Coder Social home page Coder Social logo

stopwords-json's Introduction

stopwords-json Build Status npm Bower

Stopwords for various languages in JSON format. Per Wikipedia:

Stop words are words which are filtered out prior to, or after, processing of natural language data [...] these are some of the most common, short function words, such as the, is, at, which, and on.

You can use all stopwords with stopwords-all.json (keyed by language ISO 639-1 code), or see the below table for individual language stopword files.

Languages

There are a total of 50 supported languages:

Language Stopword count Filename
Afrikaans 51 af.json
Arabic 162 ar.json
Armenian 45 hy.json
Basque 98 eu.json
Bengali 116 bn.json
Breton 126 br.json
Bulgarian 259 bg.json
Catalan 218 ca.json
Chinese 542 zh.json
Croatian 179 hr.json
Czech 346 cs.json
Danish 101 da.json
Dutch 275 nl.json
English 570 en.json
Esperanto 173 eo.json
Estonian 35 et.json
Finnish 772 fi.json
French 606 fr.json
Galician 160 gl.json
German 596 de.json
Greek 75 el.json
Hausa 39 ha.json
Hebrew 194 he.json
Hindi 225 hi.json
Hungarian 781 hu.json
Indonesian 355 id.json
Irish 109 ga.json
Italian 619 it.json
Japanese 109 ja.json
Korean 679 ko.json
Latin 49 la.json
Latvian 161 lv.json
Marathi 99 mr.json
Norwegian 172 no.json
Persian 332 fa.json
Polish 260 pl.json
Portuguese 408 pt.json
Romanian 282 ro.json
Russian 539 ru.json
Slovak 110 sk.json
Slovenian 446 sl.json
Somalia 30 so.json
Southern Sotho 31 st.json
Spanish 577 es.json
Swahili 74 sw.json
Swedish 401 sv.json
Thai 115 th.json
Turkish 279 tr.json
Yoruba 60 yo.json
Zulu 29 zu.json

Sources

License and Copyright

Copyright (c) 2017 Peter Graham, contributors. Released under the Apache-2.0 license.

stopwords-json's People

Contributors

6 avatar dohliam avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.