Coder Social home page Coder Social logo

seongl / spark-nlp-models Goto Github PK

View Code? Open in Web Editor NEW

This project forked from johnsnowlabs/spark-nlp-models

0.0 0.0 0.0 16.95 MB

Models and Pipelines for the Spark NLP library

Home Page: https://nlp.johnsnowlabs.com/

License: Apache License 2.0

Jupyter Notebook 94.94% Shell 2.44% Python 2.62%

spark-nlp-models's Introduction

Spark NLP Models

Build Status Maven Central PyPI version Anaconda-Cloud License

We use this repository to maintain our releases of pre-trained pipelines and models for the Spark NLP library. For more info please take a look at our releases.

Project's website

Take a look at our official Spark NLP page: http://nlp.johnsnowlabs.com/ for user documentation and examples

Slack community channel

Join Slack

Table of contents

Pretrained Pipelines

Example of how to use Spark NLP pretrained pipelines:

# Import Spark NLP
from sparknlp.base import *
from sparknlp.annotator import *

from sparknlp.pretrained import PretrainedPipeline
import sparknlp

# Start Spark Session with Spark NLP
spark = sparknlp.start()

# Download a pre-trained pipeline
pipeline = PretrainedPipeline('explain_document_dl', lang='en')

# Your testing dataset
text = """
The Mona Lisa is a 16th century oil painting created by Leonardo. 
It's held at the Louvre in Paris.
"""

# Annotate your testing dataset
result = pipeline.annotate(text)

# What's in the pipeline
list(result.keys())
Output: ['entities', 'stem', 'checked', 'lemma', 'document',
'pos', 'token', 'ner', 'embeddings', 'sentence']

# Check the results
result['entities']
Output: ['Mona Lisa', 'Leonardo', 'Louvre', 'Paris']

Public Pipelines

Danish - Pipelines

Pipeline Name Build lang Description Offline
Explain Document Small explain_document_sm 2.6.0 da Download
Explain Document Medium explain_document_md 2.6.0 da Download
Explain Document Large explain_document_lg 2.6.0 da Download
Entity Recognizer Small entity_recognizer_sm 2.6.0 da Download
Entity Recognizer Medium entity_recognizer_md 2.6.0 da Download
Entity Recognizer Large entity_recognizer_lg 2.6.0 da Download

Dutch - Pipelines

Pipeline Name Build lang Description Offline
Explain Document Small explain_document_sm 2.5.0 nl Download
Explain Document Medium explain_document_md 2.5.0 nl Download
Explain Document Large explain_document_lg 2.5.0 nl Download
Entity Recognizer Small entity_recognizer_sm 2.5.0 nl Download
Entity Recognizer Medium entity_recognizer_md 2.5.0 nl Download
Entity Recognizer Large entity_recognizer_lg 2.5.0 nl Download

English - Pipelines

Pipeline Name Build lang Description Offline
Explain Document ML explain_document_ml 2.4.0 en Download
Explain Document DL explain_document_dl 2.4.3 en Download
Recognize Entities DL recognize_entities_dl 2.4.3 en Download
Recognize Entities DL recognize_entities_bert 2.6.0 en Download
OntoNotes Entities Small onto_recognize_entities_sm 2.4.0 en Download
OntoNotes Entities Large onto_recognize_entities_lg 2.4.0 en Download
Match Datetime match_datetime 2.4.0 en Download
Match Pattern match_pattern 2.4.0 en Download
Match Chunk match_chunks 2.4.0 en Download
Match Phrases match_phrases 2.4.0 en Download
Clean Stop clean_stop 2.4.0 en Download
Clean Pattern clean_pattern 2.4.0 en Download
Clean Slang clean_slang 2.4.0 en Download
Check Spelling check_spelling 2.4.0 en Download
Check Spelling DL check_spelling_dl 2.5.0 en Download
Analyze Sentiment analyze_sentiment 2.4.0 en Download
Analyze Sentiment DL analyze_sentimentdl_use_imdb 2.5.0 en Download
Analyze Sentiment DL analyze_sentimentdl_use_twitter 2.5.0 en Download
Dependency Parse dependency_parse 2.4.0 en Download

Finnish - Pipelines

Pipeline Name Build lang Description Offline
Explain Document Small explain_document_sm 2.6.0 fi Download
Explain Document Medium explain_document_md 2.6.0 fi Download
Explain Document Large explain_document_lg 2.6.0 fi Download
Entity Recognizer Small entity_recognizer_sm 2.6.0 fi Download
Entity Recognizer Medium entity_recognizer_md 2.6.0 fi Download
Entity Recognizer Large entity_recognizer_lg 2.6.0 fi Download

French - Pipelines

Pipeline Name Build lang Description Offline
Explain Document Large explain_document_lg 2.4.0 fr Download
Explain Document Medium explain_document_md 2.4.0 fr Download
Entity Recognizer Large entity_recognizer_lg 2.4.0 fr Download
Entity Recognizer Medium entity_recognizer_md 2.4.0 fr Download

German - Pipelines

Pipeline Name Build lang Description Offline
Explain Document Large explain_document_lg 2.4.0 de Download
Explain Document Medium explain_document_md 2.4.0 de Download
Entity Recognizer Large entity_recognizer_lg 2.4.0 de Download
Entity Recognizer Medium entity_recognizer_md 2.4.0 de Download

Italian - Pipelines

Pipeline Name Build lang Description Offline
Explain Document Large explain_document_lg 2.4.0 it Download
Explain Document Medium explain_document_md 2.4.0 it Download
Entity Recognizer Large entity_recognizer_lg 2.4.0 it Download
Entity Recognizer Medium entity_recognizer_md 2.4.0 it Download

Norwegian - Pipelines

Pipeline Name Build lang Description Offline
Explain Document Small explain_document_sm 2.5.0 no Download
Explain Document Medium explain_document_md 2.5.0 no Download
Explain Document Large explain_document_lg 2.5.0 no Download
Entity Recognizer Small entity_recognizer_sm 2.5.0 no Download
Entity Recognizer Medium entity_recognizer_md 2.5.0 no Download
Entity Recognizer Large entity_recognizer_lg 2.5.0 no Download

Polish - Pipelines

Pipeline Name Build lang Description Offline
Explain Document Small explain_document_sm 2.5.0 pl Download
Explain Document Medium explain_document_md 2.5.0 pl Download
Explain Document Large explain_document_lg 2.5.0 pl Download
Entity Recognizer Small entity_recognizer_sm 2.5.0 pl Download
Entity Recognizer Medium entity_recognizer_md 2.5.0 pl Download
Entity Recognizer Large entity_recognizer_lg 2.5.0 pl Download

Portuguese - Pipelines

Pipeline Name Build lang Description Offline
Explain Document Small explain_document_sm 2.5.0 pt Download
Explain Document Medium explain_document_md 2.5.0 pt Download
Explain Document Large explain_document_lg 2.5.0 pt Download
Entity Recognizer Small entity_recognizer_sm 2.5.0 pt Download
Entity Recognizer Medium entity_recognizer_md 2.5.0 pt Download
Entity Recognizer Large entity_recognizer_lg 2.5.0 pt Download

Russian - Pipelines

Pipeline Name Build lang Description Offline
Explain Document Small explain_document_sm 2.4.4 ru Download
Explain Document Medium explain_document_md 2.4.4 ru Download
Explain Document Large explain_document_lg 2.4.4 ru Download
Entity Recognizer Small entity_recognizer_sm 2.4.4 ru Download
Entity Recognizer Medium entity_recognizer_md 2.4.4 ru Download
Entity Recognizer Large entity_recognizer_lg 2.4.4 ru Download

Spanish - Pipelines

Pipeline Name Build lang Description Offline
Explain Document Small explain_document_sm 2.4.0 es Download
Explain Document Medium explain_document_md 2.4.0 es Download
Explain Document Large explain_document_lg 2.4.0 es Download
Entity Recognizer Small entity_recognizer_sm 2.4.0 es Download
Entity Recognizer Medium entity_recognizer_md 2.4.0 es Download
Entity Recognizer Large entity_recognizer_lg 2.4.0 es Download

Swedish - Pipelines

Pipeline Name Build lang Description Offline
Explain Document Small explain_document_sm 2.6.0 sv Download
Explain Document Medium explain_document_md 2.6.0 sv Download
Explain Document Large explain_document_lg 2.6.0 sv Download
Entity Recognizer Small entity_recognizer_sm 2.6.0 sv Download
Entity Recognizer Medium entity_recognizer_md 2.6.0 sv Download
Entity Recognizer Large entity_recognizer_lg 2.6.0 sv Download

Multi-language - Pipelines

Pipeline Name Build lang Description Offline
LanguageDetectorDL detect_language_7 2.5.2 xx Download
LanguageDetectorDL detect_language_20 2.5.2 xx Download
  • The model with 7 languages: Czech, German, English, Spanish, French, Italy, and Slovak
  • The model with 20 languages: Bulgarian, Czech, German, Greek, English, Spanish, Finnish, French, Croatian, Hungarian, Italy, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Swedish, Turkish, and Ukrainian

Pretrained Models

Public Models

If you wish to use a pre-trained model for a specific annotator in your pipeline, you need to use the annotator which is mentioned under Model following with pretrained(name, lang) function.

Example to load a pretraiand BERT model or NER model:

bert = BertEmbeddings.pretrained(name='bert_base_cased', lang='en')

ner_onto = NerDLModel.pretrained(name='ner_dl_bert', lang='en')

NOTE: build means the model can be downloaded or loaded for that specific version or above. For instance, 2.4.0 can be used in all the releases after 2.4.x but not before.

Pretrained models are great to create custom pipeline when the pretrained pipelines don't offer a feature or you need more flexibility:

document = DocumentAssembler()\
    .setInputCol("description")\
    .setOutputCol("document")

use = UniversalSentenceEncoder.pretrained(name="tfhub_use", lang="en")\
 .setInputCols(["document"])\
 .setOutputCol("sentence_embeddings")

# the classes/labels/categories are in category column
sentimentdl = SentimentDLModel.pretrained(name="sentimentdl_use_imdb", lang="en")\
  .setInputCols(["sentence_embeddings"])\
  .setOutputCol("sentiment")

pipeline = Pipeline(
    stages = [
        document,
        use,
        sentimentdl
    ])

Danish - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.5.5 da Download
PerceptronModel (POS UD) pos_ud_ddt 2.5.5 da Download
NerDLModel (glove_100d) dane_ner_6B_100 2.6.0 da Download
NerDLModel (glove_6B_300) dane_ner_6B_300 2.6.0 da Download
NerDLModel (glove_840B_300) dane_ner_840B_100 2.6.0 da Download

Dutch - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.5.0 nl Download
PerceptronModel (POS UD) pos_ud_alpino 2.5.0 nl Download
NerDLModel (glove_100d) wikiner_6B_100 2.5.0 nl Download
NerDLModel (glove_6B_300) wikiner_6B_300 2.5.0 nl Download
NerDLModel (glove_840B_300) wikiner_840B_300 2.5.0 nl Download

English - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma_antbnc 2.0.2 en Download
PerceptronModel (POS) pos_anc 2.0.2 en Download
PerceptronModel (POS UD) pos_ud_ewt 2.2.2 en Download
NerCrfModel (NER with GloVe) ner_crf 2.4.0 en Download
NerDLModel (NER with GloVe) ner_dl 2.4.3 en Download
NerDLModel (NER with BERT) ner_dl_bert 2.6.0 en Download
NerDLModel (OntoNotes with GloVe 100d) onto_100 2.4.0 en Download
NerDLModel (OntoNotes with GloVe 300d) onto_300 2.4.0 en Download
DeepSentenceDetector ner_dl_sentence 2.4.0 en Download
SymmetricDeleteModel (Spell Checker) spellcheck_sd 2.0.2 en Download
NorvigSweetingModel (Spell Checker) spellcheck_norvig 2.0.2 en Download
ContextSpellCheckerModel (Spell Checker) spellcheck_dl 2.5.0 en Download
ViveknSentimentModel (Sentiment) sentiment_vivekn 2.0.2 en Download
DependencyParser (Dependency) dependency_conllu 2.0.8 en Download
TypedDependencyParser (Dependency) dependency_typed_conllu 2.0.8 en Download
StopWordsCleaner stopwords_en 2.5.4 en Download

Embeddings

Model Name Build Lang Offline
WordEmbeddingsModel (GloVe) glove_100d 2.4.0 en Download
BertEmbeddings electra_small_uncased 2.6.0 en Download
BertEmbeddings electra_base_uncased 2.6.0 en Download
BertEmbeddings electra_large_uncased 2.6.0 en Download
BertEmbeddings bert_base_uncased 2.6.0 en Download
BertEmbeddings bert_base_cased 2.6.0 en Download
BertEmbeddings bert_large_uncased 2.6.0 en Download
BertEmbeddings bert_large_cased 2.6.0 en Download
BertEmbeddings biobert_pubmed_base_cased 2.6.0 en Download
BertEmbeddings biobert_pubmed_large_cased 2.6.0 en Download
BertEmbeddings biobert_pmc_base_cased 2.6.0 en Download
BertEmbeddings biobert_pubmed_pmc_base_cased 2.6.0 en Download
BertEmbeddings biobert_clinical_base_cased 2.6.0 en Download
BertEmbeddings biobert_discharge_base_cased 2.6.0 en Download
BertEmbeddings covidbert_large_uncased 2.6.0 en Download
BertEmbeddings small_bert_L2_128 2.6.0 en Download
BertEmbeddings small_bert_L4_128 2.6.0 en Download
BertEmbeddings small_bert_L6_128 2.6.0 en Download
BertEmbeddings small_bert_L8_128 2.6.0 en Download
BertEmbeddings small_bert_L10_128 2.6.0 en Download
BertEmbeddings small_bert_L12_128 2.6.0 en Download
BertEmbeddings small_bert_L2_256 2.6.0 en Download
BertEmbeddings small_bert_L4_256 2.6.0 en Download
BertEmbeddings small_bert_L6_256 2.6.0 en Download
BertEmbeddings small_bert_L8_256 2.6.0 en Download
BertEmbeddings small_bert_L10_256 2.6.0 en Download
BertEmbeddings small_bert_L12_256 2.6.0 en Download
BertEmbeddings small_bert_L2_512 2.6.0 en Download
BertEmbeddings small_bert_L4_512 2.6.0 en Download
BertEmbeddings small_bert_L6_512 2.6.0 en Download
BertEmbeddings small_bert_L8_512 2.6.0 en Download
BertEmbeddings small_bert_L10_512 2.6.0 en Download
BertEmbeddings small_bert_L12_512 2.6.0 en Download
BertEmbeddings small_bert_L2_768 2.6.0 en Download
BertEmbeddings small_bert_L4_768 2.6.0 en Download
BertEmbeddings small_bert_L6_768 2.6.0 en Download
BertEmbeddings small_bert_L8_768 2.6.0 en Download
BertEmbeddings small_bert_L10_768 2.6.0 en Download
BertEmbeddings small_bert_L12_768 2.6.0 en Download
ElmoEmbeddings elmo 2.4.0 en Download
AlbertEmbeddings albert_base_uncased 2.5.0 en Download
AlbertEmbeddings albert_large_uncased 2.5.0 en Download
AlbertEmbeddings albert_xlarge_uncased 2.5.0 en Download
AlbertEmbeddings albert_xxlarge_uncased 2.5.0 en Download
XlnetEmbeddings xlnet_base_cased 2.5.0 en Download
XlnetEmbeddings xlnet_large_cased 2.5.0 en Download
UniversalSentenceEncoder (USE) tfhub_use 2.4.0 en Download
UniversalSentenceEncoder (USE) tfhub_use_lg 2.4.0 en Download
BertSentenceEmbeddings sent_electra_small_uncased 2.6.0 en Download
BertSentenceEmbeddings sent_electra_base_uncased 2.6.0 en Download
BertSentenceEmbeddings sent_electra_large_uncased 2.6.0 en Download
BertSentenceEmbeddings sent_bert_base_uncased 2.6.0 en Download
BertSentenceEmbeddings sent_bert_base_cased 2.6.0 en Download
BertSentenceEmbeddings sent_bert_large_uncased 2.6.0 en Download
BertSentenceEmbeddings sent_bert_large_cased 2.6.0 en Download
BertSentenceEmbeddings sent_biobert_pubmed_base_cased 2.6.0 en Download
BertSentenceEmbeddings sent_biobert_pubmed_large_cased 2.6.0 en Download
BertSentenceEmbeddings sent_biobert_pmc_base_cased 2.6.0 en Download
BertSentenceEmbeddings sent_biobert_pubmed_pmc_base_cased 2.6.0 en Download
BertSentenceEmbeddings sent_biobert_clinical_base_cased 2.6.0 en Download
BertSentenceEmbeddings sent_biobert_discharge_base_cased 2.6.0 en Download
BertSentenceEmbeddings sent_covidbert_large_uncased 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L2_128 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L4_128 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L6_128 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L8_128 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L10_128 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L12_128 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L2_256 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L4_256 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L6_256 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L8_256 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L10_256 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L12_256 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L2_512 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L4_512 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L6_512 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L8_512 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L10_512 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L12_512 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L2_768 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L4_768 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L6_768 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L8_768 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L10_768 2.6.0 en Download
BertSentenceEmbeddings sent_small_bert_L12_768 2.6.0 en Download

Classification

Model Name Build Lang Offline
ClassifierDLModel (with tfhub_use) classifierdl_use_trec6 2.5.0 en Download
ClassifierDLModel (with tfhub_use) classifierdl_use_trec50 2.5.0 en Download
ClassifierDLModel (with tfhub_use) classifierdl_use_spam 2.5.3 en Download
ClassifierDLModel (with tfhub_use) classifierdl_use_fakenews 2.5.3 en Download
ClassifierDLModel (with tfhub_use) classifierdl_use_emotion 2.5.3 en Download
ClassifierDLModel (with tfhub_use) classifierdl_use_cyberbullying 2.5.3 en Download
ClassifierDLModel (with tfhub_use) classifierdl_use_sarcasm 2.5.3 en Download
MultiClassifierDLModel (with tfhub_use) multiclassifierdl_use_toxic 2.6.0 en Download
MultiClassifierDLModel (with tfhub_use) multiclassifierdl_use_toxic_sm 2.6.0 en Download
MultiClassifierDLModel (with tfhub_use) multiclassifierdl_use_e2e 2.6.0 en Download
SentimentDLModel (with tfhub_use) sentimentdl_use_imdb 2.5.0 en Download
SentimentDLModel (with tfhub_use) sentimentdl_use_twitter 2.5.0 en Download
SentimentDLModel (with glove_100d) sentimentdl_glove_imdb 2.5.0 en Download

Finnish - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.5.0 fi Download
PerceptronModel (POS UD) pos_ud_tdt 2.5.0 fi Download
StopWordsCleaner stopwords_fi 2.5.4 fi Download
NerDLModel (glove_100d) wikiner_6B_100 2.6.0 fi Download
NerDLModel (glove_6B_300) wikiner_6B_300 2.6.0 fi Download
NerDLModel (glove_840B_300) wikiner_840B_300 2.6.0 fi Download
BertEmbeddings bert_finnish_cased 2.6.0 fi Download
BertEmbeddings bert_finnish_uncased 2.6.0 fi Download
BertSentenceEmbeddings sent_bert_finnish_cased 2.6.0 fi Download
BertSentenceEmbeddings sent_bert_finnish_uncased 2.6.0 fi Download

French - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.0.2 fr Download
PerceptronModel (POS UD) pos_ud_gsd 2.0.2 fr Download
NerDLModel (glove_840B_300) wikiner_840B_300 2.0.2 fr Download
StopWordsCleaner stopwords_fr 2.5.4 fr Download
Feature Description
Lemma Trained by Lemmatizer annotator on lemmatization-lists by Michal Mฤ›chura
POS Trained by PerceptronApproach annotator on the Universal Dependencies
NER Trained by NerDLApproach annotator with Char CNNs - BiLSTM - CRF and GloVe Embeddings on the WikiNER corpus and supports the identification of PER, LOC, ORG and MISC entities

German - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.0.8 de Download
PerceptronModel (POS UD) pos_ud_hdt 2.0.8 de Download
NerDLModel (glove_840B_300) wikiner_840B_300 2.4.0 de Download
StopWordsCleaner stopwords_de 2.5.4 de Download
Feature Description
Lemma Trained by Lemmatizer annotator on lemmatization-lists by Michal Mฤ›chura
POS Trained by PerceptronApproach annotator on the Universal Dependencies
NER Trained by NerDLApproach annotator with Char CNNs - BiLSTM - CRF and GloVe Embeddings on the WikiNER corpus and supports the identification of PER, LOC, ORG and MISC entities

Italian - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma_dxc 2.0.2 it Download
PerceptronModel (POS UD) pos_ud_isdt 2.0.8 it Download
NerDLModel (glove_840B_300) wikiner_840B_300 2.4.0 it Download
StopWordsCleaner stopwords_it 2.5.4 it Download
Feature Description
Lemma Trained by Lemmatizer annotator on DXC Technology dataset
POS Trained by PerceptronApproach annotator on the Universal Dependencies
NER Trained by NerDLApproach annotator with Char CNNs - BiLSTM - CRF and GloVe Embeddings on the WikiNER corpus and supports the identification of PER, LOC, ORG and MISC entities

Norwegian - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.5.0 nb Download
PerceptronModel (POS UD) pos_ud_nynorsk 2.5.0 nn Download
PerceptronModel (POS UD) pos_ud_bokmaal 2.5.0 nb Download
NerDLModel (glove_100d) norne_6B_100 2.5.0 no Download
NerDLModel (glove_6B_300) norne_6B_300 2.5.0 no Download
NerDLModel (glove_840B_300) norne_840B_300 2.5.0 no Download

Polish - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.5.0 pl Download
PerceptronModel (POS UD) pos_ud_lfg 2.5.0 pl Download
NerDLModel (glove_100d) wikiner_6B_100 2.5.0 pl Download
NerDLModel (glove_6B_300) wikiner_6B_300 2.5.0 pl Download
NerDLModel (glove_840B_300) wikiner_840B_300 2.5.0 pl Download
StopWordsCleaner stopwords_pl 2.5.4 pl Download

Portuguese - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.5.0 pt Download
PerceptronModel (POS UD) pos_ud_bosque 2.5.0 pt Download
NerDLModel (glove_100d) wikiner_6B_100 2.5.0 pt Download
NerDLModel (glove_6B_300) wikiner_6B_300 2.5.0 pt Download
NerDLModel (glove_840B_300) wikiner_840B_300 2.5.0 pt Download
StopWordsCleaner stopwords_pt 2.5.4 pt Download
BertEmbeddings bert_portuguese_base_cased 2.6.0 pt Download
BertEmbeddings bert_portuguese_large_cased 2.6.0 pt Download

Russian - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.4.4 ru Download
PerceptronModel (POS UD) pos_ud_gsd 2.4.4 ru Download
NerDLModel (glove_100d) wikiner_6B_100 2.4.4 ru Download
NerDLModel (glove_6B_300) wikiner_6B_300 2.4.4 ru Download
NerDLModel (glove_840B_300) wikiner_840B_300 2.4.4 ru Download
StopWordsCleaner stopwords_ru 2.5.4 ru Download
Feature Description
Lemma Trained by Lemmatizer annotator on the Universal Dependencies
POS Trained by PerceptronApproach annotator on the Universal Dependencies
NER Trained by NerDLApproach annotator with Char CNNs - BiLSTM - CRF and GloVe Embeddings on the WikiNER corpus and supports the identification of PER, LOC, ORG and MISC entities

Spanish - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.4.0 es Download
PerceptronModel (POS UD) pos_ud_gsd 2.4.0 es Download
NerDLModel (glove_100d) wikiner_6B_100 2.4.0 es Download
NerDLModel (glove_6B_300) wikiner_6B_300 2.4.0 es Download
NerDLModel (glove_840B_300) wikiner_840B_300 2.4.0 es Download
StopWordsCleaner stopwords_es 2.5.4 es Download
Feature Description
Lemma Trained by Lemmatizer annotator on lemmatization-lists by Michal Mฤ›chura
POS Trained by PerceptronApproach annotator on the Universal Dependencies
NER Trained by NerDLApproach annotator with Char CNNs - BiLSTM - CRF and GloVe Embeddings on the WikiNER corpus and supports the identification of PER, LOC, ORG and MISC entities

Swedish - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.5.0 sv Download
PerceptronModel (POS UD) pos_ud_tal 2.5.0 sv Download
StopWordsCleaner stopwords_sv 2.5.4 sv Download
NerDLModel (glove_100d) swedish_ner_6B_100 2.6.0 sv Download
NerDLModel (glove_6B_300) swedish_ner_6B_300 2.6.0 sv Download
NerDLModel (glove_840B_300) swedish_ner_840B_300 2.6.0 sv Download

Afrikaans - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_af 2.5.4 af Download

Arabic - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_ar 2.5.4 ar Download

Armenian - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_hy 2.5.4 hy Download
LemmatizerModel (Lemmatizer) lemma 2.5.5 hy Download
PerceptronModel (POS UD) pos_ud_armtdp 2.5.5 hy Download

Basque - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_eu 2.5.4 eu Download
LemmatizerModel (Lemmatizer) lemma 2.5.5 eu Download
PerceptronModel (POS UD) pos_ud_bdt 2.5.5 eu Download

Bengali - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_bn 2.5.4 bn Download

Breton - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_br 2.5.4 br Download
LemmatizerModel (Lemmatizer) lemma 2.5.5 br Download
PerceptronModel (POS UD) pos_ud_keb 2.5.5 br Download

Bulgarian - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.5.0 bg Download
PerceptronModel (POS UD) pos_ud_btb 2.5.0 bg Download
StopWordsCleaner stopwords_bg 2.5.4 bg Download

Catalan - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_ca 2.5.4 ca Download
LemmatizerModel (Lemmatizer) lemma 2.5.5 ca Download
PerceptronModel (POS UD) pos_ud_ancora 2.5.5 ca Download

Czech - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.5.0 cs Download
PerceptronModel (POS UD) pos_ud_pdt 2.5.0 cs Download
StopWordsCleaner stopwords_cs 2.5.4 cs Download

Esperanto - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_eo 2.5.4 eo Download

Galician - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_gl 2.5.4 gl Download
LemmatizerModel (Lemmatizer) lemma 2.5.5 gl Download
PerceptronModel (POS UD) pos_ud_treegal 2.5.5 gl Download

Greek - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.5.0 el Download
PerceptronModel (POS UD) pos_ud_gdt 2.5.0 el Download
StopWordsCleaner stopwords_el 2.5.4 el Download

Hausa - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_ha 2.5.4 ha Download

Hebrew - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_he 2.5.4 he Download

Hindi - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_hi 2.5.4 hi Download
LemmatizerModel (Lemmatizer) lemma 2.5.5 hi Download
PerceptronModel (POS UD) pos_ud_hdtb 2.5.5 hi Download

Hungarian - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.5.0 hu Download
PerceptronModel (POS UD) pos_ud_szeged 2.5.0 hu Download
StopWordsCleaner stopwords_hu 2.5.4 hu Download

Indonesian - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_id 2.5.4 id Download
LemmatizerModel (Lemmatizer) lemma 2.5.5 id Download
PerceptronModel (POS UD) pos_ud_gsd 2.5.5 id Download

Irish - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_ga 2.5.4 ga Download
LemmatizerModel (Lemmatizer) lemma 2.5.5 ga Download
PerceptronModel (POS UD) pos_ud_idt 2.5.5 ga Download

Japanese - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_ja 2.5.4 ja Download

Latin - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_la 2.5.4 la Download
LemmatizerModel (Lemmatizer) lemma 2.5.5 la Download
PerceptronModel (POS UD) pos_ud_llct 2.5.5 la Download

Latvian - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_lv 2.5.4 lv Download
LemmatizerModel (Lemmatizer) lemma 2.5.5 lv Download
PerceptronModel (POS UD) pos_ud_lvtb 2.5.5 lv Downloads

Marathi - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_mr 2.5.4 mr Download
LemmatizerModel (Lemmatizer) lemma 2.5.5 mr Download
PerceptronModel (POS UD) pos_ud_ufal 2.5.5 mr Download

Persian - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_fa 2.5.4 fa Download

Romanian - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.5.0 ro Download
PerceptronModel (POS UD) pos_ud_rrt 2.5.0 ro Download
StopWordsCleaner stopwords_ro 2.5.4 ro Download

Slovak - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.5.0 sk Download
PerceptronModel (POS UD) pos_ud_snk 2.5.0 sk Download
StopWordsCleaner stopwords_sk 2.5.4 sk Download

Slovenian - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_sl 2.5.4 sl Download
LemmatizerModel (Lemmatizer) lemma 2.5.5 sl Download
PerceptronModel (POS UD) pos_ud_ssj 2.5.5 sl Download

Somali - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_so 2.5.4 so Download

Southern Sotho - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_st 2.5.4 st Download

Swahili - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_sw 2.5.4 sw Download

Tswana - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_th 2.5.4 th Download

Turkish - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.5.0 tr Download
PerceptronModel (POS UD) pos_ud_imst 2.5.0 tr Download
StopWordsCleaner stopwords_tr 2.5.4 tr Download

Ukrainian - Models

Model Name Build Lang Offline
LemmatizerModel (Lemmatizer) lemma 2.5.0 uk Download
PerceptronModel (POS UD) pos_ud_iu 2.5.0 uk Download

Yoruba - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_yo 2.5.4 yo Download
LemmatizerModel (Lemmatizer) lemma 2.5.5 yo Download
PerceptronModel (POS UD) pos_ud_ytb 2.5.5 yo Download

Zulu - Models

Model Name Build Lang Offline
StopWordsCleaner stopwords_zu 2.5.4 zu Download

Multi-language

Model Name Build Lang Offline
WordEmbeddingsModel (GloVe) glove_840B_300 2.4.0 xx Download
WordEmbeddingsModel (GloVe) glove_6B_300 2.4.0 xx Download
BertEmbeddings bert_multi_cased 2.6.0 xx Download
BertSentenceEmbeddings sent_bert_multi_cased 2.6.0 xx Download
BertSentenceEmbeddings labse 2.6.0 xx Download
LanguageDetectorDL ld_wiki_7 2.5.2 xx Download
LanguageDetectorDL ld_wiki_20 2.5.2 xx Download
  • The model with 7 languages: Czech, German, English, Spanish, French, Italy, and Slovak
  • The model with 20 languages: Bulgarian, Czech, German, Greek, English, Spanish, Finnish, French, Croatian, Hungarian, Italy, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Swedish, Turkish, and Ukrainian

Licensed Enterprise

It is required to specify 3rd argument to pretrained(name, lang, location) function to add the location of these

Pretrained Models - Spark NLP For Healthcare

English Language, Clinical/Models Location

{Model}.pretrained({Name}, 'en', 'clinical/models')

Model Name Build
AssertionDLModel assertion_dl_large 2.5.0 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
AssertionDLModel assertion_dl 2.4.0 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
AssertionDLModel assertion_dl_biobert 2.6.2 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
AssertionLogRegModel assertion_ml 2.4.0 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ChunkEntityResolverModel chunkresolve_cpt_clinical 2.4.5 ๐Ÿ“‹ ๐Ÿ’พ
ChunkEntityResolverModel chunkresolve_icd10cm_clinical 2.4.5 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ChunkEntityResolverModel chunkresolve_icd10cm_diseases_clinical 2.4.5 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ChunkEntityResolverModel chunkresolve_icd10cm_injuries_clinical 2.4.5 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ChunkEntityResolverModel chunkresolve_icd10cm_musculoskeletal_clinical 2.4.5 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ChunkEntityResolverModel chunkresolve_icd10cm_neoplasms_clinical 2.4.5 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ChunkEntityResolverModel chunkresolve_icd10cm_puerile_clinical 2.4.5 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ChunkEntityResolverModel chunkresolve_icd10pcs_clinical 2.4.5 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ChunkEntityResolverModel chunkresolve_icdo_clinical 2.4.5 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ChunkEntityResolverModel chunkresolve_loinc_clinical 2.5.0 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ChunkEntityResolverModel chunkresolve_rxnorm_cd_clinical 2.5.1 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ChunkEntityResolverModel chunkresolve_rxnorm_sbd_clinical 2.5.1 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ChunkEntityResolverModel chunkresolve_rxnorm_scd_clinical 2.5.1 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ChunkEntityResolverModel chunkresolve_snomed_findings_clinical 2.5.1 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ContextSpellCheckerModel spellcheck_clinical 2.4.2 ๐Ÿ“‹ ๐Ÿ’พ
DeIdentificationModel deidentify_rb_no_regex 2.5.0 ๐Ÿ“‹ ๐Ÿ’พ
DeIdentificationModel deidentify_rb 2.0.2 ๐Ÿ“‹ ๐Ÿ’พ
DeIdentificatoinModel deidentify_large 2.5.1 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_anatomy 2.4.2 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_bionlp 2.4.0 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_cellular 2.4.2 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_clinical_large 2.5.0 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_clinical 2.4.0 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_deid_enriched 2.5.3 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_deid_large 2.5.3 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_diseases 2.4.4 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_drugs 2.4.4 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_events_clinical 2.5.5 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_healthcare 2.4.4 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_jsl_enriched 2.4.2 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_jsl 2.4.2 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_medmentions_coarse 2.5.0 ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_posology_large 2.4.2 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_posology_small 2.4.2 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_posology 2.4.4 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_risk_factors 2.4.2 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_human_phenotype_go_clinical 2.6.0 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_human_phenotype_gene_clinical 2.6.0 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_chemprot_clinical 2.6.0 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_ade_clinical 2.6.2 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_ade_healthcare 2.6.2 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_ade_biobert 2.6.2 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_ade_clinicalbert 2.6.2 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ClassifierDLModel classifierdl_ade_biobert 2.6.2 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ClassifierDLModel classifierdl_ade_conversational_biobert 2.6.2 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
ClassifierDLModel classifierdl_ade_clinicalbert 2.6.2 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
PerceptronModel pos_clinical 2.0.2 ๐Ÿ“‹ ๐Ÿ’พ
RelationExtractionModel re_clinical 2.5.5 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
RelationExtractionModel re_posology 2.5.5 ๐Ÿ” ๐Ÿ“‹
RelationExtractionModel re_temporal_events_clinical 2.6.0 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
RelationExtractionModel re_temporal_events_enriched_clinical 2.6.0 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
RelationExtractionModel re_human_phenotype_gene_clinical 2.6.0 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
RelationExtractionModel re_drug_drug_interaction_clinical 2.6.0 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
RelationExtractionModel re_chemprot_clinical 2.6.0 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
TextMatcherModel textmatch_cpt_token 2.4.5 ๐Ÿ“‹ ๐Ÿ’พ
TextMatcherModel textmatch_icdo_ner 2.4.5 ๐Ÿ“‹ ๐Ÿ’พ
WordEmbeddingsModel embeddings_clinical 2.4.0 ๐Ÿ“‹ ๐Ÿ’พ
WordEmbeddingsModel embeddings_healthcare_100d 2.5.0 ๐Ÿ“‹ ๐Ÿ’พ
WordEmbeddingsModel embeddings_healthcare 2.4.4 ๐Ÿ“‹ ๐Ÿ’พ

Spanish Language, Clinical/Models Location

{Model}.pretrained({Name}, 'es', 'clinical/models')

Model Name Build
NerDLModel ner_diag_proc 2.5.3 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
NerDLModel ner_neoplasms 2.5.3 ๐Ÿ” ๐Ÿ“‹ ๐Ÿ’พ
WordEmbeddingsModel embeddings_scielo_150d 2.5.0 ๐Ÿ“‹ ๐Ÿ’พ
WordEmbeddingsModel embeddings_scielo_300d 2.5.0 ๐Ÿ“‹ ๐Ÿ’พ
WordEmbeddingsModel embeddings_scielo_50d 2.5.0 ๐Ÿ“‹ ๐Ÿ’พ
WordEmbeddingsModel embeddings_scielowiki_150d 2.5.0 ๐Ÿ“‹ ๐Ÿ’พ
WordEmbeddingsModel embeddings_scielowiki_300d 2.5.0 ๐Ÿ“‹ ๐Ÿ’พ
WordEmbeddingsModel embeddings_scielowiki_50d 2.5.0 ๐Ÿ“‹ ๐Ÿ’พ

Pretrained Healthcare Pipelines

PretrainedPipeline({Name}, 'en', 'clinical/models')

Pipeline Name Build lang Description Offline
Explain Clinical Document (type-1) explain_clinical_doc_carp 2.6.0 en a pipeline with ner_clinical, assertion_dl, re_clinical and ner_posology. It will extract clinical and medication entities, assign assertion status and find relationships between clinical entities. Download
Explain Clinical Document (type-2) explain_clinical_doc_era 2.6.0 en a pipeline with ner_clinical_events, assertion_dl and re_temporal_events_clinical. It will extract clinical entities, assign assertion status and find temporal relationships between clinical entities. Download
Explain Clinical Document (type-3) recognize_entities_posology 2.6.0 en a pipeline with ner_posology. It will only extract medication entities. Download
Explain Clinical Document (type-4) explain_clinical_doc_ade 2.6.2 en a pipeline for Adverse Drug Events (ADE) with ner_ade_biobert, assertiondl_biobert and classifierdl_ade_conversational_biobert. It will extract ADE and DRUG clinical entities, assigen assertion status to ADE entities, and then assign ADE status to a text(True means ADE, False means not related to ADE). Download

German Models

Model Name Build lang Offline
NER Healthcare ner_healthcare 2.6.0 de Download
Entity Resolver ICD10GM chunkresolve_ICD10GM 2.6.0 de Download
WordEmbeddings w2v_cc_300d 2.6.0 de Download
NER Legal ner_legal 2.6.0 de Download

Contact

[email protected]

John Snow Labs

https://johnsnowlabs.com

spark-nlp-models's People

Contributors

agsfer avatar dependabot[bot] avatar diatrambitas avatar jfernandrezj avatar maymag avatar maziyarpanahi avatar saif-ellafi avatar vkocaman avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.