soniacq / textexplorer Goto Github PK
View Code? Open in Web Editor NEWLicense: BSD 3-Clause "New" or "Revised" License
License: BSD 3-Clause "New" or "Revised" License
TypeError is happening in an internal code that invokes nlp.pipe
from the Spacy library. The error happens in line for idx, doc in enumerate(nlp.pipe(texts, n_threads=16, batch_size=100)):
, and removing n_threads=16
seems to make it work in the spacy version that I'm using.
VisualTextAnalyzer.plot_text_summary(yelp_data, category_column='category', text_column='comments')
Word Frequency:
Analyzing 69 documents (positive category)
Analyzing 65 documents (negative category)
Named Entity Recognition:
Analyzing 69 documents (positive category)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-6-2065f93d9da3> in <module>
----> 1 VisualTextAnalyzer.plot_text_summary(yelp_data, category_column='category', text_column='comments')
~/workspace/nyu/d3m/piracy-demo/TextExplorer/VisualTextAnalyzer/_data_preprocessing.py in plot_text_summary(data, category_column, text_column, positive_label, negative_label, words_entities)
343 processed_data = {}
344 if words_entities is None:
--> 345 processed_data = get_words_entities(data,category_column, text_column, positive_label, negative_label)
346 global_processed_data = processed_data
347 else:
~/workspace/nyu/d3m/piracy-demo/TextExplorer/VisualTextAnalyzer/_data_preprocessing.py in get_words_entities(data, category_column, text_column, positive_label, negative_label)
261 processed_data["words"] = get_words (positive_texts, negative_texts, labels)
262 print('Named Entity Recognition:')
--> 263 processed_data["entities"] = get_entities (positive_texts, negative_texts, labels)
264 raw_text = {}
265 raw_text['positive_texts'] = positive_texts
~/workspace/nyu/d3m/piracy-demo/TextExplorer/VisualTextAnalyzer/_data_preprocessing.py in get_entities(positive_texts, negative_texts, labels)
219
220 def get_entities (positive_texts, negative_texts, labels):
--> 221 positive_entities = get_entities_frequency(positive_texts, labels['pos'])
222 negative_entities = get_entities_frequency(negative_texts, labels['neg'])
223
~/workspace/nyu/d3m/piracy-demo/TextExplorer/VisualTextAnalyzer/_data_preprocessing.py in get_entities_frequency(texts, label)
191 alias = {'ORG':'ORGANIZATION', 'LOC':'PLACE', 'GPE':'CITY/COUNTRY', 'NORP':'GROUP', 'FAC':'BUILDING'}
192 unique_entities = {}
--> 193 for idx, doc in enumerate(nlp.pipe(texts, n_threads=16, batch_size=100)):
194 for entity in doc.ents:
195 if entity.label_ in {'CARDINAL', 'ORDINAL', 'QUANTITY'}:
TypeError: pipe() got an unexpected keyword argument 'n_threads'
Spacy version:
$ pip show spacy
Name: spacy
Version: 3.0.3
Summary: Industrial-strength Natural Language Processing (NLP) in Python
Home-page: https://spacy.io
Author: Explosion
Author-email: [email protected]
License: MIT
Location: ~/miniconda2/envs/myenv/lib/python3.6/site-packages
Requires: preshed, tqdm, typer, pathy, srsly, requests, importlib-metadata, murmurhash, cymem, thinc, setuptools, pydantic, jinja2, packaging, spacy-legacy, typing-extensions, wasabi, numpy, catalogue, blis
Required-by: en-core-web-sm, text-labeling, visual-text-explorer
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.