Coder Social home page Coder Social logo

dostoevsky's People

Contributors

avbelyaev avatar dveselov avatar fossabot avatar pyup-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dostoevsky's Issues

install problem

Здравствуйте.
Пробовал pip install dostoevsky, в ответ при установке получил кучу красного текста. Концовку скопировал:
ERROR: Command errored out with exit status 1: 'C:\Users\afecn\anaconda3.1\envs\ml\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\afecn\AppData\Local\Temp\pip-install-onr9g1p0\fasttext\setup.py'"'"'; file='"'"'C:\Users\afecn\AppData\Local\Temp\pip-install-onr9g1p0\fasttext\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\afecn\AppData\Local\Temp\pip-record-t7yp1hx1\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\afecn\anaconda3.1\envs\ml\Include\fasttext' Check the logs for full command output.

КТо нибудь сталкивался с подобной проблемой? Спасибо

StopIteration

Hi! I applied your sentiment model to a df column, at the beginning everything worked fine but few minutes before I got the RuntimeError: generator raised StopIteration. Do you have any idea why and now to fix it? Thank you in advance.

My input is:
model = FastTextSocialNetworkModel(tokenizer=tokenizer)
df_clean ['sentiment'] = df_clean ['prep_text'].apply(model.predict)
df_clean

What I get is:

StopIteration Traceback (most recent call last)
~/anaconda3/lib/python3.7/site-packages/razdel/segmenters/tokenize.py in segment(self, parts)
299 def segment(self, parts):
--> 300 buffer = next(parts)
301 for split in parts:

StopIteration:

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)
in
8 model = FastTextSocialNetworkModel(tokenizer=tokenizer)
9
---> 10 df_clean ['sentiment'] = df_clean ['prep_text'].apply(model.predict)
11 df_clean

~/anaconda3/lib/python3.7/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
3846 else:
3847 values = self.astype(object).values
-> 3848 mapped = lib.map_infer(values, f, convert=convert_dtype)
3849
3850 if len(mapped) and isinstance(mapped[0], Series):

pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()

~/anaconda3/lib/python3.7/site-packages/dostoevsky/models.py in predict(self, sentences, k)
82 Dict[str, float]
83 ]:
---> 84 X = self.preprocess_input(sentences)
85 Y = (
86 self.model.predict(sentence, k=k) for sentence in X

~/anaconda3/lib/python3.7/site-packages/dostoevsky/models.py in preprocess_input(self, sentences)
76 )
77 )
---> 78 for sentence in sentences
79 ]
80

~/anaconda3/lib/python3.7/site-packages/dostoevsky/models.py in (.0)
76 )
77 )
---> 78 for sentence in sentences
79 ]
80

~/anaconda3/lib/python3.7/site-packages/dostoevsky/tokenization.py in split(self, text, lemmatize)
37 ]:
38 return [
---> 39 (token.text.lower(), None) for token in regex_tokenize(text)
40 ]
41

~/anaconda3/lib/python3.7/site-packages/dostoevsky/tokenization.py in (.0)
37 ]:
38 return [
---> 39 (token.text.lower(), None) for token in regex_tokenize(text)
40 ]
41

~/anaconda3/lib/python3.7/site-packages/razdel/substring.py in find_substrings(chunks, text)
16 def find_substrings(chunks, text):
17 offset = 0
---> 18 for chunk in chunks:
19 start = text.find(chunk, offset)
20 stop = start + len(chunk)

RuntimeError: generator raised StopIteration

Implement learning pipeline in subpackage

Current versions of fasttext models was trained manually, but it would be great if library can be retrained with different hyperparemeters or on different dataset with less code as much, or even automatically.

_lzma issue when trying to download models

Got

  File "/usr/local/lib/python3.6/lzma.py", line 27, in <module>
    from _lzma import *
ModuleNotFoundError: No module named '_lzma'

when using dostoevsky download fasttext-social-network-model
or dostoevsky download vk-embeddings cnn-social-network-model on Ubuntu 16.04, Python 3.6

Found info:

It looks like Ubuntu does not include a Python 3 version of python-lzma. It might be more practical to use a different compression method for now.

Installing lzma via pip install pylzma and sudo apt-get install liblzma-dev did not help

Remove non-fasttext code and depencencies

  • Leave only fasttext and radzel as primary depencencies.
  • Make additional dependencies list called tests.txt, with e.g. pytest in it.
  • Move tests package outside of main package.

the model doesn't return a WordVectorModel or SupervisedModel anymore?

Warning : load_model does not return WordVectorModel or SupervisedModel any more, but a FastText object which is very similar.

I'm trying to use this in Colab and it appears that everything loads correctly, but when I work through the given example with the three simple phrases, it fails, saying that:

ValueError: /usr/local/lib/python3.7/dist-packages/dostoevsky/data/models/fasttext-social-network-model.bin cannot be opened for loading!

I'm not sure what that means or how to proceed or what these error messages mean. Do you have any advice?

ImportError: cannot import name 'RegexTokenizer'

got this error when trying to run example code.
Even if I remove RegexTokenizer from import and set tokenizer = UDBaselineTokenizer() i have the following error:

(vp36) Dannys-MBP:dost marzique$ python dost.py
Traceback (most recent call last):
  File "dost.py", line 2, in <module>
    from dostoevsky.embeddings import SocialNetworkEmbeddings
ModuleNotFoundError: No module named 'dostoevsky.embeddings'

please help

Improve model performance

We can create a large automatically annotated corpus as was done with natasha/nerus and train fasttext classifier on in.
It looks like deeppavlov BERT model has better accuracy than we have and it was trained on rusentiment too, so we can use it for knowledge distillation.

ModuleNotFoundError: No module named '_lzma'

Почему не находится модуль lzma у Достоевского?

Traceback (most recent call last):
File "http_wrapper.py", line 202, in
from sentiment import Sentiment
File "/local/opt/pksa/http_wrapper/sentiment.py", line 7, in
from dostoevsky.models import FastTextSocialNetworkModel
File "/local/opt/pksa/env/lib/python3.6/site-packages/dostoevsky/models.py", line 9, in
from dostoevsky.data import DATA_BASE_PATH
File "/local/opt/pksa/env/lib/python3.6/site-packages/dostoevsky/data/init.py", line 2, in
import lzma
File "/usr/local/lib/python3.6/lzma.py", line 27, in
from _lzma import *
ModuleNotFoundError: No module named '_lzma'

fasttext-social-network-model license?

From the README of this repo, I can find that, fasttext-social-network-model was trained with rusentiment data. And the LICENSE of rusentiment dataset is Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Can you please share the license type of fasttext-social-network-model? Is it MIT based as the code is?

FileNotFoundError

pip3 install dostoevsky
then
dostoevsky download vk-embeddings cnn-social-network-model
which lead to error:

Traceback (most recent call last):
  File "/home/tarn/.local/bin/dostoevsky", line 22, in <module>
    downloader.download(source=source, destination=destination)
  File "/home/tarn/.local/lib/python3.6/site-packages/dostoevsky/data/__init__.py", line 29, in download
    with open(destination_path, 'wb') as output:
FileNotFoundError: [Errno 2] No such file or directory: '/home/tarn/.local/lib/python3.6/site-packages/dostoevsky/data/embeddings/vk-min-100-300d-none.tar.xz'

Ubuntu 18.04.2 LTS

Model source code?

@dveselov , @annargrs, dataset has been removed due to Vk's request. But what about model's source code? After reading the article I thought it's available... Thanks!

Encountered error while trying to install package fasttext

Collecting dostoevsky
  Using cached dostoevsky-0.6.0-py2.py3-none-any.whl (8.5 kB)
Requirement already satisfied: razdel==0.5.0 in c:\users\joker\appdata\local\programs\python\python310\lib\site-packages (from dostoevsky) (0.5.0)
Collecting fasttext==0.9.2
  Using cached fasttext-0.9.2.tar.gz (68 kB)
  Preparing metadata (setup.py) ... done
Requirement already satisfied: pybind11>=2.2 in c:\users\joker\appdata\local\programs\python\python310\lib\site-packages (from fasttext==0.9.2->dostoevsky) (2.9.1)
Requirement already satisfied: setuptools>=0.7.0 in c:\users\joker\appdata\local\programs\python\python310\lib\site-packages (from fasttext==0.9.2->dostoevsky) (60.9.3)
Requirement already satisfied: numpy in c:\users\joker\appdata\local\programs\python\python310\lib\site-packages (from fasttext==0.9.2->dostoevsky) (1.22.2)
Building wheels for collected packages: fasttext
  Building wheel for fasttext (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [52 lines of output]
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\setuptools\dist.py:738: UserWarning: Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead
        warnings.warn(
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build\lib.win-amd64-3.10
      creating build\lib.win-amd64-3.10\fasttext
      copying python\fasttext_module\fasttext\FastText.py -> build\lib.win-amd64-3.10\fasttext
      copying python\fasttext_module\fasttext\__init__.py -> build\lib.win-amd64-3.10\fasttext
      creating build\lib.win-amd64-3.10\fasttext\util
      copying python\fasttext_module\fasttext\util\util.py -> build\lib.win-amd64-3.10\fasttext\util
      copying python\fasttext_module\fasttext\util\__init__.py -> build\lib.win-amd64-3.10\fasttext\util
      creating build\lib.win-amd64-3.10\fasttext\tests
      copying python\fasttext_module\fasttext\tests\test_configurations.py -> build\lib.win-amd64-3.10\fasttext\tests
      copying python\fasttext_module\fasttext\tests\test_script.py -> build\lib.win-amd64-3.10\fasttext\tests
      copying python\fasttext_module\fasttext\tests\__init__.py -> build\lib.win-amd64-3.10\fasttext\tests
      running build_ext
      building 'fasttext_pybind' extension
      creating build\temp.win-amd64-3.10
      creating build\temp.win-amd64-3.10\Release
      creating build\temp.win-amd64-3.10\Release\python
      creating build\temp.win-amd64-3.10\Release\python\fasttext_module
      creating build\temp.win-amd64-3.10\Release\python\fasttext_module\fasttext
      creating build\temp.win-amd64-3.10\Release\python\fasttext_module\fasttext\pybind
      creating build\temp.win-amd64-3.10\Release\src
      "C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -IC:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include -IC:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include -Isrc -IC:\Users\joker\AppData\Local\Programs\Python\Python310\include -IC:\Users\joker\AppData\Local\Programs\Python\Python310\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" /EHsc /Tppython/fasttext_module/fasttext/pybind/fasttext_pybind.cc /Fobuild\temp.win-amd64-3.10\Release\python/fasttext_module/fasttext/pybind/fasttext_pybind.obj /EHsc /DVERSION_INFO=\\\"0.9.2\\\"
      fasttext_pybind.cc
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(171): error C2065: ssize_t: ­Ґ®Ўкпў«Ґ­­л© Ё¤Ґ­вЁдЁЄ в®а
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(171): error C2672: "pybind11::init": ­Ґ ­ ©¤Ґ­  ᮮ⢥вбвўгой п ЇҐаҐЈа㦥­­ п дг­ЄжЁп
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(170): error C2974: pybind11::init: ­Ґ¤®ЇгбвЁ¬л©  аЈг¬Ґ­в и Ў«®­ ¤«п "CFunc", вॡгҐвбп вЁЇ
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1702): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::init"
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(170): error C2974: pybind11::init: ­Ґ¤®ЇгбвЁ¬л©  аЈг¬Ґ­в и Ў«®­ ¤«п "Func", вॡгҐвбп вЁЇ
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1697): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::init"
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(170): error C2974: pybind11::init: ­Ґ¤®ЇгбвЁ¬л©  аЈг¬Ґ­в и Ў«®­ ¤«п "Args", вॡгҐвбп вЁЇ
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1690): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::init"
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(171): error C2672: "pybind11::class_<fasttext::Vector>::def": ­Ґ ­ ©¤Ґ­  ᮮ⢥вбвўгой п ЇҐаҐЈа㦥­­ п дг­ЄжЁп
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(170): error C2780: pybind11::class_<fasttext::Vector> &pybind11::class_<fasttext::Vector>::def(const char *,Func &&,const Extra &...): вॡгҐв  аЈг¬Ґ­в®ў: 3, Ё¬ҐҐвбп: 1
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1416): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::class_<fasttext::Vector>::def"
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(185): error C2065: ssize_t: ­Ґ®Ўкпў«Ґ­­л© Ё¤Ґ­вЁдЁЄ в®а
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(185): error C2065: ssize_t: ­Ґ®Ўкпў«Ґ­­л© Ё¤Ґ­вЁдЁЄ в®а
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(185): error C2672: "pybind11::init": ­Ґ ­ ©¤Ґ­  ᮮ⢥вбвўгой п ЇҐаҐЈа㦥­­ п дг­ЄжЁп
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(182): error C2974: pybind11::init: ­Ґ¤®ЇгбвЁ¬л©  аЈг¬Ґ­в и Ў«®­ ¤«п "CFunc", вॡгҐвбп вЁЇ
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1702): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::init"
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(182): error C2974: pybind11::init: ­Ґ¤®ЇгбвЁ¬л©  аЈг¬Ґ­в и Ў«®­ ¤«п "Func", вॡгҐвбп вЁЇ
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1697): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::init"
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(182): error C2974: pybind11::init: ­Ґ¤®ЇгбвЁ¬л©  аЈг¬Ґ­в и Ў«®­ ¤«п "Args", вॡгҐвбп вЁЇ
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1690): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::init"
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(185): error C2672: "pybind11::class_<fasttext::DenseMatrix>::def": ­Ґ ­ ©¤Ґ­  ᮮ⢥вбвўгой п ЇҐаҐЈа㦥­­ п дг­ЄжЁп
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(182): error C2780: pybind11::class_<fasttext::DenseMatrix> &pybind11::class_<fasttext::DenseMatrix>::def(const char *,Func &&,const Extra &...): вॡгҐв  аЈг¬Ґ­в®ў: 3, Ё¬ҐҐвбп: 1
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1416): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::class_<fasttext::DenseMatrix>::def"
      error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\BuildTools\\VC\\Tools\\MSVC\\14.29.30133\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for fasttext
  Running setup.py clean for fasttext
Failed to build fasttext
Installing collected packages: fasttext, dostoevsky
  Running setup.py install for fasttext ... error
  error: subprocess-exited-with-error

  × Running setup.py install for fasttext did not run successfully.
  │ exit code: 1
  ╰─> [54 lines of output]
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\setuptools\dist.py:738: UserWarning: Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead
        warnings.warn(
      running install
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\setuptools\command\install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
        warnings.warn(
      running build
      running build_py
      creating build
      creating build\lib.win-amd64-3.10
      creating build\lib.win-amd64-3.10\fasttext
      copying python\fasttext_module\fasttext\FastText.py -> build\lib.win-amd64-3.10\fasttext
      copying python\fasttext_module\fasttext\__init__.py -> build\lib.win-amd64-3.10\fasttext
      creating build\lib.win-amd64-3.10\fasttext\util
      copying python\fasttext_module\fasttext\util\util.py -> build\lib.win-amd64-3.10\fasttext\util
      copying python\fasttext_module\fasttext\util\__init__.py -> build\lib.win-amd64-3.10\fasttext\util
      creating build\lib.win-amd64-3.10\fasttext\tests
      copying python\fasttext_module\fasttext\tests\test_configurations.py -> build\lib.win-amd64-3.10\fasttext\tests
      copying python\fasttext_module\fasttext\tests\test_script.py -> build\lib.win-amd64-3.10\fasttext\tests
      copying python\fasttext_module\fasttext\tests\__init__.py -> build\lib.win-amd64-3.10\fasttext\tests
      running build_ext
      building 'fasttext_pybind' extension
      creating build\temp.win-amd64-3.10
      creating build\temp.win-amd64-3.10\Release
      creating build\temp.win-amd64-3.10\Release\python
      creating build\temp.win-amd64-3.10\Release\python\fasttext_module
      creating build\temp.win-amd64-3.10\Release\python\fasttext_module\fasttext
      creating build\temp.win-amd64-3.10\Release\python\fasttext_module\fasttext\pybind
      creating build\temp.win-amd64-3.10\Release\src
      "C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -IC:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include -IC:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include -Isrc -IC:\Users\joker\AppData\Local\Programs\Python\Python310\include -IC:\Users\joker\AppData\Local\Programs\Python\Python310\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" /EHsc /Tppython/fasttext_module/fasttext/pybind/fasttext_pybind.cc /Fobuild\temp.win-amd64-3.10\Release\python/fasttext_module/fasttext/pybind/fasttext_pybind.obj /EHsc /DVERSION_INFO=\\\"0.9.2\\\"
      fasttext_pybind.cc
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(171): error C2065: ssize_t: ­Ґ®Ўкпў«Ґ­­л© Ё¤Ґ­вЁдЁЄ в®а
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(171): error C2672: "pybind11::init": ­Ґ ­ ©¤Ґ­  ᮮ⢥вбвўгой п ЇҐаҐЈа㦥­­ п дг­ЄжЁп
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(170): error C2974: pybind11::init: ­Ґ¤®ЇгбвЁ¬л©  аЈг¬Ґ­в и Ў«®­ ¤«п "CFunc", вॡгҐвбп вЁЇ
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1702): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::init"
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(170): error C2974: pybind11::init: ­Ґ¤®ЇгбвЁ¬л©  аЈг¬Ґ­в и Ў«®­ ¤«п "Func", вॡгҐвбп вЁЇ
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1697): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::init"
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(170): error C2974: pybind11::init: ­Ґ¤®ЇгбвЁ¬л©  аЈг¬Ґ­в и Ў«®­ ¤«п "Args", вॡгҐвбп вЁЇ
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1690): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::init"
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(171): error C2672: "pybind11::class_<fasttext::Vector>::def": ­Ґ ­ ©¤Ґ­  ᮮ⢥вбвўгой п ЇҐаҐЈа㦥­­ п дг­ЄжЁп
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(170): error C2780: pybind11::class_<fasttext::Vector> &pybind11::class_<fasttext::Vector>::def(const char *,Func &&,const Extra &...): вॡгҐв  аЈг¬Ґ­в®ў: 3, Ё¬ҐҐвбп: 1
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1416): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::class_<fasttext::Vector>::def"
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(185): error C2065: ssize_t: ­Ґ®Ўкпў«Ґ­­л© Ё¤Ґ­вЁдЁЄ в®а
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(185): error C2065: ssize_t: ­Ґ®Ўкпў«Ґ­­л© Ё¤Ґ­вЁдЁЄ в®а
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(185): error C2672: "pybind11::init": ­Ґ ­ ©¤Ґ­  ᮮ⢥вбвўгой п ЇҐаҐЈа㦥­­ п дг­ЄжЁп
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(182): error C2974: pybind11::init: ­Ґ¤®ЇгбвЁ¬л©  аЈг¬Ґ­в и Ў«®­ ¤«п "CFunc", вॡгҐвбп вЁЇ
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1702): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::init"
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(182): error C2974: pybind11::init: ­Ґ¤®ЇгбвЁ¬л©  аЈг¬Ґ­в и Ў«®­ ¤«п "Func", вॡгҐвбп вЁЇ
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1697): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::init"
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(182): error C2974: pybind11::init: ­Ґ¤®ЇгбвЁ¬л©  аЈг¬Ґ­в и Ў«®­ ¤«п "Args", вॡгҐвбп вЁЇ
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1690): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::init"
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(185): error C2672: "pybind11::class_<fasttext::DenseMatrix>::def": ­Ґ ­ ©¤Ґ­  ᮮ⢥вбвўгой п ЇҐаҐЈа㦥­­ п дг­ЄжЁп
      python/fasttext_module/fasttext/pybind/fasttext_pybind.cc(182): error C2780: pybind11::class_<fasttext::DenseMatrix> &pybind11::class_<fasttext::DenseMatrix>::def(const char *,Func &&,const Extra &...): вॡгҐв  аЈг¬Ґ­в®ў: 3, Ё¬ҐҐвбп: 1
      C:\Users\joker\AppData\Local\Programs\Python\Python310\lib\site-packages\pybind11\include\pybind11\pybind11.h(1416): note:  б¬. ®Ўкпў«Ґ­ЁҐ "pybind11::class_<fasttext::DenseMatrix>::def"
      error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\BuildTools\\VC\\Tools\\MSVC\\14.29.30133\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> fasttext

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

Could someone please help me out?

Broken FastText Social Network Model download

Cannot download models already for 2 days. Getting error message:

Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.8.1/x64/bin/dostoevsky", line 22, in <module>
    downloader.download(source=source, destination=destination)
  File "/opt/hostedtoolcache/Python/3.8.1/x64/lib/python3.8/site-packages/dostoevsky/data/__init__.py", line 40, in download
    tar.extractall(os.path.dirname(destination_path))
  File "/opt/hostedtoolcache/Python/3.8.1/x64/lib/python3.8/tarfile.py", line 2026, in extractall
    self.extract(tarinfo, path, set_attrs=not tarinfo.isdir(),
  File "/opt/hostedtoolcache/Python/3.8.1/x64/lib/python3.8/tarfile.py", line 2067, in extract
    self._extract_member(tarinfo, os.path.join(path, tarinfo.name),
  File "/opt/hostedtoolcache/Python/3.8.1/x64/lib/python3.8/tarfile.py", line 2139, in _extract_member
    self.makefile(tarinfo, targetpath)
  File "/opt/hostedtoolcache/Python/3.8.1/x64/lib/python3.8/tarfile.py", line 2188, in makefile
    copyfileobj(source, target, tarinfo.size, ReadError, bufsize)
  File "/opt/hostedtoolcache/Python/3.8.1/x64/lib/python3.8/tarfile.py", line 247, in copyfileobj
    buf = src.read(bufsize)
  File "/opt/hostedtoolcache/Python/3.8.1/x64/lib/python3.8/lzma.py", line 200, in read
    return self._buffer.read(size)
  File "/opt/hostedtoolcache/Python/3.8.1/x64/lib/python3.8/_compression.py", line 68, in readinto
    data = self.read(len(byte_view))
  File "/opt/hostedtoolcache/Python/3.8.1/x64/lib/python3.8/_compression.py", line 99, in read
    raise EOFError("Compressed file ended before the "
EOFError: Compressed file ended before the end-of-stream marker was reached

fasttext-social-network-model.bin cannot be opened for loading!

I tried to run the tutorial code, but there was a mistake:

Traceback (most recent call last):
File "train.py", line 8, in
model = FastTextSocialNetworkModel(tokenizer=tokenizer)
File "C:\Users\Рамазан\AppData\Local\Programs\Python\Python38-32\lib\site-packages\dostoevsky\models.py", line 61, in init
super(FastTextSocialNetworkModel, self).init(
File "C:\Users\Рамазан\AppData\Local\Programs\Python\Python38-32\lib\site-packages\dostoevsky\models.py", line 26, in init
self.get_compiled_model()
File "C:\Users\Рамазан\AppData\Local\Programs\Python\Python38-32\lib\site-packages\dostoevsky\models.py", line 69, in get_compiled_model
return load_fasttext_model(self.MODEL_PATH)
File "C:\Users\Рамазан\AppData\Local\Programs\Python\Python38-32\lib\site-packages\fasttext\FastText.py", line 350, in load_model
return _FastText(model_path=path)
File "C:\Users\Рамазан\AppData\Local\Programs\Python\Python38-32\lib\site-packages\fasttext\FastText.py", line 43, in init
self.f.loadModel(model_path)
ValueError: C:\Users\Рамазан\AppData\Local\Programs\Python\Python38-32\lib\site-packages\dostoevsky\data\models/fasttext-social-network-model.bin cannot be opened for loading!

Before that i had written this: $ python -m dostoevsky download fasttext-social-network-model

Implement fastText model

Train and implement fastText model via fasttext python module on rusentiment dataset.
Maybe, with it, we can improve quality and get rid of fixed length of sentences.

Models

I can't download models through package, how I can download them directly?

[Question] Did it test on arm?

I'm triyng to use this library on RaspberryPi 4 and have some troubles.

Traceback (most recent call last):
File "SentimentAnalyzer.py", line 3, in
from dostoevsky.models import FastTextSocialNetworkModel
ModuleNotFoundError: No module named 'dostoevsky'

But on win10 PC it works just fine.

Python version 3.7.3
No matter with virtual env on not
fasttext-social-network-model is downloaded
Trying with code snippet from readme
Running with sudo

Without sudo i getting another error

ImportError: /home/pi/.local/lib/python3.7/site-packages/fasttext_pybind.cpython-37m-arm-linux-gnueabihf.so: undefined symbol: __atomic_load_8 ...

Incorrect prediction

from typing import Dict
from dostoevsky.models import FastTextToxicModel
from dostoevsky.tokenization import RegexTokenizer

tokenizer = RegexTokenizer()
toxic_model = FastTextToxicModel(tokenizer=tokenizer)

messages = [ 'привет', 'я люблю тебя!!', 'малолетние дебилы' ]
results = toxic_model.predict(messages, k=2)
for message, sentiment in zip(messages, results):
print(message, '->', sentiment)

Output:
привет -> {'normal': 0.9972950220108032, 'toxic': 0.0026416745968163013} я люблю тебя!! -> {'toxic': 1.0000100135803223, 'normal': 1.0000003385357559e-05} малолетние дебилы -> {'toxic': 1.0000100135803223, 'normal': 1.0000003385357559e-05}

я люблю тебя!! has same toxic value with малолетние дебилы

Не скачивается модель

First of all, you'll need to download binary model:
$ python -m dostoevsky download fasttext-social-network-model

(venv) iMac-Vitaliy:pythonProject vitaliy$ python -m dostoevsky download fasttext-social-network-model
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1350, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1277, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1323, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1272, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1032, in _send_output
self.send(msg)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 972, in send
self.connect()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1447, in connect
server_hostname=server_hostname)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 423, in wrap_socket
session=session
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 870, in _create
self.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 1139, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1091)

During handling of the above exception, another exception occurred:

dostoevsky is not recognized as an internal or external command

Hi, perhaps but silly question, but I'm blocked on this step
"First of all, you'll need to download binary model:
$ dostoevsky download fasttext-social-network-model"

after inputting it in cmd I have:
'dostoevsky' is not recognized as an internal or external command,
operable program or batch file.

Thank you for helping

How to interpret the sentiment?

Questions:

  1. model.predict() includes argument k, what does it configure?
  2. model.predict() returns values 'speech' & 'skip', what do they mean?
  3. model.predict() returns 'positive', 'negative', 'neutral', how to get a unified sentiment value for the whole sentence? Sometimes positive is the greatest, sometimes negative is the greatest, sometimes neutral is the greatest. I would like to understand how to get a single score of the sentence sentiment in a range of -1 to 1. Any advice?

How to fix the error


ValueError Traceback (most recent call last)

in ()
1 tokenizer = RegexTokenizer()
----> 2 model = FastTextSocialNetworkModel(tokenizer=tokenizer)
3 sentiment_list = []
4 results = model.predict(list_of_posts, k=3)
5 for sentiment in results:

4 frames

/usr/local/lib/python3.6/dist-packages/fasttext/FastText.py in init(self, model_path, args)
41 self.f = fasttext.fasttext()
42 if model_path is not None:
---> 43 self.f.loadModel(model_path)
44 self._words = None
45 self._labels = None

ValueError: /usr/local/lib/python3.6/dist-packages/dostoevsky/data/models/fasttext-social-network-model.bin cannot be opened for loading!

Installing collected packages: dostoevsky
Successfully installed dostoevsky-0.5.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.