Comments (8)
@piskvorky For gensim==2.3.0
and GoogleNews-vectors-negative300.bin.gz
all works fine
In [1]: from gensim.models import KeyedVectors
In [2]: kv = KeyedVectors.load_word2vec_format("GoogleNews-vectors-negative300.bin.gz", binary=True)
In [3]: kv.most_similar("king")
Out[3]:
[(u'kings', 0.7138046026229858),
(u'queen', 0.6510956883430481),
(u'monarch', 0.6413194537162781),
(u'crown_prince', 0.6204220056533813),
(u'prince', 0.6159993410110474),
(u'sultan', 0.5864822864532471),
(u'ruler', 0.5797567367553711),
(u'princes', 0.5646552443504333),
(u'Prince_Paras', 0.543294370174408),
(u'throne', 0.5422104597091675)]
FYI, line from README isn't correct python runserver.py hetzner.conf
(should be w2v_server.py
and w2v_*.conf
), also "harcoded" path to binary.
With gensim==2.3.0
this web app doesn't run (because used deprecated Word2Vec.load_word2vec_format
:
(qqq) ivan@P50:~/w2v_server_googlenews$ python w2v_server.py w2v_home.conf
2017-09-22 11:23:06,643 : INFO : w2v_server:163 : <module>(MainThread) : running w2v_server.py w2v_home.conf
Traceback (most recent call last):
File "w2v_server.py", line 182, in <module>
cherrypy.quickstart(Server(config.MODEL_FILE), config=conf_file)
File "w2v_server.py", line 69, in __init__
self.model = gensim.models.word2vec.Word2Vec.load_word2vec_format(fname, binary=True)
File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/gensim/models/word2vec.py", line 1450, in load_word2vec_format
raise DeprecationWarning("Deprecated. Use gensim.models.KeyedVectors.load_word2vec_format instead.")
DeprecationWarning: Deprecated. Use gensim.models.KeyedVectors.load_word2vec_format instead.
With versions from requirements.txt
(already fixed path to word-vectors) - problems with hardcoded non-existente dirs for logs:
(qqq) ivan@P50:~/w2v_server_googlenews$ python w2v_server.py w2v_home.conf
2017-09-22 11:25:40,287 : INFO : w2v_server:163 : <module>(MainThread) : running w2v_server.py w2v_home.conf
2017-09-22 11:25:40,292 : INFO : word2vec:1023 : load_word2vec_format(MainThread) : loading projection weights from /home/ivan/w2v_server_googlenews/GoogleNews-vectors-negative300.bin.gz
2017-09-22 11:27:14,305 : INFO : word2vec:1077 : load_word2vec_format(MainThread) : loaded (3000000, 300) matrix from /home/ivan/w2v_server_googlenews/GoogleNews-vectors-negative300.bin.gz
2017-09-22 11:27:14,305 : INFO : word2vec:1345 : init_sims(MainThread) : precomputing L2-norms of word weight vectors
2017-09-22 11:27:33,115 : INFO : word2vec:1345 : init_sims(MainThread) : precomputing L2-norms of word weight vectors
Traceback (most recent call last):
File "w2v_server.py", line 182, in <module>
cherrypy.quickstart(Server(config.MODEL_FILE), config=conf_file)
File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/__init__.py", line 169, in quickstart
_global_conf_alias.update(config)
File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/_cpconfig.py", line 158, in update
reprconf.Config.update(self, config)
File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/lib/reprconf.py", line 166, in update
self._apply(config)
File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/_cpconfig.py", line 168, in _apply
reprconf.Config._apply(self, config)
File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/lib/reprconf.py", line 178, in _apply
self.namespaces(config)
File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/lib/reprconf.py", line 120, in __call__
handler(k, v)
File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/__init__.py", line 645, in <lambda>
config.namespaces["log"] = lambda k, v: setattr(log, k, v)
File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/_cplogging.py", line 391, in _set_access_file
self._set_file_handler(self.access_log, newvalue)
File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/_cplogging.py", line 363, in _set_file_handler
self._add_builtin_file_handler(log, filename)
File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/_cplogging.py", line 349, in _add_builtin_file_handler
h = logging.FileHandler(fname)
File "/usr/lib/python2.7/logging/__init__.py", line 913, in __init__
StreamHandler.__init__(self, self._open())
File "/usr/lib/python2.7/logging/__init__.py", line 943, in _open
stream = open(self.baseFilename, self.mode)
IOError: [Errno 2] No such file or directory: '/var/log/w2v/access.log'
With all fixes - server runs correctly (with some errors from engine, but it doesn't matter) and all works correctly:
ivan@P50:~$ curl 'http://127.0.0.1:8889/most_similar?positive%5B%5D=woman&positive%5B%5D=king&negative%5B%5D=man'
{"taken": 0.22298312187194824, "similars": [["queen", 0.7118192911148071], ["monarch", 0.6189675331115723], ["princess", 0.5902431011199951], ["crown_prince", 0.5499460697174072], ["prince", 0.5377322435379028]], "success": 1}
ivan@P50:~$ curl 'http://127.0.0.1:8889/suggest?term=IPhon'
["iPhone", "iphone", "IPhone", "Iphone", "IPHONE", "iPHONE", "iPHone", "iPhone.com", "iphone.org", "iPhone.org"]
from w2v_server_googlenews.
What version of gensim is this?
@zppinto can you load the model using the latest version of gensim?
from w2v_server_googlenews.
gensim (2.3.0) and CherryPy (11.0.0)
Also have tried with gensim (0.13.1) and CherryPy (5.1.0) on a "older" machine I've with this dependencies...
The problem happens on both!
from w2v_server_googlenews.
Hmm, not good. Thanks for reporting. @zppinto Can you send the exact command for loading the vectors which gives you this exception?
@menshikh-iv can you replicate this? Can we really not load the GoogleNews vectors in gensim any more?
from w2v_server_googlenews.
Thank you @menshikh-iv .
@zppinto we're unable to reproduce, closing the issue. Perhaps you downloaded the data files incorrectly (corrupted)? If you send concrete instructions how to replicate the error, we can reopen and investigate.
from w2v_server_googlenews.
I've found the problem! I was running server and GoogleNews inside a directory with special chars ("Transferências" - "Downloads" directory in Portuguese) and it shows up that "UnicodeDecodeError" problem.
from w2v_server_googlenews.
@zppinto glad you figured it out :) How did you fix it?
from w2v_server_googlenews.
@piskvorky it was easy... I have just moved the server and GoogleNews file to a different directory on my user folder. I've also fixed some of the errors, like @menshikh-iv mentioned, and everything works fine.
from w2v_server_googlenews.
Related Issues (6)
- most_similar method doesn't support 2 or more words HOT 2
- Compilation Error HOT 3
- Issue with compilation HOT 1
- Feedback HOT 2
- online demo not working HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from w2v_server_googlenews.