Coder Social home page Coder Social logo

Comments (8)

menshikh-iv avatar menshikh-iv commented on July 24, 2024 1

@piskvorky For gensim==2.3.0 and GoogleNews-vectors-negative300.bin.gz all works fine

In [1]: from gensim.models import KeyedVectors
In [2]: kv = KeyedVectors.load_word2vec_format("GoogleNews-vectors-negative300.bin.gz", binary=True)
In [3]: kv.most_similar("king")
Out[3]: 
[(u'kings', 0.7138046026229858),
 (u'queen', 0.6510956883430481),
 (u'monarch', 0.6413194537162781),
 (u'crown_prince', 0.6204220056533813),
 (u'prince', 0.6159993410110474),
 (u'sultan', 0.5864822864532471),
 (u'ruler', 0.5797567367553711),
 (u'princes', 0.5646552443504333),
 (u'Prince_Paras', 0.543294370174408),
 (u'throne', 0.5422104597091675)]

FYI, line from README isn't correct python runserver.py hetzner.conf (should be w2v_server.py and w2v_*.conf), also "harcoded" path to binary.

With gensim==2.3.0 this web app doesn't run (because used deprecated Word2Vec.load_word2vec_format:

(qqq) ivan@P50:~/w2v_server_googlenews$ python w2v_server.py w2v_home.conf
2017-09-22 11:23:06,643 : INFO : w2v_server:163 : <module>(MainThread) : running w2v_server.py w2v_home.conf
Traceback (most recent call last):
  File "w2v_server.py", line 182, in <module>
    cherrypy.quickstart(Server(config.MODEL_FILE), config=conf_file)
  File "w2v_server.py", line 69, in __init__
    self.model = gensim.models.word2vec.Word2Vec.load_word2vec_format(fname, binary=True)
  File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/gensim/models/word2vec.py", line 1450, in load_word2vec_format
    raise DeprecationWarning("Deprecated. Use gensim.models.KeyedVectors.load_word2vec_format instead.")
DeprecationWarning: Deprecated. Use gensim.models.KeyedVectors.load_word2vec_format instead.

With versions from requirements.txt (already fixed path to word-vectors) - problems with hardcoded non-existente dirs for logs:

(qqq) ivan@P50:~/w2v_server_googlenews$ python w2v_server.py w2v_home.conf 
2017-09-22 11:25:40,287 : INFO : w2v_server:163 : <module>(MainThread) : running w2v_server.py w2v_home.conf
2017-09-22 11:25:40,292 : INFO : word2vec:1023 : load_word2vec_format(MainThread) : loading projection weights from /home/ivan/w2v_server_googlenews/GoogleNews-vectors-negative300.bin.gz
2017-09-22 11:27:14,305 : INFO : word2vec:1077 : load_word2vec_format(MainThread) : loaded (3000000, 300) matrix from /home/ivan/w2v_server_googlenews/GoogleNews-vectors-negative300.bin.gz
2017-09-22 11:27:14,305 : INFO : word2vec:1345 : init_sims(MainThread) : precomputing L2-norms of word weight vectors
2017-09-22 11:27:33,115 : INFO : word2vec:1345 : init_sims(MainThread) : precomputing L2-norms of word weight vectors
Traceback (most recent call last):
  File "w2v_server.py", line 182, in <module>
    cherrypy.quickstart(Server(config.MODEL_FILE), config=conf_file)
  File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/__init__.py", line 169, in quickstart
    _global_conf_alias.update(config)
  File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/_cpconfig.py", line 158, in update
    reprconf.Config.update(self, config)
  File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/lib/reprconf.py", line 166, in update
    self._apply(config)
  File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/_cpconfig.py", line 168, in _apply
    reprconf.Config._apply(self, config)
  File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/lib/reprconf.py", line 178, in _apply
    self.namespaces(config)
  File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/lib/reprconf.py", line 120, in __call__
    handler(k, v)
  File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/__init__.py", line 645, in <lambda>
    config.namespaces["log"] = lambda k, v: setattr(log, k, v)
  File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/_cplogging.py", line 391, in _set_access_file
    self._set_file_handler(self.access_log, newvalue)
  File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/_cplogging.py", line 363, in _set_file_handler
    self._add_builtin_file_handler(log, filename)
  File "/home/ivan/.virtualenvs/qqq/local/lib/python2.7/site-packages/cherrypy/_cplogging.py", line 349, in _add_builtin_file_handler
    h = logging.FileHandler(fname)
  File "/usr/lib/python2.7/logging/__init__.py", line 913, in __init__
    StreamHandler.__init__(self, self._open())
  File "/usr/lib/python2.7/logging/__init__.py", line 943, in _open
    stream = open(self.baseFilename, self.mode)
IOError: [Errno 2] No such file or directory: '/var/log/w2v/access.log'

With all fixes - server runs correctly (with some errors from engine, but it doesn't matter) and all works correctly:

ivan@P50:~$ curl 'http://127.0.0.1:8889/most_similar?positive%5B%5D=woman&positive%5B%5D=king&negative%5B%5D=man'
{"taken": 0.22298312187194824, "similars": [["queen", 0.7118192911148071], ["monarch", 0.6189675331115723], ["princess", 0.5902431011199951], ["crown_prince", 0.5499460697174072], ["prince", 0.5377322435379028]], "success": 1}
ivan@P50:~$ curl 'http://127.0.0.1:8889/suggest?term=IPhon'
["iPhone", "iphone", "IPhone", "Iphone", "IPHONE", "iPHONE", "iPHone", "iPhone.com", "iphone.org", "iPhone.org"]

from w2v_server_googlenews.

piskvorky avatar piskvorky commented on July 24, 2024

What version of gensim is this?

@zppinto can you load the model using the latest version of gensim?

from w2v_server_googlenews.

zppinto avatar zppinto commented on July 24, 2024

gensim (2.3.0) and CherryPy (11.0.0)

Also have tried with gensim (0.13.1) and CherryPy (5.1.0) on a "older" machine I've with this dependencies...
The problem happens on both!

from w2v_server_googlenews.

piskvorky avatar piskvorky commented on July 24, 2024

Hmm, not good. Thanks for reporting. @zppinto Can you send the exact command for loading the vectors which gives you this exception?

@menshikh-iv can you replicate this? Can we really not load the GoogleNews vectors in gensim any more?

from w2v_server_googlenews.

piskvorky avatar piskvorky commented on July 24, 2024

Thank you @menshikh-iv .

@zppinto we're unable to reproduce, closing the issue. Perhaps you downloaded the data files incorrectly (corrupted)? If you send concrete instructions how to replicate the error, we can reopen and investigate.

from w2v_server_googlenews.

zppinto avatar zppinto commented on July 24, 2024

I've found the problem! I was running server and GoogleNews inside a directory with special chars ("Transferências" - "Downloads" directory in Portuguese) and it shows up that "UnicodeDecodeError" problem.

from w2v_server_googlenews.

piskvorky avatar piskvorky commented on July 24, 2024

@zppinto glad you figured it out :) How did you fix it?

from w2v_server_googlenews.

zppinto avatar zppinto commented on July 24, 2024

@piskvorky it was easy... I have just moved the server and GoogleNews file to a different directory on my user folder. I've also fixed some of the errors, like @menshikh-iv mentioned, and everything works fine.

from w2v_server_googlenews.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.