Comments (7)
When you set the number of workers to 5, I assume that DeepWalk will run on a single instance with 5 threads. Thus, you need to write some additional code to generate walks on each instance and then merge the walk files together.
Perhaps the reason it got killed is that your machine run out of memory. Can you provide more detailed information on that?
from deepwalk.
I am running it on AWS ec2 at this moment,ec2 has 8V cpus and I am configuring number of workers to 8.It is running fine at this moment but the process is still slow.Can you please help me to write the additional code to generate walks on each instance and then merging the walk files. I am also thinking to make this code to run using apache spark .Can you please share your thoughts and help me in this
from deepwalk.
from deepwalk.
from deepwalk.
from deepwalk.
How about using Spark all along? We did that for our RandomWalks implementation, but trained the word2vec models locally, since the spark word2vec implementation is nowhere as good as the C variant from Mikolov himself, e.g. the maximum vocabulary size is heavily restricted.
from deepwalk.
HI @thomasniebler Can you please share your spark code for random walk generation,so that I will integrate my word2vec code with that and see how it is working.
from deepwalk.
Related Issues (20)
- default factory changed? HOT 2
- error for the "ValueError: invalid literal for int() with base 10: 'nan'"
- ```concurrent.futures.ProcessPoolExecutor``` may lead to wrong embedding ?
- How to generarte DW in a directed and unconnected graph?
- RuntimeError: dictionary changed size during iteration HOT 1
- About the power-Law distribution figure in the article. HOT 1
- ImportError: cannot import name 'Vocab' from 'gensim.models.word2vec' HOT 3
- How to assess edgelist by using scoring.py?
- ImportError: cannot import name 'Vocab' HOT 2
- nodes number HOT 2
- Make deepwalk usable from within python?
- several code changes during my test
- How to create .mat files for CORA Dataset HOT 1
- Please publish a new release on PyPI HOT 1
- Project dependencies may have API risk issues
- 'NoneType' object has no attribute 'nodes' HOT 1
- TypeError: __init__() got an unexpected keyword argument 'size' HOT 4
- Cannot import deepwalk
- Embedding crowded with some fixed node
- A research for generating PR checklists in Pull Request Template HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepwalk.