Hi, I read through the code and I have a question. The embedding is

for nodes didn't appear in walks about deepwalk HOT 8 CLOSED

phanein commented on August 16, 2024

for nodes didn't appear in walks

from deepwalk.

Comments (8)

GTmac commented on August 16, 2024

Hi,

DeepWalk generates a fixed number of walks (the default value is 10) starting from each node in the graph, thus every node should appear in some random walks. Can you show me your input graph?

from deepwalk.

lemmonation commented on August 16, 2024

Hi,

I saw in the code the default length is 40? But it doesn't matter.

My graph is original Citeseer dataset which downloaded from their website. The total node number should be 3327, but nodes I collected from generated random walks from your code is around 3250, means some nodes is missing.

Is this because the dataset itself? May be the graph of Citeseer isn't all connected? But the output embedding do have 3327 representations, while the walks generated don't cover all nodes, which makes me confused.

from deepwalk.

GTmac commented on August 16, 2024

Even if the input graph is not connected, there should be multiple walks starting from each node. Say we create a graph with 5 nodes and 0 edge as follows, and store it into test.adjlist:
1
2
3
4
5

Run DeepWalk on this graph:
deepwalk --input test.adjlist --output test.embeddings --max-memory-data-size 0
We set max-memory-data-size to 0 to dump the walks to disk.
If you read the walks file, each node still appears exactly 10 times. Thus, I wonder if there is a problem with your code for collecting nodes from random walks.

from deepwalk.

lemmonation commented on August 16, 2024

Thanks for your help. It seems that I didn't notice the format parameter, and manage edgefile with adjlist para -.-

from deepwalk.

amarzullo24 commented on August 16, 2024

Dear all,
I am getting a similar issue. I run the algorithm on an adjacency matrix of size 84x84 thus representing a directed graph of 84 nodes, using the command:
deepwalk --input graph.csv --output out.csv --undirected false --format adjlist
However, the output matrix is a 23x64.
From the paper I read that the output should be |V| x d, so an 84 x 64 matrix. Am I missing something?

Thank you for your help.

from deepwalk.

GTmac commented on August 16, 2024

Can you paste the content of graph.csv here? Thanks.

from deepwalk.

amarzullo24 commented on August 16, 2024

I copied the content in this pastebin

from deepwalk.

GTmac commented on August 16, 2024

If you are reading in the data as an adjacency list (as specified by --format adjlist), then the first value in each row should be the source node, while the rest values are the nodes connected to the source node. It seems that your input file does not follow the format of an adjacency list.

from deepwalk.

for nodes didn't appear in walks about deepwalk HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent