Coder Social home page Coder Social logo

simple's People

Contributors

baharefatemi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

simple's Issues

"Min" Policy for ties in scoring

Hi, thank you for your work on this model. I really appreciate it.
I am writing because, studying the code in the tester.py module, I found the get_rank:

    def get_rank(self, sim_scores):#assuming the test fact is the first one
        return (sim_scores > sim_scores[0]).sum() + 1.0

In this method, you count the rank of the target entity as the number of entities for which score strictly higher than the one of the target entity itself.

This means that if separate entities yield exactly the same ranking as the target entity, you will not count them in the ranking. In other words, in case of ties, you always return the minimum rank.
This is a "min" policy; is this the expected behavior?

I am asking this because I believe that the "min" policy is not best one for link prediction models.
In theory, a model using "min" policy could give the same rank to all entities in all predictions, and it would score MRR = 1.0.

Of course its effects depend on how much your model is prone to give the same, identical score to multiple answers: if there are not ties at all, the policy will never be applied.
In your experience, does SimplE generate ties?

How to build new dataset.

Sorry for asking such a fool question.
I'm really new in this area,I'm trying to create a dataset by my own. The dataset already has (h,r,t) three columns, and how to split them into train,valid,test. For other dataset (not graph dataset), I split them randomly to 8:1:1. Is that correct in graph dataset? Or is that correct in SimplE dataset?

Reproducing FB15K Results / FB15K-237?

I am trying to reproduce results, as well as try a new dataset, FB15K-237; I'd appreciate any thoughts you have (see questions at the end):

Reproducing FB15K results: I am using Windows10, pytorch-nightly (1.1, May15), TitanXP. I ran the example from the README.md:

python main.py -ne 1000 -lr 0.05 -reg 0.1 -dataset FB15K -emb_dim 200 -neg_ratio 10 -batch_size 4832 -save_each 50

Training took right about 6 hours. Validation takes 1250 seconds/epoch. Results:

Source FB15K MRR Filter MRR Raw Hit@1 Hit@3 Hit@10
Paper SimplE-ignr 0.700 0.237 0.625 0.754 0.821
Paper SimplE 0.727 0.239 0.660 0.773 0.838
My Results SimplE-a 0.726 0.240 0.659 0.770 0.837

FB15K-237
I looked at RotatE at github, and noticed that the test.txt, train.txt, and valid.txt exactly matched the files in your FB15K directory. So, I thought I would be able to run their FB15k-237 dataset with SimplE. Using the same command as above (but referencing FB15K237), I got the following, which seems a little low (notwithstanding this is a much harder dataset, with inverse relations removed):

Source FB15K237 MRR Filter MRR Raw Hit@1 Hit@3 Hit@10
My Results SimplE-b 0.168 0.074 0.094 0.178 0.324

Questions:

  1. What is the proper command to recreate the results for SimplE, from the paper? Already, I can see I should have used 0.03 for the learning rate, for example. But clearly, the results are quite close anyway.
  2. Are the result from the paper the result of a single run, or the average/best over N runs?
  3. Are the modifications needed to run "SimplE-ignr" straightforward, or more involved?
  4. Are my run-times about the same as what you get?
  5. Does one need to do anything special to incorporate a new dataset like FB15K-237? The train/test/val files look like the same format to me.
  6. Does the removal of inverse relations from the dataset impact SimpleE? Based on some of your comments in the paper, it seems it might, but where in the code is the impact particularly felt?
  7. Regarding section 6.2 from the paper, how did you incorporate the background knowledge?

Thanks!

Output of evaluation script with 0 values for Raw setting

I just ran the FB15K example and got this:

Loss in iteration 1000: 160424.18872070312(FB15K)
Saving the model
~~~~ Select best epoch on validation set ~~~~
50
Raw setting:
        Hit@1 = 0.0
        Hit@3 = 0.0
        Hit@10 = 0.0
        MR = 0.0
        MRR = 0.0

Fil setting:
        Hit@1 = 0.51695
        Hit@3 = 0.70038
        Hit@10 = 0.81951
        MR = 106.17348
        MRR = 0.6264143401644883

All the results for Raw setting are 0.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.