Coder Social home page Coder Social logo

Comments (3)

AmazingDD avatar AmazingDD commented on May 26, 2024

self.wc = {self.unk: 1}

I think you can check this code
I just let it to be one of the items and give it an index, e.g. 1
then I let the other actual items to be categorized subsequently into the item2idx dict.
P.S. In dev branch, we donnot focus on Item2Vec model,
but you can see the demo of item2vec in my master branch
here is the link (https://github.com/AmazingDD/daisyRec/blob/master/test_kit/run_item2vec.py)

from daisyrec.

ACnoWA avatar ACnoWA commented on May 26, 2024

I think you have forgotten your code a bit.
The value of the dictionary self.wc represents the number of occurrences of the item.
The index corresponding to the item in item2idx corresponds to its position sorted in descending order of the number of occurrences. So the index of self.unk will be max_item_num, which will cause errors in subsequent processing.
And can I ask you why to add this item to the return value oitems of skip-gram?
Please don't hesitate to enlighten me!Thank you very much

from daisyrec.

AmazingDD avatar AmazingDD commented on May 26, 2024

I just saw my code again, I found the self.wc represents the word count(number of appearance frequency of each item)
at first, I think this model might be used in a wider range, so the items existed in the known dataset may not be enough for any new item appearing in the future. Therefore, I think it's reasonable to create an unknown fake item in order to depict this situation. Anyway, it only count for 1 forever.

self.idx2item = sorted(self.wc, key=self.wc.get, reverse=True)[:max_item_num]

I guess this code will depict my original thought if it change like this:
self.idx2item = sorted(self.wc, key=self.wc.get, reverse=True)[:max_item_num + 1]
so that it can not only contain all the known items, but also this fake item. Then the following code is just similar to Word2Vec, like the other repositories.
Besides, as I mentioned before, I didn't focus on item2vec in our paper, so this code is only a toy implementation and even have no interface in main.py. To be honest, I didn't delete these code just because I think it might be regrettable XD.
But if you have any idea or optimization, I'd really like to merge your push request!

from daisyrec.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.