Comments (3)
Line 192 in 421d16a
I think you can check this code
I just let it to be one of the items and give it an index, e.g. 1
then I let the other actual items to be categorized subsequently into the item2idx dict.
P.S. In dev branch, we donnot focus on Item2Vec model,
but you can see the demo of item2vec in my master branch
here is the link (https://github.com/AmazingDD/daisyRec/blob/master/test_kit/run_item2vec.py)
from daisyrec.
I think you have forgotten your code a bit.
The value of the dictionary self.wc represents the number of occurrences of the item.
The index corresponding to the item in item2idx corresponds to its position sorted in descending order of the number of occurrences. So the index of self.unk will be max_item_num, which will cause errors in subsequent processing.
And can I ask you why to add this item to the return value oitems of skip-gram?
Please don't hesitate to enlighten me!Thank you very much
from daisyrec.
I just saw my code again, I found the self.wc represents the word count(number of appearance frequency of each item)
at first, I think this model might be used in a wider range, so the items existed in the known dataset may not be enough for any new item appearing in the future. Therefore, I think it's reasonable to create an unknown fake item in order to depict this situation. Anyway, it only count for 1 forever.
Line 199 in 421d16a
I guess this code will depict my original thought if it change like this:
self.idx2item = sorted(self.wc, key=self.wc.get, reverse=True)[:max_item_num + 1]
so that it can not only contain all the known items, but also this fake item. Then the following code is just similar to Word2Vec, like the other repositories.
Besides, as I mentioned before, I didn't focus on item2vec in our paper, so this code is only a toy implementation and even have no interface in
main.py
. To be honest, I didn't delete these code just because I think it might be regrettable XD.But if you have any idea or optimization, I'd really like to merge your push request!
from daisyrec.
Related Issues (10)
- about ctr prediction metric HOT 2
- ModuleNotFoundError: No module named '...' HOT 2
- JSONDecode Error HOT 1
- Is DaisyRec going to have session-based recommenders? HOT 2
- Bug in CDAE model about out_activation HOT 1
- Paper availability? HOT 2
- the parameter test_method='ufo' in daisy.utils.splitter.split_test() HOT 1
- Yelp dataset statistics HOT 4
- Why can MRR be bigger than 1? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from daisyrec.