Comments (8)
Could you please share your setting for num_neg, which is used in training phase for negative sampling? I found this parameter has a strong effect on val loss.
from neural_collaborative_filtering.
from neural_collaborative_filtering.
So I suppose the error you mentioned here refers to the loss of binary-cross-entropy on the validation set? If so, I guess it would be more proper to use the performance like HR or NDCG as the metric. The reason is the numbers of negative samples are different between training set and validation set (1 vs 4 and 1 vs 99 in the data file *.neg.dat), so the loss may act weirdly.
from neural_collaborative_filtering.
from neural_collaborative_filtering.
So you pair another 4 negative samples with the positive one sampled from the original training set to make a new validation set? If so, I think maybe you can check if those negative samples overlap any positive training samples. If all negative samples are correctly chosen, I don't have a better idea about this issue at present.
from neural_collaborative_filtering.
from neural_collaborative_filtering.
You're welcome :)
from neural_collaborative_filtering.
Hey @HenryNebula ,
My question is in some ways related to the value loss, so I felt I should comment here, rather than creating a new issue. The question is some what trivial. I've read the paper and ran the models successfully. Although, I'm a bit confused on one part.
The use of Softmax activation with a binary-cross-entropy is driving me fuzzy. To me this is more or less a regression problem, i.e., when applied to Movie Lens data set trying to predict movie ratings based on previous interactions, the loss function of 'mse' along with 'relu' or even linear activation's make sense, but how come a 'sigmoid' function is used for activation on the last layer ?
Wouldn't the sigmoid function always lead to outputs between [0,1] ? Even if we perform 'N' number of hyper-parameter tuning steps and other regularization techniques , technically a sigmoid function never crosses an output of 1.0 right ? I looked at other implementations of this paper and pretty much found the same thing, a sigmoid function at the end.
I'm not trying to contradict your or the original authors idea here, I'm just trying to figure out how sigmoid activation would make sense for this problem. Or is there any piece I'm missing fundamentally here. Let me what you think.
from neural_collaborative_filtering.
Related Issues (20)
- How to determine the hyper-parameters? HOT 1
- Question about obtaining the embedding. HOT 1
- dara processing HOT 1
- Low volatile GPU-Util but high GPU Memory Usage HOT 2
- some question about data(Each line corresponds to the line of test.rating, containing 99 negative samples. But why 99?) HOT 3
- the discussion on the NDCG
- Some negative samples may exist in the test set, Is it reasonable to do so? HOT 2
- How is the embedding layer trained?
- Possibly error in NeuMF
- Binary rating vs. actual rating
- Query on dataset HOT 1
- Performance deteriorates when reproducing
- Duplicated negative samples for a user exist
- Bad performance on the whole test data with all movies HOT 1
- errors in NeuMF HOT 1
- --
- 感谢博主
- move id issue HOT 1
- AttributeError: module 'numpy'
- .
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from neural_collaborative_filtering.