Comments (4)
Hi @hyp1231, I read through the 0.1.X version again. I notice that currently the padding function is made within the dataset instead of dataloader. Thus it could be a challenge that what if the batch data is beyond the 2D, say a batch of comments(a 3D tensor), or probably a batch of images (a 4D tensor).
Therefore, will it be more flexible to implement the padding within the dataloader, and then the function dict_to_interaction just converts the numpy array to torch tensor?
from recbole.
Hi @rowedenny, thanks for the kind suggestions and sorry for the delayed reply. It's an interesting question and our development team has made a thorough discussion on this issue.
Firstly, as for the challenge of higher dimension tensors, to make a trade-off between uniformity and flexibility, maybe it's a good idea to store the higher dimension tensors as SEQ-like features (e.g. token_seq
or float_seq
) in atomic files (actually they are flattened currently). After being fed into the models, these tensors can be reshaped into 3D or 4D manually according to hyperparameters. In this way, we still have only four feature types, but we can achieve higher dimension tensors inputs.
What's more, we find that the bottleneck of time-consuming lies in the conversion from pandas.DataFrame
to Interaction
in DataLoader
of 0.1.x branch. Thus, @chenyushuo has made a refactorization, and now in branch 0.2.x, these conversions are done in Dataset
, which speeds up a lot. However, in this situation, it's much more difficult to implement the padding within the DataLoader
.
By the way, we have just opened the Discussions, and welcome to try it up! :D
from recbole.
I recently read throughout the implementation on version 0.2, and find that model.type
is a key factor that affect how the pipeline decides what dataset, dataloader and sampler. So I am wondering can we register an enum variable named customized
such that the user can freely implement its corresponding dataset, dataloader and sampler?
from recbole.
@rowedenny Hi, thx for your advice and we will carefully consider it. BTW, we have released a new version (v1.0.0) and we rebuild the dataloader, you can read our latest code for more details.
from recbole.
Related Issues (20)
- [🐛BUG] LightGCN在ml-100k数据集上性能不佳 HOT 6
- Parameters of HyperTuning
- [🐛BUG] Error running LightGCN
- [🐛BUG] 顺序推荐模型在使用带有label标签的数据集并且使用排序评价指标(如NDCG)的时候会发生报错。 HOT 2
- 数据集想加入除了user_id和item_id的特征,应该修改哪部份代码呢?
- [🐛BUG] Handling scores on training items when evaluating based on ranking
- [🐛BUG] recbole1.2.0与recbole-cdr兼容问题 HOT 2
- 尝试用General recommendation models进行个性化试题推荐发现效果不太好,求助 HOT 1
- 关于recbole中知识图谱数据集的问题请教 HOT 2
- run_hyper训练未完成 HOT 1
- [🐛BUG] Context-aware recommenders not properly embedding float sequences. HOT 1
- 在训练的每一轮结束后释放显存缓冲区
- 想请问如何取出数据集的一部分进行训练?(小白)
- Context-aware DeepFM not learning HOT 2
- 我想请教一个ml-1m知识图谱数据集的配置信息问题 HOT 1
- 请问序列推荐时如何实现每一个时间步都进行预测
- [🐛BUG] Migration errors in SASRec
- 使用recbole1.2.0自动下载知识图谱数据集ml-1m时发生错误
- 使用recbole1.2.0时发现ml-1m的数据数量对不上
- [💡SUG] 请问如何输入id,利用case study或其他函数,输出测试集中用户实际购买的商品?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from recbole.