zenogantner / mymedialite Goto Github PK
View Code? Open in Web Editor NEWrecommender system library for the CLR (.NET)
Home Page: http://mymedialite.net
recommender system library for the CLR (.NET)
Home Page: http://mymedialite.net
Implement example web application that uses the web service interface.
The recommender type can also be derived from the information in the model file.
create a Fedora package of MyMediaLite
add pre- and post-filter APIs to MyMediaLite
pre-filters generate candidate lists
post-filter
user and item IDs could be uints (it is assumed anyway that they are >= 0)
Same for index data types in many places in the library.
It would make it harder to port MyMediaLite to Java after those changes, so we better be careful.
Not high priority.
GraphLab has a nice library of rating prediction algorithms based on matrix/tensor factorization:
http://graphlab.org/pmf.html
It would be nice to have an interface to GraphLab to be able to use this library and to use other recommenders written "in" GraphLab that make use of its particular features wrt. parallelization.
Support chronological splits, both relative to user history and to absolute times.
--chronological-split=DATETIME
--chronological-split=RATIO
Support track 1+2 in rating prediction program, and track 2 in item prediction program.
Support the combination of several recommenders by the command-line programs.
... so that people know where to read about the implemented method.
Paper: http://www.ismll.uni-hildesheim.de/pub/pdfs/RendleFreudenthaler2010-FPMC.pdf
Can be used for next-basket recommendations (=recommendations based on the last purchase)
For this, create ISequentialItemRecommender.
Output graphs (image files or CSV files) for things like precision@N and recall@N for different N.
MyMediaLite.dll : core library without external dependencies
MyMediaLite.SVM.dll : recommenders that need LIBSVM
MyMediaLite.Math.NET.dll : recommenders that need Math.NET
MyMediaLiteExperimental.dll : experimental code
MyMediaLiteExperimental.SVM.dll
MyMediaLiteExperimental.Math.NET.dll
two modes: split relations, do not split relations
support top-n evaluation (and other item prediction measures) in the rating prediction command-line program
Chris wrote:
I've done some work with Mahout, and one feature I appreciate is that it stores the user and item mappings with the model data when you save it.
It makes it easier to resurrect a recommender and reduces the likelihood I'll get all the IDs mixed up!
Hyperparameter search by line/grid search and Nelder-Mead should be supported for all recommenders;
For recommenders that use a learn rate (=step size), there should also be routines for learning good step sizes.
This will push MyMediaLite more towards being usable as a black-box tool.
Create an interface for active learning recommenders, i.e. recommenders that request certain items to be rated by a user in order to improve the predictive model.
Currently we have binary relations over users or items.
In the future, we additionally may want to have
Only read in a certain percentage of the training data:
--sample-ratings=RATIO
--sample-users=RATIO
--sample-items=RATIO
for rating prediction and item prediction
The idea is that users of other software packages can use those to create the predictions, and then evaluate the predictions using MyMediaLite's evaluation routines.
Suggested by Lucas Drumond.
The current solution is not the most elegant.
KNN recommenders are (usually) not iterative models, so we should rather use the UserItemBaseline via composition, not inheritance.
Support CV for cold-start evaluation protocols
Currently, we instantiate a recommender and then load a model via its LoadModel() method.
It would be nice to have a tiny helper tool that looks into the model file, instantiates the recommender by itself, and then does the above.
Create an example that explains how to use MyMediaLite from F#, and how to implement a new recommender in F#.
For each positive item, sample a negative item according to its overall frequency/popularity.
Give an example how to get training data from a database.
For recommenders that are trained with gradient-based algorithms we need suitable learn rates. These usually differ from data set to data set. MyMediaLite should contain a routine that automatically finds a suitable learn rate for a given data set.
add namespace ContextAwareRecommendation with the interfaces IContextAwareItemRecommender (also covers tag recommendation, time-aware recommendations, and search queries) and IContextAwareRatingRecommender
... for reading in attributes
A rating prediction program that does not need training data, but just relies on the model file to make predictions.
Will not work for memory-based recommenders; we will also take care to change the model file format to incorporate user ID and item ID mappings.
Create an interface for recommenders that aggregate score predictions for several users.
The item and rating command-line programs should remain in the core repository, but the attribute-to-factor mapping code and the GUI demo could go into another repository.
Currently, the user (of the command-line tool or the library) has to set the minimum and maximum ratings manually (if they are not the default 1 and 5). It would be more convenient to get them from the data and allowing to set them manually if necessary.
Currently, attributes are supposed to be binary: https://github.com/zenogantner/MyMediaLite/blob/master/src/MyMediaLite/IO/AttributeData.cs
It would be nice if the recommender API supported at least binary and real-valued attributes, and the IO methods supported binary, real-valued, nominal, and text attributes, and would map them accordingly to binary and real-valued attributes.
Currently, parsing floats/doubles in the Mono.Option command line parameters follows the current locale.
This is not desirable, because we want the command line options to be the same everywhere so that people can copy+paste commands from the documentation etc.
Currently, the bold-driver learning rate adaptation schemes (for BiasedMatrixFactorization and BPRMF) use fixed values to increment/decrement the step size. This should be configurable (and set to sensible defaults)
For consistency with the item prediction program, and because it would be a useful feature.
That way, we could also generate rating predictions for arbitrary items.
... by parallelizing the candidate score computations.
--static: slower loading (2 passes), less memory consumption, faster access, no new data can be added
--non-static: faster loading (1 pass), new data can be added
Chris suggested this for item prediction.
An interface for this could be:
IList<WeightedItems> Predict(IList<int> watched_items, IList<int> candidate_items)
IList<int> Predict(IList<int> watched_items, IList<int> candidate_items, int n)
This would train features for a user specified by the list watched_items, and then predict scores for the list candidate_items.
One additional thing to consider for the interface would be to extend the interface to allow user attributes (not supported by BPR-MF, but possibly by other recommenders):
IList<double> Predict(IList<int> watched_items, var user_features , IList<int> candidate_items)
Also implement a similar thing for rating prediction, like
IList<WeightedItem> PredictItems(IList<WeightedItem> rated_items);
IList<int> PredictItems(IList<WeightedItem> rated_items, int n);
The current command-line programs for item and rating prediction share many concepts.
It may be worthwile to consider implementing those shared concepts in one class and deriving from it.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.