Coder Social home page Coder Social logo

lienm / recpack Goto Github PK

View Code? Open in Web Editor NEW
17.0 2.0 3.0 6.64 MB

GitHub Mirror of RecPack: Experimentation Toolkit for Top-N Recommendation (see https://gitlab.com/recpack-maintainers/recpack)

Home Page: https://recpack.froomle.ai

License: GNU Affero General Public License v3.0

Python 100.00%
implicit-feedback machine-learning python recommender-system recommender-systems

recpack's People

Contributors

crapsjeroen avatar joeydp avatar joostatsooj avatar joostuautsooj avatar lienm avatar obe-froomle avatar ttn114 avatar verachtertr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

recpack's Issues

Random recommender implementation

Just reporting that the Random recommender has the same issue as in #4.

As an alternative, I want to share my implementation of the Random recommender, which I found to be faster and more concise.
It replaces parameter 'K' with parameter 'density'.

from collections import Counter
import random
import sys

import numpy as np
from scipy.sparse import csr_matrix, random as random_matrix


from recpack.algorithms.base import Algorithm


class Random(Algorithm):
    """Uniform random algorithm, each item has an equal chance of getting recommended.

    Simple baseline, recommendations are sampled uniformly without replacement
    from the items that were interacted with in the matrix provided to fit.
    Scores are given based on sampling rank, such that the items first
    in the sample has the highest score

    :param density: The density of a random matrix, defaults to 1.0 (fully dense)
    :type density: float, optional
    :param seed: Seed for the random number generator used, defaults to None
    :type seed: int, optional
    :param use_only_interacted_items: Should only items visited in the training
        matrix be used to recommend from. If False all items will be recommended
        uniformly at random.
        Defaults to True.
    :type use_only_interacted_items: boolean, optional
    """

    def __init__(self, density=1.0, seed=None, use_only_interacted_items=True):
        super().__init__()
        self.density = density
        self.use_only_interacted_items = use_only_interacted_items

        if seed is None:
            seed = random.randrange(sys.maxsize)
        random.seed(seed)
        self.seed = seed

    def _fit(self, X: csr_matrix):
        self.random_matrix_ = random_matrix(*X.shape, self.density, format='csr')
        if self.use_only_interacted_items:
            I = list(set(X.nonzero()[1]))
            X_mask = np.zeros(X.shape, dtype=bool)
            X_mask[:, I] = True
            self.random_matrix_ = self.random_matrix_.multiply(X_mask).tocsr()

    def _predict(self, X: csr_matrix):
        """Predicts random scores for items per user.

        Returns numpy array of the same shape as X,
        with random scores for items.
        """

        return self.random_matrix_

Recommending 'consumed' items

I might be wrong, but by looking at the code of Popularity recommender, it seems that it recommends the same set of items to every user. It is actually stated so in the comments: "all users are recommended the same items".

The issue is that a user might already have some of the recommended items in their profile. A typical recommendation scenario is to recommended new items that a user hasn't accessed yet. There are more rare cases when recommendations of already known items is meaningful (the so-called 'reminders', e.g. batteries), but it's not a common case.

Where is this filtering taken care of? Is this considered a post-processing step in the library?

User and item id mappings are gone after applying filters

The apply method of some filters returns a copy of the passed dataframe.
So when the process_many method of DataFramePreprocessor assigns uid and iid, it is no longer done on the original dataframe, but on its copy. And since this copy is not returned, the original dataframe is not affected. It will have no id mappings, and will not be filtered.

Potential fix: either do all dataframe manipulations inplace in filters, or return the modified dataframe from the process method.

Add SANSA model ("sparse EASE")

Hi, I'm one of the authors of SANSA and we'd like to ask you if we could add our model to your framework.

The official implementation of our model is open-source and we've recently released it as a package. Therefore, it will be relatively straight-forward to write an interface similar to how it's done for EASE, except that the core (training) would just be a call of a function from our package. We'll be happy to create a pull request, just let us know if you have some instructions/requirements for us :)

There is one non-standard dependency, though -- a numerical library SuiteSparse. Our readme describes how to install it and "link it" for the package installation. I think the easiest way would be to include our package in optional requirements (in pyproject.toml), and describe the need for installation of SuiteSparse as a prerequisite to our package in readme (the details should just point to our readme where it should be explained).

Let me know what you think, thanks! :)

Popularity recommender with less than K items

A minor thing that could be an issue for smaller datasets.

Since the default value for K is 200 for popularity recommender, it will fail with a ValueError if the dataset has fewer than 200 items.
This is because the dimensions of U, I, V will be different when creating a scr_matrix:

        U, I, V = [], [], []

        for user in users:
            U.extend([user] * self.K)
            I.extend(items)
            V.extend(values)

score_matrix = csr_matrix((V, (U, I)), shape=X.shape)

This of course can be solved by manually providing the value for K, but we can't rely on that.

Suggested fix:
U.extend([user] * min(self.K, len(items)))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.