Coder Social home page Coder Social logo

Comments (7)

vgoloviznin avatar vgoloviznin commented on June 16, 2024

Hey @AndreasKleineberg, there's a ttl property for the features that you can set which will discard older events: https://docs.metarank.ai/reference/overview/feature-extractors#configuration. Default is 3 months, but you can set it for a smaller scope, depending on your use case

from metarank.

AndreasKleineberg avatar AndreasKleineberg commented on June 16, 2024

If I understand it correctly, the individual events (item/user/ranking/interaction) are persistently stored in Redis (or local storage). The individual events are merged and additionally stored as click-through events, correct? The mentioned ttl property makes sure that individual features (e.g. memory eating embeddings) are deleted after some time. But what happens if the underlying item can still show up in rankings? Is the feature then recalculated? Or does the item event have to be reloaded before the 3 months expire?

On a side note, I think you've got a really interesting project going, but it's never taken me this long to grasp the context of a project (especially in terms of production use). Nevertheless, good work!

from metarank.

shuttie avatar shuttie commented on June 16, 2024

@AndreasKleineberg Metarank does not store raw events at all. It only stores some derived feature values used for the ranking, so the original idea was to stick to a soft expiration logic with TTLs in Redis to purge old data.

For example:

  • when you send an item even with a single field price=100 (which is used as-is in the ranking as a scalar feature), it creates a redis KV record of item/id1/price=100 with TTL=90days, and then discards the original event.
  • when you send a click event (and you track CTR for example), metarank updates the daily counters for clicks for a specific item (by doing hincrby in redis), and marks the item as clicked in the original ranking event. And then also discards the event.

By doing this approach we're storing not all the raw events in Redis, but only update feature values needed for the ranking.

But you're right, bulk removal of values is not implemented right now in the way you ask.

from metarank.

shuttie avatar shuttie commented on June 16, 2024

So I guess for your use-case the simplest way to go is:

  • set TTLs for all the features (AFAIK the default is 90 days, you may prefer a smaller value)
  • when the item is removed, then it won't appear in the ranking events (so it's never going to be presented to a customer), so the TTL will be never refreshed
  • with time all the inactive user/item features will be eventually expired and removed.

There is still a chance that you may want to explicitly nuke a significant part of the inventory (like off-boarding a large vendor from a marketplace), but considering that this use case is quite rare - we still not sure that it should be part of our roadmap.

from metarank.

shuttie avatar shuttie commented on June 16, 2024

Here is a fix for a bunch of issues with Redis TTL pass-through: #1114

from metarank.

AndreasKleineberg avatar AndreasKleineberg commented on June 16, 2024
* when the item is removed, then it won't appear in the ranking events (so it's never going to be presented to a customer), so the TTL will be never refreshed

Ah okay, I think that's the point I've missed so far. So the features that I have set a TTL for are updated whenever they appear in a ranking event. So then that also means if a product (item event) never appears in a ranking, it automatically flies out after the TTL expires.

One last question about this: What happens if a product was never shown in the rankings before its features expired, then the features expire and afterwards someone does look at the product (so it does show up in a ranking event). Is it then simply ignored? The features are then no longer available in Redis.

from metarank.

shuttie avatar shuttie commented on June 16, 2024

One last question about this: What happens if a product was never shown in the rankings before its features expired, then the features expire and afterwards someone does look at the product (so it does show up in a ranking event). Is it then simply ignored? The features are then no longer available in Redis.

Then the feature value would be a NaN, an empty value. All the backends like lightgbm/xgboost/catboost do support this natively and use this information in training as a yet another signal. More details on how it works: https://datascience.stackexchange.com/questions/65956/how-do-gbm-algorithms-handle-missing-data

from metarank.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.