Hello, I'm trying to understand the concept of Metarank and I'm stuck on one point

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

So I guess for your use-case the simplest way to go is: set TT

Here is a fix for a bunch of issues with Redis TTL pass-through: <a class="issue-link

<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data

How to discard old (item-) events? about metarank HOT 7 CLOSED

AndreasKleineberg commented on June 16, 2024

How to discard old (item-) events?

from metarank.

Comments (7)

vgoloviznin commented on June 16, 2024

Hey @AndreasKleineberg, there's a ttl property for the features that you can set which will discard older events: https://docs.metarank.ai/reference/overview/feature-extractors#configuration. Default is 3 months, but you can set it for a smaller scope, depending on your use case

from metarank.

AndreasKleineberg commented on June 16, 2024

If I understand it correctly, the individual events (item/user/ranking/interaction) are persistently stored in Redis (or local storage). The individual events are merged and additionally stored as click-through events, correct? The mentioned ttl property makes sure that individual features (e.g. memory eating embeddings) are deleted after some time. But what happens if the underlying item can still show up in rankings? Is the feature then recalculated? Or does the item event have to be reloaded before the 3 months expire?

On a side note, I think you've got a really interesting project going, but it's never taken me this long to grasp the context of a project (especially in terms of production use). Nevertheless, good work!

from metarank.

shuttie commented on June 16, 2024

@AndreasKleineberg Metarank does not store raw events at all. It only stores some derived feature values used for the ranking, so the original idea was to stick to a soft expiration logic with TTLs in Redis to purge old data.

For example:

when you send an item even with a single field price=100 (which is used as-is in the ranking as a scalar feature), it creates a redis KV record of item/id1/price=100 with TTL=90days, and then discards the original event.
when you send a click event (and you track CTR for example), metarank updates the daily counters for clicks for a specific item (by doing hincrby in redis), and marks the item as clicked in the original ranking event. And then also discards the event.

By doing this approach we're storing not all the raw events in Redis, but only update feature values needed for the ranking.

But you're right, bulk removal of values is not implemented right now in the way you ask.

from metarank.

shuttie commented on June 16, 2024

So I guess for your use-case the simplest way to go is:

set TTLs for all the features (AFAIK the default is 90 days, you may prefer a smaller value)
when the item is removed, then it won't appear in the ranking events (so it's never going to be presented to a customer), so the TTL will be never refreshed
with time all the inactive user/item features will be eventually expired and removed.

There is still a chance that you may want to explicitly nuke a significant part of the inventory (like off-boarding a large vendor from a marketplace), but considering that this use case is quite rare - we still not sure that it should be part of our roadmap.

from metarank.

shuttie commented on June 16, 2024

Here is a fix for a bunch of issues with Redis TTL pass-through: #1114

from metarank.

AndreasKleineberg commented on June 16, 2024

* when the item is removed, then it won't appear in the ranking events (so it's never going to be presented to a customer), so the TTL will be never refreshed

Ah okay, I think that's the point I've missed so far. So the features that I have set a TTL for are updated whenever they appear in a ranking event. So then that also means if a product (item event) never appears in a ranking, it automatically flies out after the TTL expires.

One last question about this: What happens if a product was never shown in the rankings before its features expired, then the features expire and afterwards someone does look at the product (so it does show up in a ranking event). Is it then simply ignored? The features are then no longer available in Redis.

from metarank.

shuttie commented on June 16, 2024

One last question about this: What happens if a product was never shown in the rankings before its features expired, then the features expire and afterwards someone does look at the product (so it does show up in a ranking event). Is it then simply ignored? The features are then no longer available in Redis.

Then the feature value would be a NaN, an empty value. All the backends like lightgbm/xgboost/catboost do support this natively and use this information in training as a yet another signal. More details on how it works: https://datascience.stackexchange.com/questions/65956/how-do-gbm-algorithms-handle-missing-data

from metarank.

How to discard old (item-) events? about metarank HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent