Comments (13)
I thought about this a bunch in detail and this definitely appears to be a lot less trivial than I originally thought.
One problem here is guarantees on ordering:
-
We'd have to keep a tombstone around marking the object as deleted using a logical clock that dominates the logical clock that's associated with the last known write, like something like Riak does. Then, we would need custom logic in all of the anti-entropy backends that says that a normal CRDT merge is done, in addition to, special data store logic that states that an object tombstone overwrites the merge. This tombstone couldn't be an object like bottom, because that would always merge to the most recent state.
-
This is further complicated with the delta mechanism, because we'd have to find a way to model a tombstone as a delta -- which, it can't be a value itself in the lattice, because again, we'd end up with the same merging problem.
-
One alternative, would be to extend lattices with a top value that was used to represent deletions (or, variable undeclares) -- this seems the most reasonable solution since we can trivially extend all lattices in the system with a top value in the runtime, and ensure that merges always merge to top when either side of the merge is a top. However, this complicates matters further because it would prevent values from being redeclared ever again.
-
Now, a more interesting thing might be to use the partial replication mechanism in Lasp: this ensures that values that are not within a "replication group" aren't stored on that node -- right now, this just filters objects out from the anti-entropy protocol, but ideally, we could also have this periodically prune the backend of the objects that are no longer needed. This might be the more reasonable idea here.
I'm looping @russelldb in to see what he thinks, since this is a problem that's at least been thought of in the context of Riak many times.
from lasp.
@russelldb is also correct to point out that a single logical clock for the entire node can also be used to handle a removal as well.
The complications here come from the fact that internally in Lasp's key-value store, that any CRDT supporting an interface (Erlang behavior) is supported, so we can't assume a uniform data representation. So, either we need to extend the types in a generic way, or we need to support something specific in the backend (ie. Russell's solution or the partial replication scheme I proposed.)
from lasp.
Obviously, another issue here to worry about is concurrent updating with a undeclare -- which would effectively restore the value.
In fact, the declare operation is superfluous anyway, because all it does is create a local register with the bottom value for the lattice, which is implicitly done through the update operation.
from lasp.
This is what I was planning to look at in case it was the cause of our memory leak. I think the memory spikes too quickly to this be the main issue we are seeing, but it could still be adding to it over time I figure.
from lasp.
Can you extrapolate on why you think this might be the memory leak you are experiencing?
from lasp.
@cmeiklejohn based on the comment from @bullno1 it sounded like no longer used variables would continue to take up space, and our staging environment costs of continually creating new devices/channels that get registered, used briefly, and never used again, 24/7.
Just a thought, even if it is leaving the space taken it may be so little it doesn't matter and we have other concerns, I'm still trying to find where the issue is.
from lasp.
I'd been trying to find what was eating up all the memory on our node for a while now and finally discovered why it was so hard to find :). mem
column for ets:i()
is in words, not bytes, so now I see that the 1gig I couldn't account for is the lasp_ets_storage_backend
.
from lasp.
Yes, that's right.
So, you're creating new keys often and abandoning other keys? Is that the root cause of the issue? If so, we probably need to come up with a solution for this sooner rather than later.
Can you confirm this is the actual issue?
from lasp.
Just realized I never responded here and only in gitter.
I think the abandoning is not currently the issue. I'd expect that growth to be much slower if not for the growing waiting_threads
element in dv
record. They look to be related to enforce_once
from lasp.
Can you dump the ets table (or, even just a single table entry) so we can inspect the stored state and identify where the bloating is coming from?
from lasp.
FWIW: the previous fix on mater should have removed the issue with waiting_threads
growing forever. That ensured that we prune terminated process at every invocation.
from lasp.
Oh, I'll try master.
And from the ets table the other day:
[{threshold,read,<0.1746.0>,awset,
{strict,{state_awset,{[{<0.1741.0>,
[{<<131,100,0,18,110,117,99,108,101,117,115,64,49,...>>,
1}]}],
{[{<<131,100,0,18,110,117,99,108,101,117,115,64,49,48,...>>,
1}],
[]}}}}},
from lasp.
I'll be pushing the use of master tomorrow to see how it does.
It looks like there were more commits after the gc
branch commit we were working off.
from lasp.
Related Issues (20)
- upgrade to latest gen_flow HOT 3
- enforce_once triggering more than expected HOT 3
- Can't use in an Elixir mix project HOT 12
- Lasp.stream/2 callbacks are not invoked when state_orset changes HOT 2
- Provide an option to trigger sync on update HOT 1
- Allow to force gossip to syncrhonize HOT 2
- Tree based dissemination mode is crashing. HOT 1
- peer service gossips to all members HOT 1
- Deltas don't support blocking sync option HOT 1
- Deltas don't support forced propagation option HOT 1
- Syncing initial state or state after a crash where state is lost
- Fail to Compile: Getting log of git dependency failed in /dir/to/lasp-master/. Falling back to version 0.0.0 HOT 9
- Using the lasp-erlang-client? HOT 18
- Erlang 21 incompatible HOT 4
- Delayed lasp:stream function call
- Read with threshold and maximum blocking duration HOT 1
- Memory leak ? HOT 10
- Deadlock when calling lasp:query/1 HOT 2
- Website is down HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lasp.