Comments (4)
Would this approach violate any internal implementation contracts of Tantivy, and is it feasible?
It will most likely not work yes.
The problem has to do with the delete and commit work.
Deletes are performed right before serialization.
The reason something like
- add_doc(1)
- delete_doc(1)
- delete_doc(2)
- add_doc(2)
work the way you expect, is because we attach an opstamp to each document and each delete operation, to know in which order those operations happened.
With your scheme, two concurrent writes could end up with very different outcomes.
- add_doc(1)
- add_doc(2)
- delete_doc(1)
- delete_doc(2)
You could end up with
no docs, doc1, doc2, doc1 and doc2 in the resulting tantivy index.
It will NOT look like the transaction were executed in the order of them taking the write lock.
from tantivy.
The problem has to do with the delete and commit work. Deletes are performed right before serialization.
Can you explain more about it?
Tantivy's commit would be called in pg's commit command, I actually don't need tantivy operations executed in the order of them taking the write lock, I need them meet the RC(Read Committed) transaction isolation level: Only committed operation/data is visible for other concurrent transaction.
The behavior I expect:
(1)
Because the doc1 and doc2 is invisible for delete ops in txn2, when txn1 and txn2 all committed, there are doc1 and doc2 in the resulting tantivy index;
(2)
when txn1 and txn2 all committed, there are no doc in the resulting tantivy index;
Can tantivy be able to do that?
from tantivy.
Actually I think you are right it might work.
from tantivy.
Thank you very much. I'll try implementing the code based on this plan first to see if there are any other issues. There might be some questions I will need to ask you later. I will close this issue for now.
from tantivy.
Related Issues (20)
- Random Crash in Bitpacking/Columnar when Merging Segments HOT 3
- Highligh feature not work? HOT 1
- Any plan to support learned sparse vector search? HOT 3
- Implementing Block WAND optimization for more queries HOT 3
- Adding Function Score Query HOT 5
- Implement "minimum number should match" on BooleanQuery HOT 3
- Flaky Test test_cancel_cpu_intensive_tasks HOT 3
- Rayon thread pool abort on panic
- Isolate Aggregations
- parsing simple quote in query doesn't always give a sensible result
- allow escape in query string outside of quotes
- Concurrent commit failed in multi-process environment HOT 1
- Unique field HOT 1
- Track new FxHash Algorithm
- Fix inefficiency on multivalued but sparse column. HOT 1
- Add error handling for invalid CustomOrder in term aggregation
- monotonic mapping broken for `get_docids_for_value_range`
- Possible Codec Between SPARSE and DENSE: CHIMERA HOT 2
- keys should be increasing panic HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tantivy.