Comments (14)
Maybe we need to keep track of creation and expiration CIDs (which is for example what postgres mvcc does). (see p 58ff http://momjian.us/main/writings/pgsql/internalpics.pdf)
from hyrise-v1.
This is true. Anyone volunteers to replace the valid column with a creation vector and update the validation methods?
from hyrise-v1.
I'll take care of it.
from hyrise-v1.
The Postgres approach provides fewer transactional guarantees though: "The above assumes the default
read committed isolation level". Currently (ignoring the problem above), we guarantee serializable transactions. As far as I understand it, the Postgres approach would introduce lost updates, phantom reads, and non-repeatable reads.
from hyrise-v1.
Basically, we are not changing much: The CID becomes the end timestamp and we add a creation timestamp. The creation timestamp is written once together with the CID of the insertion commit. Validity rules remain as there are with the addition of the creation check.
A record with an end CID is not valid
A record with a creation CID is valid
A record with no CIDs and a TID is currently written
...
David has a state table for that I think...
from hyrise-v1.
do we actually need the valid bit then?
from hyrise-v1.
nope...
from hyrise-v1.
Ok, so if I get this right we would have the following for each row: TID, BeginCID, EndCID.
When writing the following states should occur (ignoring inconsistent states):
v.TID | v.BeginCID | v.EndCID | LastCID | Status |
---|---|---|---|---|
-- | Inf | Inf | 12 | Init |
5 | Inf | Inf | 12 | Insert uncommitted |
5 | 13 | Inf | 12 | Insert commit in prog. |
5 | 13 | Inf | 13 | Insert committed |
-- | 13 | Inf | 13 | Insert committed (cleaned up) |
-- | 13 | 14 | 13 | delete commit in prog. |
-- | 13 | 14 | 14 | delete committed |
For the version visibility this results in:
v.TID == tx.TID | tx.LastCID >= v.BeginCID | tx.LastCID >= v.EndCID | Keep? | Comment |
---|---|---|---|---|
yes | yes | yes | No | Impossible |
no | yes | yes | No | Past Delete |
yes | no | yes | No | Impossible |
(yes) | yes | no | No | Own Delete (delete list in TM) |
no | no | yes | No | Impossible |
yes | no | no | Yes | Own Insert |
no | yes | no | Yes | Past Insert, Future Delete |
no | no | no | No | Uncommitted Insert, Future Insert |
Anything I missed?
from hyrise-v1.
We had a discussion about the implementation of deletes yesterday:
Having the delete list in the transaction manager might turn into a performance issue, depending on how many deletes we have. If ValidatePositions has to check for the existence of a deletion entry in the delete list, the best we can achieve O(log n).
The other idea would be to use some kind of delete lock, which could be implemented by reusing the TID. As you can see in the first table above, we do not need the TID field any more as soon as BeginCID is set. What we could do is to add a "delete uncommitted" state with the TID set to the TID of the deleting transaction and leaving the other fields untouched.
This might cause live locks when two transactions try to delete two rows in a different orders and always have to get aborted. However, this case does not violate the ACID criteria. Also, it can happen in other databases, like in MySQL.
from hyrise-v1.
Just my two cents here: validate positions will always preform a sequential scan over the deleted positions thus this is not o(LOG N). During the scan it will evaluate of it can commit.
However this does not solve the serialization of commits if they're executed in parallel as the is no possibility to mark a row as delete pending.
A way to mark rows as pending would be great and we can use the tid field for that.
If tid and begin cid is set it's a pending delete while a tid and no begin cid is a insert.
Open question : how to delete your own writes? How to handle abort and rollback of transactions correctly.
from hyrise-v1.
Regarding the decision if a transaction can commit or if it detects conflicts, we clearly need the list of deleted positions inside the TM.
However, I think the problem Markus refers to is a different one. If we do not write a TID in a deleted row and only remember the deleted position in a list in the TM, how does a transaction detect its own deletes. If it reads a table and has a list of positions, the ValidatePosScan has to detect all the own deletes.
If both lists were sorted, this could be done easily by iterating through the sorted lists in parallel. However, the delete list might not be in sort order but could be sorted before the ValidatePosScan. Can we assume that the list of positions that will be validated is sorted?
from hyrise-v1.
I think the issue of parallelizing commits should be a different discussion. See #138.
from hyrise-v1.
Regarding the sort order of the position lists, we cannot assume any ordering. So it's true we need something else and the proposed way of using the tid column is fine for me.
from hyrise-v1.
Ok, so let's assume we write the TID into a row when deleting the row in a transaction. So we have the following status combinations for a row:
v.TID | v.BeginCID | v.EndCID | LastCID | Status |
---|---|---|---|---|
-- | Inf | Inf | 12 | Init |
5 | Inf | Inf | 12 | Insert uncommitted |
5 | 13 | Inf | 12 | Insert commit in prog. |
5 | 13 | Inf | 13 | Insert committed |
-- | 13 | Inf | 13 | Insert committed (cleaned up) |
6 | 13 | Inf | 13 | delete uncommited |
6 | 13 | 14 | 13 | delete commit in prog. |
6 | 13 | 14 | 14 | delete committed |
If we consider a set TID in a row as an exclusive lock, can the step between 'insert committed' and 'insert committed cleaned up' make any trouble? So we could have rows that are already committed but still locked. Do you see any problem here?
from hyrise-v1.
Related Issues (20)
- Test sync_task_test/5 occasionally fails
- Add vTune task API for worker threads
- getInputTable <-> getResultTable HOT 1
- TPCC Table Import fails
- Add log-format version information to delta log files
- Add SQL Interface HOT 1
- Bug when Updating and Selecting within same transaction
- Distinct only works on single fields
- Compiling docs fails
- Docu for Query Duration is not correct
- Docker Container HOT 1
- Typo in QueryParser error message
- CompoundIndexScan throws misleading error message
- Dumped tables are not added to StorageManager after loading
- Failing tests Scheduler/SchedulerTest.wait_task_test
- Compilation error when Metis is configured with 64-bit indices
- Failing test BufferedLoggerTests.log_test HOT 3
- Test BufferedLoggerTests.simple_checkpoint_test fails
- Can hyrise be used to parse MySQL queries? HOT 1
- index
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hyrise-v1.