Comments (7)
I should have also linked the upload-pack
side of things. Here's the file. It reads and writes from a set of channels, which act as the intermediary for the QUIC connection, afair (@cloudhead might be able to say more to that).
This also means that gix probably isn't the right abstraction for now, as it's way too high-level, and I don't know if it should be able to support such a specialised while plumbing exists.
Aye, that makes sense and resonates with my memory of trying to use gix
way back :)
This probably also means you don't have to recheck the gix level code of the fetch API, even I don't like to look at it, it's so much and quite complex, always troublesome to find anything 😅.
Hahaha thank you for saving me that time up-front 😄
I probably won't get to it very soon, but it's on my list now and I will make it priority once the last stage of gix status is implemented (HEAD->index diff).
Sounds good! Ping me if there's anything I can help with when you get around to it.
from gitoxide.
from gitoxide.
This is exciting!
Something I'd like to point out is that replacing git2
functionality with gix
wouldn't be my largest concern. While a lot of work, I imagine it would be straight forward.
The heaviest lifting and biggest concern for a first task is the fact that we're still using the Delegate
approach for the fetch protocol. I did look at using the more modern approach that Gitoxide was using, however, iirc it didn't provide enough control for us to use the current staged approach to fetching from one remote to the other. If this is something that I could go into more detail about, please let me know.
from gitoxide.
Thank you for clarifying that. From my experience, using gix
in the frontend has advantages in terms of compatibility at the very least, but I also hear that recently libgit2
really ramped up its contributions so these issues might even go away in the mid-term.
And I'd definitely love to finally come up with a fetch-API that is easy to use but not unnecessarily limiting, and can thus work for you as well. If that was the case, I think you could start tracking a more recent version which might ultimately pay off.
If you would link the latest code that uses the Delegate
here, I should be able to see what can or can't be done with higher-level APIs and fix these. Sharing what it specifically was that prevented the adoption last time you tried would probably certainly be helpful to me as well.
Thanks again!
from gitoxide.
Thank you for clarifying that. From my experience, using gix in the frontend has advantages in terms of compatibility at the very least, but I also hear that recently libgit2 really ramped up its contributions so these issues might even go away in the mid-term.
Compatibility in which sense? :) In my mind, the radicle-fetch
code is quite isolated and the only conversion points are OIDs and refnames, which are already in place.
And I'd definitely love to finally come up with a fetch-API that is easy to use but not unnecessarily limiting, and can thus work for you as well. If that was the case, I think you could start tracking a more recent version which might ultimately pay off.
Interesting. I can have a look again because, admittedly, it was a while that I looked and then got swept away by other tasks.
If you would link the latest code that uses the Delegate here, I should be able to see what can or can't be done with higher-level APIs and fix these. Sharing what it specifically was that prevented the adoption last time you tried would probably certainly be helpful to me as well.
I should have taken notes when I explored trying to use the updated version, but unfortunately my foresight is not 20/20 😅
So the main entry point to using the gix fetch code are the following helpers:
The ls_refs
function calls into the Delegate
found here -- a lot of the code is a modified version of the gix
code. The same goes for fetch
(here).
I think the best way to understand how we use the delegate approach is by describing the high-level flow. The important context is that there are two, special rad
references:
rad/id
-- this reference points to a repository's identity which contains important information about the repository, in this case the delegates of the repository. A delegate being an entity that manages the repository and is a trusted peer for the context of the repository.rad/sigrefs
-- each peer who contributes to the repository has arad/sigrefs
. It points to a signed payload of that peer's reference state, key/value pairs of reference and SHAs, e.g.refs/heads/main deadbeef
,refs/cobs/xyz.radicle.patch/1234 beefdead
, etc. The pairing of the signature and payload allows the protocol to cryptographically verify a peer's set of refs and gives us the ground truth for that peer.
With this in mind, the protocol is essentially a series of stages where we fetch a set of this data and ensure some data validity. The state is kept in a type called FetchState
and each stage is executed via it's method FetchState::run_stage
.
The stages are run in the following order:
- Fetch
rad/id
-- we fetch therefs/rad/id
so that the delegates can be identified and their references are included in the fetch. - Fetch
rad/sigrefs
-- we fetch this reference for each delegate and, depending on our configuration, each peer that we are following. - Fetch the contents of
rad/sigrefs
-- we fetch the references that are listed in each peer'srad/sigrefs
. Note that we have the SHAs so we can effectively calculate the wants/haves. Also, in some cases we also have therad/sigrefs
SHA since it can be announced as part of gossip so we can ask for its SHA directly too.
All of this happens over a long-lasting connection, so we only perform the handshake step once. The handshake payload is passed down to each of the ls_refs
and fetch
calls.
The protocol finishes by signalling to the other end that it's done and it can cease sending upload-packs. The fetcher can then validate all the data it received checking signatures and ensuring that the data is in a consistent state, applying all the updates to the refdb if it's all consistent.
I hope that makes sense, but please let me know if I can elaborate on any points!
from gitoxide.
Thanks a lot for writing all this down! All this sounds familiar, particularly the multi-step process of downloading packs for different refs. If I remember correctly, the server-side is a plain git server over QUIC transport, which supports git protocol V2. That allows to use a single connection for multiple requests/commands, which are tuned according to the needs, with each stage informing what the next stage can or should do.
My goal here would be to see how I can transform the code away from the Delegate approach to the new command-oriented API, and I think I could consider that successful once the test-suite passes again. This also means that gix
probably isn't the right abstraction for now, as it's way too high-level, and I don't know if it should be able to support such a specialised while plumbing exists.
Once successful, this should allow you to track the latest versions of these plumbing crates, and since I do it I would make the API adjustments necessary to support this case as well. This probably also means you don't have to recheck the gix
level code of the fetch API, even I don't like to look at it, it's so much and quite complex, always troublesome to find anything 😅.
I probably won't get to it very soon, but it's on my list now and I will make it priority once the last stage of gix status
is implemented (HEAD->index diff).
from gitoxide.
We're no longer using QUIC, we're using a custom framed protocol over TCP, though I don't think it's so relevant since the fetch code just works with generic writers.
from gitoxide.
Related Issues (20)
- Panic receiving pack if fetch interrupted HOT 2
- `gix clone` sets `core.symlinks` to `false` on Windows even if globally `true` HOT 1
- Checking out a dangling symlink on Windows is treated as a hard error HOT 3
- CI install-action now fails on Windows, can't find .cargo/bin
- 16 tests fail on Windows with GIX_TEST_IGNORE_ARCHIVES=1
- Tests on Windows require Git Bash or a similar environment HOT 1
- Assertion failure crash in `gix_date::time::write::<impl gix_date::Time>::write_to` HOT 3
- `core.excludesFile` config entry exists but has blank value causes error: is this considered a bug or expected behavior? HOT 1
- Nondeterministic macOS `is_symlink` assertion failure in `overwriting_files_and_lone_directories_works` HOT 1
- Backport outside traversal fix to v0.62.x HOT 2
- Installing `[email protected]` via `cargo install` not possible because the `zip` crate in the specified verision is yanked HOT 1
- Could `gix_object::Find` be async?
- "[48] An unknown option was passed in to libcurl" on CentOS 7 HOT 3
- OSS-Fuzz issue 69546 HOT 1
- Use Conventional Commits and calculate version number for releases HOT 5
- OSS-Fuzz issue 69636 HOT 1
- Fetching multiple times on Windows with gix 0.63.0 eventually results in "Could not move a temporary file into its desired place" HOT 5
- Trying to fetch only non-existant refs results in "Could not decode server reply" HOT 4
- Sovereign Tech Fund and NLNet Application
- Good first issues?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gitoxide.