scottarbeit / grace Goto Github PK

View Code? Open in Web Editor NEW

532.0 13.0 15.0 1.8 MB

Grace Version Control System

License: MIT License

F# 95.88% C# 3.50% PowerShell 0.54% GLSL 0.01% Dockerfile 0.08%

dapr cloud-native fsharp dotnet source-control version-control version-control-systems developer-tools

grace's Introduction

Hi! I'm Scott.

I'm a Technical Architect at GitHub.

Grace Version Control System

I've invented a new, cloud-native, easy-to-use version control system, called Grace, and you should totally check it out. ❤️

About me

I've been a programmer since age 11, starting with the Apple ][+, AppleSoft BASIC, and 6502 Assembler. Since then, I've learned and mostly forgotten tons of other languages, operating systems, and server and client software, during a long career in which I've been a programmer, network administrator, DBA, consultant, architect, Program Manager, and Product Manager.

My most current resume, with a long list of stuff I know (and stuff I used to know) and stuff I've done is here.

After 18 years of object-oriented programming (C++, VB.NET, and C#), around 2016 I started to explore the mathematical and functional end of programming. I learned some Category Theory and even some Category Theory II (for free on YouTube from a great teacher, the world is amazing) and what a monad is, and now I write code mostly in F#. F# is just fun. It has a beautiful, minimal syntax, a full library of functional constructs plus access to the entire .NET ecosystem, it's strongly-typed, and it's very fast.

The journey from object-oriented(-ish) thinking to functional(-ish) thinking can take time - think months, not weeks - but I'd like to report from the other side of that journey: it's worth it, both for quality and for developer ergomonics. Loved it, would recommend.

I wish I could be good at All The Things, but there's just too many of them, so I specialize in cloud architecture using Microsoft Azure, and programming in .NET. I 💛 PaaS, virtual actors, CQRS and Event Sourcing.

Interests

Baseball. Hockey. Poker. Philosophy. Quantum physics. Yoga. Walking. Meditation.

Hi from a little bit east of Seattle. ☁️

Don't forget to check out Grace. 😉

Lucky to have a great girlfriend, a great family, and more friends than I know what to do with.

grace's People

Contributors

Stargazers

Watchers

Forkers

devbox10 melursus23 devdoshi waynemunro kyrvlasiuk robmaw rrodriguezxtg torratdev bernhardschuller akkineniramesh aguluman garcia-lab danieltanfh95 diode23 767829413

grace's Issues

Add authorization to Grace Server

Grace has no authorization logic yet.

Obviously, we need some.

#2 (comment)

Identify individual permissions for each server endpoint.

Permissions in Grace should be granular. Each server endpoint (i.e. each endpoint listed in Startup.Server.fs) should be assigned its own permission.

These permissions will be collected into sensible roles in #4.

Add `grace init` command

Grace needs to be able to initialize a new repository based on the contents of an existing directory. This involves several steps, including uploading files to object storage, creating the actors that represent the repository and default branches, writing a graceconfig.json and .graceignore file, and probably more.

Upload existing files to object storage.
Create new repository in actor storage.
Create local graceconfig.json file.
Create local .graceignore file.

Gather individual permissions into sensible roles.

Each server endpoint has a unique permission assigned to it.

For ease-of-use, those individual permissions should be gathered into default roles for repository owners to assign to users.

These roles should include:

Admin
Contributor
Reader
others as we think of them.

Eventually, we want repository owners to be able to create custom roles by choosing which permissions to include. Make sure the data schema handles that feature.

Figure out how to run Grace side-by-side with Git

It would be wonderful if we could run Grace side-by-side with Git in the same repository, and have Git not notice Grace's presence (and vice-versa).

This would enable us to scale up real-world testing of Grace, with no risk to the user. In that directory, Git would still be the official VCS (until it's not), and users would run Grace in that directory at the same time, using Grace the way they'd use it if it were the real VCS.

I'm not sure this is possible in more than a limited way. We'll run into the Grace/Git impedance mismatch on merges vs. promotions pretty quickly if we try to automate it, and this idea might not be work the work beyond tracking a single developer's local work.

It is worth thinking through more deeply... if we can find a good way, it would be extremely helpful.

Investigate if local git hooks could provide enough fidelity to match with Grace events
Double-check both Git and Grace .ignore files to make sure they're correct
Write instructions

Keeping forks on the same server is a security vulnerability and legal risk for the server host, and an operational risk for the maintainer of the fork.

Scott,

You've got some interesting ideas here, and it's nice to see someone tackling the UX issues that git has. That said, as someone who's been the technical co-founder of a number of companies (including Cloudability, which directly informs my thoughts here), my first impressions upon reading through the introductory documentation are as follows:

If I publish an open source project, and someone who wants to fork it does so on my server, then they have an easy way to forcibly increase my OpEx spend pretty arbitrarily. They can just make a fork, and start adding multi-gigabyte files by the boatload to inflate my S3 bill.
Of course, if I can ban someone from doing that on my server, then as a fork maintainer I face the risk that my fork could be shut down at a moment's notice by someone outside of my organization.
Similarly, the fork could be used to host illegal content by a bad actor. As the host of that content, it's entirely reasonable to expect that laws in many jurisdictions may make me liable for that. So, anyone hosting a server for the sake of hosting their OSS project may wind up with a de facto obligation to closely monitor everything being done by anyone forking their project.
A common use-case for me in forking repositories is simply to ensure that they continue to exist if the original developer decides to delete them. Requiring that the fork be on the same server is only a viable strategy if all the projects I rely on are using the Grace equivalent of Github. The moment one of them has their own server is the moment when I am either forced to come up with a way to continuously pull updates from one Grace server to another, or carry a risk that a dependency may simply disappear from under me. That adds to my operational risks.

I have other concerns, as well.

One you might have a strategy for addressing, and it might be helpful if you address it in the FAQ: I routinely make use of git-lfs, and have a few repos that are simply too large to work with over an Internet connection -- I host a server for them in my home where I have robust, high-speed, local network access. Think 5.9GB .git folder with a 2GB working directory. I also used to do game development, where large files being updated frequently is a common occurrence. For example, in one project I have a PhotoShop file that's 247MB. Having that much data get tossed around every time the artist hits save, if we happen to be working on related branches, could be pretty disruptive. Even if I have the bandwidth to handle it, the amount of time it would take would make the "instantly pick up [other] updates and automatically rebase" aspect would be gated on those transfers. For scale: In that repository, there are dozens of .psd, .fbx, and .max files that are 10+MB. I suspect this is a case where my requirements are simply out of scope for what you're trying to achieve. Worth talking about in your FAQs either way, perhaps.

My largest other concern is one you've explicitly decided is out-of-scope (poor/intermittent network access), so I won't delve into that other than to point out that it's not as simple as "sometimes I'm on an airplane". In fact, I rarely fly. I am, however, routinely in areas where I'm on cellular Internet and access is marginal at best. Even if I can maintain a sufficiently stable, fast connection, cost could easily be a concern for me. Not that that should necessarily impact your thinking on this particular architectural decision. What might be worth considering, however, is the impact on my organization's productivity if the VCS server becomes unavailable. You can make it as fault tolerant as you want, but an operational error / backhoe error / etc would mean that my entire team is unable to continue development without losing the ability to at least incrementally snapshot their work and go back to previous iterations if needed. Local caching of saves/checkpoints might adequately address that concern, however so perhaps I'm overly worried there.

Relatedly, I will note, another common use-case for me is repositories that are only ever local. I don't know if that's a common occurrence or if I'm just being idiosyncratic with it. So it may very well be not worth addressing. I suspect though that if your thinking is driven by your experience at Github that you might not really have a view on how common such a use-case is, as it wouldn't come up in discussions with development teams about their professional usage. It could be, hypothetically that most developers do this and it simply never came up in conversation. Of course, I doubt it's most, but simply want to note that it's very hard to have confidence on how common such a use-case is.

All of that said, I appreciate that someone is pushing forward with bold ideas to try and address the substantial UX issues git has and wish you the best of luck. However things turn out for Grace, I look forward to seeing learnings from it driving the state of the art forward in the future.

Add Authorization attributes to server endpoint functions.

Create initial public videos for Grace

There are a number of videos I'd like to record and publish about Grace in the coming months.

Each of these videos will be released as they're completed. This is a draft list, but something like:

Basic demo of grace watch and basic repository usage between two users
Architectural deep-dive, including Dapr
F# and functional constructs that make Grace's code easier to understand and more maintainable
Things I've learned (and sometimes rewritten) along the way
Walkthrough of the concepts in each .NET project in Grace

Please subscribe to this issue if you'd like to be notified when each video is released.

Add authentication to Grace Server

Grace Server needs to implement OAuth2 for authentication, and test it against multiple identity providers.

#6 (comment)

Figure out correct settings for Native AOT build

Right now, I'm testing Grace CLI using .NET's Ready To Run (R2R) feature, which emits pre-JITted code to run. That means that instead of waiting for the .NET CLR to JIT the code at runtime, we provide a fast-to-start JIT implementation in the .dll, and allow the .NET runtime to do further hot-path recompilation as it decides it should.

The R2R version has a few-hundred-millisecond delay in getting started, owing to getting the CLR going before running Grace itself. It's fine for now, but it's not what I want to ship with.

.NET 7 has a Native Ahead-of-Time (AOT) compilation feature. This builds the entire application including the CLR, trims unused .dll's and even unused individual methods, and emits a single, native executable file that starts instantly.

I tried a .NET 7 Native AOT build for Grace.CLI on 29-Dec-2022, and it failed. I assume it's because there were issues during trim; we probably need to prevent it from trimming things having to do with actors, but that's just a guess.

If it's easy to include AOT for Grace Server, great, R2R is fine for it, .NET will recompile the hot-paths and the server will be at top speed fairly quickly in production.

Research using Dapr State Store Query API to query actor storage

I'm writing this to capture that I've tried to use the Dapr State Store Query API, instead of writing separate functions for each actor storage database, and it doesn't work.

Here's why:

The Dapr documentation explicitly states that it doesn't work for actor storage. (Despite that, I tried it.)
The Query API is an alpha, and right now it looks like it will not proceed to beta or stable.
I actually tried it, and actually got an error.

2023-08-11T04:36:28.6865588Z 2A In getReferenceByType: referenceType: Promotion; branchId: de9220fb-e2bc-4562-b5c4-1798d81100c1; maxCount: 30.
2023-08-11T04:36:28.6868061Z 2A { "filter": {"AND": [{"EQ": { "value.Class": "ReferenceDto" } }, {"EQ": { "value.ReferenceType": "Promotion" } }, {"EQ": { "value.BranchId": "de9220fb-e2bc-4562-b5c4-1798d81100c1" } }]}, "sort": [{"key": "value.CreatedAt","order": "DESC"}], "page": {"limit": 30 } }
2023-08-11T04:36:28.7314964Z 23 Exception in getReferenceByType: Query state operation failed: the Dapr endpointed indicated a failure. See InnerException for details.
2023-08-11T04:36:28.7409083Z 23    at Dapr.Client.DaprClientGrpc.QueryStateAsync[TValue](String storeName, String jsonQuery, IReadOnlyDictionary`2 metadata, CancellationToken cancellationToken)
   at [email protected]() in /src/Grace.Server/Services.Server.fs:line 568
2023-08-11T04:36:28.7409644Z 23 Inner exception:
2023-08-11T04:36:28.7410811Z 23 Status(StatusCode="Internal", Detail="failed query in state store actorstorage: POST https://gracevcs-development.documents.azure.com:443/dbs/gracevcs-development-db/colls/grace-development/docs
--------------------------------------------------------------------------------
RESPONSE 400: 400 Bad Request
ERROR CODE: BadRequest
--------------------------------------------------------------------------------
{
  "code": "BadRequest",
  "message": "The provided cross partition query can not be directly served by the gateway. This is a first chance (internal) exception that all newer clients will know how to handle gracefully. This exception is traced, but unless you see it bubble up as an exception (which only happens on older SDK clients), then you can safely ignore this message.
ActivityId: 9e5641d4-61e4-48e7-9a87-b1ede3d0a23d, Windows/10.0.17763 cosmos-netstandard-sdk/3.18.0",
  "additionalErrorInfo": "{\"partitionedQueryExecutionInfoVersion\":2,\"queryInfo\":{\"distinctType\":\"None\",\"top\":null,\"offset\":null,\"limit\":null,\"orderBy\":[\"Descending\"],\"orderByExpressions\":[\"c[\\"value\\"][\\"value\\"][\\"CreatedAt\\"]\"],\"groupByExpressions\":[],\"groupByAliases\":[],\"aggregates\":[],\"groupByAliasToAggregateType\":{},\"rewrittenQuery\":\"SELECT c._rid, [{\\"item\\": c[\\"value\\"][\\"value\\"][\\"CreatedAt\\"]}] AS orderByItems, c AS payload\nFROM c\nWHERE ((((c[\\"value\\"][\\"value\\"][\\"Class\\"] = @__param__0__) AND (c[\\"value\\"][\\"value\\"][\\"ReferenceType\\"] = @__param__1__)) AND (c[\\"value\\"][\\"value\\"][\\"BranchId\\"] = @__param__2__)) AND ({documentdb-formattableorderbyquery-filter}))\nORDER BY c[\\"value\\"][\\"value\\"][\\"CreatedAt\\"] DESC\",\"hasSelectValue\":false,\"dCountInfo\":null},\"queryRanges\":[{\"min\":\"\",\"max\":\"FF\",\"isMinInclusive\":true,\"isMaxInclusive\":false}]}"
}
--------------------------------------------------------------------------------
")
2023-08-11T04:36:28.7412527Z 23    at Dapr.Client.DaprClientGrpc.QueryStateAsync[TValue](String storeName, String jsonQuery, IReadOnlyDictionary`2 metadata, CancellationToken cancellationToken)

In order to work around the cross-partition error, the Query API filter would need to have something like STARTSWITH() or CONTAINS() and it does not.
The recommendation from a core Dapr maintainer continues to be: use the native query API for the data store.
There is openness to a better solution to this problem in Dapr, but there's no actual coding effort going towards it as of August, 2023. If it happens, I'd be the first person to put in a weekend and rewrite the entire query layer to whatever new Dapr standard gets created.

It looks like we'll end up with 30-40 query functions. It's mildly annoying to have to fill out match expressions for each of them, for each kind of database, but it's a one-time thing, it's code that's easy for GitHub Copilot to generate once you have the first couple of functions written, and it takes longer wasting time trying to be clever about it than to just write the damn code.

Anyway, #fail (for now), and we'll just write implementations for each data store someone wants to use until the situation changes. Not worth getting bogged down about right now.

Rename "merge" to "promote"

In Grace, a "merge" to a parent branch is actually a promotion of a version of the code from a child branch to be the current state of the parent branch.

We need to rename "merge" to "promote" in all code and documentation.

Create first-draft onboarding documentation

Hey @ScottArbeit, I saw your talk on NDC, and decided to check out your project myself. However, I find it really difficult to navigate the repo because of the lacking technical documentation. Could you please point me to a place where I can find information on how to get started with everything.
I suspect that the server can be hosted in docker and Grace CLI is a dotnet cli tool, but it is hard for me to connect these things together.

It is also possible that I am just blind and missed all of the documentation somewhere :)

scottarbeit / grace Goto Github PK

grace's Introduction

Hi! I'm Scott.

Grace Version Control System

About me

Interests

grace's People

Contributors

Stargazers

Watchers

Forkers

grace's Issues

Recommend Projects

Recommend Topics

Recommend Org