bluesky-social / atproto Goto Github PK

View Code? Open in Web Editor NEW

5.8K 5.8K 406.0 34.02 MB

Social networking technology created by Bluesky

License: Other

TypeScript 97.76% JavaScript 0.28% Shell 0.08% Handlebars 1.74% Dockerfile 0.12% Makefile 0.02%

atproto's People

Stargazers

Watchers

Forkers

mgibowski eduhayon lengocgiang razzzzee chandrath tranvnb schwentker chasak eltmon ionuttbara simevo 0xpipilu dclark gatarelib hiwong asloan7 johnprisco cjh0613 mucyomiller mustafatemur sentimech 00mjk jvjvalerio an-unnamed-dao 0xnordian jsoref berkipekoglu web3mirror majmorse hellblazer takumiando harlantwood qytz cstack genemasaka mayanand gilbertmpanga12 mikestaub matrixgo devmenna worthmining toledoroy wsq003 kustomzone ifeanyi55 mysticaltech chaincatcher dnzdlklc youngnishant 0xmoonape 1grzyb1 devinivy luoaide nipz 22388o amihaylenk0 echallenge pineappleclout jmoo zhuowei myconsciousness cristiana214 dalgleish cjtakhar voidref lyhiving baitphish cdscawd neopunisher pinker10 dwking2000 chenkunlong keldush julongdragon ryanwild yasudacloud tosunkaya zclim katxeus j0ev datalayer-externals aliciasmithh yldrmali morrisallison aarongoldman changeling ralfbarkow rouralberto jiaoxlong negativeeddy laxertu adinlx goodlux nc163 noeljacob geoah onurkanbakirci hamuraijack bxxd tino097

atproto's Issues

Persistent Blockstore

Right now state is reset when server is restarted.

Would be nice to have state persistence, but also a 'purge' functionality in the cli to easily reset during prototyping.

provide containerized setup

some people may have trouble installing ADX because of node >= 15 requirement, besides keeping 4 terminals open to run the demo is complex

using docker-compose would simplify the install & allow to run the demo with just two terminals

error /home/raphy/adx/node_modules/@vscode/sqlite3: Command failed.

(base) raphy@pc:~$ git clone https://github.com/bluesky-social/adx.git
Cloning into 'adx'...
remote: Enumerating objects: 2865, done.
remote: Counting objects: 100% (286/286), done.
remote: Compressing objects: 100% (86/86), done.
remote: Total 2865 (delta 215), reused 201 (delta 200), pack-reused 2579
Receiving objects: 100% (2865/2865), 14.78 MiB | 18.15 MiB/s, done.
Resolving deltas: 100% (1779/1779), done.
(base) raphy@pc:~$ cd adx/
(base) raphy@pc:~/adx$ yarn
yarn install v1.22.18
[1/5] Validating package.json...
[2/5] Resolving packages...
[3/5] Fetching packages...
[4/5] Linking dependencies...
warning "workspace-aggregator-c0efa86d-08e7-43e5-8716-5f449fa1807f > @adx/common > [email protected]" has unmet peer dependency "@types/node@*".
[5/5] Building fresh packages...
[1/3] ⠂ leveldown
[2/3] ⠂ @vscode/sqlite3
error /home/raphy/adx/node_modules/@vscode/sqlite3: Command failed.
Exit code: 127
Command: node-gyp rebuild
Arguments: 
Directory: /home/raphy/adx/node_modules/@vscode/sqlite3
Output:
/home/raphy/.nvm/versions/node/v16.15.0/lib/node_modules/npm/bin/node-gyp-bin/node-gyp: 5: /home/raphy/.nvm/versions/node/v14.17.0/lib/node_modules/node-gyp/bin/node-gyp.js: not found

Feedback if user runs init & already has a repo

Scope out UCAN permission levels

Right now Bluesky UCANs have one permission level: POST.

Scope out additional permissions levels for UCANs

Separate web app, build CLI for data storage backend

We should stub out the interfaces around the distributed db that needs to exist, which is currently represented by the CAR files.

The web app view should be separate, and purposefully minimal, so people don't expect a fully featured end-user application from this right now.

The right audience is developers at this point, so it'd be interesting to tuck the CAR files/db functionality behind a CLI interface.

Documentation

Document:

What it does
What libraries it builds on
Why we chose to do this as an experiment

Dedup tooling config

Right now we have a configs for tsc, eslint, prettier & ava in each package. We should dedupe these & have them at the repo root

Demonstrate advanced functionality of UCANs

Right now, we're not getting more from UCANs that we'd get from a generic signature. But it should come in handy when we delegate perms between devices, across third-party servers, etc.

delegate UCAN to server to post on your behalf
delegate maintenance of the data structure to server

Can brainstorm some directions to go on this issue below.

There's some places in the code where errors aren't being handled and will lead to panics if data is malformed. One TODO for malformed auth headers is in cmd/server/main.go line 388. I saw some in the go code too, but without TODOs. Should fix these, and either record in TODOs to fix shortly, or just add error handling as code is written as much as possible.

Follows, interactions, and profile

Extend the structure of #41 to represent other data in the user store including follows, interactions, and user profiles.

3rd party server

port over 3rd party server from earlier ucan-demo

Sync repo events instead of full repos

Right now, if a server wants to send some update to another server (for instance, Alice's server wants to tell Bob's server that she liked a post of his), it pushes the entire user's repo.

This is a costly operation to inform an actor of an otherwise small event.

Servers should communicate updates in a lower bandwidth manner. Sometimes this will mean performing a repo sync, other times this will mean sending a simple event (for instance, a "like event").

This is enabled by Light Clients (#89) & formatting merkle proofs of inclusion.

Improve timestamps

Two main issues:

time should be in micro seconds
browsers dont have microsecond precision, so we do millisecond & add a counter for two timestamps made at the same millisecond
no random clock ids
- designate a segment of clock ids (0-31) for randomized, no guarantee of non-collision
- eventually we'll want to have a checkout system for clock ids from the data server (maybe identity server)

Light Clients

Right now, we only support Full Clients (entire repo + history synced), and Delegator Clients (no repo synced, reads & writes through a personal data server).

We should support Light Clients, that is partially synced repositories. This includes

Non-historical repos: no past versions, only the current state of the repo)
Partial repos: not syncing the entire repository, but just some subsection of it. For instance, only the microblogging namespace, or even only recent posts within the microblogging namespace

Investigate DIDs

ION
Ceramic
Possible bluesky network?

Make a decision on what works for us. Are we using the same provider for user DIDs & network entity DIDs?

Schema network

We need a system for creating, referencing & distributing data schemas.

Schemas should be addressed by a DID, upgradable, and extendable , with a relatively rigid description of the data format.

demo UI

Having some sort of UI for seeing in real time what happens in the demo would help people understand what's going on.

On top of #110 we could add the ui1 and ui2 services, each running a tiny vuejs app that show a reactive, live view of the feed/timeline.

As a plus, one could have a /data route that shows the Merkle DAG in a user-friendly way.

Logout button

Add a logout button to the webpage that deletes keys from localStorage & redirects to the register page

Note: this is a permanent logout, since we don't have a key management solution yet 😛

Basic schema set

Our current microblogging schemas are intentionally stripped down and simple. We should flesh these out and allow for more complex interactions such as replies, threading, retweets, and more.

We should release a set of schemas describing basic social media objects, including microblogs, likes, image posts and long form writing.

These schemas will be distributed on the schema network (#91)

Posts branch rewrite

Switching out HAMT for schema-addressed SSTables indexed by timestamp. Compression, merging of smaller tables, etc

https://www.notion.so/blueskyweb/Posts-Branch-6aa2a46fb776492187406046b1c46560

Auth library

We should create a developer-facing authorization and authentication library for permissioning device and application keys.

Some flows are laid out in the architecture overview, and hare more in-depth in the forthcoming identity spec.

This library would include:

functions for formatting UCANs and capabilities for interacting with bluesky repository objects
redirect flows for an application to get a valid UCAN attenuated from a user's device or root key
similar redirect flows for attenuating a UCAN for a third-party application

Transfer only updates to a user's store

Right now we serialize the entire user's store into a CAR file and transfer that on each GET & POST.

Instead, we should only be sending the new CIDs.

Better middleware logging

We log everything as an ERROR, even things that should be an INFO

Permanent Root DIDs

Right now the DID at the root of a user's identity is an ephemeral did:key, and the key related to that DID is held in either local storage or as a simple key file in the user's filesystem.

We will be introducing a more complete identity system soon. User DIDs will be rooted in a proper DID network. Starting out that will likely be the user's choice of

Bluesky Consortium
ION

Key Management

We should deploy better key management strategies and helpers for user and devs to take advantage of.

This breaks down into 3 pieces:

A secure in-browser storage solution for a user's device keys. This will likely take the form of a service worker that only responds to requests from approved hosts.
Custodial key management for the user's root key. A hosted HSM that holds user's root private keys (or recovery keys) that signs updates or UCANs on the user's behalf after some sort of authentication flow (for instance an email link)
Helpers for sovereign key management. Linking flows for a user to sign attenuated UCANs with their non-custodial private key. Whether this is held on a hardware device or a mobile wallet. This may include a "Sign in with Ethereum flow".

Add user data to indexed & querable store

Canonical version of user data is in UserStore. We need to put that data in an indexed & querable db (say postgres?) & push updates to the DB, as the UserStore is updated.

Native C/CPP solution for small embedded systems?

Is there a native C/CPP solution for small embedded systems? Would love to see something more low level for places that node is far too large to run on.

I envision small MIPS low power data harvesting/gathering devices similar to weather stations being able to connect up and offer services from solar power.

Interesting project :)

Rework TID Collection SSTables

SSTables are currently a non-recursive data structure with relatively costly merge operations.

SSTables also have a large number of siblings, which causes quite large merkle proofs to prove the existence of some object - especially an older object in a larger table.

We should rework TID collections to use a recursive data structure, with fewer siblings, that still preserves the ability to do range queries at the structural level

Clarify terminal example

The quick use terminal example is not very clear & can lead to confusion.

Maybe we do a different writeup with four separate panes instead of one pain with terminal numbers in parentheses.
Also was reported that (Y) didn't work for the register command.
In code comments, we still say "true" & "false" instead of "yes" & "no"

webfinger route

Add webfinger route for retrieving a user's DID

Curation/Moderation

Write now the "indexing server" returns a simple linear timeline of content from a user's follows.

We should provide more options around curation & moderation.

This includes explicit block lists for users that you do not wish to interact with. Tooling to check "bad hash lists" for illegal content. And user-selected algorithms that allow a user to customize the sort of content they want - or do not want - to see if their feed.

Posts Schema v1

Come up with a schema for posts. Consider upgradability.

Initial idea:

type PostObj = LivePost | DeletedPost

type LivePost = {
  tombstone: false
  text: string
  attachments: CID[]
  time: Date
  author: DID
  replaces: CID
  reply_to: CID | null // CID[]?
  mentions: DID[]
}

type DeletedPost = {
  tombstone: true
  replaces: CID
  time: Date
}

Json Web Proofs (JWP)

Evaluate in comparison with UCANs https://github.com/json-web-proofs/json-web-proofs#user-content-fn-JWS-66165f84d74d387b4cbbf94b42d9a297

And think through what a demo might entail

Generated type checks

TS runtime type guards suck.

#52 (comment)

Look into something like yum or joi

Data structure needs for the CLI (tracking issue)

Creating an issue to track fields / features I need from the data store to implement the CLI.

Post
- ID
- Created-at timestamp
- All fields related to post replies
- Edit(), delete()
Social
- A way to query followers
- Unfollow()
Interactions
- Everything

Add ucans alongside data in user store

In other words: Attach a UCAN to the user store when making a change. This makes the data layer more self-certifying because it brings "authentication" to that level.

Currently UCANs are sent as an authorization token along with updates to a user's store and discarded by the server after verifying.

This isn't strictly necessary, but I think it's an interesting experiment in terms of making the protocol more self-certifying.

This fulfills a different function from signing the root hash:
#10 (comment)

Signing the root of the user graph certifies that the data is correct. Sending a properly permissioned UCAN certifies that the current device/service/user has the authorization to make the changes that they did.

Add some kind of validation on the ws-relay to make sure it's not used for non-AWAKE traffic

Social Graph

We should provide a more complete social graph for users. In addition to simple follows, this includes follows related to specific schemas (or collections of schemas) and blocks.

The social graph can be thought of as a user's "Phone book for the web". In the manner that mobile applications can tap into your phone's contact book, web applications can tap into your social graph to make app-specific connections to users that you are connected to more generally. The social graph also serves as the routing table for data added to the bluesky network.

The jury is still up on this one, but this social graph will likely exist outside of the user repository - although it will incorporate many of the patterns used in the user repository. This allows users & projects to make use of the identity & social graph features without opting into the entirety of the data network.

More granular UCAN permissions

Right now there are two levels of permission for Bluesky UCAN capabilities:

WRITE
MAINTAINANCE

We should introduce additional capability levels. For instance, some apps should be write-only, and not be allowed to delete from or revise a user's repository.

Fission, for instance, uses the following capability levels for WNFS:

WNFS_ABILITY_LEVELS = {
  "SUPER_USER": 0,
  "OVERWRITE": -1,
  "SOFT_DELETE": -2,
  "REVISE": -3,
  "CREATE": -4,
}

Convert server to Typescript

Rewrite Go server in typescript so that we can re-use data structure & identity code across the project

Don't fetch update if merkle root is the same

Related to #20

Don't fetch an update for the user store if already in possession of the merkle root

Attach domain to usernames

Federate usernames

Can't post just a number

Running: yarn cli post "42"

Results:

yarn run v1.22.11
$ yarn workspace @bluesky/cli cli post 42
$ node dist/bin.js post 42
Error: Could not retrieve server did Request failed with status code 500
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
error Command failed.
Exit code: 1
Command: /Users/jer/.local/share/nvm/v16.14.2/bin/node
Arguments: /opt/homebrew/Cellar/yarn/1.22.11/libexec/lib/cli.js cli post 42
Directory: /Users/jer/projects/bluesky/bluesky-experiment/cli
Output:

info Visit https://yarnpkg.com/en/docs/cli/workspace for documentation about this command.
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

Server logged:

    at file:///Users/jer/projects/bluesky/bluesky-experiment/server/dist/routes/data/root.js:14:15
Error: Did not find expected object at bafyreigl26pc6yho2yzuohn2fhd55wh2j7kir2tx45aogrftr2rqjjdfv4: [
  {
    "code": "invalid_type",
    "expected": "string",
    "received": "number",
    "path": [
      "text"
    ],
    "message": "Expected string, received number"
  }
]

Verified quick fix:

diff --git a/cli/src/commands/posts/post.ts b/cli/src/commands/posts/post.ts
index 81fd910..134cf0a 100644
--- a/cli/src/commands/posts/post.ts
+++ b/cli/src/commands/posts/post.ts
@@ -8,7 +8,7 @@ export default cmd({
   help: 'Create a new post.',
   args: [{ name: 'text' }],
   async command(args) {
-    const text = args._[0]
+    const text = String(args._[0])
     const client = await loadClient(REPO_PATH)
     const post = await client.addPost(text)
     const tid = post.tid

Verify signature on server

TODO in cmd/sky/main.go

Not checking user signature on server yet

Post failure not handled?

After the cli tries to post and it fails (as in #76), it seems to have lost sync and any more posts fail with:

Error: Could not find commit in repo history: $bafyreiebze2qei5nub2x477ckxfsczpyfwsliez3o3uw4a5ya27kecfixi

Display a feed from multiple users

Collate posts from a user's follows into a feed similar to what you'd expect from Twitter. Consider how this plays with a user's datastore. Does it contain pointer's to posts from user's that they follow? The server would need to be responsible for updating the user's datastore with pointers. If not, how does a user know that their are new posts from their follows? The naive option is to grab the data store of each of their follows (or the first x posts in each follows store).

This should be built in a modular manner so that the user can opt to switch their feed algorithm out for another.

Better cli errors

I tried to register w/ an existing username, and this error result is a lot of unhelpful text :)

yarn cli init
yarn run v1.22.11
$ yarn workspace @bluesky/cli cli init
$ node dist/bin.js init
Repo path: /Users/jer/.sky-alice
This utility will initialize your sky repo.
Press ^C at any time to quit.
Username: alice
Server: localhost:2583
Register with the server?: (true) 
Run a delegator client (and avoid storing repo locally): (false) 
Generating repo...
Registering with server...
Failed to register with server
Error: Request failed with status code 403
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
error Command failed.
Exit code: 1
Command: /Users/jer/.local/share/nvm/v16.14.2/bin/node
Arguments: /opt/homebrew/Cellar/yarn/1.22.11/libexec/lib/cli.js cli init
Directory: /Users/jer/projects/bluesky/bluesky-experiment/cli
Output:

info Visit https://yarnpkg.com/en/docs/cli/workspace for documentation about this command.
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

Type "Keypair" needs to be exported by @adx/common

The Keypair type is used as a parameter to the delegator client but that type isn't exported by the module. Probably just need to add

export * from './common/types.js'

to /common/index.js

Separate out network roles

Right now, a single server does all of the network roles: identity provider, DID network, personal data server, and indexer.

Each of these roles should be split out into their own entity.

bluesky-social / atproto Goto Github PK

atproto's People

Stargazers

Watchers

Forkers

atproto's Issues

Recommend Projects

Recommend Topics

Recommend Org