internetarchive / dweb-transports Goto Github PK
View Code? Open in Web Editor NEWLicense: GNU Affero General Public License v3.0
License: GNU Affero General Public License v3.0
createReadStream is a useful function in itself, for cases where not coming from an AV element.
IPFS currently uses WebSocketStar (WSS), (since WebRTC crashes browsers on pretty much any decentralized platform, not just IPFS- see internetarchive/dweb-transport#1 )
There are several issues with WSS:
Most critical is that clients connecting via WSS can only retrieve ipfs CIDs that are known by the node they are connected to. This essentially means CIDs aren't universal, just known to the subset of connected peers. The "websocket-relay" project at Protocol is supposed to fix this.
MostUrgent: This could be made better (for the archive), especially in the short term by connecting directly to the IPFS instance at the archive since that node also knows all the IA files we've added, but so far none of the Protocol Lab people have been able to do this.
Most important long term: WSS's star gives a single point of failure, which means that IPFS using WSS is innappropriate for any anti-censorship applications. I think that the WSS-Relay could be used along with a changing list of places to connect to, ideally that would be built into IPFS, but in its absence someone is going to have to build a wrapper, that for example saves potential places to connect to between sessions, and feeds to the config during p_connect.
Feel free to pull these into separate issues if working on them ....
Most urgent is
GUN has a problem with not managing full storage - see internetarchive/dweb-archive#46 for why not testing with GUn in dweb-archive because of this problem.
should support DAT protocol in dweb-transports, this should be relatively straightforward.
Note, this won't (currently) work in the browser due to WebRTC issues, but should work in Node (e.g. in dweb-mirror).
See DAT meta: mitra42/dweb-universal#1
Should fork or monkeypatch Webtorrent library so that if it sees a HTTP (or WS) URL for download or for the tracker and is running under https, and has no other usable URL that it will try the https or wss URL.
Also … When I fetch images via services I’m seeing data dependent results which just look wrong …
https://archive.org/services/img/software etc and all of them work fine
HTTP/1.1 200 OK
Server: nginx/1.14.0 (Ubuntu)
Date: Sat, 23 Nov 2019 06:14:06 GMT
Content-Type: image/jpeg; charset=UTF-8
Content-Length: 3286
Connection: keep-alive
Cache-Control: max-age=21600
Expires: Sat, 23 Nov 2019 07:14:06 GMT
Last-Modified: Thu, 05 Jul 2018 02:34:06 GMT
ETag: "5b3d839e-cd6"
Expires: Sat, 23 Nov 2019 12:14:06 GMT
Access-Control-Allow-Origin: *
Accept-Ranges: bytes
Strict-Transport-Security: max-age=15724800
Accept-Ranges: bytes
But …some images are missing the cors headers e.g.
curl -o/dev/null -Lv “https://archive.org/services/img/DonkeyKong64_101p”
HTTP/1.1 200 OK
Server: nginx/1.14.0 (Ubuntu)
Date: Sat, 23 Nov 2019 06:13:38 GMT
Content-Type: image/jpeg; charset=UTF-8
Content-Length: 7181
Connection: keep-alive
Cache-Control: max-age=3600
Expires: Sat, 23 Nov 2019 07:13:18 GMT
Last-Modified: Sat, 23 Nov 2019 06:13:18 GMT
Strict-Transport-Security: max-age=3600
X-Fastcgi-Cache: HIT
Accept-Ranges: bytes
https://archive.org/services/img/opensource_movies fails with a 403 (Forbidden) . If I access it directly in the browser its fine. I can’t think what could be different about it ?
I'm working on adding "seed" as another supported function - the design thinking is in dweb-mirror#117 which is the first use case.
@rodneywitcher - particularly interested in how we might want this to work with Wolk as well. I think it would involve adding keys during config, but not sure what info needs passing during the request to seed a file or directory.
Collections without thumbnail images display as the IA logo - instead should not display any logo
See for example http://localhost:4244/details/@mitra and compare to http://dweb.archive.org/details/@mitra
Problem trying to use same code on Node as on Browser means we cant use GUN for metadata on the node based dweb-mirror.
Move the newly split naming into dweb-archivecontroller, so dweb-transports becomes less archive.org specific.
See #22
As more transports get integrated into dweb-transports, and as the IA's UI team start looking at using dweb-archivecontroller which depends on dweb-transports, its become necessary to split up dweb-transports and make it lighter.
Solution will need ...
Experimentation is in the 'split' branch, which may or may not always be working !
Steps might be ... (this section will be edited)
Pull out each transport to its own script (most are NOT ES6 Module compatable)
Split each transport, test in browser and node
Move code from DA/archive.html and DM/internetarchive.js into DTS/Transports &/or shims
Cleanup - when this is done
figure out distribution mechanisms for
Moved from: internetarchive/dweb-transport#2
There are issues with persistence of the IPFS content stored. This is inherent to IPFS since there is no guarantee of persistence in IPFS and things are only stored by people who publish, pin, or for a period look at them.
Since the publisher is a browser, and is probably offline at this point, and noone may have looked at content, we need a way to be able to store. Its unclear if this should be via Pin-ning, or if we have to go outside of IPFS to do so.
For now - given the challenge of pinning on a browser, this is solved with https://github.com/internetarchive/dweb-transport/issues/13which stores both on our http servers and in IPFS.
Note, I'm leaving this open in the hope that an IPFS specific solution can be found.
2018-01-23: Confirmed this is not possible directly in IPFS currently. Solution would be building a pinning service e.g. hit by HTTP from client, and then pins it. This would introduce another single point of failure (client access to http), so would really need to be using something like a IPFS pubsub channel that picks it up and passes back for pinning, which needs GoLang skills or maybe a separate node.js client at IA. For now will stick to HTTP for persistent storing from browsers.
I'm seeing two issues with wolk.
a) URLs like https://cloud.wolk.com/dweb.archive.org/metadata/netlabels are failing with "Key not found"
b) The library is returning that failure as a success, with a data structure that includes headers and a 404.
I've disabled it for now in the default library.
Moved from: internetarchive/dweb-transport#7
Lists should support deletion, note that a deletion is just a flag of some sort (I think YJS supports it) so any retrieval should also have the option of eliminating deletions or retaining them.
Note there is already code that filters out duplicates, it probably belongs as a argument to that code to decide whether to eliminate Deletions (first - so deduplications get the not-deleted one).
Note - part of this is having some way to delete a list all the way back to empty.
The following systems are integrated currently, updates are welcome.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.