Internet Archive's Projects
Project Gutenberg collection importation via IAS3 interface
Source for MaxMind's GeoIP-Python to install via pip
Machine readable dataset needed to understand Wikipedia and mediawiki installations better
A MediaWiki extension that supports importing of Archive.org palm leaf items
Video editing with Python
Daily TV News Summary using GPT
Tool to build solr index offline
One webpage for every book ever published!
API documentation for https://github.com/internetarchive/openlibrary
A repository of cleanup bots implementing the openlibrary-client
Python Client Library for the Archive.org OpenLibrary API
Coordination between the OpenLibrary.org Librarian community
Web archive index server based on RocksDB
A PDF classifier ensemble with REST API service
Clone of Apache PIg repo, branch 0.10, with IA-specific hacks/mods.
Automatic polyfill service.
Changes to poppler to get accurate coordinates from pdfs
A client for the [Archive-It] WASAPI Data Transfer API
A Python module for working with 10- and 13-digit ISBNs
[vault fork] of "rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Yandex Files
Demo code for the Open Library Read API
A Headless Chrome rendering solution
A powerful web component router.
model and front-end for rules for managing wayback playback
Python client package for the playback rules engine
Watch for local files to appear and move them into S3
Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wiki
Support for writing WARC files with Scrapy