Topic: internet-archiving Goto Github

Some thing interesting about internet-archiving

👇 Here are 25 public repositories matching this topic...

akamhy / waybackpy

internet-archiving,Wayback Machine API interface & a command-line tool

Home Page: https://pypi.org/project/waybackpy/

internet-archive wayback-machine internet-archiving archive-webpage archive-webpages wayback-machine-api cdx-api wayback-machine-python savepagenow web-archiving

archivebox / archivebox

internet-archiving,🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

Organization: archivebox

Home Page: https://archivebox.io

pocket wget browser-bookmarks pinboard chromium firefox backups rss web-archiving python

archivebox / archivebox-browser-extension

internet-archiving,Official ArchiveBox browser extension: automatically/manually preserve your browsing history using ArchiveBox.

Organization: archivebox

Home Page: https://chromewebstore.google.com/detail/archivebox-exporter/habonpimjphpdnmcfkaockjnffodikoj

archivebox chrome-extension firefox-extension svelte archiving browser-extension digipres digital-preservation internet-archiving web-archiving

archivebox / archivebox-proxy

internet-archiving,Official ArchiveBox MITM proxy: saves URLs of all requests passing through to an ArchiveBox server for archival.

Organization: archivebox

Home Page: https://github.com/ArchiveBox/archivebox-proxy

archivebox digipres digital-preservation https-proxy internet-archiving mitmproxy proxy web-archiving web-proxy

archivebox / debian-archivebox

internet-archiving,Home of the official apt/deb package for Ubuntu/Debian-based systems.

Organization: archivebox

Home Page: https://launchpad.net/~archivebox/+archive/ubuntu/archivebox

archivebox debian apt package internet-archiving stdeb web-archiving digipres aptitude ubuntu

archivebox / digestbox

internet-archiving,DigestBox takes any webpage URL (news article, video link, comment thread, etc.) and gives you just the raw content. It's powered by ArchiveBox.io under the hood.

Organization: archivebox

Home Page: https://DigestBox.io

archivebox backups digipres headless-browser internet-archiving warc web-archiving

archivebox / docker-archivebox

internet-archiving,Home of the official docker image for ArchiveBox

Organization: archivebox

archivebox docker container image digipres internet-archiving kubernetes oci podman docker-compose docker-image

archivebox / docs

internet-archiving,Source for the Github Wiki / ReadTheDocs documentation for AchiveBox, the self-hosted internet archiving solution.

Organization: archivebox

Home Page: https://docs.archivebox.io

archivebox sphinx python rest cli ui documentation wiki usage community

archivebox / electron-archivebox

internet-archiving,Desktop Electron app for ArchiveBox internet archiver. (ALPHA: not ready for general use)

Organization: archivebox

Home Page: https://archivebox.io

archivebox electron docker internet-archiving digipres web-archiving desktop desktop-electron macos windows

archivebox / good-karma-kit

internet-archiving,😇 A Docker Compose bundle to run on servers with spare CPU, RAM, disk, and bandwidth to help the world. Includes Tor, ArchiveWarrior, BOINC, and more...

Organization: archivebox

Home Page: https://archivebox.github.io/good-karma-kit/

docker docker-compose distributed-computing good-karma tor i2p ipfs storj sia boinc

archivebox / homebrew-archivebox

internet-archiving,Homebrew formula for the ArchiveBox self-hosted internet archiving solution.

Organization: archivebox

Home Page: https://archivebox.io

archivebox homebrew macos package linuxbrew brew-tap internet-archiving web-archiving digipres

archivebox / pip-archivebox

internet-archiving,Official Python package for ArchiveBox, the self-hosted internet archiving solution.

Organization: archivebox

Home Page: https://pypi.org/project/archivebox/

archivebox python pip pypi internet-archiving web-archiving digipres setuptools sdist wheel

archivebox / readability-extractor

internet-archiving,Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page's article text.

Organization: archivebox

archivebox node readability wrapper internet-archiving

fooftilly / rss_archiver

internet-archiving,Download and archive RSS feeds to Wayback Machine. Save a list of archived feed in locad db.

User: fooftilly

archive archiver internet-archive internet-archiving link-archive link-archiver rss rss-archive rss-feed wayback-machine webarchive

gabldotink / sharkive.old

internet-archiving,upload stuff to the Internet Archive using a shell script

User: gabldotink

internet-archive internet-archiving youtube youtube-dl youtube-downloader

httpreserve / conventoarchiver

internet-archiving,Repository for collecting scripts to help capture MyConvento newsroom press-releases from the MyConvento PR management suite. The README provides an analysis of the MyConvento URL architecture for users hoping to develop a solution for themselves.

Organization: httpreserve

myconvento web-archiving pr-newsroom press-releases webarchives digipres internet-archive internet-archiving my-convento

itsliamdowd / waybackbrowsermacos

internet-archiving,Pick a date and explore websites from the early days of the internet to now all in an easy-to-use browser format! 💻

User: itsliamdowd

application browser coding developer html internet internet-archive internet-archiving js macos

itsliamdowd / waybackbrowserwindows

internet-archiving,Pick a date and explore websites from the early days of the internet to now all in an easy-to-use browser format! 💻

User: itsliamdowd

browser css flask flask-application flask-webapp html internet internet-archive pyqt5 python wayback-machine app application apps flask-website html-css internet-archiving internet-connection internet-explorer windows

mikwielgus / forum-dl

internet-archiving,Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC

User: mikwielgus

python scraper forum discourse phpbb simplemachines data-fetching internet-archiving warc

own-data-privateer / pwebarc

internet-archiving,A suite of tools for mirroring and hoarding web pages you visit for later offline viewing. I.e. your own personal Wayback Machine that can also archive HTTP POST requests and responses, as well as most other HTTP-level data, which also follows "archive everything now, figure out what to do with it later" philosophy.

Organization: own-data-privateer

Home Page: https://oxij.org/software/pwebarc/

archive backups internet internet-archiving self-hosted wayback-machine web-archiving

pirate / internet-archiving-talk

internet-archiving,🎭 An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.

User: pirate

Home Page: https://pirate.github.io/internet-archiving-talk/

internet-archiving talks slideshow web-archiving wget warc archivebox censorship ethics

pirate / wikipedia-mirror

internet-archiving,🌐 Guide and tools to run a full offline mirror of Wikipedia.org with three different approaches: Nginx caching proxy, Kiwix + ZIM dump, and MediaWiki/XOWA + XML dump

User: pirate

Home Page: https://docs.sweeting.me/s/self-host-a-wikipedia-mirror

wikipedia wikipedia-dump wiki mediawiki xowa nginx docker docker-compose internet-archiving archiving