Coder Social home page Coder Social logo

archive.is's Introduction

archive.is

Unofficial Node.js API for archive.is

Build Status Dependency Status devDependency Status

Install

npm install archive.is --save

Usage

var archive = require('archive.is');

// Get the last existing snapshot of https://www.kernel.org
archive.timemap('https://www.kernel.org').then(function (timemap) {
  console.log(timemap.last);
  // { url: 'https://archive.is/20160109153444/https://www.kernel.org/',
  //   date: Sat, 09 Jan 2016 15:34:44 GMT }
});

// Take a new snapshot of https://www.kernel.org
archive.save('https://www.kernel.org').then(function (result) {
  console.log(result.shortUrl); // https://archive.is/EJoGi
});

API

timemap(url, [callback])

Get a list of all snapshots of a given page.

  • url {string} Page URL
  • callback {function} If omitted, a promise will be returned

Returned promise will be fulfilled with an object with the following keys:

  • original {string} Original page URL
  • timegate {string} Timegate URL
  • first {Memento} The oldest snapshot
  • last {Memento} The newest snapshot
  • mementos {Array.<Memento>} All snapshots sorted by date in ascending order

Example result:

{ original: 'https://www.kernel.org/',
  timegate: 'https://archive.is/timegate/https://www.kernel.org/',
  first:
   { url: 'https://archive.is/19980130085039/http://www.kernel.org/',
     date: Fri, 30 Jan 1998 08:50:39 GMT },
  last:
   { url: 'https://archive.is/20160127210011/https://www.kernel.org/',
     date: Wed, 27 Jan 2016 21:00:11 GMT } }
  mementos:
   [ { url: 'https://archive.is/19980130085039/http://www.kernel.org/',
       date: Fri, 30 Jan 1998 08:50:39 GMT },
     { url: 'https://archive.is/19990429093120/http://www.kernel.org/',
       date:Thu, 29 Apr 1999 09:31:20 GMT },
     ...
     { url: 'https://archive.is/20160127180405/https://www.kernel.org/',
       date: Wed, 27 Jan 2016 18:04:05 GMT },
     { url: 'https://archive.is/20160127210011/https://www.kernel.org/',
       date: Wed, 27 Jan 2016 21:00:11 GMT } ]

save(url, [options], [callback])

Take a new snapshot of a page.

  • url {string} Page URL
  • options {Object}
  • options.anyway {boolean} Force snapshot taking, even if it already exists [false]
  • callback {function} If omitted, a promise will be returned

Returned promise will be fulfilled with an object with the following keys:

  • id {string} Snapshot ID
  • shortUrl {string} Short URL (https://archive.is/ + id)
  • alreadyExists {boolean} Shows if the returned snapshot was newly created (false) or not (true)

Note that anyway option cannot be used more than once in ~3โ€“5 minutes for the same URL. So it is possible to get already existing snapshot, even after setting anyway to true.

Example result:

{ id: 'nUdVJ',
  shortUrl: 'https://archive.is/nUdVJ',
  alreadyExists: true }

Memento object

  • url {string} Snapshot access URL
  • date {Date} Snapshot taking date

License

The archive.is package is released under the GPL-3.0 license. See the LICENSE for more information.

archive.is's People

Contributors

qvint avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

archive.is's Issues

Seems that User-Agent must be set for the timemap request to work

FYI I wasn't getting results until overriding User-Agent in the request lib with something like "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36"

Using default request headers I got a weird response, it was a google analytics snippet from archive.is.

Do you have any trouble with 504s?

I tried using this library and got nothing but 504s. Same result I got with request lib. I can query archive.is with curl though. It's odd.

Couldn't save page errors on every url

This was working until today, but now I'm getting the following error for every url:
(node:12148) UnhandledPromiseRejectionWarning: Error: Couldn't save page

Not working?

raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

Not working anymore

Doesn't seem to work anymore. Did they change something?
I see that the request also takes a submitid parameter

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.