Coder Social home page Coder Social logo

gyotaku's Introduction

Gyotaku - 魚拓(ぎょたく)

What's this?

Gyotaku is a simple tool to completely save web pages.

Requirement

  • Mac OS/Linux/Windows
  • Java Runtime Environment
  • Firefox

Usage

Get Gyotaku

Download gyotaku.zip and unzip it.

https://github.com/seratch/gyotaku/downloads

Invoke Gyotaku

Using Gyotaku UI (Swing Application) is the easiest way.

./gyotaku_ui

screen_shot

Authentication

If you want to get a page which requires authentication, use the selenium web driver which is customized by yourself.

input/tumblr-login.scala

Added the following source code:

import org.openqa.selenium._
val driver = new firefox.FirefoxDriver
driver.get("https://www.tumblr.com/login")
driver.findElement(By.id("signup_email")).sendKeys("YOUR_EMAIL")
driver.findElement(By.id("signup_password")).sendKeys("YOUR_PASSWORD")
driver.findElement(By.id("signup_form")).submit()
driver

input/tumblr.yml

name: tumblr-dashbord
url: http://www.tumblr.com/dashboard
driver: { path: input/tumblr-login.scala }

Configuration

name: example
url: http://www.example.com/
driver: input/login_operation.scala
charset: UTF-8
prettify: false
replaceNoDomainOnly: false

name

The name of gyotaku. It'll be used as directory name under output directory.

url

The url to download.

driver

How to create a org.openqa.selenium.WebDriver instance.

FirefoxDriver will be used if it's omitted.

driver
  path: path/to/driver.scala

charset

Charset which is used for the downloaded html and modified css files.

"UTF-8" if it's omitted.

prettify

Modify the html using HtmlCleaner or not.

false if it's omitted.

replaceNoDomainOnly

Replace urls in html/css only when they don't start with 'http://' or 'https://'.

true if it's omitted.

gyotaku's People

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

gyotaku's Issues

Replace all the url link to full url with domain

A new feature is requested. They say that it's useful if replacing all the url link to full url with domain is available.

  • url(...) in css files should start with 'http'
  • links in html should point __local__ directory

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.