Gyotaku is a simple tool to completely save web pages.
- Mac OS/Linux/Windows
- Java Runtime Environment
- Firefox
Download gyotaku.zip and unzip it.
https://github.com/seratch/gyotaku/downloads
Using Gyotaku UI (Swing Application) is the easiest way.
./gyotaku_ui
If you want to get a page which requires authentication, use the selenium web driver which is customized by yourself.
Added the following source code:
import org.openqa.selenium._
val driver = new firefox.FirefoxDriver
driver.get("https://www.tumblr.com/login")
driver.findElement(By.id("signup_email")).sendKeys("YOUR_EMAIL")
driver.findElement(By.id("signup_password")).sendKeys("YOUR_PASSWORD")
driver.findElement(By.id("signup_form")).submit()
driver
name: tumblr-dashbord
url: http://www.tumblr.com/dashboard
driver: { path: input/tumblr-login.scala }
name: example
url: http://www.example.com/
driver: input/login_operation.scala
charset: UTF-8
prettify: false
replaceNoDomainOnly: false
The name of gyotaku. It'll be used as directory name under output directory.
The url to download.
How to create a org.openqa.selenium.WebDriver
instance.
FirefoxDriver
will be used if it's omitted.
driver
path: path/to/driver.scala
Charset which is used for the downloaded html and modified css files.
"UTF-8" if it's omitted.
Modify the html using HtmlCleaner or not.
false
if it's omitted.
Replace urls in html/css only when they don't start with 'http://' or 'https://'.
true
if it's omitted.