Coder Social home page Coder Social logo

phantomjs's Introduction

deprecation warning

active phantomjs development has ended, in favor of using Chrome's new headless functionality (reference). Instead of using this library, consider using a go package that uses this new api such as chromedp.

phantomjs godoc Status

This is a Go wrapper for the phantomjs command line program. It provides the full webpage API and has a strongly typed API. The wrapper provides an idiomatic Go interface while allowing you to communicate with the underlying WebKit and JavaScript engine in a seamless way.

Installing

First, install phantomjs on your machine. This can be done using your package manager (such as apt-get or brew). Then install this package using the Go toolchain:

$ go get -u github.com/benbjohnson/phantomjs

Usage

Starting the process

This wrapper works by communicating with a separate phantomjs process over HTTP. The process can take several seconds to start up and shut down so you should do that once and then share the process. There is a package-level variable called phantomjs.DefaultProcess that exists for this purpose.

package main

import (
	"github.com/benbjohnson/phantomjs"
)

func main() {
	// Start the process once.
	if err := phantomjs.DefaultProcess.Open(); err != nil {
		fmt.Println(err)
		os.Exit(1)
	}
	defer phantomjs.DefaultProcess.Close()

	// Do other stuff in your program.
	doStuff()
}

You can have multiple processes, however, you will need to change the port used for each one so they do not conflict. This library uses port 20202 by default.

Working with WebPage

The WebPage will be the primary object you work with in phantomjs. Typically you will create a web page from a Process and then either open a URL or you can set the content directly:

// Create a web page.
// IMPORTANT: Always make sure you close your pages!
page, err := p.CreateWebPage()
if err != nil {
	return err
}
defer page.Close()

// Open a URL.
if err := page.Open("https://google.com"); err != nil {
	return err
}

The HTTP API uses a reference map to track references between the Go library and the phantomjs process. Because of this, it is important to always Close() your web pages or else you can experience memory leaks.

Executing JavaScript

You can synchronously execute JavaScript within the context of a web page by by using the Evaluate() function. This example below opens Hacker News, retrieves the text and URL from the first link, and prints it to the terminal.

// Open a URL.
if err := page.Open("https://news.ycombinator.com"); err != nil {
	return err
}

// Read first link.
info, err := page.Evaluate(`function() {
	var link = document.body.querySelector('.itemlist .title a');
	return { title: link.innerText, url: link.href };
}`)
if err != nil {
	return err
}

// Print title and URL.
link := info.(map[string]interface{})
fmt.Println("Hacker News Top Link:")
fmt.Println(link["title"])
fmt.Println(link["url"])
fmt.Println()

You can pass back any object from Evaluate() that can be marshaled over JSON.

Rendering web pages

Another common task with PhantomJS is to render a web page to an image. Once you have opened your web page, simply set the viewport size and call the Render() method:

// Open a URL.
if err := page.Open("https://news.ycombinator.com"); err != nil {
	return err
}

// Setup the viewport and render the results view.
if err := page.SetViewportSize(1024, 800); err != nil {
	return err
}
if err := page.Render("hackernews.png", "png", 100); err != nil {
	return err
}

You can also use the RenderBase64() to return a base64 encoded image to your program instead of writing the file to disk.

phantomjs's People

Contributors

benbjohnson avatar quinn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

phantomjs's Issues

Evaluate() vs EvaluateJavaScript()

What are the differences between Evaluate() and EvaluateJavaScript() aside from the name of the functions and the comment descriptions, they appear to be the same thing. What am I missing ? Thank you Ben.

// EvaluateJavaScript executes a JavaScript function.
// Returns the value returned by the function.
func (p *WebPage) EvaluateJavaScript(script string) (interface{}, error) {
	var resp struct {
		ReturnValue interface{} `json:"returnValue"`
	}
	if err := p.ref.process.doJSON("POST", "/webpage/EvaluateJavaScript", map[string]interface{}{"ref": p.ref.id, "script": script}, &resp); err != nil {
		return nil, err
	}
	return resp.ReturnValue, nil
}

// Evaluate executes a JavaScript function in the context of the web page.
// Returns the value returned by the function.
func (p *WebPage) Evaluate(script string) (interface{}, error) {
	var resp struct {
		ReturnValue interface{} `json:"returnValue"`
	}
	if err := p.ref.process.doJSON("POST", "/webpage/Evaluate", map[string]interface{}{"ref": p.ref.id, "script": script}, &resp); err != nil {
		return nil, err
	}
	return resp.ReturnValue, nil
}

Return page source after an element has been created by JS.

I use something like this:

"use strict"

function waitFor(testFx, onReady, timeOutMillis) {
    var maxtimeOutMillis = timeOutMillis ? timeOutMillis : 3000,
        start = new Date().getTime(),
        condition = false,
        interval = setInterval(function() {
            if ( (new Date().getTime() - start < maxtimeOutMillis) && !condition ) {
                condition = (typeof(testFx) === "string" ? eval(testFx) : testFx());
            } else {
                if(!condition) {
                    phantom.exit(1);
                } else {
                    typeof(onReady) === "string" ? eval(onReady) : onReady();
                    clearInterval(interval);
                }
            }
        }, 250);
};

var page = require('webpage').create();

page.open('https://www.somewebsite.com', function(status) {
    if (status !== 'success') {
        console.log('Unable to access network');
    } else {
        waitFor(function() {
	    return page.evaluate(function() {
		var bets = document.getElementsByClassName('someclass').length;
		if (bets < 1) {
		    return false;
		}
		return true;
	    });
        }, function() {
            console.log(page.evaluate(function() {
                return document.documentElement.innerHTML;
            }))
        }, 20000);
    }
});

What is the better way to do something like this with this package?

Thanks for your help.

Not working over SSL

When I make requests over SSL I get an error, I've tried using page.SetSettings but I can't find a setting that ignores SSL errors or something along those lines.

// Works fine, loads the page, etc.
if err := page.Open("http://whatsmyuseragent.org"); err != nil {
    panic(err)
}

// err isn't nil
if err := page.Open("https://google.com"); err != nil {
    panic(err)
}

Stacktrace:

panic: failed

goroutine 1 [running]:
panic(0x624820, 0xc4200e0570)
	/usr/local/go/src/runtime/panic.go:500 +0x1a1
main.main()
	/home/gianluca/Desktop/go-test/main.go:42 +0x554

Evaluate function()

Is it possible to return a group of css selectors from this function then pass it back into another Evaluate function later on ? Or an equivalent workaround ? Thank you.

Crashed with phantomjs v2.0/v2.1.1/v2.5beta on Windows7SP1

The only code is here.

package main

import (
	"github.com/benbjohnson/phantomjs"
)

func main() {
	// Start the process once.
	if err := phantomjs.DefaultProcess.Open(); err != nil {
		fmt.Println(err)
		os.Exit(1)
	}
	defer phantomjs.DefaultProcess.Close()
}

It reports

Fatal Windows exception, code 0xc0000005.
PhantomJS has crashed. Please read the bug reporting guide at
<http://phantomjs.org/bug-reporting.html> and file a bug report.

But on macOS, it works well.
Maybe this is related with shim.js?

Fix for phantomjs 2.1.1 crash at startup on Windows 10.

I don't have any open source contribution experience,
don't know how to create patch file.
Sorry for inconvenience

Here is fix

func (p *Process) Open() error {
(...)
// Follwing line cause crash.
// cmd.Env = []string{fmt.Sprintf("PORT=%d", p.Port)} // original code

// After slight modification into next line into this one, there is no more crash.
cmd.Env = append(os.Environ(), fmt.Sprintf("PORT=%d", p.Port)) // fixed code
(...)
}

Thank you.

.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.