Coder Social home page Coder Social logo

phridge's Introduction

phridge

A bridge between node and PhantomJS.

Dependency Status Build Status Coverage Status

Working with PhantomJS in node is a bit cumbersome since you need to spawn a new PhantomJS process for every single task. However, spawning a new process is quite expensive and thus can slow down your application significantly.

phridge provides an api to easily

  • spawn new PhantomJS processes
  • run functions with arguments inside PhantomJS
  • return results from PhantomJS to node
  • manage long-running PhantomJS instances

Unlike other node-PhantomJS bridges phridge provides a way to run code directly inside PhantomJS instead of turning every call and assignment into an async operation.

phridge uses PhantomJS' stdin and stdout for inter-process communication. It stringifies the given function, passes it to PhantomJS via stdin, executes it in the PhantomJS environment and passes back the results via stdout. Thus you can write your PhantomJS scripts inside your node modules in a clean and synchronous way.

Instead of ...

phantom.addCookie("cookie_name", "cookie_value", "localhost", function () {
    phantom.createPage(function (page) {
        page.set("customHeaders.Referer", "http://google.com", function () {
            page.set(
                "settings.userAgent",
                "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5)",
                function () {
                    page.open("http://localhost:9901/cookie", function (status) {
                        page.evaluate(function (selector) {
                            return document.querySelector(selector).innerText;
                        }, function (text) {
                            console.log("The element contains the following text: "+ text)
                        }, "h1");
                    });
                }
            );
        });
    });
});

... you can write ...

// node
phantom.run("h1", function (selector, resolve) {
    // this code runs inside PhantomJS

    phantom.addCookie("cookie_name", "cookie_value", "localhost");

    var page = webpage.create();
    page.customHeaders = {
        Referer: "http://google.com"
    };
    page.settings = {
        userAgent: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5)"
    };
    page.open("http://www.google.com", function () {
        var text = page.evaluate(function (selector) {
            return document.querySelector(selector).innerText;
        }, selector);

        // resolve the promise and pass 'text' back to node 
        resolve(text);
    });
}).then(function (text) {
    // inside node again
    console.log("The element contains the following text: " + text);
});

Please note that the phantom-object provided by phridge is completely different to the phantom-object inside PhantomJS. So is the page-object. Check out the api for further information.


Installation

npm install phridge


Examples

Spawn a new PhantomJS process

phridge.spawn({
    proxyAuth: "john:1234",
    loadImages: false,
    // passing CLI-style options does also work
    "--remote-debugger-port": 8888
}).then(function (phantom) {
    // phantom is now a reference to a specific PhantomJS process
});

phridge.spawn() takes an object which will be passed as config to PhantomJS. Check out their documentation for a detailed overview of options. CLI-style options are added as they are, so be sure to escape the space character.

Please note: There are known issues of PhantomJS that some config options are only supported in CLI-style.

Run any function inside PhantomJS

phantom.run(function () {
    console.log("Hi from PhantomJS");
});

phridge stringifies the given function, sends it to PhantomJS and evals it again. Hence you can't use scope variables:

var someVar = "hi";

phantom.run(function () {
    console.log(someVar); // throws a ReferenceError
});

Passing arguments

You can also pass arguments to the PhantomJS process:

phantom.run("hi", 2, {}, function (string, number, object) {
    console.log(string, number, object); // 'hi', 2, [object Object]
});

Arguments are stringified by JSON.stringify(), so be sure to use JSON-valid objects.

Returning results

The given function can run sync and async. However, the run() method itself will always run async as it needs to wait for the process to respond.

Sync

phantom.run(function () {
    return Math.PI;
}).then(function (pi) {
    console.log(pi === Math.PI); // true
});

Async

phantom.run(function (resolve) {
    setTimeout(function () {
        resolve("after 500 ms");
    }, 500);
}).then(function (msg) {
    console.log(msg); // 'after 500 ms'
});

Results are also stringified by JSON.stringify(), so returning application objects with functions won't work.

phantom.run(function () {
    ...
    // doesn't work because page is not a JSON-valid object
    return page;
});

Returning errors

Errors can be returned by using the throw keyword or by calling the reject function. Both ways will reject the promise returned by run().

Sync

phantom.run(function () {
    throw new Error("An unknown error occured");
}).catch(function (err) {
    console.log(err); // 'An unknown error occured'
});

Async

phantom.run(function (resolve, reject) {
    setTimeout(function () {
        reject(new Error("An unknown error occured"));
    }, 500);
}).catch(function (err) {
    console.log(err); // 'An unknown error occured'
});

Async methods with arguments

resolve and reject are just appended to the regular arguments:

phantom.run(1, 2, 3, function (one, two, three, resolve, reject) {

});

Persisting states inside PhantomJS

Since the function passed to phantom.run() can't declare variables in the global scope, it is impossible to maintain state in PhantomJS. That's why phantom.run() calls all functions on the same context object. Thus you can easily store state variables.

phantom.run(function () {
    this.message = "Hello from the first call";
}).then(function () {
    phantom.run(function () {
        console.log(this.message); // 'Hello from the first call'
    });
});

For further convenience all PhantomJS modules are already available in the global scope.

phantom.run(function () {
    console.log(webpage);           // [object Object]
    console.log(system);            // [object Object]
    console.log(fs);                // [object Object]
    console.log(webserver);         // [object Object]
    console.log(child_process);     // [object Object]
});

Working in a page context

Most of the time its more useful to work in a specific webpage context. This is done by creating a Page via phantom.createPage() which calls internally require("webpage").create(). The returned page wrapper will then execute all functions bound to a PhantomJS webpage instance.

var page = phantom.createPage();

page.run(function (resolve, reject) {
    // `this` is now a webpage instance
    this.open("http://example.com", function (status) {
        if (status !== "success") {
            return reject(new Error("Cannot load " + this.url));
        }
        resolve();
    });
});

And for the busy ones: You can just call phantom.openPage(url) which is basically the same as above:

phantom.openPage("http://example.com").then(function (page) {
    console.log("Example loaded");
});

Cleaning up

If you don't need a particular page anymore, just call:

page.dispose().then(function () {
    console.log("page disposed");
});

This will clean up all page references inside PhantomJS.

If you don't need the whole process anymore call

phantom.dispose().then(function () {
    console.log("process terminated");
});

which will terminate the process cleanly by calling phantom.exit(0) internally. You don't need to dispose all pages manuallly when you call phantom.dispose().

However, calling

phridge.disposeAll().then(function () {
    console.log("All processes created by phridge.spawn() have been terminated");
});

will terminate all processes.

I strongly recommend to call phridge.disposeAll() when the node process exits as this is the only way to ensure that all child processes terminate as well. Since disposeAll() is async it is not safe to call it on process.on("exit"). It is better to call it on SIGINT, SIGTERM and within your regular exit flow.


API

phridge

.spawn(config?): Promise → Phantom

Spawns a new PhantomJS process with the given config. Read the PhantomJS documentation for all available config options. Use camelCase style for option names. The promise will be fulfilled with an instance of Phantom.

.disposeAll(): Promise

Terminates all PhantomJS processes that have been spawned. The promise will be fulfilled when all child processes emitted an exit-event.

.config.stdout: Stream = process.stdout

Destination stream where PhantomJS' clean stdout will be piped to. Set it null if you don't want it. Changing the value does not affect processes that have already been spawned.

.config.stderr: Stream = process.stderr

Destination stream where PhantomJS' stderr will be piped to. Set it null if you don't want it. Changing the value does not affect processes that have already been spawned.


Phantom.prototype

.childProcess: ChildProcess

A reference to the ChildProcess-instance.

.childProcess.cleanStdout: ReadableStream

phridge extends the ChildProcess-instance by a new stream called cleanStdout. This stream is piped to process.stdout by default. It provides all data not dedicated to phridge. Streaming data is considered to be dedicated to phridge when the new line is preceded by the classifier string "message to node: ".

.run(args..., fn): Promise → *

Stringifies fn, sends it to PhantomJS and executes it there again. args... are stringified using JSON.stringify() and passed to fn again. fn may simply return a result or throw an error or call resolve() or reject() respectively if it is asynchronous. phridge compares fn.length with the given number of arguments to determine whether fn is sync or async. The returned promise will be resolved with the result or rejected with the error.

.createPage(): Page

Creates a wrapper to execute code in the context of a specific PhantomJS webpage.

.openPage(url): Promise → Page

Calls phantom.createPage(), then page.open(url, cb) inside PhantomJS and resolves when cb is called. If the returned status is not "success" the promise will be rejected.

.dispose(): Promise

Calls phantom.exit(0) inside PhantomJS and resolves when the child process emits an exit-event.

Events

unexpectedExit

Will be emitted when PhantomJS exited without a call to phantom.dispose() or one of its std streams emitted an error event. This event may be fired on some OS when the process group receives a SIGINT or SIGTERM (see #35).

When an unexpectedExit event is encountered, the phantom instance will be unusable and therefore automatically disposed. Usually you don't need to listen for this event.


Page.prototype

.phantom: Phantom

A reference to the parent Phantom instance.

.run(args..., fn): Promise → *

Calls fn on the context of a PhantomJS page object. See phantom.run() for further information.

.dispose(): Promise

Cleans up this page instance by calling page.close()


Contributing

From opening a bug report to creating a pull request: every contribution is appreciated and welcome. If you're planing to implement a new feature or change the api please create an issue first. This way we can ensure that your precious work is not in vain.

All pull requests should have 100% test coverage (with notable exceptions) and need to pass all tests.

  • Call npm test to run the unit tests
  • Call npm run coverage to check the test coverage (using istanbul)

License

Unlicense

Sponsors

phridge's People

Contributors

aju avatar ashtonsix avatar domasx2 avatar frosas avatar jhnns avatar laggingreflex avatar meaku avatar mikxail avatar prototypealex avatar systemparadox avatar tecfu avatar yesyo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

phridge's Issues

CasperJS support?

In the spirit of the SlimerJS support request, how complicated would it be to support CasperJS?

How disable load images?

I'll try
phridge.spawn({'loadImages': false, 'loadimages': false, 'LoadImages': false, 'load-images': false, 'autoLoadImages': false, '--load-images': false})
But images loading.

[CRITICAL] QNetworkReplyImpl: backend error: caching was enabled after some bytes had been written

I am trying to spawn multiple child phantom processes using phridge which perform certain tasks like submitting form.

Everything works fine when the number of processes is less then 8 but as soon as i try to spawn 8 or more then 8 phridge process i get the below error.

2014-11-26T14:41:31 [CRITICAL] QNetworkReplyImpl: backend error: caching was enabled after some bytes had been written
2014-11-26T14:41:31 [CRITICAL] QNetworkReplyImpl: backend error: caching was enabled after some bytes had been written

Is it something related to PhantomJS itself, i am using the latest build of PhantomJS 2. ?

Phridge not receiving messages via phantom's stdout on Ubuntu

I've been spending most of my day tracking down this issue. I'm still not sure of the root cause but I was able to figure out a workaround via a lucky guess.

The latest version of phridge works great when running my node application on a mac. However, when trying to run it on an Ubuntu linux machine Phridge never receives the "message from node: hi" prompt and so the spawn() promise is never resolved (or rejected for that matter). Phridge is successfully launching phantom its just not getting the stdout from phantom.

After spending hours messing with Phridge code trying to get some response from phantom, I decided to try force flushing stdout from within phantom. I had no idea this would work or even be a valid operation on phantom's stream object, but adding the following statement after every writeLine() in lib/phantom/start.js resolves the issue:

system.stdout.flush();

Again I have no idea why this statement is required on the linux machine and not the mac. Any ideas?

Here's some version info:

  • node: 0.10.28
  • phantomjs: 1.9.0 (special development build)
  • Ubuntu: 11.10
  • Phridge: 1.0.6

I've created a pull request with this fix for your review. This fix causes no issues on mac as far as I can tell.

lift.js exits with "ReferenceError: Promise is not defined" after update.

Hi Guys, I've seen that there has been changes lately so I decided to update. Unfortunately node is now exiting with following error message:

> node spawn.js                                                                                                                                                    ~/Documents/phridge «22:16»

/Users/sli/node_modules/phridge/lib/lift.js:14
        return new Promise(function (resolve, reject) {
                   ^
ReferenceError: Promise is not defined
    at /Users/sli/node_modules/phridge/lib/lift.js:14:20
    at Object.spawn (/Users/sli/node_modules/phridge/lib/spawn.js:45:12)
    at Object.<anonymous> (/Users/sli/Documents/phridge/spawn.js:3:9)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Function.Module.runMain (module.js:497:10)
    at startup (node.js:119:16)
    at node.js:929:3

After it didn't work, I figured a fresh install may fix the issue. Unfortunately it didn't. I've un- and reinstalled node through brew, un- and reinstalled all npm packages. So I should have a fresh install, but unfortunately the Error is still present.

content of "spawn.js":

var phridge = require('phridge'); // https://www.npmjs.com/package/phridge
phridge.spawn().then(function(phantom) {
    phantom.run(function(resolve) {
        this.page = webpage.create();
    });
});

Cheers!

Memory exhausted?

I got the following error:

Memory exhausted.
c:\Exposebox\Source\Spider\node_modules\phridge\lib\Phantom.js:380
        error.message = errorMessage + ": " + error.message;
                      ^

TypeError: Cannot assign to read only property 'message' of 1
    at Phantom._onUnexpectedError (c:\Exposebox\Source\Spider\node_modules\phridge\lib\Phantom.js:380:23)
    at emitTwo (events.js:92:20)
    at ChildProcess.emit (events.js:172:7)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:200:12)

From I can understand, this means that the phantom process exited with status 1?

how can i cancel(settle) the promise of running page ?

I want cancel or abort a running page, but the when promise cannot cancel.

Is there some way to abort the running promise with no exception like Cannot call reject() after the promise has already been resolved or rejected?

how about change the promise lib to bluebird, which is more popular, better api and performance.

thx!

Footer contents (reference to phantom.callback) not working

Hi, inside my run function, I define the `page.paperSize' property like this:

        page.paperSize = {
            format: 'A4',
            orientation: 'potrait',
            border: {
              top:"1in",
              bottom:".25in",
              left:"1.2in",
              right:"1.2in"
            },
            footer: {
               height: '.5in',
               contents: phantom.callback(function(pageNum, numPages) {
                  return "Page " + pageNum + " of " + numPages;
              })
            }
        }

But for some reason, the footer never gets rendered. Is there a way I can debug this?

examples

Hi how do you use it? is there an example what needs to be required: phantom/webpage etc.

How to use phantomjs events?

Hi,

we are trying to use phridge as our phantomjs bridge, but in our usecase we have to inject Javascript to mock the AudioContext object (because in phantomjs it's unsupported). This injection has to happen after the WebPage instance was created and before the first <script> is parsed. The software we are testing creates an AudioContext instance immediately in the global scope.

Long story short: We would like to define a callback for phantomjs' onInitialized event and use page.injectJs (or modify the DOM) in this callback. A stackoverflow thread indicates that this should theoretically work.

How can we use events/handlers with phridge?

Accumulating slow down on sequential Page.run()

I'm trying to create an application that renders PNG files of many webpages one at a time. I'm doing it with one instance of phantom using phridge. For each webpage I want to render I create a new page via phridge's phantom.createPage() and then call run() on it. The run() callback (which runs in phantomjs land) simply does a page.open() and renders the PNG before returning.

I'm noticing that with every subsequent run() call I do, it takes more and more time for the phantomjs function to start executing.

The severity of the accumulating delay is related to how much data we're pushing through via the run() call, which in my case is about 500kb. This causes about an additional 2 seconds for each call (i.e. the first one takes 0 seconds, then 2 seconds, then 4 seconds, and so on) However, even without all of that data there is still an accumulation albeit a lot smaller (i.e. 0 seconds, 0.1 seconds, 0.2 seconds, 0.3 seconds, etc)

I'm thinking it has to do with the plumbing in writing to stdin. As in its getting clogged up and hence slowed down. When I add about a 4 second delay between subsequent run calls, the accumulation goes away. So my thinking is that some sort of garbage collection is allowed to run in this time.

Any ideas?

CasperJS support?

In the spirit of the SlimerJS support request, how complicated would it be to support CasperJS?

Trying to page.open, but no joy

Using phridge to run Phantom to login into a site and retrieve an attachment. This is done inside the SailsJS framework.

Running

sails.phridge
  .spawn()          
  .then(function(phantom) { 

    return phantom.run(function() {  
      console.log('run');
      page = webpage.create();
      page.open('https://commercialservices.freshservice.com/support/login', function(status) {
        console.log('hi');
        page.render('test.png');
      });
      console.log('run');
    });

  })
  .catch(function(e) {
    sails.log.info('catch');
    sails.log.error(e);
  })
  .finally(function(f) {
    sails.log.info('finally');
    sails.log.info(f);
  });

does not result in a console.log() or a page.render() inside the page.open(). However, I do see the two console.log('run') lines output to the screen. Running the same block of code inside page.open() directly in PhantomJS does produce a PNG. Am I doing something wrong?

PS: FWIW, the sails.phridge... line is equivalent to a require('phridge')

Evaluate returning "null"

As in the example:

page.open("http://www.google.com", function () {
        var text = page.evaluate(function (selector) {
            return document.querySelector(selector).innerText;
        }, selector);

        // resolve the promise and pass 'text' back to node 
        resolve(text);
    });

Returns: The element contains the following text: null
Why?

Wait for callback

Is it possible to call a javascript function in the phantomjs page that accepts a callback function,
in contrast to using a helper function and polling?

Phridge 0.1.4 fails to install

I have been to install it via npm and keep getting this error:

Error: No compatible version found: temp@'^0.8.0'
Valid install targets:
["0.2.0","0.3.0","0.4.0","0.5.0","0.5.1","0.6.0","0.7.0","0.8.0"]

It seems phridge is requesting for temp greater that 0.8.0 and there isn't, I believe 0.8.0 is the latest release.

Can I pass page to run method?

var page;

phantomNode.run({}, function(result, resolve) {
    var page = webpage.create();
    result['page'] = page;

    page.open("https://google.com", function() {
        result['title'] = page.evaluate(function() {
            return document.querySelector('title').innerText;
        });

        resolve(result)
    });
}).then(function(result) {
    page = result['page'];
   ...
});

phantomNode.run({}, page, function(result, page, resolve) {
    page.open("https://google.com", function() {
        result['title'] = page.evaluate(function() {
            return document.querySelector('title').innerText;
        });

        resolve(result)
    });
}).then(function(result) {
    ...
});

How I can pass page for reuse wrong?
...
I use global variable now, but I think it is wrong.

phantom.dispose() causes 'cannot call function of deleted QObject' when page was already closed

In my application I'm calling page.close() within phantom manually when I'm done with a page. When I'm done with the phantom instance I call phridge's phantom.dispose() but I get the following error when I do this after the call to page.close().

cannot call function of deleted QObject
  at reject (node_modules/phridge/lib/phantom/start.js:71:69)
  at run (node_modules/phridge/lib/phantom/start.js:109:15)
  at run (node_modules/phridge/lib/phantom/start.js:160:12)
  at loop (node_modules/phridge/lib/phantom/start.js:30:12)
    -----------------------------------------------------------------------
    at Phantom._send (node_modules/phridge/lib/Phantom.js:165:20)
    at Phantom.run (node_modules/phridge/lib/Phantom.js:95:17)
    at dispose (node_modules/phridge/lib/Phantom.js:147:14)
    at init (node_modules/when/lib/makePromise.js:39:5)
    at new Promise (node_modules/when/lib/makePromise.js:27:53)
    at Function.promise (node_modules/when/when.js:97:10)
    at Phantom.dispose (node_modules/phridge/lib/Phantom.js:139:17)
    at Generator._killPhantom (modules/generator/index.js:307:23)
    at Generator.stop (modules/generator/index.js:303:15)
    at Object.poolModule.Pool.destroy (endpoints/generate.js:28:15)

If I remove the call to page.close() this does not happen. I'm calling it because the phantom instance is reused many times and I want to clear out any resources being used before creating a new page using phridge's phantom.createPage().

Seems like phridge has no way to tell if a page it has created has been closed from within phantom. So when phantom.dispose is called its trying to close pages which are already closed.

Cannot read property 'stdin' of null

Hello! Thanks for this wonderful wrapper that makes interaction with phantom much simpler and straightforward.

I am using phridge to generate png sprite from svg sources. Yesterday I received an issue from someone using my gulp plugin: w0rm/gulp-svgfallback#7 with an error message coming from phridge sources.

I am not able to reproduce this, but, maybe, you have an idea of what can possibly go wrong there.

phridge is used only in the generateSprite function: https://github.com/w0rm/gulp-svgfallback/blob/master/index.js#L104

Please, give an example of load and use jquery

var phridge = require("phridge");
var path = require("path");
var jqueryPath = path.resolve(__dirname, "jquery-1.6.1.min.js");

phridge.spawn()
.then(function (phantom) {
return phantom.openPage("http://localhost:3000/");
})
.then(function (page) {
page.injectJs(jqueryPath);
return page.run(function () {
return this.evaluate(function () {
return $("button").innerText;
});
});
})

gives an error: TypeError: page.injectJs is not a function

after dispose phantom, the other phantom stdout not work

i need help, friends~

as the code:
1、set phridge.config.stdout to my stream
2、phridge.spawn() to create phantomA and phantomB
3、phantomA.dispose()
4、open a page by phantomB. i am sure that phantom has accessed the local page by log, but the phantom stdout do not through my stream

var phridge = require('phridge');
var through = require('through'); //https://github.com/dominictarr/through

//set stdout to the two-way stream(by through), so i handle the messages from phantom
phridge.config.stdout = through(function(buf){
    console.log(buf.toString());
    this.queue(buf);
}, function(){
    this.queue(null)
});


var phantomA, phantomB;

phridge.spawn()
    .then(function(phantom){
        phantomA = phantom;

        return phridge.spawn();
    })
    .then(function(phantom){
        phantomB = phantom;

        return phantomA.dispose();
    })
    .then(function(){
        var page = phantomB.createPage();
        return page.run(function(resolve){
            var page = this;
            page.open('http://localhost/touch', function(){
                console.log(page.url);                 //not to my std stream
                resolve();
            })
        })
    })
    .then(function(){
        console.log('done');
    });

some advice??
thank you!

Holding a reference to the phantom instance (JS-Object) outside of the ".spawn().then( [...] )" scope?

Hey guys, so far I'm quite satisfied with Phridge. Thanks for all the hard work.

This is no real issure, maybe more like a question or feature request. I'm failing at the attempt to hold references to my phantom instances outside of the given scope. Ultimately I'm planning on juggling multiple phantom instances, which are hopefully working simultaneously. Although it might be better (in regards of performance) to spawn multiple instances of node and access them from "outside", I hoped for a solution within my node scope.

What I tried:

var phantomInstances = {};

function spawnPhantom (id) {
    return new Promise(function (resolve,reject) {
        phridge.spawn().then(function(phantom) {
            phantomInstances[id] = phantom;
            // required for communication from within the page scope to the node scope
            phantom.run(function(resolve) {
                var page = webpage.create();
                page.onCallback = function(data) {
                    console.log(data);
                };
            });
            resolve(phantom);
        });
    });
}

Analogously I wrote a function which injects JavaScript to a phantomInstance by id and filename (JS-File to inject). For testing purposes this JS-File currently only calls the prior defined .onCallback-function, thus logging something to the console. Unfortunately the Phantom instance is not accessible. Neither the Object i put into phantomInstances[id]nor the phantom instance returned by the resolve of my promise (spawnPhantom) seem to be accessible.

spawnPhantom('ph00').then(function(phantom) {
    // not working:
    phantom.run(function(resolve) {
        // does not get executed, tested to call console.log from here...
        page.injectJs('myScript.js');
    });
    // not working either:
    var ph00 = phantomInstances('ph00')
    ph00.run(
        [...];
    );
});

My first thought was that the phantom process might be disposed by the time I try to access it. This seems not to be the case as the phantom process is running until i close (SIGINT) the node process.

When I try the same tasks iteratively and within the first spawn()-scope, everything works fine. So please tell me whether what I'm trying is possible and what I'm doing wrong.

Best Regards,
Senad

disclaimer: code simplified for readability.

remoteDebuggerPort not respected

I'm attempting to pass the following flags to phantomjs via phridge:

phridge.spawn({
  webSecurityEnabled : false,
  remoteDebuggerPort : 8888,
  ignoreSslErrors : true,
  remoteDebuggerAutorun : true
})

remoteDebuggerPort does not behave the same as --remote-debugger-port=8888, i.e. :

phantomjs --remote-debugger-port=8888 somfile.js

Allows one to open up a debugging session at 127.0.0.1:8888 in chrome/chromium.
These debugging sessions don't fire in phridge, however.

Workaround?

PhantomJS process hanging in some cases

We are using phridge as a dependency of https://github.com/Otouto/gulp-raster

At some point, we discovered that ps aux was listing lots of hanging processes on our Jenkins instance although no gulp task was running anymore:

jenkins  20650  0.0  0.6 1293792 25160 ?       Sl   Sep12   0:00 /var/lib/jenkins/workspace/node_modules/gulp-raster/node_modules/phridge/node_modules/phantomjs/lib/phantom/bin/phantomjs

Do you have any idea why this would happen? Is there a possible workaround (e.g. specifying a timeout for PhantomJS processes)?

read() function not available on cleanStdout

In older versions of phridge that use HTTP to communicate to phantom, I was using the phantom.childProcess.stdout stream to capture and log javascript errors that occurred within a phantom page. I was simply calling read() on it after loading a page and writing the string it returned to a file. This was working remarkably well.

However, in the latest versions of Phridge that use stdin/out to communicate with phantom, I no longer get intelligible responses from that stream. I noticed the phantom.childProcess.cleanStdout and thought it would be the answer to my prayers. However, I cannot even call read() on this stream because it does not define a function by that name. I've noticed, oddly enough, that this stream is marked as 'readable'. How should I read from this stream? Or is this even the right way to get at these errors?

Documentation is unclear about Phantom, Page objects and their difference from "native" PhantomJS phantom and page objects

The phantom.run("h1"... code example in README.js uses phantom.run and phantom.addCookie methods. First one is a custom phridge method, second is a native PhantomJS method, so these two phantom objects must be different. Since both objects are unfortunately named the same, it It is not immediately obvious that they are, in fact, very different, and when does the switch from one to another occur, so it is relatively easy to call phantom.openPage inside phantom.run callback and wonder why it doesn't work.

The README notes that "phantom-object provided by phridge is completely different to the phantom-object inside PhantomJS", but it is unclear how to tell these two objects apart in the API reference. Which phantom is returned by phridge.spawn()? Which page is returned by phantom.createPage()?

Orphaned phantomjs processes

Hi. This might just be a documentation issue, but I am not clear how to correctly cleanup the phantomJS processes when node exits either due to a SIGTERM or an uncaught exception.

At the moment we are getting a collection of orphaned phantomJS processes which grows whenever we restart node.

The documentation specifically states that it is not safe to call phantom.disposeAll() from process.on('exit') because it is async. But when an exception occurs, or when the node process is being terminated (i.e. by a supervisor), the only way to reliably cleanup is to use process.on('exit') and do it synchronously.

How should I best deal with this? Maybe we need a phridge.killAll()? Although, shouldn't it be phridge's responsibility to ensure that any child processes it created are definitely cleaned up when node exits?

Thanks.

Unsure about how to call resolve()

I have a script to capture a webpage using phridge:

var phridge = require('phridge');

var capture = function(url, outputPath) {
  phridge.spawn()
  .then(function(phantom) {
    return phantom.openPage(url);
  })
  .then(function(page) {
    page.run(url, outputPath, function(pageUrl, outputUrl) {
      this.open(pageUrl, function(status) {
        this.render(outputUrl);
        resolve();
      });
    });
  });
};

capture('http://www.joshtompkins.com', 'test_img.png');

Unfortunately, the promise retuned by page.run seems to finish before the render method finishes executing, and I get the following error:

Error: Cannot resolve value: The response has already been closed. Have you called resolve/reject twice?

As you can see, I haven't called resolve twice, at least as far as I can tell. Can you give me a clue about what I'm doing wrong?

Use of babel polyfill

I use the code:
`.then(function (phantom) {

var page = phantom.createPage();
var jqueryPath = path.resolve(__dirname, "jquery-3.2.1.min.js");
var babelPolyfillPath = path.resolve(__dirname, "babel-polyfill.js");

return page.run(jqueryPath, babelPolyfillPath, function (jqueryPath, babelPolyfillPath, resolve, reject) {
  var page = this;

  page.onInitialized = function () {
    page.injectJs(babelPolyfillPath);
    page.injectJs(jqueryPath);
    console.log('init');
  };
  
  page.onConsoleMessage = function (msg){console.log('(PAGE CONSOLE:) '+ msg);};

  page.open("http://localhost:3000/", function (status) {        
    var p= page.evaluate(function () {        
     $("button").click().click().click();
     return $("div").last().text();        
   });
    resolve(p);
  });
});

})`

when in loaded page (http://localhost:3000/) used let or const (for example <script> let i=0; </script>) there occures an error:
"ReferenceError: Can't find variable: i"

phridge.dispose causes EPIPE error after SIGINT

test program:

var phridge = require('phridge');
var ph = phridge.spawn();
ph.then(function (ph) {
    console.log('phantomJS started');
});
process.on('SIGINT', function () {
    phridge.disposeAll().then(function () {
        console.log('phantomJS closed');
        process.exit();
    });
});

output:

phantomJS started
^C
events.js:72
        throw er; // Unhandled 'error' event
              ^
Error: write EPIPE
    at errnoException (net.js:901:11)
    at Object.afterWrite (net.js:718:19)

Change SIGINT to SIGTERM, use a timeout or anything else to trigger the shutdown and it works fine. Maybe SIGINT closes stdin/stdout?

Interestingly it also works if I do process.emit('SIGINT').

I am using Linux (bash).

Thanks.

Wrong PhantomJS binary?

I'm having an issue with PhantomJS on Ubuntu 14.04.2 LTS (GNU/Linux 3.13.0-57-generic x86_64) throwing this error (used in UnCSS package):

<snip>/phridge/node_modules/phantomjs/lib/phantom/bin/phantomjs: Syntax error: Unterminated quoted string

When I try to invoke it manually from the same path, I get:

./phantomjs: cannot execute binary file: Exec format error

it seems that the included phantomjs package installs wrong binary for this architecture.

Example of How I submit a form and wait for the response?

I'm able to form fill in

jQuery("username").val("foo") ...

but if I do click() or tigger(click) or even submit the form ... how do I know it loaded a new page. I see the OnPageLoaded event - but it doesn't seem to ever fire.

could you create an example of something that inputs into a form and clicks submit, and then proceeds on the next page?

page.onConsoleMessage event handler is null after page.includeJs

I got really confused about this, I thought my page.evaluate() is not working...

        var page = phantom.createPage();
        page.onConsoleMessage = function (msg) {
            console.log(msg);
        };

        page.run(target, function (product, resolve, reject) {
            console.log('phantom> loading page:', product);
            // `this` is now a webpage instance
            this.open(product, function (status) {
                if (status !== "success") {
                    return reject(new Error("Cannot load " + this.reason_url +
                        " due to " + this.reason));
                }

                var page = this
                this.includeJs('https://ajax.googleapis.com/ajax/libs/jquery/1.12.0/jquery.min.js', function () {
                    // only after this event handler was added I saw the EVALUATING print
                    page.onConsoleMessage = function (msg) {
                        console.log(msg);
                    };
                    var enriched = page.evaluate(function () {
                        console.log('EVALUATING... @title' + document.title);
                        return document.title
                    });

                    console.log('page> attached exId and eventHandler to all' +
                        ' elements');
                    resolve(enriched);
                });
            });

Only after adding the second event registration did I see anything inside the evaluate.

support slimerjs?

I tried using this with slimerjs (replacing the phantomjs module and supplying .path to slimerjs) and it worked, which isn't surprizing as slimer aims to have almost identical API as phantom, and many other phantom bridges support it as well.

thoughts?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.