Coder Social home page Coder Social logo

corion / www-mechanize-chrome Goto Github PK

View Code? Open in Web Editor NEW
29.0 8.0 12.0 2.71 MB

automate the Chrome browser

Home Page: https://metacpan.org/release/WWW-Mechanize-Chrome

License: Artistic License 2.0

Perl 84.05% HTML 15.95%
perl mechanize chrome

www-mechanize-chrome's Introduction

Travis Build Status AppVeyor Build Status Github Build Status Linux Github Build Status MacOS Github Build Status Windows

CONTRIBUTING

See lib/WWW/Mechanize/Chrome/Contributing.pod

Google Keep Extraction

NAME

WWW::Mechanize::Chrome - automate the Chrome browser

SYNOPSIS

use Log::Log4perl qw(:easy);
use WWW::Mechanize::Chrome;

Log::Log4perl->easy_init($ERROR);  # Set priority of root logger to ERROR
my $mech = WWW::Mechanize::Chrome->new();
$mech->get('https://google.com');

$mech->eval_in_page('alert("Hello Chrome")');
my $png = $mech->content_as_png();

A collection of other Examples is available to help you get started.

DESCRIPTION

Like WWW::Mechanize, this module automates web browsing with a Perl object. Fetching and rendering of web pages is delegated to the Chrome (or Chromium) browser by starting an instance of the browser and controlling it with Chrome DevTools.

Advantages Over WWW::Mechanize

The Chrome browser provides advanced abilities useful for automating modern web applications that are not (yet) possible with WWW::Mechanize alone:

  • Page content can be created or modified with JavaScript. You can also execute custom JavaScript code on the page content.
  • Page content can be selected with CSS selectors.
  • Screenshots of the rendered page as an image or PDF file.

Disadvantages

Installation of a Chrome compatible browser is required. There are some quirks including sporadic, but harmless, error messages issued by the browser when run with with DevTools.

A Brief Operational Overview

WWW::Mechanize::Chrome (WMC) leverages developer tools built into Chrome and Chrome-like browsers to control a browser instance programatically. You can use WMC to automate tedious tasks, test web applications, and perform web scraping operations.

Typically, WMC is used to launch both a host instance of the browser and provide a client instance of the browser. The host instance of the browser is visible to you on your desktop (unless the browser is running in "headless" mode, in which case it will not open in a window). The client instance is the Perl program you write with the WMC module to issue commands to control the host instance. As you navigate and "click" on various nodes in the client browser, you watch the host browser respond to these actions as if by magic.

This magic happens as a result of commands that are issued from your client to the host using Chrome's DevTools Protocol which implements the http protocol to send JSON data structures. The host also responds to the client with JSON to describe the web pages it has loaded. WMC conveniently hides the complexity of the lower level communications between the client and host browsers and wraps them in a Perl object to provide the easy-to-use methods documented here.

OPTIONS

WWW::Mechanize::Chrome->new( %options )

my $mech = WWW::Mechanize::Chrome->new(
    headless => 0,
);
  • autodie

      autodie => 0   # make HTTP errors non-fatal
    

    By default, autodie is set to true. If an HTTP error is encountered, the program dies along with its associated browser instances. This frees you from having to write error checks after every request. Setting this value to false makes HTTP errors non-fatal, allowing the program to continue running if there is an error.

  • headless

    Don't display a browser window. Default is to display a browser window.

  • host

  • listen_host

    Set the host the browser listens on:

      host => '192.168.1.2'
      host => 'localhost'
    

    Defaults to 127.0.0.1. The browser will listen for commands on the specified host. The host address should be inaccessible from the internet.

  • port

      port => 9223   # set port the launched browser will use for remote operation
    

    Defaults to 9222. Commands to the browser will be issued through this port.

  • tab

    Specify the browser tab the Chrome browser will use:

      tab => 'current'
      tab => qr/PerlMonks/
    

    By default, a web page is opened in a new browser tab. Setting tab to current will use the current, active tab instead. Alternatively, to use an existing inactive tab, you can pass a regular expression to match against the existing tab's title. A false value implements the default behavior and a new tab will be created.

  • autoclose

      autoclose => 0   # keep tab open after program end
    

    By default, autoclose is set to true, closing the tab opened when running your code. If autoclose is set to a false value, the tab will remain open even after the program has finished.

  • launch_exe

    Set the name and/or path to the browser's executable program:

      launch_exe => 'name-of-chrome-executable'   # for non-standard executable names
      launch_exe => '/path/to/executable'         # for non-standard paths
      launch_exe => '/path/to/executable/chrome'  # full path
    

    By default, WWW::Mechanize::Chrome will search the appropriate paths for Chrome's executable file based on the operating system. Use this option to set the path to your executable if it is in a non-standard location or if the executable has a non-standard name.

    The default paths searched are those found in $ENV{PATH}. For OS X, the user and system Application directories are also searched. The default values for the executable file's name are chrome on Windows, Google Chrome on OS X, and google-chrome elsewhere.

    If you want to use Chromium, you must specify that explicitly with something like:

      launch_exe => 'chromium-browser', # if Chromium is named chromium-browser on your OS
    

    Results my vary for your operating system. Use the full path to the browser's executable if you are having issues. You can also set the name of the executable file with the $ENV{CHROME_BIN} environment variable.

  • cleanup_signal

      cleanup_signal => 'SIGKILL'
    

    The signal that is sent to Chrome to shut it down. On Linuxish OSes, this will be TERM, on OSX and Windows it will be KILL.

  • start_url

      start_url => 'http://perlmonks.org'  # Immediately navigate to a given URL
    

    By default, the browser will open with a blank tab. Use the start_url option to open the browser to the specified URL. More typically, the ->get method is use to navigate to URLs.

  • launch_arg

    Pass additional switches and parameters to the browser's executable:

      launch_arg => [ "--some-new-parameter=foo", "--another-option" ]
    

    Examples of other useful parameters include:

      '--start-maximized',
      '--window-size=1280x1696'
      '--ignore-certificate-errors'
    
      '--disable-web-security',
      '--allow-running-insecure-content',
    
      '--load-extension'
      '--no-sandbox'
      '--password-store=basic'
    
  • separate_session

      separate_session => 1   # create a new, empty session
    

    This creates an empty, fresh Chrome session without any cookies. Setting this will disregard any data_directory setting.

  • incognito

      incognito => 1   # open the browser in incognito mode
    

    Defaults to false. Set to true to launch the browser in incognito mode.

    Most likely, you want to use separate_session instead.

  • data_directory

      data_directory => '/path/to/data/directory'  #  set the data directory
    

    By default, an empty data directory is used. Use this setting to change the base data directory for the browsing session.

      use File::Temp 'tempdir';
      # create a fresh Chrome every time
      my $mech = WWW::Mechanize::Chrome->new(
          data_directory => tempdir(CLEANUP => 1 ),
      );
    

    Using the "main" Chrome cookies:

      my $mech = WWW::Mechanize::Chrome->new(
          data_directory => '/home/corion/.config/chromium',
      );
    
  • profile

      profile => 'ProfileDirectory'  #  set the profile directory
    

    By default, your current user profile directory is used. Use this setting to change the profile directory for the browsing session.

    You will need to set the data_directory as well, so that Chrome finds the profile within the data directory. The profile directory/name itself needs to be a single directory name, not the full path. That single directory name will be relative to the data directory.

  • wait_file

      wait_file => "$tempdir/CrashpadMetrics-active.pma"
    

    When shutting down, wait until this file does not exist anymore or can be deleted. This can help making sure that the Chrome process has really shut down.

  • startup_timeout

      startup_timeout => 5  # set the startup timeout value
    

    Defaults to 20, the maximum number of seconds to wait for the browser to launch. Higher or lower values can be set based on the speed of the machine. The process attempts to connect to the browser once each second over the duration of this setting.

  • driver

      driver => $driver_object  # specify the driver object
    

    Use a Chrome::DevToolsProtocol::Target object that has been manually constructed.

  • report_js_errors

      report_js_errors => 1  # turn javascript error reporting on
    

    Defaults to false. If true, tests for Javascript errors and warns after each request are run. This is useful for testing with use warnings qw(fatal).

  • mute_audio

      mute_audio => 0  # turn sounds on
    

    Defaults to true (sound off). A false value turns the sound on.

  • background_networking

      background_networking => 1  # turn background networking on
    

    Defaults to false (off). A true value enables background networking.

  • client_side_phishing_detection

      client_side_phishing_detection => 1  # turn client side phishing detection on
    

    Defaults to false (off). A true value enables client side phishing detection.

  • component_update

      component_update => 1  # turn component updates on
    

    Defaults to false (off). A true value enables component updates.

  • default_apps

      default_apps => 1  # turn default apps on
    

    Defaults to false (off). A true value enables default apps.

  • hang_monitor

      hang_monitor => 1  # turn the hang monitor on
    

    Defaults to false (off). A true value enables the hang monitor.

  • hide_scrollbars

      hide_scrollbars => 1  # hide the scrollbars
    

    Defaults to false (off). A true value will hide the scrollbars.

  • infobars

      infobars => 1  # turn infobars on
    

    Defaults to false (off). A true value will turn infobars on.

  • popup_blocking

      popup_blocking => 1  # block popups
    

    Defaults to false (off). A true value will block popups.

  • prompt_on_repost

      prompt_on_repost => 1  # allow prompts when reposting
    

    Defaults to false (off). A true value will allow prompts when reposting.

  • save_password_bubble

      save_password_bubble => 1  # allow the display of the save password bubble
    

    Defaults to false (off). A true value allows the save password bubble to be displayed.

  • sync

      sync => 1   # turn syncing on
    

    Defaults to false (off). A true value turns syncing on.

  • web_resources

      web_resources => 1   # turn web resources on
    

    Defaults to false (off). A true value turns web resources on.

  • json_log_file

    Filename to log all JSON communications to, one line per message/event/reply

  • json_log_fh

    Filehandle to log all JSON communications to, one line per message/event/reply

    Open this filehandle via

      open my $fh, '>:utf8', $logfilename
          or die "Couldn't create '$logfilename': $!";
    

The $ENV{WWW_MECHANIZE_CHROME_TRANSPORT} variable can be set to a different transport class to override the default transport class. This is primarily used for testing but can also help eliminate introducing bugs from the underlying websocket implementation(s).

The $ENV{WWW_MECHANIZE_CHROME_CONNECTION_STYLE} variable can be set to either websocket or pipe to specify the kind of transport that you want to use.

The pipe transport is only available on unixish OSes and only with Chrome v72 onwards.

METHODS

WWW::Mechanize::Chrome->find_executable

my $chrome = WWW::Mechanize::Chrome->find_executable();

my $chrome = WWW::Mechanize::Chrome->find_executable(
    'chromium.exe',
    '.\\my-chrome-66\\',
);

my( $chrome, $diagnosis ) = WWW::Mechanize::Chrome->find_executable(
    ['chromium-browser','google-chrome'],
    './my-chrome-66/',
);
die $diagnosis if ! $chrome;

Finds the first Chrome executable in the path ($ENV{PATH}). For Windows, it also looks in $ENV{ProgramFiles}, $ENV{ProgramFiles(x86)} and $ENV{"ProgramFilesW6432"}. For OSX it also looks in the user home directory as given through $ENV{HOME}.

This is used to find the default Chrome executable if none was given through the launch_exe option or if the executable is given and does not exist and does not contain a directory separator.

$mech->chrome_version

print $mech->chrome_version;

Synonym for ->browser_version

$mech->browser_version

print $mech->browser_version;

Returns the version of the browser executable being used. This information needs launching the browser and asking for the version via the network.

$mech->chrome_version_info

print $mech->chrome_version_info->{product};

Returns the version information of the Chrome executable and various other APIs of Chrome that the object is connected to.

$mech->driver

deprecated - use ->target instead

my $driver = $mech->driver

Access the Chrome::DevToolsProtocol instance connecting to Chrome.

Deprecated, don't use this anymore. Most likely you want to use ->target to talk to the Chrome tab or ->transport to talk to the Chrome instance.

$mech->target

my $target = $mech->target

Access the Chrome::DevToolsProtocol::Target instance connecting to the Chrome tab we use.

$mech->transport

my $transport = $mech->transport

Access the Chrome::DevToolsProtocol::Transport instance connecting to the Chrome instance.

$mech->tab

my $tab = $mech->tab

Access the tab hash of the Chrome::DevToolsProtocol::Target instance. This represents the tab we control.

$mech->new_tab

$mech->new_tab_future

my $tab2 = $mech->new_tab_future(
    start_url => 'https://google.com',
)->get;

Creates a new tab (basically, a new WWW::Mechanize::Chrome object) connected to the same Chrome session.

# Use a targetInfo structure from Chrome
my $tab2 = $mech->new_tab_future(
    tab => {
        'targetId' => '1F42BDF32A30700805DDC21EDB5D8C4A',
    },
)->get;

It returns a Future because most event loops do not like recursing within themselves, which happens if you want to access a fresh new tab within another callback.

EVENTS

popup

my $opened;
$mech->on( 'popup' => sub( $mech, $tab_f ) {
    # This is a bit heavyweight, but ...
    $tab_f->on_done(sub($tab) {
        say "New window/tab was popped up:";
        $tab->uri_future->then(sub($uri) {
            say $uri;
        });
        $opened = $tab;
    })->retain;
});

$mech->click({ selector => '#popup_window' });
if( $opened ) {
    say $opened->title;
} else {
    say "Did not find new tab?";
};

This event is sent whenever a new tab/window gets popped up or created. The callback is handed the current and a second WWW::Mechanize::Chrome instance. Note that depending on your event loop, you are quite restricted on what synchronous methods you can call from within the callback.

$mech->allow( %options )

$mech->allow( javascript => 1 );

Allow or disallow execution of Javascript

$mech->emulateNetworkConditions( %options )

# Go offline
$mech->emulateNetworkConditions(
    offline => JSON::true,
    latency => 10, # ms ping
    downloadThroughput => 0, # bytes/s
    uploadThroughput => 0, # bytes/s
    connectionType => 'offline', # cellular2g, cellular3g, cellular4g, bluetooth, ethernet, wifi, wimax, other.
);

$mech->setRequestInterception( @patterns )

$mech->setRequestInterception(
    { urlPattern => '*', resourceType => 'Document', interceptionStage => 'Request'},
    { urlPattern => '*', resourceType => 'Media', interceptionStage => 'Response'},
);

Sets the list of request patterns and resource types for which the interception callback will be invoked.

$mech->continueInterceptedRequest( %options )

$mech->continueInterceptedRequest_future(
    interceptionId => ...
);

Continues an intercepted request

$mech->add_listener

my $url_loaded = $mech->add_listener('Network.responseReceived', sub {
    my( $info ) = @_;
    warn "Loaded URL "
         . $info->{params}->{response}->{url}
         . ": "
         . $info->{params}->{response}->{status};
    warn "Resource timing: " . Dumper $info->{params}->{response}->{timing};
});

Returns a listener object. If that object is discarded, the listener callback will be removed.

Calling this method in void context croaks.

To see the browser console live from your Perl script, use the following:

my $console = $mech->add_listener('Runtime.consoleAPICalled', sub {
  warn join ", ",
      map { $_->{value} // $_->{description} }
      @{ $_[0]->{params}->{args} };
});

If you want to explicitly remove the listener, either set it to undef:

undef $console;

Alternatively, call

$console->unregister;

or call

$mech->remove_listener( $console );

$mech->on_request_intercepted( $cb )

$mech->on_request_intercepted( sub {
    my( $mech, $info ) = @_;
    warn $info->{request}->{url};
    $mech->continueInterceptedRequest_future(
        interceptionId => $info->{interceptionId}
    )
});

A callback for intercepted requests that match the patterns set up via setRequestInterception.

If you return a future from this callback, it will not be discarded but kept in a safe place.

$mech->searchInResponseBody( $id, %options )

my $request_id = ...;
my @matches = $mech->searchInResponseBody(
    requestId     => $request_id,
    query         => 'rumpelstiltskin',
    caseSensitive => JSON::true,
    isRegex       => JSON::false,
);
for( @matches ) {
    print $_->{lineNumber}, ":", $_->{lineContent}, "\n";
};

Returns the matches (if any) for a string or regular expression within a response.

$mech->on_dialog( $cb )

$mech->on_dialog( sub {
    my( $mech, $dialog ) = @_;
    warn $dialog->{message};
    $mech->handle_dialog( 1 ); # click "OK" / "yes" instead of "cancel"
});

A callback for Javascript dialogs (alert(), prompt(), ... )

$mech->handle_dialog( $accept, $prompt = undef )

$mech->on_dialog( sub {
    my( $mech, $dialog ) = @_;
    warn "[Javascript $dialog->{type}]: $dialog->{message}";
    $mech->handle_dialog( 1 ); # click "OK" / "yes" instead of "cancel"
});

Closes the current Javascript dialog.

$mech->js_console_entries()

print $_->{type}, " ", $_->{message}, "\n"
    for $mech->js_console_entries();

An interface to the Javascript Error Console

Returns the list of entries in the JEC

$mech->js_errors()

print "JS error: ", $_->{message}, "\n"
    for $mech->js_errors();

Returns the list of errors in the JEC

$mech->clear_js_errors()

$mech->clear_js_errors();

Clears all Javascript messages from the console

$mech->eval_in_page( $str, %options )

$mech->eval( $str, %options )

my ($value, $type) = $mech->eval( '2+2' );

Evaluates the given Javascript fragment in the context of the web page. Returns a pair of value and Javascript type.

This allows access to variables and functions declared "globally" on the web page.

  • returnByValue

    If you want to create an object in Chrome and only want to keep a handle to that remote object, use JSON::false for the returnByValue option:

      my ($dummyObj,$type) = $mech->eval(
          'new Object',
          returnByValue => JSON::false
      );
    

    This is also helpful if the object in Chrome cannot be serialized as JSON. For example, window is such an object. The return value is a hash, whose objectId is the most interesting part.

This method is special to WWW::Mechanize::Chrome.

$mech->eval_in_chrome $code, @args

$mech->eval_in_chrome(<<'JS', "Foobar/1.0");
    this.settings.userAgent= arguments[0]
JS

Evaluates Javascript code in the context of Chrome.

This allows you to modify properties of Chrome.

This is currently not implemented.

$mech->callFunctionOn( $function, @arguments )

my ($value, $type) = $mech->callFunctionOn(
    'function(greeting) { window.alert(greeting)}',
    objectId => $someObjectId,
    arguments => [{ value => 'Hello World' }]
);

Runs the given function with the specified arguments. This is the only way to pass arguments to a function call without doing risky string interpolation. The Javascript this object will be set to the object referenced from the objectId.

The arguments option expects an arrayref of hashrefs. Each hash describes one function argument.

The objectId parameter is optional. Leaving out the objectId parameter will create a dummy object on which the function then is called.

This method is special to WWW::Mechanize::Chrome.

->autoclose_tab

Set the autoclose option

->close

$mech->close()

Tear down all connections and shut down Chrome.

$mech->list_tabs

my @open_tabs = $mech->list_tabs()->get;
say $open_tabs[0]->{title};

Returns the open tabs as a list of hashrefs.

$mech->highlight_node( @nodes )

my @links = $mech->selector('a');
$mech->highlight_node(@links);
print $mech->content_as_png();

Convenience method that marks all nodes in the arguments with a red frame.

This is convenient if you need visual verification that you've got the right nodes.

NAVIGATION METHODS

$mech->get( $url, %options )

my $response = $mech->get( $url );

Retrieves the URL URL.

It returns a HTTP::Response object for interface compatibility with WWW::Mechanize.

Note that the returned HTTP::Response object gets the response body filled in lazily, so you might have to wait a moment to get the response body from the result. This is a premature optimization and later releases of WWW::Mechanize::Chrome are planned to fetch the response body immediately when accessing the response body.

Note that Chrome does not support download of files through the API.

Options

  • intrapage - Override the detection of whether to wait for a HTTP response or not. Setting this will never wait for an HTTP response.

$mech->_collectEvents

my $events = $mech->_collectEvents(
    sub { $_[0]->{method} eq 'Page.loadEventFired' }
);
my( $e,$r) = Future->wait_all( $events, $self->target->send_message(...));

Internal method to create a Future that waits for an event that is sent by Chrome.

The subroutine is the predicate to check to see if the current event is the event we have been waiting for.

The result is a Future that will return all captured events.

$mech->get_local( $filename , %options )

$mech->get_local('test.html');

Shorthand method to construct the appropriate file:// URI and load it into Chrome. Relative paths will be interpreted as relative to $0 or the basedir option.

This method accepts the same options as ->get().

This method is special to WWW::Mechanize::Chrome but could also exist in WWW::Mechanize through a plugin.

Warning: Chrome does not handle local files well. Especially subframes do not get loaded properly.

$mech->getRequestPostData

if( $info->{params}->{response}->{requestHeaders}->{":method"} eq 'POST' ) {
    $req->{postBody} = $m->getRequestPostData( $id );
};

Retrieves the data sent with a POST request

$mech->post( $url, %options )

not implemented

$mech->post( 'http://example.com',
    params => { param => "Hello World" },
    headers => {
      "Content-Type" => 'application/x-www-form-urlencoded',
    },
    charset => 'utf-8',
);

Sends a POST request to $url.

A Content-Length header will be automatically calculated if it is not given.

The following options are recognized:

  • headers - a hash of HTTP headers to send. If not given, the content type will be generated automatically.
  • data - the raw data to send, if you've encoded it already.

$mech->reload( %options )

$mech->reload( ignoreCache => 1 )

Acts like the reload button in a browser: repeats the current request. The history (as per the "back" method) is not altered.

Returns the HTTP::Response object from the reload, or undef if there's no current request.

$mech->set_download_directory( $dir )

my $downloads = tempdir();
$mech->set_download_directory( $downloads );

Enables automatic file downloads and sets the directory where the files will be downloaded to. Setting this to undef will disable downloads again.

The directory in $dir must be an absolute path, since Chrome does not know about the current directory of your Perl script.

$mech->cookie_jar

my $cookies = $mech->cookie_jar

Returns all the Chrome cookies in a HTTP::Cookies::ChromeDevTools instance. Setting a cookie in there will also set the cookie in Chrome. Note that the ->cookie_jar does not automatically refresh when a new page is loaded. To manually refresh the state of the cookie jar, use:

$mech->get('https://example.com/some_page');
$mech->cookie_jar->load;

$mech->add_header( $name => $value, ... )

$mech->add_header(
    'X-WWW-Mechanize-Chrome' => "I'm using it",
    Encoding => 'text/klingon',
);

This method sets up custom headers that will be sent with every HTTP(S) request that Chrome makes.

Note that currently, we only support one value per header.

Chrome since version 63+ does not allow setting and sending the Referer header anymore. The bug report is at https://bugs.chromium.org/p/chromium/issues/detail?id=849972.

$mech->delete_header( $name , $name2... )

$mech->delete_header( 'User-Agent' );

Removes HTTP headers from the agent's list of special headers. Note that Chrome may still send a header with its default value.

$mech->reset_headers

$mech->reset_headers();

Removes all custom headers and makes Chrome send its defaults again.

$mech->block_urls()

$mech->block_urls( '//facebook.com/js/conversions/tracking.js' );

Sets the list of blocked URLs. These URLs will not be retrieved by Chrome when loading a page. This is useful to eliminate tracking images or to test resilience in face of bad network conditions.

$mech->res() / $mech->response(%options)

my $response = $mech->response(headers => 0);

Returns the current response as a HTTP::Response object.

$mech->success()

$mech->get('https://google.com');
print "Yay"
    if $mech->success();

Returns a boolean telling whether the last request was successful. If there hasn't been an operation yet, returns false.

This is a convenience function that wraps $mech->res->is_success.

$mech->status()

$mech->get('https://google.com');
print $mech->status();
# 200

Returns the HTTP status code of the response. This is a 3-digit number like 200 for OK, 404 for not found, and so on.

$mech->back()

$mech->back();

Goes one page back in the page history.

Returns the (new) response.

$mech->forward()

$mech->forward();

Goes one page forward in the page history.

Returns the (new) response.

$mech->stop()

$mech->stop();

Stops all loading in Chrome, as if you pressed ESC.

This function is mostly of use in callbacks or in a timer callback from your event loop.

$mech->uri()

$mech->uri_future()

print "We are at " . $mech->uri;
print "We are at " . $mech->uri_future->get;

Returns the current document URI.

$mech->infinite_scroll( [$wait_time_in_seconds] )

$new_content_found = $mech->infinite_scroll(3);

Loads content into pages that have "infinite scroll" capabilities by scrolling to the bottom of the web page and waiting up to the number of seconds, as set by the optional $wait_time_in_seconds argument, for the browser to load more content. The default is to wait up to 20 seconds. For reasonbly fast sites, the wait time can be set much lower.

The method returns a boolean true if new content is loaded, false otherwise. You can scroll to the end (if there is one) of an infinitely scrolling page like so:

while( $mech->infinite_scroll ) {
    # Tests for exiting the loop earlier
    last if $count++ >= 10;
}

CONTENT METHODS

$mech->document_future()

$mech->document()

print $self->document->{nodeId};

Returns the document node.

This is WWW::Mechanize::Chrome specific.

$mech->content( %options )

print $mech->content;
print $mech->content( format => 'html' ); # default
print $mech->content( format => 'text' ); # identical to ->text
print $mech->content( format => 'mhtml' ); # identical to ->captureSnapshot

This always returns the content as a Unicode string. It tries to decode the raw content according to its input encoding. This currently only works for HTML pages, not for images etc.

Recognized options:

  • format - the stuff to return

    The allowed values are html and text. The default is html.

$mech->text()

print $mech->text();

Returns the text of the current HTML content. If the content isn't HTML, $mech will die.

$mech->captureSnapshot_future()

$mech->captureSnapshot()

print $mech->captureSnapshot( format => 'mhtml' )->{data};

Returns the current page as MHTML.

This is WWW::Mechanize::Chrome specific.

$mech->content_encoding()

print "The content is encoded as ", $mech->content_encoding;

Returns the encoding that the content is in. This can be used to convert the content from UTF-8 back to its native encoding.

$mech->update_html( $html )

$mech->update_html($html);

Writes $html into the current document. This is mostly implemented as a convenience method for HTML::Display::MozRepl.

The value passed in as $html will be stringified.

$mech->base()

print $mech->base;

Returns the URL base for the current page.

The base is either specified through a base tag or is the current URL.

This method is specific to WWW::Mechanize::Chrome.

$mech->content_type()

$mech->ct()

print $mech->content_type;

Returns the content type of the currently loaded document

$mech->is_html()

print $mech->is_html();

Returns true/false on whether our content is HTML, according to the HTTP headers.

$mech->title()

print "We are on page " . $mech->title;

Returns the current document title.

EXTRACTION METHODS

$mech->links()

print $_->text . " -> " . $_->url . "\n"
    for $mech->links;

Returns all links in the document as WWW::Mechanize::Link objects.

Currently accepts no parameters. See ->xpath or ->selector when you want more control.

$mech->selector( $css_selector, %options )

my @text = $mech->selector('p.content');

Returns all nodes matching the given CSS selector. If $css_selector is an array reference, it returns all nodes matched by any of the CSS selectors in the array.

This takes the same options that ->xpath does.

This method is implemented via WWW::Mechanize::Plugin::Selector.

$mech->find_link_dom( %options )

print $_->{innerHTML} . "\n"
    for $mech->find_link_dom( text_contains => 'CPAN' );

A method to find links, like WWW::Mechanize's ->find_links method. This method returns DOM objects from Chrome instead of WWW::Mechanize::Link objects.

Note that Chrome might have reordered the links or frame links in the document so the absolute numbers passed via n might not be the same between WWW::Mechanize and WWW::Mechanize::Chrome.

The supported options are:

  • text and text_contains and text_regex

    Match the text of the link as a complete string, substring or regular expression.

    Matching as a complete string or substring is a bit faster, as it is done in the XPath engine of Chrome.

  • id and id_contains and id_regex

    Matches the id attribute of the link completely or as part

  • name and name_contains and name_regex

    Matches the name attribute of the link

  • url and url_regex

    Matches the URL attribute of the link (href, src or content).

  • class - the class attribute of the link

  • n - the (1-based) index. Defaults to returning the first link.

  • single - If true, ensure that only one element is found. Otherwise croak or carp, depending on the autodie parameter.

  • one - If true, ensure that at least one element is found. Otherwise croak or carp, depending on the autodie parameter.

    The method croaks if no link is found. If the single option is true, it also croaks when more than one link is found.

$mech->find_link( %options )

print $_->text . "\n"
    for $mech->find_link( text_contains => 'CPAN' );

A method quite similar to WWW::Mechanize's method. The options are documented in ->find_link_dom.

Returns a WWW::Mechanize::Link object.

This defaults to not look through child frames.

$mech->find_all_links( %options )

print $_->text . "\n"
    for $mech->find_all_links( text_regex => qr/google/i );

Finds all links in the document. The options are documented in ->find_link_dom.

Returns them as list or an array reference, depending on context.

This defaults to not look through child frames.

$mech->find_all_links_dom %options

print $_->{innerHTML} . "\n"
    for $mech->find_all_links_dom( text_regex => qr/google/i );

Finds all matching linky DOM nodes in the document. The options are documented in ->find_link_dom.

Returns them as list or an array reference, depending on context.

This defaults to not look through child frames.

$mech->follow_link( $link )

$mech->follow_link( %options )

$mech->follow_link( xpath => '//a[text() = "Click here!"]' );

Follows the given link. Takes the same parameters that find_link_dom uses.

Note that ->follow_link will only try to follow link-like things like A tags.

$mech->xpath( $query, %options )

my $link = $mech->xpath('//a[id="clickme"]', one => 1);
# croaks if there is no link or more than one link found

my @para = $mech->xpath('//p');
# Collects all paragraphs

my @para_text = $mech->xpath('//p/text()', type => $mech->xpathResult('STRING_TYPE'));
# Collects all paragraphs as text

Runs an XPath query in Chrome against the current document.

If you need more information about the returned results, use the ->xpathEx() function.

Note that Chrome sometimes returns a node with node id 0. This node then cannot be found again using the Chrome API. This is bad luck and results in a warning.

The options allow the following keys:

  • document - document in which the query is to be executed. Use this to search a node within a specific subframe of $mech->document.

  • frames - if true, search all documents in all frames and iframes. This may or may not conflict with node. This will default to the frames setting of the WWW::Mechanize::Chrome object.

  • node - node relative to which the query is to be executed. Note that you will have to use a relative XPath expression as well. Use

      .//foo
    

    instead of

      //foo
    

    Querying relative to a node only works for restricting to children of the node, not for anything else. This is because we need to do the ancestor filtering ourselves instead of having a Chrome API for it.

  • single - If true, ensure that only one element is found. Otherwise croak or carp, depending on the autodie parameter.

  • one - If true, ensure that at least one element is found. Otherwise croak or carp, depending on the autodie parameter.

  • maybe - If true, ensure that at most one element is found. Otherwise croak or carp, depending on the autodie parameter.

  • all - If true, return all elements found. This is the default. You can use this option if you want to use ->xpath in scalar context to count the number of matched elements, as it will otherwise emit a warning for each usage in scalar context without any of the above restricting options.

  • any - no error is raised, no matter if an item is found or not.

Returns the matched results as WWW::Mechanize::Chrome::Node objects.

You can pass in a list of queries as an array reference for the first parameter. The result will then be the list of all elements matching any of the queries.

This is a method that is not implemented in WWW::Mechanize.

In the long run, this should go into a general plugin for WWW::Mechanize.

$mech->by_id( $id, %options )

my @text = $mech->by_id('_foo:bar');

Returns all nodes matching the given ids. If $id is an array reference, it returns all nodes matched by any of the ids in the array.

This method is equivalent to calling ->xpath :

$self->xpath(qq{//*[\@id="$_"]}, %options)

It is convenient when your element ids get mistaken for CSS selectors.

$mech->click( $name [,$x ,$y] )

# If the element is within a <form> element
$mech->click( 'go' );

# If the element is anywhere on the page
$mech->click({ xpath => '//button[@name="go"]' });

Has the effect of clicking a button (or other element) on the current form. The first argument is the name of the button to be clicked. The second and third arguments (optional) allow you to specify the (x,y) coordinates of the click.

If there is only one button on the form, $mech->click() with no arguments simply clicks that one button.

If you pass in a hash reference instead of a name, the following keys are recognized:

  • text - Find the element to click by its contained text

  • selector - Find the element to click by the CSS selector

  • xpath - Find the element to click by the XPath query

  • dom - Click on the passed DOM element

    You can use this to click on arbitrary page elements. There is no convenient way to pass x/y co-ordinates when using the dom option.

  • id - Click on the element with the given id

    This is useful if your document ids contain characters that do look like CSS selectors. It is equivalent to

      xpath => qq{//*[\@id="$id"]}
    
  • intrapage - Override the detection of whether to wait for a HTTP response or not. Setting this will never wait for an HTTP response.

Returns a HTTP::Response object.

As a deviation from the WWW::Mechanize API, you can also pass a hash reference as the first parameter. In it, you can specify the parameters to search much like for the find_link calls.

$mech->click_button( ... )

$mech->click_button( name => 'go' );
$mech->click_button( input => $mybutton );

Has the effect of clicking a button on the current form by specifying its name, value, or index. Its arguments are a list of key/value pairs. Only one of name, number, input or value must be specified in the keys.

  • name - name of the button
  • value - value of the button
  • input - DOM node
  • id - id of the button
  • number - number of the button

If you find yourself wanting to specify a button through its selector or xpath, consider using ->click instead.

FORM METHODS

$mech->current_form()

print $mech->current_form->{name};

Returns the current form.

This method is incompatible with WWW::Mechanize. It returns the DOM <form> object and not a HTML::Form instance.

The current form will be reset by WWW::Mechanize::Chrome on calls to ->get() and ->get_local(), and on calls to ->submit() and ->submit_with_fields.

$mech->dump_forms( [$fh] )

open my $fh, '>', 'form-log.txt'
    or die "Couldn't open logfile 'form-log.txt': $!";
$mech->dump_forms( $fh );

Prints a dump of the forms on the current page to the filehandle $fh. If $fh is not specified or is undef, it dumps to STDOUT.

$mech->form_name( $name [, %options] )

$mech->form_name( 'search' );

Selects the current form by its name. The options are identical to those accepted by the "$mech->xpath" method.

$mech->form_id( $id [, %options] )

$mech->form_id( 'login' );

Selects the current form by its id attribute. The options are identical to those accepted by the "$mech->xpath" method.

This is equivalent to calling

$mech->by_id($id,single => 1,%options)

$mech->form_number( $number [, %options] )

$mech->form_number( 2 );

Selects the _number_th form. The options are identical to those accepted by the "$mech->xpath" method.

$mech->form_with_fields( [$options], @fields )

$mech->form_with_fields(
    'user', 'password'
);

Find the form which has the listed fields.

If the first argument is a hash reference, it's taken as options to ->xpath.

See also "$mech->submit_form".

$mech->forms( %options )

my @forms = $mech->forms();

When called in a list context, returns a list of the forms found in the last fetched page. In a scalar context, returns a reference to an array with those forms.

The options are identical to those accepted by the "$mech->selector" method.

The returned elements are the DOM <form> elements.

$mech->field( $selector, $value, [, $index, \@pre_events [,\@post_events]] )

$mech->field( user => 'joe' );
$mech->field( not_empty => '', 0, [], [] ); # bypass JS validation
$mech->field( date => '2020-04-01', 2 );    # set second field named "date"

Sets the field with the name given in $selector to the given value. Returns the value.

The method understands very basic CSS selectors in the value for $selector, like the HTML::Form find_input() method.

A selector prefixed with '#' must match the id attribute of the input. A selector prefixed with '.' matches the class attribute. A selector prefixed with '^' or with no prefix matches the name attribute.

By passing the array reference @pre_events, you can indicate which Javascript events you want to be triggered before setting the value. @post_events contains the events you want to be triggered after setting the value.

By default, the events set in the constructor for pre_events and post_events are triggered.

$mech->sendkeys( %options )

$mech->sendkeys( string => "Hello World" );

Sends a series of keystrokes. The keystrokes can be either a string or a reference to an array containing the detailed data as hashes.

  • string - the string to send as keystrokes
  • keys - reference of the array to send as keystrokes
  • delay - delay in ms to sleep between keys

$mech->upload( $selector, $value )

$mech->upload( user_picture => 'C:/Users/Joe/face.png' );

Sets the file upload field with the name given in $selector to the given file. The filename must be an absolute path and filename in the local filesystem.

The method understands very basic CSS selectors in the value for $selector, like the ->field method.

$mech->value( $selector_or_element, [ $index | %options] )

print $mech->value( 'user' );

Returns the value of the field given by $selector_or_name or of the DOM element passed in.

If you have multiple fields with the same name, you can use the index to specify the index directly:

print $mech->value( 'date', 2 ); # get the second field named "date"

The legacy form of

$mech->value( name => value );

is not supported anymore.

For fields that can have multiple values, like a select field, the method is context sensitive and returns the first selected value in scalar context and all values in list context.

Note that this method does not support file uploads. See the ->upload method for that.

$mech->get_set_value( %options )

Allows fine-grained access to getting/setting a value with a different API. Supported keys are:

name
value
pre
post

in addition to all keys that $mech->xpath supports.

$mech->set_field( %options )

$mech->set_field(
    field => $field_node,
    value => 'foo',
);

Low level value setting method. Use this if you have an input element outside of a <form> tag.

$mech->select( $name, $value )

$mech->select( $name, \@values )

$mech->select( 'items', 'banana' );

Given the name of a select field, set its value to the value specified. If the field is not <select multiple> and the $value is an array, only the first value will be set. Passing $value as a hash with an n key selects an item by number (e.g. {n => 3} or {n => [2,4]}). The numbering starts at 1. This applies to the current form.

If you have a field with <select multiple> and you pass a single $value, then $value will be added to the list of fields selected, without clearing the others. However, if you pass an array reference, then all previously selected values will be cleared.

Returns true on successfully setting the value. On failure, returns false and calls $self>warn() with an error message.

$mech->tick( $name, $value [, $set ] )

$mech->tick("confirmation_box", 'yes');

"Ticks" the first checkbox that has both the name and value associated with it on the current form. Dies if there is no named check box for that value. Passing in a false value as the third optional argument will cause the checkbox to be unticked.

(Un)ticking the checkbox is done by sending a click event to it if needed. If $value is undef, the first checkbox matching $name will be (un)ticked.

If $name is a reference to a hash, that hash will be used as the options to ->find_link_dom to find the element.

$mech->untick( $name, $value )

$mech->untick('spam_confirm','yes',undef)

Causes the checkbox to be unticked. Shorthand for

$mech->tick($name,$value,undef)

$mech->submit( $form )

$mech->submit;

Submits the form. Note that this does not fire the onClick event and thus also does not fire eventual Javascript handlers. Maybe you want to use $mech->click instead.

The default is to submit the current form as returned by $mech->current_form.

$mech->submit_form( %options )

$mech->submit_form(
    with_fields => {
        user => 'me',
        pass => 'secret',
    }
);

This method lets you select a form from the previously fetched page, fill in its fields, and submit it. It combines the form_number/form_name, ->set_fields and ->click methods into one higher level call. Its arguments are a list of key/value pairs, all of which are optional.

  • form => $mech->current_form()

    Specifies the form to be filled and submitted. Defaults to the current form.

  • fields => \%fields

    Specifies the fields to be filled in the current form

  • with_fields => \%fields

    Probably all you need for the common case. It combines a smart form selector and data setting in one operation. It selects the first form that contains all fields mentioned in \%fields. This is nice because you don't need to know the name or number of the form to do this.

    (calls "$mech->form_with_fields()" and "$mech->set_fields()").

    If you choose this, the form_number, form_name, form_id and fields options will be ignored.

$mech->set_fields( $name => $value, ... )

$mech->set_fields(
    user => 'me',
    pass => 'secret',
);

This method sets multiple fields of the current form. It takes a list of field name and value pairs. If there is more than one field with the same name, the first one found is set. If you want to select which of the duplicate field to set, use a value which is an anonymous array which has the field value and its number as the 2 elements.

$mech->set_fields(
    user => 'me',
    pass => 'secret',
    pass => [ 'secret', 2 ], # repeated password field
);

CONTENT MONITORING METHODS

$mech->is_visible( $element )

$mech->is_visible( %options )

if ($mech->is_visible( selector => '#login' )) {
    print "You can log in now.";
};

Returns true if the element is visible, that is, it is a member of the DOM and neither it nor its ancestors have a CSS visibility attribute of hidden or a display attribute of none.

You can either pass in a DOM element or a set of key/value pairs to search the document for the element you want.

  • xpath - the XPath query
  • selector - the CSS selector
  • dom - a DOM node

The remaining options are passed through to either the /$mech->xpath or /$mech->selector method.

$mech->wait_until_invisible( $element )

$mech->wait_until_invisible( %options )

$mech->wait_until_invisible( $please_wait );

Waits until an element is not visible anymore.

Takes the same options as "->is_visible" in $mech->is_visible.

In addition, the following options are accepted:

  • timeout - the timeout after which the function will croak. To catch the condition and handle it in your calling program, use an eval block. A timeout of 0 means to never time out.

    See also max_wait if you want to wait a limited time for an element to appear.

  • max_wait - the maximum time to wait until the function will return. A max_wait of 0 means to never time out. If the element is still visible, the function will return a false value.

  • sleep - the interval in seconds used to sleep. Subsecond intervals are possible.

Note that when passing in a selector, that selector is requeried on every poll instance. So the following query will work as expected:

xpath => '//*[contains(text(),"stand by")]'

This also means that if your selector query relies on finding a changing text, you need to pass the node explicitly instead of passing the selector.

$mech->wait_until_visible( %options )

$mech->wait_until_visible( selector => 'a.download' );

Waits until an query returns a visible element.

Takes the same options as "->is_visible" in $mech->is_visible.

In addition, the following options are accepted:

  • timeout - the timeout after which the function will croak. To catch the condition and handle it in your calling program, use an eval block. A timeout of 0 means to never time out.
  • sleep - the interval in seconds used to sleep. Subsecond intervals are possible.

Note that when passing in a selector, that selector is requeried on every poll instance. So the following query will work as expected:

xpath => '//*[contains(text(),"click here for download")]'

CONTENT RENDERING METHODS

$mech->content_as_png()

my $png_data = $mech->content_as_png();

# Create scaled-down 480px wide preview
my $png_data = $mech->content_as_png(undef, { width => 480 });

Returns the given tab or the current page rendered as PNG image.

All parameters are optional.

This method is specific to WWW::Mechanize::Chrome.

$mech->saveResources_future

my $file_map = $mech->saveResources_future(
    target_file => 'this_page.html',
    target_dir  => 'this_page_files/',
    wanted      => sub { $_[0]->{url} =~ m!^https?:!i },
)->get();

Rough prototype of "Save Complete Page" feature

$mech->viewport_size

print Dumper $mech->viewport_size;
$mech->viewport_size({ width => 1388, height => 792 });

Returns (or sets) the new size of the viewport (the "window").

The recognized keys are:

width
height
deviceScaleFactor
mobile
screenWidth
screenHeight
positionX
positionY

$mech->element_as_png( $element )

my $shiny = $mech->selector('#shiny', single => 1);
my $i_want_this = $mech->element_as_png($shiny);

Returns PNG image data for a single element

$mech->render_element( %options )

my $shiny = $mech->selector('#shiny', single => 1);
my $i_want_this= $mech->render_element(
    element => $shiny,
    format => 'png',
);

Returns the data for a single element or writes it to a file. It accepts all options of ->render_content.

Note that while the image will have the node in the upper left corner, the width and height of the resulting image will still be the size of the browser window. Cut the image using element_coordinates if you need exactly the element.

$mech->element_coordinates( $element )

my $shiny = $mech->selector('#shiny', single => 1);
my ($pos) = $mech->element_coordinates($shiny);
print $pos->{left},',', $pos->{top};

Returns the page-coordinates of the $element in pixels as a hash with four entries, left, top, width and height.

This function might get moved into another module more geared towards rendering HTML.

$mech->render_content(%options)

my $pdf_data = $mech->render_content( format => 'pdf' );

Returns the current page rendered as PDF or PNG as a bytestring.

Note that the PDF format will only be successful with headless Chrome. At least on Windows, when launching Chrome with a UI, printing to PDF will be unavailable.

This method is specific to WWW::Mechanize::Chrome.

$mech->content_as_pdf(%options)

my $pdf_data = $mech->content_as_pdf();

my $pdf_data = $mech->content_as_pdf( format => 'A4' );

my $pdf_data = $mech->content_as_pdf( paperWidth => 8, paperHeight => 11 );

Returns the current page rendered in PDF format as a bytestring. The page format can be specified through the format option.

Note that this method will only be successful with headless Chrome. At least on Windows, when launching Chrome with a UI, printing to PDF will be unavailable. See the html-to-pdf.pl script in the examples/ directory of this distribution.

This method is specific to WWW::Mechanize::Chrome.

INTERNAL METHODS

These are methods that are available but exist mostly as internal helper methods. Use of these is discouraged.

$mech->element_query( \@elements, \%attributes )

my $query = $mech->element_query(['input', 'select', 'textarea'],
                           { name => 'foo' });

Returns the XPath query that searches for all elements with tagNames in @elements having the attributes %attributes. The @elements will form an or condition, while the attributes will form an and condition.

DEBUGGING METHODS

This module can collect the screencasts that Chrome can produce. The screencasts are sent to your callback which either feeds them to ffmpeg to create a video out of them or dumps them to disk as sequential images.

sub saveFrame {
    my( $mech, $framePNG ) = @_;
    print $framePNG->{data};

}

$mech->setScreenFrameCallback( \&saveFrame );
... do stuff ...
$mech->setScreenFrameCallback( undef ); # stop recording

If you want a premade screencast receiver for debugging headless Chrome sessions, see Mojolicious::Plugin::PNGCast.

$mech->sleep

$mech->sleep( 2 ); # wait for things to settle down

Suspends the progress of the program while still handling messages from Chrome.

The main use of this method is to give Chrome enough time to send all its screencast frames and to catch up before shutting down the connection.

INCOMPATIBILITIES WITH WWW::Mechanize

As this module is in a very early stage of development, there are many incompatibilities. The main thing is that only the most needed WWW::Mechanize methods have been implemented by me so far.

Unsupported Methods

At least the following methods are unsupported:

  • ->find_all_inputs

    This function is likely best implemented through $mech->selector.

  • ->find_all_submits

    This function is likely best implemented through $mech->selector.

  • ->images

    This function is likely best implemented through $mech->selector.

  • ->find_image

    This function is likely best implemented through $mech->selector.

  • ->find_all_images

    This function is likely best implemented through $mech->selector.

Functions that will likely never be implemented

These functions are unlikely to be implemented because they make little sense in the context of Chrome.

  • ->clone

  • ->credentials( $username, $password )

  • ->get_basic_credentials( $realm, $uri, $isproxy )

  • ->clear_credentials()

  • ->put

    I have no use for it

  • ->post

    This module does not yet support POST requests

INSTALLING

See WWW::Mechanize::Chrome::Install

SEE ALSO

MASQUERADING AS OTHER BROWSERS

Some articles about what you need to change to appear as a different browser

https://multilogin.com/why-mimicking-a-device-is-almost-impossible/

https://github.com/berstend/puppeteer-extra/tree/master/packages/puppeteer-extra-plugin-stealth

REPOSITORY

The public repository of this module is https://github.com/Corion/www-mechanize-chrome.

SUPPORT

The public support forum of this module is https://perlmonks.org/.

TALKS

I've given a German talk at GPW 2017, see http://act.yapc.eu/gpw2017/talk/7027 and https://corion.net/talks for the slides.

At The Perl Conference 2017 in Amsterdam, I also presented a talk, see http://act.perlconference.org/tpc-2017-amsterdam/talk/7022. The slides for the English presentation at TPCiA 2017 are at https://corion.net/talks/WWW-Mechanize-Chrome/www-mechanize-chrome.en.html.

At the London Perl Workshop 2017 in London, I also presented a talk, see Youtube . The slides for that talk are here.

BUG TRACKER

Please report bugs in this module via the Github bug queue at https://github.com/Corion/WWW-Mechanize-Chrome/issues

CONTRIBUTING

Please see WWW::Mechanize::Chrome::Contributing.

KNOWN ISSUES

Please see WWW::Mechanize::Chrome::Troubleshooting.

AUTHOR

Max Maischein [email protected]

CONTRIBUTORS

Andreas Kรถnig [email protected]

Tobias Leich [email protected]

Steven Dondley [email protected]

Joshua Pollack

COPYRIGHT (c)

Copyright 2010-2024 by Max Maischein [email protected].

LICENSE

This module is released under the same terms as Perl itself.

www-mechanize-chrome's People

Contributors

amanyadev avatar chrisnovakovic avatar cmadamsgit avatar corion avatar froggs avatar haraldjoerg avatar jpollack avatar lorenzota avatar sdondley avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

www-mechanize-chrome's Issues

VERY slow on xpath getting TDs of a TR

Hi there!

If there's a mailing list pls. let me know and I'll post the discussion there.

This is taking 2 to 3 seconds on average:
my @cells = $mech->xpath('.//td', node => $rows[$row_index]);

what is making it so slow ?? Is there a way to speed this?

TIA!

--
Alex

Can't locate object method "port" via package "URI::_generic" at .../perl/5.28.1/WWW/Mechanize/Chrome.pm line 1033.

#!/usr/bin/perl
use warnings;
use WWW::Mechanize::Chrome;
use Log::Log4perl qw(:easy);

my $mech = WWW::Mechanize::Chrome->new();

=> Can't locate object method "port" via package "URI::_generic"
at .../perl/5.28.1/WWW/Mechanize/Chrome.pm line 1033.

=> LINES 1009 - 1034

sub _spawn_new_chrome_instance( $self, $options ) {
    my $class = ref $self;
    my @cmd = $class->build_command_line( $options );
    $self->log('debug', "Spawning for $options->{ connection_style }", \@cmd);
    (my( $pid , $to_chrome, $from_chrome, $chrome_stdout ))
        = $self->spawn_child( $options->{ connection_style }, @cmd );
    $options->{ writer_fh } = $to_chrome;
    $options->{ reader_fh } = $from_chrome;
    $self->{pid} = $pid;
    $self->{ kill_pid } = 1;
    if( $options->{ connection_style } eq 'pipe') {
        $options->{ writer_fh } = $to_chrome;
        $options->{ reader_fh } = $from_chrome;

    } else {
        if( $chrome_stdout ) {
            # Synchronously wait for the URL we can connect to
            # Maybe this should become part of the transport, or a second
            # class to asynchronously wait on a filehandle?!
            $options->{ endpoint } = $self->read_devtools_url( $chrome_stdout );
            close $chrome_stdout;

            # set up host/port here so it can be used later by other instances
            my $ws = URI->new( $options->{endpoint});

            ###### LINE 1033 #######
            $options->{port} = $ws->port;
            ######################
            $options->{host} = $ws->host;

Is this a temporary Bug or did I just miss to install the certain module?

_Installed modules are:
Algorithm::Diff
Algorithm::Loops
App::cpanminus
CAM::PDF
CGI
Capture::Tiny
Class::Accessor
Class::Method::Modifiers
Crypt::RC4
Data::Dump
Devel::Cycle
Devel::Symdump
ExtUtils::Config
ExtUtils::Helpers
ExtUtils::InstallPaths
Filter::signatures
Font::TTF
Future
Future::HTTP
HTML::Form
HTML::Selector::XPath
HTTP::Daemon
HTTP::Request::AsCGI
HTTP::Server::Simple
IO::Async
IO::String
Image::Info
Imager
Imager::File::PNG
JSON
Log::Log4perl
MRO::Compat
Metrics::Any
Module::Build
Module::Build::Tiny
Mojolicious
Moo
Net::Async::WebSocket
Object::Import
PDF::API2
PadWalker
Path::Tiny
Perl
PerlX::Maybe
Pod::Coverage
Protocol::WebSocket
Role::Tiny
Socket
Spiffy
Struct::Dumb
Sub::Quote
Sub::Uplevel
Test::Base
Test::Deep
Test::Exception
Test::Fatal
Test::HTTP::LocalServer
Test::Identity
Test::Memory::Cycle
Test::Metrics::Any
Test::Needs
Test::NoWarnings
Test::Output
Test::Pod
Test::Pod::Coverage
Test::Refcount
Test::RequiresInternet
Test::Taint
Test::Warn
Test::Warnings
Test::Without::Module
Text::Diff
Text::Levenshtein
Text::Levenshtein::Damerau
Text::PDF
URI
URI::ws
WWW::Mechanize
WWW::Mechanize::Chrome
install
libwww::perl
local::lib
_

Greetings,
Mx

An example from synopsis does not work

the errors are:

Use of uninitialized value $cmd[0] in exec at /home/kes/work/projects/tucha/monkeyman/local/lib/perl5/WWW/Mechanize/Chrome.pm line 438.
Can't exec "": No such file or directory at /home/kes/work/projects/tucha/monkeyman/local/lib/perl5/WWW/Mechanize/Chrome.pm line 438.

Didn't see a 'Network.responseReceived' event for frameId

I'm trying to log into twitter.com automatically. When I try to submit the login form, I get an error:

Didn't see a 'Network.responseReceived' event for frameId CCFEFDAEDAF4F4BDFBD57DCF85C5B736, requestId 1000091631.28, cannot synthesize response at /Users/me/perl5/perlbrew/perls/perl-5.24.1/lib/site_perl/5.24.4/WWW/Mechanize/Chrome.pm line 1852.

Here's the code which is pretty straightforward.

  $mech->form_number(1);
  $mech->field('session[username_or_email]' => 'user');
  $mech->field('session[password]' => 'password');
  $mech->click({ selector => "form.LoginForm input.EdgeButton" });  # error occurs here

I've tried different ways of submitting the form but without luck.

Can't connect without knowing the port?! 0 at .../Chrome/DevToolsProtocol.pm line 317.

With a running Chrome browser already open, when I run this program (Ubuntu 20.04, perl version 5.30, Chrome Version 89.0.4389.114 (Official Build) (64-bit)) :

use strict;
use warnings;
use Log::Log4perl qw(:easy);
use WWW::Mechanize::Chrome;
Log::Log4perl->easy_init($ERROR);
my $mech = WWW::Mechanize::Chrome->new(
    autoclose => 0,
    tab       => 'current',
);

the constructor WWW::Mechanize::Chrome->new() aborts with

Can't connect without knowing the port?! 0 at /home/hakon/perlbrew/perls/perl-5.30.0/lib/site_perl/5.30.0/Chrome/DevToolsProtocol.pm line 317.

See also this question on stackoverflow.

viewport->size() does not appear to work as described in documentation

On a Mac, High Sierra. Chrome Version 67.0.3396.87 (Official Build) (64-bit).

When viewport->size() is called without any arguments, a Too few arguments for subroutine error is thrown.

When called with arguments, (e.g. viewport_size->({height => 768, width => 1024, deviceScaleFactor => 1}) a mobile: boolean value expected error is thrown. However, when a mobile argument is passed with a boolean value, the same error is still thrown.

Awaiting a future while the event loop is running would recurse

I have Mojolicoius application and action which run next helper:

Chrome.pm.txt

When I update my modules, except WWW/Mechanize/Chrome
Comment out loops as follows:

our @loops = (
    # ['Mojo/IOLoop.pm' => 'Chrome::DevToolsProtocol::Transport::Mojo' ],
    # ['IO/Async.pm'    => 'Chrome::DevToolsProtocol::Transport::NetAsync'],
    # ['AnyEvent.pm'    => 'Chrome::DevToolsProtocol::Transport::AnyEvent'],
    ['AE.pm'          => 'Chrome::DevToolsProtocol::Transport::AnyEvent'],
    # native POE support would be nice

    # The fallback, will always catch due to loading strict (for now)
    ['strict.pm'      => 'Chrome::DevToolsProtocol::Transport::AnyEvent'],
);

All is fine, but when I install 'Future::Mojo' I get the error:

Awaiting a future while the event loop is running would recurse at /home/kes/work/projects/tucha/monkeyman/local/lib/perl5/WWW/Mechanize/Chrome.pm line 703.

I can not understand why WWW::Mechanize::Chrome is using Mojo when loops are commented out (see above)?

 # Mojo/IOLoop.pm' => 'Chrome::DevToolsProtocol::Transport::Mojo

PS. I do not use latest WWW::Mechanize::Chrome because call to:

my $pdf  =  $c->html2pdf( $html );

cause application to fall into infinite loop =(

Name of Chrome executable not accurate for Debian system

I installed Google Chrome on a Debian system using the procedure at https://www.tecmint.com/install-google-chrome-in-debian-ubuntu-linux-mint/

The name of the Google Chrome executable on my system as a result of following this procedure is chrome-browser-stable not chrome-browser as indicated in the documentation at:

https://metacpan.org/pod/WWW::Mechanize::Chrome#INSTALLING

Also, Debian allows you to install google-chrome-beta and google-chrome-unstable.

I'd be happy to submit a patch to the docs to reflect the fact that the name of the executable may vary by OS flavor and that the user should consult their system specific documentation or system administrator for guidance.

Please let me know if you'd like me to submit a patch to make this improvement with any other suggestions for how this should be documented.

Test freezes on Win10, strawberry 5.32.1

When cpanm is running test suite, it freezes on 50-gh63-encode-response-content.t
I did a cpanm --look and did prove -vl t\50-gh63-encode-response-content.t : it did the first three tests, then froze; adding debug-prints, I saw it got to the following, where it tries to do the mech on the http:// URL.


If I change the URL to https://, then the test passes without freezing.

Also freezes in 50-tick.t:

is $mech->selector('#unchecked_1',single => 1)->get_attribute('checked'),undef, "#unchecked_1 is not checked";

Unfortunately, I couldn't find a smoking-gun for why that one was freezing.

Neither appears intermittent, and they freeze rather than fail, so I don't think it's the "temporary" timing issue from #65 , but maybe I'm wrong on that.

If I edit 50-gh63-... and skip 50-tick.t, all the other tests pass without freezing

Summary of my perl5 (revision 5 version 32 subversion 1) configuration:

  Platform:
    osname=MSWin32
    osvers=10.0.19042.746
    archname=MSWin32-x64-multi-thread
    uname='Win32 strawberry-perl 5.32.1.1 #1 Sun Jan 24 15:00:15 2021 x64'
    config_args='undef'
    hint=recommended
    useposix=true
    d_sigaction=undef
    useithreads=define
    usemultiplicity=define
    use64bitint=define
    use64bitall=undef
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
    bincompat5005=undef
  Compiler:
    cc='gcc'
    ccflags =' -DWIN32 -DWIN64 -D__USE_MINGW_ANSI_STDIO -DPERL_TEXTMODE_SCRIPTS -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -fwrapv -fno-strict-aliasing -mms-bitfields'
    optimize='-s -O2'
    cppflags='-DWIN32'
    ccversion=''
    gccversion='8.3.0'
    gccosandvers=''
    intsize=4
    longsize=4
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='long long'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='g++.exe'
    ldflags ='-s -L"C:\USR\LOCAL\APPS\STRAWBERRY\perl\lib\CORE" -L"C:\USR\LOCAL\APPS\STRAWBERRY\c\lib"'
    libpth=C:\USR\LOCAL\APPS\STRAWBERRY\c\lib C:\USR\LOCAL\APPS\STRAWBERRY\c\x86_64-w64-mingw32\lib C:\USR\LOCAL\APPS\STRAWBERRY\c\lib\gcc\x86_64-w64-mingw32\8.3.0
    libs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32
    perllibs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32
    libc=
    so=dll
    useshrplib=true
    libperl=libperl532.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_win32.xs
    dlext=xs.dll
    d_dlsymun=undef
    ccdlflags=' '
    cccdlflags=' '
    lddlflags='-mdll -s -L"C:\USR\LOCAL\APPS\STRAWBERRY\perl\lib\CORE" -L"C:\USR\LOCAL\APPS\STRAWBERRY\c\lib"'


Characteristics of this binary (from libperl):
  Compile-time options:
    HAS_TIMES
    HAVE_INTERP_INTERN
    MULTIPLICITY
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_IMPLICIT_CONTEXT
    PERL_IMPLICIT_SYS
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    USE_64_BIT_INT
    USE_ITHREADS
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
  Built under MSWin32
  Compiled at Jan 24 2021 15:05:42
  @INC:
    C:/usr/local/apps/STRAWBERRY/perl/site/lib/MSWin32-x64-multi-thread
    C:/usr/local/apps/STRAWBERRY/perl/site/lib
    C:/usr/local/apps/STRAWBERRY/perl/vendor/lib
    C:/usr/local/apps/STRAWBERRY/perl/lib

Chrome application automation

Some automation tasks within the Chrome application could be exported through a submodule

  • Create tab
  • Close arbitrary tab
  • List open tabs

HTTP::Response from ->get does not return content

    36: sub url2pdf {
   x37:     my $chrome =  shift->chrome;
   x38:     my $res =  $chrome->get( shift );
   x39:     DB::x;
  >>40:     $res->is_success  &&  $res->content   or return;
   x41:     return $chrome->content_as_pdf( format => 'A4' );
    42: }
    43:

DBG>$res
HTTP::Response {
  _content => ,
  _headers => HTTP::Headers {
    content-length => 4561,
    content-type => text/html;charset=UTF-8,
    date => Mon, 27 May 2019 11:37:49 GMT,
    server => Mojolicious (Perl),
  },
  _msg => OK,
  _rc => 200,
  _request => undef,
}

DBG>$res->content


As you can see HTTP::Response object has 4561 bytes of content, but ->content returns nothing. Also _content is empty

MacOS fails numerous tests

Most of these look like they can be attributed to file system incompatibilities.

t/00-load.t ...................... ok
t/01-chrome-devtools-protocol.t .. # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130612.382148:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/01-chrome-devtools-protocol.t .. 1/6 # Open tabs
t/01-chrome-devtools-protocol.t .. ok
t/02-chrome-devtools-tab.t ....... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130615.947746:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/02-chrome-devtools-tab.t ....... 1/4 # HeadlessChrome/67.0.3396.99
# Created new tab CB26CFC78DAF37CFAB7FC95AC499C15C
# Closing tab
[0705/130616.215842:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
t/02-chrome-devtools-tab.t ....... ok
t/47-mech-simplest.t ............. ok
t/49-mech-get-file.t ............. # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130623.089382:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/49-mech-get-file.t ............. 1/14 # Loading /Users/stevedondley/perl/raw_modules/git_repo/WWW-Mechanize-Chrome/t/49-mech-get-file.html
# Loading /Users/stevedondley/perl/raw_modules/git_repo/WWW-Mechanize-Chrome/t/49-mech-get-file.html
t/49-mech-get-file.t ............. ok
t/49-mech-nav.t .................. # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130628.534455:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
No elements found for form number 1 at t/49-mech-nav.t line 43.
Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/sgTIPLxCID : No such file or directory
# Looks like your test exited with 9 before it could output anything.
t/49-mech-nav.t .................. Dubious, test returned 9 (wstat 2304, 0x900)
Failed 4/4 subtests
t/49-port.t ...................... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
t/49-port.t ...................... 1/1 # Failed on Chrome version ''
Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/GgjZhG8d5k : No such file or directory
t/49-port.t ...................... ok
t/50-follow-link.t ............... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130713.584083:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/50-follow-link.t ............... 3/9 Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/WyLGZlj7Ri : No such file or directory
t/50-follow-link.t ............... ok
t/50-form-with-fields.t .......... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130719.191845:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/50-form-with-fields.t .......... 7/8 Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/MqHeqUAmXP : No such file or directory
t/50-form-with-fields.t .......... ok
t/50-form2.t ..................... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130724.653484:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/50-form2.t ..................... 6/32 Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/MISKOHQnUF : No such file or directory
t/50-form2.t ..................... ok
t/50-mech-content.t .............. # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130730.296573:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/50-mech-content.t .............. ok
t/50-mech-ct.t ................... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130734.846432:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/50-mech-ct.t ................... 1/2
#   Failed test 'Content-type of text/html'
#   at t/50-mech-ct.t line 43.
#          got: undef
#     expected: 'text/html'
Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/NNRl1HBsvE : No such file or directory
# Looks like you failed 1 test of 2.
t/50-mech-ct.t ................... Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/2 subtests
t/50-mech-encoding.t ............. # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130738.481579:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/50-mech-encoding.t ............. 1/4 # Length of content5375
# Length of content8706
t/50-mech-encoding.t ............. ok
t/50-mech-forms.t ................ # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130743.301739:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/50-mech-forms.t ................ 1/16 Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/p5SjCzTWNc : No such file or directory
t/50-mech-forms.t ................ ok
t/50-mech-get-nonexistent.t ...... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130748.219142:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/50-mech-get-nonexistent.t ...... 1/4 Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/h1jaK1Kj3J : No such file or directory
t/50-mech-get-nonexistent.t ...... ok
t/50-mech-get.t .................. # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130752.921560:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/50-mech-get.t .................. 1/6
#   Failed test 'Navigated to http://127.0.0.1:53873/'
#   at t/50-mech-get.t line 45.
#          got: 'chrome-error://chromewebdata/'
#     expected: 'http://127.0.0.1:53873/'

#   Failed test 'GETting http://127.0.0.1:53873/ returns HTTP code 200 from response'
#   at t/50-mech-get.t line 47.
#          got: '599'
#     expected: '200'
# <html><head></head><body></body></html>

#   Failed test 'GETting http://127.0.0.1:53873/ returns HTTP status 200 from mech'
#   at t/50-mech-get.t line 50.
#          got: '599'
#     expected: '200'
# <html><head></head><body></body></html>

#   Failed test 'We consider this response successful'
#   at t/50-mech-get.t line 53.
Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/OlJ_ZhCcP8 : No such file or directory
# Looks like you failed 4 tests of 6.
t/50-mech-get.t .................. Dubious, test returned 4 (wstat 1024, 0x400)
Failed 4/6 subtests
t/50-mech-new-dsl.t .............. # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130756.591171:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/50-mech-new-dsl.t .............. ok
t/50-mech-new.t .................. # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130800.261659:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
[0705/130800.478020:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Tabs open in PID 52001: 115
# Releasing mechanize 52001
# Released mechanize
# Listing tabs
t/50-mech-new.t .................. ok
t/50-mech-start-url.t ............ # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130809.343532:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/50-mech-start-url.t ............ 1/2
#   Failed test 'We moved to the start URL instead of about:blank'
#   at t/50-mech-start-url.t line 49.
#          got: 'chrome-error://chromewebdata/'
#     expected: 'http://127.0.0.1:53898/'
Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/stpnX10B2Q : No such file or directory
# Looks like you failed 1 test of 2.
t/50-mech-start-url.t ............ Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/2 subtests
t/50-mech-status.t ............... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130812.898237:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/50-mech-status.t ............... ok
t/50-popup.t ..................... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130817.585703:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/50-popup.t ..................... 1/3 # But we don't know what window was opened
Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/q4iAt0bt7b : No such file or directory
t/50-popup.t ..................... ok
t/51-mech-form-with-fields.t ..... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130832.369068:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/51-mech-form-with-fields.t ..... 5/6
#   Failed test 'We didn't crash'
#   at t/51-mech-form-with-fields.t line 71.
#          got: undef
#     expected: '1'
# No elements found for form with fields [baz bar] at t/51-mech-form-with-fields.t line 66.
Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/qFZxEyQngA : No such file or directory
# Looks like you failed 1 test of 6.
t/51-mech-form-with-fields.t ..... Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/6 subtests
t/51-mech-links.t ................ # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130838.366522:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/51-mech-links.t ................ 1/7 [0705/130838.818931:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/NpzVSoZgGb : No such file or directory
t/51-mech-links.t ................ ok
t/51-mech-set-content.t .......... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130842.392041:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/51-mech-set-content.t .......... 1/2 t/49-port.t ...................... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
t/49-port.t ...................... 1/1 # Failed on Chrome version ''
Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/GgjZhG8d5k : No such file or directory
t/49-port.t ...................... ok
t/50-follow-link.t ............... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130713.584083:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/50-follow-link.t ............... 3/9 Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/WyLGZlj7Ri : No such file or directory
t/50-follow-link.t ............... ok
t/50-form-with-fields.t .......... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130719.191845:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
t/51-mech-set-content.t .......... ok
t/51-mech-submit.t ............... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130846.971441:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/51-mech-submit.t ............... 15/17 Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/lsb0ZfHtlJ : No such file or directory
t/51-mech-submit.t ............... ok
t/53-mech-capture-js-error.t ..... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130852.685298:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/53-mech-capture-js-error.t ..... 7/25 # File loaded
Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/oaeqrYx2Rm : No such file or directory
t/53-mech-capture-js-error.t ..... ok
t/56-render-content.t ............ # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130857.031578:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/56-render-content.t ............ ok
t/58-alert.t ..................... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130901.541456:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/58-alert.t ..................... ok
t/60-mech-custom-headers.t ....... # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130906.229130:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/60-mech-custom-headers.t ....... 1/20
#   Failed test 'Navigated to http://127.0.0.1:53967/'
#   at t/60-mech-custom-headers.t line 59.
#          got: 'chrome-error://chromewebdata/'
#     expected: 'http://127.0.0.1:53967/'
[0705/130906.556031:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.

#   Failed test 'Navigated to http://127.0.0.1:53967/'
#   at t/60-mech-custom-headers.t line 75.
#          got: 'chrome-error://chromewebdata/'
#     expected: 'http://127.0.0.1:53967/'
No elements found for CSS selector '#request_headers' at /Users/stevedondley/perl5/perlbrew/perls/perl-5.24.1/lib/site_perl/5.24.1/WWW/Mechanize/Plugin/Selector.pm line 37.
Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/ArkXZjcHZm : No such file or directory
# Looks like you planned 20 tests but ran 5.
# Looks like you failed 2 tests of 5 run.
# Looks like your test exited with 9 just after 5.
t/60-mech-custom-headers.t ....... Dubious, test returned 9 (wstat 2304, 0x900)
Failed 17/20 subtests
t/61-mech-download.t ............. # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130910.234259:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/61-mech-download.t ............. 1/5 [0705/130910.461303:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.

#   Failed test 'The download (always) succeeds'
#   at t/61-mech-download.t line 73.

#   Failed test 'We got a download response'
#   at t/61-mech-download.t line 74.
#                   undef
#     doesn't match '(?^:attachment;)'
t/61-mech-download.t ............. 5/5
#   Failed test 'File 'mytest.txt' was downloaded OK'
#   at t/61-mech-download.t line 77.
Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/GvgP0Oz7ud : No such file or directory
# Looks like you failed 3 tests of 5.
t/61-mech-download.t ............. Dubious, test returned 3 (wstat 768, 0x300)
Failed 3/5 subtests
t/61-screencast.t ................ # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130916.143085:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/61-screencast.t ................ 1/4 No elements found for form number 1 at t/61-screencast.t line 67.
Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/kgfH1ZWGJn : No such file or directory
# Looks like you planned 4 tests but ran 2.
# Looks like your test exited with 9 just after 2.
t/61-screencast.t ................ Dubious, test returned 9 (wstat 2304, 0x900)
Failed 2/4 subtests
t/62-networkstatus.t ............. # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130918.814661:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
[0705/130919.029678:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
t/62-networkstatus.t ............. ok
t/62-viewport-size.t ............. # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130923.900070:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
[0705/130924.122103:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
[0705/130924.347009:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
t/62-viewport-size.t ............. 1/6 Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/IU_ITB0Uvj : No such file or directory
t/62-viewport-size.t ............. ok
t/65-is_visible.t ................ # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130928.381545:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/65-is_visible.t ................ ok
t/65-save-content.t .............. # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/130959.948191:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
[0705/131000.155959:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
t/65-save-content.t .............. 1/3
#   Failed test 'Top HTML file exists'
#   at t/65-save-content.t line 70.

#   Failed test 'We save the URL under the top HTML filename'
#   at t/65-save-content.t line 71.
#          got: undef
#     expected: '/var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/39MkmcOoHc/test page.html'
# $VAR1 = {
#           'chrome-error://chromewebdata/' => '/var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/39MkmcOoHc/test page.html'
#         };
Couldn't remove tempfile /var/folders/nc/1y1czqf96736xppxnnkmgkrc0000gn/T/mEKmvHtsXl : No such file or directory
# Looks like you failed 2 tests of 3.
t/65-save-content.t .............. Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/3 subtests
t/65-wait_until_visible.t ........ # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/131003.786155:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/65-wait_until_visible.t ........ ok
t/70-mech-png.t .................. # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/131010.930796:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/70-mech-png.t .................. ok
t/75-classnames.t ................ # Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0705/131016.352740:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
# Using Chrome version 'HeadlessChrome/67.0.3396.99'
t/75-classnames.t ................ ok

Test Summary Report
-------------------
t/49-mech-nav.t                (Wstat: 2304 Tests: 0 Failed: 0)
  Non-zero exit status: 9
  Parse errors: Bad plan.  You planned 4 tests but ran 0.
t/50-mech-ct.t                 (Wstat: 256 Tests: 2 Failed: 1)
  Failed test:  2
  Non-zero exit status: 1
t/50-mech-get.t                (Wstat: 1024 Tests: 6 Failed: 4)
  Failed tests:  3-6
  Non-zero exit status: 4
t/50-mech-start-url.t          (Wstat: 256 Tests: 2 Failed: 1)
  Failed test:  2
  Non-zero exit status: 1
t/51-mech-form-with-fields.t   (Wstat: 256 Tests: 6 Failed: 1)
  Failed test:  5
  Non-zero exit status: 1
t/60-mech-custom-headers.t     (Wstat: 2304 Tests: 5 Failed: 2)
  Failed tests:  3, 5
  Non-zero exit status: 9
  Parse errors: Bad plan.  You planned 20 tests but ran 5.
t/61-mech-download.t           (Wstat: 768 Tests: 5 Failed: 3)
  Failed tests:  3-5
  Non-zero exit status: 3
t/61-screencast.t              (Wstat: 2304 Tests: 2 Failed: 0)
  Non-zero exit status: 9
  Parse errors: Bad plan.  You planned 4 tests but ran 2.
t/65-save-content.t            (Wstat: 512 Tests: 3 Failed: 2)
  Failed tests:  2-3
  Non-zero exit status: 2
Files=38, Tests=320, 248 wallclock secs ( 0.15 usr  0.08 sys + 17.05 cusr  5.94 csys = 23.22 CPU)
Result: FAIL
Failed 9/38 test programs. 14/320 subtests failed.
make: *** [test_dynamic] Error 255

Could not read websocket endpoint from Chrome output because of 10 lines limit in sub read_devtools_url

I was getting error Could not read websocket endpoint from Chrome output. Do you maybe have a non-debug instance of Chrome already running? at line 1052 of Chrome.pm

I noticed that there a 10 lines read attempts at sub read_devtools_url and I found that I was getting exactly 10 lines of "irrelevant" warnings from Chromium related to GTK and one related with Chrome's Cloud management controller regardless of whether I was running Chromium from the module or directly. Therefore the module stopped looking for the websocket endpoint from Chromium output after reading not enough lines to actually find it.

I solved it by increasing lines read attempts from 10 to 20 modifying:
sub read_devtools_url( $self, $fh, $lines = 10 ) {
to
sub read_devtools_url( $self, $fh, $lines = 20 ) {
but I guess that a much higher number or another approach would be probably better.

Warning: Use of uninitialized value

Sometimes I get this warnings:

Use of uninitialized value $event in hash element at /home/kes/work/projects/tucha//local/lib/perl5/Chrome/DevToolsProtocol.pm line 235 during global destruction.

lines are: 234, 235, 237

Version 0.34 failing to install on Mac

Build log:

cpanm (App::cpanminus) 1.7044 on perl 5.024004 built for darwin-2level
Work directory is /Users/stevedondley/.cpanm/work/1564420241.85926
You have make /usr/bin/make
You have LWP 6.31
You have /usr/bin/tar: bsdtar 2.8.3 - libarchive 2.8.3
You have /usr/bin/unzip
Searching WWW::Mechanize::Chrome () on cpanmetadb ...
--> Working on WWW::Mechanize::Chrome
Fetching http://www.cpan.org/authors/id/C/CO/CORION/WWW-Mechanize-Chrome-0.34.tar.gz
-> OK
Unpacking WWW-Mechanize-Chrome-0.34.tar.gz
Entering WWW-Mechanize-Chrome-0.34
Checking configure dependencies from META.json
Checking if you have ExtUtils::MakeMaker 6.58 ... Yes (7.34)
Configuring WWW-Mechanize-Chrome-0.34
Running Makefile.PL
(Re)Creating lib/WWW/Mechanize/Chrome/Examples.pm
Checking if your kit is complete...
Looks good
Generating a Unix-style Makefile
Writing Makefile for WWW::Mechanize::Chrome
Writing MYMETA.yml and MYMETA.json
-> OK
Checking dependencies from MYMETA.json ...
Checking if you have HTTP::Cookies 0 ... Yes (6.04)
Checking if you have Filter::signatures 0.09 ... Yes (0.15)
Checking if you have Test::Deep 0 ... Yes (1.128)
Checking if you have File::Spec 0 ... Yes (3.75)
Checking if you have Algorithm::Loops 0 ... Yes (1.032)
Checking if you have Imager 0 ... Yes (1.009)
Checking if you have Storable 0 ... Yes (3.11)
Checking if you have Test::More 0 ... Yes (1.302156)
Checking if you have Imager::File::PNG 0 ... Yes (0.94)
Checking if you have Log::Log4perl 0 ... Yes (1.49)
Checking if you have Future 0.35 ... Yes (0.39)
Checking if you have File::Basename 0 ... Yes (2.85)
Checking if you have JSON 0 ... Yes (4.00)
Checking if you have Data::Dumper 0 ... Yes (2.173)
Checking if you have AnyEvent::Future 0 ... Yes (0.03)
Checking if you have ExtUtils::MakeMaker 5.52_01 ... Yes (7.34)
Checking if you have Try::Tiny 0 ... Yes (0.30)
Checking if you have Scalar::Util 0 ... Yes (1.50)
Checking if you have Moo 2 ... Yes (2.003004)
Checking if you have IO::Socket::INET 0 ... Yes (1.35)
Checking if you have Carp 0 ... Yes (1.50)
Checking if you have Object::Import 0 ... Yes (1.005)
Checking if you have AnyEvent 0 ... Yes (7.14)
Checking if you have AnyEvent::WebSocket::Client 0 ... Yes (0.50)
Checking if you have URI::file 0 ... Yes (4.21)
Checking if you have HTML::Selector::XPath 0 ... Yes (0.25)
Checking if you have WWW::Mechanize::Link 0 ... Yes (1.91)
Checking if you have Future::HTTP 0.06 ... Yes (0.12)
Checking if you have Image::Info 0 ... Yes (1.41)
Checking if you have URI 0 ... Yes (1.76)
Checking if you have Exporter 5 ... Yes (5.72)
Checking if you have MIME::Base64 0 ... Yes (3.15)
Checking if you have HTTP::Headers 0 ... Yes (6.18)
Checking if you have Test::HTTP::LocalServer 0.61 ... Yes (0.64)
Checking if you have POSIX 0 ... Yes (1.65_01)
Checking if you have HTTP::Response 0 ... Yes (6.18)
Building and testing WWW-Mechanize-Chrome-0.34
cp lib/WWW/Mechanize/Chrome/Troubleshooting.pm blib/lib/WWW/Mechanize/Chrome/Troubleshooting.pm
cp lib/Chrome/DevToolsProtocol/Transport/Mojo.pm blib/lib/Chrome/DevToolsProtocol/Transport/Mojo.pm
cp lib/HTTP/Cookies/ChromeDevTools.pm blib/lib/HTTP/Cookies/ChromeDevTools.pm
cp lib/Chrome/DevToolsProtocol/Transport/AnyEvent.pm blib/lib/Chrome/DevToolsProtocol/Transport/AnyEvent.pm
cp lib/Chrome/DevToolsProtocol.pm blib/lib/Chrome/DevToolsProtocol.pm
cp lib/WWW/Mechanize/Chrome/Cookbook.pm blib/lib/WWW/Mechanize/Chrome/Cookbook.pm
cp lib/WWW/Mechanize/Chrome.pm blib/lib/WWW/Mechanize/Chrome.pm
cp lib/WWW/Mechanize/Chrome/Examples.pm blib/lib/WWW/Mechanize/Chrome/Examples.pm
cp lib/Chrome/DevToolsProtocol/Transport.pm blib/lib/Chrome/DevToolsProtocol/Transport.pm
cp lib/WWW/Mechanize/Chrome/Contributing.pod blib/lib/WWW/Mechanize/Chrome/Contributing.pod
cp lib/WWW/Mechanize/Chrome/DSL.pm blib/lib/WWW/Mechanize/Chrome/DSL.pm
cp lib/Chrome/DevToolsProtocol/Transport/NetAsync.pm blib/lib/Chrome/DevToolsProtocol/Transport/NetAsync.pm
cp lib/WWW/Mechanize/Chrome/Node.pm blib/lib/WWW/Mechanize/Chrome/Node.pm
cp lib/WWW/Mechanize/Chrome/Install.pod blib/lib/WWW/Mechanize/Chrome/Install.pod
Manifying 14 pod documents
PERL_DL_NONLAZY=1 "/Users/stevedondley/perl5/perlbrew/perls/perl-5.24.1/bin/perl" "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef *Test::Harness::Switches; test_harness(0, 'blib/lib', 'blib/arch')" t/00-load.t t/01-chrome-devtools-protocol.t t/02-chrome-devtools-tab.t t/03-listener-leak-test.t t/47-mech-simplest.t t/49-launch.t t/49-mech-get-file.t t/49-mech-nav.t t/49-port.t t/50-follow-link.t t/50-form-with-fields.t t/50-form2.t t/50-mech-content.t t/50-mech-ct.t t/50-mech-encoding.t t/50-mech-eval.t t/50-mech-forms.t t/50-mech-get-nonexistent.t t/50-mech-get.t t/50-mech-new-dsl.t t/50-mech-new.t t/50-mech-redirect.t t/50-mech-start-url.t t/50-mech-status.t t/50-popup.t t/51-mech-form-with-fields.t t/51-mech-links.t t/51-mech-set-content.t t/51-mech-submit.t t/53-mech-capture-js-error.t t/56-render-content.t t/58-alert.t t/60-mech-cookies.t t/60-mech-custom-headers.t t/61-mech-download.t t/61-screencast.t t/62-networkstatus.t t/62-viewport-size.t t/65-is_visible.t t/65-save-content.t t/65-wait_until_visible.t t/70-mech-png.t t/75-classnames.t t/76-scroll.t t/77-reload-fragment.t t/78-memleak.t
# Testing WWW::Mechanize::Chrome 0.34, Perl 5.024004
# B 1.62
# Carp 1.50
# Carp::Heavy 1.50
# Chrome::DevToolsProtocol 0.34
# Chrome::DevToolsProtocol::EventListener <unknown>
# Chrome::DevToolsProtocol::Transport 0.34
# Class::Method::Modifiers 2.12
# Class::XSAccessor 1.19
# Class::XSAccessor::Heavy 1.19
# Config 5.024004
# Cwd 3.75
# Data::Dumper 2.173
# Devel::GlobalDestruction 0.14
# DynaLoader 1.38
# Errno 1.25
# Exporter 5.72
# Exporter::Heavy 5.72
# Fcntl 1.13
# File::Basename 2.85
# File::Spec 3.75
# File::Spec::Unix 3.75
# Filter::Simple 0.92
# Filter::Util::Call 1.55
# Filter::signatures 0.15
# Future 0.39
# Future::HTTP 0.12
# HTML::Selector::XPath 0.25
# HTTP::Cookies 6.04
# HTTP::Cookies::ChromeDevTools 0.34
# HTTP::Cookies::Netscape 6.04
# HTTP::Date 6.02
# HTTP::Headers 6.18
# HTTP::Headers::Util 6.18
# HTTP::Message 6.18
# HTTP::Response 6.18
# HTTP::Status 6.18
# IO 1.36_01
# IO::Handle 1.36
# IO::Socket 1.38
# IO::Socket::INET 1.35
# IO::Socket::UNIX 1.26
# JSON 4.00
# JSON::XS 4.0
# List::Util 1.5
# Log::Log4perl 1.49
# Log::Log4perl::Appender <unknown>
# Log::Log4perl::Appender::String <unknown>
# Log::Log4perl::Config <unknown>
# Log::Log4perl::Config::BaseConfigurator <unknown>
# Log::Log4perl::Config::PropertyConfigurator <unknown>
# Log::Log4perl::Config::Watch <unknown>
# Log::Log4perl::DateFormat <unknown>
# Log::Log4perl::Filter <unknown>
# Log::Log4perl::Filter::Boolean <unknown>
# Log::Log4perl::JavaMap <unknown>
# Log::Log4perl::Layout <unknown>
# Log::Log4perl::Layout::PatternLayout <unknown>
# Log::Log4perl::Layout::PatternLayout::Multiline <unknown>
# Log::Log4perl::Layout::SimpleLayout <unknown>
# Log::Log4perl::Level <unknown>
# Log::Log4perl::Logger <unknown>
# Log::Log4perl::MDC <unknown>
# Log::Log4perl::NDC <unknown>
# Log::Log4perl::Util <unknown>
# Log::Log4perl::Util::TimeTracker <unknown>
# MIME::Base64 3.15
# Method::Generate::Accessor <unknown>
# Method::Generate::Constructor <unknown>
# Module::Runtime 0.016
# Moo 2.003004
# Moo::Object <unknown>
# Moo::_Utils <unknown>
# Moo::_mro <unknown>
# Moo::_strictures <unknown>
# Moo::sification <unknown>
# POSIX 1.65_01
# PerlIO 1.09
# Scalar::Util 1.5
# SelectSaver 1.02
# SelfLoader 1.23
# Socket 2.020_03
# Storable 3.11
# Storable::Limit <unknown>
# Sub::Defer 2.005001
# Sub::Exporter::Progressive 0.001013
# Sub::Quote 2.005001
# Sub::Util 1.5
# Symbol 1.07
# Sys::Hostname 1.20
# Test::Builder 1.302156
# Test::Builder::Formatter 1.302156
# Test::Builder::Module 1.302156
# Test::Builder::TodoDiag 1.302156
# Test::More 1.302156
# Test2::API 1.302156
# Test2::API::Context 1.302156
# Test2::API::Instance 1.302156
# Test2::API::Stack 1.302156
# Test2::Event 1.302156
# Test2::Event::Bail 1.302156
# Test2::Event::Diag 1.302156
# Test2::Event::Exception 1.302156
# Test2::Event::Fail 1.302156
# Test2::Event::Note 1.302156
# Test2::Event::Ok 1.302156
# Test2::Event::Pass 1.302156
# Test2::Event::Plan 1.302156
# Test2::Event::Skip 1.302156
# Test2::Event::Subtest 1.302156
# Test2::Event::V2 1.302156
# Test2::Event::Waiting 1.302156
# Test2::EventFacet 1.302156
# Test2::EventFacet::About 1.302156
# Test2::EventFacet::Amnesty 1.302156
# Test2::EventFacet::Assert 1.302156
# Test2::EventFacet::Control 1.302156
# Test2::EventFacet::Error 1.302156
# Test2::EventFacet::Hub 1.302156
# Test2::EventFacet::Info 1.302156
# Test2::EventFacet::Meta 1.302156
# Test2::EventFacet::Parent 1.302156
# Test2::EventFacet::Plan 1.302156
# Test2::EventFacet::Trace 1.302156
# Test2::Formatter 1.302156
# Test2::Formatter::TAP 1.302156
# Test2::Hub 1.302156
# Test2::Hub::Interceptor 1.302156
# Test2::Hub::Interceptor::Terminator 1.302156
# Test2::Hub::Subtest 1.302156
# Test2::Util 1.302156
# Test2::Util::ExternalMeta 1.302156
# Test2::Util::Facets2Legacy 1.302156
# Test2::Util::HashBase 1.302156
# Test2::Util::Trace 1.302156
# Text::Balanced 2.03
# Tie::Hash 1.05
# Time::HiRes 1.9741
# Time::Local 1.2300
# Try::Tiny 0.30
# Types::Serialiser 1.0
# URI 1.76
# URI::Escape 3.31
# WWW::Mechanize::Chrome 0.34
# WWW::Mechanize::Chrome::Node 0.34
# WWW::Mechanize::Link 1.91
# XSLoader 0.22
# attributes 0.27
# base 2.2301
# bytes 1.05
# common::sense 3.74
# constant 1.33
# feature 1.42
# mro 1.18
# overload 1.26
# overloading 0.02
# re 0.32
# strict 1.11
# vars 1.03
# warnings 1.36
# warnings::register 1.04
t/00-load.t ...................... ok
# Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0729/131048.295550:ERROR:browser_process_sub_thread.cc(221)] Waited 3 ms for network service
# Using Chrome version 'HeadlessChrome/75.0.3770.142'
t/01-chrome-devtools-protocol.t .. ok
# Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0729/131052.223738:ERROR:browser_process_sub_thread.cc(221)] Waited 5 ms for network service
# Using Chrome version 'HeadlessChrome/75.0.3770.142'
# HeadlessChrome/75.0.3770.142
# Created new tab 55631AE15314953F03907DD9A54D1757
# Closing tab
[0729/131052.751609:ERROR:browser_process_sub_thread.cc(221)] Waited 5 ms for network service
t/02-chrome-devtools-tab.t ....... ok
# Testing with /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
[0729/131058.697720:ERROR:browser_process_sub_thread.cc(221)] Waited 3 ms for network service
# Using Chrome version 'HeadlessChrome/75.0.3770.142'
-> FAIL Timed out (> 1800s). Use --verbose to retry.
make: *** [test_dynamic] Terminated: 15
-> FAIL Installing WWW::Mechanize::Chrome failed. See /Users/stevedondley/.cpanm/work/1564420241.85926/build.log for details. Retry with --force to force install it.
Expiring 3 work directories.

Use of @_ in numeric eq (==) with signatured subroutine is experimental

Use of @_ in numeric eq (==) with signatured subroutine is experimental at local/lib/perl5/WWW/Mechanize/Chrome.pm line 818.
Use of @_ in array element with signatured subroutine is experimental at local/lib/perl5/WWW/Mechanize/Chrome.pm line 818.
Use of @_ in array element with signatured subroutine is experimental at local/lib/perl5/WWW/Mechanize/Chrome.pm line 1981.
Use of @_ in numeric eq (==) with signatured subroutine is experimental at local/lib/perl5/WWW/Mechanize/Chrome.pm line 5488.
Use of @_ in list assignment with signatured subroutine is experimental at local/lib/perl5/WWW/Mechanize/Chrome.pm line 5488.
Use of @_ in list assignment with signatured subroutine is experimental at local/lib/perl5/WWW/Mechanize/Chrome.pm line 5491.
Use of @_ in numeric eq (==) with signatured subroutine is experimental at local/lib/perl5/WWW/Mechanize/Chrome.pm line 5598.
Use of @_ in list assignment with signatured subroutine is experimental at local/lib/perl5/WWW/Mechanize/Chrome.pm line 5598.
Use of @_ in list assignment with signatured subroutine is experimental at local/lib/perl5/WWW/Mechanize/Chrome.pm line 5601.

Perl version: 5.36.0

Chrome/Chromium v97 closes CDP websocket while setting up a new instance

Launching browser at scripts/vidconf-jitsi.pl line 64.
2022/01/20 17:38:50 Spawning for websocket $VAR1 = [
          '/bin/chromium',
          '--remote-debugging-port=0',
          '--remote-debugging-address=127.0.0.1',
          '--user-data-dir=/tmp/0cwJVdJ9YF',
          '--profile-directory=/tmp/0cwJVdJ9YF/profile/test1',
          '--enable-automation',
          '--no-sandbox',
          '--safebrowsing-disable-auto-update',
          '--disable-background-networking',
          '--disable-breakpad',
          '--disable-client-side-phishing-detection',
          '--disable-component-update',
          '--disable-hang-monitor',
          '--disable-prompt-on-repost',
          '--disable-sync',
          '--disable-translate',
          '--disable-web-resources',
          '--disable-default-apps',
          '--disable-infobars',
          '--disable-popup-blocking',
          '--disable-gpu',
          '--disable-save-password-bubble',
          'about:blank'
        ];
2022/01/20 17:38:50 Spawned child as 1814216, communicating via websocket
2022/01/20 17:38:50 [[[1814257:1814257:0120/173850.697653:ERROR:gpu_init.cc(457)] Passthrough is not supported, GL is disabled, ANGLE is]]
2022/01/20 17:38:50 [[[1814216:1814216:0120/173850.699659:ERROR:profile_manager.cc(888)] Cannot create profile at path /tmp/0cwJVdJ9YF//tmp/0cwJVdJ9YF/profile/test1]]
2022/01/20 17:38:50 [[[1814216:1814216:0120/173850.699691:ERROR:profile_manager.cc(1759)] Cannot create profile at path /tmp/0cwJVdJ9YF//tmp/0cwJVdJ9YF/profile/test1]]
2022/01/20 17:38:50 [[]]
2022/01/20 17:38:50 [[DevTools listening on ws://127.0.0.1:43339/devtools/browser/8010f652-877a-413f-966d-6fbf70f0ae46]]
2022/01/20 17:38:50 Found ws endpoint from child output as 'ws://127.0.0.1:43339/devtools/browser/8010f652-877a-413f-966d-6fbf70f0ae46'
2022/01/20 17:38:50 Using endpoint ws://127.0.0.1:43339/devtools/browser/8010f652-877a-413f-966d-6fbf70f0ae46
2022/01/20 17:38:50 Connecting to ws://127.0.0.1:43339/devtools/browser/8010f652-877a-413f-966d-6fbf70f0ae46
2022/01/20 17:38:50 Connected to ws://127.0.0.1:43339/devtools/browser/8010f652-877a-413f-966d-6fbf70f0ae46
2022/01/20 17:38:50 Sent 'Target.getTargets' message $VAR1 = '{"method":"Target.getTargets","params":{},"id":1}';
2022/01/20 17:38:50 Replying to 1 $VAR1 = {
          'result' => {
                        'targetInfos' => []
                      },
          'id' => 1
        };
2022/01/20 17:38:50 Sent 'Target.createTarget' message $VAR1 = '{"params":{"url":"about:blank"},"method":"Target.createTarget","id":2}';
2022/01/20 17:38:50 Connection closed

Obviously, the connection should not be closed. It seems to be unlikely that this is a heartbeat issue. Further investigation is needed.

Setting form fields does not seem to send pre and post events

$mech->set_fields calls $mech->do_set_fields which in turn calls $mech->get_set_value.
$mech->get_set_value doesn't seem to send the pre and post events. In fact, there are only comments saying they will be sent.

I need to send an "input" event for these and after the form is filled send an "ngSubmit" event to the form.

I could do this with WWW::Mechanize::Firefox but the ability to send events in the code seems to have vanished.

Was this intentional?

Browser hangs when click() method called on a javascript link

Let's say you have a link that when clicked triggers an ajax call to load more content into the page. If the click() method is used to click the link, the browser hangs because it never receives an HTTP::Response back. I've hacked together a fix for my own purposes but it would be better to make a proper fix. I'm looking for guidance on what that fix would be. Here's my temporary hack which modifies the _mightNavigate call found at the end of the click method:

    my $response = $x ? 
    $self->_mightNavigate( sub {
 
        $self->driver->send_message('Runtime.callFunctionOn', objectId => $id, functionDeclaration => 'function() { this.click(); }', arguments => []);
    }, %options)
    :
    $self->_mightNavigate( sub {
        $self->driver->send_message('Runtime.callFunctionOn', objectId => $id, functionDeclaration => 'function() { this.click(); }', arguments => []);
    }, %options)
    ->get;

The way it works is if I pass in a value for $x to the method, it will call _mightNavigate without the get method.

form_with_fields versus _field_by_name (inconsistent results)

I'm attempting to load, then fill in, the amazon.co.uk signin page .. I get as far as "wait_until_visible" (the email input field is loaded by js.. why, who knows).. but then get stuck.

It seems the various xpath fetches don't all do the same things (is my guess):

  • form_with_fields('email','password') - this one works / no errors, selects the signIn form, as expected

  • _field_by_name(name => 'email', ..) as called by get_set_value (and ultimately from submit_form(with_fields => {email => '...'}); - this one doesnt, we get No elements found for input with name 'email'

  • xpath 1 (works):
    //form[.//*[(local-name(.)="input" or local-name(.)="select" or local-name(.)="textarea") and @name="email"] and .//*[(local-name(.)="input" or local-name(.)="select" or local-name(.)="textarea") and @name="password"]]

  • xpath 2 (fails):
    .//*[(local-name(.)="input" or local-name(.)="select" or local-name(.)="textarea") and @name="email"]

Any ideas?

Could not find node with given id

I am having trouble with some websites, getting this error when getting almost any information:

Could not find node with given id                                               
                                                                                
-32000 at /home/cmadams/perl5/lib/perl5/Chrome/DevToolsProtocol/Target.pm line 5
04

It seems somewhat hit or miss (so maybe some kind of timing issue/race condition?) and varies based on the site. For example, with Amazon, adding a sleep allows $mech->base to work, but still $mech->links fails.

Here's my test code:

#!/usr/bin/perl

use Modern::Perl;
use WWW::Mechanize::Chrome;
use Log::Log4perl qw(:easy);

Log::Log4perl->easy_init ($WARN);
my $mech = WWW::Mechanize::Chrome->new (
    launch_exe => "/usr/bin/chromium-browser",
);

eval {
        $mech->get ("https://amazon.com/");
        $mech->sleep (1);
        print $mech->base, "\n";
        print $_->text . " -> " . $_->url . "\n" for $mech->links;
};
print $@ if ($@);
$mech->target->send_message ("Browser.close")->get;
sleep (1);

->update_html halts

I have next Mojolicious helper:

sub html2pdf {
	my $chrome =  shift->chrome;
	my $html =  shift;
	$chrome->update_html( $html );
	return $chrome->content_as_pdf( format => 'A4' );
}

it halts execution of script, but if I provide string to update_html method, then works fine:

$chrome->update_html( $html );

How to reproduce:

$html = '<b>Hi</b>';
$chrome->update_html( "$html" ); # works
$chrome->update_html(  $html  ); # halted

Referer header not working

add_header method is not working fine with Referer parameter, also tested with User-Agent and seems to be working fine.

$mech->add_header( Referer => $ref );

reload method is also crashing with ignoreCache option. Is working fine without ignoreCache option.

$mech->reload( ignoreCache => 1 );

Throws:

Uncaught exception from user code:
        Invalid parameters
        ignoreCache: boolean value expected
        -32602 at ... Chrome/DevToolsProtocol/Target.pm line 460

Test code attached comparing WWW::Mechanize and WWW::Mechanize::Chrome results.
test.zip

Can't locate object method "new" via package "Chrome::DevToolsProtocol::Transport::NetAsync"

After upgrading WWW-Mechanize-Chrome from 0.28 to 0.40 I get the error:

Can't locate object method "new" via package "Chrome::DevToolsProtocol::Transport::NetAsync" (perhaps you forgot to load "Chrome::DevToolsProtocol::Transport::NetAsync"?) at /home/kes/work/projects/tucha/monkeyman/local/lib/perl5/Chrome/DevToolsProtocol/Transport.pm line 44

_

  1. If there are some implementation running
  2. default is not tried
  3. But if reloading failed (if it is already in the memory (@inc) why you try to require it again?)
  4. the default is returned (without trying to load it)

Cookie support not working

Is cookie support working? I tried ->set_cookie and ->load and both throw errors. The same code with HTTP::Cookies is working fine.

I'm I doing something wrong? Test code attached.
test.zip

Sleep statements needed in code to avoid errors

I've got chromium installed on a Debian 10 box with 8 GB of ram. I will frequently, but not always, see errors unless I place sleep statements in the code like so:

#!/usr/bin/perl
use File::Temp 'tempdir';
use Time::HiRes 'usleep';
use Log::Log4perl qw(:easy);
use WWW::Mechanize::Chrome;

Log::Log4perl->easy_init($ERROR);
my $mech = WWW::Mechanize::Chrome->new(sync => 1, launch_exe => '/usr/bin/chromium', headless => 1, incognito => 1, data_directory => tempdir( CLEANUP => 1 ), profile => '/home/admin');
$mech->get('http://exmaple.com/wp-login.php');

usleep 500000;
$mech->submit_form( with_fields => { log => 'user', pwd => 'pass' } );

usleep 500000;
print $mech->content;

Here's an example of the errors I'll get without the sleep statements:

Bad luck: Node with nodeId 0 found. Info for this one cannot be retrieved at /usr/local/share/perl/5.28.1/WWW/Mechanize/Chrome.pm line 4026.
No node with given id found

-32000 at /usr/local/share/perl/5.28.1/Chrome/DevToolsProtocol/Target.pm line 502
Node 0 has gone away in the meantime, could not resolve at /usr/local/share/perl/5.28.1/WWW/Mechanize/Chrome/Node.pm line 206.
Can't call method "send_message" on an undefined value at /usr/local/share/perl/5.28.1/WWW/Mechanize/Chrome/Node.pm line 95.

When this error occurs, the HTML does not get printed to the screen at all as execpted.

A less frequent error I see is this one:

No search session with given id found

-32000 at /usr/local/share/perl/5.28.1/Chrome/DevToolsProtocol/Target.pm line 502

Other times, the script will run but will print this error before printing the content of the page:

No node with given id found

-32000 at /usr/local/share/perl/5.28.1/Chrome/DevToolsProtocol/Target.pm line 502
Node 157 has gone away in the meantime, could not resolve at /usr/local/share/perl/5.28.1/WWW/Mechanize/Chrome/Node.pm line 206.

Are there any settings I can use to avoid these errors? I'd prefer not to have to litter my code with sleep statements.

Don't know how to set the value for node 'input.my-css-class', sorry

Hi, it seems the tag name of a node currently contains css class names. This example triggers it:

test-input-with-class.html:

<html>
<body>
    <form>
        <input name="username" class="my-css-class" value="">
    </form>
</body>
</html>

test-input.pl:

use Log::Log4perl qw(:easy);
use WWW::Mechanize::Chrome;

Log::Log4perl->easy_init($ERROR);
my $m = WWW::Mechanize::Chrome->new(
    launch_exe => '/opt/google/chrome/chrome',
);

$m->get_local('test-input-with-class.html');

$m->field('username', 'foobar');

Output:

$ perl5.24.0 test-input.pl 
[...]
Don't know how to set the value for node 'input.my-css-class', sorry at /home/froggs/perl5/perlbrew/perls/perl-5.24.0/lib/site_perl/5.24.0/WWW/Mechanize/Chrome.pm line 2782.

Potential fix:

--- lib/WWW/Mechanize/Chrome.pm	2017-06-26 23:38:34.429987377 +0200
+++ lib/WWW/Mechanize/Chrome.pm	2017-06-26 23:38:39.158062618 +0200
@@ -3388,7 +3388,7 @@
 }
 
 sub get_tag_name( $self ) {
-    $self->nodeName
+    $self->nodeName =~ /^([^.]+)/ && $1
 }
 
 sub get_text( $self ) {

Possible to suppress of annoying Chrome errors?

I'll often get errors in the output of Chrome. Here's a sample:

2018-07-08 13:13:20.143 Google Chrome[78169:25078696] *** Owner supplied to -[NSTrackingArea initWithRect:options:owner:userInfo:] referenced a deall
ocating object. Tracking area behavior is undefined. Break on NSTrackingAreaDeallocatingOwnerError to debug.
2018-07-08 13:13:22.963 Google Chrome[78169:25078696] *** Owner supplied to -[NSTrackingArea initWithRect:options:owner:userInfo:] referenced a deall
ocating object. Tracking area behavior is undefined. Break on NSTrackingAreaDeallocatingOwnerError to debug.
2018-07-08 13:13:25.752 Google Chrome[78169:25078696] Error loading /Library/Audio/Plug-Ins/HAL/DVCPROHDAudio.plugin/Contents/MacOS/DVCPROHDAudio:  d
lopen(/Library/Audio/Plug-Ins/HAL/DVCPROHDAudio.plugin/Contents/MacOS/DVCPROHDAudio, 262): no suitable image found.  Did find:
  /Library/Audio/Plug-Ins/HAL/DVCPROHDAudio.plugin/Contents/MacOS/DVCPROHDAudio: no matching architecture in universal wrapper
  /Library/Audio/Plug-Ins/HAL/DVCPROHDAudio.plugin/Contents/MacOS/DVCPROHDAudio: no matching architecture in universal wrapper
2018-07-08 13:13:25.752 Google Chrome[78169:25078696] Cannot find function pointer NewPlugIn for factory C5A4CE5B-0BB8-11D8-9D75-0003939615B6 in CFBu
ndle/CFPlugIn 0x7ff6a0ce1530 </Library/Audio/Plug-Ins/HAL/DVCPROHDAudio.plugin> (bundle, not loaded)

They clutter up my tests. Wondering if there is a known way to suppress them?

proxy basic auth

Hello. Please tell me how you can use basic authorization (login , password) when

launch_arg => ['--proxy-server=proxy.com']

Add infinite scroll function?

I have created a wrapper for WMC (with Moose) that includes a method for scrolling down to the bottom of a page with an infinite scroll and then waits for more elements to load. I'm wondering if it might be useful to improve it and incorporate into the WMC module. One call to the function will cause the browser to scroll down to the bottom of the page (twice just to make sure it registers) and then return once it detects more elements have been loaded. Here it is along with its helper functions:

sub infinite_scroll {
  my $s = shift;
  my $wait_time = shift || 120;

  my $current_element_count = $s->get_element_count;
  $s->scroll_to_bottom;

  # wait 1/10th sec for more of the page to load
  usleep 100000;

  my $new_element_count = $s->get_element_count;

  my $start_time = time();
  while (($new_element_count - $current_element_count) < 10) {

    # wait for wait time
    if (time() - $start_time > $wait_time) {
      return 0;
    }

    # wait 1/10th sec for more of the page to load
    usleep 100000;
    $new_element_count = $s->get_element_count;
  }
  usleep 100000;
  return 1;
}

sub scroll_to_bottom {
  my $s = shift;
  $s->eval( 'window.scroll(0,document.body.scrollHeight + 100)' );
  usleep 100000;
  $s->eval( 'window.scroll(0,document.body.scrollHeight + 200)' );
}

sub get_element_count {
  my $s = shift;
  my ($el_count) = $s->eval( 'document.getElementsByTagName("*").length' );
  return $el_count;
}

The content is always wrapped in HTML

When requesting something that returns a non-HTML document, e.g. application/json, if the response from the server is HTTP 304, then the content_type is undefined but the content (presumably the cached content) is wrapped in HTML, e.g.

<html><head></head><body><pre style="word-wrap: break-word; white-space: pre-wrap;">{ value => 1 }</pre></body></html>

This consistently happens when the response is HTTP 304, but this seems to happen sometimes when the response is HTTP 200.

Error when using "tab" argument in constructor

tab => argument appears to be buggy. Tried setting it to current, 0, and qr/about/ and WMC crashes with:

error when connectingInternal Exception at /Users/me/perl5/perlbrew/perls/perl-5.24.1/lib/site_perl/5.24.4/WWW/Mechanize/Chrome.pm line 533

WWW::Mechanize::Chrome->new hangs when --remote-allow-origins= is not set

Using WWW::Mechanize::Chrome 0.68 on debian 11 with chromium 111.

%> chromium --version
Chromium 111.0.5563.64 built on Debian 11.6, running on Debian 11.6

When using the basic example it hangs and using straces it reveils that chrome denies connection because of remote-allow-origin not set:

%> strace -s 1000 perl test.pl

poll([{fd=4, events=POLLIN|POLLOUT}], 1, -1) = 1 ([{fd=4, revents=POLLOUT}])
write(4, "GET /devtools/browser/a414f313-72d9-4402-aa79-a3e90d0ff50b HTTP/1.1\r\nUpgrade: WebSocket\r\nConnection: Upgrade\r\nHost: 127.0.0.1:39039\r\nOrigin: http://127.0.0.1:39039\r\nSec-WebSocket-Key: 8nRsNY16bho1R/TEneonDQ==\r\nSec-WebSocket-Version: 13\r\n\r\n", 239) = 239
poll([{fd=4, events=POLLIN}], 1, -1)    = 1 ([{fd=4, revents=POLLIN}])
read(4, "HTTP/1.1 403 Forbidden\r\nContent-Length:241\r\nContent-Type:text/html\r\n\r\nRejected an incoming WebSocket connection from the http://127.0.0.1:39039 origin. Use the command line flag --remote-allow-origins=http://127.0.0.1:39039 to allow connections from this origin or --remote-allow-origins=* to allow all origins.", 8192) = 311

Adding --remote-allow-origins=* as launch_arg did the trick here.

Another annoying thing is chromium asking for access to the kdewallet. This can be disabled by add the launch_arg --password-store=basic.

So it works for me now, but you might want to extend your default launch options so it will work out of the box.

Thanks for your module.

Provide access to transport

I need to access to the transport used by Mechanize Chrome.
$mech->driver and $mech->target return the same thing.

I found that I can get to the transport by $mech->target->{transport} but am afraid this might change in future releases.

A specific method would be better.

Add example how to setup pagesize

When I save pages as pdf they are saved us US Letter

here I can see an examples that chrome allow parameter to set page size:

await page.pdf({path: 'page.pdf', format: 'A4'});

I did not find how to setup this with help of your module. If this is possible may you please create an example?

I try to workaround that with @page{ size: A4 } but there is bug in Chrome

"'Page.navigate' wasn't found" when calling `get` on a page that's already loaded

Using W::M::C 0.67 to connect to an existing chrome instance, I've noticed I get the error message:

'Page.navigate' wasn't found

-32601 at /usr/local/share/perl/5.28.1/Chrome/DevToolsProtocol/Target.pm line 502

When I attempt to call get to navigate to a page that's already loaded. But only sometimes. I haven't yet found a short and reliable sequence of steps to reproduce it, the only pattern I notice is that it happens if I call ->get for a page that's already loaded. If I open a new tab, or call ->get for a different URL, I don't see the error.

Any thoughts?

Constructor param data_directory has no effect but it is present in the launch args

I can't spawn the browser with a different user data dir.

Via the CLI I can do:

google-chrome --user-data-dir=xxxxx

(EDIT: dir is created automatically if it does not exist, it contains various browser items and no cookies nor history are remembered from last session)

And the dir is created and the browser does not know history+cookies of the previous session.

But running the browser via [WWW::Mechanize::Chrome] does not run in separate data dir although I can see in the log that the spawning options contain --user-data-dir=xxxxx

use Log::Log4perl qw(:easy);
use WWW::Mechanize::Chrome;
Log::Log4perl->easy_init($DEBUG);

    my %default_mech_params = (
	 'data_directory' => 'xxxxx',
	);
    my $mech_obj = eval {
        WWW::Mechanize::Chrome->new(%default_mech_params)
    };
    die $@ if $@;

sleep(1000);

The result:

2023/08/29 11:52:33 Spawning for websocket $VAR1 = [
          '/usr/bin/google-chrome',
          '--remote-debugging-port=0',
          '--remote-allow-origins=*',
          '--remote-debugging-address=127.0.0.1',

          '--user-data-dir=xxxxx' , # <<<<<<<<< it is here!!!!!!

          '--enable-automation',
          '--no-first-run',
          '--mute-audio',
          '--no-sandbox',
          '--safebrowsing-disable-auto-update',
          '--no-default-browser-check',
          '--disable-background-networking',
          '--disable-breakpad',
          '--disable-client-side-phishing-detection',
          '--disable-component-update',
          '--disable-hang-monitor',
          '--disable-prompt-on-repost',
          '--disable-sync',
          '--disable-web-resources',
          '--disable-default-apps',
          '--disable-popup-blocking',
          '--disable-gpu',
          '--disable-domain-reliability',
          'about:blank'
        ];
2023/08/29 11:52:33 Spawned child as 34709, communicating via websocket
2023/08/29 11:52:34 Connecting to ws://127.0.0.1:35455/devtools/browser/0610305f-a9e1-491c-9d1c-a37afbf4299f
2023/08/29 11:52:34 Connected to ws://127.0.0.1:35455/devtools/browser/0610305f-a9e1-491c-9d1c-a37afbf4299f
2023/08/29 11:52:34 Attached to tab 9084D773A279E05A69A2952880C358D8, session BBC2805B8C0CB0C88487AC9A2937A441
2023/08/29 11:52:34 Ignoring 'Runtime.executionContextCreated'

No dir is created, cookies and history are remembered.

EDIT: I have hacked the code to print the spawned command line which works just fine if I run it from my terminal:

/usr/bin/google-chrome --remote-debugging-port=0 --remote-allow-origins=* --remote-debugging-address=127.0.0.1 --user-data-dir=xxxxx --enable-automation --no-first-run --mute-audio --no-sandbox --safebrowsing-disable-auto-update --no-default-browser-check --disable-background-networking --disable-breakpad --disable-client-side-phishing-detection --disable-component-update --disable-hang-monitor --disable-prompt-on-repost --disable-sync --disable-web-resources --disable-default-apps --disable-popup-blocking --disable-gpu --disable-domain-reliability about:blank

Even by short-circuiting the module's code to simplify the command line args, this still does not create the user data dir:

2023/08/29 11:22:45 Spawning for websocket $VAR1 = [
          '/usr/bin/google-chrome',
          '--user-data-dir=xxxxx',
          '--remote-debugging-port=0',
          '--remote-allow-origins=*'
        ];
2023/08/29 11:22:45 Spawned child as 20605, communicating via websocket
2023/08/29 11:22:46 Connecting to ws://127.0.0.1:41239/devtools/browser/3fdbf9ea-ad73-4bcd-aaa5-7a0400ad4f1d
2023/08/29 11:22:46 Connected to ws://127.0.0.1:41239/devtools/browser/3fdbf9ea-ad73-4bcd-aaa5-7a0400ad4f1d
2023/08/29 11:22:46 Attached to tab F469D08E7340E1B6592302A73797FF94, session DF552A462B7BEC9D7D7196C660E25840
2023/08/29 11:22:46 Ignoring 'Runtime.executionContextCreated'

EDIT: I forgot to mention that I also get these warnings when using the module, I did not think they can affect this problem but maybe they are, they are signature warnings (similar to issue #67):

Use of @_ in numeric eq (==) with signatured subroutine is experimental at /opt/perlbrew/perls/perl-5.36.0-O3/lib/site_perl/5.36.0/WWW/Mechanize/Chrome.pm line 837.
Use of @_ in array element with signatured subroutine is experimental at /opt/perlbrew/perls/perl-5.36.0-O3/lib/site_perl/5.36.0/WWW/Mechanize/Chrome.pm line 837.
Use of @_ in array element with signatured subroutine is experimental at /opt/perlbrew/perls/perl-5.36.0-O3/lib/site_perl/5.36.0/WWW/Mechanize/Chrome.pm line 2007.
Use of @_ in numeric eq (==) with signatured subroutine is experimental at /opt/perlbrew/perls/perl-5.36.0-O3/lib/site_perl/5.36.0/WWW/Mechanize/Chrome.pm line 5704.
Use of @_ in list assignment with signatured subroutine is experimental at /opt/perlbrew/perls/perl-5.36.0-O3/lib/site_perl/5.36.0/WWW/Mechanize/Chrome.pm line 5704.
Use of @_ in list assignment with signatured subroutine is experimental at /opt/perlbrew/perls/perl-5.36.0-O3/lib/site_perl/5.36.0/WWW/Mechanize/Chrome.pm line 5707.
Use of @_ in numeric eq (==) with signatured subroutine is experimental at /opt/perlbrew/perls/perl-5.36.0-O3/lib/site_perl/5.36.0/WWW/Mechanize/Chrome.pm line 5814.
Use of @_ in list assignment with signatured subroutine is experimental at /opt/perlbrew/perls/perl-5.36.0-O3/lib/site_perl/5.36.0/WWW/Mechanize/Chrome.pm line 5814.
Use of @_ in list assignment with signatured subroutine is experimental at /opt/perlbrew/perls/perl-5.36.0-O3/lib/site_perl/5.36.0/WWW/Mechanize/Chrome.pm line 5817.

WWW::Mechanize::Chrome version 0.71
google-chrome version is 116.0.5845.110
perl version 5.36.0
EDIT: linux fedora 36

Running version 0.71 on Perl 5.36 produces a lot of warnings

Running version 0.71 on Perl 5.36 produces a lot of warnings - for example:

Use of @_ in numeric eq (==) with signatured subroutine is experimental at perl5/perlbrew/perls/perl-5.36.1/lib/site_perl/5.36.1/WWW/Mechanize/Chrome.pm line 836.
Use of @_ in numeric eq (==) with signatured subroutine is experimental at perl5/perlbrew/perls/perl-5.36.1/lib/site_perl/5.36.1/WWW/Mechanize/Chrome.pm line 5703

These are solved in branch fix-signatures-array.
I would very much welcome a release which contains this branch to silence those warnings.

Thank you!

Failed test when installing

Trying to install on a Mac with Chrome v. 70.0.3538.110.

I'm failing a test:

#   Failed test 'The two links were found'
#   at t/51-mech-links.t line 72.
#          got: '3'
#     expected: '2'
# relative
# myiframe
# http://searchguide.level3.com/search/?q=http://somewhere.example/myiframe&t=0
# Looks like you failed 1 test of 7.
t/51-mech-links.t ................
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/7 subtests

"Prepare Chromium" utility

I wrote a little utility for myself to prepare Chromium for use by W::M::C:

https://github.com/sdondley/WWW-Mechanize-Chrome-PrepareChromium

Might be useful to others with same set up as me:

  1. MacOS
  2. I have a copy of Chromium installed exclusively for use by W::M::C.

I haven't written it with the public in mind and it has some custom code specific to me. But if you download it, you can get it to work with minimal hacking. Specifically, the internal _quit function uses some custom code for use by me to run a wrapper script for the osascript function. You will definitely want to change this to something normal like:

system ('osascript -e \'quit app "Chromium"\''); # this is untested

There isn't a lot to this module and can be easily adapted to other OSes and browser configurations.

'Lost UI shared context' Chrome error

When running tests, the following error is thrown throughout the tests:

[0706/042333.730708:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.

Occurs on both MacOS and Linux (Debian). The error doesn't cause any tests to fail and seems harmless. Not sure if there might be a way to get rid of it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.