Coder Social home page Coder Social logo

jpramosi / geckordp Goto Github PK

View Code? Open in Web Editor NEW
20.0 1.0 6.0 15.62 MB

A client implementation of Firefox DevTools over remote debug protocol in python

Home Page: https://jpramosi.github.io/geckordp/

License: MIT License

Python 100.00%
rdp remote-debug-protocol firefox crawler debug webconsole ui-testing

geckordp's Introduction

geckordp

This is a client implementation of Firefox DevTools over remote debug protocol in python.

It essentially exposes the raw api to interact with the debug server and has some similarities with a common webdriver. See also Documentation.

What's possible with geckordp?

Geckordp is meant to be used as a low level library to build tools on top. With a few helpers like the WebExtension-API and a proxy server, it can be feature rich enough for:

  • web ui-testing
  • extension testing
  • browser test tools
  • webdriver
  • data scraping
  • https recording
  • network traffic analysis
  • remote controller for browser
  • ...and possibly more

Getting Started

To use Geckordp, install it with:

pip install geckordp
# python -m pip install geckordp
# python -m pip install geckordp[develop]

Documentation can be generated with:

sphinx-build -a -c docs/src -b html docs/build docs

Package signature can be checked with:

pip download --no-deps geckordp
wget https://github.com/jpramosi.gpg -O pub.gpg
wget https://raw.githubusercontent.com/jpramosi/geckordp/master/signatures/geckordp-latest.zip.asc -O latest.asc
gpg --no-default-keyring --output pub.sig --dearmor pub.gpg
gpg --no-default-keyring --keyring ./pub.sig --verify latest.asc geckordp-*.zip
# exemplary output:
gpg: Signature made So 23 Okt 2022 14:08:20 CEST
gpg:                using RSA key 21F942661941E642894267539B8551A5AEA1227A
gpg:                issuer "[email protected]"
gpg: Good signature from "Jimmy Pramosi (git) <[email protected]>" [ultimate]

Usage

import json

from geckordp.actors.root import RootActor
from geckordp.firefox import Firefox
from geckordp.profile import ProfileManager
from geckordp.rdp_client import RDPClient

""" Uncomment to enable debug output
"""
# from geckordp.settings import GECKORDP
# GECKORDP.DEBUG = 1
# GECKORDP.DEBUG_REQUEST = 1
# GECKORDP.DEBUG_RESPONSE = 1


def main():
    # clone default profile to 'geckordp'
    pm = ProfileManager()
    profile_name = "geckordp"
    port = 6000
    pm.clone("default-release", profile_name)
    profile = pm.get_profile_by_name(profile_name)
    profile.set_required_configs()

    # start firefox with specified profile
    Firefox.start("https://example.com/", port, profile_name, ["-headless"])

    # create client and connect to firefox
    client = RDPClient()
    client.connect("localhost", port)

    # initialize root
    root = RootActor(client)

    # get a list of tabs
    tabs = root.list_tabs()
    print(json.dumps(tabs, indent=2))

    input()


if __name__ == "__main__":
    main()

See also examples and tests.

Tested Platforms

Tested Platform Working Firefox-Version Geckordp-Version
Windows (x64) yes 126.0 0.5.0
Ubuntu 20.04 yes 126.0 0.5.0
macOS 12 ? 126.0 0.5.0

Geckordp requires minimum Python 3.10 and the latest Firefox build. Older versions of Firefox may also work as long the API changes are not too drastically. In case of doubt, clone and run tests with:

cd <your-repositories-path>
git clone https://github.com/jpramosi/geckordp
cd geckordp
python -m pip uninstall geckordp
python -m pip install -e $PWD
pytest tests/ &> test.log

Archived Versions

Older versions of Geckordp with its corresponding Firefox version can be found here. But keep in mind it may have missing actors or bug-fixes.

Contribute

Every help in form of issues, questions or pull requests are very appreciated. If you would like to improve the project there are a few things to keep in mind:

For submitted code:

  • formatting
  • tests required (if new)
  • should basically reflect the geckodriver api (if possible)

For issues or improvements see here.

Eventually you can also contribute to the project just by asking what do you need (examples, a specific task, features or whether something is feasible) on the issue tracker. Often it will also help other users too.

Develop

To get an idea what's missing, here is a rough list of some notable objectives:

  • add remaining actors from geckodriver
  • add documentation for all actors its functions (even official repository got none)

If you are willing to get your hands dirty, please follow me here.

Technical Details

To be able to communicate with the server, a pre-configured profile is required.

Geckordp offers additional helper functions to resolve this problem with the ProfileManager.

The following flags are changed on profile configuration:

### disable crash-recover after 'ungraceful' process termination
("browser.sessionstore.resume_from_crash", False)

### disable safe-mode after 'ungraceful' process termination
("browser.sessionstore.max_resumed_crashes", 0)
("toolkit.startup.max_resumed_crashes", -1)
("browser.sessionstore.restore_on_demand", False)
("browser.sessionstore.restore_tabs_lazily", False)

### set download folder (not set by firefox)
("browser.download.dir", str(Path.home()))

### enable compatibility
("devtools.chrome.enabled", True)

### don't open dialog to accept connections from client
("devtools.debugger.prompt-connection", False)

### enable remote debugging
("devtools.debugger.remote-enabled", True)

### allow tab isolation (for e.g. separate cookie-jar)
("privacy.userContext.enabled", True)

### misc
("devtools.cache.disabled", True)
("browser.aboutConfig.showWarning", False)
("browser.tabs.warnOnClose", False)
("browser.tabs.warnOnCloseOtherTabs", False)
("browser.shell.skipDefaultBrowserCheckOnFirstRun", True)
("pdfjs.firstRun", True)
("doh-rollout.doneFirstRun", True)
("browser.startup.firstrunSkipsHomepage", True)
("browser.tabs.warnOnOpen", False)
("browser.warnOnQuit", False)
("toolkit.telemetry.reportingpolicy.firstRun", False)
("trailhead.firstrun.didSeeAboutWelcome", True)

Once the new profile was created, Firefox can be started with it. However, actors need to be initialized at first.

Some actors need to call additional functions to get initialized on server-side. But this is not always necessary and depends on what is actually needed. These required functions and its actors are initialized respectively used in this order according to the pcap-dumps.

| Browser initialization:

RDPClient()                 -> .connect()
    v
RootActor()                 -> .get_root()
    v
DeviceActor()               -> .get_description()
    v
ProcessActor()              -> .get_target()
    v
WebConsoleActor()           -> .start_listeners([])
    v
ContentProcessActor()       -> .list_workers()

| Tab initialization:

TabActor()                  -> .get_target()*
    v
WebConsoleActor()           -> .start_listeners([])*
    v
ThreadActor()               -> .attach()*
    v
WatcherActor()              -> .watch_resources(...)*
    v
TargetConfigurationActor()

*required if this actor will be used or events are wanted

The following hierarchy diagram shows dependencies between the actors and how to initialize individual actors:

For debugging purposes, Geckordp can be configured to print out requests and responses to better understand the structure of the json packets. To enable it use:

from geckordp.settings import GECKORDP
GECKORDP.DEBUG = 1
GECKORDP.DEBUG_REQUEST = 1
GECKORDP.DEBUG_RESPONSE = 1
# environment variables can also be used for e.g.
# GECKORDP_DEBUG_RESPONSE=1

Other noteworthy general hints, issues or experiences:

  • actor initialization (plus the related functions like attach, watch or listening) on blank new tabs may get detached after visiting a new url and must be reinitiated (can be avoided if the page got a html header & body)
  • received messages are just plain python dictionaries and most of the time it has consistent fields which can be directly accessed
  • failed requests will return 'None'
  • actors can have multiple contexts, that means different actor IDs can have the same actor model (for e.g. WebConsoleActor for process or tab)
  • called functions within manually registered async handlers on RDPClient can not call functions which emitting 'RDPClient.send_receive()' later in its execution path (instead use non-async handlers in this case)
  • on a new Firefox update it can happen that a few events doesn't get caught by the RDPClient handler or requests getting a wrong response, unfortunately a few event/response packets doesn't follow the same pattern and events must be manually specified in Geckordp which can have the implied side effects

License

MIT License

Copyright (c) 2024 jpramosi

Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:

The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

geckordp's People

Contributors

goodboy avatar jpramosi avatar longguzzz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

geckordp's Issues

How to execute js code in Browser Console?

I want to execute js code in Browser Console(Ctrl+Shift+J) rather than invoke it in single tab, is that possible?
And how can I call the extension API to execute something like localStorage?
Thanks

How to hide robot icon and tips?

x

Tried set useAutomationExtension=False, but it still not work, thanks!

profile = pm.get_profile_by_name(profile_name)
profile.set_config("useAutomationExtension", False)
profile.set_config("dom.webdriver.enabled", False)
profile.set_required_configs()

Tab mgmt APIs?

I know as per #9 and #10 that there are definitely tab reading methods but is there any way to create new tabs, and further possibly unload or discard those tabs during creation?

For example I'd like to be able to do something like what the does but dynamically ๐Ÿ˜Ž


For some examples ideally we could use some of these subsys apis to accomplish this:


Probably more discussion / research results to come..

Cookie storage example fails with error `sessionString is undefined`

Hello, I'm trying to run the cookie_storage.py example script, but it seems to fail out of the box. Here are the console logs after enabling debug mode:

2023-10-15 23:57:41,394 [geckordp][DEBUG] - wait_process_loaded(): expired=False
2023-10-15 23:57:41,407 [geckordp][DEBUG] - connect():
2023-10-15 23:57:41,407 [geckordp][DEBUG] - __connect(): Queue read task
2023-10-15 23:57:41,407 [geckordp][DEBUG] - __connect(): Run IO loop
2023-10-15 23:57:41,407 [geckordp][DEBUG] - __open_connection(): Try to open connection
2023-10-15 23:57:41,412 [geckordp][DEBUG] - __open_connection(): Start listening
2023-10-15 23:57:41,422 [geckordp][DEBUG] - __read(): message complete (226 == 226)
2023-10-15 23:57:41,422 [geckordp][INFO] - __print_response(): RESPONSE<-
{
  "from": "root",
  "applicationType": "browser",
  "testConnectionPrefix": "server1.conn0.",
  "traits": {
    "networkMonitor": true,
    "resources": {
      "extensions-backgroundscript-status": true
    },
    "workerConsoleApiMessagesDispatchedToMainThread": true
  }
}
2023-10-15 23:57:41,422 [geckordp][DEBUG] - __handle_single_request(): response valid, set result
2023-10-15 23:57:41,422 [geckordp][DEBUG] - __sync_send_receive():
2023-10-15 23:57:41,422 [geckordp][INFO] - __send(): REQUEST->
{
  "to": "root",
  "type": "listTabs"
}
2023-10-15 23:57:41,425 [geckordp][DEBUG] - __read(): message complete (292 == 292)
2023-10-15 23:57:41,425 [geckordp][INFO] - __print_response(): RESPONSE<-
{
  "tabs": [
    {
      "actor": "server1.conn0.tabDescriptor1",
      "browserId": 3,
      "browsingContextID": 4,
      "isZombieTab": false,
      "outerWindowID": 8,
      "selected": true,
      "title": "SameSite Cookies Tester",
      "traits": {
        "watcher": true,
        "supportsReloadDescriptor": true
      },
      "url": "https://samesitetest.com/cookies/set"
    }
  ],
  "from": "root"
}
2023-10-15 23:57:41,425 [geckordp][DEBUG] - __handle_single_request(): response valid, set result
2023-10-15 23:57:41,425 [geckordp][DEBUG] - __sync_send_receive():
2023-10-15 23:57:41,425 [geckordp][INFO] - __send(): REQUEST->
{
  "to": "server1.conn0.tabDescriptor1",
  "type": "getWatcher",
  "isServerTargetSwitchingEnabled": null,
  "isPopupDebuggingEnabled": null
}
2023-10-15 23:57:41,428 [geckordp][DEBUG] - __read(): message complete (596 == 596)
2023-10-15 23:57:41,429 [geckordp][INFO] - __print_response(): RESPONSE<-
{
  "actor": "server1.conn0.watcher2",
  "traits": {
    "frame": true,
    "process": true,
    "worker": true,
    "resources": {
      "console-message": true,
      "css-change": true,
      "css-message": true,
      "document-event": true,
      "Cache": true,
      "cookies": true,
      "error-message": true,
      "extension-storage": true,
      "indexed-db": true,
      "local-storage": true,
      "session-storage": true,
      "platform-message": true,
      "network-event": true,
      "network-event-stacktrace": true,
      "reflow": true,
      "stylesheet": true,
      "source": true,
      "thread-state": true,
      "server-sent-event": true,
      "websocket": true,
      "tracing-state": true,
      "last-private-context-exit": true
    }
  },
  "from": "server1.conn0.tabDescriptor1"
}
2023-10-15 23:57:41,429 [geckordp][DEBUG] - __handle_single_request(): response valid, set result
2023-10-15 23:57:41,429 [geckordp][DEBUG] - __sync_send_receive():
2023-10-15 23:57:41,429 [geckordp][INFO] - __send(): REQUEST->
{
  "to": "server1.conn0.watcher2",
  "type": "watchTargets",
  "targetType": "frame"
}
2023-10-15 23:57:41,431 [geckordp][DEBUG] - __read(): message complete (33 == 33)
2023-10-15 23:57:41,431 [geckordp][INFO] - __print_response(): RESPONSE<-
{
  "from": "server1.conn0.watcher2"
}
2023-10-15 23:57:41,431 [geckordp][DEBUG] - __handle_single_request(): response valid, set result
2023-10-15 23:57:41,431 [geckordp][DEBUG] - __sync_send_receive():
2023-10-15 23:57:41,431 [geckordp][INFO] - __send(): REQUEST->
{
  "to": "server1.conn0.watcher2",
  "type": "watchResources",
  "resourceTypes": [
    "cookies"
  ]
}
2023-10-15 23:57:41,436 [geckordp][DEBUG] - __read(): message complete (384 == 384)
2023-10-15 23:57:41,436 [geckordp][INFO] - __print_response(): RESPONSE<-
{
  "type": "resource-available-form",
  "resources": [
    {
      "actor": "server1.conn0.cookies3",
      "hosts": {
        "https://samesitetest.com": []
      },
      "traits": {
        "supportsAddItem": true,
        "supportsRemoveItem": true,
        "supportsRemoveAll": true,
        "supportsRemoveAllSessionCookies": true
      },
      "resourceType": "cookies",
      "resourceId": "cookies-2147483650",
      "resourceKey": "cookies",
      "browsingContextID": 4
    }
  ],
  "from": "server1.conn0.watcher2"
}
2023-10-15 23:57:41,436 [geckordp][DEBUG] - __handle_events(): [server1.conn0.watcher2][resource-available-form] handled
2023-10-15 23:57:41,436 [geckordp][DEBUG] - __read(): message complete (33 == 33)
2023-10-15 23:57:41,436 [geckordp][INFO] - __print_response(): RESPONSE<-
{
  "from": "server1.conn0.watcher2"
}
2023-10-15 23:57:41,437 [geckordp][DEBUG] - __handle_single_request(): response valid, set result
2023-10-15 23:57:41,437 [geckordp][DEBUG] - __sync_send_receive():
2023-10-15 23:57:41,437 [geckordp][INFO] - __send(): REQUEST->
{
  "to": "server1.conn0.cookies3",
  "type": "getStoreObjects",
  "host": "https://samesitetest.com",
  "names": null,
  "options": {}
}
2023-10-15 23:57:41,438 [geckordp][DEBUG] - __read(): message complete (181 == 181)
2023-10-15 23:57:41,438 [geckordp][INFO] - __print_response(): RESPONSE<-
{
  "from": "server1.conn0.cookies3",
  "error": "TypeError",
  "message": "sessionString is undefined",
  "fileName": "resource://devtools/shared/natural-sort.js",
  "lineNumber": 50,
  "columnNumber": 5
}
2023-10-15 23:57:41,438 [geckordp][DEBUG] - __handle_single_request(): response valid, set result
2023-10-15 23:57:41,438 [geckordp][ERROR] - __sync_send_receive(): Error on request:
{'to': 'server1.conn0.cookies3', 'type': 'getStoreObjects', 'host': 'https://samesitetest.com', 'names': None, 'options': {}}
{
  "from": "server1.conn0.cookies3",
  "error": "TypeError",
  "message": "sessionString is undefined",
  "fileName": "resource://devtools/shared/natural-sort.js",
  "lineNumber": 50,
  "columnNumber": 5
}
{
  "from": "server1.conn0.cookies3",
  "error": "TypeError",
  "message": "sessionString is undefined",
  "fileName": "resource://devtools/shared/natural-sort.js",
  "lineNumber": 50,
  "columnNumber": 5
}
2023-10-15 23:57:41,438 [geckordp][DEBUG] - __sync_send_receive():
2023-10-15 23:57:41,438 [geckordp][INFO] - __send(): REQUEST->
{
  "to": "server1.conn0.cookies3",
  "type": "editItem",
  "data": {
    "host": "https://samesitetest.com",
    "key": "uniqueKey",
    "field": "value",
    "oldValue": "",
    "newValue": "my_new_value",
    "items": {}
  }
}
2023-10-15 23:57:41,439 [geckordp][DEBUG] - __read(): message complete (33 == 33)
2023-10-15 23:57:41,439 [geckordp][INFO] - __print_response(): RESPONSE<-
{
  "from": "server1.conn0.cookies3"
}
2023-10-15 23:57:41,439 [geckordp][DEBUG] - __handle_single_request(): response valid, set result
Traceback (most recent call last):
  File "cookie_storage.py", line 200, in <module>
    main()
  File "cookie_storage.py", line 152, in main
    _data = stores_update_fut.result(1.0)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/_base.py", line 460, in result
    raise TimeoutError()
concurrent.futures._base.TimeoutError

I'm using Python 3.10.8 and Firefox 115.3.1esr (64 bit) on macOS 10.14.

Do you have any guidance for debugging the script and making it work? Thanks.

How to disable error log form geckordp?

Keep getting this error after restarted the firefox:

2022-06-03 11:09:24,550 [geckordp][ERROR] - wait_process_loaded(): waiting for process[5876] failed, wait 15 seconds:
process no longer exists (pid=5876)

Development instructions

Hi, thanks for this very promising tool!
I would be interested in using it for accessibility testing, by using the features available in the accessibility tab of the devtools.

I followed the instructions on the develop section of your readme.md, but I did not succeed to start 2 parallel instances of firefox on Linux and on MacOS.

The issue is that the geckordp profile was not yet created. May I send a PR with some update to the documentation to bring some precisions about this step?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.