Coder Social home page Coder Social logo

noahcardoza / cloudproxy Goto Github PK

View Code? Open in Web Editor NEW
529.0 16.0 65.0 581 KB

Proxy server to bypass Cloudflare protection.

License: MIT License

JavaScript 0.65% Dockerfile 3.92% TypeScript 95.43%
cloudflare cloudflare-bypass cloudflare-scrape anti-bot-page sneakerbot hacktoberfest

cloudproxy's Introduction

CloudProxy

Proxy server to bypass Cloudflare protection

⚠️ This project is in beta state. Some things may not work and the API can change at any time. See the known issues section.

2captcha Alternatives

Capsolver's Banner

Capsolver offers an affordable and quick automatic captcha solving solution with a success rate of 99.15% and the ability to solve a variety of captchas, including reCAPTCHA V2, hCaptcha, FunCaptcha, and more. Integration with various API clients is also supported, and a free trial balance is available with upgraded personal details.

Discord

If you need help feel free to swing by my Discord!

How it works

CloudProxy starts a proxy server and it waits for user requests in an idle state using few resources. When some request arrives, it uses puppeteer with the stealth plugin to create a headless browser (Chrome). It opens the URL with user parameters and waits until the Cloudflare challenge is solved (or timeout). The HTML code and the cookies are sent back to the user and those cookies can be used to bypass Cloudflare using other HTTP clients.

NOTE: Web browsers consume a lot of memory. If you are running CloudProxy on a machine with few RAM, do not make many requests at once. With each request a new browser is launched unless you use a session ID which is strongly recommended. However, if you use sessions, you should make sure to close them as soon as you are done using them.

Installation

It requires NodeJS.

Run PUPPETEER_PRODUCT=chrome npm install to install CloudProxy dependencies.

Usage

First run npm run build. Once the TypeScript is compiled, you can use npm start to start CloudProxy.

Example request:

curl -L -X POST 'http://localhost:8191/v1' \
-H 'Content-Type: application/json' \
--data-raw '{
  "cmd": "request.get",
  "url":"http://www.google.com/",
  "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.0 Safari/537.36",
  "maxTimeout": 60000,
  "headers": {
    "X-Test": "Testing 123..."
  }
}'

Commands

+ sessions.create

This will launch a new browser instance which will retain cookies until you destroy it with sessions.destroy. This comes in handy so you don't have to keep solving challenges over and over and you won't need to keep sending cookies for the browser to use.

This also speeds up the requests since it won't have to launch a new browser instance for every request.

Parameter Notes
session Optional. The session ID that you want to be assinged to the instance. If one isn't set a random UUID will be assigned.
userAgent Optional. Will be used by the headless browser.

+ sessions.list

Returns a list of all the active sessions. More for debuging if you are curious to see how many sessions are running. You should always make sure to properly close each session when you are done using them as too many may slow your computer down.

Example response:

{
  "sessions": [
    "session_id_1",
    "session_id_2",
    "session_id_3..."
  ]
}

+ sessions.destroy

This will properly shutdown a browser instance and remove all files associaded with it to free up resources for a new session. Whenever you no longer need to use a session you should make sure to close it.

Parameter Notes
session The session ID that you want to be destroyed.

+ request.get

Parameter Notes
url Mandatory
session Optional. Will send the request from and existing browser instance. If one is not sent it will create a temporary instance that will be destroyed immediately after the request is completed.
headers Optional. To specify user headers.
maxTimeout Optional. Max timeout to solve the challenge
cookies Optional. Will be used by the headless browser. Follow this format
encode Optional. Add to header list 'Content-Type': 'application/x-www-form-urlencoded' — can be useful if you need to send a JSON in postData.
Example response from running the curl above:
{
    "solution": {
        "url": "https://www.google.com/?gws_rd=ssl",
        "status": 200,
        "headers": {
            "status": "200",
            "date": "Thu, 16 Jul 2020 04:15:49 GMT",
            "expires": "-1",
            "cache-control": "private, max-age=0",
            "content-type": "text/html; charset=UTF-8",
            "strict-transport-security": "max-age=31536000",
            "p3p": "CP=\"This is not a P3P policy! See g.co/p3phelp for more info.\"",
            "content-encoding": "br",
            "server": "gws",
            "content-length": "61587",
            "x-xss-protection": "0",
            "x-frame-options": "SAMEORIGIN",
            "set-cookie": "1P_JAR=2020-07-16-04; expires=Sat, 15-Aug-2020 04:15:49 GMT; path=/; domain=.google.com; Secure; SameSite=none\nNID=204=QE3Ocq15XalczqjuDy52HeseG3zAZuJzID3R57g_oeQHyoV5DuvDhpWc4r9IcPoeIYmkr_ZTX_MNOU8IAbtXmVO7Bmq0adb-hpIHaTBIdBk3Ofifp4gO6vZleVuFYfj7ePkHeHdzGoX-en0FvKtd9iofX4O6RiAdEIAnpL7Wge4; expires=Fri, 15-Jan-2021 04:15:49 GMT; path=/; domain=.google.com; Secure; HttpOnly; SameSite=none",
            "alt-svc": "h3-29=\":443\"; ma=2592000,h3-27=\":443\"; ma=2592000,h3-25=\":443\"; ma=2592000,h3-T050=\":443\"; ma=2592000,h3-Q050=\":443\"; ma=2592000,h3-Q046=\":443\"; ma=2592000,h3-Q043=\":443\"; ma=2592000,quic=\":443\"; ma=2592000; v=\"46,43\""
        },
        "response":"<!DOCTYPE html>...",
        "cookies": [
            {
                "name": "NID",
                "value": "204=QE3Ocq15XalczqjuDy52HeseG3zAZuJzID3R57g_oeQHyoV5DuvDhpWc4r9IcPoeIYmkr_ZTX_MNOU8IAbtXmVO7Bmq0adb-hpIHaTBIdBk3Ofifp4gO6vZleVuFYfj7ePkHeHdzGoX-en0FvKtd9iofX4O6RiAdEIAnpL7Wge4",
                "domain": ".google.com",
                "path": "/",
                "expires": 1610684149.307722,
                "size": 178,
                "httpOnly": true,
                "secure": true,
                "session": false,
                "sameSite": "None"
            },
            {
                "name": "1P_JAR",
                "value": "2020-07-16-04",
                "domain": ".google.com",
                "path": "/",
                "expires": 1597464949.307626,
                "size": 19,
                "httpOnly": false,
                "secure": true,
                "session": false,
                "sameSite": "None"
            }
        ],
        "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.0 Safari/537.36"
    },
    "status": "ok",
    "message": "",
    "startTimestamp": 1594872947467,
    "endTimestamp": 1594872949617,
    "version": "1.0.0"
}

+ request.post

This is the same as request.get but it takes one more param:

Example request:

curl -L -X POST 'http://localhost:8191/v1' \
-H 'Content-Type: application/json' \
--data-raw '{
  "cmd": "request.post",
  "url":"http://www.google.com/",
  "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.0 Safari/537.36",
  "maxTimeout": 60000,
  "postData": { "string": "string", "number": 10, "boolean": false },
  "headers": {
    "X-Test": "Testing 123..."
  }
}'
Parameter Notes
postData Must be a object.

Downloading Images and PDFs (small files)

If you need to access an image/pdf or small file, you should pass the download parameter to request.get setting it to true. Rather than access the html and return text it will return a the buffer base64 encoded which you will be able to decode and save the image/pdf.

This method isn't recommended for videos or anything larger. As that should be streamed back to the client and at the moment there is nothing setup to do so. If this is something you need feel free to create an issue and/or submit a PR.

Environment variables

To set the environment vars in Linux run export LOG_LEVEL=debug and then start CloudProxy in the same shell.

Name Default Notes
LOG_LEVEL info Used to change the verbosity of the logging.
LOG_HTML false Used for debugging. If true all html that passes through the proxy will be logged to the console.
PORT 8191 Change this if you already have a process running on port 8191.
HOST 0.0.0.0 This shouldn't need to be messed with but if you insist, it's here!
CAPTCHA_SOLVER None This is used to select which captcha solving method it used when a captcha is encounted.
HEADLESS true This is used to debug the browser by not running it in headless mode.

Captcha Solvers

Sometimes CF not only gives mathmatical computations and browser tests, sometimes they also require the user to solve a captcha. If this is the case, CloudProxy will return the captcha page. But that's not very helpful to you is it?

CloudProxy can be customized to solve the captcha's automatically by setting the environment variable CAPTCHA_SOLVER to the file name of one of the adapters inside the /captcha directory.

This method makes use of the CaptchaHarvester project which allows users to collect thier own tokens from ReCaptcha V2/V3 and hCaptcha for free.

To use this method you must set these ENV variables:

CAPTCHA_SOLVER=harvester
HARVESTER_ENDPOINT=https://127.0.0.1:5000/token

Note: above I set HARVESTER_ENDPOINT to the default configureation of the captcha harvester's server, but that could change if you customize the command line flags. Simply put, HARVESTER_ENDPOINT should be set to the URI of the route that returns a token in plain text when called.

This method makes use of the hcaptcha-solver project which attempts to solve hcaptcha by randomly selecting images.

To use this solver you must first install it and then set it as the CAPTCHA_SOLVER.

npm i hcaptcha-solver
CAPTCHA_SOLVER=hcaptcha-solver

Other Options

Everyone likes more options to choose from. Help contribute to the projects by submitting PR requests for other 3rd party captcha solves or your own projects. PR's are welcome for any and all captcha solving methods and services.

Docker

You may edit the ./Dockerfile as well as ./docker-compose.yml as you see fit.

# To build the image & run it using `docker compose` (detched mode)
docker compose up -d

# To stop & remove containers:
docker compose down

# You may also build and run manually, however the configuration is
# already set in the compose file, that way you dont have to remember it.
docker build -t cloudproxy:latest .
docker run --restart=always --name cloudproxy -p 8191:8191 -d cloudproxy:latest

TypeScript

I'm quite new to TypeScript. If you spot any funny business or anything that is or isn't being used properly feel free to submit a PR or open an issue.

Known issues / Roadmap

The current implementation seems to be working on the sites I have been testing them on. However, if you find it unable to access a site, open an issue and I'd be happy to investigate.

That being said, the project uses the puppeteer stealth plugin. If Cloudflare is able to detect the headless browser, it's more that projects domain to fix.

TODO:

  • Fix remaining issues in the code (see TODOs in code)
  • Make the maxTimeout more accurate (count the time to open the first page / maybe count the captcha solve time?)
  • Hide sensitive information in logs
  • Reduce Docker image size
  • Docker image for ARM architecture
  • Install instructions for Windows

Credits

Based off of ngosang's FlareSolverr.

For help contact @MacHacker#7322 (Discord)

Has CloudProxy saved or made you money on your project? Consider buying me a coffee!

Buy Me A Coffee

cloudproxy's People

Contributors

abeloin avatar dependabot[bot] avatar dextromethorphanum avatar iiiusi0n avatar jairoxyz avatar jbou avatar khorezm0 avatar lululombard avatar ngosang avatar noahcardoza avatar notnotquinn avatar serk7 avatar souloriginal avatar thebetauser avatar twanislas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cloudproxy's Issues

cant bypass this site

it dont bybass cloadflare on this site
https://bato.to/

error detail
{ "status": "ok", "message": "", "startTimestamp": 1626848183197, "endTimestamp": 1626848183575, "session": "d282ab10-e9ea-11eb-bea0-1b284ff0b0fb", "solution": { "url": "https://bato.to/", "status": 403, "headers": { "status": "403", "date": "Wed, 21 Jul 2021 06:16:23 GMT", "content-type": "text/plain; charset=UTF-8", "content-length": "16", "x-frame-options": "SAMEORIGIN", "cache-control": "private, max-age=0, no-store, no-cache, must-revalidate, post-check=0, pre-check=0", "expires": "Thu, 01 Jan 1970 00:00:01 GMT", "cf-request-id": "0b694e5c8200004ec7521f4000000001", "expect-ct": "max-age=604800, report-uri=\"https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct\"", "report-to": "{\"endpoints\":[{\"url\":\"https:\/\/a.nel.cloudflare.com\/report\/v3?s=uVAUOhtze1dS6YUzrURwYkkE%2Bwab6FEs%2FKD4JQaoHD8YyE8NyTDXzMi2RFPnQHG1wNW0A3MG78VJU02dPmfZob4NAJ9zVPHF2G9HdU1zFC6Zy0mo5GT8m4PmYWHWAQbLJwun9jw%3D\"}],\"group\":\"cf-nel\",\"max_age\":604800}", "nel": "{\"report_to\":\"cf-nel\",\"max_age\":604800}", "vary": "Accept-Encoding", "server": "cloudflare", "cf-ray": "67224cda6c454ec7-FRA", "alt-svc": "h3-27=\":443\"; ma=86400, h3-28=\":443\"; ma=86400, h3-29=\":443\"; ma=86400, h3=\":443\"; ma=86400" }, "response": "<html><head></head><body><pre style=\"word-wrap: break-word; white-space: pre-wrap;\">error code: 1005</pre></body></html>", "cookies": [], "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.0 Safari/537.36" } }
i tried
{ "cmd": "request.get", "url":"https://bato.to", "session":"d282ab10-e9ea-11eb-bea0-1b284ff0b0fb", "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.0 Safari/537.36", "maxTimeout": 60000 }

cant bypass this site

it dont bybass cloadflare on this site
https://hpjav(.)tv/

error detail

"status": 403,
"response": "<html><head></head><body><pre style=\"word-wrap: break-word; white-space: pre-wrap;\">error code: 1020</pre></body></html>",

if i use ngosang/FlareSolverr package then error Please enable cookies.

"response": "<!DOCTYPE html><html class=\"no-js\" lang=\"en-US\"><!--<![endif]--><head>\n<title>Access denied | hpjav.tv used Cloudflare to restrict access</title>\n<meta charset=\"UTF-8\">\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\">\n<meta http-equiv=\"X-UA-Compatible\" content=\"IE=Edge,chrome=1\">\n<meta name=\"robots\" content=\"noindex, nofollow\">\n<meta name=\"viewport\" content=\"width=device-width,initial-scale=1\">\n<link rel=\"stylesheet\" id=\"cf_styles-css\" href=\"/cdn-cgi/styles/main.css\" type=\"text/css\" media=\"screen,projection\">\n</head>\n<body>\n<div id=\"cf-wrapper\">\n<div class=\"cf-alert cf-alert-error cf-cookie-error hidden\" id=\"cookie-alert\" data-translate=\"enable_cookies\">Please enable cookies.</div>\n<div id=\"cf-error-details\" class=\"p-0\">\n<header class=\"mx-auto pt-10 lg:pt-6 lg:px-8 w-240 lg:w-full mb-15 antialiased\">\n<h1 class=\"inline-block md:block mr-2 md:mb-2 font-light text-60 md:text-3xl text-black-dark leading-tight\">\n<span data-translate=\"error\">Error</span>\n<span>1020</span>\n</h1>\n<span class=\"inline-block md:block heading-ray-id font-mono text-15 lg:text-sm lg:leading-relaxed\">Ray ID: 5ea6b7a11b95ef59 •</span>\n<span class=\"inline-block md:block heading-ray-id font-mono text-15 lg:text-sm lg:leading-relaxed\">2020-10-30 17:04:46 UTC</span>\n<h2 class=\"text-gray-600 leading-1.3 text-3xl lg:text-2xl font-light\">Access denied</h2>\n</header>\n<section class=\"w-240 lg:w-full mx-auto mb-8 lg:px-8\">\n<div id=\"what-happened-section\" class=\"w-1/2 md:w-full\">\n<h2 class=\"text-3xl leading-tight font-normal mb-4 text-black-dark antialiased\" data-translate=\"what_happened\">What happened?</h2>\n<p>This website is using a security service to protect itself from online attacks.</p>\n</div>\n</section>\n<div class=\"cf-error-footer cf-wrapper w-240 lg:w-full py-10 sm:py-4 sm:px-8 mx-auto text-center sm:text-left border-solid border-0 border-t border-gray-300\">\n<p class=\"text-13\">\n<span class=\"cf-footer-item sm:block sm:mb-1\">Cloudflare Ray ID: <strong class=\"font-semibold\">5ea6b7a11b95ef59</strong></span>\n<span class=\"cf-footer-separator sm:hidden\">•</span>\n<span class=\"cf-footer-item sm:block sm:mb-1\"><span>Your IP</span>: 45.76.49.139</span>\n<span class=\"cf-footer-separator sm:hidden\">•</span>\n<span class=\"cf-footer-item sm:block sm:mb-1\"><span>Performance &amp; security by</span> <a rel=\"noopener noreferrer\" href=\"https://www.cloudflare.com/5xx-error-landing\" id=\"brand_link\" target=\"_blank\">Cloudflare</a></span>\n</p>\n</div>\n</div>\n</div>\n<script type=\"text/javascript\">\n  window._cf_translation = {};\n  \n  \n</script>\n\n\n</body></html>",

didnt work with harvester

Describe the bug
solving captcha with CaptchaHarvester didnt work

To Reproduce

curl --location --request POST 'http://serverip:8191/v1' \
--header 'Content-Type: application/json' \
--data-raw '{
  "cmd": "request.get",
  "url":"https://www.primewire.li/?&links=With+Links&type=movie&t=y&ts=y&c=e&page=1",
  "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.0 Safari/537.36",
  "maxTimeout": 60000,
  "headers": {
    "X-Test": "Testing 123..."
  }
}'

Expected behavior
2020-10-16T20:56:49.852Z DEBUG REQ-1 { headers: [Function] }
2020-10-16T20:56:49.857Z DEBUG REQ-1 Navegating to... https://www.primewire.li/?&links=With+Links&type=movie&t=y&ts=y&c=e&page=1
2020-10-16T20:56:49.877Z DEBUG REQ-1 {
headers: {
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.0 Safari/537.36',
'X-Test': 'Testing 123...'
}
}

2020-10-16T20:56:50.411Z INFO REQ-1 Cloudflare detected
2020-10-16T20:56:50.418Z DEBUG REQ-1 No '.ray_id' challenge element detected.
2020-10-16T20:56:50.419Z DEBUG REQ-1 No '.attack-box' challenge element detected.
2020-10-16T20:56:50.722Z INFO REQ-1 captcha type:hCaptcha
2020-10-16T20:56:50.723Z INFO REQ-1 Waiting to recive captcha token to bypass challenge...
TypeError: Cannot read property 'statusCode' of undefined
    at solve (/home/mvideos/tools/CloudProxy/src/captcha/harvester.ts:27:22)
    at processTicksAndRejections (internal/process/task_queues.js:97:5)
2020-10-16T20:56:53.823Z ERROR REQ-1 Cannot read property 'statusCode' of undefined

Screenshots
N/A

Possibilty to interact with opened page ?

Hi.
Very nice solution to bypass cloudflare and work great.
I need to interact with a opened page, like click button, filling form, etc...
Not using postdata, but really interact like with selenium....
Is possible for you to add ? Or already existing but don't find how to ?

Thanks

Cant Install

Describe the bug
I use PUPPETEER_PRODUCT=chrome npm install, but throw error:

npm WARN deprecated [email protected]: Please update to ini >=1.3.6 to avoid a prototype pollution issue
npm WARN deprecated [email protected]: this library is no longer supported
npm WARN deprecated [email protected]: Debug versions >=3.2.0 <3.2.7 || >=4 <4.3.1 have a low-severity ReDos regression when used in a Node.js environment. It is recommended you upgrade to 3.2.7 or 4.3.1. (https://github.com/visionmedia/debug/issues/797)
npm WARN deprecated [email protected]: Debug versions >=3.2.0 <3.2.7 || >=4 <4.3.1 have a low-severity ReDos regression when used in a Node.js environment. It is recommended you upgrade to 3.2.7 or 4.3.1. (https://github.com/visionmedia/debug/issues/797)
npm WARN deprecated [email protected]: Please upgrade  to version 7 or higher.  Older versions may use Math.random() in certain circumstances, which is known to be problematic.  See https://v8.dev/blog/math-random for details.
npm WARN deprecated [email protected]: request has been deprecated, see https://github.com/request/request/issues/3142
npm WARN deprecated [email protected]: < 21.5.0 is no longer supported
npm ERR! code 1
npm ERR! path /home/komga/proxy/node_modules/puppeteer
npm ERR! command failed
npm ERR! command sh -c node install.js
npm ERR! node:internal/modules/cjs/loader:1080
npm ERR!   throw err;
npm ERR!   ^
npm ERR!
npm ERR! Error: Cannot find module './lib/node-progress'
npm ERR! Require stack:
npm ERR! - /home/komga/proxy/node_modules/progress/index.js
npm ERR! - /home/komga/proxy/node_modules/puppeteer/install.js
npm ERR!     at Module._resolveFilename (node:internal/modules/cjs/loader:1077:15)
npm ERR!     at Module._load (node:internal/modules/cjs/loader:922:27)
npm ERR!     at Module.require (node:internal/modules/cjs/loader:1143:19)
npm ERR!     at require (node:internal/modules/cjs/helpers:121:18)
npm ERR!     at Object.<anonymous> (/home/komga/proxy/node_modules/progress/index.js:1:18)
npm ERR!     at Module._compile (node:internal/modules/cjs/loader:1256:14)
npm ERR!     at Module._extensions..js (node:internal/modules/cjs/loader:1310:10)
npm ERR!     at Module.load (node:internal/modules/cjs/loader:1119:32)
npm ERR!     at Module._load (node:internal/modules/cjs/loader:960:12)
npm ERR!     at Module.require (node:internal/modules/cjs/loader:1143:19) {
npm ERR!   code: 'MODULE_NOT_FOUND',
npm ERR!   requireStack: [
npm ERR!     '/home/komga/proxy/node_modules/progress/index.js',
npm ERR!     '/home/komga/proxy/node_modules/puppeteer/install.js'
npm ERR!   ]
npm ERR! }
npm ERR!
npm ERR! Node.js v18.17.1

To Reproduce
Steps to reproduce the behavior:
PUPPETEER_PRODUCT=chrome npm install

Desktop (please complete the following information):

  • OS: CentOS Stream release 9

Very high CPU usage

Describe the bug
Cloudproxy container is draining my CPU after short time

To Reproduce
Steps to reproduce the behavior:

  • I use jackett modified image sclemenceau/docker-jackett:cloudproxy
  • scraping https://www.yggtorrent.si/ website every 15 minutes or so

Expected behavior
The CPU should remain low

Screenshots

CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
e8d0c363dcc2        cloudproxy          128.99%             362.2MiB / 1.045GiB   33.85%              32.8MB / 3.65MB     0B / 0B             0

Desktop (please complete the following information):

  • Docker
  • Latest Version

Additional context
I see so many chromium processes, it is normal ?

cloudproxy.log
chromium-processes.txt

HCaptcha solver support

Hey mate, this is related to #12 - I followed your advice and wanted to add https://github.com/JimmyLaurent/hcaptcha-solver as a captcha solver. The node lib actually works with the site I tried and returns a token but I have difficulties adding it as a solver under ./captcha.

First problem, routes.ts tries to pass hostname and sitekey to the solver, but hcaptcha-solver needs just a url:

 const token = await captchaSolver({
            hostname: (new URL(url)).hostname,
            sitekey,
            type: captchaType
          })

Second problem, page waits for this element and code fails with timeout if not found:
await page.waitForSelector('#challenge-form [type=submit]')

So I commented these out just for testing and the returned token works and I get a successful response from the site 👍

But I am really not good at typescript, so would not even know where to grab the url from in the function I export from my solver. Also no idea how to import the module correctly. But here is what I have as ./captcha/hcaptcha.ts

const solveHCaptcha = require('hcaptcha-solver');

export default async function solve(): Promise<string> {
    try {
        const response = await solveHCaptcha(url);
        return response
        // F0_eyJ0eXAiOiJKV1Q...
      } catch (error) {
        console.log(error);
      }
    
}

Maybe you can help adding this one like the harvester? Or maybe it would make more sense to integrate it in the main code and try to solve hcaptchas with it before calling the solver services.

Thanks!

How to download image or file using CloudProxy?

Good morning,

I have been trying to download images using CloudProxy but i never get lucky. When i run this example code, found in the readme, I get a JSON response with the cookies and a { solution: URL_TO_IMAGE_WITH_PARAMS }
curl -L -X POST 'http://localhost:8191/v1' \-H 'Content-Type: application/json' \--data-raw '{"cmd": "request.get","url":"IMAGEURL","userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.4103.0 Safari/537.36","maxTimeout": 60000}' > ./Desktop/image.html

Tryed to change the Content-Type to image/jpeg and the output to image.jpeg but it didnt work as expected and the image is corrupted.

Example response:
{"solution":{"url":"URL_TARGET_WEBSITE?__cf_chl_jschl_tk__=4518ca28c4f3080c7cd4080b68b74cfab43d6ed9-1596091916-0-AWNxm5ZZLK_lomayFZXqdLSbRBpWX5XWE1DS4qxLgVsx1XwOR_wdrKafxErGVoleIvRhg-Fkg2xAGd2TcqJPr2tYllR83aAtShoNABz4UCVZZMtTrL9rWoKFillt7NRJ3dFfxCrGvCA5WuXeJ_D_3IZnQ8gA2hH8NmpJ3LCP6aLpwqllTofEi9f3IOuYLEv-oGsFY9O6kwK6S6lntYsKpItscj-XPyO2T30v844gIOanw8DSqG_Yg_bXTdAaY4xKWhVNtGfhRal4k9lvRVm-KLAZTXGLtEllQroGnzQpBhxpjYa9zh_LagLFtLLscKidn_OkSYZJsavPX6NQgOK1h7CoC9snfY2AuSE_p3TeUEThWY4WsivqkOb0quA7J3-I9G5U7iiZsBfQX1mT9a_rnHA","status":503,"headers":{"status":"503","date":"Thu, 30 Jul 2020 06:51:43 GMT","content-type":"text/html; charset=UTF-8","set-cookie":"__cfduid=d0959ba67f595b25dba943c78589e84e91596091903; expires=Sat, 29-Aug-20 06:51:43 GMT; path=/; domain=.TARGET_DOMAIN; HttpOnly; SameSite=Lax; Secure","x-frame-options":"SAMEORIGIN","cache-control":"private, max-age=0, no-store, no-cache, must-revalidate, post-check=0, pre-check=0","expires":"Thu, 01 Jan 1970 00:00:01 GMT","cf-request-id":"044016c4d000001145db89c200000001","expect-ct":"max-age=604800, report-uri=\"https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct\"","vary":"Accept-Encoding","server":"cloudflare","cf-ray":"5bad271aec581145-MAD"},"response":"","cookies":[{"name":"cf_clearance","value":"5726025c42ac569a13f3dc3b6f51770903ee7769-1596091920-0-1z3f877e9bzbf7799e6z83dc411a-150","domain":".TARGET_DOMAIN","path":"/","expires":1596181920.323598,"size":98,"httpOnly":true,"secure":true,"session":false,"sameSite":"None"},{"name":"cf_chl_prog","value":"x19","domain":"TARGET_DOMAIN","path":"/","expires":1596095520,"size":14,"httpOnly":false,"secure":false,"session":false},{"name":"cf_chl_1","value":"0a019381102983e","domain":"TARGET_DOMAIN","path":"/","expires":1596095516,"size":23,"httpOnly":false,"secure":false,"session":false},{"name":"__cfduid","value":"d0959ba67f595b25dba943c78589e84e91596091903","domain":".TARGET_DOMAIN","path":"/","expires":1598683903.204168,"size":51,"httpOnly":true,"secure":true,"session":false,"sameSite":"Lax"}],"userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.4103.0 Safari/537.36"},"status":"ok","message":"","startTimestamp":1596091902775,"endTimestamp":1596091920407,"version":"1.0.0"}

Add generic API caller for captcha

Hi,

I'm using FlareSolverrSharp and FlareSolverr right now, I adapted the C# part to work with CloudProxy, but I'm having problems with captcha, the Harvester solution is not feasible to me, I need to automate that part too...

Is it possible to have an API call to a generic solver? I can implement it on my end.

I was think something like this:

CAPTCHA_SOLVER=generic
CAPTCHA_URL=http://host/method

So CloudProxy would call the URL and send the type of captcha, site key/id and site url, maybe a json POST?

What do you think?

Thanx for your time...

Feature request: Official support in Jackett

Hello, I'm the original author or https://github.com/ngosang/FlareSolverr and one of the Jackett' developers.
Since I don't have time to maintain FlareSolverr I'm thinking in using this project as official Cloudflare/Captcha resolver for Jackett. I will do all development, of course, but I have some concerns. Maybe you can help us in some way.

  • The main concern is this project is hard to install for regular users. In Jackett we have more than 16k daily users but most of them don't know much about software, development, network. If we add a new feature and they don't know how to use it, they are going to open a lot of support issues and we don't haven the time to manage that.
    • Provide installers. At least for Windows and Linux. In Jackett we are using Github Actions / Azure Pipelines to automatically generate and upload to GitHub the binaries. You don't need to go further, but a Windows/Linux zip file (manually generated) with all dependencies (Chrome, Node...) will be nice. Just extract and run. https://github.com/Jackett/Jackett/releases/tag/v0.16.2157
    • Official Docker image. It's really easy to create an account in DockerHub and upload the image. Some users are uploading unofficial images and I don't know if they are legit or updated... The best solution will be to ask for help in https://github.com/linuxserver but it can take too much time...
    • Update Docker installation documentation from registry. Maybe we should split the readme into several files. Most final users are not interesting in development or internals, just installation.
  • Improvements
    • Build/Docker for ARM architecture. Around 40% of the Jackett' users are running Jackett in ARM boards like RaspberryPi and NAS like QNAP, Synologic... There are some experiments with Puppeter ARM out there. Maybe we can do something.
    • Reduce Docker image size. Current Docker image takes 1.4GB. I'm sure we can do better. Space can be a problem in embedded devices.
    • Include all captcha resolvers in the Docker image / installer. The user can't use hcaptcha-solver with current Docker image.
    • Reduce Memory usage. Memory usage when IDLE is around 130MB. It's acceptable but memory is a scarce resource in embedded systems.

I don't expect all of these issues to be resolved anytime soon, but we need CloudProxy to be easy to install so that we can write a short installation guide in Jackett.

Original discussion: Jackett/Jackett#9029

Raspberry Pi - exec user process caused

Describe the bug
docker-compose up -d isn't working. It fails at step 4/15.

To Reproduce
Steps to reproduce the behavior:

  • Git clone the repo
  • Run docker compose up -d

Expected behavior
A clear and concise description of what you expected to happen.
I expected it to start successfully.

Screenshots
If applicable, add screenshots to help explain your problem.
image

Desktop (please complete the following information):

  • OS: Raspbian ARM64
  • Version: latest (cloned from git)

Additional context
Error log:

Building cloudproxy
Step 1/15 : FROM --platform=${TARGETPLATFORM:-linux/amd64} node:15.2.1-alpine3.11
 ---> 7ddc154413f5
Step 2/15 : ARG TARGETPLATFORM
 ---> Using cache
 ---> 5a09e0dc10a1
Step 3/15 : ARG BUILDPLATFORM
 ---> Using cache
 ---> ec19d88eb66d
Step 4/15 : RUN printf "I am running on ${BUILDPLATFORM:-linux/amd64}, building for ${TARGETPLATFORM:-linux/amd64}\n$(uname -a)\n"
 ---> [Warning] The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
 ---> Running in c483a3a6978b
standard_init_linux.go:219: exec user process caused: exec format error
ERROR: Service 'cloudproxy' failed to build: The command '/bin/sh -c printf "I am running on ${BUILDPLATFORM:-linux/amd64}, building for ${TARGETPLATFORM:-linux/amd64}\n$(uname -a)\n"' returned a non-zero code: 1

Session related suggestions

Hi mate, I see you converted to TS. Looks really great now and works nicely for me, especially using custom session profile, which seems to help avoiding trigger of Captcha after a couple of retries of the request.get. I have a few suggestions:

  • when using session param without custom User-Agent, a random browser profile folder is created instead of the custom named folder. Is there a reason for this? Why not create the named folder always when the session param is used? I believe that way the session and cookies would be reused even after a server or docker app restart.

  • I also noticed that named profile folders get deleted after server restart, I guess due to the non persistent nature of the Docker image data storage. Maybe we could add a conf param for a custom temp folder so that a persistent volume could be mounted to it?

  • lastly, when using request.get with a session param and the session does not exist it returns a notice to create a session first. Why not call sesssion.create like you call it if no session param is provided?

Anyway, thanks for the great app!

Jx-

Docker Build Fails

Describe the bug

  • Docker Image Build fails

To Reproduce

  • Run docker build against the dockerfile

Expected behavior
A clear and concise description of what you expected to happen.

  • Build succeeds

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: Ubuntu 21.04
  • Docker Version: Docker version 20.10.6, build 370c289

Additional context
src/captcha/anti-captcha.ts(3,53): error TS2339: Property 'userAgent' does not exist on type 'SolverOptions'.
src/captcha/anti-captcha.ts(3,64): error TS2339: Property 'proxy' does not exist on type 'SolverOptions'.
src/captcha/anti-captcha.ts(3,71): error TS2339: Property 'apiKey' does not exist on type 'SolverOptions'.
npm notice
npm notice New minor version of npm available! 7.0.8 -> 7.13.0
npm notice Changelog: https://github.com/npm/cli/releases/tag/v7.13.0
npm notice Run npm install -g [email protected] to update!
npm notice
npm ERR! code 2
npm ERR! path /home/node/cloudproxy
npm ERR! command failed
npm ERR! command sh -c tsc

npm ERR! A complete log of this run can be found in:
npm ERR! /home/node/.npm/_logs/2021-05-14T16_13_36_622Z-debug.log
The command '/bin/sh -c npm install && npm run build && rm -rf src tsconfig.json && npm prune --production' returned a non-zero code: 2

Unfortunately I'm not familiar with TS to recommend a fix.

Unable to process browser request when downloading a PDF

Describe the bug
I'm trying to download a PDF from https://www.mdpi.com/1996-1944/12/18/2995/pdf, but CloudProxy throws an error.

To Reproduce
Make the following request:

{
    'cmd': 'request.get',
    'url': 'https://www.mdpi.com/1996-1944/12/18/2995/pdf',
    'userAgent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:95.0) Gecko/20100101 Firefox/95.0',
    'download': True
} # Python syntax

Expected behavior
A PDF is downloaded as a byte stream

Desktop (please complete the following information):

  • OS: Linux
  • Browser Any
  • Version 2.1.1

Additional context
This is the stack trace:

cloudproxy_1  | 2022-01-07T23:13:16.548Z INFO REQ-0 CloudProxy v2.1.1 listening on http://0.0.0.0:8191
cloudproxy_1  | 2022-01-07T23:13:24.087Z INFO REQ-1 Incoming request: POST /v1
cloudproxy_1  | 2022-01-07T23:13:24.089Z INFO REQ-1 Params: {"cmd":"request.get","url":"https://www.mdpi.com/1996-1944/12/18/2995/pdf","download":true}
cloudproxy_1  | 2022-01-07T23:13:24.090Z DEBUG REQ-1 Launching headless browser...
cloudproxy_1  | 2022-01-07T23:13:24.390Z DEBUG REQ-1 Adding custom headers: {}
cloudproxy_1  | 2022-01-07T23:13:24.390Z DEBUG REQ-1 { headers: [Function (anonymous)] }
cloudproxy_1  | 2022-01-07T23:13:24.397Z DEBUG REQ-1 Navigating to... https://www.mdpi.com/1996-1944/12/18/2995/pdf
cloudproxy_1  | 2022-01-07T23:13:24.407Z DEBUG REQ-1 {
cloudproxy_1  |   headers: {
cloudproxy_1  |     'upgrade-insecure-requests': '1',
cloudproxy_1  |     'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.113 Safari/537.36',
cloudproxy_1  |     'accept-language': 'en-US,en;q=0.9'
cloudproxy_1  |   }
cloudproxy_1  | }
cloudproxy_1  | 2022-01-07T23:13:25.506Z ERROR REQ-1 Error: net::ERR_ABORTED at https://www.mdpi.com/1996-1944/12/18/2995/pdf
cloudproxy_1  |     at navigate (/home/node/cloudproxy/node_modules/puppeteer/lib/FrameManager.js:95:23)
cloudproxy_1  |     at processTicksAndRejections (node:internal/process/task_queues:93:5)
cloudproxy_1  |     at async FrameManager.navigateFrame (/home/node/cloudproxy/node_modules/puppeteer/lib/FrameManager.js:70:21)
cloudproxy_1  |     at async Frame.goto (/home/node/cloudproxy/node_modules/puppeteer/lib/FrameManager.js:295:16)
cloudproxy_1  |     at async Page.goto (/home/node/cloudproxy/node_modules/puppeteer/lib/Page.js:485:16)
cloudproxy_1  |     at async resolveChallenge (/home/node/cloudproxy/dist/routes.js:71:20)
cloudproxy_1  |     at async browserRequest (/home/node/cloudproxy/dist/routes.js:285:22)
cloudproxy_1  |     at async request.get (/home/node/cloudproxy/dist/routes.js:325:9)
cloudproxy_1  |     at async Object.Router [as default] (/home/node/cloudproxy/dist/routes.js:346:16)
cloudproxy_1  |   -- ASYNC --
cloudproxy_1  |     at Frame.<anonymous> (/home/node/cloudproxy/node_modules/puppeteer/lib/helper.js:94:19)
cloudproxy_1  |     at Page.goto (/home/node/cloudproxy/node_modules/puppeteer/lib/Page.js:485:53)
cloudproxy_1  |     at Page.<anonymous> (/home/node/cloudproxy/node_modules/puppeteer/lib/helper.js:95:27)
cloudproxy_1  |     at resolveChallenge (/home/node/cloudproxy/dist/routes.js:71:31)
cloudproxy_1  |     at browserRequest (/home/node/cloudproxy/dist/routes.js:285:28)
cloudproxy_1  |     at processTicksAndRejections (node:internal/process/task_queues:93:5)
cloudproxy_1  |     at async request.get (/home/node/cloudproxy/dist/routes.js:325:9)
cloudproxy_1  |     at async Object.Router [as default] (/home/node/cloudproxy/dist/routes.js:346:16)
cloudproxy_1  | 2022-01-07T23:13:25.507Z ERROR REQ-1 Unable to process browser request

CloudProxy gets stuck on jsch challenge

So I've been implementing CloudProxy on one of my projects, basically I'm trying to download plugins from SpigotMC.org and everything works fine until I want to download a plugin.

When browsing normally, the first time I download a plugin, I'll get the following screen:
image
And the download starts.

But with CloudProxy, this happens:

2020-08-07T01:06:09.818Z DEBUG REQ-99 Navegating to... https://www.spigotmc.org/resources/cmi-270-commands-insane-kits-portals-essentials-economy-mysql-sqlite-much-more.3742/download?version=349138
2020-08-07T01:06:09.820Z DEBUG REQ-99 {
  headers: {
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.0 Safari/537.36',
    'accept-language': 'en-US,en;q=0.9'
  }
}
2020-08-07T01:06:09.919Z INFO REQ-99 Cloudflare detected
2020-08-07T01:06:09.923Z DEBUG REQ-99 Waiting for Cloudflare challenge...
2020-08-07T01:06:14.127Z DEBUG REQ-99 {
  headers: {
    'upgrade-insecure-requests': '1',
    origin: 'https://www.spigotmc.org',
    'content-type': 'application/x-www-form-urlencoded',
    'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.0 Safari/537.36',
    'accept-language': 'en-US,en;q=0.9',
    referer: 'https://www.spigotmc.org/resources/cmi-270-commands-insane-kits-portals-essentials-economy-mysql-sqlite-much-more.3742/download?version=349138'
  }
}
2020-08-07T01:06:15.926Z DEBUG REQ-99 Found challenge element again...
2020-08-07T01:06:15.928Z DEBUG REQ-99 {
  headers: {
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.0 Safari/537.36',
    'accept-language': 'en-US,en;q=0.9'
  }
}
TimeoutError: Navigation timeout of 30000 ms exceeded
    at /home/lululombard/Workspace/RedCraft/CloudProxy/node_modules/puppeteer/lib/LifecycleWatcher.js:100:111
    at FrameManager.waitForFrameNavigation (/home/lululombard/Workspace/RedCraft/CloudProxy/node_modules/puppeteer/lib/FrameManager.js:107:23)
    at Frame.waitForNavigation (/home/lululombard/Workspace/RedCraft/CloudProxy/node_modules/puppeteer/lib/FrameManager.js:298:16)
    at Page.waitForNavigation (/home/lululombard/Workspace/RedCraft/CloudProxy/node_modules/puppeteer/lib/Page.js:492:16)
    at async Promise.all (index 0)
    at Page.reload (/home/lululombard/Workspace/RedCraft/CloudProxy/node_modules/puppeteer/lib/Page.js:488:24)
    at resolveChallenge (/home/lululombard/Workspace/RedCraft/CloudProxy/src/routes.ts:112:24)
    at request.get (/home/lululombard/Workspace/RedCraft/CloudProxy/src/routes.ts:330:18)
    at Object.Router [as default] (/home/lululombard/Workspace/RedCraft/CloudProxy/src/routes.ts:349:23)
  -- ASYNC --
    at Frame.<anonymous> (/home/lululombard/Workspace/RedCraft/CloudProxy/node_modules/puppeteer/lib/helper.js:94:19)
    at Page.waitForNavigation (/home/lululombard/Workspace/RedCraft/CloudProxy/node_modules/puppeteer/lib/Page.js:492:53)
    at Page.<anonymous> (/home/lululombard/Workspace/RedCraft/CloudProxy/node_modules/puppeteer/lib/helper.js:95:27)
    at Page.reload (/home/lululombard/Workspace/RedCraft/CloudProxy/node_modules/puppeteer/lib/Page.js:488:48)
    at Page.<anonymous> (/home/lululombard/Workspace/RedCraft/CloudProxy/node_modules/puppeteer/lib/helper.js:95:27)
    at resolveChallenge (/home/lululombard/Workspace/RedCraft/CloudProxy/src/routes.ts:112:35)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (internal/process/task_queues.js:93:5)
    at request.get (/home/lululombard/Workspace/RedCraft/CloudProxy/src/routes.ts:330:18)
    at Object.Router [as default] (/home/lululombard/Workspace/RedCraft/CloudProxy/src/routes.ts:349:23)
2020-08-07T01:06:45.956Z ERROR REQ-99 Navigation timeout of 30000 ms exceeded

I think it's probably related to the // TODO: find out why these pages hang sometimes but no mater how many times I try, it will fail on that link. You can add me on Discord (lululombard#1337) if you require more information, I can even give you the credentials to my account so you can try to run it for yourself to debug :)

Return error when Captcha is detected and no solver is defined

I think it would be better to return error instead success when a captcha is triggered but cannot be solved. A small change in line 180 ff in routes.ts would do:

} else {
        message = 'Captcha detected but \'CAPTCHA_SOLVER\' not set in ENV.'
        ctx.errorResponse(message)
        return
      }

Cheers and thanks again for the excellent coding!

Jx-

Add socks proxy

Can we have a socks proxy that uses an open session for all its requests transparently? I.e., by setting this proxy on a tool, that tool will be using CloudProxy transparently. Most tools support socks proxies, so this will make it very easy to use CloudProxy with them.

POST application/json

how to send application/json ?

my Request Body:
{ "cmd": "request.post", "url": "https://jsonplaceholder.typicode.com/posts", "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.0 Safari/537.36", "postData": { "title": "foo", "body": "bar", "userId": 1 }, "headers": { "content-type": "application/json" } }

my Response:
{ "status": "error", "message": "Unable to process browser request", "startTimestamp": 1668098616053, "endTimestamp": 1668098646678, "version": "2.1.1" }

CURL:
curl --location --request POST 'http://localhost:8191/v1' \ --header 'Content-Type: application/json' \ --data-raw '{ "cmd": "request.post", "url": "https://jsonplaceholder.typicode.com/posts", "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.0 Safari/537.36", "postData": { "title": "foo", "body": "bar", "userId": 1 }, "headers": { "content-type": "application/json" } }'

ERROR in Console
ERROR REQ-15 TimeoutError: Navigation timeout of 30000 ms exceeded

jayspov.net solving failed

The request:

curl -L -X POST 'http://localhost:8191/v1'
-H 'Content-Type: application/json'
--data-raw '{
"cmd": "request.get",
"url":"https://jayspov.net",
"userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.0 Safari/537.36",
"maxTimeout": 60000,
"headers": {"upgrade-insecure-requests": "1", "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.0 Safari/537.36", "accept-language": "en-US,en;q=0.9"}
}'

The corresponding log:

INFO REQ-16 Cloudflare detected
DEBUG REQ-16 No '#trk_jschal_js' challenge element detected.
DEBUG REQ-16 No '.ray_id' challenge element detected.
DEBUG REQ-16 No '.attack-box' challenge element detected.
INFO REQ-16 Successful response in 2.869 s

But the response contains a page with 403 error.
... ,"headers":{"status":"403", ...

I hope you can fix it. Thanks for you time!

Wrong browser product

Hey mate you are using product="chrome" but dockerfile installs firefox. Is chrome working better?

I am also getting protocol errors when trying to use headers. Maybe related to firefox?

2020-07-16T07:52:51.594Z DEBUG REQ-2 Adding custom headers: {
  "Referer": "https://google.com",
  "X-Requested-With": "XMLHttpRequest"
}
Error: Protocol error (Fetch.enable): Fetch.enable RemoteAgentError@chrome://remote/content/Error.jsm:25:5
UnknownMethodError@chrome://remote/content/Error.jsm:108:7
execute@chrome://remote/content/domains/DomainCache.jsm:96:13
receiveMessage@chrome://remote/content/sessions/ContentProcessSession.jsm:86:45
......
.....

hCaptcha protected sites without 'data-sitekey' and 'challenge-form submit' elements

Some sites seem to obfuscate their site-keys and other elements on the page (or they use the hCaptcha stuff in a different way). Since hcaptcha-solver doesn't require the sitekey, I made some changes so the code doesn't break when these elements are not found:

routes.ts
lines 157ff

let sitekey = null
         if (captchaType != 'hCaptcha' && process.env.CAPTCHA_SOLVER != 'hcaptcha-solver') {
           const sitekeyElem = await page.$('*[data-sitekey]')
           if (!sitekeyElem) { return ctx.errorResponse('Could not find sitekey!') }
           sitekey = await sitekeyElem.evaluate((e) => e.getAttribute('data-sitekey'))
         }

lines 187ff

try {
              await page.waitForSelector('#challenge-form [type=submit]', {
              timeout: 5000
            })
          } catch (err) { 
            log.debug(`No '#challenge-form [type=submit]' element detected.`)
          }

I can make a PR if you like but maybe you want to code it differently or look into the issue.

Cheers,

Jx-

[Not a bug] Just some help

Hello sir @NoahCardoza can be used your library on my script for bypass cf, and recaptcha?
If yes, if u can show me a exemple how can upgrade my script. here is my exemple script (have used hooman lib but is bas been patched. Thank

const url= process.argv[2];

return new Promise((resolve, reject) => {
hooman.get(url, {
      agent: {
        https: proxy,
      },
      cloudflareRetry: 10,
    })
      .then(response => {
        resolve(response);
      })
      .catch((error) => {
      console.log(error.response.body);
        let obj_v = proxies.indexOf(proxy);
        proxies.splice(obj_v, 1);
        console.log(error.message);
        return reject(error.message);
      });
  });
}

Npm build / Docker build fails

Describe the bug
I wasn't able to compile the source code using docker build -t cloudproxy:latest . (or not inside the docker, it doesn't really matter in that case), it would give me the following errors:

1 package is looking for funding
  run `npm fund` for details

2 vulnerabilities (1 moderate, 1 high)

To address all issues, run:
  npm audit fix

Run `npm audit` for details.
npm notice
npm notice New minor version of npm available! 7.0.8 -> 7.14.0
npm notice Changelog: <https://github.com/npm/cli/releases/tag/v7.14.0>
npm notice Run `npm install -g [email protected]` to update!
npm notice

> [email protected] build
> tsc

src/captcha/anti-captcha.ts(3,53): error TS2339: Property 'userAgent' does not exist on type 'SolverOptions'.
src/captcha/anti-captcha.ts(3,64): error TS2339: Property 'proxy' does not exist on type 'SolverOptions'.
src/captcha/anti-captcha.ts(3,71): error TS2339: Property 'apiKey' does not exist on type 'SolverOptions'.
npm notice
npm notice New minor version of npm available! 7.0.8 -> 7.14.0
npm notice Changelog: <https://github.com/npm/cli/releases/tag/v7.14.0>
npm notice Run `npm install -g [email protected]` to update!
npm notice
npm ERR! code 2
npm ERR! path /home/node/cloudproxy
npm ERR! command failed
npm ERR! command sh -c tsc

npm ERR! A complete log of this run can be found in:
npm ERR!     /home/node/.npm/_logs/2021-05-25T12_44_18_247Z-debug.log
The command '/bin/sh -c npm install &&     npm run build &&     rm -rf src tsconfig.json &&     npm prune --production' returned a non-zero code: 2

To Reproduce
Steps to reproduce the behavior:
docker build -t somename:latest .

Expected behavior
A docker image of the program.

Suggested Solution
So I've done some digging and I found out the following commit broke the code:
280a640
so I reverted that commit using

git revert 280a640d814e39aac9f459b8214ed1b8efd080ff

and then everything seemed to compile and build successfully, and afterwards I tested the program against a website I wanted to try, and it succeeded.

I hope it would be of some help.

Compile Error

Describe the bug
When I run npm run build the compiler throw an error.

Using [email protected] and [email protected]

To Reproduce
Clone the repository
Install using PUPPETEER_PRODUCT=chrome npm install
Run npm run build

Expected behavior
Project compile

Desktop (please complete the following information):

  • OS: Ubuntu 22.04
  • Browser: Chrome
  • Version: Idk

Additional context
Log:

> [email protected] build
> tsc

node_modules/got/dist/source/core/index.d.ts:12:15 - error TS1005: ',' expected.

12 import { type PlainResponse, type Response } from './response.js';
                 ~~~~~~~~~~~~~

node_modules/got/dist/source/core/index.d.ts:12:35 - error TS1005: ',' expected.

12 import { type PlainResponse, type Response } from './response.js';
                                     ~~~~~~~~

node_modules/got/dist/source/core/options.d.ts:21:29 - error TS1005: ',' expected.

21 import http2wrapper, { type ClientHttp2Session } from 'http2-wrapper';
                               ~~~~~~~~~~~~~~~~~~


Found 3 errors.

Sandbox error

When trying to curl the docker image I get the following issues

 curl -L -X POST 'http://docker01:8191/v1' \                                                                                1.10 
-H 'Content-Type: application/json' \
--data '{
  "cmd": "request.get",
  "url" : "https://www.google.com/",
  "userAgent": "Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:79.0) Gecko/20100101 Firefox/79.0",
  "maxTimeout": 60000
}'
{"status":"error","message":"Failed to launch the browser process!\n[0806/174417.347735:FATAL:zygote_host_impl_linux.cc(116)] No usable sandbox! Update your kernel or see https://chromium.googlesource.com/chromium/src/+/master/docs/linux/suid_sandbox_development.md for more information on developing with the SUID sandbox. If you want to live dangerously and need an immediate workaround, you can try using --no-sandbox.\n#0 0x55fdfccc2399 base::debug::CollectStackTrace()\n#1 0x55fdfcc232a3 base::debug::StackTrace::StackTrace()\n#2 0x55fdfcc34c95 logging::LogMessage::~LogMessage()\n#3 0x55fdfe51140e service_manager::ZygoteHostImpl::Init()\n#4 0x55fdfc7ed060 content::ContentMainRunnerImpl::Initialize()\n#5 0x55fdfc84e5e7 service_manager::Main()\n#6 0x55fdfc7eb631 content::ContentMain()\n#7 0x55fdfc84d80d headless::(anonymous namespace)::RunContentMain()\n#8 0x55fdfc84d50c headless::HeadlessShellMain()\n#9 0x55fdfa2415a7 ChromeMain\n#10 0x7f2628654b45 __libc_start_main\n#11 0x55fdfa2413ea _start\n\nReceived signal 6\n#0 0x55fdfccc2399 base::debug::CollectStackTrace()\n#1 0x55fdfcc232a3 base::debug::StackTrace::StackTrace()\n#2 0x55fdfccc1f35 base::debug::(anonymous namespace)::StackDumpSignalHandler()\n#3 0x7f262eb1f890 (/lib/x86_64-linux-gnu/libpthread-2.19.so+0xf88f)\n#4 0x7f2628668067 gsignal\n#5 0x7f2628669448 abort\n#6 0x55fdfccc0e95 base::debug::BreakDebugger()\n#7 0x55fdfcc35132 logging::LogMessage::~LogMessage()\n#8 0x55fdfe51140e service_manager::ZygoteHostImpl::Init()\n#9 0x55fdfc7ed060 content::ContentMainRunnerImpl::Initialize()\n#10 0x55fdfc84e5e7 service_manager::Main()\n#11 0x55fdfc7eb631 content::ContentMain()\n#12 0x55fdfc84d80d headless::(anonymous namespace)::RunContentMain()\n#13 0x55fdfc84d50c headless::HeadlessShellMain()\n#14 0x55fdfa2415a7 ChromeMain\n#15 0x7f2628654b45 __libc_start_main\n#16 0x55fdfa2413ea _start\n  r8: 0000000000000000  r9: 0000000000000000 r10: 0000000000000008 r11: 0000000000000206\n r12: 00007ffd23825178 r13: 00007ffd23824150 r14: 00007ffd23825180 r15: aaaaaaaaaaaaaaaa\n  di: 000000000000001d  si: 000000000000001d  bp: 00007ffd23824100  bx: 00007ffd23824174\n  dx: 0000000000000006  ax: 0000000000000000  cx: ffffffffffffffff  sp: 00007ffd23823fc8\n  ip: 00007f2628668067 efl: 0000000000000206 cgf: aaaa000000000033 erf: 0000000000000000\n trp: 0000000000000000 msk: 0000000000000000 cr2: 0000000000000000\n[end of stack trace]\nCalling _exit(1). Core file will not be generated.\n\n\nTROUBLESHOOTING: https://github.com/puppeteer/puppeteer/blob/master/docs/troubleshooting.md\n","startTimestamp":1596735853720,"endTimestamp":1596735857863,"version":"1.0.0"}%   

I think this did not get applied or fixed: #8 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.