Coder Social home page Coder Social logo

Comments (10)

brandonrobertz avatar brandonrobertz commented on June 10, 2024

I pushed a fix for this, will you pull latest and give it a shot?

from autoscrape-py.

golfecholima avatar golfecholima commented on June 10, 2024

It 'worked', gave a success message but only scraped first page. Form depth was set to zero:

celery_1    | [2019-04-09 18:04:00,399: WARNING/ForkPoolWorker-2] Completed iteration!
celery_1    | [2019-04-09 18:04:00,399: WARNING/ForkPoolWorker-2] Completed iteration!
celery_1    | [2019-04-09 18:04:00,399: DEBUG/ForkPoolWorker-2] Completed iteration!
celery_1    | [2019-04-09 18:04:00,401: WARNING/ForkPoolWorker-2] Scrape complete! Exiting.
celery_1    | [2019-04-09 18:04:00,401: WARNING/ForkPoolWorker-2] Scrape complete! Exiting.
celery_1    | [2019-04-09 18:04:00,400: DEBUG/ForkPoolWorker-2] Scrape complete! Exiting.
celery_1    | [2019-04-09 18:04:01,080: WARNING/ForkPoolWorker-2] [Errno 2] No such file or directory: '/tmp/tmpv_z9wvgb'
celery_1    | [2019-04-09 18:04:01,082: INFO/ForkPoolWorker-2] Task autoscrape.tasks.start[b8308024-f90e-48bc-84ff-818606cbc086] succeeded in 99.0302832999987s: None

New issue I imagine?

from autoscrape-py.

brandonrobertz avatar brandonrobertz commented on June 10, 2024

Can you give me a verbose output? I'm guessing it's not finding your button.

from autoscrape-py.

brandonrobertz avatar brandonrobertz commented on June 10, 2024

Sorry, I realized verbose output isn't an option in the UI. I need to add that. Are you targeting the "Ward" selector?

from autoscrape-py.

golfecholima avatar golfecholima commented on June 10, 2024

It's not, but the terminal output does have references to ...

flask_1     | 172.18.0.5 - - [09/Apr/2019 19:02:15] "POST /receive/27a58b08-f604-4e91-8c65-8bb8de5b6866 HTTP/1.1" 200 -
celery_1    | [2019-04-09 19:02:15,556: WARNING/ForkPoolWorker-2] Next button not found!
celery_1    | [2019-04-09 19:02:15,556: WARNING/ForkPoolWorker-2] Next button not found!
celery_1    | [2019-04-09 19:02:15,557: WARNING/ForkPoolWorker-2] Next button not found!
celery_1    | [2019-04-09 19:02:15,556: DEBUG/ForkPoolWorker-2] Next button not found!

I am targeting the ward selector, which seems to work according to the PNG and that the first page does come through in the results. But there are hundreds of pages after it.

from autoscrape-py.

golfecholima avatar golfecholima commented on June 10, 2024

Can I use the terminal version if I've set up via Docker?

from autoscrape-py.

brandonrobertz avatar brandonrobertz commented on June 10, 2024

Okay I did the scrape you're attempting and found the problem.

The problem is the results page "Next" buttons aren't part of a form nor are they a button (like how most other sites do this), they're just links, indistinguishable from ones on the rest of the page.

I pushed a fix that includes links inside of tables, which I think will be fairly common for other scrapes as well. I tried the scrape and it appears to be working for me now. Pull latest and try for yourself.

from autoscrape-py.

brandonrobertz avatar brandonrobertz commented on June 10, 2024

Can I use the terminal version if I've set up via Docker?

No, you need to follow the "Setup for Standalone Local CLI" instructions to run the CLI version outside of a Docker container.

from autoscrape-py.

golfecholima avatar golfecholima commented on June 10, 2024

This appears to be working ... though I set maxdepth to 10 and it's still cranking away :-)

from autoscrape-py.

brandonrobertz avatar brandonrobertz commented on June 10, 2024

Awesome, closing this then. :)

from autoscrape-py.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.