Comments (10)
I pushed a fix for this, will you pull latest and give it a shot?
from autoscrape-py.
It 'worked', gave a success message but only scraped first page. Form depth was set to zero:
celery_1 | [2019-04-09 18:04:00,399: WARNING/ForkPoolWorker-2] Completed iteration!
celery_1 | [2019-04-09 18:04:00,399: WARNING/ForkPoolWorker-2] Completed iteration!
celery_1 | [2019-04-09 18:04:00,399: DEBUG/ForkPoolWorker-2] Completed iteration!
celery_1 | [2019-04-09 18:04:00,401: WARNING/ForkPoolWorker-2] Scrape complete! Exiting.
celery_1 | [2019-04-09 18:04:00,401: WARNING/ForkPoolWorker-2] Scrape complete! Exiting.
celery_1 | [2019-04-09 18:04:00,400: DEBUG/ForkPoolWorker-2] Scrape complete! Exiting.
celery_1 | [2019-04-09 18:04:01,080: WARNING/ForkPoolWorker-2] [Errno 2] No such file or directory: '/tmp/tmpv_z9wvgb'
celery_1 | [2019-04-09 18:04:01,082: INFO/ForkPoolWorker-2] Task autoscrape.tasks.start[b8308024-f90e-48bc-84ff-818606cbc086] succeeded in 99.0302832999987s: None
New issue I imagine?
from autoscrape-py.
Can you give me a verbose output? I'm guessing it's not finding your button.
from autoscrape-py.
Sorry, I realized verbose output isn't an option in the UI. I need to add that. Are you targeting the "Ward" selector?
from autoscrape-py.
It's not, but the terminal output does have references to ...
flask_1 | 172.18.0.5 - - [09/Apr/2019 19:02:15] "POST /receive/27a58b08-f604-4e91-8c65-8bb8de5b6866 HTTP/1.1" 200 -
celery_1 | [2019-04-09 19:02:15,556: WARNING/ForkPoolWorker-2] Next button not found!
celery_1 | [2019-04-09 19:02:15,556: WARNING/ForkPoolWorker-2] Next button not found!
celery_1 | [2019-04-09 19:02:15,557: WARNING/ForkPoolWorker-2] Next button not found!
celery_1 | [2019-04-09 19:02:15,556: DEBUG/ForkPoolWorker-2] Next button not found!
I am targeting the ward selector, which seems to work according to the PNG and that the first page does come through in the results. But there are hundreds of pages after it.
from autoscrape-py.
Can I use the terminal version if I've set up via Docker?
from autoscrape-py.
Okay I did the scrape you're attempting and found the problem.
The problem is the results page "Next" buttons aren't part of a form nor are they a button (like how most other sites do this), they're just links, indistinguishable from ones on the rest of the page.
I pushed a fix that includes links inside of tables, which I think will be fairly common for other scrapes as well. I tried the scrape and it appears to be working for me now. Pull latest and try for yourself.
from autoscrape-py.
Can I use the terminal version if I've set up via Docker?
No, you need to follow the "Setup for Standalone Local CLI" instructions to run the CLI version outside of a Docker container.
from autoscrape-py.
This appears to be working ... though I set maxdepth to 10 and it's still cranking away :-)
from autoscrape-py.
Awesome, closing this then. :)
from autoscrape-py.
Related Issues (2)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from autoscrape-py.