Comments (4)
Thank you for reporting. We will investigate the issue and fix it shortly.
from crawlee.
Ad second issue: This is because Autoscaled pool is configured to finish when there is no task in progress and workerFunction returns null. We will add configuration parameter to leave pool running in this case and new method to finish it manually.
Ad first issue: I did some tests and it seems to be working correctly. It looks for spare capacity and tries to run new worker functions everytime some of the function gets finished. And also there is interval every 1s that looks for spare capacity and possible starts new worker functions.
In your example you are adding new task every 900ms and tasks takes 1000s so there should be about 1 task running the most of the time.
from crawlee.
Thanks for the details.
For the first issue I now understand the reason. Would be worth to be mentioned in the docs!
from crawlee.
from crawlee.
Related Issues (20)
- scrape page count is exceed maxRequestsPerCrawl too much
- Show line numbers in code blocks on Crawlee docs
- No links are being enqueued on some pages HOT 3
- Playwright requires installation via `npx playwright install` HOT 13
- Issue Downgrading from Crawlee 3.7.2 to 3.4.0 - Persistent Version and TypeScript Errors HOT 8
- Save screenshot/HTML on first occurrence of error in error statistics HOT 2
- Double clicking title selects also prefix pill – makes it harder to copypaste HOT 1
- dataset as requestsFromUrl
- add "exclude" property to enqueueLinksByClickingElements like "enqueueLinks"
- Implement Automatic Memory Management in Playwright for Enhanced Stability in Web Crawling Operations
- Support plain-text sitemaps (sitemap.txt) in the `Sitemap` class HOT 1
- Implement sitemap autodetection (independent of robots.txt)
- `maxUsageCount: 1` does not retire session after a single use HOT 1
- `useIncognitoPages` doesn't rotate fingerprints HOT 1
- Add support for all tags defined by the sitemap protocol
- `page.evaluate` results error HOT 2
- HttpCrawler - determining character encoding
- Add `waitForAllRequestsToBeAdded` option to `enqueueLinks`
- XPATH selectors support HOT 4
- Multiple calls to enqueueLinks with Promise.all result in a crash HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from crawlee.