Comments (8)
@qo4on I had a thought about this over the weekend. I wonder if your problem would be solved by avoiding using a session altogether -- As in, creating a new session for every request.
My theory is that interleaved requests using the same underlying TCP connection are causing your problem. By default, when you use a session, the TCP connections will be reused. If you do not use a session object, a new session will be created for each request, which should also mean a new TCP connection will be created for each request.
You will have to pass in any necessary headers/cookies to each request, but I think this may help you send multiple requests concurrently, instead of needing to do them all serially.
from grequests.
Thank you for your help. It looks like you are right. Their cloud drive works reliable only with sequential requests. I tried multi processes, threads, asyncio + requests, http.client + pipeline, dugong, grequests, coroutines... It mostly freezes at some point or sends incorrect responses. I have no idea what they mean by http pipelining.
Unfortunately other services are even worse.
from grequests.
Thanks. I wrote a message to their support team and they fixed some errors. Finally this code works. I have no idea why it works, but it works, it returns fd = [1, 2, 3, 4, 5...]
as it should. I tried to do the same with httpx and had incorrect values for all files fd = [1, 1, 1, 1, 1...]
.
from grequests.
grequests
supports all the usual features of requests
. My understanding is that keep-alive is on by default for sessions. If you provide a session object to the grequests request, it will use that session for its connection, so you can use the same connection pool across requests sent with grequests.
I think this will do what you want @qo4on
import grequests
import requests
sesh = requests.Session()
import logging
logging.basicConfig(level=logging.DEBUG) # to see the connection logs
request_list = [grequests.get('http://httpbin.org/status/200', session=sesh) for _ in range(10)]
for resp in grequests.imap(request_list):
...
If you don't use a session, you'll see that a new connection is made for every request. You can also further configure the connection pooling by configuring your session object accordingly.
from grequests.
@spyoungtech Thank you. I'm not sure that the same session means the same tcp
connection. In my view grequests
does not guarantee that the order it sends the requests is the same as the order of remote server gets these requests. That means grequests
uses different tcp
connection for every request. Am I right?
Also, I don't know why, but my authorization fails after a few requests with grequests
. Usual requests.Session()
does not have such problem.
request_list = []
for item in items[:10]:
params = {'path': f"/{self.tts}/{self.theme}/{item['name']}",
'flags': self.pc.O_CREAT}
request_list.append(grequests.get('https://eapi.pcloud.com/file_open', session=self.pc.session, params=params))
for resp in grequests.imap(request_list):
print(resp.json())
{'result': 0, 'fd': 1, 'fileid': 85081146}
{'result': 0, 'fd': 2, 'fileid': 85081155}
{'result': 0, 'fd': 3, 'fileid': 85081252}
{'result': 0, 'fd': 4, 'fileid': 85081157}
{'result': 1000, 'error': 'Log in required.'}
{'result': 1000, 'error': 'Log in required.'}
{'result': 1000, 'error': 'Log in required.'}
{'result': 1000, 'error': 'Log in required.'}
{'result': 1000, 'error': 'Log in required.'}
{'result': 0, 'fd': 5, 'fileid': 85081255}
{'result': 0, 'fd': 1, 'fileid': 85081146}
{'result': 0, 'fd': 2, 'fileid': 85081155}
{'result': 0, 'fd': 3, 'fileid': 85081252}
{'result': 0, 'fd': 4, 'fileid': 85081157}
{'result': 0, 'fd': 5, 'fileid': 85081255}
{'result': 0, 'fd': 6, 'fileid': 85081158}
{'result': 0, 'fd': 7, 'fileid': 85081259}
{'result': 0, 'fd': 8, 'fileid': 85081159}
{'result': 0, 'fd': 9, 'fileid': 85081262}
{'result': 1000, 'error': 'Log in required.'}
from grequests.
I think I see what you mean now. The core underlying library, urllib3
(and by extension requests
) does not support HTTP pipelining, therefore grequests
does not support this. However, urrlib3's connection pooling is likely to already be giving you similar, if not better, performance gains.
grequests does not guarantee that the order it sends the requests
No, I don't believe there are any guarantees of the order in which requests are sent, at least not when using grequests.map
/grequests.imap
. This is, in part, an inescapable nature of handling multiple requests/responses concurrently.
That means grequests uses different tcp connection for every request.
Like in requests
, the behavior of creating connections and connection pooling is handled in urllib3. Using the code in my first comment, you might see in the debug logs that urllib3 ends up creating two HTTP connections and uses/reuses them for all 10 requests. It does not create a new connection for each request when using a session.
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): httpbin.org:80
DEBUG:urllib3.connectionpool:Starting new HTTP connection (2): httpbin.org:80
You can configure the underlying connection pool size. See also: urllib3 advanced usage.
However, in order to send multiple requests concurrently, you need at least two connections! Trying to limit the pool to 1 and exactly 1 connection will cause a deadlock waiting for a connection in the pool to become available that never becomes available.
In this example, you'll see only one HTTPS connection is created.... however, it's kind of useless because the program deadlocks.
import grequests
import requests
sesh = requests.Session()
adapter = requests.adapters.HTTPAdapter(pool_maxsize=1, pool_block=True)
sesh.mount('https://', adapter)
import logging
logging.basicConfig(level=logging.DEBUG) # to see the connection logs
request_list = [grequests.get('https://httpbin.org/status/200', session=sesh) for _ in range(10)]
for resp in grequests.imap(request_list):
# DEADLOCK
...
Basically, it seems that when you have a blocking pool (see urllib3 docs linked above) with just 1 available connection, you get a deadlock with grequests
. Essentially, you need at least two connections to prevent the deadlock.
urllib3 fulfills the entire request -> response cycle; you can't send subsequent requests on the same connection without first receiving a response. Instead, connections are reused for subsequent requests (and responses). But even when reusing the same TCP connection, it's not quite the same as HTTP pipelining. Though, as mentioned, I don't think you'll see much performance gains, if any, with pipelining compared to urllib 3's connection pooling.
from grequests.
The problem is that the server does not support multiple connections https://docs.pcloud.com/protocols/http_json_protocol/single_connection.html
Sending requests in a for loop works great but takes too long. Multiple processes/threads are not supported.
Is there any workaround?
from grequests.
I don't think that document is suggesting the server does not support multiple connections or that threading your requests is prohibited.
By my reading, the document is suggesting to use a single connection for performance, but does not require it. The comment about threads/processes is regarding having different threads handling different requests writing to the same connection, which doesn't happen in the case of urllib3/requests/grequests.
Some quotes from that page with emphasis added:
"You can push multiple requests over single connection without waiting for answer, to improve performance
"However you should make sure that in no event two threads/processes write to the same connection at the same time."
Is there any workaround?
I believe grequests is working for your use case, based on the output you described (aside from the auth issue). Maybe adjusting the session adapter pool settings or the gevent pool size (via the size
argument to imap
) will help you achieve better results.
I'm not sure why you're having that authentication issue, I'd have to look closer around the authentication and know how you're handling that. Are you using the auth tokens with session cookies or some other method?
As a personal note, their documentation leaves much to be desired and does not inspire confidence in their service... some of the design there, particularly around authentication, is also kind of a big 'yikes' for me 😬
from grequests.
Related Issues (20)
- Grequest returns empty generators HOT 2
- How to convert response for a json HOT 1
- limit number of requests per second. HOT 1
- why size is no affect in map? HOT 1
- Add index in imap HOT 3
- Params of get/post functions HOT 3
- Import fails HOT 1
- Grequests warning and crash HOT 3
- Docker crash HOT 2
- Request for image HOT 2
- Append wait time or rate for every thread due to server limits HOT 1
- Python 3.6 the fastest environment?
- Python 3.6 the fastest environment? HOT 2
- gtimeout does not works with imap_enumerated HOT 1
- Flag to print the request being executed HOT 2
- How to ignore SSL cert verification failed in grequests? HOT 2
- grequests.map() is slow when passing requests with the data parameter HOT 2
- nose should be in `dev_requirements.txt`
- Process finished with exit code 132 (interrupted by signal 4: SIGILL) HOT 1
- how to parse single response when setting stream=True in map function?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from grequests.