Comments (9)
Here's code from the youtube-dl project that checks for gzipped byte marker:
if len(video_webpage) > 2 and video_webpage[0] == '\x1f' and video_webpage[1] == '\x8b':
buf = StringIO.StringIO(video_webpage)
f = gzip.GzipFile(fileobj=buf)
video_webpage = f.read()
Here's that applied to workflow.web.Response.content
:
@property
def content(self):
"""Raw content of response (i.e. bytes)
:returns: Body of HTTP response
:rtype: :class:`str`
"""
if not self._content:
self._content = self.raw.read()
if len(self._content) > 2 and (self._content[0] == '\x1f' and
self._content[1] == '\x8b'):
buf = StringIO.StringIO(self._content)
gzip_f = gzip.GzipFile(fileobj=buf)
self._content = gzip_f.read()
return self._content
from alfred-workflow.
I'll look into it.
Why aren't you using the solution from the chosen answer?
If the data is gzipped for transmission, the Content-Encoding
header should reflect this.
Your proposed solution (checking the first byte) has the same problem as your proposed adaptation to decode()
to handle mdls output: you're wrongly generalising a specific egde case, and it will probably break other cases.
It's perfectly possible that a binary file starts with the same byte, but your solution would incorrectly try to unzip it.
That's fine for youtube-dl because it only handles a very limited number of file formats. It is not appropriate in the general case.
from alfred-workflow.
Good point. I had read the youtube-dl
solution last, so that's what stuck in my mind, but checking the header is clearly better. Good call.
Here's a second run at it:
import gzip
from cStringIO import StringIO
@property
def content(self):
"""Raw content of response (i.e. bytes)
:returns: Body of HTTP response
:rtype: :class:`str`
"""
if not self._content:
self._content = self.raw.read()
if self.headers['content-encoding'] == 'gzip':
inbuffer = StringIO(self._content)
f = gzip.GzipFile(mode='rb', fileobj=inbuffer)
try:
self._content = f.read()
finally:
f.close()
return self._content
from alfred-workflow.
Looks like a fair start.
iter_content()
would also need to support gzip encoding and the library would have to send an Accept-Encoding
header that contains gzip
in order to be correct.
I've had a little play with GzipFile
and this looks tricky to do.
from alfred-workflow.
So, I've had a look at this, and it doesn't look like it's possible to decompress a stream of gzipped data, at least, not without piping it through gunzip
or keeping all the data in memory, which is what iter_content()
is supposed to avoid.
Which website were you pulling the data from? A webserver shouldn't be sending gzipped data if the HTTP client hasn't explicitly told the server that it supports it (which web.py
doesn't).
from alfred-workflow.
It's the search results from kickasss torrents.
from alfred-workflow.
Hmm. That's actually naughty behaviour by the server. It shouldn't be sending gzipped data unless the client has told it that it accepts that.
Currently, I'm inclined to leave gzip-handling out of web.py
.
It's a good feature to have, but I don't want to have Response.content
support it but not Response.iter_content()
.
I didn't put too much effort into it, but it looks like piping the data through gunzip
is the only thing likely to work (Python seems to only be able to decompress the data if it's all loaded into memory or saved to a temporary file).
from alfred-workflow.
I've had an idea regarding iter_content()
and gzipped content.
The way I see it, streaming HTTP data isn't so useful in a workflow; the main selling point of iter_content()
is handling large files without having to load them into memory.
So, could it be replaced with a save_to_path()
function? This could first save to a temporary file, which is then moved/renamed to the destination file upon completion, and decompressed if necessary.
How does that sound?
I should probably make this a separate issue…
from alfred-workflow.
Gzip support and save_to_path()
implemented in v1.9.6
from alfred-workflow.
Related Issues (20)
- Run with error HOT 2
- Cache Image HOT 5
- Basic auth HOT 3
- Pass parameter to subprocess HOT 5
- Tutorial options for keywords need to be updated for Alfred 4 HOT 11
- set_config raises error when the bundle id is null HOT 4
- Setting only arg on Variables adds line break HOT 1
- will it support python3? HOT 1
- python3 has no cpickle HOT 1
- cant get output HOT 11
- chr() arg not in range(256) error when trying to use Beautiful Soup 4 HOT 1
- workflow:magic not working?
- API functionality question
- AlertCautionIcon.icns does not exist on Big Sur
- ERROR: [Script Filter] JSON error
- Google SDK
- Can't get Script Filter to find the pinboard.py file from the tutorial HOT 1
- [Feature request] Possible to open bookmarks from root?
- Not working on the latest MacOS 12.3 HOT 11
- How to fetch chrome cookie?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from alfred-workflow.