Comments (7)
Ideally I think the Content-Encoding and Content-Length headers should be set by the HTTP implementation, which would give the String instance liberty to default to something like UTF-8 (modulo user configuration perhaps).
Alternatively, I would just delete the String instance and bump the version number. Applications that rely on this instance should really be fixed to do their own encoding, just recently I had the pleasure of figuring out what was causing package CouchDB to break - it was the broken String instance in here.
If I see a String instance like that I'll assume it's sane - as in actually handles all valid Strings appropriately.
from http.
I agree in principle that a broken String instance shouldn't exist, it's just that removing it now could have a very substantial impact and I'm scared of doing that. On the other hand there is all the time people like you will have wasted figuring out the brokenness.
As well as dealing with the sending aspect, don't we need to deal with any encoding the server chooses to return?
from http.
I'm in favour of doing String
properly by looking at Content-Type
, but removing support for String
and just using ByteString
or [Word8]
or whatever is certainly better than the current situation.
from http.
I've spent a while investigating and I currently think doing String properly is too hard :-( I guess removing the String instance is the only remaining option.
http://www.haskell.org/pipermail/libraries/2012-September/018426.html
from http.
You are right that if the library is handed content that does not have a Content-Type
with a charset
already, then it is rather hard. And, of course, if handed a charset
the library does not support (for whatever reason) that is also pretty hard.
Since in practice encodings are sometimes detected from the actual content itself (XML declarations, meta tags), there is in fact no sane simplifying assumption if Content-Type
lacks an encoding. While by-the-book such a Content-Type
should indicate binary data, in practise it is used to deliver data with all sorts of character encodings (yay browsers magically auto-detecting things!). So, the only sane thing for a generic HTTP library that is not content aware to do in the case of Content-Type
with no charset
is to throw an error (or at least a very strong warning), which is likely not in the spirit of this library.
Long live ByteString
.
from http.
+1 for moving to ByteString
. Any progress on this?
Another option is Either ByteString String
, where the Right
is decoded properly, but only if the content is actually, unambigiously, text.
I was really dismayed, on my very first HTTP attempt, to find that the HTTP
module blatantly ignores Content-Type
(and its charset) and assumes everything is a string. This will be confusing a lot of first timers; the additional work required to become encoding-aware will be fairly daunting. And among other things it makes it impossible to work with images.
You are right that if the library is handed content that does not have a Content-Type with a charset already, then it is rather hard.
According to the MIME spec, content without Content-Type
is considered by application/octet-stream
, ie. a raw stream of 8-bit bytes. Text content that has a Content-Type
but no charset is assumed to be 7-bit ASCII, except when the particular MIME type defines a default.
from http.
Sorry, I was hoping to do this much more promptly. It's a bit harder than I thought to do nicely because the entire library is based around using Strings :-(
from http.
Related Issues (20)
- Typo in documentation of Network.HTTP.getResponseCode
- Add warning for lack of HTTPS support HOT 6
- Commits missing on github? HOT 1
- Bump bounds on win32? HOT 3
- Support for network 2.8.x? HOT 5
- Extract Code that Creates http Headers and Body HOT 2
- findHeader is case-sensitive for custom headers HOT 2
- Support for GHC-9.0 HOT 5
- Upcoming breakage with transformers-0.6 / mtl-2.3 HOT 6
- Ik
- Compatibility with mtl-2.3 HOT 1
- Better error message when HTTP_PROXY is down.
- cabal install gets stuck at Recovering connection to hackage.haskell.org HOT 2
- Custom instances of HStream impossible to write
- urlEncodeVars does not properly encode arrays HOT 1
- Not building with ghc 7.8 HOT 3
- Remove dep on old-time? HOT 4
- simpleHTTP raises exception on connection refused HOT 1
- Mac OSX 10.11.2 HTTP Error HOT 7
- GHC 8.0.1 support HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from http.