Coder Social home page Coder Social logo

Unicode about pydelicious HOT 6 CLOSED

illegalwalker avatar illegalwalker commented on September 16, 2024
Unicode

from pydelicious.

Comments (6)

GoogleCodeExporter avatar GoogleCodeExporter commented on September 16, 2024
It seems to me that u'\x96' is just not correct Python Unicode string -- could 
you
have non-Unicode character in Unicode string?

Original comment by [email protected] on 8 May 2008 at 10:57

from pydelicious.

GoogleCodeExporter avatar GoogleCodeExporter commented on September 16, 2024
Sorry, I don't fully understand this question. What do you mean by "not 
correct"? Do 
you have a pointer to something where I can learn more about why these 
non-Unicode 
characters (in the context of Unicode strings)? We are also interested in 
getting to 
the bottom of this with gbookmark2delicious project. Thanks.

Original comment by [email protected] on 8 May 2008 at 5:06

from pydelicious.

GoogleCodeExporter avatar GoogleCodeExporter commented on September 16, 2024
After a ton of experimentation, I think I've got it all figured out - one must 
use 
the 'utf-8' codec instead of the 'iso-8859-1' codec.  I advise changing the 
default 
codec in DeliciousAPI's constructor.

E.g., if you try to post_add something with the string '\xf6', then delicious 
misinterprets that and stores the wrong character (if you query it, it gives 
you 
u'\u2298').  If OTOH you send it the utf-8-encoded string '\xc3\xb6', you'll 
get 
back the same string.

Original comment by [email protected] on 13 May 2008 at 6:12

from pydelicious.

GoogleCodeExporter avatar GoogleCodeExporter commented on September 16, 2024
Hmmm.. I *think*, the 'encode' is only relevant when someone passes in unicode
strings instead of plain strings to the DeliciousAPI methods.

yaaang: what is your locale encoding?

But the handling in _call_server is not correct. I think the following would be 
the
right way to ensure we post plain (byte) strings to del.icio.us:

    if isinstance(params[key], unicode):
        params[key] = params[key].encode(self.codec)

The thing I am left wondering about is how the server interprets these bytes.
Neither XML nor HTTP headers indicate encoding, presumably XML's default: utf-8.
The elementtree XML parsing always seems to return unicode strings for these...

I work in an UTF-8 environment but what about people using latin-1/ISO-8859-1 
encoded
strings in their bookmarks? 

With the above code any unicode strings I pass to the instance get handled 
correctly:

In [231]: da = pydelicious.DeliciousAPI('mpe', passwd, codec='utf-8')

In [232]: da.posts_add('cid:[email protected]', unicode('★', 
'utf-8'),
replace=True)
Out[232]: {'result': (True, 'done')}

In [233]: da.posts_add('cid:[email protected]', '★', replace=True)
Out[233]: {'result': (True, 'done')}

In [234]: for u in 'cid:[email protected]',
'cid:[email protected]': da.posts_get(url=u)
   .....:
Out[234]:
{'dt': '2008-06-02',
 'posts': [{'description': u'\u2605',
            'hash': '15a97870f0707fb9d33496391eac572f',
            'href': 'cid:[email protected]',
            'others': '',
            'shared': 'no',
            'tag': 'system:unfiled',
            'time': '2008-06-02T15:56:12Z'}],
 'tag': '',
 'user': 'mpe'}
Out[234]:
{'dt': '2008-06-02',
 'posts': [{'description': u'\u2605',
            'hash': '5caff95c3d3ea03a7598f300419a3848',
            'href': 'cid:[email protected]',
            'others': '',
            'shared': 'no',
            'tag': 'system:unfiled',
            'time': '2008-06-02T15:56:25Z'}],
 'tag': '',
 'user': 'mpe'}

So both have the same result and delicious either uses or recognizes UTF-8.

Original comment by [email protected] on 2 Jun 2008 at 3:58

from pydelicious.

GoogleCodeExporter avatar GoogleCodeExporter commented on September 16, 2024
Err, which is:
- '★' # plain string: '\xe2\x98\x85'
- unicode('★', 'utf-8') # unicode string: u'\u2605'

Original comment by [email protected] on 2 Jun 2008 at 4:01

  • Added labels: Type-Other
  • Removed labels: Type-Defect

from pydelicious.

GoogleCodeExporter avatar GoogleCodeExporter commented on September 16, 2024
ok. Encoding issues should have been resolved now and commited.

BTW, see tests/test_encodings.py to see encoding/decoding utf8 and latin1 in 
action.



Original comment by [email protected] on 28 Nov 2008 at 3:57

  • Changed state: Fixed

from pydelicious.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.