Coder Social home page Coder Social logo

Correct handling of Unicode in py2 about pantable HOT 9 CLOSED

ickc avatar ickc commented on June 15, 2024
Correct handling of Unicode in py2

from pantable.

Comments (9)

ickc avatar ickc commented on June 15, 2024

A quick look into the unicodecsv's README seems to suggest backport.csv is a better fit here. I mostly use python 3 (and code in python 3 while pasteurized into python 2). So a backport of csv to python 2 seems better.

from pantable.

ickc avatar ickc commented on June 15, 2024

I'm thinking if we should avoid the extra dependency for Python 3 users. Do you know a quick and clean way to do so?

from pantable.

reenberg avatar reenberg commented on June 15, 2024

I don't have any preferences for which module. I just ended up not being able to load in my .csv files when they contained Danish language instead of English, which I had primarily used this filter for in the past.

Well we could obviously do a conditional import of the different modules, but that doesn't take care of the dependency in setup.py. I don't have a solution on top of my head for doing conditional dependencies in there.

I thought the py3 csv module had the same issues as the py2, but it seems that it was only an issue in py3.0, and it was fixed in 3.1.

It seems that the backport.csv module is a pure python implementation, meaning it is slow (according to the author), However I don't know how bit of a deal this is in practice.

I could try and make a proposal using this and the io.open() as changes instead?
Also trying to figure out how to deal with setup.py. Alternatively we could just try and import it, and if it works, then great, if not then just use the default csv module. This way it the user install the module herself, then it will work and the setup.py file wouldn't polute py3 installations.

from pantable.

reenberg avatar reenberg commented on June 15, 2024

Dealing with this in setup.py actually seems quite easy. Environment Markers (PEP 508) is designed for this.

However there seems to be some fuzz about old versions of setuptools. Instead of specifying the environment markers in install_requires, it should be set in extra_requires as this is supported since setuptools 18 (e.g., here)

It seems you need version 20.6.8 (May 2016) for support to be fully functional in install_requires.

from pantable.

reenberg avatar reenberg commented on June 15, 2024

See reenberg@7e3fa94 for an initial test at using environment markers. It works like a charm. And I don't think it is unreasonable to depend on a fairly new version of setuptools.

from pantable.

reenberg avatar reenberg commented on June 15, 2024

@ickc And with the added backports.csv module instead for py2 (reenberg@bb13cd2).

It Actually simplified the code a bit, as there wasn't a need for differentiating between io.BytesIO and io.StringIO any more.

However the tests seems to fail on python 2, as some of the parameters is unicode instead of str:

- [...] TableRow(TableCell(Para(Math(E=mc^2; format=u'InlineMath'))) [...]
?
+ [...] TableRow(TableCell(Para(Math(E=mc^2; format='InlineMath'))) [...]

However this is an issue in the master branch as well, so doesn't seem related to what i have changed.

See full log: pantable.test.txt

Should I make a PR for this, or do you see anything that needs changing? It has minimum impact on py3 as requested.

from pantable.

reenberg avatar reenberg commented on June 15, 2024

Actually I can see that something changed, since the test_read_data now also fails for assert read_data(True, '') is None in py3, which it doesn't on master, due to the fact that i removed the str() call inside io.open(str(include)).

Is there a particular reason why this is done so? Can there be any valid ways to actually sneak a bool or anything else than a string into this variable? It comes from the yaml meta-data block.
So you would have to pass something else than a string here, for example a list or something, but calling str() on that would still yield something that is not useful.

from pantable.

ickc avatar ickc commented on June 15, 2024

Also see #21

from pantable.

ickc avatar ickc commented on June 15, 2024

@reenberg, please check pantable v0.11 in #25 fixes your problem. Thanks.

from pantable.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.