Coder Social home page Coder Social logo

Comments (10)

p2 avatar p2 commented on May 28, 2024

From [email protected] on September 05, 2012 14:01:38
I used a hex editor rather than a text editor to ensure that the issue is not related to text/linebreak encoding.

from quicklook-csv.

p2 avatar p2 commented on May 28, 2024

From [email protected] on October 18, 2012 16:11:38
Please react to my bug report.

from quicklook-csv.

p2 avatar p2 commented on May 28, 2024

From p2 on October 23, 2012 23:09:07
Sorry, I don't seem to be receiving emails for bug reports any more!

The problem you describe has to do with how the plugin detects which fields to use to separate the cells. It does so using only the first 200 characters, which in your case is not sufficient. I'll up that limit to make it work for you.

from quicklook-csv.

p2 avatar p2 commented on May 28, 2024

From p2 on October 24, 2012 00:43:34
Fixed in rev 24

from quicklook-csv.

p2 avatar p2 commented on May 28, 2024

From [email protected] on October 24, 2012 08:23:34
AUTODETECT_NUM_FIRST_CHARS = 1000
That are 10 lines with 99 chars.
If someone has a longer commentary introduction before the CSV data starts, it will fail again.

Can we increase AUTODETECT_NUM_FIRST_CHARS a bit more, to an amount which is grateful towards introduction text while still remaining performant?
If someone quick-looks through multiple files, the delay inbetween should not be too long. What number would that be? 10000? If you assume a low read rate of 3 MB/sec (i.e. USB flash drive), then this would be read in 3 milliseconds (10000 bytes / 3000000 bytes/sec = 0.003 secs), even 30ms (0.3MB/sec read) would be acceptable I think.

from quicklook-csv.

p2 avatar p2 commented on May 28, 2024

From p2 on October 24, 2012 12:42:13
The autodetect is not implemented in a very smart way. What it does is split the first 1000 characters on comma, tab, semicolon and pipe and then choses the separator that gives the most columns. If you have a "real" CSV separated by comma and the values contain semicolons or pipes, then this algorithm will wrongly not use the comma but the other separator. That's the main reason I've limited it to the first 1000 (and 200 before) characters, though this might be a futile attempt.

from quicklook-csv.

p2 avatar p2 commented on May 28, 2024

From [email protected] on October 24, 2012 15:59:50
Another algorithm idea:

Preparation:
Read in the first AUTODETECT_NUM_FIRST_CHARS
Then split it across the lines.
Get a statistic of comma, tab, semicolon, pipe per line.

Analysis:
Check if one of these amounts is the same per each line.
If 0 candidates match that condition, then choose the candidate which occurs at most within all lines. Not necessarily correct, but likely true.
If 1 candidate matches that condition, this must be the separator. Only very few exceptions imaginable.
If 1+ candidates match that condition, take the one which occurs first in the first line. Not necessarily true, but again likely, as the first cell's probability to contain a separator-class-character as a literal is not that high.

from quicklook-csv.

p2 avatar p2 commented on May 28, 2024

From [email protected] on November 06, 2012 15:37:05
What do you think about my alternative detection proposal?

from quicklook-csv.

p2 avatar p2 commented on May 28, 2024

From p2 on November 09, 2012 01:23:31
Because this is merely a sideproject I'm not going to invest more time any time soon. Your approach has one problem, which is the "split by lines" part. CSV can have newline data, protected by double quotes. Thus it would need to be correctly parsed, once for every possible separator, before this can be decided. That's why I limit it to a few chars at the start; it could definitely be better, but it's good enough.

from quicklook-csv.

p2 avatar p2 commented on May 28, 2024

From [email protected] on November 09, 2012 07:37:51
Ok. Maybe sometime in the future, if you are in the mood for it. Thanks for what you offered so far!

from quicklook-csv.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.