Comments (6)
When working with (potentially) large text files it's a good idea to read the file line-by-line to have all the benefits of it (mentioned in the README). Though if you know your file isn't too big to handle or if you don't care, then you can always use this initializer to create a CSVImporter
object using a String. To do this you would need to read the contents of the CSV file by yourself. This way you have a String object which you can use to get the total number of lines. The code could look something like this:
let contentString = try! String(contentsOfFile: "path/to/your/file.csv")
let totalLinesCount = contentString.components(separatedBy: CharacterSet.newlines).count
let importer = CSVImporter<[String: String]>(contentString: contentString)
You can also see this example in the tests here.
The above code is a workaround though and might not perfectly work depending on the line ending of your file. As you can see here we already have the lines somewhere within CSVImporter, but it's not public, so you can't read it.
I think to add official support for the total number of lines we could add a public computed property which returns an Optional to CSVImporter
which could look like this:
public var totalDataLinesCount: Int? {
guard case let stringSource = source as? StringSource else { return nil }
return stringSource?.lines.count
}
It would only work, if you initialize CSVImporter
with a String, but it would make sure you don't get into trouble with line endings.
@ambujpunn Would you like to add this feature with test and send a PR? 😃
from csvimporter.
@Dschee Wouldn't this only work for when loading an entire csv file into a huge string? Ideally, we'd like to continue and extend the awesome behavior of CSVImporter which is to read line by line rather than store it first somewhere
from csvimporter.
Well, there's a logical problem there though, isn't it? I mean, if you wanna read a file "line by line" then you can't know how many lines the file has since you haven't read the entire file yet, no? What you could do is guess the total number of lines based on the file size. But as this is not accurate by any means, I tend not to include such a feature into CSVImporter. It's gonna result in this.
If you have any other idea of how we could do this, then please, explain and I'll consider adding it.
from csvimporter.
Just a suggestion, but perhaps a separate API could be added that would iterate through the file in chunks, so everything wouldn't need to be in memory at once, just counting the line endings (not within quoted strings).
from csvimporter.
Yeah, that could be possible. But it would still mean that the file is traversed twice, once for checking the total number of lines and once for actually processing the data. Of course, in some cases this might not be a problem, so as long as documentation is very clear on the performance drawback, I'd be happy to merge this feature into CSVImporter. Any volunteers? Cause I won't much time the coming months, maybe sometime in December ...
from csvimporter.
I'm closing this feature as not many people seemed to be interested in it and there's a workaround available by checking the file manually. Feel free to post a PR if you want this feature and are ready to implement yourself.
from csvimporter.
Related Issues (20)
- Publish 1.9.0 to Cocoapods HOT 1
- Cannot init CSVImporter HOT 1
- Reporting parsing errors HOT 7
- Any way to get the column headers before parsing the rest? HOT 2
- Import CSV
- Different UTF encoding? HOT 1
- Swift Package Manager support broken HOT 2
- Linux Support HOT 4
- Is it safe to assume that no more frameworks will be imported HOT 1
- Code Signing Fails in XCode 10 HOT 6
- Doesn't properly handle empty lines in the data HOT 1
- Remote URL formats? HOT 1
- trim whitespace from headers
- Add installation instructions for Accio & list as supported
- Reading the CSV while considering the columns and the headers? HOT 2
- Can't find CSVImporter.framework HOT 3
- Add Codable support HOT 2
- Add option for processing data in batches HOT 1
- 'Hashable.hashValue' is deprecated as a protocol requirement; conform type 'Regex' to 'Hashable' by implementing 'hash(into:)' instead HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from csvimporter.