Comments (6)
👋 Hey @samy-at-shopify ! Supporting compressed CSVs sounds like a simple solution to the large file upload problem, which we've definitely encountered before 🤔 (e.g. #600). I think we may want to look more into direct file uploads at one point, but I still think supporting compressed file formats makes sense.
cc @etiennebarrie -- any opinion here?
@samy-at-shopify are you interested in trying to put together a PR for this?
from maintenance_tasks.
Hey @adrianna-chang-shopify 👋
Happy to take this on! I'm not sure how much time I'll be able to dedicate to this though, so don't worry about hot-potatoing this to someone else if the need to support compressed files becomes urgent 👍
from maintenance_tasks.
Thought about this for a little more, and this feature might not be as easy to implement as I first thought.
These lines handle reading the contents of the .csv input file. They do so by reading the entire file and representing its contents as a string in memory.
This clashes a bit with the intention for supporting zipped .csv files (allowing someone to use a very large .csv file as input; e.g., > 1GB): the gem currently loads the entire file rather than reading and processing lines one-by-one.
from maintenance_tasks.
Ruby does have the IO.foreach
method that would allow the gem to process a .csv file line-by-line without putting it all in memory. I'll see if I can get a version of the gem that works using this lazy approach
from maintenance_tasks.
No strong opinions on that, but I'd rather avoid the additional dependency it will require (rubyzip
I assume). So maybe we should make the feature optional on the dependency being present, and otherwise not allow zip files to be uploaded?
I wish we could let the browser/frontend web server do the work of compressing/uncompressing, but quickly looking I couldn't find anything to do that.
Regarding the fact that we download everything at once, it's because while ActionStorage lets you download in chunk and StringIO lets you turn a String into an IO, there's nothing in Ruby that would combine the two, an IO that would go yield to a block when it needs more data (which could be used to download the next chunk with ActiveStorage). It would be neat to have that, but that wouldn't solve the large file upload issue.
from maintenance_tasks.
This issue has been marked as stale because it has not been commented on in two months.
Please reply in order to keep the issue open. Otherwise, it will close in 14 days.
Thank you for contributing!
from maintenance_tasks.
Related Issues (20)
- Changing batch size on causes the job to skip records HOT 1
- Adding the ability to tag tasks for filtering on the index page HOT 3
- Persist user email HOT 4
- Consolidating migrations HOT 2
- Retry button
- Cursor based iteration HOT 6
- Multiple database shards HOT 3
- How do we set `limit`? HOT 8
- No callback is called after Interrupted -> Resumed HOT 5
- Release notes HOT 1
- Web UI and Rails API mode HOT 2
- CSV Task Failure: ActiveRecord::NotNullViolation Error on UUID record_id HOT 2
- please add a CHANGELOG.md file HOT 4
- Weekly CI run failed
- Weekly CI run failed HOT 1
- Weekly CI run failed HOT 1
- Weekly CI run failed HOT 1
- CSV count wrong when new line in cell HOT 8
- Weekly CI run failed HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from maintenance_tasks.