Comments (9)
jq -c
leaves one object per line. So IMO
- it would be ok if gron could work with json-line/ndjson out of the box and without sacrisfying speed, and
- only switches to the expensive "multiple lines per json object and multiple objects" parsing if a command line is given
from gron.
@jan-schulz-k24 @filippog firstly: thank you for your patience!
I've added basic support for multi-object input in b9faf39
At the moment it only supports one object per line; I'm not 100% convinced this is the best solution but it's certainly the easiest to implement.
You can use the feature with the -s
/--stream
flag:
tom@work:~▶ cat stream.json
{"one": 1, "two": 2, "three": [1, 2, 3]}
{"one": 1, "two": 2, "three": [1, 2, 3]}
tom@work:~▶ gron --stream stream.json
json = [];
json[0] = {};
json[0].one = 1;
json[0].three = [];
json[0].three[0] = 1;
json[0].three[1] = 2;
json[0].three[2] = 3;
json[0].two = 2;
json[1] = {};
json[1].one = 1;
json[1].three = [];
json[1].three[0] = 1;
json[1].three[1] = 2;
json[1].three[2] = 3;
json[1].two = 2;
Internally it reads the input line by line, so it will start to provide output as soon as a line is available to read. So in the below example the output appears in three chunks with two second intervals between them.
tom@work:~▶ cat delay.sh
#!/bin/bash
echo '{"one": 1, "two": 2}'
sleep 2
echo '{"three": 3, "four": 4}'
sleep 2
echo '{"five": 5, "six": 6}'
tom@work:~▶ ./delay.sh | gron -s
json = [];
json[0] = {};
json[0].one = 1;
json[0].two = 2;
json[1] = {};
json[1].four = 4;
json[1].three = 3;
json[2] = {};
json[2].five = 5;
json[2].six = 6;
This should make it possible to work with steaming HTTP APIs - most of which seem to provide one object per line.
I haven't tagged a release yet, and I'm going to leave this issue open for a while longer because I'd like to think more about supporting objects that span many lines.
Let me know your thoughts / if you have any problems.
Thanks again for your patience at this particularly busy time in my life! 😆
from gron.
Thanks @tomnomnom for working on this! I did a quick test with the dataset I have and works great with --stream
!
from gron.
I've got JSON output with multiple objects but they're not one-per-line - gron currently only handles the first of these objects. I do have a hacky/sketchy patch for stream mode which handles this case but obviously don't want to step on any toes if there's another solution in the works?
from gron.
@rjp I also ran into this issue (specifically with the GitHub cli tool's --paginate
option), and I ended up with this hacky sed
one-liner to work around it:
gh api --paginate /repos/{owner}/{repo}/environments | sed -E 's|\}\{|\}\n\{|g' | gron --stream
from gron.
Hi @filippog! Thank you! :)
It certainly can be supported, and I think it would be a good idea.
Do you think it would be OK to enable the feature with a command line option rather than trying to auto-detect multiple objects in the input?
from gron.
If autodetection is expensive and/or unreliable to convert to/from gron then yeah a command line option would do. Otherwise I was expecting gron to just work when fed multiple objects, case in point for me is reading from an access log where every entry is a json object, separated by \n
from gron.
I've currently the same problem, in this case using it with jq:
Minimal example:
λ cat ~/gron_tmp
{
"data" : [
{"a": "1"},
{"a": "2"}
]
}
λ cat ~/gron_tmp |jq '.data[]'
{
"a": "1"
}
{
"a": "2"
}
λ cat ~/gron_tmp |jq '.data[]' |gron
json = {};
json.a = "1";
I've also seen logfiles consisting of one json object per line (ndjson), but that expects minified json.
-> my preference would be if an object ends and a new one starts starts straight after (only whitespace between }...{
), just take that object as well. I wouldn't even mind if there wouldn't be any difference between the lines:
λ cat ~/gron_tmp |jq '.data[]' |gron
json = {};
json.a = "1";
json = {};
json.a = "2";
Although a commandline switch to implicitly treat the input as a list would be nice:
λ cat ~/gron_tmp |jq '.data[]' |gron --assume-list
json[0] = {};
json[0].a = "1";
json[1] = {};
json[1].a = "2";
from gron.
Hey, sorry this hasn't had the attention it needs... Kids keep you pretty busy!
I've been giving some thought about the approach needed for this, and there's probably only two sane options:
- Require that the input be one JSON blob per line so it's easy to split on
\n
- Do a pre-parse step to detect multiple JSON objects in the input
Option 1 is by far the easiest to implement, but it doesn't work for @jan-schulz-k24's example where each JSON blob spans multiple lines.
Option 2 is far more permissive, but it requires a rune-by-rune inspection of the input text (you can't just, say, regex for }[^,]*{
because that sequence could appear in a string value)
The problem with option 2 is that it's pretty expensive to do, especially when the input is very large. This is made slightly better by only enabling multi-object input when a command line flag is specified.
On balance I think that gron working in more situations is more important than performance, so option 2 is probably best.
from gron.
Related Issues (20)
- --no-sort option is unstable HOT 5
- Ungron doesn't respect --no-sort option HOT 1
- Output format HOT 12
- more tollerance when parsing
- Remove this HOT 1
- Export functions? HOT 1
- asdf gron
- Colour - Support Highlighting when Ungronning from grep.
- could you create releases? HOT 6
- Pager? HOT 1
- Feature request HOT 3
- Inconsistency in ungron for invalid inputs HOT 1
- Add github topics? HOT 1
- `gron URL` doesn't use the proxy defined in environment HOT 2
- Latest release
- ungron is slow and leaky HOT 7
- Binary builds broken?? HOT 4
- Add release for armv6l
- Special-case `new Array()` and `new Object`
- v0.7.1 does not report its own version HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gron.