Comments (11)
TL;DR Just omit the dash at the end.
Hi. Your example works as expected. It seems in your case the JSON Pointer (pointer
option) is just not used correctly. The pointer
option means "iterate over items in this element". If you only need to iterate over the items in the user-provided-parameters
key, just use /user-provided-parameters
as the pointer. The dash at the end means "any index" so it matches /user-provided-parameters/0
, /user-provided-parameters/1
, and so on, and then tries to iterate over what's inside a vector on that index. If you need more explanation, let me know or have a second look at the JSON Machine documentation.
from json-machine.
Thanks for the quick response! You're right.
I tried to reduce the case and did it incorrectly. Let me try again:
Let's say we have a number[][]
matrix where we want to iterate through cells, same as:
function cells($matrix) {
foreach ($matrix as $row) {
foreach ($row as $cell) {
yield $cell;
}
}
}
$options = ['pointer' => '/table/-'];
Items::fromString('{"table": [[1,2], [3,4]]}', $options);
// Expected: [1,2,3,4]
// Actual: same
Items::fromString('{"table": [[1,2], 3]}', $options);
// Expected: error
// Actual: [1,2,3]
Is this possible?
from json-machine.
And the reason I was using /table/-/-
was because then you get nice results in getCurrentJsonPointer()
:
1 - /table/0/0
2 - /table/0/1
3 - /table/1/0
4 - /table/1/1
from json-machine.
What are your thoughts on an option "flatten" => false
(default true), where of your examples:
JSON Pointer value | Will iterate through |
---|---|
(empty string - default) | ["this", "array"] or {"a": "this", "b": "object"} will be iterated (main level) |
/result/items |
{"result": {"items": ["this", "array", "will", "be", "iterated"]}} |
/0/items |
[{"items": ["this", "array", "will", "be", "iterated"]}] (supports array indices) |
/results/-/status |
{"results": [{"status": "iterated"}, {"status": "also iterated"}]} (a hyphen as an array index wildcard) |
/ (gotcha! - a slash followed by an empty string, see the spec) |
{"":["this","array","will","be","iterated"]} |
/quotes\" |
{"quotes\"": ["this", "array", "will", "be", "iterated"]} |
All of them return a single item, except /results/-/status
(with an explicit wildcard) returns the same as today?
from json-machine.
I'm not sure what the question is now. Can you be more specific?
Anyway, let me just elaborate a little on the flatten
topic. JSON Machine supports finding data in a JSON down to a single scalar value if needed. It does that automatically. If it finds a scalar value at a pointer instead of an object or an array, it just yields it in a single iteration. So it might seem it somehow flattens the structure when used in combination with -
and when the structure is not rigid. But in reality, no such thing happens.
Try this and you'll see no deep flattening is happening:
$options = ['pointer' => '/table/-'];
Items::fromString('{"table": [[[1,2]], [3,4]]}', $options);
// Expected: [[1,2],3,4]
Also, this example is not expected to produce an error:
$options = ['pointer' => '/table/-'];
Items::fromString('{"table": [[1,2], 3]}', $options);
because at /table/0
there is [1,2]
which is sequentially iterated, and at /table/1
there is 3
which is a scalar value and as such it's simply yielded as a single value.
from json-machine.
I would expect a behavior where:
- For every non-wildcard pointer component:
- Machine navigates into the object property/array element, and
- the number of items does not increase.
- For every wildcard pointer component:
- Machine explodes the object properties/array elements,
- the number of the items increases, and
- the key/index is available using
getCurrentJsonPointer()
.
Currently, even a non-wildcard component explodes the items (but has nowhere to indicate this in the path), if the element pointed to is an object/array. It is this behavior that I would like to have a way to disable.
Below is (yet another) example, which demonstrates both my concerns (indexes in getCurrentJsonPointer()
and unpredictable levels).
Say you have two-level array mixed[][]
, where all of these are valid:
{"2d": [[1,2], [3]]}
$value['2d'][0][0] (/2d/0/0) = 1
$value['2d'][0][1] (/2d/0/1) = 2
$value['2d'][1][0] (/2d/1/0) = 3
{"2d": [[1,2], [3,true]]}
$value['2d'][0][0] (/2d/0/0) = 1
$value['2d'][0][1] (/2d/0/1) = 2
$value['2d'][1][0] (/2d/1/0) = 3
$value['2d'][1][1] (/2d/1/1) = true
{"2d": [[1,2], [3,[4,5]]]}
$value['2d'][0][0] (/2d/0/0) = 1
$value['2d'][0][1] (/2d/0/1) = 2
$value['2d'][1][0] (/2d/1/0) = 3
$value['2d'][1][1] (/2d/1/1) = [4,5]
The following is not valid, because it's not really mixed[][]
:
{"2d": [[1,2], false]}
$value['2d'][0][0] (/2d/0/0) = 1
$value['2d'][0][1] (/2d/0/1) = 2
$value['2d'][1][0] = error
I would like to
- properly get the elements in the valid examples,
- know their indexes, and
- (ideally) somewhat gracefully handle the invalid example (error or ignore the non-matching value).
This cannot be currently achieved:
- If you use
/2d/-/-
- ✅ You do get both indices.
- ❌ Third valid example (
[[1,2], [3,[4,5]]]
) gets flattened (and you get 5 items) - ✅ The invalid example ignores the invalid element.
- If you use
/2d/-
:- ❌ You do not get both indices, only the first.
- ✅ Third valid example doesn't get flattened (properly get 4 elements)
- ❌ The invalid example gets silently ignored (you get same items as first valid example)
from json-machine.
-
If you use
/2d/-/-
- ✅ You do get both indices.
- ❌ Third valid example ([[1,2], [3,[4,5]]]) gets flattened (and you get 5 items)
- That's a feature, not a bug as explained earlier.
- ✅ The invalid example ignores the invalid element.
-
If you use
/2d/-
:- ❌ You do not get both indices, only the first.
- Ok, this seems weird. Can you give the exact output? Could it be the same problem as #100?
- ✅ Third valid example doesn't get flattened (properly get 4 elements)
- ❌ The invalid example gets silently ignored (you get same items as first valid example)
- Not-found items get ignored. That's normal behavior. It's as if you wanted the
find
command to fail on every existing file in the searched dir that does not match searched string.
- Not-found items get ignored. That's normal behavior. It's as if you wanted the
- ❌ You do not get both indices, only the first.
from json-machine.
Sorry for being brief ;)
from json-machine.
No worries, I appreciate your responses, responsiveness, and patience with me iterating on trying to get the best example.
If you use
/2d/-/-
❌ Third valid example ([[1,2], [3,[4,5]]]) gets flattened (and you get 5 items)
- That's a feature, not a bug as explained earlier.
Yes, I understand. But disabling this feature is essentially my feature request! :D
If you use
/2d/-
:
❌ You do not get both indices, only the first.
- Ok, this seems weird. Can you give the exact output? Could it be the same problem as Why only red is output #100?
I'm not saying that the items do not get iterated over, just that in the getCurrentJsonPointer()
return value you don't have both indices (which makes sense, since there is not "placeholder" for them).
❌ The invalid example gets silently ignored (you get same items as first valid example)
- Not-found items get ignored. That's normal behavior. It's as if you wanted the
find
command to fail on every existing file in the searched dir that does not match searched string.
By "silently ignored" I don't mean not returned by the iterator (that's what happens with /2d/-/-
and that's OK) but returned identically than if it was in a different structure.
Perhaps I owe an explanation for this admittedly weird use-case:
I'm querying OpenAI's text completions AI with the new function calling/structured output mechanism, which returns JSON. JSON Machine is used to return results in a streaming fashion to the user live (see videos here if curious). That table should be string[][]
and 95% of the time it is, but occasionally the model hallucinates and omits a level of nesting, adds a level of nesting, returns the wrong number of rows or cells. So when iterating over /2d/-/-
I check both the indexes to be monotonically increasing with no gaps, that the values are indeed string, and so on... very defensively.
In recap, I don't think path nr# 2 (/2d/-
) is the way forward. /2d/-/-
is mostly there, but I would prefer not to have that auto-descent feature.
from json-machine.
But disabling this feature is essentially my feature request! :D
Now it makes perfect sense 😁. Because in terms of JSON Machine, there's no 'flattening', I'd suggest modifying the scalar parsing logic, which is what's actually behind your problem. Maybe an option something like iterate_scalars
, with three settings:
AUTO
(current behavior, would remain the default)ALWAYS
/ONLY
/FORCE
(an iterable on the pointer position will throw)NEVER
(a scalar on the pointer position will throw)
This example of yours:
$options = ['pointer' => '/table/-'];
Items::fromString('{"table": [[1,2], 3]}', $options);
// Expected: error
// Actual: [1,2,3]
would then throw an error with option 'iterate_scalars' => NEVER
from json-machine.
Also for a less predictable structure maybe #36 would help?
from json-machine.
Related Issues (20)
- Why only red is output HOT 4
- Iterating items from specific position HOT 1
- Problem With `getCurrentJsonPointer()` HOT 4
- Non-Intersecting Pointers Considered Intersecting HOT 2
- Unicode Escape Sequences Support HOT 4
- Wrong key when combining list and scalar value pointers
- GeoJSON properties only
- Thank you for this awesome library. HOT 3
- Deprecated warning HOT 2
- Getting un catchable errors on non JSON files HOT 9
- Option or method for preserving the precision of numeric literals HOT 6
- CLI script for streaming HOT 8
- passing headers HOT 1
- UnexpectedEndSyntaxErrorException HOT 19
- taking time while parsing HOT 8
- Basic Usage Using PHP and Brew HOT 7
- Parsing nested values in objects HOT 4
- File parsing takes too long. HOT 6
- Unexpected symbol with Items::fromStream HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from json-machine.