Comments (12)
No, it is still not clear. You've already given yourself counter examples. Unless the N->1 mapping illustrated in your counter examples (again, many other forms exist) can be resolved, I don't see there is a clear round-trip invariant conversion between MATLAB data and JSON.
The argument that an automated encoder "cannot be produced" those counter examples is not strong enough. You can't predict/specify what an encoder should output. For jsonlab, any valid JSON should be accepted. But for the above reasons, it can not guarantee a level-1 round-trip convergence (i.e. arbitrary input data -> json->reproduced data, or json-> matlab data->reproduced json).
On the other hand, if you read my comments in this closed tracker (#1 (comment)),
one of the current design goals of jsonlab is to satisfy level-2 or higher round-trip convergence. In other words, if you iterate the input with loadjson/savejson multiple times, I want the output to be reproduced. Even this is difficult, not mention an arbitrary user input (either data or json string)
from jsonlab.
You can't predict/specify what an encoder should output.
Pardon?
What I meant was, as a general purpose JSON parser (at least that's what loadjson is aimed for), I can not specify the behavior and format for user's JSON encoders - they can use any encoders!
In fact, reading my past emails from the users and posts on the mailing list, many of them (>30%) produced their JSON data using an encoder other than savejson() - some were web-streamed data, some were dumped from dedicated software, and some were hand-coded. From loadjson's perspective, the only restriction I can ask is the input must be a valid JSON string. Other than that, I have no control to the anticipated depth of the brackets, use of white spaces, and user's preference of data format.
For jsonlab, any valid JSON should be accepted.
Maybe that's the point where we disagree
I agree. But jsonlab is open-source and both savejson and loadjson have straightforward structures. If you want to contribute and add the round-trip conservation for matlab-originated data, it is absolutely welcome! I just don't know how to do it (i.e. implementing it in a consistent and efficient way).
from jsonlab.
Because of this problem :
b = {1, 2, 3}
savejson('', b)
ans =
[
1,
2,
3
]
c = [1, 2, 3]
savejson('', c)
ans =
[1,2,3]
So you should be able to know if it's a cell if you have the line break in the json.
Another bug :
d = {1, 2, [1 2 3]}
savejson('', d)
ans =
[
1,
2,
[1,2,3]
]
loadjson(ans)
Error using reshape
Size arguments must be real integers.
Error in loadjson>parse_array (line 194)
object=reshape(obj,dim2,numel(obj)/dim2)';
Error in loadjson (line 104)
data{jsoncount} = parse_array(opt);
from jsonlab.
@sheljohn, to make the data round-trip invariant, you need to disable the "FastArrayParser" by setting it to 0
res=loadjson(savejson('',foo),'FastArrayParser',0)
to save the data with single
otherwise
res=loadjson(savejson('',foo))
gives you a row vector, and
res=loadjson(savejson('',foo,'SingletArray',1))
gives you a column vector.
a JSON array is defined as a 1D object list. it does not have built-in definitions of columns and rows for high-dimensional arrays. So, when exporting from a matlab cell array/matrix, I have to use [] to signify a row in a high-dimensional array. A a result, in jsonlab, [ [1], [2], [3] ] denotes a 3x1 vector, and [1, 2, 3] denotes a 1x3 row vector.
The difficulty you noticed is a result of ambiguities when mapping between cells/arrays to a JSON object. JSON's array is rather a cell. But when it has a regular numerical structure, I want MATLAB to use cell2mat to collapse it into an array. If one does not like this automatic conversion, one can use SimplifyCell and FastArrayParser to prevent it.
I don't see there is a clear fix of this, again because of the ambiguity of the mapping. So I am closing this tracker, but feel free to reopen if you want to suggest a fix.
from jsonlab.
@fangq
JSON does not have built-in definitions for containers shape, that is true, but not the problem here.
There is no "ambiguity" whatsoever when it comes to distinguishing between cells and vectors:
[ [1], [2], [3] ]
maps to the (shape-less) cell{1,2,3}
[1, 2, 3]
maps to the vector[1,2,3]
[ [1], [[2]], [3,4] ]
maps to the cell{1,{2},[3,4]}
If you are not convinced, then you should realise that in Matlab {1,2,3}
is equivalent to {[1],[2],[3]}
, and [ [1], [2], [3] ]
is equivalent to [1,2,3]
; so arrays of arrays are always arrays, and cells of scalars are cells of arrays, and hence there is a way to distinguish unambiguously between cells and arrays in JSON (note that a cell of struct can also be distinguished from a struct-array in a similar manner).
The only "ambiguity" (which cannot be produced by an automated writer), is that in theory [ [1], [2], 3 ]
, [ [1], 2, 3 ]
, [ 1, [2], 3 ]
etc. all map to the same cell {1,2,3}
, even though I would argue that they should be treated as errors, and as I said there is no way to generate them with a correct writer.
from jsonlab.
From your proposed mapping, I think you have a preference using cell as containers. In that case, just set both "FastArrayParser" and "SimplifyCell" to 0. Everything becomes cell elements. But in my opinion, cell is not the most efficient data form for many applications, especially when processing such data inside matlab. On the other hand, array is the most natural and convenient data form in matlab; converting cells into arrays, whenever possible, is a natural choice.
Out of my curiosity, what do you propose to map [ [1,2], [3,4], [5,6] ]? also, I don't fully understand your statement
"arrays of arrays are always arrays, and therefore can be used in JSON to represent cells without ambiguity"
in that regards, should [ [1], [2], [3] ] to be mapped to an array or a cell?
You gave a valid ambiguity example at the end, but it is just one of many many possibilities. You can add nested brackets to the elements, such as [[]], [[], []] etc, uniformly or a subset of elements. I can not reject these forms because they are all valid JSON (check it at http://jsonlint.com/). In your preference, you can keep the data inside their cell containers, but letting cell2mat to collapse the cell elements into an array makes the data much easier to process.
from jsonlab.
I updated my comment, maybe it is clearer now? I do not have a preference for any container, and turning everything into a cell array is incorrect and not what I would expect a parser to do. I just don't understand why you would say that there is no Matlab-complete reversible mapping with JSON, because there clearly exists one.
[ [1,2], [3,4], [5,6] ]
is clearly a cell, because an array of of arrays in Matlab would reduce to an array and therefore arrays of arrays do not exist. Again if that suprises you, it is probably because you are thinking about array-shapes, which don't exist in JSON. It is idiotic to use newlines and tabulations as shape specifiers, because no one would expect a compact or binary representation to allow such characters to be present (and good luck for maintenance too). Instead, it would make sense that shaped-arrays would be translated to JSON objects (whatever the source language), with parameterisable fieldnames (say parameter prefix for instance, because Matlab is so stringent on what a struct-fieldname can be).
Similarly [ [], [], [] ]
is a 3-cell, if you meant an empty array you should write []
, and there is no array with empty elements in Matlab. Similarly [ [1], [2], [3] ]
is a cell, and [1,2,3]
is an array.
Is that clearer?
from jsonlab.
The argument that an automated encoder "cannot produce" those counter examples is not strong enough.
If these cases cannot be produced, they should never be encountered, should they? This opinion sounds very subjective to me.
You can't predict/specify what an encoder should output.
Pardon?
For jsonlab, any valid JSON should be accepted.
Maybe that's the point where we disagree; imo the important round-trip to support is starting from Matlab, not JSON (nothing starts from JSON). And that round-trip can absolutely be guaranteed, while also ensuring that any valid JSON would be accepted.
from jsonlab.
Because I consistently found issues with all the parsers I found online so far (I didn't test the ones relying on C++ libraries, but that might be a bit overkill in terms of requirements), I started working on a version of my own, but as I am doing this on the side it will take some time and I am not aiming for performance in the first instance, nor to support unicode. One of the requirements I found so far in terms of design is that parsing needs to be done in two depth-first passes for a correct output to be produced. One cannot be enough.
When I am done though, I would be happy to post a comment here and if you have time to give some feedback that would be great. No worries if you're not interested though.
from jsonlab.
@GeoffreyGamaya I ended up writing my own parser and it works much better than any other one I could find online. Please give it a try if you're interested, the repository is:
https://github.com/Sheljohn/Deck
There are plenty of other things in there, but the parser can be used with:
dk.json.read
: read JSON filedk.json.decode
: parse JSON stringdk.json.write
: write to JSON filedk.json.encode
: return JSON string
from jsonlab.
I made some tests and yours seems better in terms of recovering cell values.
Thx for the link!
One bug : when you have a cell in one column like {1; 2; 3} your parser will return one in a line like {1, 2, 3} .
from jsonlab.
@GeoffreyGamaya You can open an issue in my repo about this, and I can explain at length why it is not possible to strictly resolve array shapes with JSON encoding. If you want to preserve shape, then the right thing to do is to store a structure with size/type/data fields instead. Feel free to open that issue and I can give you a better explanation.
from jsonlab.
Related Issues (20)
- Encoding scalar values into array unnecessarily HOT 1
- improve UTF8 support HOT 1
- 房老师,您好,我是汽车V2X方向的开发工程师 HOT 1
- Struct Arrays importing as cell array (again)
- Msgpack data types HOT 1
- Potential for arbitrary code execution
- Question: use loadmsgpack HOT 1
- FloatFormat Default savejson HOT 1
- 0-length string key throws an error for both loadjson and loadbj
- Error making binary JSON for a matlab structure HOT 1
- UBJSON broken by matlab `fwrite()` change HOT 2
- Slash (/) written as "\/" in the resulting file HOT 5
- loadjson: complex value was replaced by its conjugate HOT 1
- accept string filename in `savejson`
- Can't encode an empty struct array HOT 1
- Change README.txt from Latin-1 to UTF-8? HOT 2
- special characters in strings are not escaped HOT 3
- Problem with long JSON keys and special characters HOT 2
- `null` converted to `[]` HOT 3
- Arrays with one element are converted to scalars HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jsonlab.