Comments (2)
@giancarlobi pinging you here again, this is old but i keep this issues as my personal diary and when i come back and back to the same idea i have to conclude its worth persuing.
Question: Wonder if we can measure memory usage in PHP just based of different JSON string lenghts?
Now that i'm working closer on Importing XML and CSV directly into JSON i want to be sure we can handle anything.
Idea would be to decide at which point/string size we swap normal json_decode
for one of these ones (specially salsify one, can't stop thinking this is something California related?).
@alliomeria i know this is not your realm but i will keep pinging you. The idea here is that when we take a very very large JSON String (like what you see in SBF raw is quite small) ,and i will share here a HUGE one coming from a Finding AID import or even when you have 1000 pages for a book and its not a PDF nor a Sub Graph/Linked Objects Graph (i hate the word compound) the JSON, that string loaded and transformed into a PHP Structure uses memory and some times a lot. This issue is to think when to change the way we decode JSON, the implications, the why. A JSON string can be up to 2Gbytes in our Database!
If we can use the Salsify
code we could
$listener = new InMemoryListener();
$stream = fopen($testfile, 'r'); // $testfile here would need to be an streaming endpoint http or a pointer. We do not use files!
try {
$parser = new \JsonStreamingParser\Parser($stream, $listener);
$parser->parse();
fclose($stream);
} catch (Exception $e) {
fclose($stream);
throw $e;
}
$actual = $listener->getJson(); // Same as a JSON_DECODE but lower memory footprint.
The real beauty lies here
https://github.com/salsify/jsonstreamingparser/tree/master/src/Listener
Basically when you parse like this, the Listener provides events and data as the JSON progresses, and allows us, e.g to only get the data we need and leave all the rest out from memory. But there are also Format specific listeners (Looking at @dmer) which means we can deal with some pretty intensive data coming from Huge GeoJSONs.
This example is gold
https://github.com/salsify/jsonstreamingparser/blob/master/example/example_regex.php
Thanks!
from strawberryfield.
PS: this is not for Archipelago (right now) but since @giancarlobi and i are using reactPHP for realtime background processing its good to mention that you can stream MYSQL
https://clue.engineering/2018/introducing-reactphp-mysql
from strawberryfield.
Related Issues (20)
- Deprecation error: "trim" receiving NULL HOT 3
- Escape tags on the default raw json formatter HOT 3
- Basic IABookreader Search Controller misbehaves when a CWS Child uses it directly HOT 1
- Are we sure Flavors are being removed on File deletion/removal from an ADO? HOT 3
- Allow ETDF Sets to be processed by our Key Name Providers HOT 1
- Recent change in Views/Initialization breaks Rendered Item Index if Display Mode invokes/contains/renders a view (any) HOT 1
- Use Case: Audit Trail for Solr Index Activity
- Call to undefined method Drupal\search_api_db in StrawberryfieldUtilityService.php HOT 2
- Silly old bug, valid Strawberry Flavor index check is not right HOT 2
- Add extra configurations to Semantic Breadcrumbs
- Normalize Pronom Output HOT 1
- Make flavorsearch limit configurable HOT 6
- Dealing with Layout Builder + Views + no Search API driven Views (SQL) mixed with Search API views HOT 1
- Add re-tracker Search API Index implementation for Strawberry Flavors HOT 1
- Add a simple space to time constant
- noAI. Yes. that
- Support Vectors and ML Model Metadata on Strawberry Flavors HOT 1
- Add extra Caching context for breadcrumbs to avoid Drupal 10/11 changes in breadcrumb manager HOT 1
- Add extra checks for as:filetype structure values if a user decides (still don't OK?) RAW edit JSON and remove keys
- Use "isIndexing" helper on StrawberryFlavor Data source
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from strawberryfield.