Coder Social home page Coder Social logo

Comments (2)

d-frey avatar d-frey commented on June 15, 2024 1

The question is what parts of our library you are using and where the bottlenecks are. Given the modular design of taoJSON, you could use simdjson as a parser (or any other parser) and feed events from that parser to our events interface, hence making use of all the other parts our library offers. Start by adding benchmarks and profiling to your code and make sure you focus on the parts that are most relevant.

from json.

meftunca avatar meftunca commented on June 15, 2024
#ifndef simjson_tao_converter
#define simjson_tao_converter

#include <cstddef>
#include <iostream>
#include <numeric>
#include <stdexcept>
#include <string_view>
#include <tao/json/contrib/traits.hpp>
#include <tao/json/value.hpp>
#include "simdjson.h"

namespace convert_simdjson {

namespace tao_core {

inline tao::json::value renderWithType(simdjson::ondemand::value& token);
inline tao::json::value parse(std::string_view& str);
inline tao::json::value parse(simdjson::ondemand::value& token);
inline tao::json::value parseString(simdjson::ondemand::value& stringToken);
inline tao::json::value parseNumber(simdjson::ondemand::value& numberToken);
inline tao::json::value parseBoolean(simdjson::ondemand::value& boolToken);
inline tao::json::value parseNull();
inline tao::json::value parseObject(simdjson::ondemand::value& objectToken);
inline tao::json::value parseArray(simdjson::ondemand::value& arrayToken);
inline tao::json::value parseString(simdjson::ondemand::value& stringToken) {
    // std::cout << "stringToken.value: " << stringToken.value << std::endl;
    return std::string(stringToken.get_string().value());
}
inline tao::json::value parseNumber(simdjson::ondemand::value& numberToken) {
    // std::cout << "numberToken.value: " << numberToken.value << std::endl;
    // Parse Float
    return numberToken.is_integer() ? numberToken.get_number().value()
                                    : numberToken.get_double().value();
}
inline tao::json::value parseBoolean(simdjson::ondemand::value& boolToken) {
    return boolToken.get_bool().value();
}
inline tao::json::value parseNull() { return tao::json::null; }
inline tao::json::value parseObject(simdjson::ondemand::value& objectToken) {
    // std::cout << "Parse Object: " << objectToken.start << std::endl;
    tao::json::value result = tao::json::empty_object;
    // std::for_each(object.begin(), object.end(), [result](auto& keyToken) {
    //     auto key = keyToken.unescaped_key();
    //     auto value = keyToken.value();
    //     result.set_object(key, renderWithType(value));
    // });
    // std::for_each(objectToken.get_object().begin(),
    //               objectToken.get_object().end(), [&result](auto keyToken) {
    //                   auto key = keyToken.unescaped_key();
    //                   simdjson::ondemand::value fieldValue = keyToken.value();
    //                   auto value = renderWithType(fieldValue);
    //                   result[key] = value;
    //               });
    for (auto field : objectToken.get_object()) {
        std::string keyv(std::string_view(field.unescaped_key()));
        simdjson::ondemand::value fieldValue = field.value();
        auto value = renderWithType(fieldValue);
        result[keyv] = value;
    }
    return result;
}

inline tao::json::value parseArray(simdjson::ondemand::value& arrayToken) {
    // std::cout << "Parse Array: " << arrayToken.start << std::endl;
    tao::json::value result = tao::json::value::array({});
    // result.reserve(arrayToken.count_elements());
    auto array = arrayToken.get_array();

    for (auto item : array) {
        auto renderedValue = renderWithType(item.value());
        result.push_back(renderedValue);
    }
    return result;
}

inline tao::json::value parse(std::string_view& str) {
    // std::cout << "Parse JSON" << std::endl;
    simdjson::ondemand::parser parser;

    simdjson::padded_string json = simdjson::padded_string(str);
    simdjson::ondemand::document data = parser.iterate(json);
    simdjson::ondemand::value val = data;
    // return tao::json::empty_object;
    return renderWithType(val);
}

inline tao::json::value renderWithType(simdjson::ondemand::value& token) {
    switch (token.type()) {
        case simdjson::ondemand::json_type::array:
            return parseArray(token);
        case simdjson::ondemand::json_type::object:
            return parseObject(token);
        case simdjson::ondemand::json_type::string:
            return parseString(token);
        case simdjson::ondemand::json_type::number:
            return parseNumber(token);
        case simdjson::ondemand::json_type::boolean:
            return parseBoolean(token);
        case simdjson::ondemand::json_type::null:
            return parseNull();
        default:
            throw std::logic_error("Unsupported JSON type");
    }
}

}  // namespace tao_core

}  // namespace convert_simdjson

#endif  // simjson_tao_converter

@d-frey I get a very slow conversion speed in this build

Run on (12 X 2600 MHz CPU s)
CPU Caches:
L1 Data 32 KiB
L1 Instruction 32 KiB
L2 Unified 256 KiB (x6)
L3 Unified 12288 KiB
Load Average: 3.16, 3.00, 2.86

Benchmark Time CPU Iterations UserCounters...

BM_JSON_SIMD_TO_TAO 15475392 ns 15444750 ns 44 FileSize=631.514k bytes_per_second=38.9944M/s items_per_second=40.8886M/s

from json.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.