Coder Social home page Coder Social logo

toml-f / toml-f Goto Github PK

View Code? Open in Web Editor NEW
94.0 6.0 22.0 1.28 MB

TOML parser implementation for data serialization and deserialization in Fortran

Home Page: https://toml-f.readthedocs.io

License: Apache License 2.0

Meson 4.79% Fortran 91.90% CMake 3.04% Python 0.27%
serde toml fortran-library toml-parser fortran-package-manager

toml-f's Introduction

TOML parser for Fortran projects

License Release Build docs Documentation Status codecov

A TOML parser implementation for data serialization and deserialization in Fortran.

TOML-Fortran

Installation

The TOML Fortran library is available via various distribution channels. If your channel is not found here, please checkout the instructions for building from source. You can also find these instructions in the user documentation at Installing TOML Fortran.

Conda package

Conda Conda

This project is packaged for the mamba package manager and available on the conda-forge channel. To install the mamba package manager we recommend the mambaforge installer. If the conda-forge channel is not yet enabled, add it to your channels with

mamba config --add channels conda-forge
mamba config --set channel_priority strict

Once the conda-forge channel has been enabled, TOML Fortran can be installed with mamba:

mamba install toml-f

It is possible to list all of the versions of TOML Fortran available on your platform with mamba:

mamba repoquery search toml-f --channel conda-forge

FreeBSD port

FreeBSD port

A port for FreeBSD is available and can be installed using

pkg install textproc/toml-f

In case no package is available build the port using

cd /usr/ports/textproc/toml-f
make install clean

For more information see the toml-f port details.

Alternative distributions

Please let us know if you are packaging TOML Fortran. Other available distributions of TOML Fortran currently include

An overview of the availability of TOML Fortran in distributions tracked by Repology is provided here:

Packaging status

Building from source

To build this project from the source code in this repository you need to have

  • a Fortran compiler supporting Fortran 2008

    • GFortran 5 or newer
    • Intel Fortran 18 or newer
    • NAG 7 or newer
  • One of the supported build systems

Get the source by cloning the repository

git clone https://github.com/toml-f/toml-f
cd toml-f

Building with meson

To integrate TOML Fortran in your meson project checkout the Integrate with meson recipe.

To build this project with meson a build-system backend is required, i.e. ninja version 1.7 or newer. Setup a build with

meson setup _build --prefix=/path/to/install

You can select the Fortran compiler by the FC environment variable. To compile the project run

meson compile -C _build

We employ a validator suite to test the standard compliance of this implementation. To use this testing a go installation is required. The installation of the validator suite will be handled by meson automatically without installing into the users go workspace. Run the tests with

meson test -C _build --print-errorlogs

The binary used for transcribing the TOML documents to the testing format is _build/test/toml2json and can be used to check on per test basis.

Finally, you can install TOML Fortran using

meson install -C _build

Building with CMake

To integrate TOML Fortran in your CMake project checkout the Integrate with CMake recipe.

While meson is the preferred way to build this project it also offers CMake support. Configure the CMake build with

cmake -B _build -G Ninja -DCMAKE_INSTALL_PREFIX=/path/to/install

Similar to meson the compiler can be selected with the FC environment variable. You can build the project using

cmake --build _build

You can run basic unit tests using

ctest --test-dir _build

The validation suite is currently not supported as unit test for CMake builds and requires a manual setup instead using the toml2json binary.

Finally, you can install TOML Fortran using

cmake --install _build

Building with fpm

To integrate TOML Fortran in your fpm project checkout the Using the Fortran package manager recipe.

The Fortran package manager (fpm) supports the addition of TOML Fortran as a dependency. In the package manifest, fpm.toml, you can add TOML Fortran dependency via:

[dependencies]
toml-f.git = "https://github.com/toml-f/toml-f"

Then build and test normally.

fpm build
fpm test

A more detailed example is described in example 1.

Documentation

The user documentation is available at readthedocs. Additionally, the FORD generated API documentation is available here.

To build the user documentation locally we use sphinx, install the dependencies you can use the mamba package manager

mamba create -n sphinx --file doc/requirements.txt
mamba activate sphinx

The documentation is build with

sphinx-build doc _doc

You can inspect the generated documentation by starting a webserver

python3 -m http.server -d _doc

And open the down URL in a browser.

Translating the documentation

The documentation of TOML Fortran can be fully translated. Before adding a translation, reach out to the repository maintainers by creating and issue or starting a discussion thread.

To start a new translation you need the sphinx-intl package which can be installed with mamba

mamba install -n sphinx sphinx-intl

To add a new language to the translation extract the text with sphinx-build and create the respective locales with sphinx-intl using the commands shown below.

sphinx-build -b gettext doc _gettext
sphinx-intl update -l en -p _gettext -d doc/locales

Replace the argument to the language flag -l with your target language, the language keys are listed here. The same workflow can be used for updating existing locales. The translation files are available in doc/locales and can be translated using a translation-editor, like gtranslator or poedit.

After a new translation is merged, a maintainer will create a new translation for the readthedocs to ensure it shows up at the pages.

Generating the API docs

The API documentation is generated with FORD. We are looking for a better tool to automatically extract the documentation, suggestions and help with this effort are welcome.

The required programs can be installed with mamba

mamba create -n ford ford
mamba activate ford

To generate the pages use

ford docs.md -o _ford

You can inspect the generated documentation by starting a webserver

python3 -m http.server -d _ford

Usage

To make use this library use the tomlf module in your projects. You can access the individual modules but those are not considered part of the public API and might change between versions.

An example program to load and dump a TOML file would look like this:

use tomlf
implicit none
character(len=*), parameter :: nl = new_line("a")
type(toml_table), allocatable :: table
character(kind=tfc, len=:), allocatable :: input_string
type(toml_serializer) :: ser

input_string = &
   & '# This is a TOML document.' // nl // &
   & 'title = "TOML Example"' // nl // &
   & '[owner]' // nl // &
   & 'name = "Tom Preston-Werner"' // nl // &
   & 'dob = 1979-05-27T07:32:00-08:00 # First class dates' // nl // &
   & '[database]' // nl // &
   & 'server = "192.168.1.1"' // nl // &
   & 'ports = [ 8001, 8001, 8002 ]' // nl // &
   & 'connection_max = 5000' // nl // &
   & 'enabled = true' // nl // &
   & '[servers]' // nl // &
   & '  # Indentation (tabs and/or spaces) is allowed but not required' // nl // &
   & '  [servers.alpha]' // nl // &
   & '  ip = "10.0.0.1"' // nl // &
   & '  dc = "eqdc10"' // nl // &
   & '  [servers.beta]' // nl // &
   & '  ip = "10.0.0.2"' // nl // &
   & '  dc = "eqdc10"' // nl // &
   & '[clients]' // nl // &
   & 'data = [ ["gamma", "delta"], [1, 2] ]' // nl // &
   & '# Line breaks are OK when inside arrays' // nl // &
   & 'hosts = [' // nl // &
   & '  "alpha",' // nl // &
   & '  "omega"' // nl // &
   & ']'

call toml_loads(table, input_string)
if (allocated(table)) then
   call table%accept(ser)
   call table%destroy  ! not necessary
end if
end

Here the TOML document is provided as string, notice that you have to add a newline character by using the intrinsic function new_line("a") to get the lines correctly.

Alternatively, a file can be loaded from any connected, formatted unit using the same overloaded function. For the standard input the intrinsic input_unit should be passed. If the TOML file is successfully parsed the table will be allocated and can be written to the standard output by passing the toml_serializer as visitor to the table.

For more details checkout the documentation pages. If you find an error in the documentation or a part is incomplete, please open an issue or start a discussion thread.

Contributing

This is a volunteer open source projects and contributions are always welcome. Please, take a moment to read the contributing guidelines on how to get involved in TOML-Fortran.

License

TOML-Fortran is free software: you can redistribute it and/or modify it under the terms of the Apache License, Version 2.0 or MIT license at your opinion.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an as is basis, without warranties or conditions of any kind, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in TOML-Fortran by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

toml-f's People

Contributors

aradi avatar aslozada avatar awvwgk avatar bhourahine avatar dmejiar avatar emilyviolet avatar haozeke avatar kjelljorner avatar rscohn2 avatar yundantianchang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

toml-f's Issues

Variable declaration: class type instead of derived type

When declaring variables for toml-f, the library expects the derived types, like toml_table, to be of class instead of type:

    class(toml_table), allocatable :: table
    class(toml_table), pointer     :: child

Is there any reason for this? Normally, class is used to define dummy arguments of object references only (this/self/…), not for the declaration of object-oriented types in general. Or am I missing something?

Pass invalid decoder tests

The TOML Fortran parser does not flag all invalid usage correctly at the moment. The following invalid decoder tests are passing:

  • invalid/control/bare-cr
  • invalid/control/comment-cr
  • invalid/control/comment-del
  • invalid/control/comment-lf
  • invalid/control/comment-null
  • invalid/control/comment-us
  • invalid/control/multi-del
  • invalid/control/multi-lf
  • invalid/control/multi-null
  • invalid/control/multi-us
  • invalid/control/rawmulti-del
  • invalid/control/rawmulti-lf
  • invalid/control/rawmulti-null
  • invalid/control/rawmulti-us
  • invalid/control/rawstring-del
  • invalid/control/rawstring-lf
  • invalid/control/rawstring-null
  • invalid/control/rawstring-us
  • invalid/control/string-bs
  • invalid/control/string-del
  • invalid/control/string-lf
  • invalid/control/string-null
  • invalid/control/string-us
  • invalid/datetime/hour-over
  • invalid/datetime/mday-over
  • invalid/datetime/mday-under
  • invalid/datetime/minute-over
  • invalid/datetime/month-over
  • invalid/datetime/month-under
  • invalid/datetime/second-over
  • invalid/encoding/bad-utf8-in-comment
  • invalid/encoding/bad-utf8-in-string
  • invalid/float/double-point-1
  • invalid/float/double-point-2
  • invalid/float/exp-double-e-1
  • invalid/float/exp-double-e-2
  • invalid/float/exp-leading-us
  • invalid/float/exp-point-2
  • invalid/float/trailing-us-exp
  • invalid/inline-table/add
  • invalid/inline-table/trailing-comma
  • invalid/integer/incomplete-bin
  • invalid/integer/incomplete-oct
  • invalid/integer/incomplete-hex
  • invalid/integer/negative-bin
  • invalid/integer/negative-hex
  • invalid/integer/positive-bin
  • invalid/integer/positive-hex
  • invalid/string/bad-codepoint
  • invalid/string/basic-multiline-out-of-range-unicode-escape-1
  • invalid/string/basic-multiline-out-of-range-unicode-escape-2
  • invalid/string/basic-out-of-range-unicode-escape-1
  • invalid/string/basic-out-of-range-unicode-escape-2
  • invalid/table/append-with-dotted-keys-1
  • invalid/table/append-with-dotted-keys-2

The respective tests must be checked, the error caught correctly in the parser and reported to the user

CI Migration

At some point a CI migration from Travis-CI to another provider has to be considered.

Travis-CI testing for this project currently covers

  • Ubuntu 18.04, GCC 5, meson
  • Ubuntu 18.04, GCC 6, meson
  • Ubuntu 18.04, GCC 7, meson

Better user interface for retrieving arrays

Based on feedback found in a post at https://emilyviolet.github.io/2022/01/29/fortran-parsing.html including TOML Fortran (see #70).

A point raised is the parsing of data, which is currently limited especially for arrays:

  • Need to use and allocate special toml-f-specific data types.
  • Can’t just do a simple toml_parse() then get_value() if there are nested sections: each section needs its own allocation, which you then call get_value() on.
  • Can’t just parse input arrays/lists directly into an array. Need to specially allocate a TOML array type and then iterate over that.

This could be improved by providing a new getter in the tomlf_build interface to retrieve a whole array

  • allow passing allocatable array
  • check how default value can be provided (scalar or/and array valued?)
  • raise error if TOML array contains heterogeneous data
  • investigate optional conversion if data is compatible

Please don't include GCC version in installed file paths

This and many your other projects include strings like GNU-10.3.0:

$ find /usr/local/include/ -name GNU-10.3.0
/usr/local/include/mctc-lib/GNU-10.3.0
/usr/local/include/mstore/GNU-10.3.0
/usr/local/include/multicharge/GNU-10.3.0
/usr/local/include/toml-f/GNU-10.3.0
/usr/local/include/tblite/GNU-10.3.0
/usr/local/include/s-dftd3/GNU-10.3.0
/usr/local/include/dftd4/GNU-10.3.0

This causes ports/packages to break when gcc version is changed because plist would change and ports would need to be updated.

Functional interface for accessing TOML data structures

Provide a functional interface to access TOML data structures. It is questionable whether is will be possible to produce pure functions due to the internal usage of pointers to return the memory location of data structure members. However, it might be possible to provide at least functions rather then subroutines for obtaining values.

The error handling has to be decided, the error handling as well as the dispatch of the interface could be done based on a fallback value if the actual value cannot be obtained from the data structure (missing entry or wrong type).

use tomlf_functional
real :: val
real, parameter :: fallback = 0

val = get_value(table, "entry", fallback)

An alternative model is to return a type which can contain either a value or an error and the user has to do the dispatch based on the allocation status of either member. In this case setting defaults will be more difficult.

Nagfor: problem to compile sort.f90

Unfortunately, the NAG compiler can not compile toml-f, as it does not like the procedure pointer assignment in the pure sort_keys subroutine in sort.F90. I think, it is a compiler bug, and I've reported it already to NAG. However, it would be nice to have a quick workaround to enable testing with NAG for DFTB+ with tblite included.

Possible workarounds:

  • Leave away the pure attribute from sort_keys
  • Leave away the pointer assignment and put directly the quicksort() into the two branches of the if.

Thanks!

Tests fail with PGI

Building toml-f works with NVHPC 20.7 and 20.9, but the unit tests will fail with a segfault due to the destroy type bound procedure in the abstract structure type. Most prominently this is invoked in the manual deconstructor for the toml_table with:

call self%list%destroy

For some reason the PGI compiler fails to resolve this call correctly and ends with a segfault. Since the manual deconstructor is not requires, as all types are allocatable rather than pointer, this call is usually not required in practise, but might still be used when reallocating inside the abstract structure type or deleting from a data structure.

Better serializer for TOML

The current serializer does directly write to a unit, which can be somewhat limiting in terms of IO operations.

  • allow serializer to produce a string, write to a file or a connected unit
  • make the format more customizable

Looking for (co-)maintainers

I'm looking for contributors and maintainers for this project to ensure this projects stays available for the Fortran community in the long-term.

Current status

  • Support for TOML v1.0.0 is available except for UTF-8 support
  • meson, CMake and fpm build system support
  • used as dependency in fpm, tblite, s-dftd3, and maybe more
  • GCC >=5, Intel >=18 and NAG >=7 are currently supported

Dependencies

Project assets

Packaging status

Return allocation on deletion from table

Currently, tables can delete values, but they just drop the allocation rather than returning it. In contrast arrays can only delete an element by returning the allocation and the user has to do the deletion (or just let the allocation go out of scope). This behavior should be more consistent, i.e. arrays should be able to delete an element and tables should be able to return an allocation from a key to ensure a backwards compatible interface.

child is always associated in get_value even if requested section is missing

Hi,
Using the Accessing nested tables tutorial as a base, I'm trying to understand how call get_value child variable works when the requested section is missing.
When editing the example toml file for the above tutorial to rename the hamiltonian section to something else (eg. [foobar.dftb]) the code runs with out displaying the expected error message "No hamiltonian section found in input file". Is there a bug in get_value and child is always associated? Or am I just misunderstaning how the tutorial code works?

Revisit data structure for storing tables

The toml_table currently uses an array internally to store the keys, which can be inefficient for a large number of keys. It might be useful to adopt a binary search tree or a hash map instead. See src/tomlf/structure for the current implementation.

However, for a new storage structure it is best to avoid using pointer attributes for any member of the data structure to ensure the automatic memory management by the Fortran compiler can be exploited. Also, the possibility to retain the order of the keys from the input would be a wanted feature.

Release version 0.3.0

Tracking issue for releasing version 0.3.0.

The version bump will happen with #88, due to ABI breakage and slight API changes (no breakage hopefully). However, the release might not be immediate to allow for adjustments and bugfixes.

Tests fail with flang

Using the Ubuntu Focal (ubuntu-20.04) image and the flang compiler allows to successfully build toml-f. However, the unit tests still fail.

Might be related to the test failure with PGI in #27. Needs further investigation.

Pass valid encoder tests

Currently the following valid encoder tests are failing

  • valid/array/mixed-string-table
  • valid/comment/everywhere
  • valid/datetime/local-date
  • valid/datetime/local-time
  • valid/datetime/local
  • valid/float/inf-and-nan
  • valid/inline-table/nest
  • valid/key/escapes
  • valid/key/special-chars
  • valid/string/escapes
  • valid/string/multiline-quotes

This might be due to a format change with the latest toml-test version. The tests must be checked and the format of the encoder updated.

Support creating machine readable diagnostics

Raised by Giannis @gnikit in https://fortran-lang.discourse.group/t/3949/2

Adding an option or slightly restructuring the diagnostic message to be easily parseable via regex. The message structure is already very good, but fetching the error message via regex would be hard. A good example of that is gfortran-11>= with the flag -fdiagnostic-plain-output. Having an such option would allow for code editors to parse the output of the linter to their Diagnostics console in VS Code PROBLEMS tab.


The current report interface turns an index of a token to a label object

allocate(labels(1))
labels(1) = toml_label(level_, &
& self%token(origin)%first, self%token(origin)%last, label, .true.)

And creates a diagnostic object from it

diagnostic = toml_diagnostic( &
& level_, &
& message, &
& self%filename, &
& labels)

For the human facing output this is than turned into a string at

string = render(diagnostic, self%source, color)

Note that the actual source code is only needed when creating the report string, the diagnostic object itself contains only position information from the label objects as well as the messages to display.


To support this without much effort on the user side, we could add a state in the context objects which describes whether the report should be optimized for humans or machines, default to human-friendly output. A tool which wants integrate its TOML Fortran usage for error reporting with the the VS Modern Fortran Extension can provide an option or environment variable to toggle this switch in the context object and make the error output automatically accessible for the Diagnostic console in VS Code.

The actual preferred format for the VS Modern Fortran Extension has to be defined first.

Support unicode escape characters

Currently unicode escape sequences are not supported by this implementation and are the only part missing to fully support all TOML features.

For convenience, some popular characters have a compact escape sequence.

 \b         - backspace       (U+0008)
 \t         - tab             (U+0009)
 \n         - linefeed        (U+000A)
 \f         - form feed       (U+000C)
 \r         - carriage return (U+000D)
 \"         - quote           (U+0022)
 \\         - backslash       (U+005C)
-\uXXXX     - unicode         (U+XXXX)
-\UXXXXXXXX - unicode         (U+XXXXXXXX)

Any Unicode character may be escaped with the \uXXXX or \UXXXXXXXX forms. The escape codes must be valid Unicode scalar values.

See specifications: https://github.com/toml-lang/toml/blob/master/toml.md#string

Tutorial on creating an input format

Provide a complete walk-through for building a configuration file for an application, discussing advantages and disadvantages of certain data structures, would also be a good choice for a tutorial. I don't have something particular in mind yet for a topic, either package management or computational chemistry might be themes I would chose for such a guide.

Relevant points (not all might be covered):

  • naming of tables / keys (lowercase-dashed, snake_case, camelCase)
  • name for array of tables, singular vs. plural
  • nesting tables, reusable input parser
  • versioning the format, automatic updates on breaking changes
  • error reporting for user, error handler vs. full context
  • reporting full input for reproducibility (filling in defaults, version migrations)
  • interacting with other input sources (command line arguments, global configuration file ~/.<program>rc)

Potential examples (suggestions are welcome):

  • atomistic simulation program
    • tasks like single point, geometry optimization, molecular dynamics, frequency calculation, ...
    • post processing and analysis, logging
    • needs a FF, SQM or DFT package as backend to be runable (maybe tblite?)

Compatibility with stdlib

It would be desirable to have a close integration of TOML Fortran with stdlib and provide a way to directly use stdlib data types like string_type together with most procedures in TOML Fortran. However, due to feature constraints in TOML Fortran, stdlib cannot be a required dependencies since this projects has to support GCC 5.3.0 for supporting Windows on conda-forge at the moment.

Related issues

Evaluate completeness of documentation

I found a post by @emilyviolet on Fortran parsing libraries at https://emilyviolet.github.io/2022/01/29/fortran-parsing.html including TOML Fortran.

The main critique is the documentation (or rather the lack of it), which is unfortunately still quite sparse even with the new readthedocs pages:

  • Terrible documentation
    • Project’s GitHub is both sparse and outdated
    • Need to rely on the automatically generated API docs again
    • Authors also recommend reading the fpm source, since it makes a lot of use of toml-f

toml-f unfortunately has nonexistent documentation, which admittedly is a fairly large downside

Actionable points for improving TOML Fortran's documentation:

  • check the project README
  • create recipes based on the fpm source and add those to the RTD pages

Related issues:

Compatibility of datetime data type

TOML Fortran implements a toml_datetime derived type to represent the date time values in TOML documents. Since the derived type mainly functions to store the value it doesn't provide much functionality on its own.

Potential options to make the date time values more useful is to provide either a richer set of functionality in TOML Fortran or provide a possibility to bridge to an existing date time implementation, e.g. provide a get_value interface to directly create a datetime_module::datetime or m_time_oop::date_time instance. It might be useful to at least provide a recipe in the documentation on how-to create a bridge to the respective date time libraries.

Available libraries

Recipes

Related stdlib proposals

Fails to build with PGI

Fails due to a PGI bug with allocatable characters in derived type constructors (NVHPC 20.7 and 20.9).

FAILED: libtoml-f.so.0.2.1.p/src_tomlf_utils_convert.f90.o 
/opt/nvidia/hpc_sdk/Linux_x86_64/20.9/compilers/bin/nvfortran -Ilibtoml-f.so.0.2.1.p -I. -I.. -Minform=inform -O2 -g -Mbackslash -Mallocatable=03 -traceback -fPIC -module libtoml-f.so.0.2.1.p -o libtoml-f.so.0.2.1.p/src_tomlf_utils_convert.f90.o -c ../src/tomlf/utils/convert.f90
NVFORTRAN-F-0155-Empty structure constructor() - type toml_time (../src/tomlf/utils/convert.f90: 170)
NVFORTRAN/x86-64 Linux 20.9-0: compilation aborted

Dual build system support: CMake

toml-f can currently be built with meson and fpm.

To ease integration of toml-f in other projects the CMake build system should be support as well, this includes:

  • setup CMake build that matches meson build as close as possible
  • test both CMake and meson build on the CI
  • allow meson to write CMake package files to match the CMake build

Parsing crashes in case of unclose string in array

On release included in fpm 0.4 alpha and development branch as of 2021-10-05, toml-f crashes in case of unclosed string in an array, reproduced with the minimal file.

name = "ftn_portaudio"
[build]
link = ['portaudio_x64]

To reproduce: create a file name package-e.toml in .../toml-f/test/example-1 and execute fpm run -- package-e.toml in the same directory. Under gdb it results in:

(gdb) backtrace
#0  0x00007ffff7f41d23 in ?? () from /lib/x86_64-linux-gnu/libgfortran.so.5
#1  0x00007ffff7f43465 in ?? () from /lib/x86_64-linux-gnu/libgfortran.so.5
#2  0x00007ffff7f541e8 in ?? () from /lib/x86_64-linux-gnu/libgfortran.so.5
#3  0x0000555555574826 in tomlf_error::add_context (message=0x555555599790, context=..., _message=0x555556448090) at ./../../src/tomlf/error.f90:227
#4  0x00005555555761c4 in tomlf_error::syntax_error (error=0x555556448080, context=..., message=<error reading variable: frame address is not available.>,
    stat=<error reading variable: Cannot access memory at address 0x0>, _message=20) at ./../../src/tomlf/error.f90:113
#5  0x00005555555665ce in next_token::scan_string (de=..., ptr=0x5555555991c8, dot_is_token=.FALSE., _ptr=0x7fffff7ff800) at ./../../src/tomlf/de/character.f90:211
#6  0x0000555555567bb7 in tomlf_de_character::next_token (de=..., dot_is_token=.FALSE.) at ./../../src/tomlf/de/character.f90:114
#7  0x000055555556894c in tomlf_de_tokenizer::next (de=..., dot_is_token=.FALSE., whitespace_is_precious=<error reading variable: Cannot access memory at address 0x0>)
    at ./../../src/tomlf/de/tokenizer.f90:734
#8  0x000055555556915f in tomlf_de_tokenizer::parse_array (de=..., array=...) at ./../../src/tomlf/de/tokenizer.f90:555
#9  0x0000555555569497 in tomlf_de_tokenizer::parse_array (de=..., array=...) at ./../../src/tomlf/de/tokenizer.f90:578
#10 0x0000555555569497 in tomlf_de_tokenizer::parse_array (de=..., array=...) at ./../../src/tomlf/de/tokenizer.f90:578
#11 0x0000555555569497 in tomlf_de_tokenizer::parse_array (de=..., array=...) at ./../../src/tomlf/de/tokenizer.f90:578
#12 0x0000555555569497 in tomlf_de_tokenizer::parse_array (de=..., array=...) at ./../../src/tomlf/de/tokenizer.f90:578
#13 0x0000555555569497 in tomlf_de_tokenizer::parse_array (de=..., array=...) at ./../../src/tomlf/de/tokenizer.f90:578
#14 0x0000555555569497 in tomlf_de_tokenizer::parse_array (de=..., array=...) at ./../../src/tomlf/de/tokenizer.f90:578
...

followed by an almost infinite list of call to the tokenizer.

It seems that call de%parse_array(arr) at line 578 calls itself recursively without proper termination in case of error

License question

@awvwgk, thanks for creating this package.

We are looking for a package to use as a TOML parser for the Fortran Package Manager, see the issue for it at fortran-lang/fpm#149.

It looks like your package would be exactly what we need. The only issue is that the toml-f's GPL license would not allow us to distribute fpm under an MIT license as we currently do.

Would you be open to relicense toml-f to a more permissive license such as BSD, MIT or Apache?

If not, that is totally fine. I just wanted to ask.

Investigate an alternative to FORD

Maybe Doxygen or breathe could be used, however the interface support in the Doxygen generated documentation is somewhat suboptimal. At least the build interface is mainly just two interfaces (get_value and set_value).

Support retrieving entries using a key path

Based on feedback found in a post at https://emilyviolet.github.io/2022/01/29/fortran-parsing.html including TOML Fortran (see #70).

Keys which can query several layers into the data structure are currently not supported, but could internally retrieve/create the subtables for user convenience.

  • figure out a sane way to pass several keys to the getter (derived type or array of toml_key?)

An alternative would be to integrate with the Fortran standard library once maps and lists are available to turn a TOML data structure in a stdlib compatible object. This would probably be implemented in a separate project as TOML Fortran / stdlib bridge, to not encumber the project with additional dependencies and break fpm.

Add new \e shorthand for the escape character.

New escape sequence \e added to the TOML specification.

 \b         - backspace       (U+0008)
 \t         - tab             (U+0009)
 \n         - linefeed        (U+000A)
 \f         - form feed       (U+000C)
 \r         - carriage return (U+000D)
+\e         - escape          (U+001B)
 \"         - quote           (U+0022)
 \\         - backslash       (U+005C)
 \uXXXX     - unicode         (U+XXXX)
 \UXXXXXXXX - unicode         (U+XXXXXXXX)

This is part of an unreleased TOML standard at the moment.

Reference: toml-lang/toml#790

Style preserving deserializer / serializer

At the moment the parser will directly create the data structures, instead an intermediate representation could be constructed for mapping the input to an abstract syntax tree which is than translated in the data structure and can retain the line information.

This would be especially useful for user applications which want to show their error messages on wrong input data directly in the TOML file to provide as much information as possible about the problem.

Build fails on FreeBSD

Meson fails on FreeBSD 12:

$  FC=gfortran meson setup build_gcc
The Meson build system
Version: 0.54.3
Source dir: /tml/toml-f
Build dir: /tmp/toml-f/build_gcc
Build type: native build
Project name: toml-f
Project version: 0.2.0
Using 'FC' from environment with value: 'gfortran'
Using 'FC' from environment with value: 'gfortran'
Fortran compiler for the host machine: gfortran (gcc 10.2.0 "GNU Fortran (FreeBSD Ports Collection) 10.2.0")
Fortran linker for the host machine: gfortran ld.bfd 2.33.1
Host machine cpu family: x86_64
Host machine cpu: x86_64
Program go found: NO
Build targets in project: 5

Found ninja-1.10.1 at /usr/local/bin/ninja
$ meson compile -C build           

ERROR: Path to builddir /tmp/toml-f/build does not exist!

Creating the missing directory:

$ mkdir build
$ meson compile -C build

ERROR: Could not find any runner or backend for directory /tmp/toml-f/build

Intel: problem compiling sort.f90 tests

Hello,

I had a problem while compiling toml-f via dftb+, so I’m not sure if the issue belong here, sorry if it doesn't m(_ _)m
Intel compiler versions: 16.0.1 and 18.2.199 (didn't test other version yet) seems to choke on the compilation of test/tftest/sort.f90 file, with a

error #6580: Name in only-list does not exist.   [TOML_KEY]

followed by many errors resulting from the original one.

Changing line 16 of sort.f90 from:

   use tomlf, only : toml_key, sort

to:

   use tomlf_type, only : toml_key
   use tomlf_utils_sort, only : sort

seems to be a workaround.

I’m not sure if the change is necessary though: it might just be a problem with the supercomputer configuration / installation of intel compilers.

Sincerely,

Error using meson system setup

The recommended install (as described in the HowTo) through meson does not work with the following error:

The Meson build system
Version: 0.53.2
Source dir: /home/roche/toml-f
Build dir: /home/roche/toml-f/_build
Build type: native build
Project name: toml-f
Project version: 0.2.2
Fortran compiler for the host machine: gfortran (gcc 9.3.0 "GNU Fortran (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0")
Fortran linker for the host machine: gfortran ld.bfd 2.34
Host machine cpu family: x86_64
Host machine cpu: x86_64
Program config/install-mod.py found: YES (/usr/bin/env python /home/roche/toml-f/config/install-mod.py)

meson.build:63:8: ERROR: add_install_script args must be strings

A full log can be found at /home/roche/toml-f/_build/meson-logs/meson-log.txt

The building process work as expected when using CMake.
Best wishes,
dmr
meson-log.txt

[Edit: I attach the full log for reference]

Pass valid decoder tests

Currently a number of validation tests is failing:

  • valid/array/array
  • valid/array/bool
  • valid/array/empty
  • valid/array/hetergeneous
  • valid/array/mixed-int-array
  • valid/array/mixed-int-float
  • valid/array/mixed-int-string
  • valid/array/mixed-string-table
  • valid/array/nested-double
  • valid/array/nested
  • valid/array/nospaces
  • valid/array/string-quote-comma-2
  • valid/array/string-quote-comma
  • valid/array/string-with-comma
  • valid/array/strings
  • valid/comment/tricky
  • valid/comment/everywhere
  • valid/datetime/local-date
  • valid/datetime/local-time
  • valid/datetime/datetime
  • valid/datetime/local
  • valid/example
  • valid/float/inf-and-nan
  • valid/inline-table/nest
  • valid/key/escapes
  • valid/spec-example-1-compact
  • valid/spec-example-1
  • valid/string/escape-tricky
  • valid/string/escapes
  • valid/string/multiline
  • valid/string/multiline-quotes
  • valid/string/unicode-escape

For most cases the expected JSON format in toml-test has changed.

Output for valid/array/array
FAIL valid/array/array
     Malformed output from your encoder: 'value' is not a JSON array: map[string]interface {}

     input sent to parser-cmd:
       ints = [1, 2, 3, ]
       floats = [1.1, 2.1, 3.1]
       strings = ["a", "b", "c"]
       dates = [
         1987-07-05T17:45:00Z,
         1979-05-27T07:32:00Z,
         2006-06-01T11:00:00Z,
       ]
       comments = [
                1,
                2, #this is ok
       ]

     output from parser-cmd (stdout):
       {
         "ints": {"type": "array", "value":
           [
             {"type": "integer", "value": "1"},
             {"type": "integer", "value": "2"},
             {"type": "integer", "value": "3"}
           ]
         },
         "floats": {"type": "array", "value":
           [
             {"type": "float", "value": "1.1000000000000001"},
             {"type": "float", "value": "2.1000000000000001"},
             {"type": "float", "value": "3.1000000000000001"}
           ]
         },
         "strings": {"type": "array", "value":
           [
             {"type": "string", "value": "a"},
             {"type": "string", "value": "b"},
             {"type": "string", "value": "c"}
           ]
         },
         "dates": {"type": "array", "value":
           [
             {"type": "datetime", "value": "1987-07-05T17:45:00Z"},
             {"type": "datetime", "value": "1979-05-27T07:32:00Z"},
             {"type": "datetime", "value": "2006-06-01T11:00:00Z"}
           ]
         },
         "comments": {"type": "array", "value":
           [
             {"type": "integer", "value": "1"},
             {"type": "integer", "value": "2"}
           ]
         }
       }

     want:
       {
         "comments": [
           {
             "type": "integer",
             "value": "1"
           },
           {
             "type": "integer",
             "value": "2"
           }
         ],
         "dates": [
           {
             "type": "datetime",
             "value": "1987-07-05T17:45:00Z"
           },
           {
             "type": "datetime",
             "value": "1979-05-27T07:32:00Z"
           },
           {
             "type": "datetime",
             "value": "2006-06-01T11:00:00Z"
           }
         ],
         "floats": [
           {
             "type": "float",
             "value": "1.1"
           },
           {
             "type": "float",
             "value": "2.1"
           },
           {
             "type": "float",
             "value": "3.1"
           }
         ],
         "ints": [
           {
             "type": "integer",
             "value": "1"
           },
           {
             "type": "integer",
             "value": "2"
           },
           {
             "type": "integer",
             "value": "3"
           }
         ],
         "strings": [
           {
             "type": "string",
             "value": "a"
           },
           {
             "type": "string",
             "value": "b"
           },
           {
             "type": "string",
             "value": "c"
           }
         ]
       }

Encoder testing

Currently only the decoding is tested with a validation suite. Testing the encoding is important to verify the serialization procedures work correctly.

Requires to parse JSON (objects, arrays and strings only), which could be either done with JSON-Fortran or a self-implemented minimal parser.

Reliable display of colored output in docs

Looks like the ANSI color output block in the docs is currently broken. We should find an option to reliably display the color output produced by the context report. Currently a custom ansi-block directive is used, which seems to not be recognized by the RTD build and just ends up being a default code-block. An alternative might be to create a custom pygments highlighter.

Previous art

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.