Coder Social home page Coder Social logo

bluebrain / highfive Goto Github PK

View Code? Open in Web Editor NEW
671.0 25.0 159.0 29.68 MB

HighFive - Header-only C++ HDF5 interface

Home Page: https://bluebrain.github.io/HighFive/

License: Boost Software License 1.0

CMake 3.87% C++ 95.52% Shell 0.54% Makefile 0.07%

highfive's Introduction

Note: In preparation of v3 of HighFive, we've started merging breaking changes into the main branch. More information and opportunity to comment can be found at: #864

HighFive - HDF5 header-only C++ Library

Doxygen -> gh-pages codecov HighFive_Integration_tests Zenodo

Documentation: https://bluebrain.github.io/HighFive/

Brief

HighFive is a modern header-only C++14 friendly interface for libhdf5.

HighFive supports STL vector/string, Boost::UBLAS, Boost::Multi-array and Xtensor. It handles C++ from/to HDF5 with automatic type mapping. HighFive does not require additional libraries (see dependencies).

It integrates nicely with other CMake projects by defining (and exporting) a HighFive target.

Design

  • Simple C++-ish minimalist interface
  • No other dependency than libhdf5
  • Zero overhead
  • Support C++14

Feature support

  • create/read/write files, datasets, attributes, groups, dataspaces.
  • automatic memory management / ref counting
  • automatic conversion of std::vector and nested std::vector from/to any dataset with basic types
  • automatic conversion of std::string to/from variable length string dataset
  • selection() / slice support
  • parallel Read/Write operations from several nodes with Parallel HDF5
  • Advanced types: Compound, Enum, Arrays of Fixed-length strings, References
  • half-precision (16-bit) floating-point datasets
  • std::byte in C++17 mode (with -DCMAKE_CXX_STANDARD=17 or higher)
  • etc... (see ChangeLog)

Dependencies

  • HDF5 or pHDF5, including headers
  • boost >= 1.41 (recommended)
  • eigen3 (optional)
  • xtensor (optional)
  • half (optional)

Known flaws

  • HighFive is not thread-safe. At best it has the same limitations as the HDF5 library. However, HighFive objects modify their members without protecting these writes. Users have reported that HighFive is not thread-safe even when using the threadsafe HDF5 library, e.g., #675.
  • Eigen support in core HighFive was broken until v3.0. See #532. H5Easy was not affected.
  • The support of fixed length strings isn't ideal.

Examples

Write a std::vector to 1D HDF5 dataset and read it back

#include <highfive/highfive.hpp>

using namespace HighFive;

std::string filename = "/tmp/new_file.h5";

{
    // We create an empty HDF55 file, by truncating an existing
    // file if required:
    File file(filename, File::Truncate);

    std::vector<int> data(50, 1);
    file.createDataSet("grp/data", data);
}

{
    // We open the file as read-only:
    File file(filename, File::ReadOnly);
    auto dataset = file.getDataSet("grp/data");

    // Read back, with allocating:
    auto data = dataset.read<std::vector<int>>();

    // Because `data` has the correct size, this will
    // not cause `data` to be reallocated:
    dataset.read(data);
}

Note: As of 2.8.0, one can use highfive/highfive.hpp to include everything HighFive. Prior to 2.8.0 one would include highfive/H5File.hpp.

Note: For advanced usecases the dataset can be created without immediately writing to it. This is common in MPI-IO related patterns, or when growing a dataset over the course of a simulation.

Write a 2 dimensional C double float array to a 2D HDF5 dataset

See create_dataset_double.cpp

Write and read a matrix of double float (boost::ublas) to a 2D HDF5 dataset

See boost_ublas_double.cpp

Write and read a subset of a 2D double dataset

See select_partial_dataset_cpp11.cpp

Create, write and list HDF5 attributes

See create_attribute_string_integer.cpp

And others

See src/examples/ subdirectory for more info.

H5Easy

For several 'standard' use cases the highfive/H5Easy.hpp interface is available. It allows:

  • Reading/writing in a single line of:

  • Getting in a single line:

    • the size of a DataSet,
    • the shape of a DataSet.

Example

#include <highfive/H5Easy.hpp>

int main() {
    H5Easy::File file("example.h5", H5Easy::File::Overwrite);

    int A = ...;
    H5Easy::dump(file, "/path/to/A", A);

    A = H5Easy::load<int>(file, "/path/to/A");
}

whereby the int type of this example can be replaced by any of the above types. See easy_load_dump.cpp for more details.

Note: Classes such as H5Easy::File are just short for the regular HighFive classes (in this case HighFive::File). They can thus be used interchangeably.

CMake integration

There's two common paths of integrating HighFive into a CMake based project. The first is to "vendor" HighFive, the second is to install HighFive as a normal C++ library. Since HighFive makes choices about how to integrate HDF5, sometimes following the third Bailout Approach is needed.

Regular HDF5 CMake variables can be used. Interesting variables include:

  • HDF5_USE_STATIC_LIBRARIES to link statically against the HDF5 library.
  • HDF5_PREFER_PARALLEL to prefer pHDF5.
  • HDF5_IS_PARALLEL to check if HDF5 is parallel.

Please consult tests/cmake_integration for examples of how to write libraries or applications using HighFive.

Vendoring HighFive

In this approach the HighFive sources are included in a subdirectory of the project (typically as a git submodule), for example in third_party/HighFive.

The projects CMakeLists.txt add the following lines

add_subdirectory(third_party/HighFive)
target_link_libraries(foo HighFive)

Note: add_subdirectory(third_party/HighFive) will search and "link" HDF5 but wont search or link any optional dependencies such as Boost.

Regular Installation of HighFive

Alternatively, HighFive can be install and "found" like regular software.

The project's CMakeLists.txt should add the following:

find_package(HighFive REQUIRED)
target_link_libraries(foo HighFive)

Note: find_package(HighFive) will search for HDF5. "Linking" to HighFive includes linking with HDF5. The two commands will not search for or "link" to optional dependencies such as Boost.

Bailout Approach

To prevent HighFive from searching or "linking" to HDF5 the project's CMakeLists.txt should contain the following:

# Prevent HighFive CMake code from searching for HDF5:
set(HIGHFIVE_FIND_HDF5 Off)

# Then "find" HighFive as usual:
find_package(HighFive REQUIRED)
# alternatively, when vendoring:
# add_subdirectory(third_party/HighFive)

# Finally, use the target `HighFive::Include` which
# doesn't add a dependency on HDF5.
target_link_libraries(foo HighFive::Include)

# Proceed to find and link HDF5 as required.

Optional Dependencies

HighFive does not attempt to find or "link" to any optional dependencies, such as Boost, Eigen, etc. Any project using HighFive with any of the optional dependencies must include the respective header:

#include <highfive/boost.hpp>
#include <highfive/eigen.hpp>

and add the required CMake code to find and link against the dependencies. For Boost the required lines might be

find_package(Boost REQUIRED)
target_link_libraries(foo PUBLIC Boost::headers)

Questions?

Do you have questions on how to use HighFive? Would you like to share an interesting example or discuss HighFive features? Head over to the Discussions forum and join the community.

For bugs and issues please use Issues.

Funding & Acknowledgment

The development of this software was supported by funding to the Blue Brain Project, a research center of the École polytechnique fédérale de Lausanne (EPFL), from the Swiss government's ETH Board of the Swiss Federal Institutes of Technology.

HighFive releases are uploaded to Zenodo. If you wish to cite HighFive in a scientific publication you can use the DOIs for the Zenodo records.

Copyright © 2015-2022 Blue Brain Project/EPFL

License

Boost Software License 1.0

highfive's People

Contributors

1uc avatar acdemiralp avatar adevress avatar alexsavulescu avatar alkino avatar contre avatar emmenlau avatar ferdonline avatar github-actions[bot] avatar henryiii avatar jawsnl avatar jrs65 avatar kerim371 avatar matz-e avatar mivade avatar nritsche avatar philipdeegan avatar plusangel avatar quark-x10 avatar sergiorg-hpc avatar ssbotelh avatar tatatupi avatar tdegeus avatar tklauser avatar tristan0x avatar tuxu avatar tvandera avatar unbtorsten avatar weinaji avatar wolfv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

highfive's Issues

Add the Block parameter to dataset.selection

Is it possible to add the block parameter to the dataset selection function (or is there a better practice in HighFive)?

The current code has a different sized block depending if the current processor is on the 'edge' of the 3d space. For example:

const auto blockX= mpi.rankX() >0 ? nx - 1 : nx;
const auto blockY= mpi.rankY() >0 ? ny - 1 : ny;
const auto blockZ= mpi.rankZ() >0 ? nz - 1 : nz;

const std::vector<hsize_t> block = { blockX, blockY, blockZ };

auto dataset = file.createDataSet<double>(dimensionName, fileDataspace);
auto selection = dataset.select(offset, count, stride, block);
selection.write(output);

Rename 1.5 release file?

I notice that the release tarballs have names such as archive/v1.2.tar.gz, except for version 1.5 which is missing the letter v. Is this intentional? Do you want to change the name of the tarball, possibly by changing the name of the branch or tag?

Add support for Compound DataType to HighFive

Compund DataType in HDF5 are an efficient way to serialize heterogenous structure to HDF5 Dataset.

The C API for compound datatype rely extensively on a creation of a special type object containing the offset of the different datamember to map. This need to be simplified in the C++ binding.

Critical bug in broadcasting

I've been getting incomprehensible segfaults after converting to HighFive from the H5 C++ bindings, and I've traced it down to the following reproducible bug (I think/hope):

If you have a datatype stored as a column vector or row vector, HighFive correctly "broadcasts" any size one dimensions out. So [3,1] or [1,3] both should be able to be read into a vector and produce 3 elements. However, this does not work (silently reads 1 value in, since checkDimensions is correct but data_converter is not) for [1,3] (which, in my bad luck, is how the old bindings wrote vectors). Here's the test case that will fail currently:

// Broadcasting is supported
BOOST_AUTO_TEST_CASE(ReadInBroadcastDims) {

    const std::string FILE_NAME("h5_missmatch1_dset.h5");
    const std::string DATASET_NAME("dset");

    // Create a new file using the default property lists.
    File file(FILE_NAME, File::ReadWrite | File::Create | File::Truncate);

    // Create the data space for the dataset.
    std::vector<size_t> dims_a{1,3};
    std::vector<size_t> dims_b{3,1};

    // 1D input / output vectors
    std::vector<double> some_data{5.0, 6.0, 7.0};
    std::vector<double> data_a;
    std::vector<double> data_b;

    DataSpace dataspace_a(dims_a);
    DataSpace dataspace_b(dims_b);

    // Create a dataset with double precision floating points
    DataSet dataset_a = file.createDataSet(DATASET_NAME + "_a", dataspace_a, AtomicType<double>());
    DataSet dataset_b = file.createDataSet(DATASET_NAME + "_b", dataspace_b, AtomicType<double>());

    dataset_a.write(some_data);
    dataset_b.write(some_data);

    DataSet out_a = file.getDataSet(DATASET_NAME + "_a");
    DataSet out_b = file.getDataSet(DATASET_NAME + "_b");

    out_a.read(data_a);
    out_b.read(data_b);

    BOOST_CHECK_EQUAL_COLLECTIONS(
            data_a.begin(), data_a.end(),
            some_data.begin(), some_data.end());

    BOOST_CHECK_EQUAL_COLLECTIONS(
            data_b.begin(), data_b.end(),
            some_data.begin(), some_data.end());
}

Support for fixed-length strings

Hi,

I am trying to read a dataset with this type:

DATATYPE H5T_STRING {
  STRSIZE 50;
  STRPAD H5T_STR_NULLPAD;
  CSET H5T_CSET_ASCII;
  CTYPE H5T_C_S1;
}

Glancing at AtomicType<std::string>::AtomicType() it seems that only variable-length strings are expected.

Is this something that can be supported in the future?

warning: use of old-style cast to ‘hsize_t*’

I get this warning when I compile an application with gcc gcc (GCC) 8.2.1 20180831:

H5Slice_traits_misc.hpp: In member function ‘HighFive::Selection HighFive::SliceTraits<Derivate>::select(const HighFive::ElementSet&) const’:
include/highfive/bits/H5Slice_traits_misc.hpp:129:46: warning: use of old-style cast to ‘hsize_t*’ {aka ‘long long unsigned int*’} [-Wold-style-cast]
         data = (hsize_t*)(&(elements._ids[0]));

Add support for Enums

It would be helpful to have first class support for Enums in HighFive. Currently I have to use a workaround like this to add them:

namespace HighFive
{
class UserType : public DataType
{
public:
    UserType(hid_t hid) { _hid = hid; }
};
}
enum Sorting
{
        none = 0,
        by_gid = 1,
        by_time = 2
};
auto enumValue = Sorting::none;
H5Tenum_insert(sortingEnum.getId(), "none", &enumValue);
enumValue = Sorting::by_gid;
H5Tenum_insert(sortingEnum.getId(), "by_gid", &enumValue);
enumValue = Sorting::by_time;
H5Tenum_insert(sortingEnum.getId(), "by_time", &enumValue);

auto sortingAttr =
            group.createAttribute(std::string("sorting"),
                                  HighFive::DataSpace::From(sortingEnum),
                                  sortingEnum);

H5Awrite(sortingAttr.getId(), sortingEnum.getId(), &enumValue);

Enabling Chunking

Hello,

Firstly I'd like to thank you for the HighFive library, it's been mostly a pleasure to work with.

However I'm currently converting some older parallel HDF5 fortran code and have run into the following problem:

call h5pcreate_f(H5P_DATASET_CREATE_F, chunkingPropertyID, result)
call h5pset_chunk_f(chunkingPropertyID, numberOfDims, chunkDimensions, result)

call h5dcreate_f(groupID, datasetName, H5T_IEEE_F64LE, fileID, datasetID, result, dcpl_id = chunkingPropertyID)

Normally it's possible to apply these properties after the creation by using the getId() function. However it looks like HDF5 only takes chunking as a parameter in the creation of the dataset (you can't set it later).

Is there another way to accomplish this?

Collective read

Hi,
I am happily using HighFive at NERSC to read HDF5 files. Currently, I am encountering a
performance bottleneck with parallel read and was wondering if it is possible to configure
HighFive to do collective read.

Thank you very much,
Holger

Write to DataSet from a pointer to a multidimensional array

I have some data that is on a 3D grid with dimensions [n, n, n], but I only have the pointer to this data array. If I try to store it as a [n, n, n] DataSet, I get the following error:

  terminate called after throwing an instance of 'HighFive::DataSpaceException'
  what():  Impossible to write buffer of dimensions 1 into dataset of dimensions 3

Also wrapping the array using a boost::multi_array_ref doesn't work, as DataSet::write only accepts a pure boost::multi_array, and I get linking errors using the ref array.

It would be nice to add this functionality to this really nice library! (Or maybe I just missed something...)

Parallel HDF5 support

README.md says that HighFive supports parallel HDF5. From looking at HDF5 parallel tutorial, I think HDF5 needs some information from MPI (for example, communicator) for parallel IO. But I see none of this in HDF5 code. Am I misunderstanding what you mean by supporting parallel HDF5? Or am I missing something in the code?

Dataset read error with 3D vector

When I compile this code I get error with this,but if I notes "dataset1.read(data);" this code,it compiles successful, and show label.size() is right. How to solve this problem?

SOURCE CODE

std::vector<std::vector<std::vector>> data;
std::vector label;
std::string filename = "/path/to/data/h5_data/110020171219_h5/0.h5";
File file(filename, File::ReadOnly);
DataSet dataset1 = file.getDataSet("data");
dataset1.read(data);
DataSet dataset2 = file.getDataSet("label");
dataset2.read(label);
std::cout << label.size() << std::endl;

ERROR

In file included from /path/to/CLionProjects/Create_hdf5/src/highfive/bits/../bits/H5Attribute_misc.hpp:29:0,
from /path/to/CLionProjects/Create_hdf5/src/highfive/bits/../H5Attribute.hpp:73,
from /path/to/CLionProjects/Create_hdf5/src/highfive/bits/H5Annotate_traits_misc.hpp:18,
from /path/to/CLionProjects/Create_hdf5/src/highfive/bits/H5Annotate_traits.hpp:74,
from /path/to/CLionProjects/Create_hdf5/src/highfive/H5File.hpp:17,
from/path/to/CLionProjects/Create_hdf5/src/main.cpp:7:
/path/to/CLionProjects/Create_hdf5/src/highfive/bits/../bits/H5Converter_misc.hpp: In instantiation of ‘typename std::vector<_RealType>::iterator HighFive::details::single_buffer_to_vectors(typename std::vector<_RealType>::iterator, typename std::vector<_RealType>::iterator, const std::vector&, std::size_t, std::vector&) [with T = float; U = std::vector<std::vector >; typename std::vector<_RealType>::iterator = __gnu_cxx::__normal_iterator<float*, std::vector >; std::size_t = long unsigned int]’:
/path/to/CLionProjects/Create_hdf5/src/highfive/bits/../bits/H5Converter_misc.hpp:272:64: required from ‘void HighFive::details::data_converter<std::vector<_RealType>, typename HighFive::details::enable_if<HighFive::details::is_container< >::value>::type>::process_result(std::vector<_RealType>&) [with T = std::vector<std::vector >; typename HighFive::details::enable_if<HighFive::details::is_container< >::value>::type = void]’
/path/to/CLionProjects/Create_hdf5/src/highfive/bits/../bits/H5Slice_traits_misc.hpp:184:5: required from ‘void HighFive::SliceTraits::read(T&) const [with T = std::vector<std::vector<std::vector > >; Derivate = HighFive::Selection]’
/path/to/CLionProjects/Create_hdf5/src/main.cpp:113:61: required from here
/path/to/CLionProjects/Create_hdf5/src/highfive/bits/../bits/H5Converter_misc.hpp:105:69: error: no matching function for call to ‘single_buffer_to_vectors(std::vector::iterator&, std::vector::iterator&, const std::vector&, std::size_t, std::vector<std::vector >&)’
current_dim + 1, it);
^
/path/to/CLionProjects/Create_hdf5/src/highfive/bits/../bits/H5Converter_misc.hpp:105:69: note: candidates are:
/path/to/CLionProjects/Create_hdf5/src/highfive/bits/../bits/H5Converter_misc.hpp:81:1: note: typename std::vector<_RealType>::iterator HighFive::details::single_buffer_to_vectors(typename std::vector<_RealType>::iterator, typename std::vector<_RealType>::iterator, const std::vector&, std::size_t, std::vector<_RealType>&) [with T = std::vector; typename std::vector<_RealType>::iterator = __gnu_cxx::__normal_iterator<std::vector
, std::vector<std::vector > >; std::size_t = long unsigned int]
single_buffer_to_vectors(typename std::vector::iterator begin_buffer,
^
/path/to/CLionProjects/Create_hdf5/src/highfive/bits/../bits/H5Converter_misc.hpp:81:1: note: no known conversion for argument 1 from ‘std::vector::iterator {aka __gnu_cxx::__normal_iterator<float*, std::vector >}’ to ‘std::vector<std::vector >::iterator {aka __gnu_cxx::__normal_iterator<std::vector*, std::vector<std::vector > >}’
/path/to/CLionProjects/Create_hdf5/src/highfive/bits/../bits/H5Converter_misc.hpp:94:1: note: template<class T, class U> typename std::vector<_RealType>::iterator HighFive::details::single_buffer_to_vectors(typename std::vector<_RealType>::iterator, typename std::vector<_RealType>::iterator, const std::vector&, std::size_t, std::vector&)
single_buffer_to_vectors(typename std::vector::iterator begin_buffer,
^
/path/to/CLionProjects/Create_hdf5/src/highfive/bits/../bits/H5Converter_misc.hpp:94:1: note: template argument deduction/substitution failed:
/path/to/CLionProjects/Create_hdf5/src/highfive/bits/../bits/H5Converter_misc.hpp:105:69: note: couldn't deduce template parameter ‘T’
current_dim + 1, *it);
^

Compilation Issue with VS2017-15.7.0 Preview 1.0

When compiling both version 1.5 and master with the latest visual studio through cmake, I get the following error.

error C2668: 'HighFive::details::vectors_to_single_buffer': ambiguous call to overloaded function

Overload at position 49 vs Overload at 61
(std::vector<typename type_of_array::type>& buffer vs std::vector& buffer)

h5converter_misc.hpp(61): note: could be 'void HighFive::details::vectors_to_single_buffer<double>(const std::vector<double,std::allocator<_Ty>> &,const std
         ::vector<size_t,std::allocator<unsigned __int64>> &,size_t,std::vector<_Ty,std::allocator<_Ty>> &)'
                 with
                 [
                     _Ty=double
                 ]
h5converter_misc.hpp(49): note: or       'void HighFive::details::vectors_to_single_buffer<double>(const std::vector<double,std::allocator<_Ty>> &,const std
         ::vector<size_t,std::allocator<unsigned __int64>> &,size_t,std::vector<_Ty,std::allocator<_Ty>> &)'
                 with
                 [
                     _Ty=double
                 ]

Compilation Stack (master):

h5converter_misc.hpp(69)
h5converter_misc.hpp(262)
h5converter_misc.hpp(260)
h5slice_traits_misc.hpp(232)
h5slice_traits_misc.hpp(230)
select_partial_dataset_cpp11.cpp(37)

Continuous integration should cover additional environments

HighFive should be tested against several versions of gcc and clang and also tested on both Linux and MacOS. There are opened issues with msvc, this compiler should also be tested.

See spdlog repository that uses TravisCI for Linux/MaxOS and AppVeyor for Windows. The 2 YAML configs can almost be copy/pasted, we just have to add libhdf5 installation.

Listing and H5Gget_objname_by_idx complexity

HighFive currently use H5Gget_objname_by_idx for name listing.

H5Gget_objname_by_idx has a (non-documented) complexity of O(n) of the size of the group. This is an issue when listing very large dataset where it takes O(n²) of the number of element.

This need to be switch to H5Literate or similar.

Compilation error when only including H5Group.hpp

I am using the master branch of HighFive and compiling with gcc 5.4.0.

I create a file main.cpp with these contents:

// Compiler error:
#include <highfive/H5Group.hpp>

// Works:
//#include <highfive/H5File.hpp>

int main()
{
	HighFive::Group group;
}

I then compile with this command:

g++ -std=c++11 -o program -I/home/jkarlsso/Work/HighFive/include -I/usr/include/hdf5/serial main.cpp -lhdf5_serial

I then get the following error:

In file included from /home/jkarlsso/Work/HighFive/include/highfive/bits/H5Node_traits.hpp:99:0,
                 from /home/jkarlsso/Work/HighFive/include/highfive/H5Group.hpp:14,
                 from main.cpp:2:
/home/jkarlsso/Work/HighFive/include/highfive/bits/H5Node_traits_misc.hpp: In member function ‘HighFive::Group HighFive::NodeTraits<Derivate>::createGroup(const string&)’:
/home/jkarlsso/Work/HighFive/include/highfive/bits/H5Node_traits_misc.hpp:72:77: error: return type ‘class HighFive::Group’ is incomplete
 inline Group NodeTraits<Derivate>::createGroup(const std::string& group_name) {
                                                                             ^
/home/jkarlsso/Work/HighFive/include/highfive/bits/H5Node_traits_misc.hpp:74:15: error: invalid use of incomplete type ‘class HighFive::Group’
     if ((group._hid = H5Gcreate2(static_cast<Derivate*>(this)->getId(),
               ^
In file included from /home/jkarlsso/Work/HighFive/include/highfive/H5Group.hpp:13:0,
                 from main.cpp:2:
/home/jkarlsso/Work/HighFive/include/highfive/bits/H5Annotate_traits.hpp:18:7: note: forward declaration of ‘class HighFive::Group’
 class Group;
       ^
In file included from /home/jkarlsso/Work/HighFive/include/highfive/bits/H5Node_traits.hpp:99:0,
                 from /home/jkarlsso/Work/HighFive/include/highfive/H5Group.hpp:14,
                 from main.cpp:2:
/home/jkarlsso/Work/HighFive/include/highfive/bits/H5Node_traits_misc.hpp: In member function ‘HighFive::Group HighFive::NodeTraits<Derivate>::getGroup(const string&) const’:
/home/jkarlsso/Work/HighFive/include/highfive/bits/H5Node_traits_misc.hpp:85:63: error: return type ‘class HighFive::Group’ is incomplete
 NodeTraits<Derivate>::getGroup(const std::string& group_name) const {
                                                               ^
/home/jkarlsso/Work/HighFive/include/highfive/bits/H5Node_traits_misc.hpp:87:15: error: invalid use of incomplete type ‘class HighFive::Group’
     if ((group._hid = H5Gopen2(static_cast<const Derivate*>(this)->getId(),
               ^
In file included from /home/jkarlsso/Work/HighFive/include/highfive/H5Group.hpp:13:0,
                 from main.cpp:2:
/home/jkarlsso/Work/HighFive/include/highfive/bits/H5Annotate_traits.hpp:18:7: note: forward declaration of ‘class HighFive::Group’
 class Group;
       ^

If I instead include "H5File.hpp" it compiles fine.

Slicing: selecting all elements along one dimension

Hi !

I'm trying to read an hdf5 file that has a dataset with shape (N,3), but I'd like to select some elements only along direction N. In this example, I'm trying to read four elements out of a 100 element array, but the main goal is to downsample very large hdf5 files. That's what I have tried:

#include <functional>
#include <iostream>
#include <string>
#include <vector>
#include <set>
 
 
#include <highfive/H5File.hpp>
#include <highfive/H5DataSet.hpp>
#include <highfive/H5DataSpace.hpp> 

const std::string FILE_NAME("data.h5");
const std::string DATASET_NAME("Velocities");
 
int main(void) { 
      using namespace HighFive;
  
      File file(FILE_NAME, File::ReadWrite);
      DataSet dataset = file.getDataSet(DATASET_NAME);
  

      std::vector<std::vector<double>> result;
      
      std::cout << FILE_NAME << "\n";
      
       srand (time(NULL));
       std::set<long unsigned int> indices;
       while(indices.size() < 4)
       {   
           indices.insert(rand()%((int) 99));
       }
   
       const std::vector<long unsigned int> ind(indices.begin(), indices.end());
   
  
      dataset.select(ind, {0,1,2}).read(result);
      

  return 0;
  }

And this is the error I get:

  what():  Impossible to read DataSet of dimensions 3 into arrays of dimensions 2
[1]    15113 abort (core dumped)  ./read

Any idea of how could I read it? Moreover, I'd like to parallelise the reading in the future, however I couldn't find any example (there's only an example on how to write files in parallel). Could someone please clarify me how to do it?

Very short syntax

If a user just wants to quick and accept all defaults, a very nice syntax could be:

File file(FILE_NAME, File::ReadWrite | File::Create | File::Truncate);

std::vector<int> data(size_dataset);
// ...

file.write(DATASET_NAME, data);

The data-type would thereby be derived from the std::vector. Similarly one could

File file(FILE_NAME, File::ReadOnly);

std::vector<int> read_data = file.read<std::vector<int>>(DATASET_NAME);

which would shorten an example like examples/read_write_vector_dataset.cpp.

One could refine the syntax, for example

std::vector<int> read_data = file.readVector<int>(DATASET_NAME);

This would allow simpler implementation.

exist not recursive

The following

#include <highfive/H5File.hpp>

int main()
{
    HighFive::File file("test.h5", HighFive::File::Overwrite);
    file.exist("/path/to/group");

    return 0;
}

throws in-stead of returning false. The output:

HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 0:
  #000: H5L.c line 815 in H5Lexists(): unable to get link info
    major: Links
    minor: Can't get value
  #001: H5L.c line 3095 in H5L__exists(): path doesn't exist
    major: Links
    minor: Object already exists
  #002: H5Gtraverse.c line 851 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #003: H5Gtraverse.c line 741 in H5G__traverse_real(): component not found
    major: Symbol table
    minor: Object not found
libc++abi.dylib: terminating with uncaught exception of type HighFive::GroupException: Invalid link for exist()  (Symbol table) Object not found

Attribute assignment bug

The following code renders att invalid after the second assignment.

File h5file(TEST_FILE, File::ReadWrite);
Group g = h5file.getGroup("metadata");

Attribute att = g.getAttribute("family");
att = g.getAttribute("one");

att.read() will fail with error:
#000: ../../src/H5A.c line 1276 in H5Aget_type(): not an attribute
major: Invalid arguments to routine
minor: Inappropriate type

Recursively create groups

I want to switch from my own wrapper HDF5pp to using HighFive by contributing to a xtensor-io wrapper. The current PR appears to be too complicated as it wraps HighFive. I want to avoid this using the following syntax

HighFive::File file("/tmp/file.h5", HighFive::File::ReadWrite);

xt::array<double> data = xt::load(file, "/path/to/data");

// ....

xt::dump(file, "/other/path", data);

However, I directly miss a feature that both the PR and my own library do have, and that I deem essential in use. And that is automatically recursively creating groups such that a DataSet with an arbitrary path can be created. I was looking to open a PR for HighFive, but I must say that I don't directly see where to place such a function, and how to make createDataSet use it.

Are you open to such a feature? How can I integrate it?

@hernando @wolfv

1D assertion failure

Hi,

I am getting assertion failures from commit a06a59c:

braynsTestData: /home/jkarlsso/local/include/highfive/bits/H5Converter_misc.hpp:162: HighFive::details::data_converter<std::vector<_RealType>, typename std::enable_if<(std::is_same<T, typename HighFive::details::type_of_array<T>::type>::value)>::type>::data_converter(std::vector<_RealType>&, HighFive::DataSpace&) [with T = vmml::vector<4ul, float>; typename std::enable_if<(std::is_same<T, typename HighFive::details::type_of_array<T>::type>::value)>::type = void]: Assertion `is_1D(_space.getDimensions())' failed.
unknown location(0): fatal error in "render_circuit_and_compare": signal: SIGABRT (application abort requested)

The assertion code looks like this:

inline bool is_1D(const std::vector<size_t>& dims)
{
    return std::count_if(dims.begin(), dims.end(),
                         [](size_t i){ return i > 1; }) < 2;
}

This checks the presence of only one dimension of size 2 or greater which seems strange. I would assume it should look more like this:

inline bool is_1D(const std::vector<size_t>& dims)
{
    return dims.size() == 1 && dims[0] == 1;
}

What do you think?

[conda packaging] [windows] Adding H5_BUILT_AS_DYNAMIC_LIB to definitions

I opened the PR to package highfive on conda-forge. This will be a dependency to xtensor-io in our next release. The PR url is: conda-forge/staged-recipes#7195

When not disabling the build of the examples, we are seeing some errors on windows. The full logs are available at: https://ci.appveyor.com/project/conda-forge/staged-recipes/builds/20860171

To include some information about this:

Scanning dependencies of target read_write_single_scalar_bin
[  5%] Building CXX object src/examples/CMakeFiles/read_write_single_scalar_bin.dir/read_write_single_scalar.cpp.obj
cl : Command line warning D9002 : ignoring unknown option '-std=c++11'
read_write_single_scalar.cpp
c:\bld\highfive_1544226794732\work\include\highfive\bits/H5PropertyList_misc.hpp(82): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data
c:\bld\highfive_1544226794732\work\include\highfive\bits/H5DataType_misc.hpp(132): warning C4244: '=': conversion from 'double' to 'std::size_t', possible loss of data
[ 10%] Linking CXX executable read_write_single_scalar_bin.exe
LINK Pass 1: command "C:\PROGRA~2\MI0E91~1.0\VC\bin\amd64\link.exe /nologo @CMakeFiles\read_write_single_scalar_bin.dir\objects1.rsp /out:read_write_single_scalar_bin.exe /implib:read_write_single_scalar_bin.lib /pdb:%SRC_DIR%\src\examples\read_write_single_scalar_bin.pdb /version:0.0 /machine:x64 /debug /INCREMENTAL /subsystem:console %PREFIX%\Library\lib\hdf5.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib /MANIFEST /MANIFESTFILE:CMakeFiles\read_write_single_scalar_bin.dir/intermediate.manifest CMakeFiles\read_write_single_scalar_bin.dir/manifest.res" failed (exit code 1120) with the following output:
read_write_single_scalar.cpp.obj : MSIL .netmodule or module compiled with /GL found; restarting link with /LTCG; add /LTCG to the link command line to improve linker performance
LINK : warning LNK4075: ignoring '/INCREMENTAL' due to '/LTCG' specification
read_write_single_scalar.cpp.obj : error LNK2001: unresolved external symbol H5T_NATIVE_DOUBLE_g
read_write_single_scalar.cpp.obj : error LNK2001: unresolved external symbol H5T_NATIVE_INT_g
read_write_single_scalar_bin.exe : fatal error LNK1120: 2 unresolved externals
NMAKE : fatal error U1077: 'C:\bld\highfive_1544226794732\_build_env\Library\bin\cmake.exe' : return code '0xffffffff'
Stop.

By the way, if any of the highfive maintainers wants to be a maintainer of the conda recipe, let me know! I would be happy to add them in the PR.

HighFive fails to open a file with flag Create when the file exist

on current master when using File file(FILE_NAME, File::ReadWrite | File::Create); it fails when the file exists while it should just ignore the Create flag and open the file:

HDF5-DIAG: Error detected in HDF5 (1.8.18) thread 0:
  #000: ../../src/H5F.c line 522 in H5Fcreate(): unable to create file
    major: File accessibilty
    minor: Unable to open file
  #001: ../../src/H5Fint.c line 1048 in H5F_open(): unable to open file
    major: File accessibilty
    minor: Unable to open file
  #002: ../../src/H5FD.c line 993 in H5FD_open(): open failed
    major: Virtual File Layer
    minor: Unable to initialize object
  #003: ../../src/H5FDsec2.c line 339 in H5FD_sec2_open(): unable to open file: name = 'dataset.h5', errno = 17, error message = 'File exists', flags = 15, o_flags = c2
    major: File accessibilty
    minor: Unable to open file
Unable to create file create_dataset_example.h5 (File accessibilty) Unable to open file

Cannot read dataset to std::vector<uint8_t>

I experienced a runtime error when I try to read a dataset to the std:vector<uint8_t>. Here is the error message:

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) MPI-process 0: #000: H5Dio.c line 170 in H5Dread(): can't read data major: Dataset minor: Read failed #001: H5Dio.c line 418 in H5D__read(): unable to set up type info major: Dataset minor: Unable to initialize object #002: H5Dio.c line 978 in H5D__typeinfo_init(): unable to convert between src and dest datatype major: Dataset minor: Feature is unsupported #003: H5T.c line 4560 in H5T_path_find(): no appropriate function for conversion path major: Datatype minor: Unable to initialize object HDF5 error: Error during HDF5 Read: (Datatype) Unable to initialize object

The error goes away if I recompile the program by replacing "uint8_t" with "unsigned char". It also worked for "unsigned short".

Support non-STL types

I just came across your nice project. Is there any way that non-STL types that one would use for multi-dimensional arrays (e.g. Eigen, xtensor, ...) could we directly supported, without, at least seemingly, first passing through std::vector? I have done this in my own lightweight simple wrapper HDF5pp for Eigen (and my own n-d array library cppmat).

Vote : migrate HighFive to full C++11 code base in release 2.0

A vote : migrate HighFive to full C++11 code base in release 2.0 or not.

I would like to put C++11 as a requirement for HighFive 2.0 release in future.

This would imply an incompatibility of the new versions >= 2.0 with old compilers ( XLC-BGQ, GCC 4.4 and old PGI compilers ).

The branch 1.X will be maintains with C++03 compatibility.

If anyone has arguments to oppose to that, or want to support legacy compiler for longer, please message here.

A decision will be takend on the 18 of July.

Write to 2-d extendible dataset

The following code

#include <highfive/H5DataSet.hpp>
#include <highfive/H5DataType.hpp>
#include <highfive/H5DataSpace.hpp>
#include <highfive/H5File.hpp>

int main()
{
  HighFive::File file("test.h5", HighFive::File::Overwrite);

  HighFive::DataSpace dataspace = HighFive::DataSpace({10,10}, {HighFive::DataSpace::UNLIMITED, HighFive::DataSpace::UNLIMITED});

  HighFive::DataSetCreateProps props;
  props.add(HighFive::Chunking(std::vector<hsize_t>{10,10}));

  HighFive::DataSet dataset = file.createDataSet("/A", dataspace, HighFive::AtomicType<double>(), props);

  double data = 10.;

  dataset.select({0,0},{1,1}).write(data);
  dataset.select({0,1},{1,2}).write(data);

  file.flush();
}

Throws the following:

libc++abi.dylib: terminating with uncaught exception of type HighFive::DataSpaceException: Impossible to write buffer of dimensions 0 into dataset of dimensions 2

What should be the correct usage in this case?

Make selection lifetime independant of their parent dataset

For now, HDF5 hyperslap are valid only for the time life of their parent dataset object.

Currently, HighFive selection do not increase the reference counter of their parent DataSet.

This can lead to confusing situation like

_hdf5_file.getDataSet("fdfdfd").select(...)

Where parent dataset lifetime is inferior to selection lifetime, creating invalid selection object.

The solution to this is to make "Selection" object to own a dataset reference ( increase reference counter ) instead of just using a weak reference.

Ability to creating Soft link

Is it possible to create a soft-link to a dataset using HighFive?
I couldn't find such feature in current release.

Imagine it could do:

rootGroup.createSoftLink(std::string name,const HighFive::DataSet & targetDataSet)
or
rootGroup.createSoftLink(std::string name, std::string pathToDataSetObject /* /group1/group2/dataset*/)

Thanks in advance for any help.

Use inline namespaces for library versioning

HighFive uses std::exception as the base type of it's exception classes. Since this class has a virtual destructor, it's not totally safe to use different versions of HighFive in libraries that are going to be linked together in the same final binary, because the compiler may generate a symbols for the virtual pointer tables of the exception types. There may be other unknown definitions for which the compiler might generate symbols as well.
A clean solution to avoid this problem is to use inline namespaces and version the library (as compiler vendors do to be able to avoid mixing C++98 and C++11 symbols in the same binary)

License

Are you open to re-licensing the library under something more permissive than LGPL? (Like a BSD or an MIT license?) My employer insists on releasing software under BSD and won't let us include any GPL or LGPL code.

(It's also not entirely clear to me what it means for a header-only library to be under LGPL. There is no possibility of dynamic linking, so isn't it exactly the same as GPL?)

static const not defined

#include <highfive/H5File.hpp>
#include <iostream>
#include <memory>
#include <cstdio>
#include <fstream>
#include <cassert>
#include <functional>

int main()
{
  std::unique_ptr<HighFive::File> data = std::make_unique<HighFive::File>("test.h5", HighFive::File::Overwrite);

  return 0;
}

Compiled using

clang++ --version
Apple LLVM version 10.0.0 (clang-1000.11.45.5)
Target: x86_64-apple-darwin18.2.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

Results in

clang++ -std=c++14 -lhdf5 -o test test.cpp
Undefined symbols for architecture x86_64:
  "HighFive::File::Overwrite", referenced from:
      _main in test-96d5dd.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

In fact it is not so surprising, it is a known limitation of static const, see e.g. this post.

A solution is to include const int HighFive::File::Overwrite; somewhere, but I don't think this is very wanted. Removing the staticness would solve the issue, or what I think is much more elegant is to use enum

Using traits in DataSpace::From()

We are trying to use HighFive library with blaze matrices (https://bitbucket.org/blaze-lib/blaze). HighFive is a very nicely written library so I was able to write few traits such as array_dims and transform_read... to make it work with blaze. But I couldn't do the same thing with DataSpace::From() since it doesn't use traits. This is not a serious issue since I can create DataSpace with a vector but It would be much nicer if we could use DataSpace::From() with blaze.

So is it possible to use traits in DataSpace::From() instead of creating a function for each input data?

writing and reading 2d C array

I have the following working example:

#include <string>
#include <vector>
#include <iostream>
#include <highfive/H5File.hpp>
#include <highfive/H5DataSet.hpp>
#include <highfive/H5DataSpace.hpp>

namespace h5 = HighFive;
using namespace std;

const string FILE_NAME("test.h5");
const string DATASET_NAME("dset");
const int ROWS(5);
const int COLS(2);

int main(void) {
	h5::File file(FILE_NAME, h5::File::ReadWrite | h5::File::Create | h5::File::Truncate);

	vector<size_t> dims(2);
	dims[0] = ROWS;
	dims[1] = COLS;

	// dynamically allocate C array (2d)
	double **data = (double **)calloc(ROWS, sizeof(double *));
	data[0] = (double *)calloc(ROWS*COLS, sizeof(double));
	for (int ii = 1; ii < ROWS; ii++) {
		data[ii] = data[0] + ii * COLS;
	}

	// put some values
	for (int ii = 0; ii < ROWS; ii++) {
		for (int jj = 0; jj < COLS; jj++) {
			data[ii][jj] = (double)(ii*COLS + jj);
			cout << "data[" << ii << "][" << jj <<"] = " << data[ii][jj] << endl;
		}
	}

	// writing the dataset
	h5::DataSet dataset = file.createDataSet<double>(DATASET_NAME, h5::DataSpace(dims));
	dataset.write(data);
	file.flush();

	// reading the dataset
	double **data_read = NULL;
	dataset = file.getDataSet("dset");
	// dataset.read(data_read);
	// for (int ii = 0; ii < ROWS; ii++) {
	// 	for (int jj = 0; jj < COLS; jj++) {
	// 		cout << data[ii][jj] << " <-> " << data_read[ii][jj];
	// 	}
	// }

	// free memory
	free(data);
	free(data[0]);

        return 0;
}

Two questions/problems:

  1. the the printed values in data look okay, but the written values in test.h5 are wrong. where is the error? in my example or in the HighFive?
  2. reading back the 2d dataset into a C-array does not work at all. I get the runtime error

HDF5-DIAG: Error detected in HDF5 (1.8.16) thread 139984454051648:
#000: ../../../src/H5Dio.c line 173 in H5Dread(): can't read data
major: Dataset
minor: Read failed
#1: ../../../src/H5Dio.c line 443 in H5D__read(): no output buffer
major: Invalid arguments to routine
minor: Bad value
terminate called after throwing an instance of 'HighFive::DataSetException'
what(): Error during HDF5 Read: (Invalid arguments to routine) Bad value

Do I expect too much simplicity and flexibility of the HighFive API?
Looking forward to the feedbacks...cheers

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.