Coder Social home page Coder Social logo

jupyterlab / jupyterlab-hdf5 Goto Github PK

View Code? Open in Web Editor NEW
115.0 115.0 27.0 6.22 MB

Open and explore HDF5 files in JupyterLab. Can handle very large (TB) sized files, and datasets of any dimensionality

License: BSD 3-Clause "New" or "Revised" License

Python 26.69% TypeScript 24.23% Jupyter Notebook 46.65% CSS 1.22% JavaScript 1.13% Shell 0.08%
hdf5 hdf5-dataset hdf5-filebrowser high-dimensional jupyter jupyterlab jupyterlab-2 jupyterlab-extension labextension

jupyterlab-hdf5's Introduction

Installation | Documentation | Contributing | License | Team | Getting help |

PyPI version Downloads Build Status Build Status Documentation Status Crowdin GitHub Discourse Gitter Gitpod

Binder

An extensible environment for interactive and reproducible computing, based on the Jupyter Notebook and Architecture.

JupyterLab is the next-generation user interface for Project Jupyter offering all the familiar building blocks of the classic Jupyter Notebook (notebook, terminal, text editor, file browser, rich outputs, etc.) in a flexible and powerful user interface.

JupyterLab can be extended using npm packages that use our public APIs. The prebuilt extensions can be distributed via PyPI, conda, and other package managers. The source extensions can be installed directly from npm (search for jupyterlab-extension) but require an additional build step. You can also find JupyterLab extensions exploring GitHub topic jupyterlab-extension. To learn more about extensions, see the user documentation.

Read the current JupyterLab documentation on ReadTheDocs.

Important

JupyterLab 3 will reach its end of maintenance date on May 15, 2024, anywhere on Earth. To help us make this transition, fixes for critical issues will still be backported until December 31, 2024. If you are still running JupyterLab 3, we strongly encourage you to upgrade to JupyterLab 4 as soon as possible. For more information, see JupyterLab 3 end of maintenance on the Jupyter Blog.


Getting started

Installation

If you use conda, mamba, or pip, you can install JupyterLab with one of the following commands.

  • If you use conda:
    conda install -c conda-forge jupyterlab
  • If you use mamba:
    mamba install -c conda-forge jupyterlab
  • If you use pip:
    pip install jupyterlab
    If installing using pip install --user, you must add the user-level bin directory to your PATH environment variable in order to launch jupyter lab. If you are using a Unix derivative (e.g., FreeBSD, GNU/Linux, macOS), you can do this by running export PATH="$HOME/.local/bin:$PATH". If you are using a macOS version that comes with Python 2, run pip3 instead of pip.

For more detailed instructions, consult the installation guide. Project installation instructions from the git sources are available in the contributor documentation.

Installing with Previous Versions of Jupyter Notebook

When using a version of Jupyter Notebook earlier than 5.3, the following command must be run after installing JupyterLab to enable the JupyterLab server extension:

jupyter serverextension enable --py jupyterlab --sys-prefix

Running

Start up JupyterLab using:

jupyter lab

JupyterLab will open automatically in the browser. See the documentation for additional details.

If you encounter an error like "Command 'jupyter' not found", please make sure PATH environment variable is set correctly. Alternatively, you can start up JupyterLab using ~/.local/bin/jupyter lab without changing the PATH environment variable.

Prerequisites and Supported Browsers

The latest versions of the following browsers are currently known to work:

  • Firefox
  • Chrome
  • Safari

See our documentation for additional details.


Getting help

We encourage you to ask questions on the Discourse forum. A question answered there can become a useful resource for others.

Bug report

To report a bug please read the guidelines and then open a Github issue. To keep resolved issues self-contained, the lock bot will lock closed issues as resolved after a period of inactivity. If a related discussion is still needed after an issue is locked, please open a new issue and reference the old issue.

Feature request

We also welcome suggestions for new features as they help make the project more useful for everyone. To request a feature please use the feature request template.


Development

Extending JupyterLab

To start developing an extension for JupyterLab, see the developer documentation and the API docs.

Contributing

To contribute code or documentation to JupyterLab itself, please read the contributor documentation.

JupyterLab follows the Jupyter Community Guides.

License

JupyterLab uses a shared copyright model that enables all contributors to maintain the copyright on their contributions. All code is licensed under the terms of the revised BSD license.

Team

JupyterLab is part of Project Jupyter and is developed by an open community. The maintenance team is assisted by a much larger group of contributors to JupyterLab and Project Jupyter as a whole.

JupyterLab's current maintainers are listed in alphabetical order, with affiliation, and main areas of contribution:

  • Mehmet Bektas, Netflix (general development, extensions).
  • Alex Bozarth, IBM (general development, extensions).
  • Eric Charles, Datalayer, (general development, extensions).
  • Frédéric Collonval, WebScIT (general development, extensions).
  • Martha Cryan, Mito (general development, extensions).
  • Afshin Darian, QuantStack (co-creator, application/high-level architecture, prolific contributions throughout the code base).
  • Vidar T. Fauske, JPMorgan Chase (general development, extensions).
  • Brian Granger, AWS (co-creator, strategy, vision, management, UI/UX design, architecture).
  • Jason Grout, Databricks (co-creator, vision, general development).
  • Michał Krassowski, Quansight (general development, extensions).
  • Max Klein, JPMorgan Chase (UI Package, build system, general development, extensions).
  • Gonzalo Peña-Castellanos, QuanSight (general development, i18n, extensions).
  • Fernando Perez, UC Berkeley (co-creator, vision).
  • Isabela Presedo-Floyd, QuanSight Labs (design/UX).
  • Steven Silvester, MongoDB (co-creator, release management, packaging, prolific contributions throughout the code base).
  • Jeremy Tuloup, QuantStack (general development, extensions).

Maintainer emeritus:

  • Chris Colbert, Project Jupyter (co-creator, application/low-level architecture, technical leadership, vision, PhosphorJS)
  • Jessica Forde, Project Jupyter (demo, documentation)
  • Tim George, Cal Poly (UI/UX design, strategy, management, user needs analysis).
  • Cameron Oelsen, Cal Poly (UI/UX design).
  • Ian Rose, Quansight/City of LA (general core development, extensions).
  • Andrew Schlaepfer, Bloomberg (general development, extensions).
  • Saul Shanabrook, Quansight (general development, extensions)

This list is provided to give the reader context on who we are and how our team functions. To be listed, please submit a pull request with your information.


Weekly Dev Meeting

We have videoconference meetings every week where we discuss what we have been working on and get feedback from one another.

Anyone is welcome to attend, if they would like to discuss a topic or just listen in.

Notes are archived on GitHub Jupyter Frontends team compass.

jupyterlab-hdf5's People

Contributors

athornton avatar cnydw avatar dependabot[bot] avatar jonjonhays avatar loichuder avatar rcthomas avatar saulshanabrook avatar telamonian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jupyterlab-hdf5's Issues

unable to open file

after successful installation, the file still does not open, it keeps navigating through the groups.
dataset: redd.h5(low frequency)

Improve functionality of "Show in File Browser"

Currently, right clicking on a dataset tab and selecting "Show in File Browser" will work correctly, but only if the dataset has a .data extension at the end of its name.

The likeliest cause is that at some point the related command for "Show in..." tries to find a file type to go along with the dataset, but only relies on the path to do so. Without the extension, no file type can be determined, so the command fails.

Fixing this will entail making changes (probably to core) such that the file type finding in this case is more robust.

Support jupyterlab3.0+ (also the release candidate?)

JupyterLab is going to release 3.0 version soon, I tried to use the hdf5 extension with jupyterlab==3.0.0rc6, and was able to make it working locally by updating the dependencies in package.json.

  "dependencies": {
    "@jupyterlab/application": "^3.0.0-rc.6",
    "@jupyterlab/apputils": "^3.0.0-rc.6",
    "@jupyterlab/coreutils": "^5.0.0-rc.6",
    "@jupyterlab/docmanager": "^3.0.0-rc.6",
    "@jupyterlab/docregistry": "^3.0.0-rc.6",
    "@jupyterlab/filebrowser": "^3.0.0-rc.6",
    "@jupyterlab/notebook": "^3.0.0-rc.6",
    "@jupyterlab/services": "^6.0.0-rc.6",
    "@lumino/algorithm": "^1.3.0",
    "@lumino/coreutils": "^1.5.0",
    "@lumino/datagrid": "^0.14.0",
    "@lumino/messaging": "^1.4.0",
    "@lumino/signaling": "^1.4.0",
    "@lumino/widgets": "^1.14.0"
  },
  "devDependencies": {
    "husky": "^3.0.1",
    "lint-staged": "^9.2.0",
    "prettier": "^1.13.7",
    "rimraf": "~2.6.2",
    "tslint": "^5.10.0",
    "tslint-config-prettier": "^1.13.0",
    "tslint-plugin-prettier": "^2.0.1",
    "typescript": "~4.0.2",
    "yarn-deduplicate": "^1.1.1"
  },

Do we wait until the official release of jupyterlab3.0 to publish the corresponding npm package? Or maybe we can already publish a corresponding release candidate version?

Show attributes

Thanks for this really helpful extension.

I would really appreciate a way to view attributes on groups and datasets — possibly by extending a drop-down under the relevant item.

Proposals to improve the backend

I have played a bit with the new endpoints and have a few improvements to propose:

Needs discussion

  • 1) As attributes can store complex data (i.e. arrays), the type and shape of attributes should be returned. #87

2) The type of datasets is also tough to find as the dtype field is the NumPy type and not the HDF5 type. One could infer the HDF5 type from the dtype but I would prefer that the backend provides directly the HDF5 type. Not implemented: silx-kit/h5web#406 (comment)

3) Similarly, shape is the NumPy shape is not the shape in the HDF5 sense of the term. A HDF5 shape can be simple (as you expect it) or be a compound shape for more complex datasets. To support these, I think the HDF5 shape should be returned. Not implemented: #53 (comment)

Uncontroversial

  • Added in #60 -> Less critical, HDF5 entities are indexed internally by an id. It could be an additional field returned by the hdf/meta endpoint. REMOVED in #70 and #77, see #68 for the discussion.

@telamonian please give me your thoughts

The extension "@jupyterlab/hdf5" does not yet support the current version of JupyterLab.

Extension Installation Error

An error occurred installing @jupyterlab/hdf5.

Error message: The extension "@jupyterlab/hdf5" does not yet support the current version of JupyterLab.

Conflicting Dependencies:
JupyterLab Extension Package

=2.1.2 <2.2.0 >=1.2.0 <2.0.0 @jupyterlab/application
=2.1.1 <2.2.0 >=1.2.0 <2.0.0 @jupyterlab/apputils
=4.1.0 <4.2.0 >=3.2.0 <4.0.0 @jupyterlab/coreutils
=2.1.2 <2.2.0 >=1.2.0 <2.0.0 @jupyterlab/docmanager
=2.1.2 <2.2.0 >=1.2.0 <2.0.0 @jupyterlab/filebrowser
=2.1.2 <2.2.0 >=1.2.0 <2.0.0 @jupyterlab/notebook
=5.1.0 <5.2.0 >=4.2.0 <5.0.0 @jupyterlab/services

Add slicing of datasets

Related to n-dimensional dataset support in #4. Add support to the hdf data models (and thereby to the dataset grids) for showing arbitrary 2D slices. Will use a Numpy-like syntax.

Extending the default file browser to launch an HDF5 rendering extension

We're developing an extension that renders a visualization of data stored in HDF5 files. Our original intentions were to extend the default file browser's "Open With" menu; however, after a chat with @telamonian and @saulshanabrook, it turns out this may not be trivial for our particular use case. @saulshanabrook mentioned that the canonical way to extend the "Open With" menu would be to create a Mime Renderer extension, but a Mime Renderer extension wants to load the file into local memory by default. This is a problem for us, as we are dealing with remote files much too large to pull into local memory, and our extension only needs the small portion of the file that is actually being rendered. So, we're looking for the simplest way to launch the extension via the browser, while only retrieving the target filename in the process, rather than the file data. Extending the "Open With" menu, the file browser's menu itself, or something involving the data registry are all acceptable for us, we're just trying to avoid something super clunky like entering the filename in a text field. Thanks so much for all the help!

Tagging @shreddd here as well.

change type of index labels from slice to range

having the type of the index labels be a list of python slice objects on the python side and an array of ISlice objects on the typescript side is, at best, very confusing. I'm essentially using each "slice" here to define a range of index values, so let's just call these things range (python) and IRange (typescript) instead.

To clarify, I'm proposing simply to change the names of these types, not any behavior

h5grove version

The install_requires section specifies h5grove==0.0.14 but the current release of that package is 1.3.0. Is there a reason that the extension needs to stay pinned to that specific version?

Linking hdf5 files

Hi there,

It seems there might be a strange behavior when using external link with different hdf5 files.

In the below example the linked file doesn't reach the expectations.

import h5py
import numpy as np
 
with h5py.File('atomicfile.hdf5', 'a') as f:
    dataset = np.random.rand(100)
    f.create_dataset('testset', data=dataset)
 
with h5py.File('linkfile.hdf5', 'a') as f:
    f['atomicfile'] = h5py.ExternalLink('atomicfile.hdf5', '/')

Anyway, this is the workaround which seems to solve the problem.

import h5py
import numpy as np

with h5py.File('atomicfile.hdf5', 'w') as f:
    dataset = np.random.rand(100)
    f.create_dataset('testset', data=dataset)

with h5py.File('linkfile.hdf5', 'w') as f:
    f['/testset'] = h5py.ExternalLink('atomicfile.hdf5', '/testset')
    
with h5py.File('linkfile.hdf5', 'r') as f:
    keys = list(f.keys())
    print(keys[0])

Or the following case using a group.

import h5py
import numpy as np

with h5py.File('atomicfile.hdf5', 'w') as f:
    dataset = np.random.rand(100)
    g = f.create_group('test')
    g.create_dataset('testset', data=dataset)

with h5py.File('linkfile.hdf5', 'w') as f:
    f['/test'] = h5py.ExternalLink('atomicfile.hdf5', '/test')
    
with h5py.File('linkfile.hdf5', 'r') as f:
    keys = list(f.keys())
    print(keys[0])

Opinions?

"File not found" error

When I double-click on an .h5 file in the standard file browser, I get this error message

image

And the "browse HDF" tab does not show any files or folders to me.

image

I am running JupyterHub Version 2.2.4. and Python 3.6.9 (default, Jul 17 2020, 12:50:27)

Add cell selection and copy/paste by upgrading to `@phosphor/datagrid` v0.3.0

This extension desperately needs a way to actually select and grab data out of an HDF5 file. The latest version of @phosphor/datagrid (v0.3.0) has added this necessary feature.

I've already starting working on this in a new branch: jupyterlab/jupyterlab-hdf5:upgrade-datagrid-0.3.0. There's two main issues to deal with:

  • Previously, it was impossible to get this to work, due to JupyterLab core pinning datagrid at v0.1.2 by marking it as a singleton. This has now been fixed, so that's no longer a blocker.

  • Various small changes have been made to the datagrid api, in particular to support cell selection. I'll have to go through this repo and upgrade it to match.

conda-forge packaging

Problem

Having jupyterlab-hdf5 packaged on conda-forge will be useful to avoid mixing libraries installed by pip and conda in a conda-managed python distribution.
See for example packaging of https://github.com/matplotlib/ipympl, which can be installed from conda-forge without having nodejs installed and without having to build the extension.

Proposed Solution

Package jupyterlab-hdf5 on conda-forge: https://conda-forge.org/docs/maintainer/adding_pkgs.html.

Additional context

It would make the installation jupyterlab-hdf5 easier.

Double clicking above the files of the FileBrowser still opens the file

Description

brave_bumDo3bcoY

When you have many files in a folder that you can scroll down in the file browser, then you double click on the area above the DirListing widget, the file is still opened.

Reproduce

Like the description

Expected behavior

The file shouldn't be opened.

Context

this._browser.node.addEventListener("dblclick", handleDblClick, true);
patches the whole browser node, which includes the filter and other components above the DirListing widget. It should only patch the DirListing widget's node.

Produce valid JSON in the presence of NaN or INF values

Currently the backend server extension is using Python's native JSON package to serialize the payload. Python's JSON library will produce invalid JSON in the presence of NaN or Inf values. The JSON library will add values equivalent to Javascript's NaN and Inf values, but unfortunately these values are not considered valid JSON, and we fail on the frontend when parsing the JSON payload. While this could be handled on the frontend, I think it makes more sense to handle this server-side. Unfortunately, I wasn't able to find a solution for this without bringing in a third party dependency. The simplejson library provides an ignore_nan flag as input to its dump function that encodes NaN values as None/null. While Javascript data rendering and plotting libraries typically prefer null values, Python libraries (e.g, Matplotlib) tend to prefer NaN values, and may produce errors in the presence of null values. Is it likely this extension will ever be called from a non-Javascript context? If not, it might make sense to simply encode NaN values as null by default. However, if it is likely that this extension will be called from a non-Javascript context, we could add an ignoreNan flag to the query parameters. In my PR for n-dim support, I added an implementation of the former. Here's a discussion on this topic:

https://stackoverflow.com/questions/15228651/how-to-parse-json-string-containing-nan-in-node-js

[Discussion] Keep/remove `id` from backend response ?

After discussion with @axelboc, I have come to wonder whether id should be in the API responses.

  • The true identifier for entities in the API is the path in the HDF5 (or uri). Adding an id which is in fact not the REST API identifier is confusing.
  • After a few tries with hard and soft links, I realized that the same entity accessed through two different paths has a different id.
  • In the end, it is a number that I had to dig from the lower levels of h5py while the rest of the backend uses only the high-level layer (which is fine as it eases things a lot, especially for link resolution).
  • It is a big integer that raises some issues: #67

For these reasons, I struggle to find a usecase for id and would be in favour of removing it, favoring uri as the identifier.

404 instead of 500 when entity not found at given path

Problem

When requesting the metadata or value of an entity that doesn't exist, the server returns a 500 error response (i.e. "Internal Server Error"), which could technically mean anything.

Proposed Solution

Respond with a 404 "Not Found" error instead.

Additional context

A more specific error code will help us convey clearer error messages to the user: silx-kit/h5web#618

Make the "Open" filebrowser context menu item work with ".hdf5" files

I'm currently exploring some changes to core to make this happen. Some possibilities:

  • Introduce a dirlike filetype to go alongside directory and file in the core contents machinery. When you open a dirlike in the default filebrowser, it will switch to a particular browser, then use that browser's drive to cd to dirlike.path.

  • add contents handling callback to the optional filetype parameters. At some point in the 'filebrowser:open' command, in between getting an item and passing item.path to the next open command, have a hook for a callback like:

    (item: Contents.IModel): string => {
      <do stuff>
    
      return path;
    }

Include group links in `/hdf/meta` endpoint response

Whenever I fetch a group that contains children, I always end up calling these two endpoints in sequence:

  • /hdf/meta, which tells me that the entity is a group and how many children it contains (childrenCount);
  • /hdf/contents, which gives me some basic info about the group's children (name, uri and type) -- in other words, the group's links.

I can't think of a real-world use case where one would need to know that an entity is a group and that this group contains links, without actually needing to fetch those links. I fell like /hdf/meta should just include a links array in a group's response.

Failure to open a 1D dataset

With the HDF5 file created by the following code, jupyterlab-hdf 0.1.0 fails to open the x dataset, which is 1D, showing no data. With the y and z datasets, which are 2D and 3D respectively, there is no problem.

Maybe there's a bug where the extension forgets that datasets can be one-dimensional.

import h5py
import numpy as np


x = np.zeros(shape=(1,), dtype=np.float64)
y = np.zeros(shape=(1, 1), dtype=np.float64)
z = np.zeros(shape=(1, 1, 1), dtype=np.float64)
with h5py.File('/tmp/bad.hdf5', 'w') as f:
    f.create_dataset('x', data=x)
    f.create_dataset('y', data=y)
    f.create_dataset('z', data=z)

Don't use `uri` to refer to internal object path

There's a bunch of places in the codebase where I call the internal path to a specific object within an HDF5 the uri. This is a misnomer, and should be changed to something else. Maybe opath?

Display one-dimensional data vertically

Problem

When I open a one-dimensional array, I see a single line that I have to scroll horizontally in order to see the entries. The cells are by default too narrow to read a whole floating-point number, let alone a UTF-8-encoded string. Presenting the values vertically would allow the more familiar vertical scrolling, and leave more horizontal space for longer entries. It would also normally allow more entries to be visible at once.

Proposed Solution

Treat a view of shape (n,) as shape (n,1), so that the built-in two-dimensional display code displays it vertically.

Additional context

A couple of examples:

Screenshot from 2023-02-16 19-55-01

Screenshot from 2023-02-16 19-55-28

Screenshot from 2023-02-16 19-57-05

I also note that if I supply the index "0" I get just the first entry, as expected, but I can no longer adjust the cell width (and it is too small to see much of the value).

It might be possible to make the indexing come out two-dimensional as one would in numpy, with [np.newaxis, :] or [None, :] but these do not work. The need for these is questionable; really I think it's simpler to just display one-dimensional arrays vertically, using the whole width of the window.

Display attributes more like datasets

Problem

I have an HDF5 file that has attributes associated with the file itself (the top-level group), and with some groups. (Not data sets, although I know that is possible in HDF5.) When I open the file, these attributes are invisible; there is no indication that they are present. This is particularly awkward as one attribute of the file - README explains what all the datasets and attributes mean.

For the attributes of sub-groups, if I know that they are there I can right-click and query them. For the attributes of the File itself, I know of no way to access them through jupyter-hdf5.

I would like to be able to browse through the file, easily reading both the metadata stored in attributes and the contents of the datasets.

Proposed Solution

I suggest making attributes appear as "directory entries" of the objects that they are attached to, perhaps with a different icon than is used for datasets and sub-groups. It's not so simple to see how to do this for attributes of datasets, but if the file browser is still present the attributes should be visible there.

Additional context

Example attributes:

  • README - attribute of the top-level file, somewhat long text explaining what all is in the file
  • Fit parameters - attribute of the top-level file, list of strings giving the parameters that were fit, serving as column labels for one entry
  • DMX/DMX0001/DMX - attribute of a subgroup, a floating-point number giving the value of the named parameter.

The h5py package makes it easy to store strings and lists (arrays) of strings in attributes, but cumbersome to store them in datasets, so I suspect that the usage of attributes to store valuable metadata is common.

Include sample hf5 file?

It might be useful to include some sample hdf5 files, to help with development and for demo purposes.

Jupyterlab-hdf5 does not currently support JupyterLab 4.x

Problem

jupyterlab-hdf5 is not JupyterLab 4.x compatible

Proposed Solution

That's the question: is jupyterlab-hdf5 actively maintained?

If so: is there already work underway to make this extension work as a prebuilt JL4+ extension?

If not: is there something else that provides equivalent functionality?

If the answer to that is also "no", are there other sites that feel like their desire for a JL4-compatible HDF extension approaches "need" ?

Additional context

I'm slowly getting the Rubin Science Platform migrated to JupyterLab 4 and am taking inventory of what extensions that we use are already there, and which ones either need updating or removal from our environment. Obviously the Rubin Observatory would prefer not to take ownership of the HDF server and lab extensions, but I think we're probably going to need HDF5 support. So knowing what Jupyter's status and intentions vis-a-vis this extension are would help us define and prioritize development work.

Dev install doesn't work without h5py already installed

pip install -e .
Obtaining file:///Users/saul/p/jupyterlab-hdf
    ERROR: Command errored out with exit status 1:
     command: /usr/local/miniconda3/envs/jupyterlab-hdf/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/Users/saul/p/jupyterlab-hdf/setup.py'"'"'; __file__='"'"'/Users/saul/p/jupyterlab-hdf/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info
         cwd: /Users/saul/p/jupyterlab-hdf/
    Complete output (9 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/Users/saul/p/jupyterlab-hdf/setup.py", line 56, in <module>
        from jupyterlab_hdf._version import __version__
      File "/Users/saul/p/jupyterlab-hdf/jupyterlab_hdf/__init__.py", line 10, in <module>
        from .contents import HdfContentsManager, HdfContentsHandler
      File "/Users/saul/p/jupyterlab-hdf/jupyterlab_hdf/contents.py", line 7, in <module>
        import h5py
    ModuleNotFoundError: No module named 'h5py'
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

You are using option 6 here, which comes with this warning:

Although this technique is common, beware that it will fail if sample/init.py imports packages from install_requires dependencies, which will very likely not be installed yet when setup.py is run.

Fixup new error handling

  • Add missing error handlers as appropriate following the changes made in #41, #43, and #45

  • Currently, I've seen at least one modal error message that can't be closed. I think what's happening is that the error is re-throwing and reopening the modal every time the modal is closed. So that needs fixing

TypeError: Object of type int64 is not JSON serializable

I can see the groups in the HDF5 file.
But I get TypeError: Object of type int64 is not JSON serializable when double-clicking on a table.
Python: 3.9.0
JupyterLab: Version 2.2.9
jupyterlab_hdf: 0.4.1
The HDF5 file was created with PyTables with this datatype:

class Tab(tables.IsDescription):
    number = tables.Float64Col()

I can see all content with h5ls.

Directory not found "hdf:"

Hello,

I try to open some .h5 files in Jupyterlab 1.2.5 and jupyterlabhdf 0.3.0 but this error appears:

Directory not found "hdf:"

I have jupyterhub running remotely in a linux machine.

Thanks in advance!

Modernize signature of `get` method in backend

I keep forgetting about this, but

@gen.coroutine
def get(self, path):

needs to become

    from tornado import web

    @web.authenticated
    async def get(self):

This will:

  • update the (deprecated) tornado coroutine to the native python async support
  • add https authentication to the endpoint, which is just a Good Idea

@loichuder If you're currently writing unittests for the enpoints, you'll probably want to make sure this change has been made before you get in too deep (it may subtlely change some behavior). You can just add this change to unittest PR, or I can take care of it if you prefer

Support of compression filters

Thanks to h5py, most compression filters are supported out of the box 🎉

At ESRF, we usually use the bitshuffle filter that isn't directly supported by h5py leading to 500 errors when trying to fetch compressed datasets.

Importing the hdf5plugin module in the file handling the reading of the dataset would allow to support bitshuffle in addition to other compression filters.

@telamonian Would you agree to add it to the Python project dependencies ?

Add support for 1-dimensional datasets

Currently if a user tries to open a 1-dimensional datatset, a tab will open but nothing will be displayed. I get the following stacktrace in the console:

backend.js:6 RangeError: Invalid array length
    at t._drawBodyRegion (datagrid.js:2323)
    at t._draw (datagrid.js:2249)
    at t._paint (datagrid.js:2229)
    at t.scrollTo (datagrid.js:813)
    at t._syncScrollState (datagrid.js:1093)
    at t.onResize (datagrid.js:954)
    at t.e.processMessage (widget.js:488)
    at t.processMessage (datagrid.js:870)
    at w (index.js:436)
    at Object.n [as sendMessage] (index.js:172)

Instead, a single column should be displayed containing the 1-dimensional data, as well as the index column. I was able to display the index column and an empty data column by making the following small modification:

this._colCount = shape[1] ? shape[1] : 1;

However, I haven't yet figured out how to populate the single data column, so I reverted the change in my working branch, and will leave this as an open issue for now.

Cells only display values if scrolled out of view once

I tried v0.4.0 #30 (comment) and I did not get any problem to install/launch it 🎉

However, I observed the following behaviour when opening a dataset:

  • All visible cells are empty
  • If I scroll down, the cells that were hidden contain values as expected
  • If I scroll back up, the previously empty cells now contain values

That can lead to strange displays (Tested on Firefox 80.0.1 and Chrome 84.0)
image

I also get Uncaught (in promise) TypeError: shape is undefined in the console:
image

but I am not sure this is related as it appears before I open the dataset.

EDIT: The error seems instead related to an error 403 I got above as it cannot resolve a file I previously opened. Also, changing tabs seems to also trigger the display of values.

File not found for correct path

I pasted the path that I confirmed working in a notebook into the top cell of the HDF viewer and it claims there's no file found (see screenshot).

The path is long, and the viewer shows wiggly lines in the part where there are multiple dots, is it breaking any filename rules for HDF that I'm not aware of?

The path is:

/Users/klay6683/Dropbox/data/planet4/P4_catalog_v1.0_supplementary_data/P4_catalog_v1.0_raw_classifications.hdf

Screenshot 2019-11-18 23 16 07

Document why not using h5serve

I just had a call with @telamonian, Shreyas Cholia, and @JonjonHays. Jon and Shreyas are going to push forward on this extension and one question they had when looking at the code was why it wasn't using h5serv and instead implementing its own backend. We could use h5vserv with jupyter server proxy to expose it. However @telamonian said that he originally looked into using it, but thought it would be more performant and easier to maintain to reimplement it, if we only needed to support the read case.

It would be good to document this rationale in the README so that if others come to the project with the same question they can have it answered.

Add support for n-dimensional datasets

nd datasets can be supported by allowing the user to specify a desired 2D slice via an input box on the toolbar and a numpy-like slicing syntax. The default slice would be something like

[:, :, 0, 0, 0, 0, etc]

Include '.h5' extension

The .h5 extension is also used for HDF5 files, would it be possible to include this extension as well?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.