osplanning / omx Goto Github PK

View Code? Open in Web Editor NEW

48.0 21.0 18.0 21.36 MB

Open Matrix (OMX)

Home Page: https://github.com/osPlanning/omx/wiki

License: Apache License 2.0

Jupyter Notebook 100.00%

omx matrix-data open-data travel-modeling

omx's People

Contributors

Stargazers

Watchers

Forkers

sfcta frazierc jeabraham udst andrewthetm mmilkovits jpn-- rluthi spartakos87 andyyaor mbalde111 aequilibrae grieverwzn bstabler ikshovon

omx's Issues

Python API overflow on zone number from unsigned short

Guys, we just had a user repro an overflow on zone number (he had zone #'s > 65535) when used in a mapping from Python. Looks like the unsigned short being used in createMapping() at omx/File.py line 173.

...
mymap = self.createArray(self.root.lookup, title, atom=tables.UInt16Atom(),
...

How should we rework API to handle future OMX data types?

Right now the API assumes everything is an OMX matrix. Assuming this project is a wild success and future datatypes are coming down the pike, what do we need to do to generalize the API?

Can we have a Javascript library for OMX?

Has anyone tackled Javascript yet for reading OMX matrix files?

Looks like there's an hdf5 library https://www.npmjs.com/package/hdf5

We have this most awesome matrix viewer/clusterer that works in the browser written in multithreaded Javascript.

I wrote an R package.

I remember a few months ago someone put together a python package that could be installed and maintained separately. Over the weekend I threw together an R package containing the API functions, adding some documentation files, etc.

I've got my work on a release branch in my gregmacfarlane/omxr repo. If anyone would like to help me check out the code, that would be great.

Also, I'd be happy to transfer ownership to osPlanning if you like.

Unable to load attribute info from object header

I am using the R API to load a .omx file I exported from Cube with the cube2omx.exe tool. I can see the resulting tables from my matrix files in AMPKHWY16.omx using HDFView for Ubuntu, so I think that step worked, which is why I'm filing this issue here.

But I cannot read these tables into R, using Ubuntu or OSX. For example,

library(rhdf5)
source("omx.r")
listOMX("AMPK16HWY.omx")
HDF5-DIAG: Error detected in HDF5 (1.8.7) thread 0:
  #000: H5A.c line 550 in H5Aopen(): unable to load attribute info from object header
    major: Attribute
    minor: Unable to initialize object
  #001: H5Oattribute.c line 530 in H5O_attr_open_by_name(): can't locate attribute
    major: Attribute
    minor: Object not found
HDF5: unable to open attribute
Error: Error in h5checktype(). Argument not of class H5IdComponent.

I filed this here because I can look into the tables with HDFView. On the other hand, I can run through the examples in text_omx.r with no difficulty, so the problem might be in cube2omx.exe or my use of it.

mapentries function always throws error

Python api File.mapentries() always throws error because of undefined variable.

File.py line 144 is:
return (keymap,entries)
but should be:
return entries

variable keymap is not defined, so code throws misleading exception of type LookupError

Need new class for array subclassing CArray

I was trying to implement where() and it is getting really ugly sans a class for matrices...

Suggest subclassing CArray strictly, but open to others...

Make C# API COM visible

Makes loading the library in Excel via VBA possible

+Arrow/Feather

I'd like to propose that we evaluate the feasibility to support the faster Arrow-based data format.

Do we need to deal w/ 0-based and 1-based matrices?

Cube, Transcad, and Emme all use zone numbers that are 1-based, which is a huge pain in the ass.

How do we want to deal with this? A flag? Ignore it and leave it to the user?

Java OMX library won’t compile against HDF5 1.10

@jeabraham - I’m trying to use the Java OMX library and it won’t compile against HDF5 1.10 because they’ve moved to long pointers. Has anyone tackled the conversion/update? I’m giving it a go but it’s a bit of work.

My motivation is actually to put the PECAS economic flow matrices (700 or so of them) into one file that can be read by a visualization program. That visualization program is currently in Javascript. Has anyone tackled reading OMX matrices from Javascript? There’s a Javascript HDF5 library. See #32

update Python API documentation

We need to update https://github.com/osPlanning/omx/wiki/Python since we changed the API.

CArray not supported in creation, must convert to numpyarray

It would be nice to wrap this feature since we are going from h5-->h5 and it should be seamless.

Traceback of error when trying to write CArray"

>>> myfile = omx.openFile(TEST_FILE, 'r')
>>> outFile = omx.openFile(r"testData/sfptam2.omx",'w')
>>> m1in=myfile["1"]
>>> m1in
/1 (CArray(3693, 3693), shuffle, zlib(7)) ''
  atom := Float64Atom(shape=(), dflt=0.0)
  maindim := 0
  flavor := 'numpy'
  byteorder := 'little'
  chunkshape := (8, 3693)
>>> outFile["test1"]=m1in
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "omx\File.py", line 178, in __setitem__
    return self.createMatrix(key, atom, shape, obj=dataset)
  File "omx\File.py", line 35, in createMatrix
    matrix[:] = obj
  File "C:\Python27\Lib\site-packages\tables\array.py", line 672, in __setitem__

    nparr = convertToNPAtom2(value, self.atom)
  File "C:\Python27\Lib\site-packages\tables\utils.py", line 114, in convertToNP
Atom2
    nparr = convertToNPAtom(object, atom, copy)
  File "C:\Python27\Lib\site-packages\tables\utils.py", line 83, in convertToNPA
tom
    nparr = array_of_flavor(arr, 'numpy')
  File "C:\Python27\Lib\site-packages\tables\flavor.py", line 213, in array_of_f
lavor
    return array_of_flavor2(array, flavor_of(array), dst_flavor)
  File "C:\Python27\Lib\site-packages\tables\flavor.py", line 202, in flavor_of
    "supported objects are: %s" % (type_name, supported_descs) )
TypeError: objects of type ``CArray`` are not supported in this context, sorry;
supported objects are: NumPy array, record or scalar; homogeneous list or tuple,
 integer, float, complex or string

Solution:

outFile["test"] = numpy.array((m1in))

Make a release of Python-OMX

ActivitySim needs to read OMX files. Based on feedback from the community we'd like to have OMX available as a standalone Python library installable via pip and conda (on PyPI and binstar, respectively). This kind of release is required so we can easily install the OMX library for things like automated testing and so it can be listed as an explicit dependency of ActivitySim.

Related to making a release, I'd recommend splitting the directories for each OMX library into separate repositories. That's not strictly necessary, but it has some conveniences:

users can install omx directly from GitHub via pip
tagged releases of Python code don't include the other languages
maintenance and development branches for one language don't include the other languages

As an example, the official GitHub API wrappers for different languages are kept in separate repositories: https://github.com/octokit.

Whether you decide to split the the directories or not, I'm happy to help with the infrastructure for a release. We'll need this available soon for ActivitySim development and testing. (Though not until after the New Year.)

You may also be interested in the changes I made for the proposed integration of OMX with ActivitySim: ActivitySim/activitysim#7. The changes included PEP8 compliance, switch to pytest for running unit tests, and additional test coverage.

Version 3.3.0 of pytables expires camelCase api breaking python version

Version 3.3.0 of pytables expired deprecated camelCase function names in favor of underscore_delimited names thus breaking references in openmatrix to functions like getNode (now renamed get_node).

Should "select by tag" return list of 2D arrays, 1 3D array, or a summed 2D array?

I think we probably would like capabilities on all fronts, but for trips (and other additive measures) we most likely want a summed 2D array while for something like a skim, we would probably like to be able to iterate through the list (simpler than a 3D array, although not all numpy functionality comes through).

Need for list format

Hello everyone and congratulations for this great project.
Standardising formats will definitely help everyone in the transport industry.

I just wanted to raise a concern regarding the selected approach. Although the transport industry has been using and storing OD matrices in the "matrix format" (e.g. rows and columns), I believe that this is not the most efficient approach. From my perspective as well as from quite a few other data analysts and programmers (e.g. https://vita.had.co.nz/papers/tidy-data.pdf ) storing data in a "list" or "database format" is more efficient. Following this format all ODs could be stored in a single file and the user will be able to make selections based on simple and standardised queries.
For instance:

Origin_Zone, Destination_Zone, Trip_Purpose, Time_Period, Trips
A, B, HBW, AM, 10
Z, X, HBO, IP, 12
...

I would really like to know the views of the development team regarding this comment.

Kind Regards

-Haris

samples/python-omx-sample.py is out of date

python-omx-sample.py fails around line 52 calling:

myfile.listAllTags() # ['am','hwy','md','trips','trn']

This function appears to have been removed some time ago.

I tried changing it to call myfile.listAllAttributes() instead but that is apparently not equivalent and furthermore it just fails a little further on.

alphabetically based storage order based on the table names

from the SATURN team

One suggestion for the next revision to the specification is internal storage of the multiple tables within the OMX file. At the moment, they are stored alphabetically based on the table names so their relative position changes depending on the user-defined title. This causes some problems in implementation for our software as we use their level (ie 1st table, 2nd table, 3rd table) as the unique identifier not the table name (with the latter only optional). To get round this, we export the OMX tables (via UFM2OMX) with a Level position ‘Lxx’ prefix to the user-defined title name. When importing (via OMX2UFM), the levels are simply defined as the order they appear (ie alphabetically) and hence require some post import manipulation. The process works but it’s not as streamlined as it could be. Has anybody else flagged this as weakness?

Python - default matrix TITLE attribute is some funny character that is causing problems

The default matrix TITLE attribute is some funny character that is causing problems. In order to read our matrices into the OMX Viewer and VISUM, we had to override the TITLE after the matrix was created. We added the following bold line:

import openmatrix as omx
outfile = omx.openFile(os.path.join(dir_data, 'transit_od_demand_new_' + per + '.omx'), 'w')
outfile['metro'] = od_matrix[i+1][0]
outfile.createMapping('taz', taz_labels)
outfile['metro'].attrs.TITLE = 'metro'
outfile.close()

No exceptions are being thrown

API lists some exception types, but none are implemented yet.

add test suite

we need test cases to round trip validate OMX APIs. We can use this for the OMX APIs on GitHub, as well as for commercial third-party applications. We may also setup continuous integration.

creates index in new h5 file even when it wasn't successfully created

Issue:

when you try and create a node on the omx and it doesn't do so successfully, it doesn't delete the node. Need to make it fail gracefully and delete if it doesn't finish creating.

Traceback:

>>> outFile["test1"]=m1in
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "omx\File.py", line 178, in __setitem__
    return self.createMatrix(key, atom, shape, obj=dataset)
  File "omx\File.py", line 33, in createMatrix
    chunkshape, byteorder, createparents)
  File "C:\Python27\Lib\site-packages\tables\file.py", line 831, in createCArray

    chunkshape=chunkshape, byteorder=byteorder)
  File "C:\Python27\Lib\site-packages\tables\carray.py", line 203, in __init__
    byteorder, _log)
  File "C:\Python27\Lib\site-packages\tables\leaf.py", line 263, in __init__
    super(Leaf, self).__init__(parentNode, name, _log)
  File "C:\Python27\Lib\site-packages\tables\node.py", line 241, in __init__
    parentNode._g_refNode(self, name, validate)
  File "C:\Python27\Lib\site-packages\tables\group.py", line 494, in _g_refNode
    % (self._v_pathname, childName))
tables.exceptions.NodeError: group ``/`` already has a child node named ``test1`

OMX viewer

The AequilibraE plugin for QGIS also has an OMX viewer, which might be a good alternative.
I have not found a way to issue a pull request for the Wiki content on this repo, so that is why I am filling this issue.

The description of the capability and how to get it to work is documented here: https://www.xl-optim.com/displaying-omx-matrix-in-qgis/

Check in dll and solution file from C# patch

Hi @mmilkovits, Can you check in your solution file on your branch and the resulting dll from the latest patch you made? Then I can pull it into the main code.

not opening my h5 file :-( "the file is not witable"

Issue Part 1 : call to pytables to open h5 file is returning an error that it isn't writable (FileModeError) despite there not being any permissions or read-onlyness of the file

Issue Part 2: why are we requiring write file privies if we are opening as read-only?

testTA.py:

import omx,numpy

TEST_FILE = r"testData/sfptam.h5"

myfile = omx.openFile(TEST_FILE, 'r')

Traceback:

C:\work\omx>python testTA.py
Traceback (most recent call last):
  File "testTA.py", line 5, in <module>
    myfile = omx.openFile(TEST_FILE, 'r')
  File "C:\work\omx\omx\__init__.py", line 21, in openFile
    f.root._v_attrs['omx_version'] = __version__
  File "C:\Python27\Lib\site-packages\tables\attributeset.py", line 526, in __se
titem__
    self.__setattr__(name, value)
  File "C:\Python27\Lib\site-packages\tables\attributeset.py", line 427, in __se
tattr__
    nodeFile._checkWritable()
  File "C:\Python27\Lib\site-packages\tables\file.py", line 1620, in _checkWrita
ble
    raise FileModeError("the file is not writable")
tables.exceptions.FileModeError: the file is not writable
Closing remaining open files: testData/sfptam.h5... done

C:\work\omx>python testTA.py > notwrited.log notwrite.log

Should we do separate repositories for the different APIs?

@e-lo asked:

Where does API / Examples / Gists live?

Preferably in separate repositories, tagged with OMX version #. Currently the APIs of all code are in one place and move forward in commits together. I personally find this messy.

What do you all think about having:
• OMX.r
• OMX.py
• OMX.cpp
• OMX.java
• etc...
all as separate repositories? There are good and bad things about this, but the main issue i see is one of the APIs moving forward w/out volunteers to move forward the other APIs and ended up with stale parts of a repo. The major disadvantage I see is consolidated issue tracking/version control.

Python API - bug caused by compatibility issues with the tables module

I got this error using the method createMapping() from the Python API:

--> 172         mymap = self.createArray(self.root.lookup, title, atom=tables.UInt32Atom(),AttributeError: 'File' object has no attribute 'createArray'`
It turns out that in my "tables" module (version 3.3.0) the method createArray() is called crate_array().

I fixed this bug localy by manually changing one line in the createMapping(self, title, entries, overwrite=False) method of the File class.

The original code was:

# Write the mapping!
mymap = self.createArray(self.root.lookup, title, atom=tables.UInt32Atom(),
shape=(len(entries),) )

I replaced it by:

# Write the mapping!
mymap = self.create_array(self.root.lookup, title, atom=tables.UInt32Atom(),
shape=(len(entries),) )

For information, I am using Python 2.7.3, and the openmatrix version 0.2

Indexing support needs to be written

Transcad-style indexing is not yet supported. Design details:

Where & how should indices be stored? Attributes vs. tables.
What's the API for accessing data via an index?