Coder Social home page Coder Social logo

glotzerlab / garnett Goto Github PK

View Code? Open in Web Editor NEW
8.0 8.0 2.0 29.41 MB

Collection of file parsers and writers for particle trajectory formats used by the Glotzer Group.

Home Page: https://garnett.readthedocs.io/

License: BSD 3-Clause "New" or "Revised" License

Python 98.68% Shell 0.27% Cython 1.05%

garnett's People

Contributors

bdice avatar csadorf avatar djulia avatar erteich avatar gaofy95 avatar harperic avatar j-proc avatar jamesaan avatar jdaaph avatar klarh avatar klywang avatar lyrivera avatar syjlee avatar tcmoore3 avatar vyasr avatar zinebbe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

garnett's Issues

Add ellipsoid support

Of the shape classes supported by GSD state data storage, the ellipsoid seems to be only one not yet supported here.

Creating Frames in Python

Original report by Kyle Pettibone (Bitbucket: kjpett, ).


This would enable creation of glotzformats.trajectory.Frame objects without a file backing them.

Example use:

frame = glotzformats.trajectory.Frame()

frame.positions = [[0, 0, 0], [1, 0 0]]
frame.orientations = [[1, 0, 0, 0], [1, 0, 0, 0]]

It might also be nice to add the ability to add a frame to a currently open writer (instead of having to create an entire glotzformats.trajectory.Trajectory object) without having to rewrite the entire trajectory.

CifFile Flexibility

Original report by JamesP (Bitbucket: jproc, GitHub: j-proc).


We currently only look at the "_atom_site_label" field to get our type information. There are possibly more relevant data in cif files that could be exposed either on demand or as a backup if these label's aren't found.

There is also a "_atom_site_type_symbol" in the cif specifications that I've noticed is used pretty commonly. Spec details here.

Also just a note that CIF2.0 exists. I don't think there's any need to switch to it for us, however if for some reason it becomes popular, we might want to be able to read it too. More info.

Sanitize inputs to writers

Original report by Vyas Ramasubramani (Bitbucket: vramasub, GitHub: vyasr).


Currently both the positions (and the cif_coordinates for CIF files) can be written arbitrarily with no checks. We may want to sanitize these inputs further (See Pull Request #36). However, there are also potentially legitimate use cases for making seemingly arbitrary changes, so we should think carefully about what restrictions to place on modification.

Use custom numpy ndarrays as trajectory objects

Original report by Eric Harper (Bitbucket: harperic, GitHub: harperic).


I think it would make sense to create trajectory objects as numpy.ndarrays with a custom dtype:

dtype=[("position", (np.float32, np.float32, np.float32)),
("orientation", (np.float32, np.float32, np.float32, np.float32)),
("box", ([("Lx", np.float32), ...), etc.]

This would augment the user's ability to access the data in more standard pythonic ways. I haven't thought out any potential downsides, however, one being that there may be a number of copy operations in ensuring the data is contiguous when feeding into freud.

Refactor posfilereader._parse_shape_definition

Original report by Vyas Ramasubramani (Bitbucket: vramasub, GitHub: vyasr).


The code for this function is convoluted and should be cleaned up. Copying over my comments from Pull Request #33:

However, the design of the whole _parse_shape_definition function seems a little strange to me. Is wanting to place all of the return calls together at the end sufficient reason to justify the odd fragmentation of code here? Specifically, I'm referring to two oddities: 1) the same set of conditionals gets checked twice, once for actual processing and once for the return; and 2) the code uses a try-catch block to address a fallback that is dealt with entirely internally and therefore could just be in the else clause. The only reason for these choices (aside from trying to accumulate all of the return statements) seems to be in order to deal with the color, and I would prefer to duplicate that line of code. Am I missing something?

Crash when converting GSD to pos file with varying number of particles

Original report by Jens Glaser (Bitbucket: jens_glaser, GitHub: jglaser).


I have a .gsd file with a varying N, which I want to convert to .pos for visualization.

This is what happens when I run convert.py

#!bash

$ python convert.py solute_two_component_AB_True_SB_True_AS_True_z5.000.pos solute_two_component_AB_True_SB_True_AS_True_z5.000_all.gsd 
/Users/jglaser/Library/Python/3.4/lib/python/site-packages/glotzformats-0.3.2-py3.4-macosx-10.10-x86_64.egg/glotzformats/reader.py:17: UserWarning: Mocking GetarFileReader, gtar package not available.
  "Mocking GetarFileReader, gtar package not available.")
Traceback (most recent call last):
  File "convert.py", line 16, in <module>
    pos_writer.write(traj, out)
  File "/Users/jglaser/Library/Python/3.4/lib/python/site-packages/glotzformats-0.3.2-py3.4-macosx-10.10-x86_64.egg/glotzformats/posfilewriter.py", line 90, in write
    _write(' '.join((str(_num(v)) for v in pos)))
  File "/Users/jglaser/Library/Python/3.4/lib/python/site-packages/glotzformats-0.3.2-py3.4-macosx-10.10-x86_64.egg/glotzformats/posfilewriter.py", line 90, in <genexpr>
    _write(' '.join((str(_num(v)) for v in pos)))
  File "/Users/jglaser/Library/Python/3.4/lib/python/site-packages/glotzformats-0.3.2-py3.4-macosx-10.10-x86_64.egg/glotzformats/posfilewriter.py", line 30, in _num
    return int(x) if int(x) == x else round(float(x), POSFILE_FLOAT_DIGITS)
ValueError: cannot convert float NaN to integer

Docs won't build with cython >=0.29.0

I was having issues building the docs, which I've finally tracked down. The failures were with the dcdreader module, which is built with Cython. The error message below only appears when I try to build the documentation with sphinx. The package builds and installs just fine without any warnings. I could import the dcdreader class in a Python shell even though Sphinx failed to import that module. I did not try to execute the tests.

There seems to be a bug with Cython >=0.29.0 (the latest version is 0.29.6). The errors are introduced by the changes in this line, specifically the check_size ignore part: https://github.com/cython/cython/blame/477c152e4004b4bf581e1a05f5b24bd56dea87db/Cython/Includes/numpy/__init__.pxd#L206

This error message was repeated many times, for every attempt to import a class for Sphinx to extract docstrings.

WARNING: autodoc: failed to import class 'formats.FileFormat' from module 'glotzformats'; the following exception was raised:
Traceback (most recent call last):
  File "/.../lib/python3.7/site-packages/sphinx/ext/autodoc/importer.py", line 232, in import_module
    __import__(modname)
  File "/.../glotzformats/glotzformats/__init__.py", line 7, in <module>
    from . import reader
  File "/.../glotzformats/glotzformats/reader.py", line 39, in <module>
    from . import dcdreader
  File "__init__.pxd", line 206, in init glotzformats.dcdreader
TypeError: numpy.dtype is not a type object

By using Cython 0.28.5, I was able to fix this problem and build the docs.

GSD Reader with additional shape frame is not working

Original report by Vyas Ramasubramani (Bitbucket: vramasub, GitHub: vyasr).


Something on develop appears to have broken the ability to do gsd_reader.read(gsd_file, frame_with_shape). On the vislab, using the installed version (which I assume is master) I get the expected behavior, but using the current develop branch I just get spheres out. We should be identify the problem using git bisect, I imagine that it's related to the changes in the _parse_shape_definitions function that was added (to support the shape information contained in GSD files).

Review/update examples

I haven't looked in detailed at the examples but wouldn't hurt to review and update them if necessary, and possibly add new ones?

Add orientable property to sphere shapes

The orientable flag enables spherical particles to have orientations, which is necessary when particles are interacting via anisotropic pair potentials (e.g. patchy particles). This was first added to the HOOMD HPMC sphere integrator in version 2.3, and recently included in version 1.3 of the HOOD Schema of the GSD package as part of the state information. Although not strictly necessary at the moment, I think it should be included for completeness, specially since its a simple implementation.

Unit tests missing assert statements

The test_sphere and test_convex_polyhedron unit tests of the HPMCPosFileReaderTest are missing assert statements. Also, the test_convex_polyhedron test uses the convex_polygon integrator.

Update README

Original report by me.


The credits and maintainer contact info should be updated on the README.

Also, the testing information at the bottom of the README says to use hoomd -m unittest... which is from HOOMD v1. This should only list the other two methods, python -m unittest discover tests and nosetest.

Interpret the rotation keyword

Original report by Carl Simon Adorf (Bitbucket: csadorf, GitHub: csadorf).


When rotating a frame in injavis and then storing it to a pos-file, injavis will not actually rotate the box matrix, but insert a rotation keyword into the pos-file.

This keyword is currently ignored. It would be better if gf could interpret it and rotate the box matrix accordingly. (opt-in?)

Next steps

  1. How is the rotation keyword defined?
  2. Implement a (opt-in) way to rotate the box matrix when reading the file.

Position array returned by XMLDCD file reader is not contiguous

Original report by Mayank Agrawal (Bitbucket: amayank, GitHub: amayank).


Computing RDF of positions extracted through XMLDCD file reader using freud's rdf module returns a very weird incorrect RDF plot.
Positions were read as pos = np.copy(traj[0].positions)
Wrapping these positions as new_pos = np.ascontiguousarray(pos, dtype=np.float32) solved the problem and the rdf plot was correctly reproduced.

This wrapping can be performed internally by glotzformats.

Document how to operate on open/closed files

Original report by Carl Simon Adorf (Bitbucket: csadorf, GitHub: csadorf).


The glotzformats reader implementation philosophy is to not load unsolicited data into memory whenever possible. This is important, otherwise it would not be feasible to operate with trajectory data with millions of particles per frame.
Readers are stateful to the extend that keep an inventory of the currently read trajectory, such as the number of frames and individual frame positions within the file to allow for fast random-access, but otherwise will try to operate on the file directly as much as possible.

This philosophy and the technical requirements needs to be better documented, since users commonly run into issues where they attempt to operate on closed files.

Make glotzformats open-source

Original report by Carl Simon Adorf (Bitbucket: csadorf, GitHub: csadorf).


Many Glotzer group peep's workflows heavily rely on open-source components developed inside and outside of this group. However, one integral component, this suite of readers/writers, and format definitions is still closed-source. This makes it harder for other people to reproduce our protocols. It would therefore make sense to make this package available to the public as well.

Things that would need to be resolved in preparation for this:

Done:

  • Contributor Agreement (resolved by #85 #86)

To-do:

  • License (resolved by #87)
  • Rebranding (resolved by #97)
  • Deployment via PyPI (after making repository public)

On-demand reading of frame data

Original report by Carl Simon Adorf (Bitbucket: csadorf, GitHub: csadorf).


The trajectory data frames are currently loaded into memory for simplicity. This is not feasible for very large trajectories.

Proposed solution:

  1. Indexing of the trajectory file
  2. Reading on demand

Box regularization

Original report by Vyas Ramasubramani (Bitbucket: vramasub, GitHub: vyasr).


The box regularization for reading pos files currently built-in doesn't quite work. While we can handle certain simple cases where there are e.g. negatives on the box diagonal, the more general case of a box matrix that is non-orthorhombic (or perhaps not even upper triangular) does not.

I actually don't think that this is something that we can do generally. Although it is relatively simple to transform the box vectors into an upper triangular basis, I don't think that we can commensurately transform all particle positions in a way that would generate the identical periodic structure. While the system might appear the same when only viewing one copy of the box, once the box is replicated across PBs this can fail very easily.

I propose that we enumerate certain simple cases (like chirality) that are solvable, and for the rest we simply include warnings that we cannot translate this box into an equivalent upper triangular system.

Getar 2D

Original report by JamesP (Bitbucket: jproc, GitHub: j-proc).


Getar Reader 2D files recognized as 3D. Minimal reproduction below:

#!python

import hoomd
from hoomd import *
from glotzformats.reader import GetarFileReader

context.initialize()

system=init.create_lattice(unitcell=lattice.sq(a=1.0),n=[4,4])
getar=dump.getar.immediate('ge.tar', static=['viz_static','global_all'], dynamic=['viz_dynamic'])
print(system.box)

reader = GetarFileReader()
with open('ge.tar', 'rb') as file:
    traj = reader.read(file)


frame=traj[-1]
print(frame.box)


Frame properties revisions

Original report by me.


A few to-do items to clean up frame property handling:

  • The check for equality between frames needs to properly treat None.
  • The property angmom should be treated the same for 3D and 2D systems.
  • Format readers should not assign default values to properties that are not present in the original file (assign None) instead.
  • Format writers should check if a property is None before attempting to write it.
  • The image property (and possibly others) should be supported.

Can't read GSD file velocities

Original report by Chengyu Dai (Bitbucket: daich, GitHub: jdaaph).


Use the attachments, can't read the velocities by the following code, it only gives out all 0s

from glotzformats.reader import GSDHOOMDFileReader
reader = GSDHOOMDFileReader()
with open("trajectory.gsd",'rb') as f:
traj = reader.read(f)
print(traj[19].velocities)

Add unit test to ensure AttributeError is raised when frame attributes are None

Following the discussion on #39, unit test are required to ensure AttributeError is properly raised when expected. I have already implemented, in the frame-default-fix branch, some of these unit tests for trajectories read from dcd and pos files, but these need to be thoroughly checked. Unit tests are still missing for the other supported file formats.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.