rockdoc / cdlparser Goto Github PK
View Code? Open in Web Editor NEWA python module for parsing files encoded in netCDF's common data form language (CDL)
License: BSD 3-Clause "New" or "Revised" License
A python module for parsing files encoded in netCDF's common data form language (CDL)
License: BSD 3-Clause "New" or "Revised" License
Docstrings are required for the main public methods, especially for decribing the optional keyword arguments recognised by the __init__
methods.
Keep working in Python 2.
Need to add code to correctly assign data arrays to those variables which reference an unlimited dimension.
You might consider switching to use my 'bison.py' parser skeleton.
(https://github.com/Unidata/bison.py)
If you did, then you could directly use the ncgen.y grammar
to parse full netcdf-4 CDL.
The t_IDENT method finishes up with the line
return(t)
which looks like a copy-and-paste mis-cue. Works fine, but looks odd.
Better exception handling is required in a number of places, particularly the p_xxx methods which construct the in-memory netCDF structures.
Add a parse_data()
method to the CDLParser base class so that client code can pass in CDL input text as a plain string. The existing parse_file()
method can then simply call the proposed new method having read in the text from the CDL file.
It would be nice to support general-purpose stream input, but I don't think the underlying PLY parser supports that approach - need to check this.
To keep the TDD police happy :-)
Adding an extra class to handle parsing of CDL files that adhere to the netCDF-4 syntax is a longer term goal. The CDL4 grammar is significantly more complex than that used in CDL3, so it's a non-trivial task.
When the _FillValue attribute is specified for a variable, we need to assign a numeric value of the correct type to the corresponding netcdf variable.
For variables defined in the DATA section, check that the specified variable has been declared earlier in the VARIABLES section. If not, raise an exception.
For multidimensional variables, any data arrays defined in the DATA section need to be suitably shaped before assigning to netCDF4 variable instances.
In the case of record variables which contain fill values (using '_' in the data block), an error is thrown if the specification of the data for the unlimited dimension appears after the record variable in the data block. If the unlimited dimension data precedes the record variable in the data block then all is well.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "cdlparser.py", line 177, in parse_file
return self.parse_text(data, ncfile=ncfile)
File "cdlparser.py", line 204, in parse_text
self.parser.parse(input=cdltext, lexer=self.lexer)
File "/Users/phil/lib/python/ply/yacc.py", line 265, in parse
return self.parseopt_notrack(input,lexer,debug,tracking,tokenfunc)
File "/Users/phil/lib/python/ply/yacc.py", line 971, in parseopt_notrack
p.callable(pslice)
File "cdlparser.py", line 639, in p_datadecl
self.write_var_data(var, arr)
File "cdlparser.py", line 812, in write_var_data
raise CDLContentError(errmsg)
cdlparser.CDLContentError: Error attempting to write data array for variable tas
Exception details are as follows:
could not convert string to float: _
If a global attribute is defined ahead of the dimensions block, as shown in the CDL snippet below, then a CDLSyntaxError exception is raised. Although it's rare in practice, this is valid CDL, as evinced by the fact that this CDL example is handled correctly by the ncgen utility.
netcdf unusual_order {
// global attributes
:comment = "blah blah" ;
dimensions:
x = 10 ;
y = 20 ;
...
}
Use netCDF4 module's experimental diskless mode to enable instantiation of the resulting netCDF dataset in memory only, without persisting to file on disk. Requires netcdf 4.2.
The cdlparser module is currently versioned using the 'version' attribute, which has to be updated manually, and is something that is easy to overlook. Can we have git automatically increment the version number? Or at least some kind of build number?
Use
super(CDL3Parser, self).__init__(**kw)
in preference to
CDL3Parser.__init__(self, **kw)
cldparser -> cdlparser
char-valued variables defined in the data section are not being written out correctly - due to misinterpretation of var.size attribute
A dry-run option would be useful for those situations where, for example, you maybe want to check the syntax of a CDL file without actually creating a netCDF output file.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.