testgroup-bnl / csvqa Goto Github PK
View Code? Open in Web Editor NEWTools for QA checks, fixes and post-processing for CSV files
License: GNU General Public License v3.0
Tools for QA checks, fixes and post-processing for CSV files
License: GNU General Public License v3.0
2021-01-08 12:12:20,944 [INFO]: [Chamber 1 - Blue_Min1.dat] Checking for duplicates in ['TIMESTAMP', 'RECORD']...
2021-01-08 12:12:21,016 [INFO]: [Chamber 1 - Blue_Min1.dat] Found 8216 duplicates
2021-01-08 12:12:21,250 [INFO]: [Chamber 1 - Blue_Min1.dat] Dropped duplicates, kept last.
2021-01-08 12:12:21,250 [INFO]: [Chamber 1 - Blue_Min1.dat] Checking for duplicates in TIMESTAMP...
2021-01-08 12:12:21,281 [INFO]: [Chamber 1 - Blue_Min1.dat] No duplicates found.
2021-01-08 12:12:21,281 [INFO]: [Chamber 1 - Blue_Min1.dat] Checking record order...
2021-01-08 12:12:21,281 [INFO]: [Chamber 1 - Blue_Min1.dat] No out of order timestamps found.
2021-01-08 12:12:21,281 [INFO]: [Chamber 1 - Blue_Min1.dat] Checking for missings rows...
2021-01-08 12:12:21,547 [INFO]: [Chamber 1 - Blue_Min1.dat] Filled 384815 missing rows
2021-01-08 12:12:52,828 [INFO]: [Chamber 1 - Blue_Min1.dat] Loading ../Raw\Chamber 1 - Blue_Min1_alt.dat...
2021-01-08 12:12:52,828 [WARNING]: [Chamber 1 - Blue_Min1_alt.dat] Problem while loading Chamber 1 - Blue_Min1_alt.dat: [Errno 2] No such file or directory: '../Raw\Chamber 1 - Blue_Min1_alt.dat'
2021-01-08 12:12:52,828 [INFO]: [Chamber 1 - Blue_Min1_alt.dat] Checking for duplicates in TIMESTAMP...
2021-01-08 12:12:52,966 [INFO]: [Chamber 1 - Blue_Min1_alt.dat] No duplicates found.
2021-01-08 12:12:52,969 [INFO]: [Chamber 1 - Blue_Min1_alt.dat] Checking record order...
Traceback (most recent call last):
File "T:\Projects\NGEE-Arctic\ZPW\2019\Data\QA\Level0to1.py", line 359, in
d_dev = calcAltDevs(d, fname, **opts["Level 1"], **opts["Input"])
File "T:\Projects\NGEE-Arctic\ZPW\2019\Data\QA\Level0to1.py", line 211, in calcAltDevs
checkOrder(d_alt)
File "T:\Projects\NGEE-Arctic\ZPW\2019\Data\QA\Level0to1.py", line 184, in checkOrder
if df[ts].is_monotonic:
TypeError: 'NoneType' object is not subscriptable[
Level0to1_2021-01-08_121217.log
](url)
So ... is it correct that I copy the raw data .dat files, the .conf and the source_characteristics.csv into a directory (e.g. \dc3.bnl.gov\bnlfiles\Test\Projects\NGEE-Arctic\ZPW\2019\Data\QA) and then run from there?
Chamber 1 and 2 ran, Level0to1, but then it failed on Chamber 3. Could this be catching an error in the recording of the data, where the timestamp has been messed up? (BTW, the status messages for every line are really instructive. Nice one.)
2021-01-10 20:09:34,639 [INFO]: [Chamber 3 - Black_Min1.dat] Checking record order...
2021-01-10 20:09:34,654 [INFO]: [Chamber 3 - Black_Min1.dat] Records are out of order, cannot continue.
Add support for multiple headers, most likely using MultiIndex. Similar example: https://stackoverflow.com/questions/22356746/how-can-i-write-a-csv-file-with-multiple-header-lines-with-pandas-to-csv/26815898
Rather then using individual global vars, use a config dict. Then when logging the config parameters they don't have to be redundantly rebuilt as a dict.
Working through the README, Step 1.
Took me a while to find this file - should there be a template copy in an easy to find place on the repo?
Then, unclear if I should create a data/2019 folder in the repo to store the updated source_characteristics file in.
Parallel processing should be added particularly to the resampling process. The custom agg functions are very likely the major time sink.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.