Coder Social home page Coder Social logo

oneflux's Introduction

ONEFlux Processing Pipeline

ONEFlux (Open Network-Enabled Flux processing pipeline) is an eddy covariance data processing codes package jointly developed by the AmeriFlux Management Project, the European Fluxes Database, and the ICOS Ecosystem Thematic Centre. ONEFlux is used for the standard processing and data product creation for these networks.

ONEFlux consolidates multiple computations to process half-hourly (or hourly) flux inputs in an automatic fashion, including friction velocity threshold estimation methods and filtering, gap-filling of micrometeorological and flux variables, partitioning of CO2 fluxes into ecosystem respiration and gross primary production, uncertainty estimates, and more.

The current version of the code is compatible with the code base used to create the FLUXNET2015 dataset, and data processed with ONEFlux can be used in conjunction with data from FLUXNET2015.

The pipeline controlling code uses Python version 2.7 (it should work with Python version 3.5 or later, but was not fully tested with these versions; update of the code to Python 3 is ongoing).

(THERE ARE CAVEATS AND KNOWN LIMITATIONS TO THIS CODE, PLEASE SEE CAVEATS LIST BELOW.) This iteration of the code is not fully in line with open source/free software development practices, but we intend to steadily move in that direction.

Implemented steps

The steps implemented in the ONEFlux processing pipeline are detailed in the data processing description page of the FLUXNET2015 dataset.

The outputs of each of these steps is saved to a sub-directories of a directory containing the data for a site. The structure of these output folders includes:

  • 01_qc_visual/: output of QA/QC procedures and visual inspection of data; this is the main input for the ONEFlux pipeline.
  • 02_qc_auto/: output of data preparation procedures for next steps and automated flagging of data based on quality tests (this step is implemented in C, and source available under ../ONEFlux/oneflux_steps/qc_auto/).
  • (step 03 is part of a secondary visual inspection and not included in this codebase)
  • 04_ustar_mp/: output of the Moving Point detection method for USTAR threshold estimation (this step is implemented in C, and source available under ../ONEFlux/oneflux_steps/ustar_mp/).
  • 05_ustar_cp/: output of the Change Point detection method for USTAR threshold estimation (this step is implemented in MATLAB, and source available under ../ONEFlux/oneflux_steps/ustar_cp/).
  • 06_meteo_era/: output of the downscaling of micromet data using the ERA-Interim dataset (this step is optional and is currently not part of this codebase).
  • 07_meteo_proc/: output of the micromet processing step, including gap-filling (this step is implemented in C, and source available under ../ONEFlux/oneflux_steps/meteo_proc/).
  • 08_nee_proc/: output of the NEE processing step, including gap-filling (this step is implemented in C, and source available under ../ONEFlux/oneflux_steps/nee_proc/).
  • 09_energy_proc/: output of the energy (LE and H) processing step, including gap-filling (this step is implemented in C, and source available under ../ONEFlux/oneflux_steps/energy_proc/).
  • 10_nee_partition_nt/: output of the NEE partitioning, nighttime method (this step is implemented in Python, and source available under ../ONEFlux/oneflux/).
  • 11_nee_partition_dt/: output of the NEE partitioning, daytime method (this step is implemented in Python, and source available under ../ONEFlux/oneflux/).
  • 12_ure_input/: output of the preparation of input for the uncertainty estimation step (this step is implemented in Python, and source available under ../ONEFlux/oneflux/).
  • 12_ure/: output of the uncertainty estimation step (this step is implemented in C, and source available under ../ONEFlux/oneflux_steps/ure/).
  • 99_fluxnet2015/: final output of the pipeline with combined products from previous steps (this step is implemented in Python, and source available under ../ONEFlux/oneflux/).

Building and installing

A installation script is available in the form of a Makefile, which can be used in Linux (x86-64) systems; versions for Windows and Mac are planned but not available at this time.

Running the command $ make in the source code folder will install all required Python dependencies, compile all C modules and install them in the user home directory under ~/bin/oneflux/ (gcc version 4.8 or later is required to compile the C modules), and will also copy to the same destination an executable compiled version of a Matlab code (see below how to install MCR and run this code). Also note that the Python modules for ONEFlux will not be installed, so the source code will need to be used directly to configure paths and call the main pipeline execution.

Installing MCR to run compiled MATLAB code

A compiled version of the MATLAB code for the Change Point detection method for USTAR threshold estimation is available (under ../ONEFlux/oneflux_steps/ustar_cp/bin/) and is copied into the executables directory along with the compiled version of the steps implemented in C. (currently only a version for Linux x86-64 environment is available)

To run this MATLAB compiled code, it is necessary to install the MATLAB Compiler Runtime toolset. It can be downloaded in the MCR page. Version 2018a is required (this version was used to compile the code). Follow the instructions in the download page to install MCR.

The path to the newly installed MCR environment (e.g., ~/bin/matlab/v94/) is a necessary input to the pipeline execution if this step is to be executed.

Running

Run Python using the file runoneflux.py with the following parameters:

usage: runoneflux.py [-h] [--perc [PERC [PERC ...]]]
                     [--prod [PROD [PROD ...]]] [-l LOGFILE] [--force-py]
                     [--mcr MCR_DIRECTORY] [--ts TIMESTAMP] [--recint {hh,hr}]
                     COMMAND DATA-DIR SITE-ID SITE-DIR FIRST-YEAR LAST-YEAR

positional arguments:
  COMMAND               ONEFlux command to be run [all, partition_nt, partition_dt]
  DATA-DIR              Absolute path to general data directory
  SITE-ID               Site Flux ID in the form CC-XXX
  SITE-DIR              Relative path to site data directory (within data-dir)
  FIRST-YEAR            First year of data to be processed
  LAST-YEAR             Last year of data to be processed

optional arguments:
  -h, --help            show this help message and exit
  --perc [PERC [PERC ...]]
                        List of percentiles to be processed
  --prod [PROD [PROD ...]]
                        List of products to be processed
  -l LOGFILE, --logfile LOGFILE
                        Logging file path
  --force-py            Force execution of PY partitioning (saves original
                        output, generates new)
  --mcr MCR_DIRECTORY   Path to MCR directory
  --recint {hh,hr}      Record interval for site
  --versionp VERSIONP   Version of processing
  --versiond VERSIOND   Version of data

Running examples

Sample data

Data formatted to be used in the examples below are available. The sample input data (around 80MB) can be used to run the full pipeline. To check the processing worked as expected, the sample output data (around 400MB) cab be used.

  • sample input data: ftp://ftp.fluxdata.org/.ameriflux_downloads/.test/US-ARc_sample_input.zip
  • sample output data: ftp://ftp.fluxdata.org/.ameriflux_downloads/.test/US-ARc_sample_output.zip

Execution commands

Run all steps in the pipeline:

  • python: Python interpreter
  • runoneflux.py: Main code to be executed
  • all: Pipeline step to be executed (all)
  • "../datadir/": Main data directory
  • US-ARc: Site Flux ID of site to be processed
  • "US-ARc_sample_input": Relative path to data directory for site (with main data directory)
  • 2005: First year to be processed
  • 2006: Last year to be processed
  • -l partitioning_nt_US-ARc.log: Uses file to store execution log
python runoneflux.py all "../datadir/" US-ARc "US-ARc_sample_input" 2005 2006 -l fluxnet_pipeline_US-ARc.log --mcr ~/bin/matlab/v94/ --recint hh

Run nighttime partitioning method:

python runoneflux.py partition_nt "../datadir/" US-ARc "US-ARc_sample_input" 2005 2006 -l fluxnet_pipeline_US-ARc.log

Run daytime partitioning with only single percentile and/or a single USTAR threshold type data product (recommended for first executions), use:

  • --prod y: processes only VUT USTAR threshold product
  • --perc 50: processes only 50-percentile USTAR threshold
  • --force-py: forces execution of Python partitioning code (replaces existing outputs)
python runoneflux.py partition_dt "../datadir/" US-ARc "US-ARc_sample_input" 2005 2006 -l fluxnet_pipeline_US-ARc.log --prod y --perc 50 --force-py

Note that for the execution of the partitioning steps, only if the output *.csv file doesn’t exist (e.g., nee_y_50_US-ARc_2005.csv), the code will run and generate the file. If it exists, nothing will be done (unless the flag --force-py is used).

Required input data

All steps

In the data directory for the site, the input data must be in the expected formats, especially for individual steps within the pipeline. If the full pipeline is being executed, the inputs that must be present should be in the following directories:

  • 01_qc_visual/qcv_files/
  • 06_meteo_era/ (optional)

and the outputs will be generated in the directories:

  • 02_qc_auto/
  • 04_ustar_mp/
  • 05_ustar_cp/
  • 07_meteo_proc/
  • 08_nee_proc/
  • 09_energy_proc/
  • 10_nee_partition_nt/
  • 11_nee_partition_dt/
  • 12_ure_input/
  • 12_ure/
  • 99_fluxnet2015/

Flux partitioning steps only

For both the nighttime and daytime partitioning methods, the inputs that must be present should be in the following directories:

  • 02_qc_auto/
  • 07_meteo_proc/
  • 08_nee_proc/

and the outputs will be generated, respectively, in the directories:

  • 10_nee_partition_nt/
  • 11_nee_partition_dt/

Caveats and known limitations

  • NO SUPPORT. We are not offering any kind of support for the code at this time (including creation of GitHub issues). This is so we are able to concentrate in improving the code and creating a more usable set of steps within the pipeline. This version of the code is intended to offer insight into how some of the steps work, but not a fully supported codebase. Once the code is more mature, we will revise this approach.

  • NO CODE CONTRIBUTIONS. Following the same reasoning for not offering support at this time, we will not accept code contributions for now. Once the we have a more mature code and development process we will revise this approach (and start encouraging contributions at that point).

  • Execution environment requirements. Many of the steps of the ONEFlux codebase have very specific requirements for the execution environment, including how the intermediate files are formatted, what outputs were generated successfully, execution logs being in place, etc. For this reason, it might be difficult for someone else to run this code if there are any unexpected conditions. Someone familiar with Python and C coding and Unix environments might be able to navigate and remedy errors, but the current version of the code is not intended to be "friendly" to the user (we hope to improve this in upcoming versions).

  • Data format requirements. We have tested this codebase extensively but only on sites included in the FLUXNET2015 dataset. Specific formats with extra metadata (see sample data files for examples) are required for the pipeline to run correctly. Preparing a new site to be run could be difficult (this also something we intend to improve soon).

  • Python version. This code has been tested to work with Python 2.7; it should work under Python 3.5 or later as well, but this was not fully tested.

  • MATLAB code. The CPD friction velocity threshold estimation step requires an MCR environment configured to run MATLAB step compiled into executable (see instructions above)

  • Daytime partitioning method exceptions. The current version of the implementation of the daytime flux partitioning method aimed at preserving the behavior of the code used to generate the FLUXNET2015 dataset. There are situations in which the non-linear least squares method fail for a window in the time series, stopping the execution. The solution implemented for the original implementation (mimicked in this version) would involve adding the failed window into a list of exception windows to be skipped when the execution is restarted. This is done automatically in this implementation, but is something that will be removed for future version, avoiding the failure conditions altogether.

  • Performance issues. The performance of a few of the steps is not optimal. In particular, the daytime and nighttime partitioning methods, can take a long time. The daytime partitioning can be especially slow, with the non-linear least squares method taking a long time to run for each window. This will be addressed in future versions, taking advantage of newer least squares methods and other improvements to the execution flow.

  • Sundown reference partitioning method One of the methods used for flux partitioning in FLUXNET2015 is not included in this version of the pipeline currently. We are working on porting the code and will include the method in future versions.

  • Micromet downscaling step. The downscaling of micrometeorological variables using ERA-Interim reanalysis data (for filling long gaps) is another step missing from the current iteration and will be included in future versions.

  • Bug for full years missing. Known bug, will be addressed in future updates: there are situations when a full year of data missing for CO2 fluxes, or for energy/water fluxes, or micromet variables (or combinations of them), will cause the code to break.

  • Bug in input for nighttime partition. Known bug, will be addressed in future updates: Certain rare conditions in the USTAR threshold estimation can cause the nighttime partitioning to break for lack of data for one or more windows within the time series.

  • Incompatibility with numpy version 1.16 or higher. A change on how numpy handles access to multiple fields in version 1.16 broke many spots in the code (to be corrected in future updates)

  • Default output messages. The default logging level (both to the screen/standard output and to the log file) are set to the most verbose mode, showing all diagnostics and status messages, which can generate a large log file.

  • Installation setup. The Python installation script (setup.py) is not implemented yet, so configuring the execution environment (e.g., Python Path) might have to be done manually

  • Data input formats. Some of the data formats are not fully compatible with regional networks and FLUXNET formats. This is being addressed and will be fixed in future versions.

  • Not all steps callable. The current implementation only offers the execution of the full pipeline, the nighttime, and the daytime partitioning methods. All other methods are available but do not have an easy interface to get them to run individually (will also be changed in future versions).

Support and Funding

This material is based upon work supported by:

  • U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research, through the AmeriFlux Management Project, under Contract No. DE-AC02-05CH11231
  • COOP+ project funded under the European Union's Horizon 2020 research and innovation programme - grant agreement No 654131
  • RINGO project funded under the European Union's Horizon 2020 research and innovation programme - grant agreement No 730944
  • ENVRIFAIR project funded under the European Union's Horizon 2020 research and innovation programme - grant agreement No 824068

Contributors

Development and Code

  • Gilberto Pastorello, gzpastorello <at> lbl <DOT> gov
  • Dario Papale, darpap <at> unitus <DOT> it
  • Carlo Trotta, trottacarlo <at> unitus <DOT> it
  • Alessio Ribeca, a.ribeca <at> unitus <DOT> it
  • Abdelrahman Elbashandy, aaelbashandy <at> lbl <DOT> gov
  • Alan Barr, alan.barr <at> canada <DOT> ca

Evaluation

  • Deb Agarwal, daagarwal <at> lbl <DOT> gov
  • Sebastien Biraud, scbiraud <at> lbl <DOT> gov

(THERE ARE CAVEATS AND KNOWN LIMITATIONS TO THIS CODE, PLEASE SEE CAVEATS LIST (ABOVE)

oneflux's People

Contributors

gilbertozp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

oneflux's Issues

Create strategy to integrate metadata (BIF-style) into the data product

Define how to integrate metadata into ONEFlux data product zip package files for each site (similar to BIF-style file made available for all sites in previous releases).

  • Add BIF file to final product -- decide what information to be included (BADM Variables)
  • Incorporate information from AUXN0EE and AUXMeteo files into metadata (same BIF file for each site)
  • Make sure metadata about the product (e.g., filename version in product, see Issue #21 )
  • Consider adding information about who did the processing: machine where run happened, timestamp of processing, person running the code, etc.

Integrate ERA5 downscaling step into main pipeline

The original FLUXNET2015 product was based on the ERA-Interim product, downscaled to site-level data in a separate pipeline. ONEFlux now also supports ERA5 products generated in the same way. However, integration of this step into the pipeline would speed up the process and increase automation.

This is possible by either incorporating similar behavior to the code currently used for the downscaling, downloading large portions of the ERA5 product locally (faster for clusters) or adapting it to run on a site-by-site basis (lighter-weight for running individual or small groups of sites).

Mismatched last ERA timestamp

I run the pipline from 2018 to 2021, but meet this error at the very end stage (already output the hourly results in 99_ folder)

ONEFluxError: US-ARc mismatched last ERA timestamp expected (202212312330) and found (202112312330)

Zhenqi Luo

versions of python modules

Can you provide the python module versions in you implemention?
numpy
scipy
matplotlib
statsmodels

I tried a few versions, but not able to find a set that are compatible.

Add new implementation of downscaling to pipeline

New implementation of downscaling retrieving data directly from ECMWF and running downscaling algorithm locally.

Options/steps for implementation:

  • independent implementation
  • incorporated into regular pipeline

Define strategy for versioning in the product filenames

Define how the filename version information will be used:

  • Default values
  • Document meaning of data version
  • Document meaning of processing version
  • Define if processing version is fixed in the code (hard-coded with the code version) or can be parameterized
  • Networks always in sync vs individuals running the code

Filename template:
FLX_CC-SSS_DATAPRODNAME_[SUBSET/FULLSET]_FYYY-LYYY_DATAV-PROCV.zip

Filename examples:
FLX_US-ARc_FLUXNET2015_FULLSET_2005-2006_2-3.zip
FLX_US-ARc_FLUXNET2015_FULLSET_2005-2006_2-3.zip

Additional variables to be included in standard ONEFlux output

Several additional variables are supported in the FP Standard but not propagated to the ONEFlux processing result, even when present in the inputs. These should be included in the output when present in the input.

A first version could simply propagate them, without gapfilling or temporal aggregation, simplifying the code change.

A new method for tracking and documenting these variables online will be necessary and needs to be synchronized with ONEFlux variable lists within the code.

Examples include:

  • SW_OUT
  • LW_OUT

Error linking ure

Hi ONEFluxers,

I'm getting some errors when using make -f Makefile in ONEFlux/oneflux_steps at the linking of ure. The screen output with the error is below.

gcc -O3 dataset.o main.o /home/peter/Development/ONEFlux/oneflux_steps/common/common.o -w -lm -o /home/peter/bin/oneflux/ure
/usr/bin/ld: main.o:(.data.rel.local+0x0): multiple definition of authors_suffix'; dataset.o:(.bss+0x0): first defined here /usr/bin/ld: main.o:(.data.rel.local+0x20): multiple definition of types_suffix'; dataset.o:(.bss+0x20): first defined here
collect2: error: ld returned 1 exit status
make: *** [Makefile:174: /home/peter/bin/oneflux/ure] Error 1

OS is Ubuntu 22.04 and gcc version is 11.3.0. It links OK on Ubuntu 20.04 and gcc 9.4.0.

Many thanks for all your hard work on ONEFlux.

Cheers,
Peter

Included additional compiled tools for independent steps

Current compiled tools are available for several of the steps in ONEFlux; there are additional tools using subsets of the functionality that should also be distributed in compiled/ready to use forms, including, for instance: MDS gapfilling method, top of the atmosphere solar radiation data generator, etc. (Some of these are currently available in the ICOS-ETC repo)

Unit testing for c code in simple way

Hi,

I want to share you some info on how we can add some checking (unit testing) to existing code.
There are many unit testing libraries floating around the web but I think for our needs we can use a C features to mimic unit testing usage: macros.

In C language, we can have parts of code that can be enabled by defining a symbol (keyword):

#ifdef ENABLE_TEST
puts("some test enabled!");
#endif

if the compiler during compilation will find ENABLE_TEST defined, the code inside the block will be compiled.
Let's make a working example:

#include <stdio.h>

int main(void)
{
	int a = 5;
	printf("the value of a is %d\n", a);
	
#ifdef ENABLE_TEST
	puts("test enabled!");
	printf("a is %s than 10\n", (a < 10) ? "<" : ">=");
	puts("end of test!");	
#endif

	return 0;
}

if we compile it with gcc:

gcc main.c && ./a.out

we have as output:

the value of a is 5

but if we compile it declaring ENABLE_TEST definition using -D parameter

gcc -DENABLE_TEST main.c && ./a.out

we'll have

the value of a is 5
test enabled!
a is < than 10
end of test!

As you can see, we can easily enable or disable this kind of checking.
Let's do a better example:

#include <stdio.h>

#ifdef ENABLE_TEST
#define str__(a) #a
#define STR(a) str__(a)
#define TEST(condition) \
		if (!(condition)) printf("condition '%s' fails at row %u in function %s inside file %s\n", \
		STR(condition), __LINE__, __FUNCTION__, __FILE__)
#else
#define TEST(unused)
#endif

int main(void)
{
	int a = 5;
	printf("the value of a is %d\n", a);
	
	// it will fail, condition must be true!
	TEST(a > 10);
	// nothing happens
	TEST(a < 7);

	return 0;
}

In this example the macro TEST will evaluate only if ENABLE_TEST is defined, otherwise nothing happens.
Again if we compile and run it

gcc -DENABLE_TEST main.c && ./a.out

we have

the value of a is 5
condition 'a > 10' fails at line 19 in function main inside main.c file

We can eventually adopt this to encapsulated a unit testing library too:

// please note: simplified example
#ifdef ENABLE_TEST
#include "unit_test_files.c"
#define TEST(condition) unit_test_func(condition)
#else
#define TEST(unused)
#endif

Any ideas ?

Otherwise this seems like a good candidate for unit testing: https://github.com/sheredom/utest.h

Investigate daytime exception for edge case

Keep track of executions and see if this condition is triggered for any site and use it as example to investigate problem.

Happening here

#### This if statement is weird but it's in the pvwave code
if ind[jmin[0], 1, i] == 6616: # TODO: investigate and replace statement
msg = "DT EXIT EXCEPTION: exact number of indices"
_log.critical(msg)
raise ONEFluxPartitionError(msg)

MDS General Improvement

The MDS method for gap-filling needs a reorganization. The main reasons or aspects are:

  1. introduce the flexibility in the selection of the drivers to be used and tolerances
  2. introduce the option to calculate the values to be used for gapfilling considering the uneven distribution of measurements above and below the reference value
  3. create the independent tool for this new version (linked and solving the issue #14

This issue should be solved adding these options as parameters to pass to the MDS code and then setting them as currently used before switching to new setups.

Consider options for processing versioning

Current default value for processing versioning (from data and processing versioning pair) is set as beta in the code currently. An execution parameter allows setting a different value.

Consider a better alternative as default (maybe based on code version?)

Python testing with pytest

ONEFlux does not currently have any automated testing of python code. We intend to update the ONEFlux Python code to python3 (see #8 ..). Unittest tests exist, but serve to check the importing of ONEFlux only.

ONEFlux would benefit from the implementation of a GitHub workflow in which the following is done automatically upon the submission of a pull request (PR).

  • Build oneflux
  • Gather example data US-ARc_sample_input
  • Execute integration test - possibly partitioning_nt.py step

This runs an integration test. Unit testing can be done in the same way. I would suggest using pytest as an alternative to Unittest.

Gapfilling meteo data where MDS is not applied (P, PA, WS...) - use original data if ERA are not available in aggregations

Currently, in case ERA are not available and original data are available, in the halfhourly final file there is P (with original data) and P_F is all -9999. This happens even for the days where P is always present. It is ok because in fact the variable is not fully gapfilled.
However this creates a problem in the daily files where the P is always entirely missing even for days where there are measurements.

Let's consider the possibility to provide P_F from daily to yearly in case we have the periods fully covered by measurements (so only for the days->years where there are no gaps)

Error running US-ARc example files

I get the following error when running the latest ONEFlux with the US-ARc example files:
ONEFluxError: US-ARc mismatched last ERA timestamp expected (202112312330) and found (201412312330)

Details:
ONEFlux: Cloned from GitHub on 27th October 2022
US-ARc: ftp files downloaded on 26th October 2022
OS: Ubuntu 20.04
Python: conda environment using V2.7.15

I have saved the log file and can attach this if it will help.

Anyone else have this problem?

Unit testing C code using a framework

ONEFlux does not have any C unit testing in place (apart from on-the-fly checks). Writing tests of the pre-existing 'C' code is a large engineering task but a good (and perhaps necessary) first step is to write tests for code that is expected to change in the near term.

Testing ONEFlux's C code can be done by writing our own testing macros or by using a testing framework. Here we suggest adopting a framework. There are a few advantages:

  • They are well tested.
  • Provide advanced features like the ability to run subsets of tests. And many more...
  • Coverage support

Unlike python, where 'pytest' is really the only choice, in C the decision is less clear cut. There are many:

  • Check
  • gtest
  • cmocka
  • utest
  • kyua + atf

The aim is to demonstrate the adoption of a C testing framework by writing a few simple tests of a step in the ONEFlux pipeline. meteo_proc.c is a good candidate for testing.

qc_auto small fixes

Hi,

I can make a PR to following changes 'cause I have those on my local repository:

  • do not compute NEE if FC entirely NULL
  • do not put NEE on file if FC is missing or entirely NULL
  • do not create U* file if U*, NEE or FC are missings or entirely invalids
  • do not create ENERGY file if H or LE missings or entirely invalids

Please let me know if I can make a PR

Update Matlab MCR requirement to latest release

The current readme specifies Matlab MCR version 2018a for running compiled Matlab code. However, this version of the MCR does not support Apple M-series Silicon.

Can we update the build to support M-series chips.

Add option to continue product generation when one or more partitioning results not available

Need to add an execution parameter to skip the partitioning step or continue execution if it fails.

Parameters individually for each partitioning method are needed (e.g., skip NT only, or skip DT only, or skip all partitioning); important because DT is more susceptible to limitations, e.g., not enough data to finish optimization step. There are also situations where NT is not working (e.g. tropical sites where temperature is too stable).

The URE should run and create final outputs (99_fluxnet) with results even if one or more partitioning results are not generated.

Consider also time limits, so partitioning for sites that take too long can be interrupted; important, for instance, for sites for which it might be impossible to fit parameters or no sense (e.g. urban sites).

energy_proc fixes

Hi,

I can make a PR to following bugs 'cause I have those changes on my local repository:

  • bug on quality flags for gapfilled LE and H
  • correclty calculate the QC in case of missing data
  • renaming var SWIN_m, SWIN_mqc to new convention names: SW_IN_m and SW_IN_mqc
  • bug: changing the value of the QC to -9999 when it was not possible to calculate the energy balance and create LE_Corr

Please let me know if I can make a PR,

Alessio

New input method for ONEFlux

Proposal

Unify the parameters file into a single run file, .i.e .yaml file.

Motivation

At the moment, the run parameters of ONEFlux are spread across different methods: from local_settings.py and from command arguments.
I propose to have all configurable parameters in a single configuration file so it is easy to share and reproduce each run.
The YAML file is a good candidate since it is easy to read and has been widely adopted as a configuration file in many projects.

Pros

  • The YAML format is easy to read.
  • It has indentation syntax for the hierarchical structure that is similar to Python syntax.

Cons

  • Need to install PyYAML library.
  • Extra precaution when modifying the YAML file because a single mismatch indentation can mess up the configuration file. However, users may want to use the provided YAML template.

Plan

  • Add PyYAML library to requirements.txt.
  • Add template YAML file.
  • User can provide the YAML file and overwrite it with command arguments.
  • Save run parameters.
  • Merge local_settings.py with the new configuration file.

Consider alternative behaviors for gapfilling of very long gaps

For all variables for which this is implemented, the current gapfilling implementation will run independently of gap sizes, leading to unusable results in some cases (e.g., missing full year). In these extreme cases, alternatives should be considered, e.g., keeping the gaps in the final results.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.