Table of Contents
- Introduction
- Installation
- Initial Configuration
- Development Environment Configuration
- CSV File Format for Import
- Niamoto CLI Command Examples
- Mapping Configuration
- Static Type Checking and Testing with mypy and pytest
- License
- Contribution
The Niamoto CLI is a tool designed to facilitate the configuration, initialization, and management of data for the Niamoto platform. This tool allows users to configure the database, import data from CSV files, and generate static websites.
pip install niamoto
After installation, initialize the Niamoto environment using the command:
niamoto init
This command will create the default configuration necessary for Niamoto to operate. Use the --reset
option to reset the environment if it already exists.
To set up a development environment for Niamoto, you must have Poetry
installed on your system. Poetry is a dependency management and packaging tool for Python.
- Poetry Installation:
To install Poetry, run the following command:
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python -
- Clone the Niamoto repository:
Clone the Niamoto repository on your system using git
:
git clone https://github.com/niamoto/niamoto.git
- Configure the development environment with Poetry:
Move into the cloned directory and install the dependencies with Poetry:
cd niamoto
poetry install
- Activate the virtual environment:
Activate the virtual environment created by Poetry:
poetry shell
-
Editable Installation:
If you want to install the project in editable mode (i.e., source code changes are immediately reflected without needing to reinstall the package), you can use the following command:
pip install -e .
To import taxonomic data into Niamoto, you must provide a structured CSV file with the following columns:
Column | Description |
---|---|
id_taxon |
Unique identifier of the taxon |
full_name |
Full name of the taxon |
rank_name |
Taxonomic rank (e.g., family, genus, species) |
id_family |
Identifier of the family to which the taxon belongs |
id_genus |
Identifier of the genus to which the taxon belongs |
id_species |
Identifier of the species to which the taxon belongs |
id_infra |
Infraspecific identifier of the taxon |
authors |
Authors of the taxon name |
This markdown summarizes the command-line interface (CLI) commands available in the Niamoto system, which helps users manage database operations and data imports without direct code interaction.
Command:
$ niamoto init [--reset]
Explanation:
Initializes or resets the Niamoto environment. Use the --reset
option to reset the environment if it already exists, clearing all data and configurations to start fresh.
Command:
$ niamoto import-all
Explanation: Imports all data from CSV files and GeoPackage files into the database. This command is a shortcut to import taxonomy, plot, occurrences, and occurrence-plot links data in one go. Assuming the following files are present in the current source directory and specified in the configuration file
Command:
$ niamoto import-taxonomy <csvfile> [--ranks <ranks>]
Explanation:
Imports taxonomy data from a specified CSV file. The --ranks
option allows specifying the order of taxonomic ranks as they appear in the CSV file.
Command:
$ niamoto import-plots <gpkg_file>
Explanation: Imports plot data from a GeoPackage file into the database, which should contain plot geometries and associated attributes.
Command:
$ niamoto import-occurrences <csvfile> --taxon-id-column <column_name>
Explanation:
Imports occurrences data from a CSV file. The --taxon-id-column
option specifies the CSV column containing the taxon IDs needed to link occurrences to taxons.
Command:
$ niamoto import-occurrence-plots <csvfile>
Explanation: Imports links between occurrences and plots from a CSV file, establishing relational data within the database.
Command:
$ niamoto generate-mapping --data-source <csv_file> --mapping-group <group> [--reference-table-name <table_name> --reference-data-path <path>]
Explanation: Generates mappings from a CSV file based on specified grouping criteria. Optional parameters allow linking to reference data for enhanced mapping accuracy.
Command:
$ niamoto calculate-statistics [--mapping-group <group> --csv-file <file>]
Explanation: Calculates statistics based on the provided mapping file and optional group or CSV file specifics.
Command:
$ niamoto generate-static_files-site
Explanation: Generates a static website for each taxon in the database, providing a visual and informational representation of taxonomic data.
The mapping file defines the structure and transformations of data to be imported into the database. It is a YAML file that describes the different fields, their types, the transformations to apply, and visualization options.
The mapping consists of the following elements:
group_by
: The field used to group the data (e.g., "taxon").identifier
: The unique identifier for each group (e.g., "id_taxonref").source_table_name
: The name of the target table in the database (e.g., "occurrences").reference_table_name
: The name of the reference table (e.g., "taxon_ref").reference_data_path
: The path to the reference data (can be null).fields
: A dictionary defining the different fields to import and their configurations.
Each field in the fields
dictionary is defined by the following elements:
source
: The source of the data (e.g., "occurrences", "plots")source_field
: The name of the target field in the occurrences table. Can be null for calculated fields.field_type
: The data type of the field (e.g., "INTEGER", "DOUBLE", "BOOLEAN", "GEOGRAPHY").label
: The label of the field.description
: A description of the field.transformations
: A list of transformations to apply to the field. Each transformation is defined by:name
: The name of the transformation (e.g., "count", "mean", "max", "min", "coordinates").count
(fortop
transformations): The number of top values to retrieve.target_ranks
(fortop
transformations): A list of target ranks to consider (e.g., "id_species", "id_infra").chart_type
: The type of chart to generate (e.g., "text", "pie", "map", "gauge", "bar").chart_options
: Specific options for the chart type (e.g., "max", "title", "label", "color", "indexAxis", "stacked").
bins
: A dictionary defining the bins for the field. It contains:values
: A list of values to discretize continuous data.chart_type
: The type of chart to generate for the bins (e.g., "bar").chart_options
: Specific options for the bin chart (e.g., "title", "color").
Some fields may have specific configurations depending on their source
and field_type
:
- Calculated field (e.g., total number of occurrences):
source
: "occurrences"source_field
: nullfield_type
: "INTEGER"transformations
: Must contain a "count" type transformation
- Boolean field (e.g., occurrence on a particular substrate):
source
: "occurrences"source_field
: The name of the boolean field in the occurrences tablefield_type
: "BOOLEAN"transformations
: May contain a "count" type transformation
- Geographical field (e.g., location of the occurrence):
source
: "occurrences"source_field
: The name of the geographical field in the occurrences tablefield_type
: "GEOGRAPHY"transformations
: May contain a "coordinates" type transformation
transformations:
- {"name": "max", "chart_type": "gauge", "chart_options": {"max": 40, "title": "Maximum", "label": "units"}}
This format will display both YAML notations under a single Markdown box, keeping the explanation compact and the code examples clear and easy to compare.
transformations:
- name: max
chart_type: gauge
chart_options:
max: 40
title: Maximum
label: units
mypy is an optional static type checker for Python that aims to combine the benefits of dynamic (duck) typing and static typing. It checks the type annotations in your Python code to find common bugs as soon as possible during the development cycle.
To run mypy on your code:
mypy src/niamoto
pytest is a framework that makes it easy to write simple tests, yet scales to support complex functional testing for applications and libraries.
To run your tests with pytest, use:
pytest --cov=src --cov-report html
The documentation for the Niamoto CLI tool is available in the docs
directory. It includes information on the CLI commands, configuration options, and data import formats.
To build the documentation, you can use the following command:
cd docs
sphinx-apidoc -o . ../src/niamoto
make html
make markdown
niamoto
is distributed under the terms of the GPL-3.0-or-later license.
Instructions for contributing to the Niamoto project.