Coder Social home page Coder Social logo

PyPI - Version PyPI - Python Version


Table of Contents

Introduction

The Niamoto CLI is a tool designed to facilitate the configuration, initialization, and management of data for the Niamoto platform. This tool allows users to configure the database, import data from CSV files, and generate static websites.

Installation

pip install niamoto

Initial Configuration

After installation, initialize the Niamoto environment using the command:

niamoto init

This command will create the default configuration necessary for Niamoto to operate. Use the --reset option to reset the environment if it already exists.

Development Environment Configuration

To set up a development environment for Niamoto, you must have Poetry installed on your system. Poetry is a dependency management and packaging tool for Python.

  1. Poetry Installation:

To install Poetry, run the following command:

curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python -
  1. Clone the Niamoto repository:

Clone the Niamoto repository on your system using git:

git clone https://github.com/niamoto/niamoto.git
  1. Configure the development environment with Poetry:

Move into the cloned directory and install the dependencies with Poetry:

cd niamoto
poetry install
  1. Activate the virtual environment:

Activate the virtual environment created by Poetry:

poetry shell
  1. Editable Installation:

    If you want to install the project in editable mode (i.e., source code changes are immediately reflected without needing to reinstall the package), you can use the following command:

pip install -e .

CSV File Format for Taxonomy Import

To import taxonomic data into Niamoto, you must provide a structured CSV file with the following columns:

Column Description
id_taxon Unique identifier of the taxon
full_name Full name of the taxon
rank_name Taxonomic rank (e.g., family, genus, species)
id_family Identifier of the family to which the taxon belongs
id_genus Identifier of the genus to which the taxon belongs
id_species Identifier of the species to which the taxon belongs
id_infra Infraspecific identifier of the taxon
authors Authors of the taxon name

Niamoto CLI Command Examples

This markdown summarizes the command-line interface (CLI) commands available in the Niamoto system, which helps users manage database operations and data imports without direct code interaction.

1. Initialize or Reset the Environment

Command:

$ niamoto init [--reset]

Explanation: Initializes or resets the Niamoto environment. Use the --reset option to reset the environment if it already exists, clearing all data and configurations to start fresh.

2. Import All Data

Command:

$ niamoto import-all

Explanation: Imports all data from CSV files and GeoPackage files into the database. This command is a shortcut to import taxonomy, plot, occurrences, and occurrence-plot links data in one go. Assuming the following files are present in the current source directory and specified in the configuration file

3. Import Taxonomy Data

Command:

$ niamoto import-taxonomy <csvfile> [--ranks <ranks>]

Explanation: Imports taxonomy data from a specified CSV file. The --ranks option allows specifying the order of taxonomic ranks as they appear in the CSV file.

4. Import Plot Data

Command:

$ niamoto import-plots <gpkg_file>

Explanation: Imports plot data from a GeoPackage file into the database, which should contain plot geometries and associated attributes.

5. Import Occurrences Data

Command:

$ niamoto import-occurrences <csvfile> --taxon-id-column <column_name>

Explanation: Imports occurrences data from a CSV file. The --taxon-id-column option specifies the CSV column containing the taxon IDs needed to link occurrences to taxons.

6. Import Occurrence-Plot Links

Command:

$ niamoto import-occurrence-plots <csvfile>

Explanation: Imports links between occurrences and plots from a CSV file, establishing relational data within the database.

7. Generate Mapping

Command:

$ niamoto generate-mapping --data-source <csv_file> --mapping-group <group> [--reference-table-name <table_name> --reference-data-path <path>]

Explanation: Generates mappings from a CSV file based on specified grouping criteria. Optional parameters allow linking to reference data for enhanced mapping accuracy.

8. Calculate Statistics

Command:

$ niamoto calculate-statistics [--mapping-group <group> --csv-file <file>]

Explanation: Calculates statistics based on the provided mapping file and optional group or CSV file specifics.

9. Generate Static Site

Command:

$ niamoto generate-static_files-site

Explanation: Generates a static website for each taxon in the database, providing a visual and informational representation of taxonomic data.

Mapping Configuration

The mapping file defines the structure and transformations of data to be imported into the database. It is a YAML file that describes the different fields, their types, the transformations to apply, and visualization options.

Structure of the Mapping

The mapping consists of the following elements:

  • group_by: The field used to group the data (e.g., "taxon").
  • identifier: The unique identifier for each group (e.g., "id_taxonref").
  • source_table_name: The name of the target table in the database (e.g., "occurrences").
  • reference_table_name: The name of the reference table (e.g., "taxon_ref").
  • reference_data_path: The path to the reference data (can be null).
  • fields: A dictionary defining the different fields to import and their configurations.

Field Configuration

Each field in the fields dictionary is defined by the following elements:

  • source: The source of the data (e.g., "occurrences", "plots")
  • source_field: The name of the target field in the occurrences table. Can be null for calculated fields.
  • field_type: The data type of the field (e.g., "INTEGER", "DOUBLE", "BOOLEAN", "GEOGRAPHY").
  • label: The label of the field.
  • description: A description of the field.
  • transformations: A list of transformations to apply to the field. Each transformation is defined by:
    • name: The name of the transformation (e.g., "count", "mean", "max", "min", "coordinates").
    • count (for top transformations): The number of top values to retrieve.
    • target_ranks (for top transformations): A list of target ranks to consider (e.g., "id_species", "id_infra").
    • chart_type: The type of chart to generate (e.g., "text", "pie", "map", "gauge", "bar").
    • chart_options: Specific options for the chart type (e.g., "max", "title", "label", "color", "indexAxis", "stacked").
  • bins: A dictionary defining the bins for the field. It contains:
    • values: A list of values to discretize continuous data.
    • chart_type: The type of chart to generate for the bins (e.g., "bar").
    • chart_options: Specific options for the bin chart (e.g., "title", "color").

Special Fields

Some fields may have specific configurations depending on their source and field_type:

  • Calculated field (e.g., total number of occurrences):
    • source: "occurrences"
    • source_field: null
    • field_type: "INTEGER"
    • transformations: Must contain a "count" type transformation
  • Boolean field (e.g., occurrence on a particular substrate):
    • source: "occurrences"
    • source_field: The name of the boolean field in the occurrences table
    • field_type: "BOOLEAN"
    • transformations: May contain a "count" type transformation
  • Geographical field (e.g., location of the occurrence):
    • source: "occurrences"
    • source_field: The name of the geographical field in the occurrences table
    • field_type: "GEOGRAPHY"
    • transformations: May contain a "coordinates" type transformation

Note: The transformations and bins can be defined using two equivalent notations:

1. JSON Style Notation:

transformations:
  - {"name": "max", "chart_type": "gauge", "chart_options": {"max": 40, "title": "Maximum", "label": "units"}}

1. Standard YAML Style Notation:

This format will display both YAML notations under a single Markdown box, keeping the explanation compact and the code examples clear and easy to compare.

transformations:
  - name: max
    chart_type: gauge
    chart_options:
      max: 40
      title: Maximum
      label: units

Static Type Checking and Testing with mypy and pytest

Using mypy for Static Type Checking

mypy is an optional static type checker for Python that aims to combine the benefits of dynamic (duck) typing and static typing. It checks the type annotations in your Python code to find common bugs as soon as possible during the development cycle.

To run mypy on your code:

mypy src/niamoto

Running Tests with pytest

pytest is a framework that makes it easy to write simple tests, yet scales to support complex functional testing for applications and libraries.

To run your tests with pytest, use:

pytest --cov=src --cov-report html

Documentation

The documentation for the Niamoto CLI tool is available in the docs directory. It includes information on the CLI commands, configuration options, and data import formats.

To build the documentation, you can use the following command:

cd docs
sphinx-apidoc -o . ../src/niamoto
make html
make markdown

License

niamoto is distributed under the terms of the GPL-3.0-or-later license.

Contribution

Instructions for contributing to the Niamoto project.

niamoto's Projects

niamoto icon niamoto

Niamoto is a command-line application and library focused on processing and publishing botanical data

niamoto-core icon niamoto-core

Ecological information system for decision support in ecosystems and biodiversity conservation.

niamoto-portal icon niamoto-portal

niamoto django main project, including REST APIs and web application.

niamoto-static-site icon niamoto-static-site

This repository hosts the generated static web pages for the Niamoto project. Niamoto provides functionalities such as taxonomy data import, plot management, occurrence analysis, and static content generation. Includes detailed pages for taxons and plots, interactive taxonomy trees, and statistics.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.