Coder Social home page Coder Social logo

nextflow-io / nf-co2footprint Goto Github PK

View Code? Open in Web Editor NEW
10.0 6.0 3.0 1.08 MB

[WIP] A Nextflow plugin to estimate the CO2 footprint of pipeline runs.

Home Page: https://nextflow-io.github.io/nf-co2footprint/

License: Apache License 2.0

Makefile 1.09% Shell 0.05% Nextflow 0.03% Groovy 59.53% HTML 30.50% JavaScript 8.80%

nf-co2footprint's Introduction

nf-co2footprint plugin [WIP]

A Nextflow plugin to estimate the CO2 footprint of pipeline runs.

Introduction

The nf-co2footprint plugin estimates the energy consumption for each pipeline task based on the Nextflow resource usage metrics and information about the power consumption of the underlying compute system. The carbon intensity of the energy production is then used to estimate the respective CO2 emission.

The calculation is based on the carbon footprint computation method developed in the Green Algorithms project: www.green-algorithms.org

Green Algorithms: Quantifying the Carbon Footprint of Computation.

Lannelongue, L., Grealey, J., Inouye, M.,

Adv. Sci. 2021, 2100707. https://doi.org/10.1002/advs.202100707

The nf-co2footprint plugin generates a detailed TXT carbon footprint report containing the energy consumption, the estimated CO2 emission and other relevant metrics for each task. Additionally, an HTML report is generated with information about the carbon footprint of the whole pipeline run and containing plots showing, for instance, an overview of the CO2 emissions for the different processes.

Quick Start

Declare the plugin in your Nextflow pipeline configuration file:

plugins {
  id '[email protected]'
}

This is all that is needed. Run your pipeline with the usual command.

You can find more information on plugins in the Nextflow documentation.

Note

To test the plugin prior to its first release, refer to the contributing documentation.

Credits

The nf-co2footprint plugin is mainly developed and maintained by Sabrina Krakau and Júlia Mir-Pedrol at QBiC.

We thank the following people for their extensive assistance in the development of this pipeline:

nf-co2footprint's People

Contributors

skrakau avatar mirpedrol avatar pditommaso avatar mashehu avatar jorgeaguileraseqera avatar bentsherman avatar

Stargazers

Maxime Borry avatar Simon Heumos avatar Jan Willem Wijnands avatar  avatar Famke Bäuerle avatar Tamás Stirling avatar Sateesh_Peri avatar Nicolas Vannieuwkerke avatar Damon-Lee Pointon avatar Mayan avatar

Watchers

Matteo Fiandesio avatar  avatar J. Kim avatar  avatar Friederike Hanssen avatar  avatar

nf-co2footprint's Issues

Inclusion of the carbon footprint of cached processes

This is to discuss how to present the carbon footprint of cached processes.

Simple example: 3 processes [P1] [P2] [P3]

  • First run of the pipeline, all three are run
  • Second run of the pipeline, cached [P1] is used, [P2] and [P3] are run

When presenting the carbon footprint of run (2), we can either:

  • Take the carbon footprint of [P1] for run (1) and add the carbon footprints of [P2] and [P3] from run (2).
  • Or only add the carbon footprints of [P2] and [P3] and ignore the impact of [P1].

Option 1 gives a better estimate of the total carbon footprint of the pipeline if we were to run it again from start to finish on new data let's say. But option 2 gives a more accurate estimate of the true carbon footprint of running step (2). And if adding run (1) + run (2), option 2 should be used (otherwise the footprint of [P1] would get double counted even though it was only run once).

It seems to depend a lot on what users want to do with this information, so perhaps best to give both information in the report so that users can decide what to do?

Report all used plugin parameters

Currently, not all plugin parameters are given out, including the ones that are used for the actual computation but are not task specific, such as CI and PUE.
They should be listed in the HTML and TXT summary output.

Restructure code

Some parts of the code became a bit chaotic. Restructure code when main functionality is ready.

Handle numeric types consistently

  • Handle numeric types: double, Double, float, Float, BigDecimal consistently
  • change type for values for which float would be sufficient, such as cpu_usage, powerdraw (adjust List, maybe when anyway re-structuring)

Map CPU model names from LINUX file to the `TDP_CPU` file

This is a tricky one and more long term. As it is now, it is likely that the CPU model name returned by LINUX doesn't match exactly the model name in the TDP table.

One option could be to try and find the standardised model names returned my LINUX and have a mapping table. Or some sophisticated regex to map CPU model names. Open topic!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.