spacetelescope / jwql Goto Github PK

The James Webb Space Telescope Quicklook Application

License: BSD 3-Clause "New" or "Revised" License

Jupyter Notebook 4.04% Python 85.29% CSS 0.86% JavaScript 2.94% HTML 6.73% Shell 0.13%

python html css javascript django sphinx jenkins pytest jupyter-notebook conda

jwql's Introduction

The JWST Quicklook Application (`JWQL`)

The JWST Quicklook Application (JWQL) is a database-driven web application and automation framework for use by the JWST instrument teams to monitor and trend the health, stability, and performance of the JWST instruments. The system is comprised of the following:

A network file system that stores all uncalibrated and calibrated data products on disk in a centrally-located area, accessible to instrument team members (i.e. the MAST data cache)
A relational database that stores observational metadata allowing for data discovery via relational queries (MAST database API).
A software library that provides tools to support an automation framework in which to build automated instrument monitoring routines.
A web application that allows users to visually inspect new and archival JWST data as well as instrument-specific monitoring and performance results.

Official API documentation can be found on ReadTheDocs

The jwql application is available at https://jwql.stsci.edu. Please note that the application is currently restricted to specific JWST instrument team members.

Installation for Users

To install jwql, simply use pip:

pip install jwql

The section below describes a more detailed installation for users that wish to contribute to the jwql repository.

Installation for Contributors

Getting jwql up and running on your own computer requires four steps, detailed below:

Cloning the GitHub repository
Installing the condaenvironment
Installing the python package
Setting up the configuration file

Prerequisites

It is highly suggested that contributors have a working installation of anaconda or miniconda for Python 3.9+. Downloads and installation instructions are available here:

Requirements for contributing to the jwql package will be included in the jwql conda environment, which is included in our installation instructions below. Further package requirements will be provided for jwql by a pyproject.toml file included in the repository.

Clone the `jwql` repo

You first need to clone the current version of jwql. The simplest way to do this is to go to the directory you want your copy of the repository to be in and clone the repository there. Once you are in the directory you can do the following:

git clone https://github.com/spacetelescope/jwql.git
cd jwql

or, if you would rather use SSH instead of https, type

git clone [email protected]:spacetelescope/jwql.git
cd jwql

instead, and then proceed as stated.

Environment Installation

Following the download of the jwql repository, contributors can then install the jwql conda environment via the environment yaml file, which contains all of the dependencies for the project. First, if necessary, install conda. Next, ensure that your version of conda is up to date:

conda update conda

Next, activate the base or root environment (depending on your version of conda):

source activate base/root

Note: If you have added a step activating conda to your default terminal/shell (e.g. the .bashrc, .zshrc, or .profile file) then you don't need to do the above step.

Lastly, create the jwql environment via one of the environment.yml files (currently environment_python_3.9.yml, for python 3.9, and environment_python_3.10.yml, for python 3.10, are supported by jwql):

conda env create -f environment_python_3.9.yml

conda env create -f environment_python_3.10.yml

Configuration File

Much of the jwql software depends on the existence of a config.json file within the jwql directory. This file contains data that may be unique to users and/or contain sensitive information. Please see the Config File wiki page for instructions on how to provide this file.

Citation

If you use JWQL for work/research presented in a publication (whether directly, or as a dependency to another package), we recommend and encourage the following acknowledgment:

  This research made use of the open source Python package 'jwql' (Bourque et al, 2020).

where (Bourque et al, 2020) is a citation of the Zenodo record available using the DOI badge above. By using the Export box in the lower right corner of the Zenodo page, you can export the citation in the format most convenient for you.

Software Contributions

There are two current pages to review before you begin contributing to the jwql development. The first is our style guide and the second is our suggested git workflow page, which contains an in-depth explanation of the workflow.

Contributors are also encouraged to check out the Checklist for Contributors Guide to ensure the pull request contains all of the necessary changes.

The following is a bare-bones example of a best work flow for contributing to the project:

Create a fork off of the spacetelescope jwql repository.
Make a local clone of your fork.
Ensure your personal fork is pointing upstream properly.
Create a branch on that personal fork.
Make your software changes.
Push that branch to your personal GitHub repository (i.e. origin).
On the spacetelescope jwql repository, create a pull request that merges the branch into spacetelescope:develop.
Assign a reviewer from the team for the pull request.
Iterate with the reviewer over any needed changes until the reviewer accepts and merges your branch.
Delete your local copy of your branch.

Issue Reporting / Feature Requests

Users who wish to report an issue or request a new feature may do so through the following channels:

Submit a new issue on GitHub (preferred method): https://github.com/spacetelescope/jwql/issues
Submit a new ticket on Jira: https://jira.stsci.edu/projects/JWQL/

Code of Conduct

Users and contributors to the jwql repository should adhere to the Code of Conduct. Any issues or violations pertaining to the Code of Conduct should be brought to the attention of a jwql team member or to [email protected].

Questions

Any questions about the jwql project or its software can be directed to [email protected].

Current Development Team

Bryan Hilbert (Project Manager, INS) @bilhbert4
Mees Fix (Technical Lead, INS) @mfixstsci
Misty Cracraft (INS) @cracraft
Mike Engesser (INS) @mengesser
Maria Pena-Guerrero @penaguerrero
Ben Sunnquist (INS) @bsunnquist
Brian York (INS) @york-stsci
Bradley Sappington (INS) @bradleysappington
Melanie Clarke (INS) @melanieclarke

Past Development Team Members

Matthew Bourque (INS) @bourque
Lauren Chambers (INS) @laurenmarietta
Joe Filippazzo (INS) @hover2pi
Graham Kanarek (INS) @gkanarek
Teagan King (INS) @tnking97
Sara Ogaz (DMD) @SaOgaz
Catherine Martlin (INS) @catherine-martlin
Johannes Sahlmann (INS) @Johannes-Sahlmann
Shannon Osborne (INS) @shanosborne

Acknowledgments:

Faith Abney (DMD)
Joshua Alexander (DMD) @obviousrebel
Anastasia Alexov (DMD)
Sara Anderson (DMD)
Tracy Beck (INS)
Francesca Boffi (INS) @frboffi
Clara Brasseur (DMD) @ceb8
Matthew Burger (DMD)
Steven Crawford (DMD) @stscicrawford
James Davies (DMD) @jdavies-st
Rosa Diaz (INS) @rizeladiaz
Van Dixon (INS)
Larry Doering (ITSD)
Tom Donaldson (DMD) @tomdonaldson
Kim DuPrie (DMD)
Jonathan Eisenhamer (DMD) @stscieisenhamer
Ben Falk (DMD) @falkben
Ann Feild (OPO)
Mike Fox (DSMO) @mfox22
Scott Friedman (INS)
Alex Fullerton (INS) @awfullerton
Macarena Garcia Marin (INS)
Lisa Gardner (DMD)
Vera Gibbs (ITSD)
Catherine Gosmeyer (INS) @cgosmeyer
Phil Grant (ITSD)
Dean Hines (INS)
Sherie Holfeltz (INS) @stholfeltz
Joe Hunkeler (DMD) @jhunkeler
Catherine Kaleida (DMD) @ckaleida
Deborah Kenny (DMD)
Jenn Kotler (DMD) @jenneh
Daniel Kühbacher (Goddard) @DanielKuebi
Mark Kyprianou (DMD) @mkyp
Stephanie La Massa (INS)
Matthew Lallo (INS)
Karen Levay (DMD)
Crystal Mannfolk (SCOPE) @cmannfolk
Greg Masci (ITSD)
Jacob Matuskey (DMD) @jmatuskey
Margaret Meixner (INS)
Christain Mesh (DMD) @cam72cam
Prem Mishra (ITSD)
Don Mueller (ITSD)
Maria Antonia Nieto-Santisteban (SEITO)
Brian O'Sullivan (INS)
Joe Pollizzi (JWSTMO)
Lee Quick (DMD)
Anupinder Rai (ITSD)
Matt Rendina (DMD) @rendinam
Massimo Robberto (INS) @mrobberto
Mary Romelfanger (DMD)
Elena Sabbi (INS)
Bernie Shiao (DMD)
Matthew Sienkiewicz (ITSD)
Arfon Smith (DSMO) @arfon
Linda Smith (INS)
Patrick Taylor (ITSD)
Dave Unger (ITSD)
Jeff Valenti (JWSTMO) @JeffValenti
Jeff Wagner (ITSD)
Thomas Walker (ITSD)
Geoff Wallace (DMD)
Lara Wilkinson (OPO)
Alex Yermolaev (ITSD) @alexyermolaev
Joe Zahn (ITSD)

jwql's People

Contributors

Stargazers

Watchers

Forkers

laurenmarietta bourque gkanarek saogaz bhilbert4 catherine-martlin cracraft johannes-sahlmann hover2pi danielkuebi dickreuter brianjosullivan stscirij mtplate astro-josh rendinam bsunnquist jbcurtin belldandyxtq cicerolneto jason-neal kudaykiran tnking97 mfixstsci penaguerrero ttemim mengesser lepture shanosborne york-stsci dmkunsman bradleysappington teaganking rcooper295 snoyes melanieclarke nespinoza sosey zacharyburnett nflagey-stsci

jwql's Issues

Write tests for generate_preview_image

One test that we could write is to pick a program ID, make sure a preview_image and thumbail directory exists, and make sure all expected preview images and thumbails exist.

Build module that logs the execution of a script

One feature that has been particularly handy for both wfc3ql and acsql has been decorator functions that log the execution of a script. We should build something similar for jwql. Perhaps the logging_functions module in wfc3ql is a good reference/starting point.

Build ginga plugin for web application

It would be awesome if we could interact with JWST FITS files via ginga straight through the web application. Through some quick research, @gkanarek pointed out that this should be possible.

Ask OPO if they have ST-specific css themes

@gkanarek pointed out that at least one other web application at ST was using some css that has the look and feel of the STScI branding. I should ask Chad Smith if OPO already has existing material for this.

Create wiki page describing how to make API docs for code

Now that we have sphinx all set up, it would be useful to have instructions on how to add API docs for any future code that we write.

Build filename parser utility function

It would be useful to have a function in the utils.py module that returned the individual elements of a given filename, for example:

from jwql.utils.utils import parse_filename
filename_dict = parse_filename('jw94015001001_02102_00001_nrcb1_uncal.fits')

where filename_dict is:

{
    'program_id' : '94015',
    'observation' : '001',
    'visit' : '001',
    'visit_group' : '02',
    'parallel_seq_id' : '1',
    'activity' : '02',
    'exposure_id' : '00001',
    'detector' : 'nrcb1',
    'suffix' : 'uncal'
}

documentation - readthedocs?

I strongly suggest we use readthedocs to autogenerate our documentation (with the help of sphinx). We even have a STScI readthedocs style sheet (ex: http://stak-notebooks.readthedocs.io/en/latest/). Not shown in this example is the inclusion of the sphinx generated API documentation pulled from doc strings in the code.

Ask DMS/archives how proprietary data will be accessed and authenticated

Now that we have support from the JWST mission office to access JWST proprietary data from the same location as the DMS/archive teams will use, we should figure out the implementation details of how these data are accessed and how the access is authenticated.

Develop Database Schema

We need to develop a schema for the jwql database. I think a decent starting point is something like the schema I used for ACS Quicklook:

In this schema, we have a master table that keeps track of each rootname that is in the database and when it was ingested. The datasets table keeps track of which filetypes exist for a given rootname. Then there is a table for each detector/extension/filetype combination which is basically a dump of the headers (columns are header keys and values are header values).

To construct this for jwql, we will need to know the following for each instrument:

What are all of the possible filetypes and what purpose do they serve?
What is the data structure for each filetype (i.e. number of extensions, what purpose each extension serves, what datatype each one is)?
What are the header keywords for each filetype/extension combination?

Create module that generates a preview image for a given JWST file

Make Wiki page describing why and how to use the Logging function

Get access to/test the engineering database

There exists a JWST 'engineering database'. We should see if we can use this and if it could be helpful for our application.

Make updated schedule for project

Now that we have passed the proposal stage and are actively developing on this project, it is a good time to refine our initial schedule to more accurately reflect the work we are actually doing and come up with more accurate deadlines.

Overview of instrument-specific calibration and monitoring software

We can use this thread to discuss instrument-specific calibration and monitoring software and identify what work needs to be done. A good first step would be to consolidate the tables provided in the Phase A proposal, which I will do.

Build database monitor

Similar to #47, we should also build a monitoring script that gathers information about the MAST database. Some things that would be good to know:

Number of files in the database
Number of files in the database for a given instrument and/or observing mode
Number of header keywords stored in database

Output should be bokeh or matplotlib plots that track these things over time.

I'm working on securing central storage disk space for the jwql project. Eventually outputs should be stored there.

Figure out what to do for config unit test

Marked as xfail for now so jenkins runs, but we should decide if we want to keep that test or not.

Acquire some JWST FITS files to develop with

The infrastructure of the jwql system will be heavily dependent on the FITS file structure of JWST data, specifically their extensions, header contents, and available filetypes. It would be helpful if we can get our hands on some files that are at least anticipated to be close to the actually data products to come out of the telescope after launch.

Create an interface to the jwqldb database

Akin to database_interface.py for acsql, we need to build a module that will serve as an interface to the jwqldb database. This module will hold the classes and functions for creating the tables and connecting to the database.

Ask ITSD to build a test and production server

Vera Gibbs suggests that we request a test server and environment sooner rather than later, as it helps ITSD plan their work better. The test environment should mimic the production environment, so we should do some thinking on what we would like our production environment to be first.

Generate text files that contain header keywords and their data type

In order to implement the database schema, we will need lists of each header keyword for each instrument/detector/filetype combination, akin to the acsql examples. These lists will eventually be used by database_interface to create the header tables of the database.

The code that generates these for the acsql example is here, perhaps it can be adapted for jwql use.

It would be ideal if we could generate these lists programmatically, with very little or no hard coding/file creation/file editing involved (since there sure are a lot of detectors and filetypes!).

Build module that sets the appropriate permissions for a given file

When the jwql project eventually has a filesystem containing preview images, proprietary files (still uncertain), and output products generated by automated calibration and monitoring scripts, we will need to make sure these files have appropriate unix permissions as to not let anyone outside of the jwql "group" see them.

As such, it would be convenient for scripts to be able to import a module that takes care of this without having to worry what permissions to set things to. So we should build this module.

The module should take as input a path to a file, check to see if the owner of the file is the jwql admin account, and if so, (1) set the permissions appropriately, and (2) set the group membership appropriately.

This module should also come with nosetest(s) to test if the function(s) within the module are working properly.

Enable "advanced" preview images

During the meeting with the NIRCam team, they expressed interest in having preview images that includes a mosaic with all detectors that were observed during a particular observation. Perhaps this could be an option for a preview image to view on the web application.

Determine which database technology to use

According to Vera Gibbs, ITSD supports MySQL, PostgreSQL, and MSS. We need to decide which of these to use for the jwql database.

Add sphinx docs for db_monitor

With the merge of #52, we should add the db_monitor API docs to the sphinx build.

Determine if a public and/or proprietary filesystem will be available through MAST

Ideally, we would like to avoid having to maintain a separate copy of JWST data to support the jwql application. Through informal conversations with @tomdonaldson, I've been told that we will have access to a centrally-located filesystem of both public and proprietary JWST data, akin to the MAST public cache. We should make sure if this is actually true, and if so, what the organization of this filesystem will look like.

Update README with new environment installation

With #30 , there will be some extra steps to install the jwql-dev environment, namely:

conda update conda
source activate root
conda env create -f environment.yml

We should update the README to reflect this.

Make wiki page describing how to use the config file

Build script that will generate preview images for all files in filesystem

Now that we have preview_image.py, we can build a wrapper around this module to create preview images for all files in the filesystem. The preview images should be stored under the jwql project directory on central storage in some sort of organized filesystem.

Test the MAST API for JWQL needs

Now that we have been pointed to the MAST API, we should test it out with the needs of the JWQL application in mind and gather a list of any capabilities that we would like to don't exist in the API.

Build database table(s) for tracking anomalies

One feature of our web application will be to tag images with having specific anomalies (e.g. satellite trail, various detector artifacts, etc.). Though we don't know which anomalies the instrument teams may want to track (nor will we probably know for sure until operations), we should build the framework for a database that stores anomaly information.

We already have a postgresql database built for us, we can use that one for now.

Draft a site map for the web application

A good starting point for designing the structure of the web application is to build a site map. We can build a draft site map for now and iterate on it until we are happy

Build filesystem monitor

We should build a script that will monitor and gather statistics about the filesystem. We should be able to easily answer questions like:

How many files are there in total?
How much disk space is used by all of the files?
How many <some_filetype> files are there?
Other questions I can't think of right now.

I envision the script will create a series of bokeh or matplotlib plots that show these statistics over time. These plots can eventually be hosted on our web application. Currently our filesystem is just some static test filesystem, so the plots will be quite boring for now. But this will become more useful after launch.

This script will become one of the monitors that get run by cron once a day on our virtual machine. Output products should be directed to a specific directory (to be determined).

Make elevator line/pitch for JWQL

Build sphinx documentation

Now that we have some actual code with docstrings in our jwql package, we can start adding the tools to build official API documentation. I have done this before with sphinx and it has worked well, so unless anybody has a better suggestion I think we should use that.

The documentation that we build could then be hosted on Read the Docs.

Request development web server

Now that we have a prototype for a web application, it would be useful to test its efficiency on an internal server.

Create style guide for jwql software development

To ensure best software development practices and principles, we should make a style guide for software development on this project. We can then check pull requests against this style guide to ensure all collaborators are coding to standard.

Build in file download capability through web application

One feature of our web application should be to be able to download a JWST FITS file through the web app to the user's machine. I envision that, when looking at a webpage that displays a preview image of a particular observation, there are buttons to download each available filetype for that observation.

Determine which skipped keywords are important for jwql

For various reasons, the MAST API/database purposefully skips the storage of certain JWST header keywords (see attached file). However, Kim DuPrie and Lisa Gardner suggested that skipped keywords could be added to the database/API if there is a use case. We should identify the keywords that we think will be important for our application and ask that they add these.

skipped_jwst_keyword.txt

Estimate the size of the database

It would be useful for our work in creating the database schema (issue #6) if we had a rough idea of how big we think the database will eventually be. It may tell us if we need to be concerned about disk space and/or memory issues, which could dictate how we structure the schema.

Make web style guide for the JWQL web app

We should decide on a theme/brand for our web application and document it in a 'web style guide'. Something like this example:

Make dev conda environment more general

We should make our dev conda environment more generalized so that it can be used on the new test server.

Build tests for preview_image.py

We should probably have some tests for preview_image.py, like we do for permissions.py.

One obvious test would be to use a test file as input and check to see that the code successfully generates a preview image file.

Assigning to @Johannes-Sahlmann

Investigate JWST Data Analysis Tools

Reproducing here from the Slack channel:

"During the NIRSpec team meeting on Tuesday, we had a demo of three of the tools the JWST Data Analysis Development Team have been working on - specviz, mosviz, and cubeviz. One of my team members asked whether JWQL would be able to use the tools for QL, but I said no, since they’re stand-alone tools and wouldn’t work in a browser. However, I did say that we might be able to learn from them in terms of choices they’ve made & issues they’ve faced when visualizing various JWST data products. we could also think about whether we want to try and be consistent with those tools in terms of theme/aesthetic/whatever."

So, once we move towards implementing the JWQL visualizations, we should make sure to investigate these tools, and see what we can adapt.

Add sphinx docs for filesystem monitor

Now that the filesystem monitor is merged in (#69), we should add the sphinx documentation.

Decide on Continous Integration and testing solutions

There are two main options used at the institute for continuous integration: Travis and Jenkins. IT has written up a policy about this, although it is still a draft: https://confluence.stsci.edu/pages/viewpage.action?pageId=99327010

I'm not sure of the specifics on writing tests that will need to interact with a database. This may determine which CI we will use. AFAIK, as relates to our project, Jenkin's strength is the short wait time to run the test suite. We have private servers for this, whereas with Travis sometimes there can a wait time of a few hours. Travis's strength is that anyone contributing a PR can start/stop a Travis run. For Jenkins, if a test suite run needs to be stopped, restarted, etc. that has to be done by someone internal with the correct permissions.

We may want to split unit testing out into a separate issue later on.

Create README for contributing to this project

It may be helpful to others if there were some instructions on how to get started on contributing code to this project (i.e. cloning the repo, submitting pull requests, etc.)

Explore using django for building a web application

For the wfc3ql and acsql projects, we used Flask as the web framework for building web applications, however I'm not sure the lightweight nature of Flask is going to cut it for jwql, since jwql will be scaled up by ~5 compared to wfc3ql and acsql.

For this reason, I would like to explore using django as an alternative to Flask. django appears to be an 'industry standard' for web frameworks in python. (There are even entire dedicated conferences to it, so people must use it!)

It looks like the jdango website has decent documentation and basic tutorials on how to use it. Perhaps this is a good starting point.

Incorporate OWASP Top 10 into Workflow

https://www.owasp.org/images/7/72/OWASP_Top_10-2017_%28en%29.pdf.pdf

Task list

This issue serves as a place to list and discuss the various tasks that are to be completed for the project. Each task will eventually become its own issue to allow for further discussion (to be created by @bourque in the near future).

Create text files that list each header keyword and its data type for each instrument/detector/filetype combination (@bhilbert4)
Build module(s) for creating and connecting to the database (i.e. database_interface.py) (@Ogaz, @bhilbert4, @bourque)
Build module(s) that logs the execution of a script (i.e. logging.py) (@cmartlinSTScI )
Build module(s) that inserts records into the database (i.e. update_database.py) (@Johannes-Sahlmann )
Figure out how to connect Jenkins CI to repository (@hover2pi, @SaraOgaz )
Start building a web app with django (@laurenmarietta )

I've tentatively assigned individuals to tasks, but those can definitely change based on people's interests and availability. Please feel free to comment below any thoughts/feelings you have.

github workflow

Here's the github workflow recommendations that DATB/SCSB will be using for our repos. I like this workflow a lot, but if anyone has any different preferences we can always make adjustments for this project:

https://confluence.stsci.edu/display/DATB/Git+Development+and+Release+Workflows