Coder Social home page Coder Social logo

spacetelescope / jwst_mast_query Goto Github PK

View Code? Open in Web Editor NEW
13.0 3.0 10.0 265 KB

Command line tool for querying MAST and downloading JWST data products

License: BSD 3-Clause "New" or "Revised" License

Python 95.26% JavaScript 4.74%
archive astronomy-astrophysics jwst python query stsci

jwst_mast_query's Introduction

jwst_mast_query

This repository contains software for querying MAST for JWST observations and products, displaying the results in ascii tables, summarizing the files in a local html file, and downloading the selected products to a local directory.

jwst_mast_query uses the MAST API to query MAST. It first queries over the input time frame and retrieves all of the matching "observations". In this case the script uses a definition of "observation" that matches that in APT. An observation can be composed of one or more exposures, from one or more detectors of a single instrument. For each observation, the script identifies all of the individual "products" or files, to download. There can be many products for a given observation, including science data files in various stages of calibration, guide star observations, jpg files of the observations, etc. This means if the user asks for too many observations, the number of products can be so large that the MAST query times out. For example: The pre-launch LRE5 rehearsal has 264 NIRCam observations, which have in turn >17,000 products! About 12,000 of these are guide star products, about 5,000 are science observation products, and 500 are raw, uncalibrated science exposures. jwst_mast_query is able to download all NIRCam LRE5 products, but times out when downloading NIRSpec LRE5 products, as NIRSpec had more observations during the rehearsal. The easiest way to avoid time-outs is to remove unwanted products when defining the query.

Installation

Install the latest released version

pip install jwst_mast_query

Install the development version

pip install git+https://github.com/spacetelescope/jwst_mast_query.git

Server Installation

If you encounter the error below when running, try installing pip install keyrings.alt and repeating the installation of jwst_mast_query.

keyring.errors.NoKeyringError: No recommended backend was available. Install a recommended 3rd party backend package; or, install the keyrings.alt package if you want to use the non-recommended backends. See https://pypi.org/project/keyring for details.

Environment Variables

These environment variables are not required, but they make life easier:

  • MAST_API_TOKEN: Set this variable to your MAST token, and you get automatically logged in. You can also pass your token with --token. NOTE: if you don't use your MAST token for 10 days, it will become invalid, and you have to get a new one from here: https://auth.mast.stsci.edu/token

  • JWST_QUERY_CFGFILE: Set this variable to your config file (yaml), and it gets loaded automatically. There is a default config file jwst_query.cfg. The config file can also be supplied at run time using --config and a path to a local config file.

  • JWSTDOWNLOAD_OUTDIR: The yaml config file accepts environment variable in the format $XYZ.

  • In the default config file, "outrootdir: $JWSTDOWNLOAD_OUTDIR". This allows different people to use the same config file, but store the images in different locations.

Use

The jwst_mast_query package contains two tools, both designed to be called from the command line. The first is jwst_download.py. This script will query MAST for files matching the input parameters, and then download those files into a local output directory specified by the user. There is also an option to download associated jpgs from MAST and create a simple index.html file with a table containing these preview images and basic observation information. If requested, the script will also save ascii tables with information on the files identified by the query.

The other script is jwst_query.py. This script will query MAST for files matching the input parameters. jwst_download.py wraps around jwst_query.py and provides the abiltiy to download the files identified by jwst_query.py.

NOTE: jwst_download.py will check to see if the if the files returned by the MAST query already exist in the specified output directory. If they do, it will not download the files. It checks on a file-by-file basis, and also checks if the file is complete. Only missing or incomplete files will be downloaded, meaning that you can safely run jwst_download.py multiple times with the same query, and you will end up only downloading missing/new data, saving time.

Inputs

Inputs to the command line call of jwst_download.py are optional, but we recommend at a minimum using a configuration file to define input parameters. To get started, download/copy the configuration file from the repository. For a detailed description of the options in the configuration file, see the Config File section below.

Below we show an example of a typical call to jwsst_download.py, with a few options specified. In this case, we specify verbose mode (-v) in order to get more details printed to the screen as the command runs. In addition, we want to locate all data from JWST proposal 1410 (--propID). We specify the name of the config file using --config. Since there is no path given, it is assumed that the config file is in the current working directory. We set the --lookbacktime to 3 days. This means that jwst_download.py will only search for observations taken within the last 3 days. Finally, we request data only from NIRCam's NRCA1 and NRCA2 detectors using the --sca option.

jwst_download.py -v --propID 1410 --config jwst_query.cfg --lookbacktime 3 --sca a1 a2

Common command line options

  • --lookbacktime : The number of days before the present to use as the beginning of the query

  • --propID : The JWST proposal number. e.g. --propID 1409 or --propID 01409

  • --obsnums : Optional observation number or numbers within propID to retrieve. e.g. --obsnums 3 103 would retrieve files only from observations 3 and 103.

  • --makewebpages : Make webpages for the products for each propID containing info and images of the retrieved data.

These are just a few of the options that can be set. The rest are specified in the config file provided in the call. Config file details are given below. For more example calls, see the Examples section below.

Config File and Input Options

Hierarchy

Most config parameters get their values from one of three places: defaults, the config file, and command line arguments.

When jwst_download.py or jwst_query.py are run, first the config file is read in, and all parameters in the config file are saved in self.params, overwriting the existing default values. Then, any command line arguments that are provided are added to self.params, overwriting the config file parameters.

Config file

The config file is a convenient way to set and keep track of parameter values. The table below lists the contents of the config file and defines each parameter.

Parameter Name: default value Description
instrument: nircam Specify the instrument (nircam, nirspec, niriss, miri, fgs)
propID Specify the JWST proposal number to query for. This can be a 5 digit integer, or for a smaller number, an integer with or without prepended zeros. e.g. 1409 or 01409.
obsnums Specify the observation numbers within the propID. This can be a single number, or a bracketed list of numbers. e.g. 3 or [3, 103]
outrootdir: $JWSTDOWNLOAD_OUTDIR The base directory for the downloaded products. This can be the name of an environment variable (such as JWSTDOWNLOAD_OUTDIR in this example), or a path. Note the preceding "$" in the case where an environment variable is given. If you include the --outrootdir command line argument when calling jwst_query.py or jwst_download.py, that value will override the value provided here.
outsubdir: Any additional directory to add to the base directory. This can be used to customize the organization of the downloaded products.
skip_propID2outsubdir: False By default, the APT proposal ID is added as a subdir to the directory. You can skip this with this option.
obsnum2outsubdir: True If True, the observation number will be used to create a subdirectory into which the appropriate files will be placed. e.g. $JWSTDOWNLOAD_OUTDIR/01410/obsnum23/
propIDs_obsnum2outsubdir: [1409] Specify list of propID for which obsnum2outsubdir is True. Only for the listed proposal numbers will observation numbers be used to create a subdirectory into which the appropriate files will be placed. e.g. $JWSTDOWNLOAD_OUTDIR/01410/obsnum23/
skip_check_if_outfile_exists: False By default, the script queries MAST for products, and then checks if each file already exists in the output directory or not ("dl_code" and "dl_str" columns). For large numbers of products, this can take time. By setting this option to True, this check can be skipped.
Nobs_per_batch: 2 For large programs, the query for the products can time out, and in these cases it is better to split up the query into Nobs_per_batch observations per batch. For example, if there are 22 observations, and Nobs_per_batch=4, then there will be 6 batches, 5 batches with 4 observations, and the last batch with 2.
obsmode: ['image', 'wfss'] Optionally specify the observation modes to query for. If provided, MAST will be queried only for these types of observations. e.g. assuming instrument is 'nircam', obsmode: ['image', 'wfss'] will query MAST for NIRCAM/IMAGE and NIRCAM/WFSS data. If left empty, the query will include all modes. For a list of valid modes, see the example config file in the repository. Note that as of 24 March 2023, MAST is in the process of reprocessing all data to add the mode values to the instrument names. Until this process is complete, some older data may not include the mode name, and therefore will not be found via a query that includes the mode name.
filetypes: ['uncal'] List of file types to select in the product table, e.g., _uncal.fits or _uncal.jpg. If no suffix is given, .fits is appended. If only letters, then _ and .fits are added. For example, 'uncal' gets expanded to _uncal.fits. Typical image filetypes are uncal, rate, rateints, cal. For downloading a single file type, the brackets must still surround the file suffix, as the script expects a list. A relatively complete list of options includes: ['_segm.fits', '_asn.json', '_pool.csv', '_i2d.jpg', '_thumb.jpg', '_cat.ecsv', '_i2d.fits', '_uncal.fits', '_uncal.jpg', '_cal.fits', '_trapsfilled.fits', '_cal.jpg', '_rate.jpg', '_rateints.jpg', '_trapsfilled.jpg', '_rate.fits', '_rateints.fits'] See the JWST calibration pipeline documentation for a complete list.
jpg_separate_subdir: False If True, downloaded jpgs are saved in separate "jpg" subdirectories, along side the fits files.
guidestars: False If guidestars is set to True, guidestar products are also included. Note: there are a lot of guide star products. We recommend you set to True only if really needed!
guidestar_data_only : False If guidestar_data_only is set to True, only guidestar products will be included. Science products will be filtered out.
lookbacktime: 1.0 Lookback time in days. The script will query MAST over a time from the lookback time to the present moment. Note that all other time parameters (date_select, etc) override the lookback time.
date_select: [] Specify date range (MJD or isot format) applied to "dateobs_center" column. If single value, then only exact matches will be returned. If a single value has "+" or "-" at the end, then it is a lower and upper limit, respectively. date_select will override the lookbacktime. Examples: 58400+, 58400-, 2020-11-23+,2020-11-23 2020-11-25
savetables: Save the tables (selected products, obsTable, summary with suffix selprod.txt, obs.txt, summary.txt, respectively) with the specified string as basename. Tables are saved in the same output directory as the data. If no string is provided, the tables are not saved.
mastcolumns_obsTable: ['proposal_id', 'dataURL', 'obs_id', 't_min','t_exptime'] Core columns returned from MAST to the obsTable
outcolumns_productTable List of columns to be shown in product table, e.g., ['proposal_id', 'obsnum', 'obsID', 'parent_obsid', 'obs_id', 'dataproduct_type', 'productFilename', 'filetype', 'calib_level', 'size', 'outfilename', 'dl_code', 'dl_str']
outcolumns_obsTable: ['proposal_id', 'obsnum', 'obsid', 'obs_id', 't_min', 't_exptime', 'date_min'] Output columns for the obsTable.
sortcols_productTable: ['calib_level','filetype','obsID'] The productTable is sorted based on these columns. The default sorts the table based on calibration level.
sortcols_obsTable: ['date_min','proposal_id','obsnum'] The obsTable is sorted based on these columns. The defaults sort the table in the order the observations were observed
sortcols_summaryTable: ['date_start','proposal_id','obsnum'] The summary table is sorted based on these columns. The default sorts the table in the order the observations were observed
makewebpages: False Make webpages for the products for each propID containing info and images of the retrieved data.
webpage_tablefigsize_width: Specify the figure box width. Recommended: 100-150, or don't specify if webpage_mkthumbnails is True. In that case the size of the thumbnails is used by default.
webpage_tablefigsize_height: Specify the figure box height. Recommended: 100-150, or don't specify if webpage_mkthumbnails is True. In that case the size of the thumbnails is used by default.
webpage_level12_jpgs: ['_uncal.jpg','_dark.jpg','_rate.jpg','_rateints.jpg','_trapsfilled.jpg','_cal.jpg','_crf.jpg'] List of filetypes whose thumbnails will be shown in index.html.
webpage_fitskeys2table: ['TARG_RA', 'TARG_DEC', 'FILTER', 'PUPIL', 'READPATT', 'NINTS', 'NGROUPS', 'NFRAMES', 'DATE-BEG', 'DATE-END', 'EFFINTTM', 'EFFEXPTM'] List of fits header keywords that should be copied to the table.
webpage_cols4table': ['proposal_id', 'obsnum', 'visit', 'obsID', 'parent_obsid', 'sca', 'FILTER', 'PUPIL', 'READPATT', 'uncal', 'dark', 'rate', 'rateints', 'cal', 'TARG_RA', 'TARG_DEC', 'NINTS', 'NGROUPS', 'NFRAMES', 'DATE-BEG', 'DATE-END', 'EFFINTTM', 'EFFEXPTM', 'size', 'obs_id', 'outfilename'] Columns to be shown in index.html
webpage_sortcols: ['proposal_id', 'obsnum', 'visit', 'sca'] Columns to sort index.html by.
webpage_mkthumbnails: True If True, a thumbnail jpg is created for each of the jpg products listed in webpage_level12_jpgs
webpage_thumbnails_overwrite: False If True, remake thumbnails even if they already exist
webpage_thumbnails_width: 120 Width in pixels of the resized jpg images to be inserted into the index.html summary file.
webpage_thumbnails_height: Height in pixels of the resized jpg images to be inserted into the index.html summary file. If left undefined, the height will be determined from webpage_thumbnail_width and the aspect ratio of the original image.

Outputs

The primary output of jwst_download.py are the downloaded files themselves. By default, the downloaded files are saved into the directory:

<outrootdir><outsubdir><proposal number>obsnum<XX>

where:

proposal number is the 5-digit, zero-padded APT number. i.e. the proposal ID. outsubdir is an optional user-specified additional subdirectory. XX is the observation number, as specified in the proposal.

In addition to querying MAST and downloading the selected files, jwst_mast_query saves several ASCII files containing tables with details of the files. These are the Summary table, the obsTable, and the productTable.

By default, the summary table will be saved in the output directory with the name .summary.txt This table contains a high-level summary of the data identified in the query. The example below shows that the table contains the propposal ID, observation number, number of uncal.fits files found for each observation, and the date of the beginning of the observation.

proposal_id  obsnum  \_uncal.fits         date_start
    1410       1            1 2022-02-13T22:58:32.378
    1410       3            1 2022-02-14T20:22:58.543
    1410      14            1 2022-02-14T20:34:26.670

The obsTable contains data about each observation found to have data matching the query. In the example below we see that this table contains the individual observation IDs for the files, as well as the exposure time for each.

proposal_id  obsnum  obsid                obs_id                  t_min      t_exptime       date_min              \_uncal.fits
    1410       1    71672220 jw01410001001_02101_00001_guider1 59623.957319    161.052   2022-02-13T22:58:32.378         1
    1410      14    71673297 jw01410014001_02101_00001_guider1 59624.857253    161.052   2022-02-14T20:34:26.670         1
    1410       3    71673298 jw01410003001_02101_00001_guider1 59624.849289    161.052   2022-02-14T20:22:58.543         1

The table of selected products gives even more details about each downloaded product. In the example below we see there is information on the type of each data product, the calibration pipelne level through which the file has been run, and the location to which the file has been saved.

proposal_id obsnum   obsID  parent_obsid                   obs_id             sca     dataproduct_type    filetype  calib_level size                       outfilename                                  dl_code  dl_str
   1410       1    71672220  71672220     jw01410001001_02101_00001_guider1 guider1      image          \_uncal.fits     1    100713600 /jwst_data/01410/jw01410001001_02101_00001_guider1_uncal.fits        0     NaN
   1410       3    71673298  71673298     jw01410003001_02101_00001_guider1 guider1      image          \_uncal.fits     1    100713600 /jwst_data/01410/jw01410003001_02101_00001_guider1_uncal.fits        0     NaN
   1410      14    71673297  71673297     jw01410014001_02101_00001_guider1 guider1      image          \_uncal.fits     1    100713600 /jwst_data/01410/jw01410014001_02101_00001_guider1_uncal.fits        0     NaN

Examples

Get all NIRCam NRCA1 and NRCA2 files for proposal 1410 taken in the last 3 days

jwst_download.py -v --propID 1410 --config jwst_query.cfg --lookbacktime 3 --sca a1 a2

Specify a proposal ID and specific observation numbers

Download the fits and jpg files for JWST proposal 1138, taken in the last 1 day and create an index.html summary file.

jwst_download.py -v -c jwst_query.cfg --outrootdir /jwst_data -l 1 --propID 01138 --makewebpages --filetypes jpg fits

Download the fits files for JWST proposal 1409, observations 3 and 103 only, taken in the last 2 days.

jwst_download.py -v --config jwst_query.cfg --lookbacktime 2 --propID 1409 --obsnums 3 103

Download the fits and jpg files for JWST proposal 1138, observation 4 only, taken in the last 1 day and create an index.html summary file.

jwst_download.py -v -c jwst_query.cfg --outrootdir /jwst_data -l 1 --propID 01138 --obsnums 4 --makewebpages --filetypes jpg fits

Download the fits and jpg files for JWST proposal 1138, observations 4, 5, and 7 only, taken in the last 1 day and create an index.html summary file.

jwst_download.py -v -c jwst_query.cfg --outrootdir /jwst_data -l 1 --propID 01138 --obsnums 4 5 7 --makewebpages --filetypes jpg fits

Specify dates

Get all files for proposal 743 with an observation date between Aug 11, 2021 16:49:49 and Aug 12, 2021 16:49:49

jwst_download.py -v  -c jwst_query.cfg --propID 743 --date_select 2021-08-11T16:49:49 2021-08-12T16:49:49

Get all files for proposal 743 with an observation date of Aug 11, 2021 16:49:49 or later

jwst_download.py -v  -c jwst_query.cfg --propID 743 --date_select 2021-08-11T16:49:49+

Get all files for proposal 743 with an observation date of Aug 11, 2021 16:49:49 or earlier

jwst_download.py -v  -c jwst_query.cfg --propID 743 --date_select 2021-08-11T16:49:49-

Get all files for proposal 743 with an observation date of MJD 59430.0 or later

jwst_download.py -v  -c jwst_query.cfg --propID 743 --date_select 59430.0+

Download only the jpg (not the fits) files from the last 5 days, and create a table of results, saved into index.html

jwst_download.py -v -c jwst_query.cfg --outrootdir /my_jwst_data -lookbacktime 5 --makewebpages --filetype jpg

jwst_mast_query's People

Contributors

arminrest avatar bhilbert4 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

jwst_mast_query's Issues

Trouble downloading data acquired after Feb 28.

reported by @bholler:

I attempted to use jwst_mast_query to download data obtained on March 2nd from MAST but was told by the tool that there were no files that fit my criteria (the same criteria I have used for months to obtain uncal files from various instruments). I checked and the data were available on MAST itself. A colleague let me know they were having the same issue today and first suggested that the issue could possibly be that MJD increased above 60000 the other day. However, he did some further investigation and found that he could not download data from after February 28th this year, and this was after MJD exceeded 60000, so that doesn't appear to be the issue. Could there be something hard-coded that doesn't like that February has less than 30 days? I did a quick search through the code and didn't find anything like that. Not sure what exactly is going on, but the cutoff at February 28 is suspicious. I also wonder if something changed on the MAST side of things.

No observations detected --- and NameError: name 'sys' is not defined

Hi folks,

Having issues with the latest version of the software. I'm using the .cfg file that's on the repo, but simply modifying the instrument, the PID to match 1541 (which is a public TSO) and the lookbacktime by doing the following edits:

instrument: niriss/soss
(...)
propIDs_obsnum2outsubdir: 1541
(...)
lookbacktime: 1000.0

I then run via jwst_download.py --propID 1541 --config jwst_query.cfg.

However, when I do this, I get the following:

Loading config file jwst_query.cfg
propID 01541
obsnums [1]
INSTRUMENT: niriss/soss
INFO: MAST API token accepted, welcome Nestor Espinoza [astroquery.mast.auth]
MJD range: 59027.13078453367 60027.230784538275
WARNING: NoResultsWarning: Query returned no results. [astroquery.mast.discovery_portal]
WARNING!! No observations found!!

################################
NO OBSERVATIONS FOUND! exiting....
################################
############## Nothing selected!!!! exiting...
Traceback (most recent call last):
  File "/Users/nespinoza/opt/anaconda3/envs/neonewen/bin/jwst_download.py", line 4, in <module>
    __import__('pkg_resources').run_script('jwst-mast-query==0.0.2.dev15+g44b81de', 'jwst_download.py')
  File "/Users/nespinoza/opt/anaconda3/envs/neonewen/lib/python3.10/site-packages/pkg_resources/__init__.py", line 672, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/Users/nespinoza/opt/anaconda3/envs/neonewen/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1479, in run_script
    exec(script_code, namespace, namespace)
  File "/Users/nespinoza/opt/anaconda3/envs/neonewen/lib/python3.10/site-packages/jwst_mast_query-0.0.2.dev15+g44b81de-py3.10.egg/EGG-INFO/scripts/jwst_download.py", line 45, in <module>
  File "/Users/nespinoza/opt/anaconda3/envs/neonewen/lib/python3.10/site-packages/jwst_mast_query-0.0.2.dev15+g44b81de-py3.10.egg/EGG-INFO/scripts/jwst_download.py", line 37, in main
NameError: name 'sys' is not defined

So, first, no observations are detected (?) and then a sys not defined (perhaps related to this?).

Have folks experimented this? Don't know what is actually going wrong.

Thanks!
Néstor

Cannot find some instrument data due to mode

This potentially should be considered a MAST issue rather than an issue for this tool; not sure.

I have discovered that in MAST, for NIRCam coronagraphic data the ‘Instrument’ keyword is set to “NIRCAM/CORON” instead of just NIRCam. On the other hand for MIRI coronagraphy it’s just “MIRI”. In other words the instruments are not treated consistently.

A practical consequence of this is that jwst_mast_query cannot find or download any NIRCam coronagraphy data, because it searches for “nircam” data, and does not know to also search for “nircam/coron”.

There appear to be similar cases of this for other instruments, for example some data labeled as "NIRSPEC/IFU" which does not show up if you search for NIRSpec data using this tool...

Unable to retrieve guide star data

The flag for filtering out guide star data seems to always be True? I tried a query where I did not set it to True, but no guide star data were returned in the search. None of the queries below returned GS data.

We should make sure that the flag value is respected, so that users can retrieve guidestar data if they want it.
Also, we may want to add a e.g. --no_science flag, so that only guide star data are returned.

jwst_download.py -i nircam --propID 1068 --skipdownload --date_select 2022-05-14 2022-05-16 --savetables temp
jwst_download.py -i nircam --propID 1068 --obsnums 4 --skipdownload --date_select 2022-05-14 2022-05-16 --savetables temp
jwst_download.py --propID 1068 --skipdownload --date_select 2022-05-14 2022-05-16 --savetables temp
jwst_download.py -i fgs --propID 1068 --skipdownload --date_select 2022-05-14 2022-05-16 --savetables temp

LRE6 flag cuts off too early

LRE6 cuts off at 2021-10-23T00:00:00 but there is data for LRE6 beyond that.

I suggest changing the LRE6 flag from
mjd_max = mjd_min+6
to
mjd_max = mjd_min+10

querying for exposure types

would it be possible to add a query option for exposure types? this would help when querying for TA products for example.

MSA config files for NIRSpec download multiple times

Currently one can't specify _msa.fits files of AUXILLARY type to get for NIRSpec. This would be good to be able to get with this tool, as these are required for reprocessing using calwebb_spec2 for NIRSpec MSA data.

These MSA config files are the only AUXILLARY files currently required during pipeline reprocessing, outside of the SCIENCE (uncal, rate, cal, etc) and INFO (_asn.json) files.

EDIT

My bad. One can get _msa.fits files. I.e.:

jwst_download.py --propID 2736 --filetypes _msa.fits -l 80 -i nirspec

It would be good to add that to the docs as a possible --filetypes arg.

But I notice that it tries to download each one multiple times. This is probably due to the fact that a single MSA configuration file can be used for multiple observations. In the case of the ERO SMACS NIRSpec MSA data, the 2 MSA config files are used for all 1392 datasets, so it downloads each MSA file almost 700 times.

Even for a single exposure, a single MSA config file will be used for both NRS1 and NRS2 detectors, so a single config file for 2 datasets. Perhaps it would be good to find unique entries in the final download table?

html file is empty when specifying one suffix of jpg

I just tried retrieving jpg files for 01143. When I set filetype to jpg, I get all the jpgs and an index.html file that looks good. If I set filetype to _rate.jpg then it downloads all of the *_rate.jpg files, but the index.html file is empty other than the column names.

Put index.html in correct location when there is no PID subdirectory

from Jo Taylor:

Just tested it for NIRISS commissioning data and it worked great. The only issue I had was when I set skip_propID2outsubdir to True and also made the index.html file. In this case it downloaded all files to outrootdir, but wrote the index.html to outrootdir/PID, with the locations of the jpegs also pointing to the PID subdir, even though they actually existed one directory up in outrootdir. Therefore none of the thumbnails worked.

No module named 'PIL'

I upgraded recently and got the following error:

(mast_query) % python jwst_query.py --propID 1185 -l 5
Traceback (most recent call last):
  File "~/jwst_mast_query/jwst_mast_query/jwst_query.py", line 18, in <module>
    from PIL import Image
ModuleNotFoundError: No module named 'PIL'

pip install Pillow solved the problem for me. Maybe that just needs to be in the installation requirements?

Questions for documentation

I'm going to add questions here as I go through the documentation:

  • What if you want all the data for proposal XXXXX, regardless of how long ago it was? Will providing the PID override the lookback time?
    Answer: No, it won't. Lookback time needs to be far enough back to get all the files.

  • productfilename is no longer a part of the path of the downloaded files?
    It looks like it is not.

keyring.backends.macOS.api.Error: (-25244, 'Unknown Error')

I get a Mac keychain error when running jwst_query.py out of the box with Python 3.8.11, astroquery=0.4.5,keyring=23.5.0 on an M1pro mac with MacOS Monterey 12.0.1. It works without an error in a different environment that has Python 3.6.7, astroquery='0.4.2.dev0' and some intel Rosetta thing. It says "python3 wants to use your confidential information stored in astroquery:mast.stsci.edu.token in your keychain. After that, I get an error

Here is the full traceback:

python jwst_query.py                             
Loading config file /Users/~/outside_progs/jwst_mast_query/jwst_mast_query/jwst_query.cfg
INSTRUMENT: nircam
Traceback (most recent call last):
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/keyring/backends/macOS/__init__.py", line 40, in set_password
    api.set_generic_password(self.keychain, service, username, password)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/keyring/backends/macOS/api.py", line 151, in set_generic_password
    delete_generic_password(name, service, username)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/keyring/backends/macOS/api.py", line 172, in delete_generic_password
    Error.raise_for_status(status)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/keyring/backends/macOS/api.py", line 114, in raise_for_status
    raise cls(status, "Unknown Error")
keyring.backends.macOS.api.Error: (-25244, 'Unknown Error')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "jwst_query.py", line 959, in <module>
    query.login(raiseErrorFlag=False)
  File "jwst_query.py", line 302, in login
    self.JwstObs.login(token=token, store_token=True)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/astroquery/query.py", line 150, in login
    bases[0].login(*args, **kwargs)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/astroquery/query.py", line 514, in login
    self._authenticated = self._login(*args, **kwargs)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/astroquery/mast/core.py", line 66, in _login
    return self._auth_obj.login(token, store_token, reenter_token)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/astroquery/mast/auth.py", line 76, in login
    keyring.set_password("astroquery:mast.stsci.edu.token", "masttoken", token)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/keyring/core.py", line 60, in set_password
    get_keyring().set_password(service_name, username, password)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/keyring/backends/macOS/__init__.py", line 44, in set_password
    raise PasswordSetError("Can't store password on keychain: " "{}".format(e))
keyring.errors.PasswordSetError: Can't store password on keychain: (-25244, 'Unknown Error')

A way around it is to change this line in jwst_query.py

self.JwstObs.login(token=token, store_token=True)

to

self.JwstObs.login(token=token, store_token=False)

Get all data for propID without having to input dates

It would be nice if a user wanted to download all data for a given program, that they could do that without having to specify a lookbacktime or date_select value.

e.g., it would be nice if these worked: (they currently don't because lookbacktime defaults to 1 day, which is a good thing)
jwst_download.py -i nircam --propID 1068
jwst_download.py -i nircam --propID 1068 --obsnums 3

This can be done within the current framework. I think it would require an initial query to MAST to retrieve the dates that the data for the propID were acquired. And then use those dates as date_select values.

Sort the obs and product tables

Would be nice if the results shown in the productTable and obsTable were sorted by observation and productFilename. The results would be easier to examine.

It's just a little complicated because the way the code is at the moment, the unwanted rows are not filtered out from the tables until within the call to the write() function. Might be better to cut the tables down to the wanted rows earlier, and then do something like the lines below, and then write the tables to the screen/files.

    # Sort the productTable to make it easier to read
    if "obsnum" in self.productTable.t and "obs_id" in self.productTable.t:
        self.productTable.t.sort_values(by=["obsnum", "obs_id"], ascending=[True, True])

date_select does not work from the config file

I tried a query supplying date_select on the command line and everything worked as expected. However, repeating the query using date_select in the config file, it appears that date_select is being ignored. When using date_select in the config file, I tried the query with and without the lookback entry commented out, and with the date_select values as two numbers separated by a space, as well as two numbers in a python list e.g. [59650, 59651], as well as two strings in a python list e.g. ['59650', '59651']. The results were the same in all cases. @arminrest

Add documentation of config file parameters to inline help

Currently the inline help

jwst_download.py --help

has less information (and some inaccurate listing of defaults) about the possible parameters one can use, relative to the information in the README describing the config file parameters. It would be good to merge that config file parameter documentation into the inline help, including the current updated defaults.

An example is filetypes default is currently uncal but the inline help says it is None:

  -f FILETYPES [FILETYPES ...], --filetypes FILETYPES [FILETYPES ...]
                        List of product filetypes to get, e.g., _uncal.fits or
                        _uncal.jpg. If only letters, then _ and .fits are
                        added, for example uncal gets expanded to _uncal.fits.
                        Typical image filetypes are uncal, rate, rateints, cal
                        (default=None)

and

filetypes: ['uncal'] | List of file types to select in the product table, e.g., _uncal.fits or _uncal.jpg. If no suffix is given, .fits is appended. If only letters, then _ and .fits are added. For example, 'uncal' gets expanded to _uncal.fits. Typical image filetypes are uncal, rate, rateints, cal. For downloading a single file type, the brackets must still surround the file suffix, as the script expects a list. A relatively complete list of options includes: ['_segm.fits', '_asn.json', '_pool.csv', '_i2d.jpg', '_thumb.jpg', '_cat.ecsv', '_i2d.fits', '_uncal.fits', '_uncal.jpg', '_cal.fits', '_trapsfilled.fits', '_cal.jpg', '_rate.jpg', '_rateints.jpg', '_trapsfilled.jpg', '_rate.fits', '_rateints.fits'] See the JWST calibration pipeline documentation for a complete list.

Btw, great tool. Everyone I know is using it, as the interface is simple. 🚀

.gitignore

A .gitignore file would be a useful addition to this repository? I do an in-place installation, which means it generate *.egg-info and __pycache__ directories that git then wants to commit. Here's a standard .gitignore I use that covers a bunch of hidden files that various programs might create in the directory:


# Created by https://www.gitignore.io/api/macos,emacs,python

### macOS ###
*.DS_Store
.AppleDouble
.LSOverride

# Visual Studio Code
.vscode

# Icon must end with two \r
Icon
# Thumbnails
._*
# Files that might appear in the root of a volume
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns
.com.apple.timemachine.donotpresent
# Directories potentially created on remote AFP share
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk


### Emacs ###
# -*- mode: gitignore; -*-
*~
\#*\#
/.emacs.desktop
/.emacs.desktop.lock
*.elc
auto-save-list
tramp
.\#*

# Org-mode
.org-id-locations
*_archive

# flymake-mode
*_flymake.*

# eshell files
/eshell/history
/eshell/lastdir

# elpa packages
/elpa/

# reftex files
*.rel

# AUCTeX auto folder
/auto/

# cask packages
.cask/
dist/

# Flycheck
flycheck_*.el

# server auth directory
/server/

# projectiles files
.projectile

# directory configuration
.dir-locals.el


### Python ###
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints
_*.ipynb
Untitled*.ipynb

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# No FITS files
*.fits

# Ignore APT backups
*.aptbackup

No Observations Found

I am running a previously successful (version 0.0.3) command with the latest version of jwst_mast_query, and it is now failing. I can't figure out what needs to get changed.

jwst_download.py -v --config jwst_query.cfg --propID 2666 --outrootdir ./mydownloads/ -l 50 --i miri --obsmode image --token YOUR_TOKEN_HERE --filetypes ‘_uncal.fits’

html file overwritten when downloading separate observations

For 01138, I downloaded observation 4 with one command, and then downloaded observation 5 with a second command. This was done in order to save each to a separate subdirectory. Looking at the html file, it shows only observation 5 files. It's also located in the directory along side the obsnum directories. It might be better to save it in the obsnum directory. This will save the overwriting issue, and give easier to digest html files with one file for each obsnum.

@arminrest

File number Issues with NIRSpec MSA downloads

In NIRSpec MSA downloads, the query returns 10000s of files. Thus, each file seems to be downloaded multiple times, taking an extraordinary amount of time.

for example


# define the output filename
outrootdir: nirspec/data/
outsubdir:
skip_propID2outsubdir: False
skip_check_if_outfile_exists: False

filetypes: ['_rate.fits', '_msa.fits']

yields:

### Summary propID/obsnum:
##########################
 proposal_id obsnum  _rate.fits  _msa.fits              date_start
        1345     61        2766       1386 2022-12-21T02:43:12.597
        1345     62          12          6 2022-12-21T06:33:48.039
        1345     63        5268       2634 2022-12-21T08:28:54.465
        1345     64        4140       2070 2022-12-21T10:07:02.395
        1345     66        1428        720 2022-12-21T13:59:18.006
        1345     67          12          6 2022-12-21T17:46:40.672
        1345     68        2718       1362 2022-12-21T19:31:26.938
        1345     69        2718       1362 2022-12-21T23:24:39.152
        1345     71          24         18 2022-12-22T03:16:53.100
Saving 28650 rows into ./mast_query_table.selprod.txt
Saving 1662 rows into ./mast_query_table.obs.txt
Saving 9 rows into ./mast_query_table.summary.txt

###############################
### Downloading 27730 files


### Downloading #1 out of 27730 files (status: 0 successful, 0 failed): jw01345061001_02101_00003_nrs1_rate.fits
Downloading URL https://mast.stsci.edu/jwst/api/v0.1/download/file?uri=mast:JWST/product/jw01345061001_02101_00003_nrs1_rate.fits tonirspec/data/01345/jw01345061001_02101_00003_nrs1_rate.fits ...
|====================================================================================================================================|  83M/ 83M (100.00%)        20s

### Downloading #2 out of 27730 files (status: 1 successful, 0 failed): jw01345061001_02101_00003_nrs1_rate.fits
Downloading URL https://mast.stsci.edu/jwst/api/v0.1/download/file?uri=mast:JWST/product/jw01345061001_02101_00003_nrs1_rate.fits to data/01345/jw01345061001_02101_00003_nrs1_rate.fits ...
|====================================================================================================================================|  83M/ 83M (100.00%)        17s

### Downloading #3 out of 27730 files (status: 2 successful, 0 failed): jw01345061001_02101_00003_nrs1_rate.fits
Downloading URL https://mast.stsci.edu/jwst/api/v0.1/download/file?uri=mast:JWST/product/jw01345061001_02101_00003_nrs1_rate.fits to nirspec/data/01345/jw01345061001_02101_00003_nrs1_rate.fits

ideally the code should check for unique objects before downloading right? Else for a full program for all levels of calibrations there will be 100k+ files being repeatedly downloaded?

Simplify webpage_level12_jpgs and webpage_cols4table

webpage_level12_jpgs lists the suffixes of the jpg file types for which jpg thumbnails are created.
webpage_cols4table lists the columns that will be written to the index.html file.

At the moment, there is very little checking that these two lists are consistent. #41 adds a small check such that any suffixes in webpage_level12_jpgs but not in webpage_cols4table will not crash the code. But checking for the opposite situation, where the user requests a column in the html file for which they did not ask for jpgs, is much harder to check for. And if there is a jpg suffix in webpage_cols4table that is not in webpage_level12_jpgs, the code crashes.

Easier might be to redefine webpage_cols4table to initially not include any jpg suffixes, and within the code, the entire list of webpage_level12_jpgs will be added to the webpage_cols4table list before creating the html file. In that way, a user would be locked in to seeing all of the jpgs they requested, which doesn't seem like a bad thing. But this way it would be guaranteed that webpage_cols4table would not contain jpg suffixes for which jpgs were not created (unless the user places them in webpage_cols4table to begin with...)

Deal with re-processed data

Is there a way to have the tool check the date that the data were processed by DMS and download the files if the processing date is more recent than the date in the existing local files?

Search only returns results for NIRCam

I have cloned and installed the latest version of the repo, but when I supply a cfg file (attached) that specifies instrument: niriss, the code tries to retrieve nircam data. The output I get is below. I even tried a different cfg file that I know has worked in the past.

> jwst_download.py --config jwst_query.cfg
Loading config file jwst_query.cfg
INSTRUMENT:  nircam
obsmode:  [None]
propID:  01085
obsnums:  [12]
INFO: MAST API token accepted, welcome Jo Taylor [astroquery.mast.auth]
MJD range: 59573.0 60059.976084486516
No obsmode given. Querying for all files for nircam.
WARNING: NoResultsWarning: Query returned no results. [astroquery.mast.discovery_portal]
WARNING!! No observations found!!

################################
NO OBSERVATIONS FOUND! exiting....
################################
############## Nothing selected!!!! exiting...

jwst_query.txt

Tables are not saved in the output directory

Hi,

The README section for --savetables says "Tables are saved in the same output directory as the data". However, currently the tables seem to be saved in the current working directory (from where the script is run).

I implemented a simple fix locally (append self.outdir to each Table.write() command). Should I open a PR?

Thanks for this package, it's really useful!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.