Coder Social home page Coder Social logo

spacetelescope / jwst_mast_query Goto Github PK

View Code? Open in Web Editor NEW
16.0 3.0 10.0 265 KB

Command line tool for querying MAST and downloading JWST data products

License: BSD 3-Clause "New" or "Revised" License

Python 95.26% JavaScript 4.74%
archive astronomy-astrophysics jwst python query stsci

jwst_mast_query's Issues

No module named 'PIL'

I upgraded recently and got the following error:

(mast_query) % python jwst_query.py --propID 1185 -l 5
Traceback (most recent call last):
  File "~/jwst_mast_query/jwst_mast_query/jwst_query.py", line 18, in <module>
    from PIL import Image
ModuleNotFoundError: No module named 'PIL'

pip install Pillow solved the problem for me. Maybe that just needs to be in the installation requirements?

No Observations Found

I am running a previously successful (version 0.0.3) command with the latest version of jwst_mast_query, and it is now failing. I can't figure out what needs to get changed.

jwst_download.py -v --config jwst_query.cfg --propID 2666 --outrootdir ./mydownloads/ -l 50 --i miri --obsmode image --token YOUR_TOKEN_HERE --filetypes ‘_uncal.fits’

Trouble downloading data acquired after Feb 28.

reported by @bholler:

I attempted to use jwst_mast_query to download data obtained on March 2nd from MAST but was told by the tool that there were no files that fit my criteria (the same criteria I have used for months to obtain uncal files from various instruments). I checked and the data were available on MAST itself. A colleague let me know they were having the same issue today and first suggested that the issue could possibly be that MJD increased above 60000 the other day. However, he did some further investigation and found that he could not download data from after February 28th this year, and this was after MJD exceeded 60000, so that doesn't appear to be the issue. Could there be something hard-coded that doesn't like that February has less than 30 days? I did a quick search through the code and didn't find anything like that. Not sure what exactly is going on, but the cutoff at February 28 is suspicious. I also wonder if something changed on the MAST side of things.

MSA config files for NIRSpec download multiple times

Currently one can't specify _msa.fits files of AUXILLARY type to get for NIRSpec. This would be good to be able to get with this tool, as these are required for reprocessing using calwebb_spec2 for NIRSpec MSA data.

These MSA config files are the only AUXILLARY files currently required during pipeline reprocessing, outside of the SCIENCE (uncal, rate, cal, etc) and INFO (_asn.json) files.

EDIT

My bad. One can get _msa.fits files. I.e.:

jwst_download.py --propID 2736 --filetypes _msa.fits -l 80 -i nirspec

It would be good to add that to the docs as a possible --filetypes arg.

But I notice that it tries to download each one multiple times. This is probably due to the fact that a single MSA configuration file can be used for multiple observations. In the case of the ERO SMACS NIRSpec MSA data, the 2 MSA config files are used for all 1392 datasets, so it downloads each MSA file almost 700 times.

Even for a single exposure, a single MSA config file will be used for both NRS1 and NRS2 detectors, so a single config file for 2 datasets. Perhaps it would be good to find unique entries in the final download table?

No observations detected --- and NameError: name 'sys' is not defined

Hi folks,

Having issues with the latest version of the software. I'm using the .cfg file that's on the repo, but simply modifying the instrument, the PID to match 1541 (which is a public TSO) and the lookbacktime by doing the following edits:

instrument: niriss/soss
(...)
propIDs_obsnum2outsubdir: 1541
(...)
lookbacktime: 1000.0

I then run via jwst_download.py --propID 1541 --config jwst_query.cfg.

However, when I do this, I get the following:

Loading config file jwst_query.cfg
propID 01541
obsnums [1]
INSTRUMENT: niriss/soss
INFO: MAST API token accepted, welcome Nestor Espinoza [astroquery.mast.auth]
MJD range: 59027.13078453367 60027.230784538275
WARNING: NoResultsWarning: Query returned no results. [astroquery.mast.discovery_portal]
WARNING!! No observations found!!

################################
NO OBSERVATIONS FOUND! exiting....
################################
############## Nothing selected!!!! exiting...
Traceback (most recent call last):
  File "/Users/nespinoza/opt/anaconda3/envs/neonewen/bin/jwst_download.py", line 4, in <module>
    __import__('pkg_resources').run_script('jwst-mast-query==0.0.2.dev15+g44b81de', 'jwst_download.py')
  File "/Users/nespinoza/opt/anaconda3/envs/neonewen/lib/python3.10/site-packages/pkg_resources/__init__.py", line 672, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/Users/nespinoza/opt/anaconda3/envs/neonewen/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1479, in run_script
    exec(script_code, namespace, namespace)
  File "/Users/nespinoza/opt/anaconda3/envs/neonewen/lib/python3.10/site-packages/jwst_mast_query-0.0.2.dev15+g44b81de-py3.10.egg/EGG-INFO/scripts/jwst_download.py", line 45, in <module>
  File "/Users/nespinoza/opt/anaconda3/envs/neonewen/lib/python3.10/site-packages/jwst_mast_query-0.0.2.dev15+g44b81de-py3.10.egg/EGG-INFO/scripts/jwst_download.py", line 37, in main
NameError: name 'sys' is not defined

So, first, no observations are detected (?) and then a sys not defined (perhaps related to this?).

Have folks experimented this? Don't know what is actually going wrong.

Thanks!
Néstor

.gitignore

A .gitignore file would be a useful addition to this repository? I do an in-place installation, which means it generate *.egg-info and __pycache__ directories that git then wants to commit. Here's a standard .gitignore I use that covers a bunch of hidden files that various programs might create in the directory:


# Created by https://www.gitignore.io/api/macos,emacs,python

### macOS ###
*.DS_Store
.AppleDouble
.LSOverride

# Visual Studio Code
.vscode

# Icon must end with two \r
Icon
# Thumbnails
._*
# Files that might appear in the root of a volume
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns
.com.apple.timemachine.donotpresent
# Directories potentially created on remote AFP share
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk


### Emacs ###
# -*- mode: gitignore; -*-
*~
\#*\#
/.emacs.desktop
/.emacs.desktop.lock
*.elc
auto-save-list
tramp
.\#*

# Org-mode
.org-id-locations
*_archive

# flymake-mode
*_flymake.*

# eshell files
/eshell/history
/eshell/lastdir

# elpa packages
/elpa/

# reftex files
*.rel

# AUCTeX auto folder
/auto/

# cask packages
.cask/
dist/

# Flycheck
flycheck_*.el

# server auth directory
/server/

# projectiles files
.projectile

# directory configuration
.dir-locals.el


### Python ###
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints
_*.ipynb
Untitled*.ipynb

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# No FITS files
*.fits

# Ignore APT backups
*.aptbackup

html file overwritten when downloading separate observations

For 01138, I downloaded observation 4 with one command, and then downloaded observation 5 with a second command. This was done in order to save each to a separate subdirectory. Looking at the html file, it shows only observation 5 files. It's also located in the directory along side the obsnum directories. It might be better to save it in the obsnum directory. This will save the overwriting issue, and give easier to digest html files with one file for each obsnum.

@arminrest

html file is empty when specifying one suffix of jpg

I just tried retrieving jpg files for 01143. When I set filetype to jpg, I get all the jpgs and an index.html file that looks good. If I set filetype to _rate.jpg then it downloads all of the *_rate.jpg files, but the index.html file is empty other than the column names.

Questions for documentation

I'm going to add questions here as I go through the documentation:

  • What if you want all the data for proposal XXXXX, regardless of how long ago it was? Will providing the PID override the lookback time?
    Answer: No, it won't. Lookback time needs to be far enough back to get all the files.

  • productfilename is no longer a part of the path of the downloaded files?
    It looks like it is not.

Search only returns results for NIRCam

I have cloned and installed the latest version of the repo, but when I supply a cfg file (attached) that specifies instrument: niriss, the code tries to retrieve nircam data. The output I get is below. I even tried a different cfg file that I know has worked in the past.

> jwst_download.py --config jwst_query.cfg
Loading config file jwst_query.cfg
INSTRUMENT:  nircam
obsmode:  [None]
propID:  01085
obsnums:  [12]
INFO: MAST API token accepted, welcome Jo Taylor [astroquery.mast.auth]
MJD range: 59573.0 60059.976084486516
No obsmode given. Querying for all files for nircam.
WARNING: NoResultsWarning: Query returned no results. [astroquery.mast.discovery_portal]
WARNING!! No observations found!!

################################
NO OBSERVATIONS FOUND! exiting....
################################
############## Nothing selected!!!! exiting...

jwst_query.txt

Add documentation of config file parameters to inline help

Currently the inline help

jwst_download.py --help

has less information (and some inaccurate listing of defaults) about the possible parameters one can use, relative to the information in the README describing the config file parameters. It would be good to merge that config file parameter documentation into the inline help, including the current updated defaults.

An example is filetypes default is currently uncal but the inline help says it is None:

  -f FILETYPES [FILETYPES ...], --filetypes FILETYPES [FILETYPES ...]
                        List of product filetypes to get, e.g., _uncal.fits or
                        _uncal.jpg. If only letters, then _ and .fits are
                        added, for example uncal gets expanded to _uncal.fits.
                        Typical image filetypes are uncal, rate, rateints, cal
                        (default=None)

and

filetypes: ['uncal'] | List of file types to select in the product table, e.g., _uncal.fits or _uncal.jpg. If no suffix is given, .fits is appended. If only letters, then _ and .fits are added. For example, 'uncal' gets expanded to _uncal.fits. Typical image filetypes are uncal, rate, rateints, cal. For downloading a single file type, the brackets must still surround the file suffix, as the script expects a list. A relatively complete list of options includes: ['_segm.fits', '_asn.json', '_pool.csv', '_i2d.jpg', '_thumb.jpg', '_cat.ecsv', '_i2d.fits', '_uncal.fits', '_uncal.jpg', '_cal.fits', '_trapsfilled.fits', '_cal.jpg', '_rate.jpg', '_rateints.jpg', '_trapsfilled.jpg', '_rate.fits', '_rateints.fits'] See the JWST calibration pipeline documentation for a complete list.

Btw, great tool. Everyone I know is using it, as the interface is simple. 🚀

Unable to retrieve guide star data

The flag for filtering out guide star data seems to always be True? I tried a query where I did not set it to True, but no guide star data were returned in the search. None of the queries below returned GS data.

We should make sure that the flag value is respected, so that users can retrieve guidestar data if they want it.
Also, we may want to add a e.g. --no_science flag, so that only guide star data are returned.

jwst_download.py -i nircam --propID 1068 --skipdownload --date_select 2022-05-14 2022-05-16 --savetables temp
jwst_download.py -i nircam --propID 1068 --obsnums 4 --skipdownload --date_select 2022-05-14 2022-05-16 --savetables temp
jwst_download.py --propID 1068 --skipdownload --date_select 2022-05-14 2022-05-16 --savetables temp
jwst_download.py -i fgs --propID 1068 --skipdownload --date_select 2022-05-14 2022-05-16 --savetables temp

Put index.html in correct location when there is no PID subdirectory

from Jo Taylor:

Just tested it for NIRISS commissioning data and it worked great. The only issue I had was when I set skip_propID2outsubdir to True and also made the index.html file. In this case it downloaded all files to outrootdir, but wrote the index.html to outrootdir/PID, with the locations of the jpegs also pointing to the PID subdir, even though they actually existed one directory up in outrootdir. Therefore none of the thumbnails worked.

Cannot find some instrument data due to mode

This potentially should be considered a MAST issue rather than an issue for this tool; not sure.

I have discovered that in MAST, for NIRCam coronagraphic data the ‘Instrument’ keyword is set to “NIRCAM/CORON” instead of just NIRCam. On the other hand for MIRI coronagraphy it’s just “MIRI”. In other words the instruments are not treated consistently.

A practical consequence of this is that jwst_mast_query cannot find or download any NIRCam coronagraphy data, because it searches for “nircam” data, and does not know to also search for “nircam/coron”.

There appear to be similar cases of this for other instruments, for example some data labeled as "NIRSPEC/IFU" which does not show up if you search for NIRSpec data using this tool...

keyring.backends.macOS.api.Error: (-25244, 'Unknown Error')

I get a Mac keychain error when running jwst_query.py out of the box with Python 3.8.11, astroquery=0.4.5,keyring=23.5.0 on an M1pro mac with MacOS Monterey 12.0.1. It works without an error in a different environment that has Python 3.6.7, astroquery='0.4.2.dev0' and some intel Rosetta thing. It says "python3 wants to use your confidential information stored in astroquery:mast.stsci.edu.token in your keychain. After that, I get an error

Here is the full traceback:

python jwst_query.py                             
Loading config file /Users/~/outside_progs/jwst_mast_query/jwst_mast_query/jwst_query.cfg
INSTRUMENT: nircam
Traceback (most recent call last):
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/keyring/backends/macOS/__init__.py", line 40, in set_password
    api.set_generic_password(self.keychain, service, username, password)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/keyring/backends/macOS/api.py", line 151, in set_generic_password
    delete_generic_password(name, service, username)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/keyring/backends/macOS/api.py", line 172, in delete_generic_password
    Error.raise_for_status(status)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/keyring/backends/macOS/api.py", line 114, in raise_for_status
    raise cls(status, "Unknown Error")
keyring.backends.macOS.api.Error: (-25244, 'Unknown Error')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "jwst_query.py", line 959, in <module>
    query.login(raiseErrorFlag=False)
  File "jwst_query.py", line 302, in login
    self.JwstObs.login(token=token, store_token=True)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/astroquery/query.py", line 150, in login
    bases[0].login(*args, **kwargs)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/astroquery/query.py", line 514, in login
    self._authenticated = self._login(*args, **kwargs)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/astroquery/mast/core.py", line 66, in _login
    return self._auth_obj.login(token, store_token, reenter_token)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/astroquery/mast/auth.py", line 76, in login
    keyring.set_password("astroquery:mast.stsci.edu.token", "masttoken", token)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/keyring/core.py", line 60, in set_password
    get_keyring().set_password(service_name, username, password)
  File "/Users/~/miniconda3/envs/m1py3p8/lib/python3.8/site-packages/keyring/backends/macOS/__init__.py", line 44, in set_password
    raise PasswordSetError("Can't store password on keychain: " "{}".format(e))
keyring.errors.PasswordSetError: Can't store password on keychain: (-25244, 'Unknown Error')

A way around it is to change this line in jwst_query.py

self.JwstObs.login(token=token, store_token=True)

to

self.JwstObs.login(token=token, store_token=False)

Tables are not saved in the output directory

Hi,

The README section for --savetables says "Tables are saved in the same output directory as the data". However, currently the tables seem to be saved in the current working directory (from where the script is run).

I implemented a simple fix locally (append self.outdir to each Table.write() command). Should I open a PR?

Thanks for this package, it's really useful!

Simplify webpage_level12_jpgs and webpage_cols4table

webpage_level12_jpgs lists the suffixes of the jpg file types for which jpg thumbnails are created.
webpage_cols4table lists the columns that will be written to the index.html file.

At the moment, there is very little checking that these two lists are consistent. #41 adds a small check such that any suffixes in webpage_level12_jpgs but not in webpage_cols4table will not crash the code. But checking for the opposite situation, where the user requests a column in the html file for which they did not ask for jpgs, is much harder to check for. And if there is a jpg suffix in webpage_cols4table that is not in webpage_level12_jpgs, the code crashes.

Easier might be to redefine webpage_cols4table to initially not include any jpg suffixes, and within the code, the entire list of webpage_level12_jpgs will be added to the webpage_cols4table list before creating the html file. In that way, a user would be locked in to seeing all of the jpgs they requested, which doesn't seem like a bad thing. But this way it would be guaranteed that webpage_cols4table would not contain jpg suffixes for which jpgs were not created (unless the user places them in webpage_cols4table to begin with...)

date_select does not work from the config file

I tried a query supplying date_select on the command line and everything worked as expected. However, repeating the query using date_select in the config file, it appears that date_select is being ignored. When using date_select in the config file, I tried the query with and without the lookback entry commented out, and with the date_select values as two numbers separated by a space, as well as two numbers in a python list e.g. [59650, 59651], as well as two strings in a python list e.g. ['59650', '59651']. The results were the same in all cases. @arminrest

Get all data for propID without having to input dates

It would be nice if a user wanted to download all data for a given program, that they could do that without having to specify a lookbacktime or date_select value.

e.g., it would be nice if these worked: (they currently don't because lookbacktime defaults to 1 day, which is a good thing)
jwst_download.py -i nircam --propID 1068
jwst_download.py -i nircam --propID 1068 --obsnums 3

This can be done within the current framework. I think it would require an initial query to MAST to retrieve the dates that the data for the propID were acquired. And then use those dates as date_select values.

Sort the obs and product tables

Would be nice if the results shown in the productTable and obsTable were sorted by observation and productFilename. The results would be easier to examine.

It's just a little complicated because the way the code is at the moment, the unwanted rows are not filtered out from the tables until within the call to the write() function. Might be better to cut the tables down to the wanted rows earlier, and then do something like the lines below, and then write the tables to the screen/files.

    # Sort the productTable to make it easier to read
    if "obsnum" in self.productTable.t and "obs_id" in self.productTable.t:
        self.productTable.t.sort_values(by=["obsnum", "obs_id"], ascending=[True, True])

Deal with re-processed data

Is there a way to have the tool check the date that the data were processed by DMS and download the files if the processing date is more recent than the date in the existing local files?

LRE6 flag cuts off too early

LRE6 cuts off at 2021-10-23T00:00:00 but there is data for LRE6 beyond that.

I suggest changing the LRE6 flag from
mjd_max = mjd_min+6
to
mjd_max = mjd_min+10

querying for exposure types

would it be possible to add a query option for exposure types? this would help when querying for TA products for example.

File number Issues with NIRSpec MSA downloads

In NIRSpec MSA downloads, the query returns 10000s of files. Thus, each file seems to be downloaded multiple times, taking an extraordinary amount of time.

for example


# define the output filename
outrootdir: nirspec/data/
outsubdir:
skip_propID2outsubdir: False
skip_check_if_outfile_exists: False

filetypes: ['_rate.fits', '_msa.fits']

yields:

### Summary propID/obsnum:
##########################
 proposal_id obsnum  _rate.fits  _msa.fits              date_start
        1345     61        2766       1386 2022-12-21T02:43:12.597
        1345     62          12          6 2022-12-21T06:33:48.039
        1345     63        5268       2634 2022-12-21T08:28:54.465
        1345     64        4140       2070 2022-12-21T10:07:02.395
        1345     66        1428        720 2022-12-21T13:59:18.006
        1345     67          12          6 2022-12-21T17:46:40.672
        1345     68        2718       1362 2022-12-21T19:31:26.938
        1345     69        2718       1362 2022-12-21T23:24:39.152
        1345     71          24         18 2022-12-22T03:16:53.100
Saving 28650 rows into ./mast_query_table.selprod.txt
Saving 1662 rows into ./mast_query_table.obs.txt
Saving 9 rows into ./mast_query_table.summary.txt

###############################
### Downloading 27730 files


### Downloading #1 out of 27730 files (status: 0 successful, 0 failed): jw01345061001_02101_00003_nrs1_rate.fits
Downloading URL https://mast.stsci.edu/jwst/api/v0.1/download/file?uri=mast:JWST/product/jw01345061001_02101_00003_nrs1_rate.fits tonirspec/data/01345/jw01345061001_02101_00003_nrs1_rate.fits ...
|====================================================================================================================================|  83M/ 83M (100.00%)        20s

### Downloading #2 out of 27730 files (status: 1 successful, 0 failed): jw01345061001_02101_00003_nrs1_rate.fits
Downloading URL https://mast.stsci.edu/jwst/api/v0.1/download/file?uri=mast:JWST/product/jw01345061001_02101_00003_nrs1_rate.fits to data/01345/jw01345061001_02101_00003_nrs1_rate.fits ...
|====================================================================================================================================|  83M/ 83M (100.00%)        17s

### Downloading #3 out of 27730 files (status: 2 successful, 0 failed): jw01345061001_02101_00003_nrs1_rate.fits
Downloading URL https://mast.stsci.edu/jwst/api/v0.1/download/file?uri=mast:JWST/product/jw01345061001_02101_00003_nrs1_rate.fits to nirspec/data/01345/jw01345061001_02101_00003_nrs1_rate.fits

ideally the code should check for unique objects before downloading right? Else for a full program for all levels of calibrations there will be 100k+ files being repeatedly downloaded?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.