astropy / astropy-data Goto Github PK
View Code? Open in Web Editor NEWThe source for the astropy data repository (although the primary server is not on github)
The source for the astropy data repository (although the primary server is not on github)
hello,
STScI built a mirror for http://www.astropy.org/astropy-data/
. It can be found here,
https://astropy-data.s3.us-east-1.amazonaws.com/data/index.html
To update the mirror, install the following URL as a GET request in Github Hooks
https://la6fmo7tpj.execute-api.us-east-1.amazonaws.com/main/astropy_hook
Where may we share a copy of the source codes?
It occurs to me that there is some overlap between sites.json
and https://github.com/astrofrog/acknowledgment-generator. They have different use cases, but they are both about keeping a database of observatory data. In sites.json
we collect the source of the LAT/LON information, which often (but not always) is the same scholary paper that would be used to acknowledge data from that observatory. Thus, it might be good for the community to combine the two data bases in one.
Since it would be considerable work to backfill all values for all observatories that are not common to both databases right now, I suggest to allow a string value of "not filled in the database yet, but please support this work by open a PR here: LINK". There is some overlap already, in particular for the major observatories.
sites.json
also does not include space-based facilities. Either we change the structure to allow a value for location that is "space", or we keep this restricted to ground-based facilities and leave is up to users such as https://github.com/astrofrog/acknowledgment-generator to handle space facilities separately.
There seems to be an inconsistency in the sign of longitudes in the sites.json
file. For example, Siding Spring Observatory should be at about 149 East longitude, the NASA IRTF should be at about 155 West longitude, Lick Observatory should be at 121 West, and La Palma should be at about 17 deg West longitude. But what we see in the sites.json file is inconsistent (output truncated to highlight problem):
"sso": {
"name": "Siding Spring Observatory",
"longitude": 149.06119444444445}
"irtf": {
"name": "NASA Infrared Telescope Facility",
"longitude": 155.4719987888889} ##Should be 204.5280012111111
"lick": {
"name": "Lick Observatory",
"longitude": 238.36333333333332}
"lapalma": {
"name": "Roque de los Muchachos, La Palma",
"longitude": 342.12}
When employed in coordinates.EarthLocation()
we see:
for site in ['sso', 'irtf', 'lick', 'lapalma']:
print(coordinates.EarthLocation.of_site(site).lon)
149d03m40.3s
155d28m19.1956s
-121d38m12s
-17d52m48s
But I think it should be:
149d03m40.3s
-155d28m19.1956s ## <- IRTF
-121d38m12s
-17d52m48s
This inconsistency arose from different standards for representing longitude. The irtf
data came in commit 808c025 which has other entries using negative longitudes, but those seem OK.
It might be neat to add the AAS facilities keywords to our sites.json
. If I looked up a site in astropy, I would know how to add the correct facility to my article. I can envision further uses where sites.json
could be used to add keywords to data downloaded through astroquery, but those are beyond the scope of this repro. If the data is here, others may find other uses.
While I got this idea from looking at the list of AAS keywords, we should add a more generic category to the database, e.g. {'journal keywords': {'AAS': keyword}}
to make it extensible if we add others later. The AAS journals are the only ones with a system like that right now that I'm aware of, though.
AAS journals facilities keywords: https://journals.aas.org/facility-keywords/
Maybe this could be done in collaboration with the AAS data editors? @augustfly
#38 fixed an issue described in #36 where astropy doesn't correctly parse the sites.json
file when it has non-ascii characters. While #38 fixes it, it does so by removing some characters that are proper names that really ought to be unicode. astropy/astropy#7082 fixes the underlying problem by making everything unicode-compliant in astropy. So this issue is about esentially reverting #38, and, more broadly, allowing UTF-8 in the site.json
file.
We do not want to do this yet, however, as it will instantly break all released astropys (at the time of this writing). So instead, we are thinking to follow the suggestion of #36 (comment) - wait a while until we think most installed astropy have the fix (could be a long time...), and then bring in the non-ascii characters again.
The current time zone for Keck observatories is incorrect in the coordinates.json
file. While currently marked as 'US/Aleutian' here, it should be Hawaii Standard Time (US/Hawaii).
The example code from here can be used to check.
'US/Aleutian' is UTC-9, whereas HST is UTC-10
https://www.timeanddate.com/worldclock/usa/honolulu
http://tdc-www.harvard.edu/iraf/rvsao/bcvcorr/obsdb.html
Might want to consider making that tutorial use a smaller file and get rid of synchrotron_i_lobe_0700_150MHz.fits
. cc @eblur and @adrn
remote: warning: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
remote: warning: See http://git.io/iEPt8g for more information.
remote: warning: File tutorials/synthetic-images/synchrotron_i_lobe_0700_150MHz.fits is 64.00 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB
We need to set up some basic CI testing to make sure changes in this repo doesn't introduce a regression in the astropy core testing, see #36
This can be assigned to me, and will try to come back to it in a few days time.
without a way to automate the updates for the intersphinx mappings this may be suboptimal, but having links pointing to an old version is still better than the current way of failing docs builds on astropy core.
This is somewhat related to #69
Currently the sites list lowell and DCT (Discovery Channel Telescope). The DCT got renamed to the Lowell Discovery Telescope (LDT) earlier this year, could we add a duplicate entry for this?
The lowell site actually points to the Anderson Mesa site which should be renamed to NPOI, the main telescope on that site and Lowell should point to the campus on Mars Hill in Flagstaff (35.202875, -111.664781, 2195m) where we now have research telescopes.
Is this possible?
When downloading a file using a hash/
URL, such as hash/34c33b3eb0d56eb9462003af249eff28
, if that file has been previously downloaded and is in your local cache that works fine. However, if you've never downloaded it the hash URL isn't resolved on the data.astropy.org server.
Whenever we add a data file to the server we should also add the appropriate link so that the hash URL works. Or perhaps even better, all files on the server should actually be stored under there hash primarily, with the actual filenames being symlinks to the correct hash for the latest version of that file.
We should also have a database on data.astropy.org (just a JSON file is fine for now) mapping hashes to filenames and vice-versa (the latter may be one-to-many, with the hashes in chronological order for files that have had multiple versions).
Failure for astroquery today:
intersphinx inventory 'https://requests.kennethreitz.org/en/stable/objects.inv' not fetchable due to <class 'requests.exceptions.ConnectionError'>: HTTPSConnectionPool(host='requests.kennethreitz.org', port=443): Max retries exceeded with url: /en/stable/objects.inv (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f13742c48d0>: Failed to establish a new connection: [Errno -2] Name or service not known'))
A light suggestion that people here may want to consider expressing this collection of data-sets using Intake, which is a cataloging standard allowing you to describe the metadata and way-to-load for data in a single browsable, searchable spec, e.g., a YAML file. This is, for instance, what the pangeo collaboration does. Most of their data is in zarr or other xarray-compatible formats, so you would also need intake-astro to enable loading FITS data from remote files, with local caching if desired, and lazy-loading with Dask.
As part of astropy/astropy#8915 I tried to remember what we did for the backup interesphinx mappings here and realised that they are awfully outdated.
They should be either regularly updated (manually or automatically, but I'm not sure it worth the effort investing into machinery for the latter) or removed completely.
Do we need all the old matplotlib comparison images under https://github.com/astropy/astropy-data/tree/gh-pages/testing/astropy ?
With astropy/astropy#8787 , only matplotlib>=2.1
is supported now for master
branch.
The telescope data file sites.json currently lists rubin and rubin_aux. The 8.4m telescope is called the Simonyi Survey Telescope and is part of Vera C. Rubin Observatory so "rubin" might not be the right name as described. LSST is not a telescope so the aliases shouldn't really include it. AuxTel is formally the Rubin Auxiliary Telescope. It's 1.2m (not 1.4m: see https://noirlab.edu/public/programs/vera-c-rubin-observatory/rubin-auxtel/) and should not include LSST in the aliases either.
We've just got the AAS facilities list updated to use Rubin:Simonyi and Rubin:1.2m.
I'm not sure if it's even possible to change "Rubin" without breaking people's code -- is it possible to deprecate sites?
I'm open for ideas as to how best to proceed with this.
# Put your Python code snippet here.
From @eerovaher (https://github.com/astropy/astropy/pull/12721/files#r786773846)
Raising warnings about duplicated names sounds like a good idea, but it should not be implemented until the sites.json
file in astropy-data
is cleaned from duplicates, otherwise the users would receive many warnings that they can't really do anything about. Removing the duplicates from sites.json
would reduce the number of usable keys for anyone not using the bleeding edge version of astropy
, so it doesn't seem it would be a good idea to do that quite yet.
When it comes to checking for duplicated names for different sites then it wouldn't be too difficult to add a test to astropy-data
that would check for that. The output of astropy.coordinates.EarthLocation.get_site_names()
shows that currently there aren't any duplicated labels across sites other than the empty string that this pull request takes care of.
I think the fix is to replace special characters introduced in #34. However, if I am wrong, please do whatever is correct and then close this issue when tests are passing again.
p.s. Maybe we need to find a way to test this stuff over at astropy side before merging.
When I try to run -
from astropy.coordinates.tests import test_sites
test_sites.check_builtin_matches_remote('file://path/to/your/new/sites.json')
I get the following error -
ipython-input-128-46a0624d400b> in <module>()
----> 1 test_sites.check_builtin_matches_remote(f)
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\astropy\coordinates\tests\test_sites.py in check_builtin_matches_remote(download_url)
153 in_dl[name] = name in dl_registry
154 if in_dl[name]:
--> 155 matches[name] = quantity_allclose(builtin_registry[name], dl_registry[name])
156 else:
157 matches[name] = False
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\astropy\units\quantity.py in allclose(a, b, rtol, atol, **kwargs)
1666 """
1667 return np.allclose(*_unquantify_allclose_arguments(a, b, rtol, atol),
-> 1668 **kwargs)
1669
1670
<__array_function__ internals> in allclose(*args, **kwargs)
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\numpy\core\numeric.py in allclose(a, b, rtol, atol, equal_nan)
2169
2170 """
-> 2171 res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
2172 return bool(res)
2173
<__array_function__ internals> in isclose(*args, **kwargs)
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\numpy\core\numeric.py in isclose(a, b, rtol, atol, equal_nan)
2264 # This will cause casting of x later. Also, make sure to allow subclasses
2265 # (e.g., for numpy.ma).
-> 2266 dt = multiarray.result_type(y, 1.)
2267 y = array(y, dtype=dt, copy=False, subok=True)
2268
<__array_function__ internals> in result_type(*args, **kwargs)
TypeError: invalid type promotion
Python 3.5.2 |Enthought, Inc. (x86_64)| (default, Mar 2 2017, 16:37:47) [MSC v.1900 64 bit (AMD64)]
Numpy 1.17.3
Scipy 1.0.0
astropy 3.2.3
pytz 2019.3
https://github.com/astropy/astropy-data/tree/gh-pages/testing/astropy has been replaced by https://github.com/astropy/astropy-figure-tests , right, @astrofrog or @Cadair ?
I currently have 2.5Gb of data stored as hdf5 files stored in several in the following web location:
http://www.astro.yale.edu/aphearin/Data_files/halo_catalogs
The files are organized into subdirectories of this location, and I would like the entire directory and its contents on the data server. These files contain reduced dark matter halo catalogs that have been pre-processed by Halotools, so that users can quickly get up and running with N-body simulation analysis.
This is following up on a discussion with @eteq , who told me to also ping @astrofrog .
For the present needs of Halotools, and for the next several months, I only need <~10Gb of space. However, before the first official package release (towards the end of 2015), I will need more like a few hundred Gb of space. The reason is that I would also like to provide pre-processed binaries of merger trees, not just single-snapshot halo catalogs.
I realize that astropy-data is not really configured for this kind of volume, and that this may take some time, so I'm raising the issue significantly in advance of my package release, as suggested by @eteq .
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.