Coder Social home page Coder Social logo

astropy-data's Introduction

Astropy Data Server

GitHub Actions CI Status

This repository is the source for the Astropy Data Server. You should issue pull requests here to add data to the server, but note that it has to be mirrored by the main server to actually appear at http://data.astropy.org.

astropy-data's People

Contributors

adrn avatar aelanman avatar amgraf avatar astrofrog avatar bsipocz avatar cadair avatar caseyjlaw avatar cylammarco avatar dependabot[bot] avatar dhilipsiva avatar dhomeier avatar eblur avatar eteq avatar eundas avatar hamogu avatar hazboun6 avatar japp avatar keflavich avatar kreardon avatar lpsinger avatar majkelx avatar matteobachetti avatar parejkoj avatar pllim avatar sfgraves avatar sharongoliath avatar shbhuk avatar stuartlittlefair avatar tbowers7 avatar yucelkilic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

astropy-data's Issues

Add preferred reference and citation to `sites.json`?

Description

It occurs to me that there is some overlap between sites.json and https://github.com/astrofrog/acknowledgment-generator. They have different use cases, but they are both about keeping a database of observatory data. In sites.json we collect the source of the LAT/LON information, which often (but not always) is the same scholary paper that would be used to acknowledge data from that observatory. Thus, it might be good for the community to combine the two data bases in one.

Additional context

Since it would be considerable work to backfill all values for all observatories that are not common to both databases right now, I suggest to allow a string value of "not filled in the database yet, but please support this work by open a PR here: LINK". There is some overlap already, in particular for the major observatories.

sites.json also does not include space-based facilities. Either we change the structure to allow a value for location that is "space", or we keep this restricted to ground-based facilities and leave is up to users such as https://github.com/astrofrog/acknowledgment-generator to handle space facilities separately.

Add AAS facilities keywords to site-data?

Description

It might be neat to add the AAS facilities keywords to our sites.json. If I looked up a site in astropy, I would know how to add the correct facility to my article. I can envision further uses where sites.json could be used to add keywords to data downloaded through astroquery, but those are beyond the scope of this repro. If the data is here, others may find other uses.

While I got this idea from looking at the list of AAS keywords, we should add a more generic category to the database, e.g. {'journal keywords': {'AAS': keyword}} to make it extensible if we add others later. The AAS journals are the only ones with a system like that right now that I'm aware of, though.

Additional context

AAS journals facilities keywords: https://journals.aas.org/facility-keywords/
Maybe this could be done in collaboration with the AAS data editors? @augustfly

Storing 10-100Gb of data on the data server for Halotools

I currently have 2.5Gb of data stored as hdf5 files stored in several in the following web location:
http://www.astro.yale.edu/aphearin/Data_files/halo_catalogs

The files are organized into subdirectories of this location, and I would like the entire directory and its contents on the data server. These files contain reduced dark matter halo catalogs that have been pre-processed by Halotools, so that users can quickly get up and running with N-body simulation analysis.

This is following up on a discussion with @eteq , who told me to also ping @astrofrog .

For the present needs of Halotools, and for the next several months, I only need <~10Gb of space. However, before the first official package release (towards the end of 2015), I will need more like a few hundred Gb of space. The reason is that I would also like to provide pre-processed binaries of merger trees, not just single-snapshot halo catalogs.

I realize that astropy-data is not really configured for this kind of volume, and that this may take some time, so I'm raising the issue significantly in advance of my package release, as suggested by @eteq .

Use Intake

A light suggestion that people here may want to consider expressing this collection of data-sets using Intake, which is a cataloging standard allowing you to describe the metadata and way-to-load for data in a single browsable, searchable spec, e.g., a YAML file. This is, for instance, what the pangeo collaboration does. Most of their data is in zarr or other xarray-compatible formats, so you would also need intake-astro to enable loading FITS data from remote files, with local caching if desired, and lazy-loading with Dask.

Big data warning from GitHub

Might want to consider making that tutorial use a smaller file and get rid of synchrotron_i_lobe_0700_150MHz.fits. cc @eblur and @adrn

remote: warning: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
remote: warning: See http://git.io/iEPt8g for more information.
remote: warning: File tutorials/synthetic-images/synchrotron_i_lobe_0700_150MHz.fits is 64.00 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB

hash/ URLs not working/available

When downloading a file using a hash/ URL, such as hash/34c33b3eb0d56eb9462003af249eff28, if that file has been previously downloaded and is in your local cache that works fine. However, if you've never downloaded it the hash URL isn't resolved on the data.astropy.org server.

Whenever we add a data file to the server we should also add the appropriate link so that the hash URL works. Or perhaps even better, all files on the server should actually be stored under there hash primarily, with the actual filenames being symlinks to the correct hash for the latest version of that file.

We should also have a database on data.astropy.org (just a JSON file is fine for now) mapping hashes to filenames and vice-versa (the latter may be one-to-many, with the hashes in chronological order for files that have had multiple versions).

@astrofrog @eteq @perrygreenfield

BUG: Fix sites.json so Astropy remote data tests pass again

I think the fix is to replace special characters introduced in #34. However, if I am wrong, please do whatever is correct and then close this issue when tests are passing again.

p.s. Maybe we need to find a way to test this stuff over at astropy side before merging.

Clean duplicates off sites.json

From @eerovaher (https://github.com/astropy/astropy/pull/12721/files#r786773846)

Raising warnings about duplicated names sounds like a good idea, but it should not be implemented until the sites.json file in astropy-data is cleaned from duplicates, otherwise the users would receive many warnings that they can't really do anything about. Removing the duplicates from sites.json would reduce the number of usable keys for anyone not using the bleeding edge version of astropy, so it doesn't seem it would be a good idea to do that quite yet.

When it comes to checking for duplicated names for different sites then it wouldn't be too difficult to add a test to astropy-data that would check for that. The output of astropy.coordinates.EarthLocation.get_site_names() shows that currently there aren't any duplicated labels across sites other than the empty string that this pull request takes care of.

Update astropy.coordinates.EarthLocation.of_site for Lowell Observatory sites

Currently the sites list lowell and DCT (Discovery Channel Telescope). The DCT got renamed to the Lowell Discovery Telescope (LDT) earlier this year, could we add a duplicate entry for this?

The lowell site actually points to the Anderson Mesa site which should be renamed to NPOI, the main telescope on that site and Lowell should point to the campus on Mars Hill in Flagstaff (35.202875, -111.664781, 2195m) where we now have research telescopes.

Is this possible?

Set up CI for this repo

We need to set up some basic CI testing to make sure changes in this repo doesn't introduce a regression in the astropy core testing, see #36

This can be assigned to me, and will try to come back to it in a few days time.

Allow non-ascii characters in sites.json (when enough astropy user base supports it)

#38 fixed an issue described in #36 where astropy doesn't correctly parse the sites.json file when it has non-ascii characters. While #38 fixes it, it does so by removing some characters that are proper names that really ought to be unicode. astropy/astropy#7082 fixes the underlying problem by making everything unicode-compliant in astropy. So this issue is about esentially reverting #38, and, more broadly, allowing UTF-8 in the site.json file.

We do not want to do this yet, however, as it will instantly break all released astropys (at the time of this writing). So instead, we are thinking to follow the suggestion of #36 (comment) - wait a while until we think most installed astropy have the fix (could be a long time...), and then bring in the non-ascii characters again.

Intersphinx mappings are outdated

As part of astropy/astropy#8915 I tried to remember what we did for the backup interesphinx mappings here and realised that they are awfully outdated.

They should be either regularly updated (manually or automatically, but I'm not sure it worth the effort investing into machinery for the latter) or removed completely.

Add requests intersphinx

Failure for astroquery today:

intersphinx inventory 'https://requests.kennethreitz.org/en/stable/objects.inv' not fetchable due to <class 'requests.exceptions.ConnectionError'>: HTTPSConnectionPool(host='requests.kennethreitz.org', port=443): Max retries exceeded with url: /en/stable/objects.inv (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f13742c48d0>: Failed to establish a new connection: [Errno -2] Name or service not known'))

Inconsistencies in longitude sign in coordinates' sites.json file.

There seems to be an inconsistency in the sign of longitudes in the sites.json file. For example, Siding Spring Observatory should be at about 149 East longitude, the NASA IRTF should be at about 155 West longitude, Lick Observatory should be at 121 West, and La Palma should be at about 17 deg West longitude. But what we see in the sites.json file is inconsistent (output truncated to highlight problem):

"sso": {
    "name": "Siding Spring Observatory",
    "longitude": 149.06119444444445}

"irtf": {
    "name": "NASA Infrared Telescope Facility",
    "longitude": 155.4719987888889}  ##Should be 204.5280012111111

"lick": {
    "name": "Lick Observatory",
    "longitude": 238.36333333333332}

"lapalma": {
    "name": "Roque de los Muchachos, La Palma",
    "longitude": 342.12}

When employed in coordinates.EarthLocation() we see:

for site in ['sso', 'irtf', 'lick', 'lapalma']:
    print(coordinates.EarthLocation.of_site(site).lon)

149d03m40.3s
155d28m19.1956s
-121d38m12s
-17d52m48s

But I think it should be:

149d03m40.3s
-155d28m19.1956s ## <- IRTF
-121d38m12s
-17d52m48s

This inconsistency arose from different standards for representing longitude. The irtf data came in commit 808c025 which has other entries using negative longitudes, but those seem OK.

Running a check on sites.json gives an error

When I try to run -

from astropy.coordinates.tests import test_sites
test_sites.check_builtin_matches_remote('file://path/to/your/new/sites.json')

I get the following error -


ipython-input-128-46a0624d400b> in <module>()
----> 1 test_sites.check_builtin_matches_remote(f)

~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\astropy\coordinates\tests\test_sites.py in check_builtin_matches_remote(download_url)
    153         in_dl[name] = name in dl_registry
    154         if in_dl[name]:
--> 155             matches[name] = quantity_allclose(builtin_registry[name], dl_registry[name])
    156         else:
    157             matches[name] = False
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\astropy\units\quantity.py in allclose(a, b, rtol, atol, **kwargs)
   1666     """
   1667     return np.allclose(*_unquantify_allclose_arguments(a, b, rtol, atol),
-> 1668                        **kwargs)
   1669 
   1670 
<__array_function__ internals> in allclose(*args, **kwargs)
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\numpy\core\numeric.py in allclose(a, b, rtol, atol, equal_nan)
   2169 
   2170     """
-> 2171     res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
   2172     return bool(res)
   2173 
<__array_function__ internals> in isclose(*args, **kwargs)
~\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\numpy\core\numeric.py in isclose(a, b, rtol, atol, equal_nan)
   2264     # This will cause casting of x later. Also, make sure to allow subclasses
   2265     # (e.g., for numpy.ma).
-> 2266     dt = multiarray.result_type(y, 1.)
   2267     y = array(y, dtype=dt, copy=False, subok=True)
   2268 
<__array_function__ internals> in result_type(*args, **kwargs)
TypeError: invalid type promotion 

Python 3.5.2 |Enthought, Inc. (x86_64)| (default, Mar 2 2017, 16:37:47) [MSC v.1900 64 bit (AMD64)]
Numpy 1.17.3
Scipy 1.0.0
astropy 3.2.3
pytz 2019.3

Rubin isn't a telescope

Description

The telescope data file sites.json currently lists rubin and rubin_aux. The 8.4m telescope is called the Simonyi Survey Telescope and is part of Vera C. Rubin Observatory so "rubin" might not be the right name as described. LSST is not a telescope so the aliases shouldn't really include it. AuxTel is formally the Rubin Auxiliary Telescope. It's 1.2m (not 1.4m: see https://noirlab.edu/public/programs/vera-c-rubin-observatory/rubin-auxtel/) and should not include LSST in the aliases either.

We've just got the AAS facilities list updated to use Rubin:Simonyi and Rubin:1.2m.

Expected behavior

I'm not sure if it's even possible to change "Rubin" without breaking people's code -- is it possible to deprecate sites?

I'm open for ideas as to how best to proceed with this.

Actual behavior

Steps to Reproduce

  1. [First Step]
  2. [Second Step]
  3. [and so on...]
# Put your Python code snippet here.

System Details

Add pytest intersphinx

without a way to automate the updates for the intersphinx mappings this may be suboptimal, but having links pointing to an old version is still better than the current way of failing docs builds on astropy core.

This is somewhat related to #69

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.