Comments (19)
Oh noooo.
We've had this happen before, and I thought we had impressed upon NIST TRC the importance of engaging users through a gradual community process about major changes.
@mrshirts Can you put us in touch to sort out what can be done here?
from openff-evaluator.
I'd be happy to - @mattwthompson or @SimonBoothroyd could you write up a sentence or two with the exact details for me to send to them so we can figure this out? In the meantime, I think we're mostly using a local copy, correct?
from openff-evaluator.
Sorry, I only have a surface-level knowledge here. Simon (or John, or somebody else who has used it before) would be better suited to provide direction.
from openff-evaluator.
@mrshirts The issue seems to be that our entry point where we access/download the ThermoML tarballs has been removed and changed. It used to be an individual .tgz
for each of the journals, for example: https://trc.nist.gov/ThermoML/JCED.tgz . Now there is a single tarball at a different URL (https://data.nist.gov/od/ds/mds2-2422/ThermoML.v2020-09-30.tgz, landing page: https://data.nist.gov/od/id/mds2-2422). I'm not sure if any of the data has changed (my assumption would be no), but looking at the landing page it looks like they added .json files so it's possible there were other changes.
Here's an example traceback of how it's failing:
Traceback (most recent call last):
File "/home/owenmadin/Documents/python/binary-mixture-publication/data-set-curation/curate_boron_phosphorus_silicon_data.py", line 614, in <module>
main()
File "/home/owenmadin/Documents/python/binary-mixture-publication/data-set-curation/curate_boron_phosphorus_silicon_data.py", line 609, in main
initial_data = prepare_initial_data()
File "/home/owenmadin/Documents/python/binary-mixture-publication/data-set-curation/curate_boron_phosphorus_silicon_data.py", line 57, in prepare_initial_data
initial_data = CurationWorkflow.apply(
File "/home/owenmadin/anaconda3/envs/binary-mixture-publication/lib/python3.9/site-packages/openff/evaluator/datasets/curation/workflow.py", line 112, in apply
data_frame = component_class.apply(
File "/home/owenmadin/anaconda3/envs/binary-mixture-publication/lib/python3.9/site-packages/openff/evaluator/datasets/curation/components/components.py", line 90, in apply
modified_data_frame = cls._apply(data_frame, schema, n_processes)
File "/home/owenmadin/anaconda3/envs/binary-mixture-publication/lib/python3.9/site-packages/openff/evaluator/datasets/curation/components/thermoml.py", line 124, in _apply
cls._download_data(schema)
File "/home/owenmadin/anaconda3/envs/binary-mixture-publication/lib/python3.9/site-packages/openff/evaluator/datasets/curation/components/thermoml.py", line 71, in _download_data
request.raise_for_status()
File "/home/owenmadin/anaconda3/envs/binary-mixture-publication/lib/python3.9/site-packages/requests/models.py", line 953, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://trc.nist.gov/ThermoML/JCED.tgz
So essentially I think we'd need to change the place where we're getting the tarballs from, but it will also probably break some data collation tools that expect a series of tarballs rather than just one.
from openff-evaluator.
@mattwthompson Let me know how I can help out with fixing this (I am probably the main user of this tool currently)
from openff-evaluator.
Is this still broken? I forget if this has been resolved on another platform
from openff-evaluator.
Looks like it has been resolved.
from openff-evaluator.
Unfortunately this is broken again, this time on NIST's end. It looks like there's an issue with their tarball. I get this message trying to download with evaluator:
Traceback (most recent call last):
File "/home/owenmadin/Documents/python/binary-mixture-publication/data-set-curation/vapor_pressure_search.py", line 93, in <module>
main()
File "/home/owenmadin/Documents/python/binary-mixture-publication/data-set-curation/vapor_pressure_search.py", line 88, in main
initial_data = prepare_initial_data()
File "/home/owenmadin/Documents/python/binary-mixture-publication/data-set-curation/vapor_pressure_search.py", line 57, in prepare_initial_data
initial_data = CurationWorkflow.apply(
File "/home/owenmadin/anaconda3/envs/openff-force-fields/lib/python3.9/site-packages/openff/evaluator/datasets/curation/workflow.py", line 112, in apply
data_frame = component_class.apply(
File "/home/owenmadin/anaconda3/envs/openff-force-fields/lib/python3.9/site-packages/openff/evaluator/datasets/curation/components/components.py", line 90, in apply
modified_data_frame = cls._apply(data_frame, schema, n_processes)
File "/home/owenmadin/anaconda3/envs/openff-force-fields/lib/python3.9/site-packages/openff/evaluator/datasets/curation/components/thermoml.py", line 113, in _apply
cls._download_data(schema)
File "/home/owenmadin/anaconda3/envs/openff-force-fields/lib/python3.9/site-packages/openff/evaluator/datasets/curation/components/thermoml.py", line 60, in _download_data
request.raise_for_status()
File "/home/owenmadin/anaconda3/envs/openff-force-fields/lib/python3.9/site-packages/requests/models.py", line 960, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: for url: https://data.nist.gov/od/ds/mds2-2422/ThermoML.v2020-09-30.tgz
{
"requestURL" : "/od/ds/mds2-2422/ThermoML.v2020-09-30.tgz",
"method" : "GET",
"status" : 500,
"message" : "Unexpected Server Error"
}
Process finished with exit code 1
And if I try to download manually through their download manager I get the same thing:
Information about requested bundle/package is given below.
Following files are not included in the bundle because of errors:
https://data.nist.gov/od/ds/mds2-2422/ThermoML.v2020-09-30.tgz?requestId=5c75c307-4328-46dd-baf2-068675b89c47 There is an Error accessing this file, Server returned status with response code 500 and message:There is an error accessing this file/URL from server.
@mrshirts can you contact someone at NIST to figure out why this is happening?
from openff-evaluator.
So, this link seems to work for me now using a manual download - can you check if that works for you, and if it might be transient?
https://data.nist.gov/od/ds/mds2-2422/ThermoML.v2020-09-30.tgz
from openff-evaluator.
I'm still unable to download manually, on either RHEL or Ubuntu.
from openff-evaluator.
Are other NIST downloads down, or just this one?
from openff-evaluator.
It was working manually for me for a couple min, but now is not.
from openff-evaluator.
I tried to download something else from the NIST website and it also failed. Maybe their servers are just struggling today?
from openff-evaluator.
Yeah, sounds like an overall NIST problem.
from openff-evaluator.
It would be good to have a "load from local tarball" option in evaluator.datasets.curation.thermoML.ImportThermoMLData
in the case this happens in the future.
from openff-evaluator.
It would be good to have a "load from local tarball" option
Good idea, file an issue?
from openff-evaluator.
Email from Damien Riccardi at NIST:
"I added a few links to the web app and data.nist.gov page this morning, and, before reaching out here, I reviewed your issue linked below. It appeared as though Thermoml issues on the Open FF end were resolved until the data.nist.gov download link to the .tgz file broke (as of yesterday). An email has been sent to admins of data.nist.gov and I hope it will be fixed soon. In clicking through the related openff thermoml issues I noticed the annoyance with historical movement in the data resource. The https://data.nist.gov/od/ds/mds2-2422/ThermoML.v2020-09-30.tgz file should now (technical difficulties with data.nist.gov servers aside) never change or be deleted."
Also, the ThermoML had a software note in JCC: https://onlinelibrary.wiley.com/share/author/WKPMRWMYRCFW79RXEQPW?target=10.1002/jcc.26842
from openff-evaluator.
@ocmadin posted this in Slack; I don't have time to look at it now but this might be a path forward:
https://onlinelibrary.wiley.com/doi/epdf/10.1002/jcc.26842 [...] TL;DR, don't think we need to change anything, but they are now offering a RESTful API to access the data which may be useful in the future.
from openff-evaluator.
working url
https://nist-oar-cache.s3.amazonaws.com/prd/gen0/mds2-2422/ThermoML.v2020-09-30.tgz
from openff-evaluator.
Related Issues (20)
- Random halting of estimations (experiments/solvent2.nc) HOT 5
- Solvation free energy calculations failing HOT 10
- Loosen MDTraj constraint HOT 3
- tutorial data and script are outdated HOT 1
- unit bug forcebalance 1.9.4 + evaluator 0.4.1 + toolkit 0.11.4 in tutorial04 HOT 5
- Virtual sites missing from gradient calculations HOT 8
- Remove modules depending on Yank
- OpenMM 8 causing test failures HOT 2
- Star imports misconfigured HOT 1
- Unit-related AttributeError on code which previously worked HOT 6
- FilterBySmirks does not work with isotopes HOT 6
- Remove use of `pkg_resources`
- Pandas 2 breaks things HOT 3
- Tutorial01 HOT 5
- OE_LICENSE HOT 3
- unit HOT 16
- Pint removed upcast_types = [] from pint.compat HOT 3
- openff-evaluator conda installation problems HOT 3
- Update calls to Simulation.MinimizeEnergy to provide tolerance as force, not energy HOT 4
- tutorial04 - No module named 'forcebalance' HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openff-evaluator.