Comments (6)
Ah thanks for the heads up. I though we created an uptime check in #45 (comment) that would restart the instance if it became unresponsive like this.
Tagging @falquaddoomi who helped last time. I can restart the instance, but might be good to keep this error active so we can make sure the uptime check detects it. (@falquaddoomi no rush, don't interrupt your weekend).
from hetionet.
Sorry for the trouble you've been having with the sevice, @jromanowska. Also, hey @dhimmel; we do have an uptime check set up for the neo4j instance, but it just reports that the instance is inaccessible, it doesn't reboot it. Also, it's unfortunately very noisy, so it's hard to tell when a real outage is occurring versus a transient network issue on Google's side. I'd assumed since no one complained that these were just transient issues, but apparently not -- I'll look into them as soon as they come up now.
After looking into the logs a bit today, it seems the neo4j instance hits a series of out-of-memory exceptions that cause it to stop being able to fully service requests. Oddly, it'll still serve static resources, just with very high (30 seconds+) latency. I'm going to try bumping up the RAM on the instance, and I'll also add a daemon on the machine itself that checks if https://neo4j.het.io/browser/ is responsive and reboots the docker container if it isn't. I'll keep investigating why this is happening, since if there's a memory leak what I proposed will just delay the outages, not eliminate them.
Perhaps let's keep this issue open for a week or so to see if the issue's resolved, and after that we can close it?
from hetionet.
Just FYI, I've put in a monitoring script that'll reboot the neo4j container if https://neo4j.het.io/browser/ takes longer than 30 seconds to return, or if it returns a non-200 response. I've also increased the RAM on the instance from 8GB to 12GB, and I'll be watching the logs and the uptime check for "transient" issues as well. Here's hoping that the changes I made will improves its stability, but do let me know if any of you have issues with it. 🤞
from hetionet.
Right, the outages shouldn't be more than 5 minutes (that's the current polling interval), and if necessary the entire neo4j container gets restarted, which would reset its memory usage. Fair point about it not being worth tracking down a memory leak in an older version of neo4j. I'll take a look at #33 and see if I can make progress on it.
from hetionet.
Thanks a lot @falquaddoomi! Stoked that we're able to automate the restarts.
I'll keep investigating why this is happening, since if there's a memory leak what I proposed will just delay the outages, not eliminate them.
But the outages will be short-lived and the reboot will reset the memory usage right?
Since the instance is running a pretty old version of Neo4j, there's probably not a ton of value in spending much time diagnosing the memory leak. I played around with upgrading in #33, but was hitting a bunch of problems.
So in summary, don't worry too much about digging into the memory leak unless you think that will create an actionable insight.
from hetionet.
I'll take a look at #33 and see if I can make progress on it
Any help appreciated but a forewarning that there's several things that were breaking: guides, HTTPS, and more. So happy to video chat at any point and give you an overview of the hurdles if that'd be helpful.
from hetionet.
Related Issues (20)
- Local files HOT 2
- Multiple Match Queries Not Working HOT 2
- Question About Hetionet's Dictionary HOT 3
- How to add new disease and anatomy nodes HOT 2
- Providing a dump version of Hetionet HOT 11
- http://neo4j.het.io/browser/ time out HOT 4
- Neo4J instance down (?) HOT 7
- Updated TSV version HOT 6
- graph.db database offline in neo4j HOT 3
- Hetionet Browser is down HOT 4
- Mapping to original databases HOT 2
- Cannot map non-existing file HOT 5
- Do any relations imply another relation? HOT 1
- Connectivity Search Automated Query Question HOT 8
- Docker compatibility question HOT 4
- Question on metrics HOT 1
- What does it mean if something up regulates a disease in this context? HOT 3
- Speeding up data import to Neo4j v5 and CSV format data HOT 2
- Inquiry about metapaths from 2017 Paper "Systematic Integration of Biomedical Knowledge Prioritizes Drugs for Repurposing" HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hetionet.