Comments (5)
On the backend, the only log entry I see is 09:43:34.218 [qtp1413184129-17] ERROR c.c.t.c.SinglePointAnalysisController - analysis error
which is also not very helpful in understanding what's gone wrong.
I can describe what I did immediately before the problem arose: I was doing analysis on all of Switzerland with walk+transit modes. I then switched to car mode. The worker re-linked the whole pointset for car mode. Now searches with any mode including walk+transit fail in this way.
The worker does not show any errors or any sign of failure. After a while the whole system begins working again, and in the worker log we see:
Jul 17, 2017 9:49:37 AM org.apache.http.impl.execchain.RetryExec execute
09:49:38
INFO: I/O exception (java.net.SocketException) caught when processing request to {}->http://10.0.0.120:9001: Connection timed out (Write failed)
09:49:38
[pool-3-thread-1] WARN com.conveyal.r5.analyst.cluster.AnalystWorker - Failed to mark task 296426 as completed.
09:49:38
org.apache.http.client.ClientProtocolException at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java
09:49:38
Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot retry request with a non-repeatable request entity at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:107) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) ... 7 more
09:49:38
Caused by: java.net.SocketException: Connection timed out (Write failed) at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) at java.net.SocketOutputStream.write(SocketOutputStream.java:155) at org.apache.http.impl.io.SessionOutputBufferImpl.streamWrite(SessionOutputBufferImpl.java:126) at org.apache.http.impl.io.S
09:49:38
[pool-1-thread-4] ERROR com.conveyal.r5.analyst.cluster.AnalystWorker - Error writing travel time surface to broker
[pool-1-thread-4] ERROR com.conveyal.r5.analyst.cluster.AnalystWorker - Error writing travel time surface to broker
09:49:38
java.io.IOException: Pipe closed at java.io.PipedInputStream.checkStateForReceive(PipedInputStream.java:260) at java.io.PipedInputStream.receive(PipedInputStream.java:226) at java.io.PipedOutputStream.write(PipedOutputStream.java:149) at java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:253) at java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:211) at java
java.io.IOException: Pipe closed
at java.io.PipedInputStream.checkStateForReceive(PipedInputStream.java:260)
at java.io.PipedInputStream.receive(PipedInputStream.java:226)
at java.io.PipedOutputStream.write(PipedOutputStream.java:149)
at java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:253)
at java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:211)
at java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:145)
at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
at com.conveyal.r5.analyst.TravelTimeSurfaceComputer.write(TravelTimeSurfaceComputer.java:161)
at com.conveyal.r5.analyst.cluster.AnalystWorker.handleTravelTimeSurfaceRequest(AnalystWorker.java:534)
at com.conveyal.r5.analyst.cluster.AnalystWorker.handleOneRequest(AnalystWorker.java:414)
at com.conveyal.r5.analyst.cluster.AnalystWorker.lambda$null$6(AnalystWorker.java:671)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
I wonder if the single-point side channel was somehow blocked by a previous timeout. Could be related to R5 issue #303.
from analysis-backend.
I think I know what is going on here. The first thing is that the frontend expects all error responses to be JSON TaskErrors, so we should change the error returned by the backend. I think the root cause of the failure though is that Java PipedStreams don't work the way pipes work in any other language; I suspect the writing thread is dying before the response has been written to the broker, and the broker is then never sending it on to the frontend.
from analysis-backend.
This is still happening, specifically when working with the Netherlands network. The client locks on a white screen and shows the message "Fetch Error Unexpected token C in JSON at position 0". This is because the client is expecting a JSON error response, but the backend is just returning the plain text "Could not contact broker" (with content-type JSON).
The dearth of information is due to SinglePointAnalysisController line 84. It's swallowing the exception and returning a meaningless string via Spark's halt() method. The log message contains no detail: 08:02:41.590 [qtp802836797-319] ERROR c.c.t.c.SinglePointAnalysisController - analysis error
.
Suddenly after a long time, spontaneously, the errors have ceased. This seems to be happening on large graphs that take so long to start up that the first single-point request fails with a timeout. Maybe some thread is left in a bad state by that initial failure (as mentioned above).
The first thing to do is to add more logging on the back end and return meaningful errors, potentially including a stack trace, up to the client so we can at least see what's failing. We should do this every single place an exception is caught.
from analysis-backend.
The errors are network timeouts. Strangely the server is also logging this message: 10:08:11.245 [qtp802836797-16] INFO spark.webserver.MatcherFilter - The requested route [/api/analysis/enqueue/single] has not been mapped in Spark
and the system never seems to work again until you restart the broker (after the worker is up and running), which does remedy the situation.
from analysis-backend.
Lot's of changes to error handling since this Issue has last been updated, should be much clearer now!
from analysis-backend.
Related Issues (20)
- Boost regional analyses HOT 3
- Regional and static-site analyses ignore remove-trips modification HOT 5
- Allow importing a single modification from another project HOT 1
- Better templating for worker startup script HOT 1
- Aggregation areas in UTM projections do not work HOT 3
- If requested, don't union aggregation area features HOT 1
- Jobs status API endpoint fails HOT 2
- Backend endpoints fail due to empty database responses HOT 4
- Don't retry analyses that take too long HOT 1
- Corrupt geotiffs HOT 2
- Add more details to not found exceptions
- Fail to build HOT 19
- polyline-encoder unavailable on Sonatype HOT 3
- Error when uploading Opportunity dataset HOT 3
- com.google.common.util.concurrent.ExecutionError: java.lang.OutOfMemoryError: GC overhead limit exceeded HOT 1
- Aggregation areas upload error HOT 4
- java.lang.ArrayIndexOutOfBoundsException HOT 1
- Worker uses default security group HOT 1
- Opportunity upload in wrong format leads to persistent processing message HOT 1
- Error when running dev-latest.jar HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from analysis-backend.