Comments (8)
Perhaps just an exception getting swallowed, maybe here.
from outbackcdx.
Ah, yes that sounds like a swallowed exception. galgeek recently added some more logging to WbCdxApi to help with that.
from outbackcdx.
Yeah, I dunno what's going on. Some clumsy handling of error conditions in my client was confusing things. Having tidied that up I only get Socket closed
errors:
SEVERE: GET http://localhost:9090/fc?limit=100&matchType=exact&sort=reverse&url=http%3A%2F%2Facid.matkelly.com%2F failed with error Socket closed
java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:136)
at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:152)
at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:270)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:161)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:153)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:254)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:86)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
at uk.bl.wap.util.OutbackCDXClient.getCDXLine(OutbackCDXClient.java:268)
Having added some logging to OutbackCDX, I can see
java.net.SocketException: Broken pipe (Write failed)
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
at outbackcdx.NanoHTTPD$CountingOutputStream.write(NanoHTTPD.java:1092)
at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
at outbackcdx.NanoHTTPD$ChunkingOutputStream.write(NanoHTTPD.java:702)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at java.io.BufferedWriter.flush(BufferedWriter.java:254)
at outbackcdx.WbCdxApi.lambda$queryIndex$0(WbCdxApi.java:71)
at outbackcdx.NanoHTTPD$Response.send(NanoHTTPD.java:539)
at outbackcdx.NanoHTTPD$1$1.run(NanoHTTPD.java:210)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
(warning mixture of threads in that log)
So the error seems to be consistent with the client going away (broken pipe
).
from outbackcdx.
I note my NGINX comparison would not have stretched the chunked transfer decoding, for example, so may not be a fair test. It's also possible I'm exhausting ephemeral ports on this machine, but it's an odd way for that error to show up.
from outbackcdx.
Yeah, that's consistent with the client closing the connection. Although it could obviously be doing that in response to something the server does like invalid chunked encoding. Can't be ephemeral port exhaustion because the connection was already established (unless the client library responds to being unable to open new connections by closing random existing ones or something).
One thing you could try is running OutbackCDX with the undocumented and experimental '-u' command-line option. This makes it use Undertow instead of NanoHTTPD as the http server. Chances are Undertow has more overhead but it's more battle tested and probably has better error handling.
from outbackcdx.
Oof, well, to give you an idea of how it's going, I tried upping the threads some more and got this!
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x000000010c33ad1c, pid=36661, tid=111619
#
# JRE version: Java(TM) SE Runtime Environment (8.0-b124) (build 1.8.0-ea-b124)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b66 mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# C [libnet.dylib+0x9d1c] Java_java_net_SocketInputStream_socketRead0+0x23d
So perhaps there are deeper problems.
Thanks for the tip about Undertow. It does seem slightly happier, reporting some errors on the back-end but not in the client (?!).
java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at org.xnio.nio.NioSocketConduit.write(NioSocketConduit.java:164)
at io.undertow.server.protocol.http.HttpResponseConduit.write(HttpResponseConduit.java:609)
at io.undertow.conduits.ChunkedStreamSinkConduit.flush(ChunkedStreamSinkConduit.java:270)
at org.xnio.conduits.ConduitStreamSinkChannel.flush(ConduitStreamSinkChannel.java:162)
at io.undertow.channels.DetachableStreamSinkChannel.flush(DetachableStreamSinkChannel.java:119)
at org.xnio.channels.Channels.flushBlocking(Channels.java:63)
at io.undertow.io.UndertowOutputStream.close(UndertowOutputStream.java:348)
at outbackcdx.UWeb$UServer.sendResponse(UWeb.java:71)
at outbackcdx.UWeb$UServer.dispatch(UWeb.java:42)
at io.undertow.server.Connectors.executeRootHandler(Connectors.java:376)
at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:830)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
and then later on:
Oct 07, 2020 11:54:01 PM io.undertow.server.Connectors executeRootHandler
ERROR: UT005071: Undertow request failed HttpServerExchange{ GET /fc}
java.lang.IllegalStateException: UT000002: The response has already been started
at io.undertow.server.HttpServerExchange.setStatusCode(HttpServerExchange.java:1406)
at outbackcdx.UWeb$UServer.sendResponse(UWeb.java:54)
at outbackcdx.UWeb$UServer.dispatch(UWeb.java:49)
at io.undertow.server.Connectors.executeRootHandler(Connectors.java:376)
at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:830)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Upped the threads to OutbackCDX and these errors no longer showed up. Could be coincidence.
Very odd speed profile. First column is parallel thread count. Third is seconds per thread (where each thread has the same workload). Gets faster up to 750 threads then big slowdown at 1000. Not clear why.
10 1620 162.0 0 1000
50 818 16.36 0 5000
100 1209 12.09 0 10000
250 2286 9.144 0 25000
500 4664 9.328 0 50000
750 6665 8.886666666666667 0 75000
1000 63796 63.796 0 100000
I think the problem is likely not with OutbackCDX
, unless maybe something subtle going on with keepalive
and chunked
encoding. I'll do some more experiments on a different platform.
from outbackcdx.
Using Undertow, through Docker:
10 1522 152.2 0 1000
50 1890 37.8 0 5000
100 3621 36.21 0 10000
250 8615 34.46 0 25000
500 18705 37.41 0 50000
750 27317 36.422666666666665 0 75000
1000 86674 86.674 0 100000
No errors!
Then back to NanoHTTPD, still under Docker:
10 5633 563.3 0 1000
50 5720 114.4 0 5000
100 8543 85.43 0 10000
250 19600 78.4 0 25000
500 37367 74.734 0 50000
750 47824 63.76533333333333 0 75000
...LOTS OF SOCKET CLOSED BLARFING...
1000 71197 71.197 0 100000
So, while the underlying reasons are unclear, Undertow appears to be faster in general and more stable in particular. So I'll try using Undertow mode in production.
from outbackcdx.
Good to know. Maybe I'll make undertow mode the default then and deprecate nanohttpd.
from outbackcdx.
Related Issues (20)
- OWB compat: annotate 'closest' capture for BubbleCalendar
- Access rule editor: clearing 'accessed between' fields doesn't work
- CDXJ: Error: no such capture field: method HOT 7
- URL:s ending with an asterisk isn't found when searching for the same URL HOT 3
- Upgrade rocksdb
- Pasting a URL in the query field in the dashboard doesn't work
- Document index upgrades HOT 1
- ArrayIndexOutOfBoundsException, NumberFormatException when loading indexed ARC-files HOT 2
- Releases HOT 1
- API to scan all records HOT 3
- Possible infinite loop upon malformed requests HOT 6
- Handling of `+` characters in queries HOT 1
- CDX11+3 support HOT 2
- Replication secondary applies the last batch over and over HOT 3
- resolving a surt-timestamp collision should not replace a non-revisit record with a revisit record HOT 9
- Signed WARC URL generation
- Handle invalid dates HOT 4
- Report exception to client in WbCdxApi HOT 2
- XML protocol: numreturned and numresults
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from outbackcdx.