Coder Social home page Coder Social logo

Support NIO non-blocking SSL sockets about pgjdbc HOT 20 OPEN

pgjdbc avatar pgjdbc commented on September 26, 2024 2
Support NIO non-blocking SSL sockets

from pgjdbc.

Comments (20)

ringerc avatar ringerc commented on September 26, 2024

I wrote a bunch of detail on this on this comment: #194 (comment)

In short, java.nio may not be necessary, or a simple answer here.

from pgjdbc.

quipsy-karg avatar quipsy-karg commented on September 26, 2024

As I also wondering why PGJDBC is not using native NIO Channels instead of InputStream/OutputStream, I'd like to ask the following questions:

  • Do you mean that it is impossible to use NIO Channels?
  • Or do you mean that it is just not needed to solve this particular issue?
    In particular I wonder whether NIO Channels would be beneficial for non-SSL (pure TCP) connections?

from pgjdbc.

davecramer avatar davecramer commented on September 26, 2024

Markus,

The reality is that when Oliver wrote the V3 protocol NIO was not
available. Nobody has invested the effort into writing the protocol layer
with NIO. I'd certainly welcome an NIO protocol implementation.

Dave
On 19 June 2015 at 07:37, Markus Karg [email protected] wrote:

As I also wondering why PGJDBC is not using native NIO Channels instead of
InputStream/OutputStream, I'd like to ask the following questions:

  • Do you mean that it is impossible to use NIO Channels?
  • Or do you mean that it is just not needed to solve this particular
    issue? In particular I wonder whether NIO Channels would be beneficial for
    non-SSL (pure TCP) connections?


Reply to this email directly or view it on GitHub
#163 (comment).

from pgjdbc.

quipsy-karg avatar quipsy-karg commented on September 26, 2024

@davecramer Great to hear. I used NIO some years back with a different project and it gave it a performance boost. @vlsi Possibly Vladimir can help me this Summer with some benchmarking, so I could provide a PR for that. Shouldn't be too complex, but would reduce the amount the byte arrays a lot.

from pgjdbc.

vlsi avatar vlsi commented on September 26, 2024

would reduce the amount the byte arrays a lot

I would not be very optimistic here.

  1. In the "send binds to backend" case the main "problem" is backend wants to have "length" field before the data, so we really have to buffer data somewhere.
  2. In the "receive tuple" case pgjdbc creates lots of individual byte[] for easier implementation of further "get" APIs. Well, it might be the case that netty's ByteBuf would simplify "random access" into the received stream and allow us to avoid lots of temproary byte[]

from pgjdbc.

quipsy-karg avatar quipsy-karg commented on September 26, 2024

@vlsi I'll give it a quick shot anyways, just for fun. If it turns out it is useless, we can simply stay with the code as-is. :-)

from pgjdbc.

vlsi avatar vlsi commented on September 26, 2024

just for fun.

Ah, that is fun indeed.

from pgjdbc.

quipsy-karg avatar quipsy-karg commented on September 26, 2024

I checked what Oracle report wrt NIO and they say ByteBuffers surpass byte[] starting with about 8K size. So I expect the biggest performance gain when batch-loading large ResultSet. Maybe I should concentrate on that.

from pgjdbc.

davecramer avatar davecramer commented on September 26, 2024

Or in the copy command ?

Dave Cramer

On 22 June 2015 at 06:15, Markus Karg [email protected] wrote:

I checked what Oracle report wrt NIO and they say ByteBuffers surpass
byte[] starting with about 8K size. So I expect the biggest performance
gain when batch-loading large ResultSet. Maybe I should concentrate on that.


Reply to this email directly or view it on GitHub
#163 (comment).

from pgjdbc.

quipsy-karg avatar quipsy-karg commented on September 26, 2024

@davecramer I expect it would speed up COPY, but I assume many more people will use ResultSets.

from pgjdbc.

davecramer avatar davecramer commented on September 26, 2024

True, but if you do it for one we should be good to go for all of them

Dave Cramer

On 22 June 2015 at 07:22, Markus Karg [email protected] wrote:

@davecramer https://github.com/davecramer I expect it would speed up
COPY, but I assume many more people will use ResultSets.


Reply to this email directly or view it on GitHub
#163 (comment).

from pgjdbc.

ringerc avatar ringerc commented on September 26, 2024

It'd be useful to know the size of data we're going to send in advance anyway, for protocol deadlock avoidance reasons. IMO we should really have a class for each message type, so we can instantiate it, set its data, get its length, then send it on the wire. This would fit well with a channels based approach.

from pgjdbc.

sasavilic avatar sasavilic commented on September 26, 2024

We have a service that periodically dumps/load a lot of data and we were using batched insert but was still slow. Only after reading jdbc source code and finding this issue and related one, I have rewrote parts of our service, so it now generates big insert (with 1000 rows per insert) and it is huge improvement. But it still feels like I am not able to achive 100% of what postgres server could service and because I am being slowed by plain round trip to database. It would be nice if batching in jdbc would work as expected and if possible with use asynchronous operation.

from pgjdbc.

davecramer avatar davecramer commented on September 26, 2024

@sasavilic the batching issues are an artifact of the server. It does not support batching. If you really want performant inserts, copy is the way to go.
As far as using NIO for inserts. Pull requests are welcome.

from pgjdbc.

vlsi avatar vlsi commented on September 26, 2024

with 1000 rows per insert

@sasavilic , which pgjdbc version are you using?
Have you tried adding reWriteBatchedInserts=true option?
It will enable rewrite batched "insert values()" queries to a single insert values(),(),(),... statement.

Unfortunately, sever-side does not really support true batching (it still processes commands one by one even in case the commands are send in a single network packet), so for high performance there are tree options only:

  1. reWriteBatchedInserte=true
  2. Manually craft a single SQL
  3. CopyAPI

from pgjdbc.

sasavilic avatar sasavilic commented on September 26, 2024

I think I owe an apology. After double checking and repeating the tests, I can see that performance are almost identical. I must have done something wrong, when I was measuring first time. Sorry about false alarm.

I am using 42.2.2 with manually crafted SQL for insert (variant B) and simple SQL insert with JDBC batching (variant A). Beside oblivious error in measurement during first test, I was additionally confused by comments in source code and github: I have thought that JDBC library waits for Command completion after each Bind/Describe/Execute, but after looking into traffic in wireshark, I can see that I was wrong.

(I don't want to pollute your thread further, but my problem is that I have a java process that reads from one table, do some complex processing and then stores result in second table. I use cursor for fetching data from source table (5000 rows is fetch size) and then insert them in batches of 100 rows (variant A). My java thread is at 50% utilization, postgres backend is at 40% utilization (but the whole server is almost idle), write performance is around 5-7 MB/s (although we have fast ssds) and average throughput in around 10k rows/s, network is supposed to be good. I expected that either postgres or java will be near 100% utilization, but I can't achieve that. That is why I was suspicious about batch processing and JDBC driver and was under impression that I am loosing a lot of time for sending this small requests/responses between client and backend)

from pgjdbc.

davecramer avatar davecramer commented on September 26, 2024

I still think this would be useful, in fact rewriting the entire thing with NIO would be useful.

from pgjdbc.

LifeIsStrange avatar LifeIsStrange commented on September 26, 2024

I might be HS but just saying that in java 13

https://openjdk.java.net/jeps/353
https://openjdk.java.net/jeps/373

The Java socket API has been reimplemented which might alleviate some issues and crucially is forward compatible with loom which will enable true asynchronocity. NIO socket should also ensure loom compatibility.

Finally: noob question
Will the NIO socket enable true non blocking socket and therefore enable asynchronous queries, query pipelining and batch processing?
https://2ndquadrant.github.io/postgres/libpq-batch-mode.html

This feature is the biggest performance bottleneck for servers on the TechEmpowerup benchmarck. If you were able to solve this before loom or R2DBC you would allow existing Java code to shine and spring to be on the top 5 fastest HTTP servers of the world!

from pgjdbc.

davecramer avatar davecramer commented on September 26, 2024

interesting. Thanks

from pgjdbc.

mikehearn avatar mikehearn commented on September 26, 2024

@LifeIsStrange Loom doesn't automatically make NOTIFY properly async or anything else - for that the driver really needs to split rx/tx into separate threads. However Loom would make it cheaper to do that.

from pgjdbc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.