Hi guys,
It's been now a long time I'm using this driver and just wanted to thank you for the work done.
My Application
I'm stuck in an issue concerning queries that returns around 350k documents (more of less same situation than these issues found here #76 and here #70 ). When there's a lot of document to retrieve, the request is way too long to proceed.
I tried both way using a Cursor and, in another hand, directly using client connection to proceed with an export request but still got the same latency issue of a really long document retrievement.
Here how I was making my request when using client connection.
// [...] Connection with entrypoints a list of arango instances (just one address for the moment)
conn, err := http.NewConnection(http.ConnectionConfig{
Endpoints: entrypoints,
TLSConfig: &tls.Config{InsecureSkipVerify: true},
})
if err != nil {
panic("[PANIC] When connecting a HTTP connection : " + err.Error())
}
// [...] Client Connection
client, err := driver.NewClient(driver.ClientConfig{
Connection: conn,
Authentication: driver.BasicAuthentication(user, password),
})
if err != nil {
panic("[PANIC] Could not instantiate new Client : " + err.Error())
}
// [...] Requesting Database
conn := client.Connection()
req, err := conn.NewRequest("PUT", "/_db/dbtest/_api/export?collection=collectiontest")
if err != nil {
return nil, fmt.Errorf("could not retrieve documents")
}
req.SetBody(struct {
BatchSize int `json:"batchSize"`
Count bool `json:"count"`
Flush bool `json:"flush"`
TTL int `json:"ttl"`
}{50000, true, true, 360})
t := time.Now()
resp, err := conn.Do(ctx, req)
log.Printf("Since %s", time.Since(t))
if err != nil {
return nil, fmt.Errorf("could not retrieve")
}
return resp, nil
// [...] Request next batch if more
conn := client.Connection()
var err error
req, err := conn.NewRequest("PUT", "/_db/dbtest/_api/export?collection=collectiontest")
if err != nil {
return nil, fmt.Errorf("could not retrieve next batch")
}
ctxTimeout, cancel := context.WithTimeout(ctx, 6*60*time.Second)
defer cancel()
var resp driver.Response
done := make(chan bool, 0)
go func(resp *driver.Response, obj interface{}, err *error) {
*resp, *err = conn.Do(ctxTimeout, req)
close(done)
}(&resp, obj, &err)
select {
case <-done:
if err != nil {
return nil, fmt.Errorf("could not retrieve documents")
}
return resp, nil
case <-ctxTimeout.Done():
if attempt <= 0 {
return nil, fmt.Errorf("could not retrieve all documents")
}
}
Even if I tried the request with a batch size of 5000 / 50000 / 500000, retrievement time is the same, even worse when we decrease it more.
The best time I made for 350k documents was 20 seconds.
I was then questioning myself about what could really decelerate result retrieving then I went deeper into arangodb/go-driver sources.
My investigation
I found out this piece of code ( in file github.com/arangodb/go-driver/http/connection.go ) which is the source used when I'm calling conn.Do
from your driver:
// Do performs a given request, returning its response.
func (c *httpConnection) Do(ctx context.Context, req driver.Request) (driver.Response, error) {
httpReq, ok := req.(httpRequest)
if !ok {
return nil, driver.WithStack(driver.InvalidArgumentError{Message: "request is not a httpRequest"})
}
r, err := httpReq.createHTTPRequest(c.endpoint)
rctx := ctx
if rctx == nil {
rctx = context.Background()
}
rctx = httptrace.WithClientTrace(rctx, &httptrace.ClientTrace{
WroteRequest: func(info httptrace.WroteRequestInfo) {
httpReq.WroteRequest(info)
},
})
r = r.WithContext(rctx)
if err != nil {
return nil, driver.WithStack(err)
}
// Block on too many concurrent connections
if c.connPool != nil {
select {
case t := <-c.connPool:
// Ok, we're allowed to continue
defer func() {
// Give back token
c.connPool <- t
}()
case <-rctx.Done():
// Context cancelled or expired
return nil, driver.WithStack(rctx.Err())
}
}
resp, err := c.client.Do(r)
if err != nil {
return nil, driver.WithStack(err)
}
var rawResponse *[]byte
if ctx != nil {
if v := ctx.Value(keyRawResponse); v != nil {
if buf, ok := v.(*[]byte); ok {
rawResponse = buf
}
}
}
// Read response body
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
return nil, driver.WithStack(err)
}
if rawResponse != nil {
*rawResponse = body
}
ct := resp.Header.Get("Content-Type")
var httpResp driver.Response
switch strings.Split(ct, ";")[0] {
case "application/json", "application/x-arango-dump":
httpResp = &httpJSONResponse{resp: resp, rawResponse: body}
case "application/x-velocypack":
httpResp = &httpVPackResponse{resp: resp, rawResponse: body}
default:
if resp.StatusCode == http.StatusUnauthorized {
// When unauthorized the server sometimes return a `text/plain` response.
return nil, driver.WithStack(driver.ArangoError{
HasError: true,
Code: resp.StatusCode,
ErrorMessage: string(body),
})
}
// Handle empty 'text/plain' body as empty JSON object
if len(body) == 0 {
body = []byte("{}")
if rawResponse != nil {
*rawResponse = body
}
httpResp = &httpJSONResponse{resp: resp, rawResponse: body}
} else {
return nil, driver.WithStack(fmt.Errorf("Unsupported content type '%s' with status %d and content '%s'", ct, resp.StatusCode, string(body)))
}
}
if ctx != nil {
if v := ctx.Value(keyResponse); v != nil {
if respPtr, ok := v.(*driver.Response); ok {
*respPtr = httpResp
}
}
}
return httpResp, nil
}
I measured the time elapsed from this tiny piece of code from above :
resp, err := c.client.Do(r)
if err != nil {
return nil, driver.WithStack(err)
}
which has an acceptable execution time (like 10% of the total 20 second latency, about 2-3 seconds). I was then wondering where the other 90% went?
Then I found out this :
// Read response body
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
return nil, driver.WithStack(err)
}
which take the other 90% of time of latency.
Doing some research about ioutil.ReadAll ended up like not being a good practice to use it. Especially that it reallocates every 512 bytes the buffer and redo a copy of the precedent informations + the new data to append.
More the body is big and longer is the treatment of it.
Here are the official sources concerning ioutil.ReadAll :
// readAll reads from r until an error or EOF and returns the data it read
// from the internal buffer allocated with a specified capacity.
func readAll(r io.Reader, capacity int64) (b []byte, err error) {
var buf bytes.Buffer
// If the buffer overflows, we will get bytes.ErrTooLarge.
// Return that as an error. Any other panic remains.
defer func() {
e := recover()
if e == nil {
return
}
if panicErr, ok := e.(error); ok && panicErr == bytes.ErrTooLarge {
err = panicErr
} else {
panic(e)
}
}()
if int64(int(capacity)) == capacity {
buf.Grow(int(capacity))
}
_, err = buf.ReadFrom(r)
return buf.Bytes(), err
}
// ReadAll reads from r until an error or EOF and returns the data it read.
// A successful call returns err == nil, not err == EOF. Because ReadAll is
// defined to read from src until EOF, it does not treat an EOF from Read
// as an error to be reported.
func ReadAll(r io.Reader) ([]byte, error) {
return readAll(r, bytes.MinRead)
}
With bytes.MinRead = 512
and Grow() defined here : https://golang.org/src/bytes/buffer.go
Every 512 bytes read, a reallocation + deep copy is processed.
I tried to propose a way to change how to access data by making a pull request but I did not know how to modify your sources without having incidence on other file but it would be really good to be able to withdraw ioutil.ReadAll from use here especially when there's requests case like returning a results with a lot of document responses.
We would here being able to have a significant improve of requests execution time client side.
Can we bufferized more Body to fill once its content thanks ContentLength set previously ?