t-tiger / gorm-bulk-insert Goto Github PK

View Code? Open in Web Editor NEW

278.0 6.0 66.0 70 KB

implement BulkInsert using gorm, just pass a Slice of Struct. Simple and compatible.

License: Apache License 2.0

Go 100.00%

gorm database go bulk batch mysql postgresql

gorm-bulk-insert's People

Contributors

Stargazers

Watchers

Forkers

2892931976 axinan-com glacier759 yamad07 beatscode shiba-hiro jeffy-mathew 524243642 kirinse davidjumani seacomandor isgasho xruins skobaken7 neoreids haya14busa erealmsoft winggao megabild dvaughn555-cfa lucasfloriani hiromichi-n rizalgowandy hlf513 uhhc edmundleex ows-vn nextarcadia timt-unity3d nicholasasimov rebider tangximing saivarunr betula-l hahleung-resilient koreset zbd20 ddh27 juliansarmiento9310 grg8881m voyager3m itisbean tretkow windhooked hnzxmutex indiejohnny periscopedata bigsoldier lovung xplorfin yujiyamamoto tranquilityapp huacnlee pzentenoe symblcrowd ykomuro0719 charizer tdreaming githubw2015 laushunyu namphamtoday apptness daoshenzzg iq-scm rubensten

gorm-bulk-insert's Issues

callbacks doesn't fired

I used elastic APM in my project and every request to database is logged in APM, except a bulk_insert one. I figure out that registered callbacks for reference DB instance doesn't called.

MySQL 1062 Error while updating

It'll be good if the INSERT INTO is changed to REPLACE INTO, so that rows will be updated/created accordingly

Please add postgresql support.

I would be really great to have postgresql support.

I can help you if you guide me.

[Question] SQL injection potential flaw?

Just want to ask, does anyone check is BulkInsert protected from SQL injections? I see that there are raw strings that we send to DB. Am I right that it is important to validate all the data to prevent security issues using that package?

can it support gorm v2??

Override gorm tags

Hi there,

Thanks a lot for putting together a bulk insert for gorm.

I would like to highlight an issue we experienced, let's say we have a struct with a field defined such as:

type fakeTable struct {
	TossACoin string `gorm:"column:toYourWitcher"`
}

Because there is this line:

dbColumns = append(dbColumns, mainScope.Quote(gorm.ToColumnName(key)))

The insert statement will be

...  INSERT INTO table (`toss_a_coin`,  ...

which is the default naming strategy of gorm.

The expected statement will be:

...  INSERT INTO table (`toYourWitcher`,  ...

Please, is there a way of not overriding the column names provided by tags when present?

Please make a release tag

Could you please make some release tags, so they can be targeted. Depending on master seems riskier

bulk insert do not response primary key?

When using gorm batches, it seems that you cannot assign values to the primary key of struct

Breaking changes for auto increment/pk fields

The commit eab1a9e introduces breaking changes.

Click to see diff for the commit

diff --git a/bulk_insert.go b/bulk_insert.go
index ee09f01..cae9d99 100644
--- a/bulk_insert.go
+++ b/bulk_insert.go
@@ -103,7 +103,7 @@ func extractMapValue(value interface{}, excludeColumns []string) (map[string]int
                _, hasForeignKey := field.TagSettingsGet("FOREIGNKEY")

                if !containString(excludeColumns, field.Struct.Name) && field.StructField.Relationship == nil && !hasForeignKey &&
-                       !field.IsIgnored && !(field.DBName == "id" || fieldIsAutoIncrement(field)) {
+                       !field.IsIgnored && !fieldIsAutoIncrement(field) {
                        if field.Struct.Name == "CreatedAt" || field.Struct.Name == "UpdatedAt" {
                                attrs[field.DBName] = time.Now()
                        } else if field.StructField.HasDefaultValue && field.IsBlank {
@@ -125,5 +125,5 @@ func fieldIsAutoIncrement(field *gorm.Field) bool {
        if value, ok := field.TagSettingsGet("AUTO_INCREMENT"); ok {
                return strings.ToLower(value) != "false"
        }
-       return field.IsPrimaryKey
+       return false
 }

If you use the default gorm.Model for your ID/primary key the field won't have the tag AUTO_INCREMENT. Since the field is missing the tag fieldIsAutoIncrement will always return false and thus not be ignored. This will result in a duplicate key error. The field does however have the primary_key tag:

type Model struct {
    ID        uint `gorm:"primary_key"`
    CreatedAt time.Time
    UpdatedAt time.Time
    DeletedAt *time.Time `sql:"index"`
}

I know that a field doesn't auto increment by default just because it's a primary key but even if the table is created with gorm migrates or not I think it's common to use gorm.Model for your ID field and want it to be auto incremented.

Code to reproduce

package main

import (
	"github.com/jinzhu/gorm"
	_ "github.com/jinzhu/gorm/dialects/postgres"
	gormbulk "github.com/t-tiger/gorm-bulk-insert"
)

type MyType struct {
	gorm.Model
	Value string
}

func main() {
	// docker run -it --rm -p 5432:5432  -e POSTGRES_PASSWORD=pass postgres
	db, _ := gorm.Open(
		"postgres",
		"host=localhost port=5432 user=postgres dbname=test password=pass sslmode=disable",
	)

	defer db.Close()

	toInsert := []interface{}{
		MyType{Value: "first"},
		MyType{Value: "second"},
	}

	if err := gormbulk.BulkInsert(db, toInsert, 3000); err != nil {
		panic(err)
	}
}

Proposed fix

diff --git a/bulk_insert.go b/bulk_insert.go
index fb49a8b..dc38b05 100644
--- a/bulk_insert.go
+++ b/bulk_insert.go
@@ -124,5 +125,5 @@ func fieldIsAutoIncrement(field *gorm.Field) bool {
        if value, ok := field.TagSettingsGet("AUTO_INCREMENT"); ok {
                return strings.ToLower(value) != "false"
        }
-       return false
+       return field.DBName == "id" && field.IsPrimaryKey
 }

Field ignored if name is "id" and is a PK regardless of auto_increment

I'm facing an issue with the bulk insert for this struct:

type GeoObject struct {
	ID               uint32  `json:"-" gorm:"column:id;primary_key;auto_increment:false"`
	CountryCode      string  `json:"countryCode" gorm:"type:varchar(2);column:country_code"`
	PostalCode       string  `json:"postalCode" gorm:"type:varchar(10);column:postal_code"`
	Latitude         float32 `json:"latitude" gorm:"column:latitude"`
	Longitude        float32 `json:"longitude" gorm:"column:longitude"`
...
}

The field ID is defined as the primary key but with an explicit auto_increment:false as the app sets it by itself based on another counter.

Even with valid entries like
{ID:7 CountryCode:cn PostalCode:510000 Latitude:23.12472 Longitude:113.23861... the insert fails with MySQL returning the error: Insert failed Error 1364: Field 'id' doesn't have a default value.

Looking at the code, I see that fields with the name id and set as a PK are ignored by extractMapValue, regardless if the column has an auto_increment or not.
It is nonetheless a valid definition to have a table with a column id set as a PK with no auto_increment.

Auto Increment of ID doesn't work with batch insert

I have an auto increment ID field defined in a struct like this:

type MasterJob struct {
	MasterJobID int       `gorm:"column:master_job_id;not null;AUTO_INCREMENT"`
	StartedAt   time.Time `gorm:"column:started_at;not null"`
	CompletedAt time.Time `gorm:"column:completed_at;"`
	Status      Status    `gorm:"column:status;"`
}

When I try to do a batch insert, I get an error

pq: duplicate key value violates unique constraint "master_jobs_pkey"

Following is my code that uses gorm-bulk-insert and throws an error:

masterJob := &models.MasterJob{
		Status:    models.StatusInprogress,
		StartedAt: time.Now(),
	}

	masterJobs := make([]*models.MasterJob, 0)
	masterJobs = append(masterJobs, masterJob)
	masterJob = &models.MasterJob{
		Status:    models.StatusInprogress,
		StartedAt: time.Now(),
	}
	masterJobs = append(masterJobs, masterJob)

	y := make([]interface{}, len(masterJobs))
	for i, v := range masterJobs {
		y[i] = *v
	}
	err := gormbulk.BulkInsert(r.DB, y, 3000)
	if err != nil {
		fmt.Println(err)
	}

Parameterize table name

Please provide a parameter for customizing the table for placing the records.

Support for automatically adding now-time to CreatedAt and UpdatedAt fields

Hi @t-tiger

When is automatic now-time support added?
Is there such a development now?

bulk insert did not return and fill models primary key

bulk insert did not return and fill models primary key, primary key is used in some situations

`pq: invalid input syntax for type json` When setup default value tag for `json.RawMessage`

This is the my testing example which set up the database using dockertest:

package main

import (
	"encoding/json"
	"fmt"
	"log"
	"net/url"
	"os"
	"time"

	"github.com/jinzhu/gorm"
	_ "github.com/lib/pq"
	"github.com/ory/dockertest"
	gormbulk "github.com/t-tiger/gorm-bulk-insert/v2"
)

type ExampleModel struct {
	gorm.Model

	Data json.RawMessage `gorm:"not null;default:'{}'"`
}

func main() {
	r, err := dockerRDS()
	if err != nil {
		log.Fatalf("Failed to create rds connection: %v", err)
	}
	defer r.Release()
	db := r.DB

	err = db.AutoMigrate(&ExampleModel{}).Error
	if err != nil {
		log.Fatalf("Failed to migrate model: %v", err)
	}

	// Start debugging
	db = db.Debug()

	// INSERT INTO "example_models" ("created_at","updated_at","deleted_at")
	// VALUES ('2020-05-15 15:31:06','2020-05-15 15:31:06',NULL) RETURNING "example_models"."id"
	err = db.Save(&ExampleModel{}).Error
	if err != nil {
		log.Fatalf("Failed to save: %v", err)
	}

	// Error: pq: invalid input syntax for type json
	//
	// INSERT INTO "example_models" ("created_at", "data", "deleted_at", "updated_at")
	// VALUES ('2020-05-15 15:25:51', ''{}'', NULL, '2020-05-15 15:25:51')
	//                                 ????
	err = gormbulk.BulkInsert(db, []interface{}{
		&ExampleModel{},
	}, 1000)
	if err != nil {
		log.Fatalf("Failed to bulk insert: %v", err)
	}
	log.Print("DONE")
}

type docker struct {
	DB  *gorm.DB
	URL string

	pool     *dockertest.Pool
	resource *dockertest.Resource
}

func (d docker) Release() error {
	return d.pool.Purge(d.resource)
}

func dockerRDS() (*docker, error) {
	pool, err := dockertest.NewPool("")
	if err != nil {
		return nil, fmt.Errorf("failed to create docker client: %w (have you run the docker daemon?)", err)
	}
	pool.MaxWait = time.Second * 10

	resource, err := pool.Run("postgres", "9.4.20-alpine", []string{"POSTGRES_PASSWORD="})
	if err != nil {
		return nil, fmt.Errorf("failed to run docker container: %w", err)
	}

	var rdsURL string
	var rdsDB *gorm.DB
	if err := pool.Retry(func() error {
		host := resource.GetBoundIP("5432/tcp")
		if h := os.Getenv("DOCKER_HOST"); h != "" {
			u, err := url.Parse(h)
			if err == nil {
				host = u.Hostname()
			}
		}

		port := resource.GetPort("5432/tcp")
		url := fmt.Sprintf("postgres://postgres@%s:%s/postgres?sslmode=disable", host, port)

		db, err := gorm.Open("postgres", url)
		if err != nil {
			return err
		}

		rdsURL = url
		rdsDB = db

		return nil
	}); err != nil {
		pool.Purge(resource)
		return nil, fmt.Errorf("failed to connect to the docker container: %w", err)
	}

	return &docker{
		DB:       rdsDB,
		URL:      rdsURL,
		pool:     pool,
		resource: resource,
	}, nil
}

How to test

Copy the example code above into a main.go
Call go mod init under the same folder to setup the package dependency tool
Install Docker if you don't have, or rewrite the function dockerRDS to provide a *gorm.DB
Run go run main.go

Issue

I saw the SQL when calling db.Save(...) is correct:

INSERT INTO "example_models" ("created_at","updated_at","deleted_at") 
VALUES ('2020-05-15 15:35:34','2020-05-15 15:35:34',NULL) 
RETURNING "example_models"."id"

But found invalid SQL when calling gormbulk.BulkInsert(db, ....):

INSERT INTO "example_models" ("created_at", "data", "deleted_at", "updated_at") 
VALUES ('2020-05-15 15:35:34', ''{}'', NULL, '2020-05-15 15:35:34')

The error:

pq: invalid input syntax for type json

Feature: Bulk delete

Please extend bulk operations to DELETE's, in order to help Go applications scale with larger traffic flows.

Feature: Bulk update

Please extend bulk operations to UPDATE's, in order to help Go applications scale with larger traffic flows.

Return the newly created rows

Hi,

After bulk creating new rows in the table, I want to obtain the newly generated rows' ids similar in concept to mysqli_insert_id but obviously applied for the entire batch I just inserted. Is there any plan to support this?

Best,
Peter

you can update to the new version of gorm

gorm.io/gorm

Time is not inserting with microsecond precision

I am using this with postgres adapter. Time is inserted with microsecond precision using gorm .Create, but it seems that precision is being dropped when using this library.

I'm parsing time like this:

        // createdAt = "2019-11-20T10:23:09.433043Z"
        createdAtTime, err := time.Parse(time.RFC3339, createdAt)
	if err != nil {
		panic(err)
	}
        // record.CreatedAt = createdAtTime

Error 1615: Prepared statement needs to be re-prepared

I don't know why but I'm getting a lot of errors like:

(/home/kochan/go/pkg/mod/github.com/t-tiger/[email protected]/bulk_insert.go:95) 
[2020-02-20 14:09:28]  Error 1615: Prepared statement needs to be re-prepared

How to do error handling in batch insert?

Support gorm's insert_option for bulk upsert

Hi @t-tiger,
Would you accept a PR that adds support for looking up gorm:insert_option and adding it to the prepared bulk insert query?

This lets users pass in custom options, like ON DUPLICATE KEY UPDATE x=VALUES(x), allowing bulk upsert.

Incompatible version of go-sqlite3 version - 2.0.3

Hi gorm-bulk-insert team,

We at the platform team of razorpay internally use bulk-insert which internally uses github.com/jinzhu/gorm v1.9.12 and creates an indirect dependency on github.com/mattn/go-sqlite3 v2.0.3.

Reference :- https://github.com/t-tiger/gorm-bulk-insert/blob/master/go.mod#L14

Now, this setup was working fine for us till last week and suddenly we had a failure in our docker build for github actions citing
github.com/mattn/[email protected]+incompatible: unknown revision v2.0.3

Can you resolve this instable version and help us to continue using bulk insert?

Thanks,
Shriram

gorm v2.0 Adaptation

gormbulk.BulkInsert(db, sliceValue, 3000) can't use "gorm.io/gorm" db ???

Incompatible with new Gorm module name

I'd like to use this library with the latest version of Gorm, however, I'm getting the following error then.

Cannot use 'r.db' (type *"gorm.io/gorm".DB) as type *"github.com/jinzhu/gorm".DB

Does not support embedded column?

I tried to bulk insert model like below.

type Item struct {
  Bucket *Bucket `gorm:"embedded,embedded_prefix:bucket_"`
  ...
}

But there's an error like below

(/Users/**/go/pkg/mod/github.com/t-tiger/[email protected]/bulk_insert.go:95)
[2020-04-29 13:38:50]  Error 1054: Unknown column 'bucket' in 'field list'

gorm-bulk-insert does not support struct that contain embedded column?

Can you create this to GORM V2 ?

GORM V2 moved to https://github.com/go-gorm/gorm

How to bulk insert sql.NullString field?

When I bulk insert record with a sql.NullString field, like:

gormbulk.BulkInsert(
	db, []interface{}{
		Test{
			ID: uuid.New(),
			LoopID: sql.NullString{
				String: "",
				Valid:  false,
			},
		},
	}, 3000,
)

will get SQL:

INSERT INTO `test` (`id`, `loop_id`) VALUES ('89cd202d-5340-4527-b85b-f15fad03b6a2', '')

If change insert statement to:

gormbulk.BulkInsert(
	db, []interface{}{
		Test{
			ID: uuid.New(),
			LoopID: sql.NullString{
				String: "abcdef", // Strings except the empty string
				Valid:  false,
			},
		},
	}, 3000,
)

will get SQL:

INSERT INTO `test` (`id`, `loop_id`) VALUES ('89cd202d-5340-4527-b85b-f15fad03b6a2', NULL)

Should sql.NullString.Valid be used to determine the value of sql.NullString? Or is there something wrong with my understanding?

Use more prepared statements

Prepared statements are preferable to sprintf()-style manipulation, in order to reduce the risk of SQL injection attacks.

How to handle boolean fields?

This is my test example:

type Test struct {
	Name     string `gorm:"Column:name;Size:100;DEFAULT:"`
	IsAdmin  bool   `gorm:"Column:is_admin;Size:1;DEFAULT:false"`
}

When I use gorm Create() function

db.Create(&Test{
	Name: "n1",
	IsAdmin: false,
})

I will get sql like:

INSERT INTO test (name,is_admin) VALUES ('n1','0')

When I use gormbulk.BulkInsert()

gormbulk.BulkInsert(
	db, []interface{}{
		Test{
			Name: "n1",
			IsAdmin: false,
		},
	}, 3000,
)

I will get sql like:

INSERT INTO test (name, is_admin) VALUES ('n1','false')

How should I deal with boolean fields to get results similar to gorm Create()?

Is there any way to pass "IGNORE" option to insert query?

I've tried db.Set("gorm:insert_option", "IGNORE") but it just append IGNORE to the SQL string like insert into ... ... ... IGNORE.

would this package support bulk update in the future?

the sql like :

UPDATE table_name
SET a = CASE WHEN id = 1
THEN 'something'
WHEN id = 2
THEN 'something else'
WHEN id = 3
THEN 'another'
END
WHERE id IN (1,2,3)
;