t-tiger / gorm-bulk-insert Goto Github PK
View Code? Open in Web Editor NEWimplement BulkInsert using gorm, just pass a Slice of Struct. Simple and compatible.
License: Apache License 2.0
implement BulkInsert using gorm, just pass a Slice of Struct. Simple and compatible.
License: Apache License 2.0
I used elastic APM in my project and every request to database is logged in APM, except a bulk_insert one. I figure out that registered callbacks for reference DB instance doesn't called.
It'll be good if the INSERT INTO is changed to REPLACE INTO, so that rows will be updated/created accordingly
I would be really great to have postgresql support.
I can help you if you guide me.
Just want to ask, does anyone check is BulkInsert protected from SQL injections? I see that there are raw strings that we send to DB. Am I right that it is important to validate all the data to prevent security issues using that package?
can it support gorm v2??
Hi there,
Thanks a lot for putting together a bulk insert for gorm.
I would like to highlight an issue we experienced, let's say we have a struct with a field defined such as:
type fakeTable struct {
TossACoin string `gorm:"column:toYourWitcher"`
}
Because there is this line:
dbColumns = append(dbColumns, mainScope.Quote(gorm.ToColumnName(key)))
The insert statement will be
... INSERT INTO table (`toss_a_coin`, ...
which is the default naming strategy of gorm.
The expected statement will be:
... INSERT INTO table (`toYourWitcher`, ...
Please, is there a way of not overriding the column names provided by tags when present?
Could you please make some release tags, so they can be targeted. Depending on master seems riskier
When using gorm batches, it seems that you cannot assign values to the primary key of struct
The commit eab1a9e introduces breaking changes.
diff --git a/bulk_insert.go b/bulk_insert.go
index ee09f01..cae9d99 100644
--- a/bulk_insert.go
+++ b/bulk_insert.go
@@ -103,7 +103,7 @@ func extractMapValue(value interface{}, excludeColumns []string) (map[string]int
_, hasForeignKey := field.TagSettingsGet("FOREIGNKEY")
if !containString(excludeColumns, field.Struct.Name) && field.StructField.Relationship == nil && !hasForeignKey &&
- !field.IsIgnored && !(field.DBName == "id" || fieldIsAutoIncrement(field)) {
+ !field.IsIgnored && !fieldIsAutoIncrement(field) {
if field.Struct.Name == "CreatedAt" || field.Struct.Name == "UpdatedAt" {
attrs[field.DBName] = time.Now()
} else if field.StructField.HasDefaultValue && field.IsBlank {
@@ -125,5 +125,5 @@ func fieldIsAutoIncrement(field *gorm.Field) bool {
if value, ok := field.TagSettingsGet("AUTO_INCREMENT"); ok {
return strings.ToLower(value) != "false"
}
- return field.IsPrimaryKey
+ return false
}
If you use the default gorm.Model
for your ID/primary key the field won't have the tag AUTO_INCREMENT
. Since the field is missing the tag fieldIsAutoIncrement
will always return false
and thus not be ignored. This will result in a duplicate key error. The field does however have the primary_key
tag:
type Model struct {
ID uint `gorm:"primary_key"`
CreatedAt time.Time
UpdatedAt time.Time
DeletedAt *time.Time `sql:"index"`
}
I know that a field doesn't auto increment by default just because it's a primary key but even if the table is created with gorm migrates or not I think it's common to use gorm.Model
for your ID field and want it to be auto incremented.
Code to reproduce
package main
import (
"github.com/jinzhu/gorm"
_ "github.com/jinzhu/gorm/dialects/postgres"
gormbulk "github.com/t-tiger/gorm-bulk-insert"
)
type MyType struct {
gorm.Model
Value string
}
func main() {
// docker run -it --rm -p 5432:5432 -e POSTGRES_PASSWORD=pass postgres
db, _ := gorm.Open(
"postgres",
"host=localhost port=5432 user=postgres dbname=test password=pass sslmode=disable",
)
defer db.Close()
toInsert := []interface{}{
MyType{Value: "first"},
MyType{Value: "second"},
}
if err := gormbulk.BulkInsert(db, toInsert, 3000); err != nil {
panic(err)
}
}
Proposed fix
diff --git a/bulk_insert.go b/bulk_insert.go
index fb49a8b..dc38b05 100644
--- a/bulk_insert.go
+++ b/bulk_insert.go
@@ -124,5 +125,5 @@ func fieldIsAutoIncrement(field *gorm.Field) bool {
if value, ok := field.TagSettingsGet("AUTO_INCREMENT"); ok {
return strings.ToLower(value) != "false"
}
- return false
+ return field.DBName == "id" && field.IsPrimaryKey
}
I'm facing an issue with the bulk insert for this struct:
type GeoObject struct {
ID uint32 `json:"-" gorm:"column:id;primary_key;auto_increment:false"`
CountryCode string `json:"countryCode" gorm:"type:varchar(2);column:country_code"`
PostalCode string `json:"postalCode" gorm:"type:varchar(10);column:postal_code"`
Latitude float32 `json:"latitude" gorm:"column:latitude"`
Longitude float32 `json:"longitude" gorm:"column:longitude"`
...
}
The field ID is defined as the primary key but with an explicit auto_increment:false
as the app sets it by itself based on another counter.
Even with valid entries like
{ID:7 CountryCode:cn PostalCode:510000 Latitude:23.12472 Longitude:113.23861...
the insert fails with MySQL returning the error: Insert failed Error 1364: Field 'id' doesn't have a default value
.
Looking at the code, I see that fields with the name id
and set as a PK are ignored by extractMapValue
, regardless if the column has an auto_increment
or not.
It is nonetheless a valid definition to have a table with a column id
set as a PK with no auto_increment
.
I have an auto increment ID field defined in a struct like this:
type MasterJob struct {
MasterJobID int `gorm:"column:master_job_id;not null;AUTO_INCREMENT"`
StartedAt time.Time `gorm:"column:started_at;not null"`
CompletedAt time.Time `gorm:"column:completed_at;"`
Status Status `gorm:"column:status;"`
}
When I try to do a batch insert, I get an error
pq: duplicate key value violates unique constraint "master_jobs_pkey"
Following is my code that uses gorm-bulk-insert and throws an error:
masterJob := &models.MasterJob{
Status: models.StatusInprogress,
StartedAt: time.Now(),
}
masterJobs := make([]*models.MasterJob, 0)
masterJobs = append(masterJobs, masterJob)
masterJob = &models.MasterJob{
Status: models.StatusInprogress,
StartedAt: time.Now(),
}
masterJobs = append(masterJobs, masterJob)
y := make([]interface{}, len(masterJobs))
for i, v := range masterJobs {
y[i] = *v
}
err := gormbulk.BulkInsert(r.DB, y, 3000)
if err != nil {
fmt.Println(err)
}
Please provide a parameter for customizing the table for placing the records.
Hi @t-tiger
When is automatic now-time support added?
Is there such a development now?
bulk insert did not return and fill models primary key, primary key is used in some situations
This is the my testing example which set up the database using dockertest:
package main
import (
"encoding/json"
"fmt"
"log"
"net/url"
"os"
"time"
"github.com/jinzhu/gorm"
_ "github.com/lib/pq"
"github.com/ory/dockertest"
gormbulk "github.com/t-tiger/gorm-bulk-insert/v2"
)
type ExampleModel struct {
gorm.Model
Data json.RawMessage `gorm:"not null;default:'{}'"`
}
func main() {
r, err := dockerRDS()
if err != nil {
log.Fatalf("Failed to create rds connection: %v", err)
}
defer r.Release()
db := r.DB
err = db.AutoMigrate(&ExampleModel{}).Error
if err != nil {
log.Fatalf("Failed to migrate model: %v", err)
}
// Start debugging
db = db.Debug()
// INSERT INTO "example_models" ("created_at","updated_at","deleted_at")
// VALUES ('2020-05-15 15:31:06','2020-05-15 15:31:06',NULL) RETURNING "example_models"."id"
err = db.Save(&ExampleModel{}).Error
if err != nil {
log.Fatalf("Failed to save: %v", err)
}
// Error: pq: invalid input syntax for type json
//
// INSERT INTO "example_models" ("created_at", "data", "deleted_at", "updated_at")
// VALUES ('2020-05-15 15:25:51', ''{}'', NULL, '2020-05-15 15:25:51')
// ????
err = gormbulk.BulkInsert(db, []interface{}{
&ExampleModel{},
}, 1000)
if err != nil {
log.Fatalf("Failed to bulk insert: %v", err)
}
log.Print("DONE")
}
type docker struct {
DB *gorm.DB
URL string
pool *dockertest.Pool
resource *dockertest.Resource
}
func (d docker) Release() error {
return d.pool.Purge(d.resource)
}
func dockerRDS() (*docker, error) {
pool, err := dockertest.NewPool("")
if err != nil {
return nil, fmt.Errorf("failed to create docker client: %w (have you run the docker daemon?)", err)
}
pool.MaxWait = time.Second * 10
resource, err := pool.Run("postgres", "9.4.20-alpine", []string{"POSTGRES_PASSWORD="})
if err != nil {
return nil, fmt.Errorf("failed to run docker container: %w", err)
}
var rdsURL string
var rdsDB *gorm.DB
if err := pool.Retry(func() error {
host := resource.GetBoundIP("5432/tcp")
if h := os.Getenv("DOCKER_HOST"); h != "" {
u, err := url.Parse(h)
if err == nil {
host = u.Hostname()
}
}
port := resource.GetPort("5432/tcp")
url := fmt.Sprintf("postgres://postgres@%s:%s/postgres?sslmode=disable", host, port)
db, err := gorm.Open("postgres", url)
if err != nil {
return err
}
rdsURL = url
rdsDB = db
return nil
}); err != nil {
pool.Purge(resource)
return nil, fmt.Errorf("failed to connect to the docker container: %w", err)
}
return &docker{
DB: rdsDB,
URL: rdsURL,
pool: pool,
resource: resource,
}, nil
}
main.go
go mod init
under the same folder to setup the package dependency tooldockerRDS
to provide a *gorm.DB
go run main.go
I saw the SQL when calling db.Save(...)
is correct:
INSERT INTO "example_models" ("created_at","updated_at","deleted_at")
VALUES ('2020-05-15 15:35:34','2020-05-15 15:35:34',NULL)
RETURNING "example_models"."id"
But found invalid SQL when calling gormbulk.BulkInsert(db, ....)
:
INSERT INTO "example_models" ("created_at", "data", "deleted_at", "updated_at")
VALUES ('2020-05-15 15:35:34', ''{}'', NULL, '2020-05-15 15:35:34')
The error:
pq: invalid input syntax for type json
Please extend bulk operations to DELETE's, in order to help Go applications scale with larger traffic flows.
Please extend bulk operations to UPDATE's, in order to help Go applications scale with larger traffic flows.
Hi,
After bulk creating new rows in the table, I want to obtain the newly generated rows' ids similar in concept to mysqli_insert_id
but obviously applied for the entire batch I just inserted. Is there any plan to support this?
Best,
Peter
gorm.io/gorm
I am using this with postgres adapter. Time is inserted with microsecond precision using gorm .Create
, but it seems that precision is being dropped when using this library.
I'm parsing time like this:
// createdAt = "2019-11-20T10:23:09.433043Z"
createdAtTime, err := time.Parse(time.RFC3339, createdAt)
if err != nil {
panic(err)
}
// record.CreatedAt = createdAtTime
I don't know why but I'm getting a lot of errors like:
(/home/kochan/go/pkg/mod/github.com/t-tiger/[email protected]/bulk_insert.go:95)
[2020-02-20 14:09:28] Error 1615: Prepared statement needs to be re-prepared
How to do error handling in batch insert?
Hi @t-tiger,
Would you accept a PR that adds support for looking up gorm:insert_option and adding it to the prepared bulk insert query?
This lets users pass in custom options, like ON DUPLICATE KEY UPDATE x=VALUES(x), allowing bulk upsert.
Hi gorm-bulk-insert team,
We at the platform team of razorpay internally use bulk-insert which internally uses github.com/jinzhu/gorm v1.9.12 and creates an indirect dependency on github.com/mattn/go-sqlite3 v2.0.3.
Reference :- https://github.com/t-tiger/gorm-bulk-insert/blob/master/go.mod#L14
Now, this setup was working fine for us till last week and suddenly we had a failure in our docker build for github actions citing
github.com/mattn/[email protected]+incompatible: unknown revision v2.0.3
Can you resolve this instable version and help us to continue using bulk insert?
Thanks,
Shriram
gormbulk.BulkInsert(db, sliceValue, 3000) can't use "gorm.io/gorm" db ???
I'd like to use this library with the latest version of Gorm, however, I'm getting the following error then.
Cannot use 'r.db' (type *"gorm.io/gorm".DB) as type *"github.com/jinzhu/gorm".DB
I tried to bulk insert model like below.
type Item struct {
Bucket *Bucket `gorm:"embedded,embedded_prefix:bucket_"`
...
}
But there's an error like below
(/Users/**/go/pkg/mod/github.com/t-tiger/[email protected]/bulk_insert.go:95)
[2020-04-29 13:38:50] Error 1054: Unknown column 'bucket' in 'field list'
gorm-bulk-insert does not support struct that contain embedded column?
GORM V2 moved to https://github.com/go-gorm/gorm
When I bulk insert record with a sql.NullString field, like:
gormbulk.BulkInsert(
db, []interface{}{
Test{
ID: uuid.New(),
LoopID: sql.NullString{
String: "",
Valid: false,
},
},
}, 3000,
)
will get SQL:
INSERT INTO `test` (`id`, `loop_id`) VALUES ('89cd202d-5340-4527-b85b-f15fad03b6a2', '')
If change insert statement to:
gormbulk.BulkInsert(
db, []interface{}{
Test{
ID: uuid.New(),
LoopID: sql.NullString{
String: "abcdef", // Strings except the empty string
Valid: false,
},
},
}, 3000,
)
will get SQL:
INSERT INTO `test` (`id`, `loop_id`) VALUES ('89cd202d-5340-4527-b85b-f15fad03b6a2', NULL)
Should sql.NullString.Valid be used to determine the value of sql.NullString? Or is there something wrong with my understanding?
Prepared statements are preferable to sprintf()
-style manipulation, in order to reduce the risk of SQL injection attacks.
This is my test example:
type Test struct {
Name string `gorm:"Column:name;Size:100;DEFAULT:"`
IsAdmin bool `gorm:"Column:is_admin;Size:1;DEFAULT:false"`
}
When I use gorm Create() function
db.Create(&Test{
Name: "n1",
IsAdmin: false,
})
I will get sql like:
INSERT INTO
test (
name,
is_admin) VALUES ('n1','0')
When I use gormbulk.BulkInsert()
gormbulk.BulkInsert(
db, []interface{}{
Test{
Name: "n1",
IsAdmin: false,
},
}, 3000,
)
I will get sql like:
INSERT INTO
test (
name,
is_admin) VALUES ('n1','false')
How should I deal with boolean fields to get results similar to gorm Create()?
the sql like :
UPDATE table_name
SET a = CASE WHEN id = 1
THEN 'something'
WHEN id = 2
THEN 'something else'
WHEN id = 3
THEN 'another'
END
WHERE id IN (1,2,3)
;
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.