gocarina / gocsv Goto Github PK
View Code? Open in Web Editor NEWThe GoCSV package aims to provide easy CSV serialization and deserialization to the golang programming language
License: MIT License
The GoCSV package aims to provide easy CSV serialization and deserialization to the golang programming language
License: MIT License
I'm dealing with data that is fixed width csv, and I'd like to have a simple method for trimming trailing spaces. I know that the leading space trimming is handled by encoding/csv, but I was looking for a way to do it by providing a custom CSVReader that wraps LazyCSVReader, or something along those lines. I can't seem to get my head around how to do that, is it possible to provide implementations of the Decoder or CSVReader interfaces?
See test case here: https://github.com/User4574/gocsvexample
What I may say is somehow trivial for csv format.
Due to our tab-delimiter input from my customer , I am currently modifying gocsv
to support tab delimiter instead of comma delimiter.
So, why not support custom delimiter like golang basic library encoding/csv
type Reader struct {
Comma rune // field delimiter (set to ',' by NewReader)
...
}
Currently, MarshalFile seems to leave parts of the old file behind. Is there any way to clear it out?
we have a situation where we have csv file with duplicate headers such as following example
client_id,client_name,client_age,class,class,class
1,Jose,42,maths,chem,
2,Daniel,26,chem,,
3,Vincent,32,maths,physics,chem
Hi,
Facing race condition when I tried to use UnmarshalToCallback in my code. Running
go test -v -race
detected 'race condition' when it tried to access same code, post spawning of goroutines.
Refer attached log for more details.
RaceCondition.txt
Hi, this is not really an issue with your software but in using it with dumb data source.
There are four colums in the CSV, one of the columns contains ordinary text data that may also contain commas. Horrible. There is a phone number, a date, sent/received,Subject,Content
An example row is like this
,+447755505585,08-22-2013 08:43:12,Send,,Some text, which may contain commas
Does your library provide any way that I can handle this case?
Many thanks
I have some csv field names which contain commas and I can't control that.
To deal with this, before unmarshalling the csv I will do:
gocsv.TagSeparator = "#"
But for some reason that causes the marshaller to use this #
as the field separator in the CSV. AFAIK these are totally separate concepts: TagSeparator is regarding struct tags and the field separator is regarding the format of the marshalled/unmarshalled CSV.
So I expected to be able to parse a CSV using this tag, which works:
DateOfCreation *CivilDate `bigquery:DateOfCreation,nullable" csv:"Date, of creation#omitempty"
But then when I marshall the struct back to a CSV, I get
field1#field2#field3
And what I want is
field1,field2,field3
Noticed a missing return (see between ## in code below) within the unmarshall function in types.go.
When using UnmarshalText rather than UnmarshalCSV, function returns error on the last line instead of returning nil (if UnmarshalText exists).
unMarshallIt := func(finalField reflect.Value) error {
if finalField.CanInterface() && finalField.Type().Implements(unMarshallerType) {
if err := finalField.Interface().(TypeUnmarshaller).UnmarshalCSV(value); err != nil {
return err
}
return nil
} else if finalField.CanInterface() && finalField.Type().Implements(textUnMarshalerType) { // Otherwise try to use TextMarshaller
if err := finalField.Interface().(encoding.TextUnmarshaler).UnmarshalText([]byte(value)); err != nil {
return err
}
## return nil ##
}
return fmt.Errorf("No known conversion from string to " + field.Type().String() + ", " + field.Type().String() + " does not implements TypeUnmarshaller")
}
Would it be reasonable to add functionality that would allow you to specify a default value for a given field? For example:
type User struct {
Email string `csv:"email"`
Name string `csv:"name"`
Status string `csv:"active,default=inactive"`
}
Error generated:
err: No known conversion from []string to string, []string does not implements TypeMarshaller nor Stringernative:
err: No known conversion from []string to string, []string does not implements TypeMarshaller nor Stringerref:
How would you feel about a PR that would change this function to take in a csv Reader interface instead of the type itself?
This interface would be defined in the package and look something like:
type CSVReader interface {
Read() ([]string, error)
ReadAll() ([][]string, error) // Maybe without this function even because I don't think it's used
}
I want to Unmarshal CSV into a struct with an embedded struct.
Does not seem to work.
type Identity struct {
Supplier string `csv:"supplier"`
Id string `csv:"id"`
}
type Product struct {
Identity
Price int `csv:"price"`
}
Do you have any pointer to make it work ?
Now it is impossible to use github.com/gocarina/gocsv/v2
with go.mod enabled library with may be used within non go.mod project, dep enabled project for example.
For example in github.com/gramework/gramework with github.com/gocarina/gocsv/v2
, breaks dep enabled project in which it was imported.
Solving failure: No versions of github.com/gocarina/gocsv met constraints:
master: Could not introduce github.com/gocarina/gocsv@master, as its subpackage github.com/gocarina/gocsv/v2 is missing. (Package is required by github.com/gramework/[email protected].)
master: Could not introduce github.com/gocarina/gocsv@master, as its subpackage github.com/gocarina/gocsv/v2 is missing. (Package is required by github.com/gramework/[email protected].)
v1: Could not introduce github.com/gocarina/gocsv@v1, as its subpackage github.com/gocarina/gocsv/v2 is missing. (Package is required by github.com/gramework/[email protected].)
And Go Module wiki described the backward compatible solution https://github.com/golang/go/wiki/Modules#releasing-modules-v2-or-higher
Major branch: Update the go.mod file to include a /v3 at the end of the module path in the module directive (e.g., module github.com/my/module/v3). Update import statements within the module to also use /v3 (e.g., import "github.com/my/module/v3/foo"). Tag the release with v3.0.0.
Go versions 1.9.7+, 1.10.3+, and 1.11 are able to properly consume and build a v2+ module created using this approach without requiring updates to consumer code that has not yet opted in to modules (as described in the the "Semantic Import Versioning" section above).
A community tool github.com/marwan-at-work/mod helps automate this procedure. See the repository or the community tooling FAQ below for an overview.
To avoid confusion with this approach, consider putting the v3.. commits for the module on a separate v3 branch.
If instead you have been previously releasing on master and would prefer to tag v3.0.0 on master, that is a viable option, but consider creating a v1 branch for any future v1 bug fixes.
I have a use case where I already have the csv rows in memory as [][]string. So I no longer need to parse strings.
And primarily I would like to be able to implement a custom decoder and leverage the readTo function to populate fields into the my structs.
Calling MarshalChan on a closed channel (with no values) causes panic.
The error culprit is the zero value read from the channel - "firstValue := <-c" - at:
https://github.com/gocarina/gocsv/blob/master/encode.go#L20
To reproduce the error: https://play.golang.org/p/1Hvmg1V9J3
Keep up the good work,
Enrico
var operation = "operation_date"
type LatamHeaders struct {
time_zone string
Operation_date string csv:operation
I want to do something like this the variable which I am using is a variable. and it is not responding anything.
Hello.
I am trying to parse two csv files that have different comma type; ",", "\t", in a program.
I injected below code to change default separator.
gocsv.SetCSVReader(func(in io.Reader) gocsv.CSVReader {
r := csv.NewReader(in)
r.Comma = '\t'
return r
})
It seems like working fine but doesn't.
The csv with tab could be parsed well but at the same time one with comma not.
In my opinion, the same reader is used in a program.
How can I use different reader on each csv file?
It would be good to make the writing of the field header line optional
I had a bizarre issue where I was always losing the first field in my unmarshaled CSV files, I dug in and figured out that it was because the first header string, when printed as a byte array, had the UTF-8 BOM as part of it. I was able to work around by tagging my first fields like this:
Code string `csv:"\xEF\xBB\xBFCODE,CODE"`
Here's the sample code:
gocsv.SetCSVWriter(func(out io.Writer) *SafeCSVWriter {
return csv.NewWriter(out)
})
csv.NewWriter returns a io.Writer, not a *SaveCSVWriter. Perhaps it's supposed to be:
gocsv.SetCSVWriter(func(out io.Writer) *SafeCSVWriter {
writer := csv.NewWriter(out)
writer.Comma = '|'
return NewSafeCSVWriter(writer)
})
?
Sometimes, we need to generate millions of records for test.
func Marshal(in interface{}, out io.Writer) (err error)
need to generate records first, which spends lots of memory.
Can we support API like func Marshal(c <-chan interface{}, out io.Writer) (err error)
?
The example code says import gocsv
should be changed to import "github.com/gocarina/gocsv"
Pipe-delimited files are extremely common, there doesn't appear to be a way to overwrite the delimiter and that would be a necessary feature to be able to use this package.
Hello!
We would like you to add tag (e.g. v1.0
) for current master head, so we would be to use gopkg.in
and lock our code on certain version/commit of gocsv
.
Thanks in advance!
#94 created a bug where you can no longer ignore structs when marshalling
type Quote struct {
Symbol string
Sector sql.NullString `csv:"-"` // should be ignored
}
from sql package:
type NullString struct {
String string
Valid bool // Valid is true if String is not NULL
}
Will always give you:
AAPL,-SEC,true
GOOGL,,false
using gocsv package to read a csv file. It works fine with files having comma separator. And now it throws column 0: wrong number of fields in line for '|' separated file.
I need to unmarshal some CSVs without headers. As far as I understand, there is no way to do this with this library currently.
It seems we need a new function like UnmarshalWithoutHeaders
just like the MarshalWithoutHeaders
that was created in the issue #3. I would think that the order of fields in the struct you are unmarshalling to could be used to map columns, or we could use an index struct tag similar to this libarary: https://github.com/yunabe/easycsv.
I'm a Go noob so please tell me if I have things very wrong. If this feature does make sense, I'd be happy to try taking a stab at this, but I think I might need some guidance!
Currently, if one column in one line cannot be deserialized (e.g. failed conversion), the whole operation fails. It would be useful to have an option to continue the deserialization with the remaining lines. Collected errors could then be returned and the application could check if the whole operation failed.
My use case is reading an extremely large CSV, coming from s3, processing these the rows and then outputting results in new CSV.
My problem is gracefully exiting early. I don't see anyway of stopping the CSV unmarshalling.
I was hoping that closing the input reader would work but it does not.
Any suggestions?
If I am using a sql.NullString in my struct, is there a way to define a custom handler then to allow this to be sent through as a string? I am trying this code in https://github.com/OpenCoreData/ocdServices/blob/master/janus/ageModel.go and it works but does serialize the sql.NullStings.
Thanks! nice package
Hello!
I am piping csv data into my program and I would like to be able to access them one by one.
I know that the first row will contain headers (or I can supply a string containing the headers), so maybe something like this might work? :
scanner := gocsv.NewScanner(os.Stdin)
for scanner.Scan() {
scanner.UnmarshalString(scanner.Text(), &record)
if err != nil {
// process error
continue
}
fmt.Println(record)
}
EDIT: After reading the source files, I made this, but I am still not sure that this should be the correct (only) way to access records one by one:
c := make(chan logRecord)
go processRecords(c)
for {
err = gocsv.UnmarshalToChan(os.Stdin, c)
if err == io.EOF {
break
}
if err != nil {
fmt.Println(err)
}
}
Thank you!
Andrei
Hi
Noticed this when my CSV file had some bad data in it. If decode.readEach
returns an error (https://github.com/gocarina/gocsv/blob/master/decode.go#L193) it does not close the channel and the program hangs . maybe you could just add a defer outValue.Close()
after it is initialized to avoid this ?
Would be nice to be able to just get an array of dictionaries keyed by the header row:
https://gist.github.com/drernie/5684f9def5bee832ebc50cabb46c377a
Any way to read headers and then start streaming or reading one record at a time ?
I'd like to do that as well on the write side, the idea is to be able to parse a file, filter it with little memory footprint, as only a few current rows would be kept around at any given time.
Is this something possible ?
UnmarshalCSV()
gives an error if we try to parse file that is missing this column.wrong number of fields in line
.Currently, unmarshalling pointers to built-in types (like *string
) doesn't work without implementing an interface. This feature should support that natively.
Unmarshalling into pointer types allows us to make fields optional; otherwise, unmarshalling a partial record would set zero values for the missing fields in the resulting struct, and you wouldn't know if it was because the field was blank or absent.
I have a scenario where I need to accept a subset of supported fields and update a database. I want to make most of the fields optional and only update database fields that are present in a given CSV file.
Given the following User
struct:
type User struct {
ID int `csv:"user_id"`
Username string `csv:"username"`
LocationID int `csv:"location_id"`
ExternalID *int `csv:"external_id"` // optional
Nickname *string `csv:"nickname"` // optional
}
user_id,username,external_id
1,userA,1000
2,,
Notice that the location_id
and nickname
fields are absent.
Any fields present in the CSV with empty values would receive a Go zero-value in the resulting struct. Any missing non-pointer fields would also receive a Go zero-value. Any missing pointer fields would retain a nil value.
[]User{
{
ID: 1,
Username: "userA",
LocationID: 0,
ExternalID: (*int)1000,
Nickname: nil,
},
{
ID: 2,
Username: "",
LocationID: 0,
ExternalID: (*int)0,
Nickname: nil,
},
}
I have structures with the format:
type Customer struct {
Email *string `json:"email,omitempty" csv:"email"`
FirstName *string `json:"firstName,omitempty" csv:"firstName"`
LastName *string `json:"lastName,omitempty" csv:"lastName"`
}
The values are pointers so that they can be nil. Works fine with go's json marshaller but returns the error:
No known conversion from *string to string, *string does not implements TypeMarshaller nor Stringer
I just tried to deserialize an empty CSV file, and ๐ฅ !
panic: runtime error: index out of range
goroutine 1 [running]:
github.com/gocarina/gocsv.readTo(0x7f35f4f833c8, 0xc82000a7b0, 0x5497a0, 0xc82000e820, 0x0, 0x0)
/home/liam/go/src/github.com/gocarina/gocsv/decode.go:73 +0x984
github.com/gocarina/gocsv.Unmarshal(0x7f35f4f833a0, 0xc82002c028, 0x5497a0, 0xc82000e820, 0x0, 0x0)
/home/liam/go/src/github.com/gocarina/gocsv/csv.go:121 +0xbb
github.com/gocarina/gocsv.UnmarshalFile(0xc82002c028, 0x5497a0, 0xc82000e820, 0x0, 0x0)
/home/liam/go/src/github.com/gocarina/gocsv/csv.go:106 +0x6e
I think this should return an error rather than causing a panic.
headers := csvRows[0]
body := csvRows[1:]
should probably be preceded by:
if len(csvRows) == 0 {
return errors.New("header row not found")
}
I"ll open a PR soon.
I dont see a reason why "Yes" =/= "yes" when converting a csv string to bool.
Line 93 in a7422e7
Given the types from sample_structs_test,
type Sample struct {
Foo string `csv:"foo"`
Bar int `csv:"BAR"`
Baz string `csv:"Baz"`
Frop float64 `csv:"Quux"`
Blah *int `csv:"Blah"`
SPtr *string `csv:"SPtr"`
Omit *string `csv:"Omit,omitempty"`
}
type EmbedSample struct {
Qux string `csv:"first"`
Sample
Ignore string `csv:"-"`
Grault float64 `csv:"garply"`
Quux string `csv:"last"`
}
we know that this will work correctly:
sampleA := Sample{"hellofoo", 2, "hellobaz", 52.0, nil, nil, nil}
a := EmbedSample{"hello", sampleA, "helloignore", 42.0, "helloquux"}
sampleObjects := []EmbedSample{a}
resultNonPtr, err := gocsv.MarshalBytes(sampleObjects)
However, if you change the EmbedSample
Sample
field to *Sample
(perfect valid within go), then:
sampleA := Sample{"hellofoo", 2, "hellobaz", 52.0, nil, nil, nil}
a := EmbedSample{"hello", &sampleA, "helloignore", 42.0, "helloquux"}
sampleObjects := []EmbedSample{a}
resultWithPtr, err := gocsv.MarshalBytes(sampleObjects)
resultWithPtr
is missing all of the Sample
struct's items and differs unexpectedly from resultNonPtr
.
UnmarshalToCallback first create a channel , start a goroutine to fill the channel, then a loop to read from the channel, and call callback function to handle the object.
I think it's users choice to decide whether to handle the whole process in synchronize or async.
So just give us a function like filepath.Walk, so we can read the csv in streaming, and with a callback that we can returns error to stop the iterate.
Would it be possible to create a map of two fields. For e.g.
client_id,client_name
4,Jose
2,Daniel
5,Vincent
I'm looking to create a map[client_id]client_name
Of course, after writing it into a struct, I could create a map. But it would be easier and simpler if gocsv could also do this automatically. Any thoughts?
in reflect.go:
func getStructInfo(rType reflect.Type) *structInfo {
structMapMutex.RLock()
stInfo, ok := structMap[rType]
structMapMutex.RUnlock()
...
I think you should add WLock and assignment structMap[rType]
?
The strcutMap
was useless in current version.
Make it a real doc.
https://godoc.org/github.com/gocarina/gocsv
Consider the following example:
package main
import (
"github.com/gocarina/gocsv"
"fmt"
"encoding/json"
)
type Foo struct {
Id int `json:"id" csv:"id"`
Value interface{} `json:"value" csv:"value"`
}
func main() {
foo := Foo{Id: 1, Value:"xyz"}
out, err := gocsv.MarshalString([]Foo{foo})
if err != nil {
panic(err)
}
fmt.Println(out)
bytes, err := json.Marshal(foo)
if err != nil {
panic(err)
}
fmt.Println(string(bytes))
}
Output:
id,value
1,
Expected:
id,value
1,xyz
E.g. json tags resulted in expected behaviour:
{"id":1,"value":"xyz"}
I'm trying to use MarshalString with a struct slice comprising of sql.Null interfaces but it outputs a csv with all the fields of the sql.Null interface (so for each value it has the type, isNull, and the column value). How would I go about handling that?
csv data:
number,column_a,column_b
1,a,b
2,c,d
3,e,f`
my struct:
type Whatever struct{
JustNumber int `csv:"number"`
ColumnA string `csv:"column_a"`
ColumnB string `csv:"column_b"`
}
the result will be:
{0 a b}
{0 c d}
{0 e f}
but if csv data like this:
,number,column_a,column_b
,1,a,b
,2,c,d
,3,e,f`
the result will be:
{1 a b}
{2 c d}
{3 e f}
why first column cannot be populate to struct?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.