Coder Social home page Coder Social logo

analytics-go's Introduction

analytics-go Circle CI go-doc

Segment analytics client for Go.

⚠️ Maintenance ⚠️

This library is in maintenance mode. It will send data as intended, but receive no new feature support and only critical maintenance updates from Segment.

Installation

The package can be simply installed via go get, we recommend that you use a package version management system like the Go vendor directory or a tool like Godep to avoid issues related to API breaking changes introduced between major versions of the library.

To install it in the GOPATH:

go get https://github.com/segmentio/analytics-go

Documentation

The links bellow should provide all the documentation needed to make the best use of the library and the Segment API:

Usage

package main

import (
    "os"

    "github.com/segmentio/analytics-go"
)

func main() {
    // Instantiates a client to use send messages to the segment API.
    client := analytics.New(os.Getenv("SEGMENT_WRITE_KEY"))

    // Enqueues a track event that will be sent asynchronously.
    client.Enqueue(analytics.Track{
        UserId: "test-user",
        Event:  "test-snippet",
    })

    // Flushes any queued messages and closes the client.
    client.Close()
}

License

The library is released under the MIT license.

analytics-go's People

Contributors

achille-roussel avatar alanjcharles avatar corgrath avatar deankarn avatar ernesto-jimenez avatar f2prateek avatar hblanks avatar jeffreylo avatar kalamay avatar kke avatar metrue avatar michaelghseg avatar nickdelja avatar parkr avatar prayansh avatar rohitpaulk avatar systemizer avatar tj avatar vincepri avatar vladiacobsendgrid avatar wyattjoh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

analytics-go's Issues

SSL: CERTIFICATE_VERIFY_FAILED trying to track an event

On AppEngine, using the urlfetch client, one gets:

segment 2015/11/09 23:34:31 error sending request: Post https://api.segment.io/v1/batch: API error 6 (urlfetch: SSL_CERTIFICATE_ERROR): [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)

The fetch method does take a validate_certificate flag which can be set to false (see https://cloud.google.com/appengine/docs/python/refdocs/google.appengine.api.urlfetch?csw=1#google.appengine.api.urlfetch.fetch).

Is there really a bad cert here?

tests

everyone loves tests

Timestamp is always overwritten

setTimestamp() is always called upon queue().. disregarding if we had previously set some value there.

Is it normal it uses time.Now() and not time.Now().UTC() ? so you need to make sure your system is configured with the same timezone as your integrations and all ?

Async Callback

@f2prateek

I think we should define the policy on how the success and failure callbacks are called. Right now they're triggered from different places in the code, which could end up creating dead locks if the application has locked a resource that its callback would also try to acquire, or just race conditions if the app makes assumptions on what goroutine notifies its callbacks.

Here are few options we can explore:

  • Have an internal queue + goroutine that's only in charge of the calling the application's callback so all function calls are made from a single goroutine
  • Spawn a new goroutine for every batch of messages we want to notify the application of.
  • Use an internal queue + executor so we dispatch callbacks concurrently with a limit on how many goroutines will be created

Whatever option we choose I'll make sure to document it in the Callback type.
Let me know if you have other ideas or what sounds the best approach to you.

maps

instead of structs perhaps, easier to remove stuff

Update README

It would make it a better experience for readers to have a README file with a bit more content (quick start, examples, links, etc...)

How to set Timestamp

I'd like to specify a timestamp at which the event occurred. See https://segment.com/docs/spec/common/#timestamps

Can anyone confirm if the below code properly sets the timestamp field? And what would I see in the segment UI?

client.Track(&Track{
        Event:  "my event",
        UserId: 1234,
        Properties: map[string]interface{}{
            "foo":      foo,
        },
        Message: analytics.Message{
            Timestamp: time.String(),
        },
    })

would this set the orignalTimestamp field? I'm seeing that has a timestamp in a different format in segment using the above code.

Calling close before sending anything blocks forever

Hi, thanks for releasing this tool!

I had a weird issue when testing it, where it would block forever if you created a new client, and closed it without sending a single event (in my case a track event):

func main() {
    client := analytics.New("<segmentWriteKey>")
    client.Close()
}

Release 3.0.0

  • Make migration guide from 2.x to 3.0
  • Update site documentation
  • Update debugger pretty examples
  • Update History.md

v1/batch vs. v1/import

The documentation doesn't really mention anything about v1/batch as an endpoint, but rather v1/import. The payload seems to be the same for both of them as documented here: import

godoc markdown?

no clue if it can produce md or not, gh-pages for such a small lib would be annoying

Get rid of the Message type

@f2prateek

I think we should get rid of the Message type and the inheritance-like approach in the current implementation and instead inject the MessageId, SentAt and Timestamp field into each object.

The thing that's bugging me the most is the exported Type field and the message interface hack. Type is only ever used internally and doesn't need to be exposed and we wouldn't even need it internally if we took the approach defined in #53

Make analytics.Client an interface

@f2prateek

What do you think about making the analytics.Client an interface, from the current PRs it looks like we're moving toward having all fields being unexported, which means there's no much need for the actual type to be exposed.

The interface can look like this:

type Client interface {
    io.Closer

    Alias(*Alias) error

    Page(*Page) error

    Group(*Group) error

    Identify(*Identify) error

    Track(*Track) error
}

This way we could also have a type MockClient struct{} which implements this interface and can be used in test code for example.

Let me know what you think about it.

Add the Context type

@f2prateek

The sepcs says the context field is an object with a specific structure (unlike properties or traits which are free-form objects).

How about defining an actual struct type to make the code more self-documenting on what can be set?

I guess my main question is, could this be too restrictive in some cases or is there any undocumented parts of the context that our API actually supports?

https://segment.com/docs/spec/common/#context

Question: Offline/Development mode?

Is there a way to configure this library to noop the event send? It's not a huge issue, but would be nice for no connectivity scenarios (train, etc) and to not put spurious events in the dev account.

Happy to submit a PR for this, but I thought I'd check for your thoughts first.

Pass values instead of pointers to send methods

@f2prateek

What do you think about passing values instead of pointers to the send methods? Besides making the syntax heavier and putting more pressure on the compiler and GC I don't really see why this is done (we're not using the nil value for example and it would always cause a panic to pass nil, crashing at runtime instead of compile time).

Basically changing

func (c *Client) Alias(a *Alias) error { ... }

to

func (c *Client) Alias(a Alias) error { ... }

Let me know what you think about it?

Closing a client twice will block

@f2prateek
@calvinfo

Currently the Close method is not idempotent, and worst it blocks the program's execution if called more than once, see:

// Close and flush metrics.
func (c *Client) Close() error {
    c.quit <- struct{}{}
    close(c.msgs)
    <-c.shutdown
    return nil
}

The issue is that nothing is reading from the c.quit channel anymore once the client has been closed.

One way of solving this is to close the c.quit channel from the client's goroutine before it terminates, but this would cause writing to the channel to panic on the second call of Close, so we have to protect this with a recover call, basically changing the method to something like this:

// Close and flush metrics.
func (c *Client) Close() error {
    defer func() {
        // Always recover, a panic could be raised if c.quit was closed which means
        // the Close method was called more than once.
        recover()
    }()
    c.quit <- struct{}{}
    close(c.msgs)
    <-c.shutdown
    return nil
}

What do you guys think about it?

Pulling config from file or environment

@f2prateek
@calvinfo

As part of #48 I mentioned we could pull the write-key or other configuration properties from a config file or environment variables (the way the AWS SDKs work).

I'm rather in favor of it, I think it often handy to have config easily changed without affecting the library's API, it often makes development and operations easier.

For example, we currently require the write-key to be passed to the client's constructor, which means a program that wants to define those values from environment variables or a config file has to implement the logic itself to then pass the settings to the constructor.

The decision here is really defining what the signature of a function like analytics.NewWithConfig would be:

  • not using config files or environment variables
type Config struct {
    ...
}

func NewWithConfig(writeKey string, config Config) Client {
    ...
}
  • with config files or environment variables
type Config struct {
    Key string
    Dir string // path to the segment config directory, would default to something like ~/.segment
    ...
}

// In that case the key is not required anymore, if zero it will be loaded from
// the environment or config file.
// Here we should really return an error if something goes wrong or if no key
// could be found so the program has a way to check that it's not behaving
// properly.
func NewWithConfig(config Config) (Client, error) {
    ...
}

Let me know what you guys think about this.

Client blocks for the duration of the batch marshal and post (including retries)

When the loop receives a message and len(msgs) == c.size the list of messages are marshalled into json and sent. When doing this, the client will accept messages up until the buffer channel buffer size, and the block. Worst case, this will be:

jsonMarshalTime + 10 * http-roundtrip + ExponentialRetries

This could be a potentially long time (in CPU terms), all the while the client would be unresponsive or just blocking other calls to client.Track etc.

Since whatever errors or complications happened during the send does not really result in anything but a call to log, it might make sense to do this completely async/concurrently.

Redundant Close() called in Client.report(resp) method

c.report is only called in one location, but both the caller and report method closes the Body:

    ...
    res, err := c.Client.Do(req)
    if err != nil {
        c.log("error sending request: %s", err)
        return err
    }
    defer res.Body.Close()

    c.report(res)

    return nil
}

// Report on response body.
func (c *Client) report(res *http.Response) {
    if res.StatusCode < 400 {
        c.verbose("response %s", res.Status)
        return
    }

    defer res.Body.Close() <------ redundant
    body, err := ioutil.ReadAll(res.Body)
    if err != nil {
        c.log("error reading response body: %s", err)
        return
    }

    c.log("response %s: %s – %s", res.Status, res.StatusCode, body)
}

https://github.com/segmentio/analytics-go/blob/master/analytics.go#L301

Don't force using a log.Logger

@f2prateek

Right now we allow configuring the client logger by defining a *log.Logger object that a client should use.
This does limit the loggers usable to the logger from the standard Go library but there are other logging libs out there that may be used and that we could integrate with.

We only use the Printf method of the logger object we receive, I think we could take one of these two approaches, which comes handy when integrating with unit tests as well:

  • using an interface
type Logger interface {
    Logf(string, ...interface{})
}

type Client struct {
    ...
    Logger Logger
}
  • using a function
type Client struct {
    ...
    Logf func(string, ...interface{})
}

I tend to like the first approach better as it's usually how it's done in Go, it also makes it easy to integrate with the unit tests logger:

func TestX(t *testing.T) {
     client := analytics.New("write-key")
     client.Logger = t // cloud also be `client.Logf = t.Logf` with the other option
     ...
}

Go makes it also really easy to make functions implement single-method interfaces with tricks like this:

type LoggerFunc func(string, ...interface{})

func (f LoggerFunc) Logf(format string, args ...interface{}) {
    f(format, args...)
}
...
client.Logger = analytics.LoggerFunc(logf)

so we could get the best of both worlds this way.

Let me know what you think about it.

Interval is not configurable

Although the interval is public and mutable, since we start the loop goroutine right away, clients can't actually change the interval.

Predict the size of the batched request

@f2prateek
@calvinfo

I noticed that we don't check for the size of the serialized JSON before sending a request and only rely on the batch size (number of in-flight messages).
Is there a historical reason for that or did we just not spend the time implementing it because limiting the message count works well enough?

Use time.Time to represent timestamps in data structures

@f2prateek
@calvinfo

Currently the timestamp and sentAt fields have a string type in the data structures (Alias, Track, ...), this is not very handy and developers have to refer to the HTTP API documentation to understand what format this should be in (which isn't linked from the Go code).

I feel like we should use time.Time for these fields and take care of the serialization internally so we take the burden off of the developers using the library and leverage the compiler to verify that they're using valid values (basically a time value, right now they could inject whatever string format in there).

Let me know what you think about it?

Add Screen type

@f2prateek

There has been no Screen type in the Go library until now, I initially thought the intent was to not have this abstraction because it's not a mobile device but actually I'm seeing that the Java library exposes it.

Should we add analytics.Screen as well to get to feature parity with other server-side libs?

Change Integrations to a map[string]bool

@f2prateek
@calvinfo

From what I read in the documentations, integration objects are of the form { all: true, something: false }, so the values are always booleans.

How about changing the Integrations fields type to map[string]bool so we can leverage the compiler to check that the program is doing it right.

Drop godep dependency

@f2prateek
@calvinfo

Hey guys, what do you think about relying on the standard vendoring system (available from Go 1.5+ with GO15VENDOREXPERIMENT=1 set, and enabled by default in Go 1.6)?

Circle-CI is building with Go 1.6 already according to this.

Specify http client timeout

https://github.com/segmentio/analytics-go/blob/master/analytics.go#L155
c.Client.Timeout = 10 * time.Second

This seems to be best practice.

You can test with this program:

package main

import (
  "fmt"
  "log"
  "net/http"
  "net/http/httptest"
  "time"
)

func main() {
  svr := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
    time.Sleep(time.Hour)
  }))
  defer svr.Close()

  fmt.Println("making request")

  c := http.DefaultClient
  // c.Timeout = 10 * time.Second
  _, err := c.Get(svr.URL)
  if err != nil {
    log.Fatalln(err)
  }

  fmt.Println("finished request")
}

With and without the Timeout line.

https://medium.com/@nate510/don-t-use-go-s-default-http-client-4804cb19f779#.wxbmr3nqi

message leak

This program will send all messages:

package main

import (
    "log"
    "strconv"
    "time"

    "github.com/segmentio/analytics-go"
)

func main() {
    client := analytics.New("test")
    client.Endpoint = "https://api.segment.build"
    client.Verbose = true

    for i := 0; i < 10; i++ {
        log.Printf("track(%d)", i)
        err := client.Track(&analytics.Track{
            Event:  strconv.Itoa(i),
            UserId: strconv.Itoa(i),
        })
        if err != nil {
            log.Fatal(err)
        }
    }

    time.Sleep(client.Interval)
    err := client.Close()
    if err != nil {
        log.Fatal(err)
    }
}

Once you remove time.Sleep(client.Interval) it won't send them all.

len(msgs) >= c.Size

This would prevent the buffer never being emptied if someone sets to size to 0:

// https://github.com/segmentio/analytics-go/blob/master/analytics.go#L316
for {
        select {
        case msg := <-c.msgs:
            c.verbose("buffer (%d/%d) %v", len(msgs), c.Size, msg)
            msgs = append(msgs, msg)
            if len(msgs) == c.Size {
                c.verbose("exceeded %d messages – flushing", c.Size)
                c.send(msgs)
                msgs = nil

Use http.RoundTripper instead of http.Client

@f2prateek

We're currently allowing the analytics.Client to configure the HTTP client it uses to connect to our API. While this is great it offers more flexibility than is usually needed.

http.Client is not a generic type and it allows configuring more than we need/want. For example it manages cookies, redirections, timeouts... things that we potentially want to have full control over.

My understanding is that we want to allow overriding the transport layer (let's say to do unit tests that don't actually connect over the network, use a different pooling policy, etc...) but the rest of the setup should be handled by the library.

I think we could change this to have a Transport http.RoundTripper field in the analytics.Client, which is an interface so consumers of the library can really inject whatever they want here.

Thoughts?

Merging all send methods into one

@f2prateek

I think we could make the client API simpler if we were merging all send methods into one Send(interface{}) and doing a type-check internally to know what message has been passed, that we we could write things like:

client.Send(analytics.Track{ ... })
client.Send(analytics.Page{ ... })
...

which is less redundant than

client.Track(analytics.Track{ ... })
client.Page(analytics.Page{ ... })

and also makes the API simpler, therefore easier to document and learn?

The downside I'm seeing is that we lose the compile-time check that the program is passing valid types to the function, although we can get it back with a trick like this:

type Event interface {
    // This method is unexported so only types defined in the analytics package can
    // implement this interface.
    path() string
}

type Alias struct { ... }

func (a Alias) path() string { return "/alias" }

// Expects a type that satisfies the Event interface
func (c *Client) Send(e Event) error { ... }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.