Coder Social home page Coder Social logo

data-frame's People

Contributors

alex-hhh avatar bennn avatar dstorrs avatar privong avatar ralsei avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

data-frame's Issues

Additional data-frame constructors

One of the issues I've had while using data-frame is the lack of auxilary constructors for constructing data that is not already read into a CSV, such as data parsed from some API that provides, for example, JSON. data-frame provides the df-read/x family of functions to construct a data-frame from existing data, but this doesn't exist outside of reading from a file. Mangling your data before it gets into a data-frame isn't particularly easy, currently.

Right now, the workflow looks like:

  • turn your data into vectors
  • (possibly) sort those vectors
  • turn those vectors into series, with names
  • (mutably) add all of those series into a new data-frame

or, as I've done a few times:

  • open a temporary file
  • convert the data to a CSV
  • write to that temporary file
  • and read it as a data-frame

which is absolutely suboptimal.

Some ideas:

  1. Have make-data-frame take (optionally) series as arguments, so (make-data-frame (make-series ...) (make-series ...) ...) -- this just generally reduces line count for small examples, or when you already have series and don't want to add a df-add-series for each individual one
  2. Add a for/data-frame form, something like:
#lang racket
(require data-frame
         (for-syntax syntax/for-body))
(provide for/data-frame)

(define-syntax (for/data-frame stx)
  (syntax-case stx ()
    [(_ clauses body ... tail-expr)
     (with-syntax ([original stx]
                   [((pre-body ...) (post-body ...))
                    (split-for-body stx #'(body ... tail-expr))])
       #'(for/fold/derived original
           ([current-df (make-data-frame)])
           clauses
           pre-body ...
           (df-add-series current-df (let () post-body ...))
           current-df))]))

(define df
  (for/data-frame ([name (in-list '("x-var" "y-var"))])
    (make-series name #:data (build-vector (λ (_) (random -50 50))))))

but this still requires the make-series at the end -- perhaps it could be values? But this forgoes comparison functions, et cetera.

Better ideas are obviously welcome.

data-frame/read-csv numbers in scientific notation as string

Hi @alex-hhh, my CSV file contains numbers both in the normal format as well as scientific notation. I noticed that the numbers in scientific notations, for example, 7.423934362508338e-05 is imported as string, i. e, "7.423934362508338e-05". This causes problems, for example when I try to plot the data using

#lang racket
(require plot)
(require data-frame)
(define eob (df-read/csv "./eob.csv" #:quoted-numbers? #t))
(plot (list (axes)
            (points (df-select* eob "time" "amp"))))

gives the following error

points: contract violation
  expected: real?
  given: "9.744929335518598e-05"
  in: an element of
      a part of the or/c of
      an element of
      a part of the or/c of
      the 1st argument of
      (->*
       ((or/c
         natural?
         (sequence/c
          (or/c natural? (sequence/c real?)))))
       (#:alpha
        (>=/c 0)
        #:color
        (or/c
         string?
         symbol?
         (and/c
          exact-integer?
          negative?
          (not/c fixnum?))
         (and/c
          exact-integer?
          positive?
          (not/c fixnum?))
         (and/c fixnum? negative?)
         (and/c fixnum? positive? (not/c index?))
         (and/c index? positive? (not/c byte?))
         g444
         1
         0
         (recursive-contract g470 #:impersonator)
         (list/c real? real? real?))
        #:fill-color
        (or/c
         string?
         symbol?
         (and/c
          exact-integer?
          negative?
          (not/c fixnum?))
         (and/c
          exact-integer?
          positive?
          (not/c fixnum?))
         (and/c fixnum? negative?)
         (and/c fixnum? positive? (not/c index?))
         (and/c index? positive? (not/c byte?))
         g444
         1
         0
         (recursive-contract g470 #:impersonator)
         (list/c real? real? real?))
        #:label
        (or/c string? #f pict?)
        #:line-width
        (>=/c 0)
        #:size
        (>=/c 0)
        #:sym
        (or/c
         string?
         char?
         (and/c
          exact-integer?
          negative?
          (not/c fixnum?))
         (and/c
          exact-integer?
          positive?
          (not/c fixnum?))
         (and/c fixnum? negative?)
         (and/c fixnum? positive? (not/c index?))
         (and/c index? positive? (not/c byte?))
         g444
         1
         0
         'dot
         'point
         'pixel
         'plus
         'times
         'asterisk
         '5asterisk
         'odot
         'oplus
         'otimes
         'oasterisk
         'o5asterisk
         'circle
         'square
         'diamond
         'triangle
         'fullcircle
         'fullsquare
         'fulldiamond
         'fulltriangle
         'triangleup
         'triangledown
         'triangleleft
         'triangleright
         'fulltriangleup
         'fulltriangledown
         'fulltriangleleft
         'fulltriangleright
         'rightarrow
         'leftarrow
         'uparrow
         'downarrow
         '4star
         '5star
         '6star
         '7star
         '8star
         'full4star
         'full5star
         'full6star
         'full7star
         'full8star
         'circle1
         'circle2
         'circle3
         'circle4
         'circle5
         'circle6
         'circle7
         'circle8
         'bullet
         'fullcircle1
         'fullcircle2
         'fullcircle3
         'fullcircle4
         'fullcircle5
         'fullcircle6
         'fullcircle7
         'fullcircle8
         'none)
        #:x-jitter
        (>=/c 0)
        #:x-max
        (or/c real? #f)
        #:x-min
        (or/c real? #f)
        #:y-jitter
        (>=/c 0)
        #:y-max
        (or/c real? #f)
        #:y-min
        (or/c real? #f))
       any)
  contract from: 
      <pkgs>/plot-lib/plot/private/plot2d/point.rkt
  blaming: /Users/arif/Documents/test-data-frame.rkt
   (assuming the contract is correct)
  at: <pkgs>/plot-lib/plot/private/plot2d/point.rkt:47:9

Adding more getters

Right now, there are various things to set or add things to a data-frame, but no way to get them.

The three most glaring examples, for me, are:

  • There is a df-add-series!, but no df-get-series (to get the series, not the vector)
  • There is a df-set-sorted!, but no df-is-sorted?, or way to recover a comparison function from a series (like series-cmpfn or df-get-cmpfn). This is useful for doing your own binary search over a series (outside of df-index-of and df-lookup, which do this internally)
  • Data frames are constructed with an NA value with, for example, (df-read/csv file #:na "NA"), but there's no way of getting that NA value is after construction

Most of these are useful for manipulation of data-frames after construction.

df-read/x for x separated data file

Hi,

Is there an existing function to read data from a file where the data is separated by some string x other than ,? For example, x=" ".

Changing headers after `df-read/csv`

Hi Alex,

Is there a way to change the headers of the resulting data-frame after df-read/csv?
My csv file contains only the raw data without headers, and I would like to add them manually.

Thanks for the good work and the great documentation,
Laurent

least-squares: matrix-solve contract violation

What program did you run?

#lang racket

(require data-frame)

(define df
  (let* ((df (make-data-frame))
         (xs (make-series "x" #:data '#(0 1)))
         (ys (make-series "y" #:data '#(2 3))))
    (df-add-series! df xs)
    (df-add-series! df ys)
    df))

(df-least-squares-fit df "x" "y" #:mode 'polynomial)

What happened?

matrix-solve: contract violation
  expected: matrix-invertible?
  given: (array #[#[2 1 1] #[1 1 1] #[1 1 1]])
  argument position: 1st
  other arguments...:
   (array #[#[5] #[3] #[3]])

What did you expect to happen?

No error.

I'm not sure what the problem is. Too few points? I would rather have a poor fit (straight line) than an error. But if an error is the right thing, can we clear up the message?

raco setup: WARNING: undefined tag (discrete-histogram and discrete-histogram-skip)

raco pkg install data-frame prints this warning:

raco setup: --- building documentation --- [5:39:26]
raco setup: 1 running: /data-frame/scribblings/data-frame.scrbl
raco setup: WARNING: undefined tag in /data-frame/scribblings/data-frame.scrbl:
raco setup: ((lib "plot/no-gui.rkt") discrete-histogram)
raco setup: ((lib "plot/utils.rkt") discrete-histogram-skip)
raco setup: 3 rendering: /data-frame/scribblings/data-frame.scrbl

and on https://docs.racket-lang.org/data-frame/index.html, discrete-histogram and discrete-histogram-skip are not clickable.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.