Coder Social home page Coder Social logo

datavalues.jl's Introduction

DataValues

Project Status: Active - The project has reached a stable, usable state and is being actively developed. Build Status Build status codecov

Overview

This package provides the type DataValue that is used to represent missing data.

Currently the main use of this type is in the Query.jl and IterableTables.jl packages.

This repo is based on the following principles/ideas:

  • This type is meant to make life for data scientists as easy as possible. That is the main guiding principle.
  • We hook into the dot broadcasting mechanism from julia 0.7 to provide lifting functionality for functions that don't have specific methods for DataValue arguments.
  • The & and | operators follow the 3VL semantics for DataValues.
  • Comparison operators like ==, < etc. on DataValues return Bool values, i.e. they are normal predicates.
  • The package provides many lifted methods.
  • One can access or unpack the value within a DataValue either via the get(x) function, or use the x[] syntax.

Any help with this package would be greatly appreciated!

datavalues.jl's People

Contributors

abhijithch avatar andreasnoack avatar andyferris avatar ararslan avatar bkamins avatar cjprybol avatar davidagold avatar davidanthoff avatar fratrik avatar github-actions[bot] avatar iainnz avatar johnmyleswhite avatar mbauman avatar nalimilan avatar quinnj avatar ranjanan avatar ratanrsur avatar scottpjones avatar staticfloat avatar tkelman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

datavalues.jl's Issues

Using Any as a lower bound

This isn't an issue per se, just curious (I am researching Julia subtyping, and we are looking into lower-bounds usages)...

function Base.convert(::Type{DataValueArray{T,N}}, A::AbstractArray{S,N}) where {S >: Any,T,N}

Is there a reason to use A::AbstractArray{S,N} with S >: Any instead of A::AbstractArray{Any,N}?
It seems that there shouldn't be any types distinct from Any that would be supertypes of Any.

method ambiguity with similar

A method for similar introduced in this package causes ambiguity for SubArrays:

julia> using DataValues

julia> similar(view([DataValue()], 1, 1))
ERROR: MethodError: similar(::SubArray{DataValues.DataValue{Union{}},0,Array{DataValues.DataValue{Union{}},2},Tuple{Int64,Int64},false}, ::Type{DataValues.DataValue{Union{}}}, ::Tuple{}) is ambiguous. Candidates:
  similar(x::AbstractArray, ::Type{DataValues.DataValue{T}}, dims::Tuple{Vararg{Int64,N}} where N) where T in DataValues at /Users/pietro/.julia/v0.6/DataValues/src/array/primitives.jl:13
  similar(V::SubArray, T::Type, dims::Tuple{Vararg{Int64,N}} where N) in Base at subarray.jl:58
Possible fix, define
  similar(::SubArray, ::Type{DataValues.DataValue{T}}, ::Tuple{Vararg{Int64,N}} where N)
Stacktrace:
 [1] similar(::SubArray{DataValues.DataValue{Union{}},0,Array{DataValues.DataValue{Union{}},2},Tuple{Int64,Int64},false}) at ./abstractarray.jl:520

This is a bit unfortunate as JuliaDB uses DataValue, similar and SubArray quite often.

Some issue with similar

@JeffBezanson reports this over at JuliaData/Missings.jl#6 (comment):

It works pretty well, but I ran into the issue recently that similar(a::DataValueArray, Any) does not return an array that can hold any value, which prevents Base map and collect from working.

@JeffBezanson: do you have a little more info what you were trying to do?

The following seems to work:

using DataValues

x = DataValueArray([1,2,NA])

y = similar(x, Any)
y[1]="asdf"
y[2]=3.5

The type of y is DataValues.DataValueArray{Any,1}, and now that I think about it, maybe it should just be Array{Any,1}, right? But things shouldn't break, regardless of that specific choice.

Naming

Couple of simple questions:

  • What should the name of the package be?
  • What should the name of the type be? Currently it is NAable.
  • What should the name of the function checking for missing values be? Currently it is isna.
  • What should the name of the missing value be? Currently it is NA.

I'd like to get feedback, but also make a decision fairly soonish so that I can register the package and then have Query.jl depend on it.

I'm not a huge fan of NAable, but I really like that we can keep using isna and NA with that choice. Plus, I feel that folks using this in e.g. Query.jl will more often have to use the latter two identifiers than the actual type name.

If anyone has a better name, please speak up! Having short names for the equivalent for isna and NA is a huge plus for such suggestions.

Should comparison operators return Bool values?

From the discussions in JuliaStats/NullableArrays.jl#85, https://github.com/JuliaData/Roadmap.jl/issues/3, and JuliaData/DataFramesMeta.jl#58 it seems that the main open question is the behavior of comparison operators. I understand that there are good reasons for the current behavior of NAables. Unfortunately, it could lead to subtle errors, e.g.:

julia> include("NAable.jl"); using .NAables

julia> a = [NAable(1), NAable(-1), NAable{Int}()]
3-element Array{NAables.NAable{Int64},1}:
1  
-1 
#NA

julia> positive = broadcast(>=, a, 0)
3-element Array{Bool,1}:
true
false
false

julia> b = Any[NA, NA, NA]; b[positive] = "+"; b[~positive] = "-"; b
3-element Array{Any,1}:
"+"
"-"
"-"

julia> getsign(x) = x < 0 ? "-" : "+"
getsign (generic function with 1 method)

julia> getsign.(a)
3-element Array{String,1}:
"+"
"-"
"+"

If comparison operators return NAable{Bool} instead, and a function like
bool{T<:Bool}(x::NAable{T}) = isna(x) ? false : x.value exists, the above would become:

julia> include("NAable3VL.jl"); using .NAables

julia> a = [NAable(1), NAable(-1), NAable{Int}()]
3-element Array{NAables.NAable{Int64},1}:
1  
-1 
#NA

julia> positive = broadcast(>=, a, 0)
3-element Array{NAables.NAable{Bool},1}:
true 
false
#NA  

julia> b = Any[NA, NA, NA]; b[bool.(positive)] = "+"; b[bool.(~positive)] = "-"; b
3-element Array{Any,1}:
"+"
"-"
NA 

julia> getsign(x) = x < 0 ? "-" : "+"
getsign (generic function with 1 method)

julia> getsign.(a)
ERROR: TypeError: non-boolean (NAables.NAable{Bool}) used in boolean context

Maybe one can automatically treat NA as false in filter-like contexts like Boolean indexing, generators, @where, ..., so one does not have to wrap such conditions in bool(...). But because it is not safe to assume that the author of an arbitrary function using if ... else .. end or the ternary operator has anticipated the possibility of NAs, an error should be thrown if NA occurs in a comparison in a control flow statement. As Stefan Karpinsky pointed out in JuliaStats/NullableArrays.jl#85 (comment), it could still be possible to allow non-NA NAable{Bool}s in such situations.

Reduce invalidations

Loading DataValues causes a lot of invalidations, which in turn require a lot of already loaded package code to be recompiled (see at the bottom of this post):

It would be nice to at least fix these methods, which together invalidate about 9,000 methods:

==(a::DataValue{T1}, b::T2) where {T1,T2} = isna(a) ? false : unsafe_get(a) == b

Base.convert(::Type{Any}, ::DataValue{Union{}}) = NA
(proposed to be removed in #51)
==(a::T1, b::DataValue{T2}) where {T1,T2} = isna(b) ? false : a == unsafe_get(b)

Here are the invalidations on v1.8:

julia> using SnoopCompileCore

julia> invalidations = @snoopr using DataValues;

julia> using SnoopCompile

julia> trees = invalidation_trees(invalidations)
13-element Vector{SnoopCompile.MethodInvalidations}:
 inserting convert(::Type{Array{S, N}}, X::DataValueArray{T, N}) where {S, T, N} in DataValues at /home/sethaxen/.julia/packages/DataValues/N7oeL/src/array/primitives.jl:271 invalidated:
   mt_backedges: 1: signature Tuple{typeof(convert), Type{Vector{Any}}, Any} triggered MethodInstance for setindex!(::Vector{Vector{Any}}, ::Any, ::Int64) (0 children)

 inserting |(x::DataValue{Bool}, y::Bool) in DataValues at /home/sethaxen/.julia/packages/DataValues/N7oeL/src/scalar/core.jl:272 invalidated:
   mt_backedges: 1: signature Tuple{typeof(|), Any, Bool} triggered MethodInstance for Base._base(::Int64, ::Integer, ::Int64, ::Bool) (0 children)

 inserting ^(a::T1, b::DataValue{T2}) where {T1, T2} in DataValues at /home/sethaxen/.julia/packages/DataValues/N7oeL/src/scalar/operations.jl:75 invalidated:
   mt_backedges: 1: signature Tuple{typeof(^), String, Any} triggered MethodInstance for OhMyREPL.untokenize_with_ANSI(::IOContext{IOBuffer}, ::Vector{Crayons.Crayon}, ::Vector{Tokenize.Tokens.Token}, ::Any) (0 children)

 inserting &(x::Bool, y::DataValue{Bool}) in DataValues at /home/sethaxen/.julia/packages/DataValues/N7oeL/src/scalar/core.jl:254 invalidated:
   mt_backedges: 1: signature Tuple{typeof(&), Bool, Any} triggered MethodInstance for div(::Unsigned, ::Int64, ::RoundingMode{:Down}) (0 children)
                 2: signature Tuple{typeof(&), Bool, Any} triggered MethodInstance for div(::Unsigned, ::Int64, ::RoundingMode{:Up}) (0 children)
                 3: signature Tuple{typeof(&), Bool, Any} triggered MethodInstance for Base.var"#string#427"(::Int64, ::Int64, ::typeof(string), ::Unsigned) (0 children)

 inserting mapreduce(f, op::Function, X::T; skipna) where {N, S<:DataValue, T<:AbstractArray{S, N}} in DataValues at /home/sethaxen/.julia/packages/DataValues/N7oeL/src/array/reduce.jl:109 invalidated:
   backedges: 1: superseding mapreduce(f, op, A::Union{Base.AbstractBroadcasted, AbstractArray}; dims, init) in Base at reducedim.jl:357 with MethodInstance for mapreduce(::Base.ExtremaMap{typeof(identity)}, ::typeof(Base._extrema_rf), ::Vector) (8 children)

 inserting similar(x::AbstractArray, ::Type{DataValue{T}}, dims::Tuple{Vararg{Int64, N}} where N) where T in DataValues at /home/sethaxen/.julia/packages/DataValues/N7oeL/src/array/primitives.jl:12 invalidated:
   backedges: 1: superseding similar(a::AbstractArray, ::Type{T}, dims::Tuple{Vararg{Int64, N}}) where {T, N} in Base at abstractarray.jl:806 with MethodInstance for similar(::UnitRange{Int64}, ::Type, ::Tuple{Int64}) (3 children)
              2: superseding similar(a::AbstractArray, ::Type{T}, dims::Tuple{Vararg{Int64, N}}) where {T, N} in Base at abstractarray.jl:806 with MethodInstance for similar(::UnitRange{Int64}, ::DataType, ::Tuple{Int64}) (3 children)
              3: superseding similar(a::AbstractArray, ::Type{T}, dims::Tuple{Vararg{Int64, N}}) where {T, N} in Base at abstractarray.jl:806 with MethodInstance for similar(::UnitRange{Int64}, ::Type, ::Tuple{Int64}) (6 children)
   1 mt_cache

 inserting convert(::Type{Array}, X::DataValueArray{T, N}) where {T, N} in DataValues at /home/sethaxen/.julia/packages/DataValues/N7oeL/src/array/primitives.jl:291 invalidated:
   backedges: 1: superseding convert(::Type{T}, a::AbstractArray) where T<:Array in Base at array.jl:617 with MethodInstance for convert(::Type, ::AbstractArray) (15 children)
   29 mt_cache

 inserting convert(::Type{Vector}, X::DataValueVector{T}) where T in DataValues at /home/sethaxen/.julia/packages/DataValues/N7oeL/src/array/primitives.jl:283 invalidated:
   mt_backedges: 1: signature Tuple{typeof(convert), Type{Vector}, Any} triggered MethodInstance for Pkg.REPLMode.Command(::Pkg.REPLMode.CommandSpec, ::Dict{Symbol, Any}, ::Any) (1 children)
                 2: signature Tuple{typeof(convert), Type{Vector}, Any} triggered MethodInstance for Pkg.REPLMode.Command(::Nothing, ::Dict{Symbol, Any}, ::Any) (12 children)
   backedges: 1: superseding convert(::Type{T}, a::AbstractArray) where T<:Array in Base at array.jl:617 with MethodInstance for convert(::Type{Vector}, ::AbstractArray) (2 children)

 inserting similar(x::Array, ::Type{DataValue{T}}, dims::Tuple{Vararg{Int64, N}} where N) where T in DataValues at /home/sethaxen/.julia/packages/DataValues/N7oeL/src/array/primitives.jl:16 invalidated:
   mt_backedges: 1: signature Tuple{typeof(similar), Vector, Any, Tuple{Int64}} triggered MethodInstance for similar(::Vector, ::Tuple{Base.OneTo{Int64}}) (2 children)
   backedges: 1: superseding similar(a::Array, T::Type, dims::Tuple{Vararg{Int64, N}}) where N in Base at array.jl:378 with MethodInstance for similar(::Vector{Pair{DataType, Function}}, ::DataType, ::Tuple{Int64}) (1 children)
              2: superseding similar(a::Array, T::Type, dims::Tuple{Vararg{Int64, N}}) where N in Base at array.jl:378 with MethodInstance for similar(::Vector{Any}, ::DataType, ::Tuple{Int64}) (1 children)
              3: superseding similar(a::Array, T::Type, dims::Tuple{Vararg{Int64, N}}) where N in Base at array.jl:378 with MethodInstance for similar(::Vector, ::Type, ::Tuple{Int64}) (1 children)
              4: superseding similar(a::Array, T::Type, dims::Tuple{Vararg{Int64, N}}) where N in Base at array.jl:378 with MethodInstance for similar(::Array, ::DataType, ::Tuple{Int64}) (2 children)
              5: superseding similar(a::Array, T::Type, dims::Tuple{Vararg{Int64, N}}) where N in Base at array.jl:378 with MethodInstance for similar(::Array, ::Type, ::Tuple{Int64}) (2 children)
              6: superseding similar(a::Array, T::Type, dims::Tuple{Vararg{Int64, N}}) where N in Base at array.jl:378 with MethodInstance for similar(::Vector{Pair{DataType, Function}}, ::Type, ::Tuple{Int64}) (6 children)
              7: superseding similar(a::Array, T::Type, dims::Tuple{Vararg{Int64, N}}) where N in Base at array.jl:378 with MethodInstance for similar(::Vector{Any}, ::Type, ::Tuple{Int64}) (11 children)
   30 mt_cache

 inserting !(x::DataValue{T}) where T<:Number in DataValues at /home/sethaxen/.julia/packages/DataValues/N7oeL/src/scalar/core.jl:203 invalidated:
   mt_backedges:  1: signature Tuple{typeof(!), Any} triggered MethodInstance for (::Base.var"#97#98"{typeof(iszero)})(::Any) (0 children)
                  2: signature Tuple{typeof(!), Any} triggered MethodInstance for !=(::Any, ::Int64) (0 children)
                  3: signature Tuple{typeof(!), Any} triggered MethodInstance for !=(::Int64, ::Any) (0 children)
                  4: signature Tuple{typeof(!), Any} triggered MethodInstance for !=(::AbstractFloat, ::AbstractFloat) (0 children)
                  5: signature Tuple{typeof(!), Any} triggered MethodInstance for Pkg.LazilyInitializedFields.lazy_struct(::Expr) (0 children)
                  6: signature Tuple{typeof(!), Any} triggered MethodInstance for !=(::Any, ::Type{Float64}) (0 children)
                  7: signature Tuple{typeof(!), Any} triggered MethodInstance for Base.run_main_repl(::Bool, ::Bool, ::Bool, ::Bool, ::Bool) (0 children)
                  8: signature Tuple{typeof(!), Any} triggered MethodInstance for showerror(::IOContext{Base.TTY}, ::MethodError) (0 children)
                  9: signature Tuple{typeof(!), Any} triggered MethodInstance for !=(::Tuple{Base.OneTo{Int64}}, ::Any) (0 children)
                 10: signature Tuple{typeof(!), Any} triggered MethodInstance for !=(::Union{Nothing, Pkg.Types.UpgradeLevel, VersionNumber, String, Pkg.Versions.VersionSpec}, ::Union{Nothing, Pkg.Types.UpgradeLevel, VersionNumber, String, Pkg.Versions.VersionSpec}) (0 children)
                 11: signature Tuple{typeof(!), Any} triggered MethodInstance for Base.Sort.var"#sort!#8"(::Base.Sort.Algorithm, ::typeof(isless), ::typeof(identity), ::Nothing, ::Base.Order.ForwardOrdering, ::typeof(sort!), ::Vector) (0 children)
                 12: signature Tuple{typeof(!), Any} triggered MethodInstance for Base.Sort.var"#sort!#8"(::Base.Sort.Algorithm, ::typeof(isless), ::Function, ::Nothing, ::Base.Order.ForwardOrdering, ::typeof(sort!), ::Vector) (0 children)
                 13: signature Tuple{typeof(!), Any} triggered MethodInstance for !=(::Any, ::Any) (0 children)
                 14: signature Tuple{typeof(!), Any} triggered MethodInstance for !=(::Unsigned, ::Int64) (0 children)
                 15: signature Tuple{typeof(!), Any} triggered MethodInstance for ==(::Dict{String, Any}, ::Dict{String, Any}) (0 children)
                 16: signature Tuple{typeof(!), Any} triggered MethodInstance for Pkg.REPLMode._completions(::String, ::Bool, ::Int64, ::Int64) (0 children)
                 17: signature Tuple{typeof(!), Any} triggered MethodInstance for allunique(::AbstractRange) (0 children)
                 18: signature Tuple{typeof(!), Any} triggered MethodInstance for !=(::Any, ::Char) (0 children)
                 19: signature Tuple{typeof(!), Any} triggered MethodInstance for (::GlobalRef, ::Any) (0 children)
                 20: signature Tuple{typeof(!), Any} triggered MethodInstance for Test.do_test_throws(::Test.ExecutionResult, ::Any, ::Any) (0 children)
                 21: signature Tuple{typeof(!), Any} triggered MethodInstance for Test.eval_test(::Expr, ::Expr, ::LineNumberNode, ::Bool) (0 children)
                 22: signature Tuple{typeof(!), Any} triggered MethodInstance for !=(::Any, ::Type{Float16}) (0 children)
                 23: signature Tuple{typeof(!), Any} triggered MethodInstance for !=(::Any, ::Type{Float32}) (0 children)
                 24: signature Tuple{typeof(!), Any} triggered MethodInstance for Base.CoreLogging.log_record_id(::Any, ::Any, ::Any, ::Tuple{}) (1 children)
                 25: signature Tuple{typeof(!), Any} triggered MethodInstance for Base.isdelimited(::IOContext{IOBuffer}, ::Pair{Symbol, Any}) (1 children)
                 26: signature Tuple{typeof(!), Any} triggered MethodInstance for Base.isdelimited(::IOContext{IOBuffer}, ::Pair) (1 children)
                 27: signature Tuple{typeof(!), Any} triggered MethodInstance for Base.Docs.moduledoc(::LineNumberNode, ::Module, ::Expr, ::Any, ::Expr) (1 children)
                 28: signature Tuple{typeof(!), Any} triggered MethodInstance for Base.at_disable_library_threading(::LinearAlgebra.var"#249#250") (1 children)
                 29: signature Tuple{typeof(!), Any} triggered MethodInstance for Base.at_disable_library_threading(::Function) (1 children)
                 30: signature Tuple{typeof(!), Any} triggered MethodInstance for REPL.LineEditREPL(::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any) (1 children)
                 31: signature Tuple{typeof(!), Any} triggered MethodInstance for Base.CoreLogging.log_record_id(::Any, ::Any, ::Any, ::Tuple{Any, Vararg{Any}}) (2 children)
                 32: signature Tuple{typeof(!), Any} triggered MethodInstance for Base.Docs.moduledoc(::Any, ::Any, ::Any, ::Any, ::Expr) (2 children)
                 33: signature Tuple{typeof(!), Any} triggered MethodInstance for ==(::Vector{Int64}, ::Array) (2 children)
                 34: signature Tuple{typeof(!), Any} triggered MethodInstance for REPL._trimdocs(::Markdown.MD, ::Bool) (2 children)
                 35: signature Tuple{typeof(!), Any} triggered MethodInstance for Base.CoreLogging.log_record_id(::Any, ::Any, ::Any, ::Any) (3 children)
                 36: signature Tuple{typeof(!), Any} triggered MethodInstance for Base._show_nonempty(::IOContext{Base.TTY}, ::AbstractMatrix, ::String, ::Bool, ::Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}) (3 children)
                 37: signature Tuple{typeof(!), Any} triggered MethodInstance for (::Pkg.REPLMode.var"#command_is_focused#53"{Bool, Int64})() (3 children)
                 38: signature Tuple{typeof(!), Any} triggered MethodInstance for (::Base.var"#38#40")(::Core.MethodMatch) (9 children)
                 39: signature Tuple{typeof(!), Any} triggered MethodInstance for (::Base.var"#isword#489")(::Char) (12 children)
                 40: signature Tuple{typeof(!), Any} triggered MethodInstance for Base.Docs.moduledoc(::LineNumberNode, ::Module, ::Any, ::Any, ::Expr) (15 children)

 inserting ==(a::T1, b::DataValue{T2}) where {T1, T2} in DataValues at /home/sethaxen/.julia/packages/DataValues/N7oeL/src/scalar/core.jl:224 invalidated:
   backedges: 1: superseding ==(x, y) in Base at Base.jl:116 with MethodInstance for ==(::Base.UUID, ::Any) (8 children)
              2: superseding ==(x, y) in Base at Base.jl:116 with MethodInstance for ==(::Module, ::Any) (12 children)
              3: superseding ==(x, y) in Base at Base.jl:116 with MethodInstance for ==(::Core.TypeName, ::Any) (14 children)
              4: superseding ==(x, y) in Base at Base.jl:116 with MethodInstance for ==(::Method, ::Any) (14 children)
              5: superseding ==(x, y) in Base at Base.jl:116 with MethodInstance for ==(::Symbol, ::Any) (118 children)

 inserting convert(::Type{Any}, ::DataValue{Union{}}) in DataValues at /home/sethaxen/.julia/packages/DataValues/N7oeL/src/scalar/core.jl:42 invalidated:
   backedges: 1: superseding convert(::Type{Any}, x) in Base at Base.jl:60 with MethodInstance for convert(::Type{Any}, ::Any) (1934 children)
   1 mt_cache

 inserting ==(a::DataValue{T1}, b::T2) where {T1, T2} in DataValues at /home/sethaxen/.julia/packages/DataValues/N7oeL/src/scalar/core.jl:223 invalidated:
   backedges: 1: superseding ==(x, y) in Base at Base.jl:116 with MethodInstance for ==(::Any, ::FileWatching._FDWatcher) (5 children)
              2: superseding ==(x, y) in Base at Base.jl:116 with MethodInstance for ==(::Any, ::Task) (6616 children)
   10 mt_cache

not a nice output from sum

I do something like:

res=sum(df[j,:IncidenceReported]*log(λReported[j])-λReported[j]-lfact.(df[j,:IncidenceReported]) for j in 1:nrecords if ((isa.(df[j,:IncidenceReported],Number)==true) && (λReported[j]>0.0)))
return(res)

where df is a dataframe containing "missing"-entries + float numbers. I then receive the result as DataValue{Float64}(-1468.31). I think it should be simply -1468.31

Add dot lifting support

Essentially we want the same broadcasting/. syntax that lifts Nullables in base to also work with DataValue.

@TotalVerb Is that generally doable? I naivly assumed that I just have to copy all the Nullable specific code from broadcast.jl in base in here, and then replace Nullable with DataValue. Are there any further gotchas?

One line that made me nervous is this, because that is probably something we can't extend from a package? Is that something we need to worry about?

Sorry for these pretty basic questions :)

Sort out a copyright issue

I copied code from julia base, in particular the test code, and then just replaced Nullable with NAable etc. That clearly is derived work in a copyright sense, but I'm not entirely sure what the proper way to handle this is. Maybe just add all the people that show up in a git blame in the original julia file as copyright holders?

Cannot `convert` an object of type DataValues.DataValue{Any} to an object of type DateTime

I'm not sure if this is a core DataValues issue or one with the implementation in ExcelFiles or DataFrames (where I posted the same issue):
queryverse/ExcelFiles.jl#13
JuliaData/DataFrames.jl#1478

(I'll close the inappropriate issues once I figure out where the issue is)

I have dataframe df where column 1 is an array of DataValues.DataValue{Any}.
How can I convert it to DateTime, and allow for missing?

This doesn’t work:

datetimes = convert(Vector{Union{DateTime,Missing}}, df[1])
MethodError: Cannot `convert` an object of type DataValues.DataValue{Any} to an object of type DateTime

Two failing tests

Hi, two tests are failing for me:

Expression: isequal(mapreduce(f, +, X), DataValue(mapreduce(f, +, X.values)))
Evaluated: isequal(DataValue{Float64}(5130.89), DataValue{Float64}(5130.89))

Expression: method(f, X) == DataValue(method(f, A))
Evaluated: DataValue{Float64}(5130.89) == DataValue{Float64}(5130.89)

Maybe a float rounding error? No idea
Edit: Indeed it is:

DataValue(mapreduce(f, +, X.values)).value
5130.890169906765

mapreduce(f, +, X).value
5130.890169906763

Add support for similar / undef initializer

similar(DataValueArray{T}, n) falls back to DataValueArray{T}(undef, n) which errors. Would it be OK to simply define it as the DataValueArray of size n where all the data is missing?

Equality of empty DataValues

Currently,

(DataValue{T}() == DataValue{T}()) == true

but, e.g.

(DataValue{T}() <= DataValue{T}()) == false

This is a bit weird, mathematically. Maybe empty DataValues could behave like NaN, instead (all comparisons yielding false)?

promotion causes segfault/stack overflow

julia> using DataValues

julia> promote(nothing,DataValue{Any}())
Segmentation fault: 11

(see other examples on JuliaLang/julia#29639)

The segfault is obviously a Julia problem, but the discussion there suggests that it is a misspecified promotion that results in a stack overflow

Track a Union{T,Null} based implementation

Once Union{T,Null} is fast in base, we might change the internal representation to

struct DataValue{T}
   value::Union{T,Null}
end

This should be entirely transparent to users, but might be faster.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.