symbolicml / dynamicexpressions.jl Goto Github PK

View Code? Open in Web Editor NEW

90.0 2.0 11.0 2.57 MB

Ridiculously fast symbolic expressions

Home Page: https://symbolicml.org/DynamicExpressions.jl/dev

License: Apache License 2.0

Julia 100.00%

binary-trees expression-evaluator symbolic-computation symbolic-manipulation symbolic-regression

dynamicexpressions.jl's People

Contributors

Stargazers

Watchers

Forkers

alcap23 everettgrethel wenpw maxreiss123 jknohr arhik sidbaines charishma13 luxdl psaunderualberta sebastianm-c

dynamicexpressions.jl's Issues

Do/can we have an inverse `string_tree` function?

i.e. is there anyway to parse a string output by string_tree into a valid DynamicExpressions expression?

Or is there any other convenient way to save a large amount of dynamicexpressions that can be loaded back for later use?

Clean up convenience functions

Is there a way I can make the convenience functions (

DynamicExpressions.jl/src/OperatorEnumConstruction.jl

Line 49 in af68fa8

Base.MainInclude.eval(

) more robust? Right now there are created with:

Base.MainInclude.eval(...)

from inside DynamicExpressions.OperatorEnum(...). Is there a way I can get a basic @eval to work here (and still extend the user-defined operators)? Perhaps I need to simply export the functions (or non-implemented versions of the functions)?

@odow any ideas/tips?

[Feature] Interface with ChainRulesCore.jl

Apparently the proper way to build in differentiability is to define a rule for ChainRulesCore.jl: https://github.com/JuliaDiff/ChainRulesCore.jl. Specifically we would define an frule (forward) and rrule (reverse). Then, evaluations inside DynamicExpressions.jl would be able to be link in a "chain" in a larger AD pipeline. So it might be easier to do a lot of other stuff.

Maybe if we do this it would be relatively easy to get even higher order gradients in DynamicExpressions.jl?

@kazewong check this out

Rewrite docs with DocStringExtensions

Another package I wish I knew about a long time ago… https://docstringextensions.juliadocs.org/stable/

[Feature] Preserve sharing during simplification

Right now the simplification routines will break any shared subexpressions. This means if you have an expression:

inner = cos(x1) - 3.2
expression = exp(inner) + inner

where the same inner is used at each stage, if you go to simplify it, it will break this connection, and you will end up with:

expression = exp(cos(x1) - 3.2) + cos(x1) - 3.2

however, there is a way to get around this.

Whenever there is a shared node, it should be split into a system of equations. Then, each individual equation can be treated normally with simplification. At the end, the system of equations can be sewn together with the same structure as before, preserving shared variables.

@AlCap23 in case of interest

Suppress Warning from ForwardDiff Duals

Turns out ForwardDiff.jl already works with DynamicExpressions, but we would want to turn off the type mismatch warnings.

using ForwardDiff, DynamicExpressions

operators = OperatorEnum(; binary_operators=[+, -, *], unary_operators=[cos]);
x1 = Node(; feature=1)
x2 = Node(; feature=2)
expr = x1 * cos(x2 - 3.2)

X = rand(2, 5)

ForwardDiff.gradient(X) do X
    return sum(abs2, first(eval_tree_array(expr, X, operators)))
end

┌ Warning: Warning: eval_tree_array received mixed types: tree=Float32 and data=ForwardDiff.Dual{ForwardDiff.Tag{var"#13#14", Float64}, Float64, 10}.
└ @ DynamicExpressions.EvaluateModule /mnt/research/lux/DynamicExpressions.jl/src/Evaluate.jl:95
2×5 Matrix{Float64}:
  0.182636   0.882172   0.906966    0.161635    1.86929
 -0.020271  -0.363872  -0.0764073  -0.0126587  -0.331082

Possible extensions

Hey @MilesCranmer !

I've been thinking about possible extensions and wonder how you think about this:

Parametric Operations
In short, I want to be able to supply function with hold tuneable constants, e.g. f(x, y , p) = x / ( y + p) where p is a constant with a tuneable value. This could maybe be done with predefined patterns of nodes, but also might be doable somewhat different?
N-ary Operations
This one seems a little harder, but might be doable as well ( major refactoring ). My reasoning for allowing this is basically steering more into the field of program synthesis and allow chunks of expressions to be parsed. Possible applications for this might be inference of systems of equations based on "building blocks" with contain typical patterns. I know that in theory this is also possible using just binary ops, but the chance of discovery might increase based on the structural prior.
Arbitrary Expression Graphs
This plays along with 2. and reuse of patterns is key here. In many (natural) systems, reoccurrence of features is common ( in classical mechanics this would be the sin(q) and cos(q) for describing the translational movement of an angular joint ). Within a classical binary tree, each expression needs to be sampled individually while in a DAG ( possibly "just" a topological sorted DAG ), the influence of a given node could be extended beyond its direct parent.

I've made some prototypes using Symbolics / MTK for this, but it its rather slow ( given that I need to build the function for each candidate ).

Something of a parallel part:
I've been working on some alternatives to GA/GP to solve symbolic regression, partly motivated by the latest trend of using RNN to sample the graph and also related to MINLP. If you're interested we could have a chat about this :).

Cheers!

Edit Uh, I just noticed that this might all be doable quite easily given that you generate dispatches using the OperatorEnum! Nice.

Register package

@JuliaRegistrator register

[Idea] GPU-native implementation

I wonder if it’s possible to have a GPU native implementation, with the tree stored as 1-hot vectors, the dimension given by the number of total operators + value + feature + degree. You would evaluate all operators at each node in the tree, and mask the outputs not used.

However I’m not sure this would actually work, because with deeply-nested trees you would have to have O(2^n) evaluated nodes for a depth of O(n), whereas dynamic expressions would just be n evaluations.

Remove NaN checks by default

Right now, NaN and Inf checks are performed throughout the evaluation code. This is tied to SymbolicRegression.jl, but isn't needed in other cases, and slows them down a bit. Thus, I think eval_tree_array and related functions should have a flag for performing NaN and Inf checks or not.

CUDA.jl & Metal.jl Support

It seems doable enough to add support for CUDA.jl and Metal.jl support.
To do this, it might be enough to create an eval_tree_array for the corresponding array types.

what's special about variable name `operators`?

in all the examples we define a variable operators but don't use it explicitly, can we document it or make an explicit API?

AbstractTrees.jl support

https://juliahub.com/ui/Packages/General/AbstractTrees

The only two functions that are mandatory to overload are children and nodevalue.

`tree_map`?

I realize a lot of functions could just be implemented as calls to a generic tree_map function. For example,

tree_map(t -> 1, tree; merge=max)

would calculate the depth of a tree. The “merge” function would be used to aggregate left/right child for binary nodes. For example,

tree_map(t -> 1, tree; merge=(+))

would count the total number of nodes. Meanwhile,

tree_map(tree; merge=(+)) do t
    Int(t.degree==2)
end

would count the number of binary operators. Then something like

tree_map(tree; merge=(l, r)->[l…, r…]) do t
    if t.degree != 0 || !t.constant
        return []
    end
    return [t.val]
end

would return all constants in a tree (in depth-first traversal order).

@Moelf would this have been helpful for writing that NYT puzzle solver? What do you think of the API?

@AlCap23 any comment?

[Feature] Overloadable constant printing

Should be able to overload printing of constants, so that in SR we can print with reduced precision, and also print with units.

Consider a different way to error on extension not loaded

Currently the functions are written as:

function foo(args...; kwargs...)
	error("ExtX.jl not loaded")
end

# In ExtX.jl
function foo(a::Int, b::Char)
	# do the correct thing
end

Now this works, but let's say I did using X, DynamicExpressions; foo(1, 2) I will get an error "ExtX.jl not loaded" which is somewhat confusing because X.jl is already loaded. For quite a few of my packages the way I handle this is:

@inline _is_extension_loaded(::Val) = false

function _foo_internal end

function foo(...)
    _is_extension_loaded(Val(:X)) && return _foo_internal(...)
	error("....")
end

# In ExtX.jl
@inline _is_extension_loaded(::Val{:X}) = true

This does cause a minor increase in invalidations, but julia compiles away the branch so there is no runtime cost

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!