Coder Social home page Coder Social logo

treemodel's People

Contributors

ryshoooo avatar

Watchers

 avatar

treemodel's Issues

Remove string length from StringDataType

Numpy strings of any lengths can be represented as '<U', i.e the longest_string parameter is not necessary.

NB: This will be very helpful in retyping and uniformization of trees.

Apply new schema to a built row

Find a consistent way to apply a new schema to a built tree row and be able to safely transform old values to the new schema values.

Implement comparisons methods for Trees

The idea is having trees T1 and T2, then:

  • T1 <= T2 iff T1 is a subtree of T2
  • T1 < T2 iff T1 is a subtree of T2 and T1 != T2
  • vice-versa for >= and >

This might heavily influence how method mul is defined, i.e. link to issue #23

Implement level attribute for ChildNode

Currently the child nodes do not have attribute level, which should be passed on when they are being gathered via forks.

This task is to allow level attribute at child nodes as well. It is essential for union method.

Implement row-wise UDF functionality

The idea is that the user can write user-defined python function, which is then executed on each TreeRow and results in a new value with specified DataType.

Ex.

def multiply(a, b):
    return a * b
mul_udf = TreeUDF(multiply, FloatDataType())
tree_dataset['base/new_feature'] = mul_udf('base/old_feature_1', 'base/old_feature_2')

What should the intersection of 2 tree schemas be?

The current implementation of the intersection (mul in tree schema and fork node) matches from top to bottom, i.e. if the top doesn't match, nothing in the branches match.

Should this be the case or not?
What if there is a equivalent branch in other, which equals to the tree in self or vice versa? Should that be included? How do Arrays and Lists of trees fit into this concept?

Is schema necessary for TreeDataType?

Currently to initialize TreeDataType input TreeSchema is necessary, however the TreeDataType is not utilizing any of the schema possibilities.

Furthermore schemas should be only essential in datasets, not in data types, since datatypes are building blocks of schemas.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.