Coder Social home page Coder Social logo

alex-r-bigelow / intervaltree Goto Github PK

View Code? Open in Web Editor NEW

This project forked from chaimleib/intervaltree

0.0 1.0 0.0 2.98 MB

A mutable, self-balancing interval tree. Queries may be by point, by range overlap, or by range containment.

License: Apache License 2.0

Makefile 0.06% Python 99.94% Shell 0.01%

intervaltree's Introduction

Build status badge

intervaltree

A mutable, self-balancing interval tree for Python 3. Queries may be by point, by range overlap, or by range envelopment.

This library was designed to allow tagging text and time intervals, where the intervals include the lower bound but not the upper bound.

Fork details

Note: this fork adds a little bit of O(n log n) bloat for specific use cases, namely:

  • efficient histogram computation
  • iterative results for overlap queries (instead of collecting all the results in memory before returning)

If this isn't what you're looking for, I suggest you stick with the main repository.

Installing

pip install git+https://github.com/alex-r-bigelow/intervaltree.git

Features

  • Supports Python 3.4+ (Tested under 3.6.7)

  • Initializing

    • blank tree = IntervalTree()
    • from an iterable of Interval objects (tree = IntervalTree(intervals))
    • from an iterable of tuples (tree = IntervalTree.from_tuples(interval_tuples))
  • Insertions

    • tree[begin:end] = data
    • tree.add(interval)
    • tree.addi(begin, end, data)
  • Deletions

    • tree.remove(interval) (raises ValueError if not present)
    • tree.discard(interval) (quiet if not present)
    • tree.removei(begin, end, data) (short for tree.remove(Interval(begin, end, data)))
    • tree.discardi(begin, end, data) (short for tree.discard(Interval(begin, end, data)))
    • tree.remove_overlap(point)
    • tree.remove_overlap(begin, end) (removes all overlapping the range)
    • tree.remove_envelop(begin, end) (removes all enveloped in the range)
  • Point queries

    • tree[point]
    • tree.at(point) (same as previous)
  • Overlap queries

    • tree[begin:end]
    • tree.overlap(begin, end) (same as previous)
    • tree.iterOverlap(begin, end) (same as previous, except returns an iterator that yields Intervals immediately)
  • Envelop queries

    • tree.envelop(begin, end)
  • Membership queries

    • interval_obj in tree (this is fastest, O(1))
    • tree.containsi(begin, end, data)
    • tree.overlaps(point)
    • tree.overlaps(begin, end)
  • Iterable

    • for interval_obj in tree:
    • tree.items()
  • Sizing

    • len(tree)
    • tree.is_empty()
    • not tree
    • tree.begin() (the begin coordinate of the leftmost interval)
    • tree.end() (the end coordinate of the rightmost interval)
  • Set-like operations

    • union

      • result_tree = tree.union(iterable)
      • result_tree = tree1 | tree2
      • tree.update(iterable)
      • tree |= other_tree
    • difference

      • result_tree = tree.difference(iterable)
      • result_tree = tree1 - tree2
      • tree.difference_update(iterable)
      • tree -= other_tree
    • intersection

      • result_tree = tree.intersection(iterable)
      • result_tree = tree1 & tree2
      • tree.intersection_update(iterable)
      • tree &= other_tree
    • symmetric difference

      • result_tree = tree.symmetric_difference(iterable)
      • result_tree = tree1 ^ tree2
      • tree.symmetric_difference_update(iterable)
      • tree ^= other_tree
    • comparison

      • tree1.issubset(tree2) or tree1 <= tree2
      • tree1 <= tree2
      • tree1.issuperset(tree2) or tree1 > tree2
      • tree1 >= tree2
      • tree1 == tree2
  • Restructuring

    • chop(begin, end) (slice intervals and remove everything between begin and end, optionally modifying the data fields of the chopped-up intervals)
    • slice(point) (slice intervals at point)
    • split_overlaps() (slice at all interval boundaries, optionally modifying the data field)
    • merge_overlaps() (joins overlapping intervals into a single interval, optionally merging the data fields)
    • merge_equals() (joins intervals with matching ranges into a single interval, optionally merging the data fields)
  • Copying and typecasting

    • IntervalTree(tree) (Interval objects are same as those in tree)
    • tree.copy() (Interval objects are shallow copies of those in tree)
    • set(tree) (can later be fed into IntervalTree())
    • list(tree) (ditto)
  • Pickle-friendly

  • Automatic AVL balancing

  • Histogram computation

    • tree.computeCountHistogram(bins) (returns a list of evenly-spaced Interval objects of length bins, with counts for the number of Intervals that intersect each bin)
    • tree.computeUtilizationHistogram(bins) (returns a list of evenly-spaced Interval objects of length bins, with the percentage of each bin that contains an interval. Note that this will be higher than 1.0 if multiple intervals within the bin overlap each other; i.e. two intervals covering the entire bin would result in 2.0, or a 200% utilization rate)

Examples

  • Getting started

    >>> from intervaltree import Interval, IntervalTree
    >>> t = IntervalTree()
    >>> t
    IntervalTree()
  • Adding intervals - any object works!

    >>> t[1:2] = "1-2"
    >>> t[4:7] = (4, 7)
    >>> t[5:9] = {5: 9}
  • Query by point

    The result of a query is a set object, so if ordering is important, you must sort it first.

    >>> sorted(t[6])
    [Interval(4, 7, (4, 7)), Interval(5, 9, {5: 9})]
    >>> sorted(t[6])[0]
    Interval(4, 7, (4, 7))
  • Query by range

    Note that ranges are inclusive of the lower limit, but non-inclusive of the upper limit. So:

    >>> sorted(t[2:4])
    []

    Since our search was over 2 ≤ x < 4, neither Interval(1, 2) nor Interval(4, 7) was included. The first interval, 1 ≤ x < 2 does not include x = 2. The second interval, 4 ≤ x < 7, does include x = 4, but our search interval excludes it. So, there were no overlapping intervals. However:

    >>> sorted(t[1:5])
    [Interval(1, 2, '1-2'), Interval(4, 7, (4, 7))]

    To only return intervals that are completely enveloped by the search range:

    >>> sorted(t.envelop(1, 5))
    [Interval(1, 2, '1-2')]
  • Accessing an Interval object

    >>> iv = Interval(4, 7, (4, 7))
    >>> iv.begin
    4
    >>> iv.end
    7
    >>> iv.data
    (4, 7)
    
    >>> begin, end, data = iv
    >>> begin
    4
    >>> end
    7
    >>> data
    (4, 7)
  • Constructing from lists of intervals

    We could have made a similar tree this way:

    >>> ivs = [(1, 2), (4, 7), (5, 9)]
    >>> t = IntervalTree(
    ...    Interval(begin, end, "%d-%d" % (begin, end)) for begin, end in ivs
    ... )

    Or, if we don't need the data fields:

    >>> t2 = IntervalTree(Interval(*iv) for iv in ivs)

    Or even:

    >>> t2 = IntervalTree.from_tuples(ivs)
  • Removing intervals

    >>> t.remove(Interval(1, 2, "1-2"))
    >>> sorted(t)
    [Interval(4, 7, '4-7'), Interval(5, 9, '5-9')]
    
    >>> t.remove(Interval(500, 1000, "Doesn't exist"))  # raises ValueError
    Traceback (most recent call last):
    ValueError
    
    >>> t.discard(Interval(500, 1000, "Doesn't exist"))  # quietly does nothing
    
    >>> del t[5]  # same as t.remove_overlap(5)
    >>> t
    IntervalTree()

    We could also empty a tree entirely:

    >>> t2.clear()
    >>> t2
    IntervalTree()

    Or remove intervals that overlap a range:

    >>> t = IntervalTree([
    ...     Interval(0, 10),
    ...     Interval(10, 20),
    ...     Interval(20, 30),
    ...     Interval(30, 40)])
    >>> t.remove_overlap(25, 35)
    >>> sorted(t)
    [Interval(0, 10), Interval(10, 20)]

    We can also remove only those intervals completely enveloped in a range:

    >>> t.remove_envelop(5, 20)
    >>> sorted(t)
    [Interval(0, 10)]
  • Chopping

    We could also chop out parts of the tree:

    >>> t = IntervalTree([Interval(0, 10)])
    >>> t.chop(3, 7)
    >>> sorted(t)
    [Interval(0, 3), Interval(7, 10)]

    To modify the new intervals' data fields based on which side of the interval is being chopped:

    >>> def datafunc(iv, islower):
    ...     oldlimit = iv[islower]
    ...     return "oldlimit: {0}, islower: {1}".format(oldlimit, islower)
    >>> t = IntervalTree([Interval(0, 10)])
    >>> t.chop(3, 7, datafunc)
    >>> sorted(t)[0]
    Interval(0, 3, 'oldlimit: 10, islower: True')
    >>> sorted(t)[1]
    Interval(7, 10, 'oldlimit: 0, islower: False')
  • Slicing

    You can also slice intervals in the tree without removing them:

    >>> t = IntervalTree([Interval(0, 10), Interval(5, 15)])
    >>> t.slice(3)
    >>> sorted(t)
    [Interval(0, 3), Interval(3, 10), Interval(5, 15)]

    You can also set the data fields, for example, re-using datafunc() from above:

    >>> t = IntervalTree([Interval(5, 15)])
    >>> t.slice(10, datafunc)
    >>> sorted(t)[0]
    Interval(5, 10, 'oldlimit: 15, islower: True')
    >>> sorted(t)[1]
    Interval(10, 15, 'oldlimit: 5, islower: False')

Future improvements

See the issue tracker on GitHub.

Based on

Copyright

Licensed under the Apache License, version 2.0.

The source code for this project is at https://github.com/chaimleib/intervaltree

intervaltree's People

Contributors

612d7368 avatar alex-r-bigelow avatar chaimleib avatar escalonn avatar konstantint avatar mhsaul avatar milonimrod avatar progval avatar sinig avatar tuxzz avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.