Coder Social home page Coder Social logo

js-sorted-set's Introduction

Sorted Set

A sorted set is a data structure with these guarantees:

  • It's a set: it can only contain any given item once.
  • It's sorted: you can iterate over all its items in order.

As an illustration, let's build a simple sorted set out of an Array:

Operation Syntax (simple JavaScript Array)
Create const set = []
Insert set.push(value)
Remove set.splice(set.indexOf(value), 1)
Iterate set.sort(); set.forEach(doSomething)
Find set.sort(); const index = set.indexOf(value)
Previous const previousIndex = index - 1
Next const nextIndex = index + 1
Test const isInSet = set.indexOf(value) != -1

... this works, but it's a bit cryptic and some operations--notably iterate-- will be very slow with large sets.

Usage

You can npm install js-sorted-set. Alternatively, just download sorted-set.js from this directory.

To use it on a Website built with Webpack or Rollup:

import SortedSet from 'js-sorted-set'

To use it in Node:

const SortedSet = require('js-sorted-set')

Or, to pollute your global scope, insert this in your HTML:

<script src="path/to/sorted-set.js"></script>

Now that you have the SortedSet class, here's how to use it:

const set = new SortedSet({ comparator: function(a, b) { return b - a }})
set.insert(5)
set.insert(3)
set.insert(2)
set.remove(3)
const yes = set.contains(2)
console.log(set.map(function(x) { return x * 2 })) // returns [ 10, 4 ]

Operations

The SortedSet API:

Operation Syntax (js-sorted-set) Notes
Create const set = new SortedSet()
Insert set.insert(value)
Remove set.remove(value)
Clear set.clear()
Length set.length
Test set.contains(value) Returns true or false
Iterate set.forEach(doSomething) Plus set.map() and other iterative methods, returning Arrays and scalars

Find, Previous and Next work with an Iterator pattern. An iterator is an immutible pointer into the space "between" two items in the set.

const iterator = set.beginIterator() // points to the left of the leftmost item
const iterator2 = iterator.next() // points to the left of the second item
const value = iterator.value(), value2 = iterator2.value()
const end = set.endIterator() // points to the right of the final item
const value2 = end.value() // null, because there is no item

Here's the full SortedSet iterator API:

Operation Syntax (js-sorted-set) Notes
Length const len = set.length
Find const iterator = set.findIterator(value) iterator points to the left of value. If value is not in set, iterator points to the left of the first item greater than value. If value is greater than the final item in set, iterator points to the right of the final item.
Begin const iterator = set.beginIterator() If set is empty, this is equivalent to const iterator = set.endIterator()
End const iterator = set.endIterator() Points past the end of set there is never a value here
Value const value = iterator.value() For an end iterator, returns null
Forward const iterator2 = iterator.next() If iterator is an end iterator, returns null
Backward const iterator2 = iterator.previous() If iterator is a begin iterator, returns null
Can go forward const isBegin = !iterator.hasPrevious()
Can go backward const isEnd = !iterator.hasNext() Remember, if iterator is pointing to the left of the final item in set, then hasNext() will return true -- even though iterator.next().value() === null

All iterators on set become invalid as soon as something calls set.insert() or set.remove().

Options

How exactly will these elements be ordered? Let's add a comparator option. This is the argument we would pass to Array.prototype.sort:

const compareNumbers = (a, b) => a - b
const set = new SortedSet({ comparator: compareNumbers })

How to handle insert conflicts? We'll also add a onInsertConflict option that provides users with a way to specify what to do in case an item is inserted that matches another item already present within the set. Such behavior must be specified as a function taking in the conflicting items and returning a replacement item for the previously inserted one. The SortedSet class ships with three implementations such behavior:

SortedSet.OnInsertConflictThrow     // throws an error
SortedSet.OnInsertConflictReplace   // keeps the new item
SortedSet.OnInsertConflictIgnore    // keeps the previous item

Unless differently specified through the onInsertConflict option, the SortedSet class will default to SortedSet.OnInsertConflictThrow:

const set = new SortedSet({ 
    onInsertConflict: SortedSet.OnInsertConflictThrow
})
set.insert("foo")
set.insert("foo") // throw an error

Finally, some algorithms ask for really fast replacement mechanisms. So let's add a setValue() method to the iterator, which puts the onus on the user to keep things ordered.

Because this is a particularly dangerous API to use, you must set the option allowSetValue: true when creating the SortedSet.

const set = new SortedSet({ allowSetValue: true })
set.insert("foo")
set.insert("bar")
set.insert("baz")

// Shortcut API
const iterator = set.findIterator("bar")
iterator.setValue("baq") // It must stay ordered! Do not set "bbq" here!
// The shortcut executes very quickly, but if the user makes a mistake,
// future operations will likely fail

// iterator.setValue("baq") here is equivalent to:
// set.remove("bar")
// set.insert("baq")

Strategies

We can be somewhat efficient in an Array approach by avoiding sort() calls. This strategy keeps the array ordered at all times by inserting and removing elements into and out from the correct array indices. The downside: large swaths of the array must be rewritten during each insert and remove.

We can also create a simple binary tree. insert() and remove() won't overwrite the entire array each time, so this can be faster. But it's far slower to seek through a binary tree, because it can spread out very far across memory so the processor won't cache it well. Also, depending on the order in which elements were input, inserting a single item into the tree can actually be slower than rewriting an entire Array.

Finally, we can improve upon the binary tree by balancing it. This guarantees a certain maximum number of reads and writes per operation. Think of it this way: if you're lucky, a simple binary tree's operations can be extremely fast; if you're unlucky, they can be extremely slow; you'll usually be unlucky. A balanced tree makes all operations somewhat fast.

The balanced tree (which, incidentally, is a Left-Leaning Red-Black tree) is the default, because its speed is the most predictable.

Create the sets like this:

const set = new SortedSet({ strategy: SortedSet.ArrayStrategy }) // Array
const set = new SortedSet({ strategy: SortedSet.BinaryTreeStrategy }) // simple binary tree
const set = new SortedSet({ strategy: SortedSet.RedBlackTreeStrategy }) // default

Use the ArrayStrategy if your set will only have a few values at a time. Use the BinaryTreeStrategy if you've run lots of tests and can prove it's faster than the others. If neither of these conditions applies, use the default, RedBlackTreeStrategy.

You'll see running times like this:

Operation Array Binary tree Red-black tree
Create O(1) O(1) O(1)
Length O(1) O(1) O(1)
Clear O(1) O(n) (in garbage collector) O(n) (in garbage collector)
Insert O(n) (often slow) O(n) (often slow) O(lg n) (fast)
Remove O(n) (often slow) O(n) (often slow) O(lg n) (fast)
Iterate O(n) (fast) O(n) (slowest) O(n) (slower than Array)
Find, Test O(lg n) (fastest) O(n) (slowest) O(lg n) (slower than Array)

According to some simple jsPerf tests, you should use ArrayStrategy if you plan on maintaining about 100 to 1,000 items in your set. At that size, ArrayStrategy's insert() and remove() are fastest in today's browsers; and ArrayStrategy's iteration is faster at all sizes.

Contributing

  1. Fork this repository
  2. Run npm install
  3. Write the behavior you expect in test/
  4. Edit files in src/ until npm test says you're done
  5. Run npm run build to update build products
  6. Submit a pull request

License

I, Adam Hooper, the sole author of this project, waive all my rights to it and release it under the Public Domain. Do with it what you will.

My hope is that a JavaScript implementation of red-black trees somehow makes the world a better place.

js-sorted-set's People

Contributors

adamhooper avatar dependabot[bot] avatar jacoscaz avatar xeoneux avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

js-sorted-set's Issues

ArrayStrategy Remove

ArrayStrategy.remove seems like it would be better off checking the comparator == 0 vs this.data[index] !== value. Can lead to problems otherwise with seemingly identical objects not evaluating to equal.

Difference between comparing and sorting

I notice that when you return 0 in the comparator, the items are considered equal, but instead they should provide a stable sort.

Perhaps there should be a separate isEqual option, defaulting to ===?

RedBlackTreeStrategy throws error when in should not?

Hey Adam, this piece of code does not work as expected, is it? Or am I missing something?

const set = new SortedSet({comparator: () => 0});
set.insert(0);
set.remove(0);
set.insert(1);
set.insert(2);
set.insert(3);
const min = set.beginIterator().value();
console.log(`${min} in set: ${set.contains(min)}`); // shows false
set.remove(min); // throws 'Value not in set'

Add ability to initialize sorted set with an array of data

A great improvement could be adding an optional initial data argument to the constructor, or an initialize method to add an an initial data set. While the runtime would still be nLog(n) vs a user running repeated insert operations, you could avoid lots of data splicing on the ArrayStrategy.

Perhaps something along the lines of:

// AbstractSortedSet:
initialize(values){
this.priv.initialize(values);
this.length = values.length;
return this;
}

// ArrayStrategy:
initialize(values){
values.sort(this.comparator);
this.data =values
}

Add a method for check if item is in set

Hello,

i need a method to check if a item is in a set. I did not find that method, for my project. I use:

SortedSet.prototype.isInSet = function (val) {  
    return this.findIterator(val).value() !== null );
};

I would be happy to see such a function in js-sorted-set.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.