Coder Social home page Coder Social logo

validark / dynsdt Goto Github PK

View Code? Open in Web Editor NEW
21.0 3.0 0.0 240.35 MB

Dynamic Score-Decomposed Tries which more efficiently solve the prefix autocomplete problem

Home Page: https://validark.github.io/DynSDT/demo

License: MIT License

TypeScript 2.23% C# 14.61% Zig 83.16%
autocomplete autosuggest completion decomposition prefix-tree top-k trie type-ahead decomposed query

dynsdt's Introduction

DynSDT Implementations

Dynamic Score-Decomposed Tries which solve the scored prefix completion problem. The C# version is the primary version at the moment. The preliminary TypeScript version in the main repo was created just to match the pseudocode from the paper to help verify the correctness of the pseudocode. The Zig version is still in-development.

Paper: validark.github.io/DynSDT, entitled Heap-like Dynamic Score-Decomposed Tries for Top-k Autocomplete

Live Demo: validark.github.io/DynSDT/demo

Email autocomplete proof-of-concept: validark.github.io/DynSDT/web-autocomplete

  • This uses another TypeScript implementation I made which supports aliasing so multiple names/emails can autocomplete to the same person. The source code is available here.

Paper & Demo website code: /tree/paper

Give me a star if you appreciate this work!

Keywords: Query Autocomplete, QAC, type-ahead, ranked autosuggest, top-k autocomplete, trie decomposition, Dynamic Score-Decomposed Trie, trie autocomplete, Completion Trie.

dynsdt's People

Contributors

validark avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

dynsdt's Issues

TODO

Demo

  • Make transition from Completion Trie to decomposed trie animate the vertical linked lists into the parent nodes
  • Dark mode
  • Shift the Completion Trie and subsequent diagrams upwards slightly by default?

Paper

  • Fix font inconsistency across platforms
    • Currently, Figures 1-3 use the "serif" font-family, which becomes the default serif font on whatever the target platform is. For consistency, a dynamically loaded font should be used or the text should be converted to paths.
      • At present, I'd prefer a dynamically loaded font
    • Change the Big-Theta from Georgia to a dynamically loaded font
    • Consider moving all diagrams to monospace or the KaTeX_Main font?
  • Where $\large n$ and $\large bp[i]$ are identical, we should remove $\large n$ in favor of $\large bp[i]$
  • Add an "Experimental Analysis" section
    • The main problem I have with this is that performance characteristics may vary by implementation. Maybe I will support an analysis for each of them, but without scripts only one will be shown?
    • Decide: Should I have a performance comparison with an implementation of the Completion Trie? Just horizontally sorted or should I include the totally unsorted Completion Trie as well?
    • Diagrams I am thinking to include:
      • How $\large k$ correlates with runtime
        • This will only use those prefix queries which are in the top-10 or top-100 of number of total completions in the entire dataset
      • How number of completions to a query (for all possible prefix string queries) correlates with runtime (when $\large k$ is set to 10)
  • Submit to some academic platforms

Code

  • Add implementations!
    • I am open to Pull Requests from others for more implementations!

    • C-sharp This was the original version!
      • Add tests
      • Upload graphs (and code that generates them) for experimental analysis
      • #3
    • TypeScript The initial implementation only exists to closely match the pseudocode in the paper, and the pseudocode was actually based on the C# version. The problem with the direct port is that the C# version wraps nodes in tuples, which are values, not actual objects with lifetimes. However, in TS/JS, there is no such concept, just objects. Hence, the TS version should have the LCP values moved into the node that it actually describes, to cut the number of JS objects in half.
    • Lua Should be pretty easy to port from TypeScript, especially given my extensive experience developing https://roblox-ts.com/. Could probably be largely auto-generated.
    • Java Should be pretty similar to C#, right?
    • Zig All your codebase are belong to us!
      • Maybe do some nice compression of the data structure?
    • Python I don't actually "know python" yet, but the constructs I need to write this library should be pretty trivial.
    • Go fast!
    • Markdown Make a table that displays the supported features for each version. More than likely, the initial implementation for many languages may not have numeric compression initially.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.