Coder Social home page Coder Social logo

Andreas Wagner's Projects

gpt-embeddings icon gpt-embeddings

Fast Text Embeddings retrofitted using the Google Product Taxonomy (GPT) Tree

granne icon granne

Graph-based Approximate Nearest Neighbor Search

graphjet icon graphjet

GraphJet is a real-time graph processing library.

gru4rec icon gru4rec

GRU4Rec is the cleaned & simplified implementation of the algorithm of the "Session-based Recommendations with Recurrent Neural Networks" paper, published at ICLR 2016. The code is stripped of features that we had found to be unhelpful in increasing accuracy.

hesml icon hesml

HESML Java software library of ontology-based semantic similarity measures and information content models

hnswlib icon hnswlib

Java library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs

hnswlib-jna icon hnswlib-jna

Native-Like Performance for Nearest Neighbor Search in Java Applications using Hnswlib and Java Native Access

homonym icon homonym

A mini web crawler to get hundreds of websites' content based on a list of keywords.

hyperscan-java icon hyperscan-java

Match tens of thousands of regular expressions within milliseconds - Java bindings for Intel's hyperscan 5

ice icon ice

ICE: Item Concept Embedding

ieturk icon ieturk

Intuitive Annotation Tool for Information Extraction / Named Entity Recognition using localturk / Amazon Mechanical Turk

implicit icon implicit

Fast Python Collaborative Filtering for Implicit Datasets

improving-semantic-topic-clustering-for-search-queries-with-word-co-occurrence icon improving-semantic-topic-clustering-for-search-queries-with-word-co-occurrence

Uncovering common themes from a large number of unor- ganized search queries is a primary step to mine insights about aggregated user interests. Common topic model- ing techniques for document modeling often face sparsity problems with search query data as these are much shorter than documents. We present two novel techniques that can discover semantically meaningful topics in search queries: i) word co-occurrence clustering generates topics from words frequently occurring together; ii) weighted bigraph cluster- ing uses URLs from Google search results to induce query similarity and generate topics. We exemplify our proposed methods on a set of Lipton brand as well as make-up & cos- metics queries. A comparison to standard LDA clustering demonstrates the usefulness and improved performance of the two proposed methods. keywords: search queries, topic clustering, word co- occurrence, bipartite graph, co-clustering.

indexer4j icon indexer4j

Simple full text indexing and searching library for Java

interactive-dictionary icon interactive-dictionary

In this program, the user interacts with a dictionary. The user can input a word, part of speech, and filter the dictionary by part of speech. The Java program interacts with an enum to pull data from. There are still a bit of fixes to make, but the program overall works.

interactive-search-engine icon interactive-search-engine

Based on a web crawler, stored term/document frequency matrix from collected logs, implemented cosine similarity against the query, returned document URLs ordered by similarity.

ip-nsw icon ip-nsw

Implementation of ip-nsw from Non-metric Similarity Graphs for Maximum Inner Product Search

ipa-dict icon ipa-dict

Monolingual wordlists with pronunciation information in IPA

ir-project icon ir-project

Project for the Information Retrieval course at the University of Padova: "GRAS Stemmer".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.