andy-wagner Goto Github PK
Name: Andreas Wagner
Type: User
Company: [email protected]
Bio: Serial entrepreneur, data, information retrieval and machine learning geek
Location: Germany
Blog: www.searchhub.io
Name: Andreas Wagner
Type: User
Company: [email protected]
Bio: Serial entrepreneur, data, information retrieval and machine learning geek
Location: Germany
Blog: www.searchhub.io
Fast Text Embeddings retrofitted using the Google Product Taxonomy (GPT) Tree
Graph-based Approximate Nearest Neighbor Search
Search for a term/tag in the graph and return the sorted (based on edge weights) list of tags which are related to the search query
GraphJet is a real-time graph processing library.
GROBID extension for identifying and normalizing physical quantities.
GRU4Rec is the cleaned & simplified implementation of the algorithm of the "Session-based Recommendations with Recurrent Neural Networks" paper, published at ICLR 2016. The code is stripped of features that we had found to be unhelpful in increasing accuracy.
Efficient word co-occurrence statistics computation software
HESML Java software library of ontology-based semantic similarity measures and information content models
An implementation of the Hierarchical Embedding Model (HEM) for personalized product search
Java library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
Native-Like Performance for Nearest Neighbor Search in Java Applications using Hnswlib and Java Native Access
A mini web crawler to get hundreds of websites' content based on a list of keywords.
Match tens of thousands of regular expressions within milliseconds - Java bindings for Intel's hyperscan 5
ICE: Item Concept Embedding
Intuitive Annotation Tool for Information Extraction / Named Entity Recognition using localturk / Amazon Mechanical Turk
Library to find most similar items based on images
Image search engine
Fast Python Collaborative Filtering for Implicit Datasets
Uncovering common themes from a large number of unor- ganized search queries is a primary step to mine insights about aggregated user interests. Common topic model- ing techniques for document modeling often face sparsity problems with search query data as these are much shorter than documents. We present two novel techniques that can discover semantically meaningful topics in search queries: i) word co-occurrence clustering generates topics from words frequently occurring together; ii) weighted bigraph cluster- ing uses URLs from Google search results to induce query similarity and generate topics. We exemplify our proposed methods on a set of Lipton brand as well as make-up & cos- metics queries. A comparison to standard LDA clustering demonstrates the usefulness and improved performance of the two proposed methods. keywords: search queries, topic clustering, word co- occurrence, bipartite graph, co-clustering.
Analyze for significant cooccurrence using Mahout sparse matrices
Simple full text indexing and searching library for Java
Trie-Based Prefix Index for Java
In this program, the user interacts with a dictionary. The user can input a word, part of speech, and filter the dictionary by part of speech. The Java program interacts with an enum to pull data from. There are still a bit of fixes to make, but the program overall works.
Interactive Elasticsearch Analyzer
Based on a web crawler, stored term/document frequency matrix from collected logs, implemented cosine similarity against the query, returned document URLs ordered by similarity.
Implementation of ip-nsw from Non-metric Similarity Graphs for Maximum Inner Product Search
Monolingual wordlists with pronunciation information in IPA
Sequence Prediction Framework
Project for the Information Retrieval course at the University of Padova: "GRAS Stemmer".
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.