Topic: wikipedia-dump Goto Github
Some thing interesting about wikipedia-dump
Some thing interesting about wikipedia-dump
wikipedia-dump,Command line tool to extract plain text from Wikipedia database dumps
User: afuschetto
wikipedia-dump,A Python toolkit to generate a tokenized dump of Wikipedia for NLP
User: akb89
wikipedia-dump,unpack wikipedia XML dumps to files
User: alicebob
wikipedia-dump,Wikipedia importer tool for Apache Sling and Adobe AEM
User: artika4biz
wikipedia-dump,Wiki dump parser (jupyter)
Organization: bashkirtsevich-llc
wikipedia-dump,Work with Wikipedia dumps.
User: bfontaine
Home Page: https://pypi.org/project/wpydumps/
wikipedia-dump,Extract citation ISBNs from Wikipedia dump
Organization: calil
wikipedia-dump,Wikipedia Dump Processing
Organization: cogcomp
wikipedia-dump,Scripts to download the Wikipedia dumps (available at https://dumps.wikimedia.org/ )
User: cristiancantoro
wikipedia-dump,Chat with local Wikipedia embeddings 📚
User: deadbits
wikipedia-dump,Contains code to build a search engine by creating an index and perform search over Wikipedia data.
User: dhavaltaunk08
wikipedia-dump,Extracts geodata from a wikipedia dump
User: donomii
Home Page: https://donomii.github.io/wikipedia2geojson
wikipedia-dump,WikimediaDumpExtractor extracts pages from Wikimedia/Wikipedia database backup dumps.
Organization: eml4u
wikipedia-dump,Corpus creator for Chinese Wikipedia
User: howl-anderson
wikipedia-dump,Wikicompiler is a fully extensible python library that compile and evaluate text from Wikipedia dump. You can extract text, do text analysis or even evaluate the AST(Abstract Syntax Tree) yourself
User: iwasingh
wikipedia-dump,A library that assists in traversing and downloading from Wikimedia Data Dumps and their mirrors.
User: jon-edward
wikipedia-dump,A simple utility to index wikipedia dumps using Lucene.
User: lemire
wikipedia-dump,ORES-Inspect is a web app for auditing machine learning models used on Wikipedia.
User: levon003
wikipedia-dump,Natural and Technical Language Processing using Spacy, Named Entity Recognition and a custom Relationship Extraction and Labeling component
User: lsiecker
wikipedia-dump,Some Faroese language statistics taken from fo.wikipedia.org content dump
User: macbre
wikipedia-dump,Python package for working with MediaWiki XML content dumps
User: macbre
Home Page: https://pypi.org/project/mediawiki_dump/
wikipedia-dump,Generates a JSON file with F1 Driver stats from a given year based on its wikipedia page
User: matiascarabella
wikipedia-dump,Wikipedia Dump Loader for Spark
User: nwtgck
wikipedia-dump,Collects a multimodal dataset of Wikipedia articles and their images
User: olehonyshchak
wikipedia-dump,🌐 Guide and tools to run a full offline mirror of Wikipedia.org with three different approaches: Nginx caching proxy, Kiwix + ZIM dump, and MediaWiki/XOWA + XML dump
User: pirate
Home Page: https://docs.sweeting.me/s/self-host-a-wikipedia-mirror
wikipedia-dump,A tool to get the plainest text out of Wikipedia XML dumps
User: prithvidasgupta
wikipedia-dump,A complete search engine experience built on top of 75 GB Wikipedia corpus with subsecond latency for searches. Results contain wiki pages ordered by TF/IDF relevance based on given search word/s. From an optimized code to the K-Way mergesort algorithm, this project addresses latency, indexing, and big data challenges.
User: priyendumori
wikipedia-dump,Wikipedia-based Explicit Semantic Analysis, as described by Gabrilovich and Markovitch
User: pvoosten
wikipedia-dump,Research for master degree, operation projizz-I/O
User: qcl
Home Page: http://nlg.csie.ntu.edu.tw/~ccli/
wikipedia-dump,Convert WIKI dumped XML (Chinese) to human readable documents in markdown and txt.
User: quqixun
wikipedia-dump,A Search Engine built based on Wikipedia dump of 75GB. Involves creation of Index file and returns search results in real time
User: rajatyadav1994
wikipedia-dump,Index and Search wikiDump
User: ramkishore07s
wikipedia-dump,Visualize/explore word2vec datasets with pygame
User: rocket-pig
wikipedia-dump,A search system based on the Wikipedia dump dataset.
User: rsakib15
wikipedia-dump,Fact Extraction and Verification
User: rspai
wikipedia-dump,WikiBank is a new partially annotated resource for multilingual frame-semantic parsing task.
User: sascezar
wikipedia-dump,Identifies acronyms in a text file and disambiguates possible expansions
User: sayarghoshroy
wikipedia-dump,Extracting useful metadata from Wikipedia dumps in any language.
User: shyamupa
wikipedia-dump,Extract human names from Wikipedia
User: sinkasula
wikipedia-dump,Russian Wikipedia movie parser
User: slotabr
wikipedia-dump,Java tool to Wikimedia dumps into Java Article pojos for test or fake data.
User: studerw
wikipedia-dump,Wikipedia archive downloader+text parser for every language
User: temurchichua
wikipedia-dump,Node.js module for parsing the content of wikipedia articles into javascript objects
User: tomer8007
wikipedia-dump,📚 A Kotlin project which extracts ngram counts from Wikipedia data dumps.
User: tomeraberbach
wikipedia-dump,Reading the data from OPIEC - an Open Information Extraction corpus
Organization: uma-pi1
Home Page: https://www.uni-mannheim.de/dws/research/resources/opiec/
wikipedia-dump,Website with interactive game, where you have to travel from random page on Wikipedia to Adolf Hitler's page (or any page specified by you in settings).
User: vityaschel
Home Page: https://wikipedia.utidteam.com
wikipedia-dump,Convert Wikipedia XML dump files to JSON or Text files
User: wolfgarbe
wikipedia-dump,A command-line toolkit to extract text content and category data from Wikipedia dump files
User: yohasebe
wikipedia-dump,mirror of https://git.noxz.tech/wikid
User: z0noxz
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.