Coder Social home page Coder Social logo

seminar-knowledge-mining's Introduction

Seminar Knowledge Mining

Code Climate

Wikimedia image classification and suggestions for article authors.

Set up instructions

Unix

  1. Install these dependencies by using your system's package manager if you don't have them already.

    Depdendency Apt Pacman Homebrew
    Python 3 python3 python
    Cython cython3 cython
    Pip python3-pip python-pip
    Virtualenv virtualenv python-virtualenv
    Fortran gfortran gcc-fortran
    Blas libblas-dev blas
    Lapack liblapack-dev lapack
    PNG libpng-dev libpng
    JPEG libjpeg8-dev libjpeg-turbo
    Freetype libfreetype6-dev freetype2
    Cairo libcairo2-dev cairo
    FFI libffi-dev
  2. Create a virtual environment inside the repository root by runnning virtualenv . or if you have multiple Python versions virtualenv -p python3 ..

  3. Activate your virtual environment using source bin/activate. Make sure that the repository name is in front of your shell promt now.

  4. Install dependencies inside your virtual environment

     pip install -r requirements.txt
    
  5. Install OpenCV 3.0 with bindings for Python 3 by running

     chmod +x tool/setup-opencv.sh
     tool/setup-opencv.sh
    
  6. UTF-8 is required, so you may need to add these lines to your ~/.bash_profile and apply the changes with source ~/.bash_profile.

     export LC_ALL=en_US.UTF-8
     export LANG=en_US.UTF-8
    

Windows

  1. Create a virtual environment inside the repository root by runnning virtualenv . or if you have multiple Python versions virtualenv -p C:\Python34\python.exe ..

  2. Activate your virtual environment using Scripts\activate. Make sure that the repository name is in front of your shell promt now.

  3. Download these dependencies. If in doubt, use the link before the last in each list. Run pip install <path-to-file> on each of those.

  4. Install remaining dependencies inside your virtual environment using pip install -r requirements.txt.

Workflows

Data set

  1. Download DBpedia dump
  2. Extract list of image names
  3. Fetch image and meta data of random entries
  4. Manually label data
  5. Balance amount of image per class

Training

  1. Proprocess data set
  2. Extract image and text based features
  3. Train classifier

Suggesting article images

  1. Get user search term
  2. Query DBpedia for related images based on description
  3. Fetch image and meta data of first results
  4. Extract image and text based features
  5. Use trained classifier to predict class
  6. Filter against user's class selection

seminar-knowledge-mining's People

Contributors

danijar avatar dencrash avatar janukobytsch avatar sleighsoft avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.