Coder Social home page Coder Social logo

jbradmil / skills-ml Goto Github PK

View Code? Open in Web Editor NEW

This project forked from workforce-data-initiative/skills-ml

0.0 2.0 0.0 3.16 MB

Data Processing and Machine learning methods for the Open Skills Project

Home Page: https://workforce-data-initiative.github.io/skills-ml/

License: Other

Jupyter Notebook 51.56% Python 48.44%

skills-ml's Introduction

skill-ml

Build Status Code Coverage Updates Python 3 PyPI Code Climate

Open Skills Project - Machine Learning

This is the library for the methods usable by the Open Skills API, including processing algorithms and utilities for computing our jobs and skills taxonomy.

New to Skills-ML? Check out the Skills-ML Tour! It will get you started with the concepts. You can also check out the notebook version of the tour which you can run on your own.

Documentation

Hosted on Github Pages

Quick Start

1. Virtualenv

skills-ml depends on python3.6, so create a virtual environment using a python3.6 executable.

virtualenv venv -p /usr/bin/python3.6

Activate your virtualenv

source venv/bin/activate

2. Installation

pip install skills-ml

3. Import skills_ml

import skills_ml
  • There are a couple of examples of specific uses of components to perform specific tasks in examples.
  • Check out the descriptions of different algorithm types in algorithms/ and look at any individual directories that match what you'd like to do (e.g. skill extraction, job title normalization)
  • skills-airflow is the open-source production system that uses skills-ml algorithms in an Airflow pipeline to generate open datasets

Building the Documentation

skills-ml uses a forked version of pydocmd, and a custom script to keep the pydocmd config file up to date. Here's how to keep the docs updated before you push:

$ cd docs $ PYTHONPATH="../" python update_docs.py # this will update docs/pydocmd.yml with the package/module structure and export the Skills-ML Tour notebook to the documentation directory $ pydocmd serve # will serve local documentation that you can check in your browser $ pydocmd gh-deploy # will update the gh-pages branch

Structure

  • algorithms/ - Core algorithmic module. Each submodule is meant to contain a different type of component, such as a job title normalizer or a skill tagger, with a common interface so different pipelines can try out different versions of the components.
  • datasets/ - Wrappers for interfacing with different datasets, such as ONET, Urbanized Area.
  • evaluation/ - Code for testing different components against each other.

Contributors

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.