Coder Social home page Coder Social logo

treeinrandomforest / aicoe-insights-clustering Goto Github PK

View Code? Open in Web Editor NEW

This project forked from redhatinsights/aiops-insights-clustering

0.0 2.0 0.0 133 KB

Clustering of systems

License: GNU General Public License v3.0

Python 97.29% Makefile 2.71%

aicoe-insights-clustering's Introduction

Clustering Systems

Running the clustering on OpenShift

First you'll load the template that has all required resources

❯ oc create -f ./cluster-job-template.yaml -f ./build-config-template.yaml
template "systems-clustering-job" created
template "systems-clustering-bc-is" created

Then create the BuildConfig and ImageStream

❯ oc new-app --template systems-clustering-bc-is
--> Deploying template "mhild-test/systems-clustering-bc-is" to project mhild-test

     * With parameters:
        * APPLICATION_NAME=systems-clustering
        * GIT_URI=https://github.com/ManageIQ/aiops-insights-clustering.git

--> Creating resources ...
    buildconfig "systems-clustering" created
    imagestream "systems-clustering" created
--> Success
    Use 'oc start-build systems-clustering' to start a build.
    Run 'oc status' to view your app.

Development workflow

Copy .env file and adjust variables

cp .env.example .env

Start a build

❯ oc start-build systems-clustering
build "systems-clustering-1" started

Or to push your local committed code

g add .
make oc_build_head

And finally you can run a job that does the clustering

make oc_cluster_train

Look at the output of the job

❯ oc logs systems-clustering-job-spah -f
---> Running application from Python script (app.py) ...
app.py:50: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
  np_rules = pd_rules.as_matrix()
0        1
1        4
2        1
        ..
40160    3
40161    1
Name: cluster, Length: 40162, dtype: int32

Tests

make test

Local build

If you would like to deploy the clustering service locally, you can build the container using S2I

❯ s2i build -c . centos/python-36-centos7 aicoe-insights-clustering

For convenience you can store your desired environment variables in a separate file

❯ cat << EOT >> env.list
FLASK_ENV=development
CEPH_KEY=...
CEPH_SECRET=...
CEPH_ENDPOINT=...
EOT

And then run it as a Docker container

❯ docker run --env-file env.list  -p 8080:8080 -it aicoe-insights-clustering

Data storage

Currently we support 3 types of data storage. Clustering service selects the proper one based on environment variables.

  • Ceph (use CEPH_KEY, CEPH_SECRET and CEPH_SECRET)
  • AWS S3 (use AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY)
  • Local (neither variable from above is set)

Please note that AWS S3 access is not intended to be used in development since it may touch sensitive production data. For development purposes please use the local storage as described later.

Fetching a local copy of data from AWS

Install and configure AWS CLI

❯ pip install awscli --user
...
❯ aws configure --profile insights
AWS Access Key ID [None]: <YOUR_ACCESS_KEY>
AWS Secret Access Key [None]: <YOUR_SECRET_ACCESS_KEY>
Default region name [None]: <leave_blank>
Default output format [None]: <optional [json|text|table]>

Sync data locally (replace <YYYY-MM-DD> with an existing date). Use DH-DEV-INSIGHTS bucket which contains nonsensitive data only, please.

❯ # List available dates
❯ aws s3 ls --profile insights \
            --endpoint-url=http://storage-016.infra.prod.upshift.eng.rdu2.redhat.com:8080/ \
            s3://DH-DEV-INSIGHTS/
PRE 2018-02-28/
PRE 2018-03-01/
PRE 2018-03-02/
PRE 2018-03-03/
...

❯ # Sync to ./data
❯ aws s3 sync --profile insights \
              --endpoint-url=http://storage-016.infra.prod.upshift.eng.rdu2.redhat.com:8080/ \
              s3://DH-DEV-INSIGHTS/<YYYY-MM-DD>/rule_data ./data

aicoe-insights-clustering's People

Contributors

durandom avatar jhjaggars avatar treeinrandomforest avatar tumido avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.