Coder Social home page Coder Social logo

sodafoundation / delfin Goto Github PK

View Code? Open in Web Editor NEW
203.0 13.0 354.0 2.3 MB

delfin is the SODA Infrastructure Manager project which provides unified, intelligent and scalable resource management, alert and performance monitoring

Home Page: https://sodafoundation.io/

License: Apache License 2.0

Python 98.97% Shell 0.42% Dockerfile 0.03% RobotFramework 0.59%

delfin's Introduction

delfin : SODA Infrastructure Manager Project

Build Status codecov.io Releases LICENSE

Introduction

delfin (Dolphin in spanish!) , the SODA Infrastructure Manager project is an an open source project to provide unified, intelligent and scalable resource management, alert and performance monitoring. It will cover the resource management of all the storage backends & other infrastructures under SODA deployment. It will also provide the alert management and metric data(performance/health) for monitoring and further analysis. It will provide a scalable framework where more and more backends as well as client exporters can be added. This will enable to add more storage and infrastructure backends and also support different management clients for monitoring and health prediction.

It provides unified APIs to access, export and connect with clients as well as a set of interfaces for various driver addition.

This is one of the SODA Core Projects and is maintained by SODA Foundation directly.

Documentation

https://docs.sodafoundation.io

Quick Start - To Use/Experience

https://docs.sodafoundation.io/guides/user-guides/delfin

Quick Start - To Develop

https://docs.sodafoundation.io/guides/developer-guides/delfin

Demo videos - To get to know the capabilities better

https://www.youtube.com/watch?v=WtlxF7SHID4

Latest Releases

https://github.com/sodafoundation/delfin/releases

Support and Issues

https://github.com/sodafoundation/delfin/issues

Project Community

https://sodafoundation.io/slack/

How to contribute to this project?

Join https://sodafoundation.io/slack/ and share your interest in the ‘general’ channel

Checkout https://github.com/sodafoundation/delfin/issues labelled with ‘good first issue’ or ‘help needed’ or ‘help wanted’ or ‘StartMyContribution’ or ‘SMC’

Project Roadmap

We want to build a unified intelligent and scalable infrastructure management framework for resource management (config, add, remove, update), alert management and performance metrics management.

https://docs.sodafoundation.io

Join SODA Foundation

Website : https://sodafoundation.io

Slack : https://sodafoundation.io/slack/

Twitter : @sodafoundation

Mailinglist : https://lists.sodafoundation.io

delfin's People

Contributors

andrewliu83 avatar anmolbansal1 avatar anvithks avatar code4y avatar devanshjain7 avatar divyanshukumarpcm avatar ghxiaobo avatar guankc avatar jiangyutan avatar jiuyunzhao avatar joseph-v avatar kumarashit avatar littlehotcx avatar liuxiaohuan-ghca avatar luojiagen avatar najmudheenct avatar navaneetha167 avatar nikita15p avatar pravinranjan10 avatar qinwang-murphy avatar sfzeng avatar skdwriting avatar skm26 avatar sushanthakumar avatar tanjiangyu-ghca avatar thisisclark avatar utkarshshah0 avatar wisererik avatar yuanyu-ghca avatar zhilong-xu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

delfin's Issues

[Alert Manager] Clear alert at backend

@sushanthakumar commented on May 13, 2020, 10:46 AM UTC:

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:
Alert manager exposes swagger API to clear alert. User triggers clear alert using clear API

What you expected to happen:
Clear alert to be triggered from driver side for the backend device

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • NBP version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

This issue was moved by kumarashit from sodafoundation/SIM-TempIssues#33.

Test alert processing flow with vmax backend

@sushanthakumar commented on May 12, 2020, 7:53 PM UTC:

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:
Alert manager is responsible for

Listening to traps
Process incoming traps and extract meaningful info
Identify the respective driver
Invoke driver manager interface and get the filled alert model
Export the model

What you expected to happen:
Once alert manager functionalities are ready, it needs to be tested with different back end
starting with emc vmax

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • NBP version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

This issue was moved by kumarashit from sodafoundation/SIM-TempIssues#26.

Implement Distributed Lock for task managment

@NajmudheenCT commented on May 21, 2020, 4:49 AM UTC:

Issue/Feature Description:
Need to have a locking mechanism to synchronize tasks.
Why this issue to fixed / feature is needed(give scenarios or use cases):
When requests come simulatanesly the task manager should handle the synchronus problem by locking..
How to reproduce, in case of a bug:
NA
Other Notes / Environment Information: (Please give the env information, log link or any useful information for this issue)

This issue was moved by skdwriting from sodafoundation/SIM-TempIssues#45.

Make sync_all API format to sync-all

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug
/kind feature

What happened:
curent api format: /v1/storages/sync_all

it should be:
/v1/storages/sync-all

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Dolphin(release/branch) version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

[task manager] Sync call stuck when rabbit-mq server is not running

Issue/Feature Description:
Sync and sync-all requests stuck when we dont have rabbitmq-server is not installed or not running

---- log snippet -----

2020-05-22 17:33:22.214 28040 ERROR oslo.messaging._drivers.impl_rabbit [req-46d5d2ad-c0d4-4b7e-a692-659bf2eb138b - - - - -] Connection failed: [Errno 111] ECONNREFUSED (retrying in 2.0 seconds): ConnectionRefusedError: [Errno 111] ECONNREFUSED
2020-05-22 17:33:24.222 28040 ERROR oslo.messaging._drivers.impl_rabbit [req-46d5d2ad-c0d4-4b7e-a692-659bf2eb138b - - - - -] Connection failed: [Errno 111] ECONNREFUSED (retrying in 4.0 seconds): ConnectionRefusedError: [Errno 111] ECONNREFUSED
2020-05-22 17:33:24.982 28040 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 111] ECONNREFUSED (retrying in 6.0 seconds): ConnectionRefusedError: [Errno 111] ECONNREFUSED

Why this issue to fixed / feature is needed(give scenarios or use cases):
User should get a failure response when rabbitmq-server is not connectable.
How to reproduce, in case of a bug:
stop rabbitmq-server and run /v1/storages//sync
Other Notes / Environment Information: (Please give the env information, log link or any useful information for this issue)

Completed task list

@NajmudheenCT commented on May 13, 2020, 4:03 PM UTC:

[Resource manager] Register a Storage Device -Pravin
[Resource manager] Refactor Registarion -Xulin
[Resource Manager] List storages -Naju
[Resource Manager] GET storage/ -Naju
[Project skeleton] Fannie
[Task manager] Sync storage - Pravin
[Crypto-framework] - Liyu
[validation framework] - Xulin
[DB framework] - Naju/Fannie
[Task manger Framework] - Pravin
[Driver manager] skeleton - Liyu
[Alert manager] Trap reciever framework - Sushantha

This issue was moved by kumarashit from sodafoundation/SIM-TempIssues#37.

VMAX volume details for volumes without Storage Group

First issue is, some VMAX volumes may not contain Storage Group associated with it. So the fields 'original_pool_id' and 'compressed' status, which are collected from SG, are not available for these volume. Need to find a method to get these details from VMAX backend.

Second issue is, when there are multiple SGs for a volume. We need to select right SG to collect these fields.

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug
/kind feature

What happened:

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Dolphin(release/branch) version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

[Alert manager] Load all custom mibs from configured path

@sushanthakumar commented on May 11, 2020, 10:03 AM UTC:

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:
Alert manager should be able to load all custom mib files from given path

What you expected to happen:
Alert manager should be able to load all custom mib files from given path

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • NBP version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

This issue was moved by kumarashit from sodafoundation/SIM-TempIssues#18.

DB SqlAlchemy Exception handling

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:
Need to handle exceptions of all sqlalchemy DB transaction queries.
What you expected to happen:
Catch all poosble excpetions and handle it.
How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Dolphin(release/branch) version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

Remove resource name from LIST API response

@NajmudheenCT commented on May 21, 2020, 4:55 AM UTC:

Issue/Feature Description:
Currently the LIST API reposnse ( v/storages, v1/pools) contain resource name as key in the response . Need to remove this as to comply with API spec
v1_pools

Why this issue to fixed / feature is needed(give scenarios or use cases):
Not in align with API spec
How to reproduce, in case of a bug:
GET /v1/pools Query
Other Notes / Environment Information: (Please give the env information, log link or any useful information for this issue)

This issue was moved by skdwriting from sodafoundation/SIM-TempIssues#46.

Create Virtual environment for deploying Dolphin

Is this a BUG REPORT or FEATURE REQUEST?:

/ feature

What happened:
Need to run dolphin in virtual environment as to to keep dependencies required by project by creating isolated python virtual environments.
What you expected to happen:
Dolphin should work independtly even if there are other pyhton projects which uses ddifferent modules.
How to reproduce it (as minimally and precisely as possible):
NA

Anything else we need to know?:
NA
Environment:

  • Dolphin(release/branch) version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

Alert model filling at vmax driver

@sushanthakumar commented on May 10, 2020, 3:08 PM UTC:

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:
Alert manager invokes driver manager interface to fill alert model from respective driver
Currently for emc vmax driver, alert model filling to be done

What you expected to happen:
Currently for emc vmax driver, alert model filling to be done

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • NBP version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

This issue was moved by skdwriting from sodafoundation/SIM-TempIssues#3.

Update API spec

@NajmudheenCT commented on May 23, 2020, 5:41 AM UTC:

Issue/Feature Description:
There are some changes to API spec based on implementaion.

  1. resource name needs to be in LIST resource reponse.
  2. Search parametrs needs to be finalized,
    Suggestions.
  • Remove pooli_id from volume search
  • add original_pool_id, wwn, original_id .
  1. In pool add 'original_id"

Why this issue to fixed / feature is needed(give scenarios or use cases):
As of now Implemneation and API spec are not in sync
How to reproduce, in case of a bug:

Other Notes / Environment Information: (Please give the env information, log link or any useful information for this issue)

This issue was moved by skdwriting from sodafoundation/SIM-TempIssues#48.

[Alert Manager] Mechanism of driver selection need to change

@sushanthakumar commented on May 13, 2020, 8:22 AM UTC:

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:
Alert manager need to identify driver type from incoming trap information.
Currently source ip is used and it is mapped to access info.
But this mechanism is not a proper way of finding driver

What you expected to happen:
Alert manager need to support source ip configuration parameter for alert source
So incoming ip can be compared with configured alert source ip and access_info usage can be eliminated

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • NBP version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

This issue was moved by kumarashit from sodafoundation/SIM-TempIssues#30.

[Alert manager] configuration handling at backend

@sushanthakumar commented on May 13, 2020, 10:44 AM UTC:

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:
Alert manager exposes swagger APIs to configure alert source configuration.
It involves user configuration (community, usm user), trap source configuration
These configuration need to be done at backend as well

What you expected to happen:
These configuration need to be done at backend

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • NBP version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

This issue was moved by kumarashit from sodafoundation/SIM-TempIssues#32.

Handling errors received by the device and chennelising

If the device throws some error, we should tell it to the client /end-user in a suitable manner.
Consider and example: It receives an error that device is un-reachable due to network issue, this error if end user doesn't get and just get's that 'Backend issue', then user may not be able to do RCA and take action.

Ok. My main concern was to catch any exception and propagate it to upper layer too. If client wants to do RCA based upon these, it will help.

Originally posted by @kumarashit in #70

[VMAX Driver] Update Resource status based on common model

@NajmudheenCT commented on May 21, 2020, 2:38 PM UTC:

Issue/Feature Description:
Resource status model needs to be finalized and map all resource status according to SODA resource model
Why this issue to fixed / feature is needed(give scenarios or use cases):
To make model generic
How to reproduce, in case of a bug:

Other Notes / Environment Information: (Please give the env information, log link or any useful information for this issue)

This issue was moved by skdwriting from sodafoundation/SIM-TempIssues#47.

Handle multi node use cases in Driver Manager

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:
Driver manager has to consider consistency of all interfaces in a muti node scenario like.
1/ Registering Device
2. Update access info
3. un-register storage
4. Collect resources.
What you expected to happen:

How to reproduce it (as minimally and precisely as possible):
NA

Anything else we need to know?:
NA
Environment:

  • Dolphin(release/branch) version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

A mechanism is needed to ensure that deleting task would be executed successfully

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened:
When we deleting storage device, we also need to delete the pool and volume and other resource which is related to this device.
But there is a scenario that if deleting device succeed, but deleting pool failed, the pool data would be unreachable data.
So we need a mechanism to ensure that all the deleting task would be executed successfully.

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Dolphin(release/branch) version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

Exception format is not correct

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened:
The exception format is not correct, for example:

2020-05-09 18:03:14.387 12876 INFO dolphin.api.common.wsgi [req-87d23f1a-41d5-4794-92a5-fcab7404570e - - - - -] HTTP exception thrown: Storage %(id)s could not be found.

What you expected to happen:
What we want is the true storage id, instead of the %(id)s

How to reproduce it (as minimally and precisely as possible):
To call db.storage_get(ctxt, id) with a storage id which is nonexistent

Anything else we need to know?:

Environment:

  • Dolphin(release/branch) version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

Handle the optimization issues in pool update

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug
/kind feature

What happened:
Handle the optimization issues in pool update:

query = _pool_get_query(context, session)
result = query.filter_by(id=pool_id).update(values)

    if not result:
        raise exception.PoolNotFound(id=pool_id)

we can add

result = _pool_get(context, pool_id, session).update(values)

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Dolphin(release/branch) version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

[Alert Manager] Adding snmp config to engine needs update

@sushanthakumar commented on May 13, 2020, 8:35 AM UTC:

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:
Alert manager needs to add snmp config to snmp engine for receiving traps.
Currently it is added from stub values during the start

What you expected to happen:
During alert config handling, alert manager needs to add snmp config to snmp engine dynamically

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • NBP version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

This issue was moved by kumarashit from sodafoundation/SIM-TempIssues#31.

Create Indexing for DB to optimize queries

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:
Indexing is not done in DB now, Analyze how indexing can help to optimize queries .
What you expected to happen:
optimized query perfomance .
How to reproduce it (as minimally and precisely as possible):
NA

Anything else we need to know?:
NA
Environment:

  • Dolphin(release/branch) version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

Not correct behaviour of log info message

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug
/kind feature

What happened:

I have two point in mind:
1.
in file /dolphin/drivers/api.py , line #59 LOG.info("Storage was found successfully.")

This line, will always appear, even though storage is not successful.

  1. Please follow the python convention of having max chars in a line is 79.

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Dolphin(release/branch) version:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.