cuebook / cueobserve Goto Github PK
View Code? Open in Web Editor NEWTimeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases
Home Page: https://cueobserve.cuebook.ai
License: Apache License 2.0
Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases
Home Page: https://cueobserve.cuebook.ai
License: Apache License 2.0
Describe the solution you'd like
I'd love to give CueObserve a try but our warehouse is currently in MS SQL Server.
Describe the solution you'd like
Add ClickHouse as a supported data source.
I will wait when #52 will resolve and try to implements PR
Add a test case that anomaly objects must always be greater than 0 for an anomaly run.
There can be scenarios where no anomaly object is created due to % contribution
or min value
condition not being met
Originally posted by satkalra1 August 5, 2021
Hey,
I was trying the cue observe on the test dataset, to understand its working properly but after defining the anomaly, I found this particular error:
{"stackTrace": "Traceback (most recent call last):\n File "pandas/_libs/lib.pyx", line 2062, in pandas._libs.lib.maybe_convert_numeric\nValueError: Unable to parse string "null"\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File "/code/ops/tasks.py", line 49, in anomalyDetectionJob\n dimValsData = prepareAnomalyDataframes(datasetDf, anomalyDefinition.dataset.timestampColumn, anomalyDefinition.metric, anomalyDefinition.dimension, anomalyDefinition.top)\n File "/code/access/utils.py", line 17, in prepareAnomalyDataframes\n datasetDf[metricCol] = pd.to_numeric(datasetDf[metricCol])\n File "/opt/venv/lib/python3.7/site-packages/pandas/core/tools/numeric.py", line 155, in to_numeric\n values, set(), coerce_numeric=coerce_numeric\n File "pandas/_libs/lib.pyx", line 2099, in pandas._libs.lib.maybe_convert_numeric\nValueError: Unable to parse string "null" at position 1715\n", "message": "Unable to parse string "null" at position 1715"}
Could you please tell me , where the things are going wrong?
Daily anomaly cards currently show 45 days of historical data.
The user should be able to input this interval on the card UI.
There are at least 2 ways to take this user input:
last 90 days
90
shows latest 90 data points.refer discussion #139
it would help us debug issues like #147
Slack webhooks for
implement RCA for anomalies generated via rules, not just prophet.
Pick up after #45 is closed
I can create a connection but get an error when I try to create a dataset.
select
date_trunc('hour', date_add(day, date_diff(day, toDate('2014-03-23'), today()), EventTime)) as NewEventTime,
MobilePhoneModel,
count(1) as Hits
from hits_v1
GROUP BY
date_trunc('hour', date_add(day, date_diff(day, toDate('2014-03-23'), today()), EventTime)), MobilePhoneModel
order by date_trunc('hour', date_add(day, date_diff(day, toDate('2014-03-23'), today()), EventTime))
Add loader while connection is being tested, else the user doesn't get any feedback
Type
Measure
Rule
[Dimension Explosion
]
Type
= Rule, Prophet
Rule
=
Percentage Change >=
X
X
is a number >= 1 Value
operator
Y
[and Z
]operator
= >
, >=
, <
, <=
, between
, not between
Y
, Z
are of type doubleHigh/Low
Originally posted by jithendra945 August 5, 2021
whenever anomaly definition got Error, anomalies page is going blank
Describe the bug
Cueobserve scheduled tasks are getting being struck in celery queue and not completing(No error is thrown)
To Reproduce
Steps to reproduce the behavior:
th0se tasks are not completing even after 1 day.
for debugging, I have tried executing 5 scheduled tasks and 1 Scheduled task separately(different cron intervals), its is working fine, but when the scheduled tasks are 6 with same cron interval those got struck and didn't finish
Expected behavior
It should complete the 6 accepted tasks and pull the next tasks
Thanks
Its actually good to hav the sample data set used in the demo in repo or somewhere.
The time it takes to do RCA depends on the number of dimensions in the dataset and the available infra.
User should be able to abort a running RCA.
Originally posted by sdepablos September 15, 2021
Right now CueObserve is using Celery to schedule tasks (plus Redis to save the config). This requires having the system always up for the scheduler to work. In my case, I'd prefer to run this as an "scale to 0" application, either via Cloud Run or App Engine. To that effect, my idea would be to define to schedule in Google Cloud Scheduler, which will then trigger the recalculation, without the need to have a system always up. Which would be the API endpoint to trigger a task?
add minimum CPU and memory requirements in the Installation
page.
I found one more Issue like,
i added a schedule
then, I added it to anomaly definition,
afterwards I went to Schedules and deleted that schedule,
Then anomaly definitions are empty, it is getting 500 error as that schedule dint exist.
Originally posted by @jithendra945 in #61
Screen | Search in Columns |
---|---|
Anomalies | Dataset, Granularity, Measure, Filters |
Anomaly Definitions | Dataset, Granularity, Anomaly Definition |
Datasets | Dataset Name, Connection, Granularity |
Dataset Granularity
Runtime granularity for RCA
Run data quality check after fetching data for a dataset and before running anomaly detection job.
Metric column must not have any string value. Refer #81.
If a dataset metric contains a string value, throw error in anomaly definition and do not run the anomaly definition.
Handle NULL, NaN values for metrics in pandas dataframe
Better handling of insufficient data
Describe the bug
Facing this issue intermittently where Celery gives the following error on scheduled run. I believe this is happening because of some race condition due to asyncio. We have used single pod solution only, even with that configuration this issue pops up randomly
{"stackTrace": "Traceback (most recent call last):\n File \"/code/ops/tasks/anomalyDetectionTasks.py\", line 85, in
anomalyDetectionJob\n result = _detectionJobs.get()\n File \"/opt/venv/lib/python3.7/site-packages/celery/result.py\", line 680, in get\n on_interval=on_interval,\n File \"/opt/venv/lib/python3.7/site-packages/celery/result.py\", line 799, in
join_native\n on_message, on_interval):\n File \"/opt/venv/lib/python3.7/site-packages/celery/backends/asynchronous.py\",
line 150, in iter_native\n for _ in self._wait_for_pending(result, no_ack=no_ack, **kwargs):\n File
\"/opt/venv/lib/python3.7/site-packages/celery/backends/asynchronous.py\", line 267, in _wait_for_pending\n
on_interval=on_interval):\n File \"/opt/venv/lib/python3.7/site-packages/celery/backends/asynchronous.py\", line 54, in
drain_events_until\n yield self.wait_for(p, wait, timeout=interval)\n File \"/opt/venv/lib/python3.7/site-
packages/celery/backends/asynchronous.py\", line 63, in wait_for\n wait(timeout=timeout)\n File
\"/opt/venv/lib/python3.7/site-packages/celery/backends/redis.py\", line 152, in drain_events\n message =
self._pubsub.get_message(timeout=timeout)\n File \"/opt/venv/lib/python3.7/site-packages/redis/client.py\", line 3617, in
get_message\n response = self.parse_response(block=False, timeout=timeout)\n File \"/opt/venv/lib/python3.7/site-
packages/redis/client.py\", line 3505, in parse_response\n response = self._execute(conn, conn.read_response)\n File
\"/opt/venv/lib/python3.7/site-packages/redis/client.py\", line 3479, in _execute\n return command(*args, **kwargs)\n File
\"/opt/venv/lib/python3.7/site-packages/redis/connection.py\", line 756, in read_response\n raise
response\nredis.exceptions.ResponseError: wrong number of arguments for 'subscribe' command\n", "message": "wrong
number of arguments for 'subscribe' command"}
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Is there any work around that we can use to avoid this issue?, please help
docs for issue #48
docs for #45
Analyze anomalous data point for dimension values with minimum X
% contribution
user feedback
First question on top of my mind : how to host it somewhere and schedule these anomaly detection jobs in a self-serve manner.
Once scheduled - someone else can come in and navigate the results super easily. Could not find anything in the documentation but will play around a bit more.
I should be able to run anomaly detection on metrics that are not additive.
Below are a few examples of aggregate functions in a dataset's SQL GROUP BY that can then be supported:
COUNT(DISTINCT)
MIN()
MAX()
AVG()
Custom Percentage calculations
Dataset SQL can have zero or 1+ dimensions. Since data cannot be rolled up, anomaly definition cannot define anomaly explosion. Instead, dataset SQL itself defines the extent of explosion.
e.g. Say a dataset has 2 dimensions and 1 metric - State, Brand, ConversionRate. This means anomaly objects must be created for each state+brand combination. We cannot have an anomaly definition for a single dimension or no dimension.
Analyze anomalous data point for dimension values with minimum X
% contribution, where X can be specified by the user.
Currently X = 1
Is your feature request related to a problem? Please describe.
I'd love to give CueObserve a try but our warehouse is currently in MS SQL Server.
Describe the solution you'd like
Add SQL Server as a supported data source.
Additional context
I'd be interested in making the necessary pull request, but I'd like some high level advice on what might be needed.
Is it as simple as adding the necessary sqlserver.py in https://github.com/cuebook/CueObserve/tree/main/api/dbConnections ?
Anomaly qualification rules to decide whether an anomaly should be published or not.
Should these rules be defined at the global level, dataset level or at the anomaly definition level?
How do we merge duplicate anomalies resulting from multiple anomaly definitions on the same measure?
Support OR / AND when multiple rules.
Threshold metrics
Threshold operators
>
, >=
, <
, <=
, between
, not between
Min % Contribution
X
X
is a number between 1 and 100Min Avg Value
Y
Y
is of data type double
metric
) >= Y
We can leverage the already NODE_ENV for checking if the app is running in development mode, see example code below:
if(process.env.NODE_ENV === "development"){
// Development Settings
this.host = "http://localhost:8000";
this.base_url = this.host + basePath + "/api/";
}
else{
// Production Settings
this.host = "";
this.base_url = this.host + basePath + "/api/";
}
Originally posted by pjpringle August 30, 2021
Not everyone has slack especially in the work place environments. Provide support to plug in rest calls to notify of anomalies.
related #42
for #64
RCA currently executes on the latest anomaly data point. As a user, I should be able to execute RCA on any given data point.
refer discussion #139
I tried to give postgres as given in the document.
But it is on my local system.
When i run it, it is giving 500 internal server error, could not connect to postgres server.
I can connect to postgres psql on my system but this project is not connecting.
Anyone know the solution ?
Originally posted by @jithendra945 in #55
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.