Coder Social home page Coder Social logo

anomaly-detection-ensemble-learning-'s Introduction

NAB Data Corpus

Data are ordered, timestamped, single-valued metrics. All data files contain anomalies, unless otherwise noted.

alt-text alt-text

Real data

  • realAWSCloudwatch/

    AWS server metrics as collected by the AmazonCloudwatch service. Example metrics include CPU Utilization, Network Bytes In, and Disk Read Bytes.

  • realAdExchange/

    Online advertisement clicking rates, where the metrics are cost-per-click (CPC) and cost per thousand impressions (CPM). One of the files is normal, without anomalies.

  • realKnownCause/

    This is data for which we know the anomaly causes; no hand labeling.

    • ambient_temperature_system_failure.csv: The ambient temperature in an office setting.
    • cpu_utilization_asg_misconfiguration.csv: From Amazon Web Services (AWS) monitoring CPU usage โ€“ i.e. average CPU usage across a given cluster. When usage is high, AWS spins up a new machine, and uses fewer machines when usage is low.
    • ec2_request_latency_system_failure.csv: CPU usage data from a server in Amazon's East Coast datacenter. The dataset ends with complete system failure resulting from a documented failure of AWS API servers. There's an interesting story behind this data in the Numenta blog.
    • machine_temperature_system_failure.csv: Temperature sensor data of an internal component of a large, industrial mahcine. The first anomaly is a planned shutdown of the machine. The second anomaly is difficult to detect and directly led to the third anomaly, a catastrophic failure of the machine.
    • nyc_taxi.csv: Number of NYC taxi passengers, where the five anomalies occur during the NYC marathon, Thanksgiving, Christmas, New Years day, and a snow storm. The raw data is from the NYC Taxi and Limousine Commission. The data file included here consists of aggregating the total number of taxi passengers into 30 minute buckets.
    • rogue_agent_key_hold.csv: Timing the key holds for several users of a computer, where the anomalies represent a change in the user.
    • rogue_agent_key_updown.csv: Timing the key strokes for several users of a computer, where the anomalies represent a change in the user.
  • realRogueAgent/

    This data represents computer usage patterns for different users, where an anomaly may occur with a rogue user of the computer.

  • realTraffic/

    Real time traffic data from the Twin Cities Metro area in Minnesota, collected by the Minnesota Department of Transportation. Included metrics include occupancy, speed, and travel time from specific sensors.

  • realTweets/

    A collection of Twitter mentions of large publicly-traded companies such as Google and IBM. The metric value represents the number of mentions for a given ticker symbol every 5 minutes.

Artificial data

  • artificialNoAnomaly/

    Artifically-generated data without any anomalies.

  • artificialWithAnomaly/

    Artifically-generated data with varying types of anomalies.

anomaly-detection-ensemble-learning-'s People

Contributors

astreak avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.