Coder Social home page Coder Social logo

stanford-futuredata / macrobase Goto Github PK

View Code? Open in Web Editor NEW
658.0 658.0 128.0 26.98 MB

MacroBase: A Search Engine for Fast Data

Home Page: http://macrobase.stanford.edu/

License: Apache License 2.0

Shell 0.12% Java 83.61% HTML 1.13% JavaScript 1.66% CSS 0.05% Python 8.09% Jupyter Notebook 3.96% ANTLR 0.93% Dockerfile 0.05% TSQL 0.39%

macrobase's Introduction

MacroBase

Build Status Coverage Status

MacroBase is a data analytics tool that prioritizes attention in large datasets using machine learning.

For tutorial, documentation, papers and additional information, please refer to our project website: http://macrobase.stanford.edu/.

macrobase's People

Contributors

bryankate avatar deepakn94 avatar edgan8 avatar fabuzaid21 avatar ferhatelmas avatar jarnoux avatar jialinding avatar kexinrong avatar kraftp avatar mamikonyana avatar pbailis avatar raininthesun avatar rockie-yang avatar sahaana avatar shroman avatar vayu-stanford avatar viveksjain avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

macrobase's Issues

Tutorial feedback

I just went through the tutorial and got the demo working. Some feedback:

  • Getting started with a postgres server was by far the most difficult and annoying part. If you have a guide you like to use I think a link would be helpful; if not here's what I had to do:
    • Download and install PostgreSQL
    • Add the binaries my PATH, manually - export PATH=/Library/PostgreSQL/9.6/bin:$PATH
    • Create a database cluster folder somewhere - initdb postgres
    • Start the server: pg_ctl -D postgres start
  • Then the Python scripts didn't immediately work:
    • Had to pip install psycopg2, which admittedly is fine but I would've been comforted to see that step called out in the instructions
    • There were difficult-to-debug problems with psycopg2 and libssl.dylib or something, and it took me a while to find that I should do export DYLD_FALLBACK_LIBRARY_PATH=/Library/PostgreSQL/9.6/lib.
  • The image in step 1 for the MacroBase Exploratory GUI didn't match what eventually worked for me, which was Database URL: localhost, and the table selected from is sensor_data in the image while it should be (and is in the text above it) sensor_data_demo.

Cheers!

Feature Extraction refactoring discussion thread

Old Pipeline:

Ingest -> Scoping -> Outlier Detection -> Summarization

Basic idea: The outlier detector both:

  1. scores data
  2. marks points as outliers based on a percentile over scores (i.e., 99th percentile)

New Pipeline:

Ingest -> Feature extraction -> Context selection -> Density estimation -> Summarization

The main change in this pipeline is the separation of Feature extraction from Outlier detection.

Feature Extraction:

Basic idea: performs arbitrary transformation on input data

Input: data point (metrics in R^N, attributes)
Output: data point (metrics in R^M, attributes)

Density Estimation

Basic idea: performs density estimation on the transformed attributes

Input: data point (metrics in R^M, attributes)
Output: data point with probability (metrics in R^M, attributes, probability density estimate)

Code changes required:
  1. Most existing outlier detectors become feature extraction methods.
  2. KDE becomes default density estimation technique; we can also convert the percentile estimators for backwards compatibility.
  3. Will need utility methods for manipulating metric points (see discussion below)
Some discussion points
  1. Do we want to update point metrics in-place? It may be beneficial to have the original metrics attached, but this may also be expensive, especially if we chain feature extraction. Maybe we make this a method that we can overload later.
  2. If we perform dimensionality reduction (e.g., streaming PCA), do we do it before or after scoping?
  3. What is our upgrade path for batch versus streaming? Do we want to maintain separate code paths? We don't have a streaming version of scoping or KDE right now, so do we deprecate the streaming execution engine for the time being?

fix FPTree.getSupport()

Corrected code:

        while(pathHead != null) {
            FPTreeNode curNode = pathHead;
            int itemsToFind = plist.size();

            while(curNode != null) {
                log.debug("{} {}", itemsToFind, plist);
                if(pattern.contains(curNode.getItem())) {
                    itemsToFind -= 1;
                }

                if(itemsToFind == 0) {
                    count += curNode.count;
                    break;
                }

                curNode = curNode.getParent();
            }
            pathHead = pathHead.getNextLink();
        }

Address already in use [reopening]

Hello,

Sorry for again posting this but the fix presented here:

#170

Did not work for me. Reposting the error after the fix:

* Failed to parse configuration at: server.applicationConnectors.[0]; Could not resolve type id 'http' into a subtype of [simple type, class io.dropwizard.jetty.ConnectorFactory]: known type ids = [ConnectorFactory]
 at [Source: N/A; line: -1, column: -1] (through reference chain: macrobase.conf.MacroBaseConf["server"]->io.dropwizard.server.DefaultServerFactory["applicationConnectors"]->java.util.ArrayList[0])

Update and scoring timing

In batch mode, we can break down the detector time into update and scoring, as we have in the streaming case.

In streaming mode, we can separate the time spent periodically updating the summaries and the time spent updating the models.

"training" and "scoring" are probably better names than what we have now.

Server configuration lacks knobs of batch configuration

Right now, the GUI server config file supports a small subset of the config options of the standalone batch configuration. It would be nice to consolidate these configuration files/classes.

There is also an opportunity to refactor the BaseAnalyzer class.

NullPointerException on cmtDatasetComplexStreaming sometimes

Exception in thread "main" java.lang.NullPointerException
    at macrobase.analysis.summary.itemset.StreamingFPGrowth$StreamingFPTree.sortByNewOrder(StreamingFPGrowth.java:615)
    at macrobase.analysis.summary.itemset.StreamingFPGrowth$StreamingFPTree.access$900(StreamingFPGrowth.java:36)
    at macrobase.analysis.summary.itemset.StreamingFPGrowth.restructureTree(StreamingFPGrowth.java:686)
    at macrobase.analysis.summary.itemset.StreamingFPGrowth.decayAndResetFrequentItems(StreamingFPGrowth.java:709)
    at macrobase.analysis.summary.itemset.ExponentiallyDecayingEmergingItemsets.updateModels(ExponentiallyDecayingEmergingItemsets.java:108)
    at macrobase.analysis.summary.itemset.ExponentiallyDecayingEmergingItemsets.updateModelsNoDecay(ExponentiallyDecayingEmergingItemsets.java:74)
    at macrobase.analysis.periodic.RetrainingProcedure.updatePeriod(RetrainingProcedure.java:29)
    at macrobase.analysis.periodic.TupleBasedRetrainer.updatePeriod(TupleBasedRetrainer.java:35)
    at macrobase.analysis.periodic.AbstractPeriodicUpdater.updateIfNecessary(AbstractPeriodicUpdater.java:12)
    at macrobase.analysis.StreamingAnalyzer.analyzeOnePass(StreamingAnalyzer.java:189)
    at macrobase.runtime.standalone.streaming.MacroBaseStreamingCommand.run(MacroBaseStreamingCommand.java:78)
    at macrobase.runtime.standalone.streaming.MacroBaseStreamingCommand.run(MacroBaseStreamingCommand.java:15)
    at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:77)
    at io.dropwizard.cli.Cli.run(Cli.java:70)
    at io.dropwizard.Application.run(Application.java:80)
    at macrobase.runtime.MacroBaseServer.main(MacroBaseServer.java:26)
    at macrobase.MacroBase.main(MacroBase.java:38)

Abstract trained model of an outlier detector for reuse

I'd like to reuse a model trained for Context C1 to be reused for Context C2, i.e., I don't want to run the public abstract void train(List<Datum> data); in class public abstract class BatchTrainScore every time for a context.

I propose the following to achieve this

  • I add an interface called public interface BatchModel , and another training method public abstract void train(BatchModel batchModel); in public abstract class BatchTrainScore . public interface BatchModel is supposed to contain all the information about a model, for example, it contains median, MAD for MADDetector.
  • for every implementation of BatchTrainScore (MAD, KDE), we add an implementation of public interface BatchModel, that contains the model specific to them, as well as an implementation of public abstract void train(BatchModel batchModel);.

Here is an example implementation of MAD:

https://github.com/stanford-futuredata/macrobase/blob/trainedModelAbstraction/src/main/java/macrobase/analysis/stats/BatchTrainScore.java

https://github.com/stanford-futuredata/macrobase/blob/trainedModelAbstraction/src/main/java/macrobase/analysis/stats/MAD.java

ValueError when running scripts/py_analysis/analyze_apriori.py

I have python 2.7.10, pandas 0.18.1, numpy 1.8.0rc1.
When I ran
python scripts/py_analysis/analyze_apriori.py
it said
Traceback (most recent call last):
File "script/py_analysis/analyze_apriori.py", line 4, in <module>
from sklearn import linear_model, cluster
File "/Library/Python/2.7/site-packages/sklearn/__init__.py", line 57, in <module>
from .base import clone
File "/Library/Python/2.7/site-packages/sklearn/base.py", line 11, in <module>
from .utils.fixes import signature
File "/Library/Python/2.7/site-packages/sklearn/utils/__init__.py", line 10, in <module>
from .murmurhash import murmurhash3_32
File "numpy.pxd", line 155, in init sklearn.utils.murmurhash (sklearn/utils/murmurhash.c:5029)
ValueError: numpy.dtype has the wrong size, try recompiling

Is it the version of the module that's causing the error? Which versions am I supposed to use? Thanks!

NullPointerException in StreamingFPTree.sortByNewOrder()

Exception in thread "main" java.lang.NullPointerException
    at macrobase.analysis.summary.itemset.StreamingFPGrowth$StreamingFPTree.sortByNewOrder(StreamingFPGrowth.java:615)
    at macrobase.analysis.summary.itemset.StreamingFPGrowth$StreamingFPTree.access$900(StreamingFPGrowth.java:36)
    at macrobase.analysis.summary.itemset.StreamingFPGrowth.restructureTree(StreamingFPGrowth.java:686)
    at macrobase.analysis.summary.itemset.StreamingFPGrowth.decayAndResetFrequentItems(StreamingFPGrowth.java:709)
    at macrobase.analysis.summary.itemset.ExponentiallyDecayingEmergingItemsets.updateModels(ExponentiallyDecayingEmergingItemsets.java:108)
    at macrobase.analysis.summary.itemset.ExponentiallyDecayingEmergingItemsets.updateModelsNoDecay(ExponentiallyDecayingEmergingItemsets.java:74)
    at macrobase.analysis.periodic.RetrainingProcedure.updatePeriod(RetrainingProcedure.java:29)
    at macrobase.analysis.periodic.TupleBasedRetrainer.updatePeriod(TupleBasedRetrainer.java:35)
    at macrobase.analysis.periodic.AbstractPeriodicUpdater.updateIfNecessary(AbstractPeriodicUpdater.java:12)
    at macrobase.analysis.StreamingAnalyzer.analyzeOnePass(StreamingAnalyzer.java:189)
    at macrobase.runtime.standalone.streaming.MacroBaseStreamingCommand.run(MacroBaseStreamingCommand.java:78)
    at macrobase.runtime.standalone.streaming.MacroBaseStreamingCommand.run(MacroBaseStreamingCommand.java:15)
    at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:77)
    at io.dropwizard.cli.Cli.run(Cli.java:70)
    at io.dropwizard.Application.run(Application.java:80)
    at macrobase.runtime.MacroBaseServer.main(MacroBaseServer.java:26)
    at macrobase.MacroBase.main(MacroBase.java:38)

This is on the cmtDatasetComplexStreaming workload. (same version that is on origin/master)

IndexOutofBoundsException

Exception in thread "main" java.lang.IndexOutOfBoundsException: toIndex = 1008
        at java.util.ArrayList.subListRangeCheck(ArrayList.java:1004)
        at java.util.ArrayList.subList(ArrayList.java:996)
        at macrobase.analysis.outlier.MinCovDet.findKClosest(MinCovDet.java:169)
        at macrobase.analysis.outlier.MinCovDet.train(MinCovDet.java:217)
        at macrobase.analysis.StreamingAnalyzer.analyzeOnePass(StreamingAnalyzer.java:183)
        at macrobase.runtime.standalone.streaming.MacroBaseStreamingCommand.run(MacroBaseStreamingCommand.java:81)
        at macrobase.runtime.standalone.streaming.MacroBaseStreamingCommand.run(MacroBaseStreamingCommand.java:15)
        at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:77)
        at io.dropwizard.cli.Cli.run(Cli.java:70)
        at io.dropwizard.Application.run(Application.java:80)
        at macrobase.runtime.MacroBaseServer.main(MacroBaseServer.java:26)
        at macrobase.MacroBase.main(MacroBase.java:38)
Exception in thread "main" java.lang.IndexOutOfBoundsException: toIndex = 1004
        at java.util.ArrayList.subListRangeCheck(ArrayList.java:1004)
        at java.util.ArrayList.subList(ArrayList.java:996)
        at macrobase.analysis.outlier.MinCovDet.findKClosest(MinCovDet.java:169)
        at macrobase.analysis.outlier.MinCovDet.train(MinCovDet.java:217)
        at macrobase.analysis.StreamingAnalyzer.analyzeOnePass(StreamingAnalyzer.java:183)
        at macrobase.runtime.standalone.streaming.MacroBaseStreamingCommand.run(MacroBaseStreamingCommand.java:81)
        at macrobase.runtime.standalone.streaming.MacroBaseStreamingCommand.run(MacroBaseStreamingCommand.java:15)
        at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:77)
        at io.dropwizard.cli.Cli.run(Cli.java:70)
        at io.dropwizard.Application.run(Application.java:80)
        at macrobase.runtime.MacroBaseServer.main(MacroBaseServer.java:26)
        at macrobase.MacroBase.main(MacroBase.java:38)
Exception in thread "main" java.lang.IndexOutOfBoundsException: toIndex = 1007
        at java.util.ArrayList.subListRangeCheck(ArrayList.java:1004)
        at java.util.ArrayList.subList(ArrayList.java:996)
        at macrobase.analysis.outlier.MinCovDet.findKClosest(MinCovDet.java:169)
        at macrobase.analysis.outlier.MinCovDet.train(MinCovDet.java:217)
        at macrobase.analysis.StreamingAnalyzer.analyzeOnePass(StreamingAnalyzer.java:183)
        at macrobase.runtime.standalone.streaming.MacroBaseStreamingCommand.run(MacroBaseStreamingCommand.java:81)
        at macrobase.runtime.standalone.streaming.MacroBaseStreamingCommand.run(MacroBaseStreamingCommand.java:15)
        at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:77)
        at io.dropwizard.cli.Cli.run(Cli.java:70)
        at io.dropwizard.Application.run(Application.java:80)
        at macrobase.runtime.MacroBaseServer.main(MacroBaseServer.java:26)
        at macrobase.MacroBase.main(MacroBase.java:38)

On cmtDatasetSimple, alphaMCD = 1.0

Simplify item counting

Basic idea:

Store min count from previous epoch
Item in table? Increment
Item not in table? Min + count

End of epoch, decay counts, record min, compute 1/thresh highest counts, discard higher items.

This is SpaceSaving but with O(1) update and O(items*log(k)) maintenance (normally O(log(k)) update) and unlimited space within an epoch (compare to O(k)).

Error when running bin/server.sh: relation "sensor_data" does not exist

When I was running bin/server.sh it gave me
ERROR [2016-05-28 18:53:09,969] io.dropwizard.jersey.errors.LoggingExceptionMapper: Error handling a request: 885c3f33ae9ed907 ! org.postgresql.util.PSQLException: ERROR: relation "sensor_data" does not exist ! Position: 15 ! at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2198) ~[postgresql-9.3-1104-jdbc41.jar:na] ! at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1927) ~[postgresql-9.3-1104-jdbc41.jar:na] ! at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255) ~[postgresql-9.3-1104-jdbc41.jar:na] ! at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:562) ~[postgresql-9.3-1104-jdbc41.jar:na] ! at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:406) ~[postgresql-9.3-1104-jdbc41.jar:na] ! at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:286) ~[postgresql-9.3-1104-jdbc41.jar:na] ! at macrobase.ingest.SQLIngester.getSchema(SQLIngester.java:100) ~[classes/:na] ! at macrobase.runtime.resources.SchemaResource.getSchema(SchemaResource.java:27) ~[classes/:na]

I'm not quite sure what went wrong. Postgresql outputted the following
$ postgres --config_file=/usr/localpath/var/postgres/postgresql.conf
LOG: MultiXact member wraparound protections are now enabled
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
ERROR: relation "sensor_data" does not exist at character 15
STATEMENT: SELECT * from sensor_data LIMIT 1

I suppose the correct query should be SELECT * from sensor_data_demo LIMIT 10;? But how come in SQLIngester.java:
Statement stmt = connection.createStatement();
String sql = String.format("%s LIMIT 1", removeSqlJunk(removeLimit(baseQuery)));
ResultSet rs = stmt.executeQuery(sql);
it didn't get the right string? Hard coding didn't help much since there are other places associated with the use of sql that have the same problem.

Thanks!

Issue with demo / Null Pointer Exception

Found two issues with the tutorial:

  1. python script/load_demo.py, the load_demo.py is now located under tools/
  2. got the following error trying to run the tutorial

0.2883.95 Safari/537.36" 2
0:0:0:0:0:0:0:1 - - [16/Jan/2017:00:35:35 +0000] "GET /fonts/glyphicons-halflings-regular.woff2 HTTP/1.1" 304 - "http://localhost:8080/css/bootstrap.min.css" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36" 3
ERROR [2017-01-16 00:35:35,663] macrobase.runtime.resources.SchemaResource: An error occurred while processing a request:
! java.lang.NullPointerException: null
! at macrobase.runtime.resources.SchemaResource.getSchema(SchemaResource.java:42) ~[classes/:na]
! at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_102]
! at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_102]
! at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_102]
! at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_102]
! at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:205) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.internal.Errors.process(Errors.java:315) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.internal.Errors.process(Errors.java:297) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.internal.Errors.process(Errors.java:267) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:471) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:425) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:383) [macrobase-assembly-0.1-SNAPSHOT.jar:na]
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:336) [macrobase-assembly-0.1-SNAPSHOT.jar:na]

Static Threshold for different outlier detectors

@pbailis @mamikonyana

I understand we've added more outlier detectors, and each one of them has a score() function that returns a score for every datum. The scores are used by outlier classifiers. If we are to use percentile classifier, it's no problem. However, if we are to use static threshold classifier, I am not exactly sure what should be the right threshold macrobase.analysis.classify.outlierStaticThreshold for every outlier detector.

For example, for MAD, the recommended threshold is 3. Can we have a recommended threshold for each detector? We can encode the default threshold values in MacroBaseConf.java; or we can simply document it in docs/parameter_desc.md

Set contextual attributes via conf

DataLoader pulls metrics and attributes from the conf object: https://github.com/stanford-futuredata/macrobase/blob/master/src/main/java/macrobase/ingest/DataLoader.java#L32

However, the contextual attributes are passed in via getData (https://github.com/stanford-futuredata/macrobase/blob/master/src/main/java/macrobase/ingest/DataLoader.java#L55).

@raininthesun: is there a reason why this needs to be the case? Can we refactor to load the contextual attributes like the rest?

Refactoring discussion: Configuration

Right now, we have config file spaghetti. We have a large number of config parameters, each of which is set manually for:

This leads to pretty ugly code like:

StreamingAnalyzer analyzer = new StreamingAnalyzer();

Do we want to keep this code as is or should we make the config objects generic and pass them into each analyzer?

In batch, avoid counting twice.

We already count the outlier transactions to find out those with minimum ratio. We should pass this into the FPTree constructor to avoid re-computing the counts.

I can fix. This is a note for me.

Exception when using EM_GMM for contextual outlier detection

@mamikonyana

Exception in thread "main" org.apache.commons.math3.linear.SingularMatrixException: matrix is singular at org.apache.commons.math3.linear.EigenDecomposition$Solver.getInverse(EigenDecomposition.java:553) at org.apache.commons.math3.distribution.MultivariateNormalDistribution.<init>(MultivariateNormalDistribution.java:131) at org.apache.commons.math3.distribution.MultivariateNormalDistribution.<init>(MultivariateNormalDistribution.java:82) at macrobase.analysis.stats.distribution.MultivariateNormal.<init>(MultivariateNormal.java:18) at macrobase.analysis.stats.mixture.ExpectMaxGMM.trainEM(ExpectMaxGMM.java:97) at macrobase.analysis.stats.mixture.ExpectMaxGMM.train(ExpectMaxGMM.java:33) at macrobase.analysis.transform.BatchScoreFeatureTransform.consume(BatchScoreFeatureTransform.java:37) at macrobase.analysis.contextualoutlier.ContextualOutlierDetector.contextualOutlierDetection(ContextualOutlierDetector.java:436) at macrobase.analysis.contextualoutlier.ContextualOutlierDetector.searchContextualOutliers(ContextualOutlierDetector.java:128) at macrobase.analysis.pipeline.BasicContextualBatchedPipeline.run(BasicContextualBatchedPipeline.java:58) at macrobase.runtime.command.MacroBasePipelineCommand.run(MacroBasePipelineCommand.java:37) at macrobase.runtime.command.MacroBasePipelineCommand.run(MacroBasePipelineCommand.java:17) at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:77) at io.dropwizard.cli.Cli.run(Cli.java:70) at io.dropwizard.Application.run(Application.java:80) at macrobase.runtime.MacroBaseServer.main(MacroBaseServer.java:20) at macrobase.MacroBase.main(MacroBase.java:33)

Benchmark effect of model parameters

Independent Variables

In rough order, we want to sweep each of the following variables:

To evaluate the model behavior:

  • h in MCD in both streaming and batch (e.g., 1, .95, .75, .5, .25, .1, .05, .01)
  • stoppingDelta in MCD in both streaming and batch (e.g., 1e0, 1e-1, 1e-2, 1-e3, 1e-4)
  • sample reservoir size in streaming (e.g., 100, 1000, 10000, 100000)
  • refreshPeriod in streaming (e.g., 10 tuples, 100, 1000, 10000, 100000)

To evaluate the summary behavior:

  • support in both streaming and batch
  • ratio in both streaming and batch
  • outlier percentile in both streaming and batch

(Note: stoppingDelta will require a configuration file change.)

Independent Variables

The figures of interest we want to measure (and eventually plot):

  • Wall-clock performance for each component and overall
  • Number of itemsets
  • For the model behaviors, several percentiles of outlier scores (e.g., 50th, 75th, 95th, 99th).
    • This allows us to determine how the distribution of scores changes as a result of changes to the underlying model.
    • It's unclear where it's best to implement this. One thought is to do it optionally at the end of a given run.

Exception Running Contextual outlier detection using KDE/TreeKDE/BinnedKDE

@mamikonyana

Exception in thread "main" org.apache.commons.math3.exception.MathUnsupportedOperationException: unsupported operation at org.apache.commons.math3.linear.EigenDecomposition.getSquareRoot(EigenDecomposition.java:382) at macrobase.analysis.stats.KDE.calculateBandwidthAncillaries(KDE.java:152) at macrobase.analysis.stats.KDE.setBandwidth(KDE.java:91) at macrobase.analysis.stats.KDE.setBandwidth(KDE.java:139) at macrobase.analysis.stats.KDE.train(KDE.java:158) at macrobase.analysis.transform.BatchScoreFeatureTransform.consume(BatchScoreFeatureTransform.java:38) at macrobase.analysis.contextualoutlier.ContextualOutlierDetector.contextualOutlierDetection(ContextualOutlierDetector.java:437) at macrobase.analysis.contextualoutlier.ContextualOutlierDetector.searchContextualOutliers(ContextualOutlierDetector.java:129) at macrobase.analysis.pipeline.BasicContextualBatchedPipeline.run(BasicContextualBatchedPipeline.java:55) at macrobase.runtime.command.MacroBasePipelineCommand.run(MacroBasePipelineCommand.java:37) at macrobase.runtime.command.MacroBasePipelineCommand.run(MacroBasePipelineCommand.java:17) at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:77) at io.dropwizard.cli.Cli.run(Cli.java:70) at io.dropwizard.Application.run(Application.java:80) at macrobase.runtime.MacroBaseServer.main(MacroBaseServer.java:20) at macrobase.MacroBase.main(MacroBase.java:33)

NullPointerException again :(

Exception in thread "main" java.lang.NullPointerException
        at macrobase.analysis.summary.itemset.StreamingFPGrowth$StreamingFPTree.sortByNewOrder(StreamingFPGrowth.java:628)
        at macrobase.analysis.summary.itemset.StreamingFPGrowth$StreamingFPTree.access$900(StreamingFPGrowth.java:36)
        at macrobase.analysis.summary.itemset.StreamingFPGrowth.restructureTree(StreamingFPGrowth.java:690)
        at macrobase.analysis.summary.itemset.StreamingFPGrowth.decayAndResetFrequentItems(StreamingFPGrowth.java:713)
        at macrobase.analysis.summary.itemset.ExponentiallyDecayingEmergingItemsets.updateModels(ExponentiallyDecayingEmergingItemsets.java:108)
        at macrobase.analysis.summary.itemset.ExponentiallyDecayingEmergingItemsets.updateModelsNoDecay(ExponentiallyDecayingEmergingItemsets.java:74)
        at macrobase.analysis.periodic.RetrainingProcedure.updatePeriod(RetrainingProcedure.java:29)
        at macrobase.analysis.periodic.TupleBasedRetrainer.updatePeriod(TupleBasedRetrainer.java:35)
        at macrobase.analysis.periodic.AbstractPeriodicUpdater.updateIfNecessary(AbstractPeriodicUpdater.java:12)
        at macrobase.analysis.StreamingAnalyzer.analyzeOnePass(StreamingAnalyzer.java:189)
        at macrobase.runtime.standalone.streaming.MacroBaseStreamingCommand.run(MacroBaseStreamingCommand.java:81)
        at macrobase.runtime.standalone.streaming.MacroBaseStreamingCommand.run(MacroBaseStreamingCommand.java:15)
        at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:77)
        at io.dropwizard.cli.Cli.run(Cli.java:70)
        at io.dropwizard.Application.run(Application.java:80)
        at macrobase.runtime.MacroBaseServer.main(MacroBaseServer.java:26)
        at macrobase.MacroBase.main(MacroBase.java:38)

On fedDisbursementsSimple, with inputReservoirSize = 100

MinCovDet should not sort

FindKClosest can be done in a single pass using a heap instead of sorting. O(nlogk) instead of O(nlogn). Easy fix given the right heap implementation, and something we should definitely look into.

Address already in use.

I was recently introduced to MacroBase and wanted to give it a try. I have been following the tutorial and I had followed the steps but when I run bin/server.sh I get an exception:


INFO  [2016-10-29 07:09:45,783] io.dropwizard.server.DefaultServerFactory: Registering admin handler with root path prefix: /
INFO  [2016-10-29 07:09:45,837] org.eclipse.jetty.setuid.SetUIDListener: Opened application@435871cb{HTTP/1.1}{0.0.0.0:8080}
WARN  [2016-10-29 07:09:45,841] org.eclipse.jetty.util.component.AbstractLifeCycle: FAILED org.eclipse.jetty.server.Server@45cff11c: java.lang.RuntimeException: java.net.BindException: Address already in use
! java.net.BindException: Address already in use
! at sun.nio.ch.Net.bind0(Native Method) ~[na:1.8.0_111]
! at sun.nio.ch.Net.bind(Net.java:433) ~[na:1.8.0_111]
! at sun.nio.ch.Net.bind(Net.java:425) ~[na:1.8.0_111]
! at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) ~[na:1.8.0_111]
! at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) ~[na:1.8.0_111]
! at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:321) ~[macrobase-jar-with-dependencies.jar:na]
! at org.eclipse.jetty.setuid.SetUIDListener.lifeCycleStarting(SetUIDListener.java:200) ~[macrobase-jar-with-dependencies.jar:na]
! ... 8 common frames omitted
! Causing: java.lang.RuntimeException: java.net.BindException: Address already in use
! at org.eclipse.jetty.setuid.SetUIDListener.lifeCycleStarting(SetUIDListener.java:213) ~[macrobase-jar-with-dependencies.jar:na]
! at org.eclipse.jetty.util.component.AbstractLifeCycle.setStarting(AbstractLifeCycle.java:188) ~[macrobase-jar-with-dependencies.jar:na]
! at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:67) ~[macrobase-jar-with-dependencies.jar:na]
! at io.dropwizard.cli.ServerCommand.run(ServerCommand.java:43) [macrobase-jar-with-dependencies.jar:na]
! at io.dropwizard.cli.EnvironmentCommand.run(EnvironmentCommand.java:41) [macrobase-jar-with-dependencies.jar:na]
! at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:77) [macrobase-jar-with-dependencies.jar:na]
! at io.dropwizard.cli.Cli.run(Cli.java:70) [macrobase-jar-with-dependencies.jar:na]
! at io.dropwizard.Application.run(Application.java:80) [macrobase-jar-with-dependencies.jar:na]
! at macrobase.runtime.MacroBaseServer.main(MacroBaseServer.java:21) [classes/:na]
ERROR [2016-10-29 07:09:45,841] io.dropwizard.cli.ServerCommand: Unable to start server, shutting down
! java.net.BindException: Address already in use
! at sun.nio.ch.Net.bind0(Native Method) ~[na:1.8.0_111]
! at sun.nio.ch.Net.bind(Net.java:433) ~[na:1.8.0_111]
! at sun.nio.ch.Net.bind(Net.java:425) ~[na:1.8.0_111]
! at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) ~[na:1.8.0_111]
! at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) ~[na:1.8.0_111]
! at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:321) ~[macrobase-jar-with-dependencies.jar:na]
! at org.eclipse.jetty.setuid.SetUIDListener.lifeCycleStarting(SetUIDListener.java:200) ~[macrobase-jar-with-dependencies.jar:na]
! ... 8 common frames omitted
! Causing: java.lang.RuntimeException: java.net.BindException: Address already in use
! at org.eclipse.jetty.setuid.SetUIDListener.lifeCycleStarting(SetUIDListener.java:213) ~[macrobase-jar-with-dependencies.jar:na]
! at org.eclipse.jetty.util.component.AbstractLifeCycle.setStarting(AbstractLifeCycle.java:188) ~[macrobase-jar-with-dependencies.jar:na]
! at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:67) ~[macrobase-jar-with-dependencies.jar:na]
! at io.dropwizard.cli.ServerCommand.run(ServerCommand.java:43) ~[macrobase-jar-with-dependencies.jar:na]
! at io.dropwizard.cli.EnvironmentCommand.run(EnvironmentCommand.java:41) [macrobase-jar-with-dependencies.jar:na]
! at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:77) [macrobase-jar-with-dependencies.jar:na]
! at io.dropwizard.cli.Cli.run(Cli.java:70) [macrobase-jar-with-dependencies.jar:na]
! at io.dropwizard.Application.run(Application.java:80) [macrobase-jar-with-dependencies.jar:na]
! at macrobase.runtime.MacroBaseServer.main(MacroBaseServer.java:21) [classes/:na]
Exception in thread "main" java.lang.RuntimeException: java.net.BindException: Address already in use
	at org.eclipse.jetty.setuid.SetUIDListener.lifeCycleStarting(SetUIDListener.java:213)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.setStarting(AbstractLifeCycle.java:188)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:67)
	at io.dropwizard.cli.ServerCommand.run(ServerCommand.java:43)
	at io.dropwizard.cli.EnvironmentCommand.run(EnvironmentCommand.java:41)
	at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:77)
	at io.dropwizard.cli.Cli.run(Cli.java:70)
	at io.dropwizard.Application.run(Application.java:80)
	at macrobase.runtime.MacroBaseServer.main(MacroBaseServer.java:21)
Caused by: java.net.BindException: Address already in use
	at sun.nio.ch.Net.bind0(Native Method)
	at sun.nio.ch.Net.bind(Net.java:433)
	at sun.nio.ch.Net.bind(Net.java:425)
	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
	at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:321)
	at org.eclipse.jetty.setuid.SetUIDListener.lifeCycleStarting(SetUIDListener.java:200)
	... 8 more

When I try looking at localhost:8080 I don't see anything running. But localhost:8081 is being used by McAfee. I am not sure if that is the issue?

Limitation of the Iterator-based pipeline design?

It seems that the iterator-based pipeline design limits the access to the output of a node only once?

For example, for the "DataIngester -> FeatureTransform -> OutlierClassifier -> BatchSummarizer" in BasicBatchedPipeline, once the results of OutlierClassifier is consumed by BatchSummarizer, I cannot access the results of OutlierClassifier again by calling its next().

Design discussion: how to handle JSON

We're going to need to run over some JSON files in the near future. Say we have a bunch of files of the form:

userLikes1.json
userLikes2.json
userLikes3.json
userComments1.json
userComments2.json
userComments3.json
clicks1.json
clicks2.json
clicks3.json

Each type of file has a different schema, and imagine we have 1000s of these files, totaling 100s of GB.

Topic for discussion: how do we want to parse these and formulate queries?

Re-scale metrics upon ingest.

For consistency across datasets and possibly improved convergence time, re-scale our metrics to a [0, 1] scale upon ingest. To start, let's do (x-min(X))/(max(X)-min(X)).

Error while processing index page request

I simply installed on my linux box, started the server and tried to access the GUI. DB table sensor_data_demo is created and it has the data as well. But I'm getting the following error on the base page itself.
macrobase

fatal org.postgresql.util.PSQLException after clicking "more rows" a few times in the GUI

To reproduce the problem:

  1. open up the sensor_data_demo
  2. click "sample"
  3. click "more rows" for a few times (~8)

Here is the stacktrace from the terminal:

ERROR [2016-05-04 04:14:52,113] org.apache.tomcat.jdbc.pool.ConnectionPool: Unable to create initial connections of pool.
! org.postgresql.util.PSQLException: FATAL: remaining connection slots are reserved for non-replication superuser connections
! at org.postgresql.core.v3.ConnectionFactoryImpl.readStartupMessages(ConnectionFactoryImpl.java:589) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:192) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:64) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:143) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:29) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:21) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:38) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:24) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.Driver.makeConnection(Driver.java:412) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.Driver.connect(Driver.java:280) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.apache.tomcat.jdbc.pool.PooledConnection.connectUsingDriver(PooledConnection.java:307) ~[tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.PooledConnection.connect(PooledConnection.java:200) ~[tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.ConnectionPool.createConnection(ConnectionPool.java:708) [tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:642) [tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.ConnectionPool.init(ConnectionPool.java:464) [tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.ConnectionPool.<init>(ConnectionPool.java:141) [tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.DataSourceProxy.pCreatePool(DataSourceProxy.java:115) [tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.DataSourceProxy.createPool(DataSourceProxy.java:102) [tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:126) [tomcat-jdbc-8.0.28.jar:na]
! at macrobase.ingest.SQLIngester.initializeConnection(SQLIngester.java:168) [classes/:na]
! at macrobase.ingest.SQLIngester.getRows(SQLIngester.java:116) [classes/:na]
! at macrobase.runtime.resources.RowSetResource.getRows(RowSetResource.java:41) [classes/:na]
! at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_66]
! at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_66]
! at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_66]
! at java.lang.reflect.Method.invoke(Method.java:497) ~[na:1.8.0_66]
! at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81) [jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144) [jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161) [jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:205) [jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99) [jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389) [jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347) [jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102) [jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326) [jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271) [jersey-common-2.22.1.jar:na]
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267) [jersey-common-2.22.1.jar:na]
! at org.glassfish.jersey.internal.Errors.process(Errors.java:315) [jersey-common-2.22.1.jar:na]
! at org.glassfish.jersey.internal.Errors.process(Errors.java:297) [jersey-common-2.22.1.jar:na]
! at org.glassfish.jersey.internal.Errors.process(Errors.java:267) [jersey-common-2.22.1.jar:na]
! at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317) [jersey-common-2.22.1.jar:na]
! at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305) [jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154) [jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:471) [jersey-container-servlet-core-2.22.1.jar:na]
! at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:425) [jersey-container-servlet-core-2.22.1.jar:na]
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:383) [jersey-container-servlet-core-2.22.1.jar:na]
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:336) [jersey-container-servlet-core-2.22.1.jar:na]
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:223) [jersey-container-servlet-core-2.22.1.jar:na]
! at io.dropwizard.jetty.NonblockingServletHolder.handle(NonblockingServletHolder.java:49) [dropwizard-jetty-0.9.1.jar:0.9.1]
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669) [jetty-servlet-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83) [jetty-servlets-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:300) [jetty-servlets-9.2.13.v20150730.jar:9.2.13.v20150730]
! at io.dropwizard.jetty.BiDiGzipFilter.doFilter(BiDiGzipFilter.java:132) [dropwizard-jetty-0.9.1.jar:0.9.1]
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) [jetty-servlet-9.2.13.v20150730.jar:9.2.13.v20150730]
! at io.dropwizard.servlets.ThreadNameFilter.doFilter(ThreadNameFilter.java:29) [dropwizard-servlets-0.9.1.jar:0.9.1]
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) [jetty-servlet-9.2.13.v20150730.jar:9.2.13.v20150730]
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.handle(AllowedMethodsFilter.java:43) [dropwizard-jersey-0.9.1.jar:0.9.1]
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.doFilter(AllowedMethodsFilter.java:38) [dropwizard-jersey-0.9.1.jar:0.9.1]
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) [jetty-servlet-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) [jetty-servlet-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) [jetty-servlet-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at com.codahale.metrics.jetty9.InstrumentedHandler.handle(InstrumentedHandler.java:240) [metrics-jetty9-3.1.2.jar:3.1.2]
! at io.dropwizard.jetty.RoutingHandler.handle(RoutingHandler.java:51) [dropwizard-jetty-0.9.1.jar:0.9.1]
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:95) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:159) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.Server.handle(Server.java:499) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) [jetty-io-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) [jetty-util-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) [jetty-util-9.2.13.v20150730.jar:9.2.13.v20150730]
! at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]
ERROR [2016-05-04 04:14:52,122] io.dropwizard.jersey.errors.LoggingExceptionMapper: Error handling a request: 34ebdc5ea696a2c1
! org.postgresql.util.PSQLException: FATAL: remaining connection slots are reserved for non-replication superuser connections
! at org.postgresql.core.v3.ConnectionFactoryImpl.readStartupMessages(ConnectionFactoryImpl.java:589) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:192) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:64) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:143) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:29) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:21) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:38) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:24) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.Driver.makeConnection(Driver.java:412) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.postgresql.Driver.connect(Driver.java:280) ~[postgresql-9.3-1104-jdbc41.jar:na]
! at org.apache.tomcat.jdbc.pool.PooledConnection.connectUsingDriver(PooledConnection.java:307) ~[tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.PooledConnection.connect(PooledConnection.java:200) ~[tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.ConnectionPool.createConnection(ConnectionPool.java:708) ~[tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:642) ~[tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.ConnectionPool.init(ConnectionPool.java:464) ~[tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.ConnectionPool.<init>(ConnectionPool.java:141) ~[tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.DataSourceProxy.pCreatePool(DataSourceProxy.java:115) ~[tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.DataSourceProxy.createPool(DataSourceProxy.java:102) ~[tomcat-jdbc-8.0.28.jar:na]
! at org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:126) ~[tomcat-jdbc-8.0.28.jar:na]
! at macrobase.ingest.SQLIngester.initializeConnection(SQLIngester.java:168) ~[classes/:na]
! at macrobase.ingest.SQLIngester.getRows(SQLIngester.java:116) ~[classes/:na]
! at macrobase.runtime.resources.RowSetResource.getRows(RowSetResource.java:41) ~[classes/:na]
! at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_66]
! at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_66]
! at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_66]
! at java.lang.reflect.Method.invoke(Method.java:497) ~[na:1.8.0_66]
! at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81) ~[jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144) ~[jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161) ~[jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:205) ~[jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99) ~[jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389) ~[jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347) ~[jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102) ~[jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326) ~[jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271) [jersey-common-2.22.1.jar:na]
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267) [jersey-common-2.22.1.jar:na]
! at org.glassfish.jersey.internal.Errors.process(Errors.java:315) [jersey-common-2.22.1.jar:na]
! at org.glassfish.jersey.internal.Errors.process(Errors.java:297) [jersey-common-2.22.1.jar:na]
! at org.glassfish.jersey.internal.Errors.process(Errors.java:267) [jersey-common-2.22.1.jar:na]
! at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317) [jersey-common-2.22.1.jar:na]
! at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305) [jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154) [jersey-server-2.22.1.jar:na]
! at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:471) [jersey-container-servlet-core-2.22.1.jar:na]
! at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:425) [jersey-container-servlet-core-2.22.1.jar:na]
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:383) [jersey-container-servlet-core-2.22.1.jar:na]
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:336) [jersey-container-servlet-core-2.22.1.jar:na]
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:223) [jersey-container-servlet-core-2.22.1.jar:na]
! at io.dropwizard.jetty.NonblockingServletHolder.handle(NonblockingServletHolder.java:49) [dropwizard-jetty-0.9.1.jar:0.9.1]
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669) [jetty-servlet-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83) [jetty-servlets-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:300) [jetty-servlets-9.2.13.v20150730.jar:9.2.13.v20150730]
! at io.dropwizard.jetty.BiDiGzipFilter.doFilter(BiDiGzipFilter.java:132) [dropwizard-jetty-0.9.1.jar:0.9.1]
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) [jetty-servlet-9.2.13.v20150730.jar:9.2.13.v20150730]
! at io.dropwizard.servlets.ThreadNameFilter.doFilter(ThreadNameFilter.java:29) [dropwizard-servlets-0.9.1.jar:0.9.1]
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) [jetty-servlet-9.2.13.v20150730.jar:9.2.13.v20150730]
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.handle(AllowedMethodsFilter.java:43) [dropwizard-jersey-0.9.1.jar:0.9.1]
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.doFilter(AllowedMethodsFilter.java:38) [dropwizard-jersey-0.9.1.jar:0.9.1]
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) [jetty-servlet-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) [jetty-servlet-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) [jetty-servlet-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at com.codahale.metrics.jetty9.InstrumentedHandler.handle(InstrumentedHandler.java:240) [metrics-jetty9-3.1.2.jar:3.1.2]
! at io.dropwizard.jetty.RoutingHandler.handle(RoutingHandler.java:51) [dropwizard-jetty-0.9.1.jar:0.9.1]
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:95) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:159) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.Server.handle(Server.java:499) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) [jetty-server-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) [jetty-io-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) [jetty-util-9.2.13.v20150730.jar:9.2.13.v20150730]
! at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) [jetty-util-9.2.13.v20150730.jar:9.2.13.v20150730]
! at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.