Coder Social home page Coder Social logo

rax-maas / blueflood Goto Github PK

View Code? Open in Web Editor NEW
597.0 95.0 102.0 11.92 MB

A distributed system designed to ingest and process time series data

Home Page: http://www.blueflood.io

License: Apache License 2.0

Java 96.44% Shell 1.40% Python 0.74% Ruby 0.17% JavaScript 0.61% HTML 0.01% Dockerfile 0.02% PowerShell 0.61%

blueflood's Introduction

Blueflood

Unit tests Coverage Status License

Discuss - Code - Site

Introduction

Blueflood is a multi-tenant, distributed metric processing system. Blueflood is capable of ingesting, rolling up and serving metrics at a massive scale.

Getting Started

The latest code will always be here on Github.

git clone https://github.com/rax-maas/blueflood.git
cd blueflood

Building

Blueflood builds and runs on Java 8. Ensure you're using an appropriate JDK before proceeding.

Blueflood builds with Maven. Use typical Maven lifecycle phases:

  • mvn clean removes build artifacts.
  • mvn test runs unit tests.
  • mvn verify runs all tests.
  • mvn package builds a Blueflood package for release.

Important build profiles to know about:

  • skip-unit-tests skips unit tests in all modules.
  • skip-integration-tests skips the integration tests.

Blueflood's main artifact is an 'uber jar', produced by the blueflood-all module.

After compiling, you can also build a Docker image with mvn docker:build. See blueflood-docker for the Docker-related files.

Running

You can easily build a ready-to-run Blueflood jar from source:

mvn package -P skip-unit-tests,skip-integration-tests

However, it requires Cassandra to start and Elasticsearch for all its features to work. The best place to start is the 10 minute guide.

Additional Tools

The Blueflood team maintains a number of tools that are related to the project, but not essential components of it. These tools are kept in various other repos:

Contributing

First, we welcome bug reports and contributions. If you would like to contribute code, just fork this project and send us a pull request. If you would like to contribute documentation, you should get familiar with our wiki

Also, we have set up a Google Group to answer questions.

License

Copyright 2013-2017 Rackspace

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

blueflood's People

Contributors

actions-user avatar chandraaddala avatar chinmay-gupte avatar danielni avatar darxriggs avatar dlobue avatar dogild avatar dreid avatar fourk avatar gdusbabek avatar georgejahad avatar goru97 avatar itzg avatar iwebi avatar izrik avatar kaustavha avatar lakshmi-kannan avatar rampage644 avatar ratanasv avatar richarxt avatar robert-chiniquy avatar rohitsngh27 avatar shintasmith avatar slevinbe avatar stackedsax avatar terryllowery avatar tilogaat avatar usnavi avatar vinnyq avatar zzantozz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

blueflood's Issues

Using blueflood in private cloud/on premise

Blue Flood is very interesting. I am currently using Cyanite. I am interested to know whether blue flood only works on rackspace, or I can use it in on premise/private cloud setup environments?

Thanks in advance

Paginate results from /metrics/search

I develop Rackspace Intelligence, which uses Blueflood extensively via Rackspace Monitoring and Cloud Metrics. I've been debugging a performance problem where we need to find a device that generates metrics so we can display it on a particular graph page, not knowing which device it will be ahead of time.

Currently we call an API in Rackspace Monitoring that delegates to Metrics (which is Blueflood). The Blueflood API does not support any limits on the size of the response beyond the hard coded limit of 100,000 in the code. I believe this is causing the time to respond to go up without a reasonable bound on large accounts. On the customer's account that alerted us to the issue, I saw response times from Blueflood up to 20 seconds.

We would like to speed up this operation by calling Rackspace Cloud Metrics directly, but we need an upper bound on the latency of this operation for it to make sense. It would be nice to expose the from and size parameters to Elasticsearch as queryable parameters to the /metrics/search API.

Startup time is too slow when zookeeper is enabled.

I had 3 blueflood rollup servers deployed, shard space initially split into 3 regions. I decided to use zookeeper clustering.

I realized that Blueflood startup time became very slow as soon as I enabled the Zookeeper clustering. (my deployment environment has health check of the form "if the given ports doesn't open within N seconds, consider the deploy to be failed".)

I enabled the DEBUG log, and i realized the logic was at ZKBasedShardLockManager.prefetchLocksAndScheduleLocksScavenging. Specifically at the loop

            for (int shard : shards) {
                    worker.submit(lock.acquirer()).get();
                    if (lock.isHeld() && ++locksObtained >= maxLocksToPrefetch) {
                        break;
                    }
            }

each iteration of loop is taking ~1 sec for me, thus entire startup takes about 1 minute. relevant log entries are:

014-10-22 22:18:01 DEBUG asedShardLockManager:212 - Initial lock attempt for 30
2014-10-22 22:18:01 DEBUG asedShardLockManager:563 - Trying ZK lock for 30
2014-10-22 22:18:01 DEBUG asedShardLockManager:576 - Acquired ZK lock for 30
...
2014-10-22 22:19:11 DEBUG asedShardLockManager:212 - Initial lock attempt for 38
2014-10-22 22:19:11 DEBUG asedShardLockManager:563 - Trying ZK lock for 38
2014-10-22 22:19:12 DEBUG asedShardLockManager:581 - Acquire ZK failed for 38

I propose two possible solutions for this:

  1. make this loop parallel. (seems like one blocker for this is ThreadPoolExecutor worker having the max pool size of 1 - effectively serializing the background worker pool. any history behind this)?
  2. allow the app to start without this task finishing. (I don't know why this logic is prerequisite for startup up the app - any background on this)?

Any thoughts?

Build error on Ubuntu 14.04

Hi,

I followed the 10 minute guide to build Blueflood from the sources, but I got the following errors.

[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-assembly-plugin:2.4:single (make-assembly) on project blueflood-all: Execution make-assembly of goal org.apache.maven.plugins:maven-assembly-plugin:2.4:single failed: A required class was missing while executing org.apache.maven.plugins:maven-assembly-plugin:2.4:single: org/apache/commons/lang/StringUtils
[ERROR] -----------------------------------------------------
[ERROR] realm = plugin>org.apache.maven.plugins:maven-assembly-plugin:2.4
[ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
[ERROR] urls[0] = file:/home/ubuntu/.m2/repository/org/apache/maven/plugins/maven-assembly-plugin/2.4/maven-assembly-plugin-2.4.jar
[ERROR] urls[1] = file:/home/ubuntu/.m2/repository/org/slf4j/slf4j-jdk14/1.5.6/slf4j-jdk14-1.5.6.jar
[ERROR] urls[2] = file:/home/ubuntu/.m2/repository/org/slf4j/slf4j-api/1.5.6/slf4j-api-1.5.6.jar
[ERROR] urls[3] = file:/home/ubuntu/.m2/repository/org/slf4j/jcl-over-slf4j/1.5.6/jcl-over-slf4j-1.5.6.jar
[ERROR] urls[4] = file:/home/ubuntu/.m2/repository/org/apache/maven/reporting/maven-reporting-api/2.2.1/maven-reporting-api-2.2.1.jar
[ERROR] urls[5] = file:/home/ubuntu/.m2/repository/org/apache/maven/doxia/doxia-sink-api/1.1/doxia-sink-api-1.1.jar
[ERROR] urls[6] = file:/home/ubuntu/.m2/repository/org/apache/maven/doxia/doxia-logging-api/1.1/doxia-logging-api-1.1.jar
[ERROR] urls[7] = file:/home/ubuntu/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar
[ERROR] urls[8] = file:/home/ubuntu/.m2/repository/org/codehaus/plexus/plexus-interactivity-api/1.0-alpha-4/plexus-interactivity-api-1.0-alpha-4.jar
[ERROR] urls[9] = file:/home/ubuntu/.m2/repository/backport-util-concurrent/backport-util-concurrent/3.1/backport-util-concurrent-3.1.jar
[ERROR] urls[10] = file:/home/ubuntu/.m2/repository/org/sonatype/plexus/plexus-sec-dispatcher/1.3/plexus-sec-dispatcher-1.3.jar
[ERROR] urls[11] = file:/home/ubuntu/.m2/repository/org/sonatype/plexus/plexus-cipher/1.4/plexus-cipher-1.4.jar
[ERROR] urls[12] = file:/home/ubuntu/.m2/repository/org/apache/maven/shared/maven-common-artifact-filters/1.4/maven-common-artifact-filters-1.4.jar
[ERROR] urls[13] = file:/home/ubuntu/.m2/repository/org/codehaus/plexus/plexus-interpolation/1.15/plexus-interpolation-1.15.jar
[ERROR] urls[14] = file:/home/ubuntu/.m2/repository/org/codehaus/plexus/plexus-archiver/2.2/plexus-archiver-2.2.jar
[ERROR] urls[15] = file:/home/ubuntu/.m2/repository/org/apache/maven/shared/file-management/1.1/file-management-1.1.jar
[ERROR] urls[16] = file:/home/ubuntu/.m2/repository/org/apache/maven/shared/maven-shared-io/1.1/maven-shared-io-1.1.jar
[ERROR] urls[17] = file:/home/ubuntu/.m2/repository/org/apache/maven/shared/maven-filtering/1.1/maven-filtering-1.1.jar
[ERROR] urls[18] = file:/home/ubuntu/.m2/repository/org/sonatype/plexus/plexus-build-api/0.0.4/plexus-build-api-0.0.4.jar
[ERROR] urls[19] = file:/home/ubuntu/.m2/repository/org/codehaus/plexus/plexus-io/2.0.6/plexus-io-2.0.6.jar
[ERROR] urls[20] = file:/home/ubuntu/.m2/repository/org/apache/maven/maven-archiver/2.5/maven-archiver-2.5.jar
[ERROR] urls[21] = file:/home/ubuntu/.m2/repository/junit/junit/3.8.1/junit-3.8.1.jar
[ERROR] urls[22] = file:/home/ubuntu/.m2/repository/org/codehaus/plexus/plexus-utils/3.0.8/plexus-utils-3.0.8.jar
[ERROR] urls[23] = file:/home/ubuntu/.m2/repository/org/apache/maven/shared/maven-repository-builder/1.0-alpha-2/maven-repository-builder-1.0-alpha-2.jar
[ERROR] Number of foreign imports: 1
[ERROR] import: Entry[import from realm ClassRealm[maven.api, parent: null]]
[ERROR]
[ERROR] -----------------------------------------------------: org.apache.commons.lang.StringUtils
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginContainerException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn -rf :blueflood-all

Regards,
Erdinc

Perform batch writes for rollups

Currently, rollups are written to disk one at a time. We should have a way to write rollups in batches. This would reduce the number of cassandra operations.

Things to keep in mind

  1. Figure out the optimum batch size in terms of performance

  2. Make sure that the slot state is marked as Rolled only after rollups are persisted on disk.

We're using netty 3 but depending on netty 4.

astyanax-cassandra requires netty 3.

When our web services were initially created, the netty 3 classes were inadvertently used.

We need to fix this, but it's going to require some refactoring, as the HttpRequest/Response API changed a good bit between 3 and 4.

Blob format

I'm having trouble reading data directly from Cassandra (from Python).

cqlsh:DATA> DESCRIBE KEYSPACE DATA;
....
CREATE TABLE metrics_5m (
key text,
column1 bigint,
value blob,
PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
bloom_filter_fp_chance=0.010000 AND
caching='KEYS_ONLY' AND
comment='' AND
dclocal_read_repair_chance=0.000000 AND
gc_grace_seconds=864000 AND
read_repair_chance=0.100000 AND
replicate_on_write='true' AND
populate_io_cache_on_flush='false' AND
compaction={'class': 'SizeTieredCompactionStrategy'} AND
compression={'sstable_compression': 'SnappyCompressor'};
....

I see the 'value' is of type 'blob'.

!/use/bin/env python

import pycassa
import struct

pool = pycassa.ConnectionPool('DATA', server_list=['c1','c2'])
col_fam = pycassa.ColumnFamily(pool, 'metrics_5m')
res = col_fam.get('7023,70fa66a.diskuse.__srv')

for ts, v in res.items():
print struct.unpack('bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb', v)

.... snip ....
(0, 5, 0, 110, 0, 0, 0, 0, 0, 0, 8, 64, 1, 110, 0, 0, 0, 0, 0, 0, 0, 0, 2, 110, 0, 0, 0, 0, 0, 0, 8, 64, 3, 110, 0, 0, 0, 0, 0, 0, 8, 64)
(0, 5, 0, 110, 0, 0, 0, 0, 0, 0, 8, 64, 1, 110, 0, 0, 0, 0, 0, 0, 0, 0, 2, 110, 0, 0, 0, 0, 0, 0, 8, 64, 3, 110, 0, 0, 0, 0, 0, 0, 8, 64)
.... snip ....

I'm sure that I am missing something simple, but what is the recommended way to read this data directly from Cassandra?

Zookeeper error when run Blueflood

I am trying to run blueflood-all.jar with the following configurations

log4j.properties:

log4j.appender.console.additionalFields={'environment': 'dev', 'application': 'bf', 'instance_id': '0'}
log4j.appender.console.extractStacktrace=true
log4j.appender.console.addExtendedInformation=true
log4j.appender.console.facilityIsLogger=true
log4j.appender.console.layout=org.apache.log4j.PatternLayout

log4j.logger.httpclient.wire.header=WARN
log4j.logger.httpclient.wire.content=WARN

log4j.category.org.apache.zookeeper.ClientCnxn=WARN
log4j.category.org.apache.zookeeper.client.ZooKeeperSaslClient=ERROR

log4j.logger.org.apache.http.client.protocol=INFO
log4j.logger.org.apache.http.wire=INFO
log4j.logger.org.apache.http.impl=INFO
log4j.logger.org.apache.http.headers=INFO

log4j.rootLogger=INFO, console
java -cp /home/ubuntu/blueflood/blueflood-all/target/blueflood-all-2.0.0-SNAPSHOT-jar-with-dependencies.jar \
-Dblueflood.config=file:///home/ubuntu/blueflood/blueflood-core/src/main/resources/configDefaults/blueflood.properties \
-Dlog4j.configuration=file:///home/ubuntu/blueflood/blueflood-core/log4j.properties \
com.rackspacecloud.blueflood.service.BluefloodServiceStarter

But I am getting these errors:

log4j:ERROR Could not find value for key log4j.appender.console
log4j:ERROR Could not instantiate appender named "console".
log4j:ERROR Could not find value for key log4j.appender.console
log4j:ERROR Could not instantiate appender named "console".

bf-rollups-delay.py support new version of Cassandra with CQL 1.4.0

status error An attempt was made to connect to each of the servers twice, but none of the attempts succeeded. The last failure was TTransportException: Could not connect to 192.168.1.1:9160
Traceback (most recent call last):
File "blueflood-rollup-delay.py", line 139, in
main(sys.argv[1])
File "blueflood-rollup-delay.py", line 136, in main
raise ex
pycassa.pool.AllServersUnavailable: An attempt was made to connect to each of the servers twice, but none of the attempts succeeded. The last failure was TTransportException: Could not connect to 192.168.1.1:9160

refactor rollup read/write logic

It's gotten pretty complicated with the latest sets of changes (pre-aggregated metrics support, rollup write batching)

11:21 <@gdusbabek> jburkhart, lakspace: after this we may want to take a few steps back and examine
 how rollups are read and written.  I think we could improve our use of threadpools and make the code cleaner.
11:22 <@gdusbabek> it's getting a little difficult to grok.
...
11:23 <@lakspace> We have to revisit everything from LocatorFetchRunnable 
(that's where the confusion starts, I think)

While we're at it, it might make sense to find a way to do batch reads of metrics when doing rollups. We could see performance gains there.

One potential optimization could be to redo the way we shard locators by using org.apache.cassandra.dht.RandomPartitioner, that way we end up with better locality for all reads for a given shard.

That probably only improves things if number of cass servers goes evenly into number of shards, but that seems like it should describe the vast majority of deployments

Refactor ShardStateWorker

The main reason would be to simplify the boilerplate required to push/pull shard state.

One way to do this would be to refactor StardStateWorker by adding start/stop methods, then creating another class (ShardStateServices?) that has reference each to a push and pull worker that can be managed.

Then the boilerplate becomes simply:

new ShardStateServices().start();

Loading the schema into cassandra fails

I just cloned the blueflood git repo and built the project.
Then I tried to import the schema into cassandra (2.0.16) (also tried on 2.2.3)

Here's the command that I used

cqlsh -f /blueflood/src/cassandra/cli/load.script

But it failed

/blueflood/src/cassandra/cli/load.script:4:Bad Request: Failed parsing statement: [CREATE KEYSPACE DATA
WITH placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
AND strategy_options = {replication_factor:1};] reason: NullPointerException null
/blueflood/src/cassandra/cli/load.script:5:Bad Request: Keyspace 'data' does not exist
/blueflood/src/cassandra/cli/load.script:7:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:8:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:9:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:10:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:11:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:13:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:14:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:15:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:16:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:17:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:18:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:20:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:21:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:22:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:23:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:24:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:25:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:27:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:28:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:29:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:30:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
/blueflood/src/cassandra/cli/load.script:31:Bad Request: line 1:7 no viable alternative at input 'COLUMN'
Aborted (core dumped)

Request: Graphite backend

I see there's a statsd backend which is awesome. Requesting a graphite backend so that I can set my carbon-relay to stream metrics to bluefood in the same way we can stream to blueflood-statsd.

Would really allow us (and I'm sure many others) to hook in to blueflood much easier.

We have too many metrics going to graphite (not via statsd) right now that switching to blueflood is going to be super difficult and time consuming. A graphite backend would make this process much easier to get started, allowing us to slowly convert everything else in the future, while still getting the immediate benefits.

Error in running tests on Ubuntu

I get an error under both Java 7 and Java 8 when running mvn test in Ubuntu 15.04:

com.rackspacecloud.blueflood.service.BluefloodServiceStarterTest  Time elapsed: 0.794 sec  <<< ERROR!
java.lang.IllegalStateException: Failed to transform class with name com.rackspacecloud.blueflood.service.BluefloodServiceStarter. Reason: java.io.IOException: invalid constant type: 18
    at javassist.bytecode.ConstPool.readOne(ConstPool.java:1090)
    at javassist.bytecode.ConstPool.read(ConstPool.java:1033)
    at javassist.bytecode.ConstPool.<init>(ConstPool.java:149)
    at javassist.bytecode.ClassFile.read(ClassFile.java:764)
    at javassist.bytecode.ClassFile.<init>(ClassFile.java:108)
    at javassist.CtClassType.getClassFile2(CtClassType.java:190)
    at javassist.CtClassType.subtypeOf(CtClassType.java:303)
    at javassist.CtClassType.subtypeOf(CtClassType.java:318)
    at javassist.compiler.MemberResolver.compareSignature(MemberResolver.java:247)
    at javassist.compiler.MemberResolver.lookupMethod(MemberResolver.java:119)
    at javassist.compiler.MemberResolver.lookupMethod(MemberResolver.java:96)
    at javassist.compiler.TypeChecker.atMethodCallCore(TypeChecker.java:704)
    at javassist.expr.NewExpr$ProceedForNew.setReturnType(NewExpr.java:243)
    at javassist.compiler.JvstTypeChecker.atCallExpr(JvstTypeChecker.java:146)
    at javassist.compiler.ast.CallExpr.accept(CallExpr.java:45)
    at javassist.compiler.TypeChecker.atVariableAssign(TypeChecker.java:248)
    at javassist.compiler.TypeChecker.atAssignExpr(TypeChecker.java:217)
    at javassist.compiler.ast.AssignExpr.accept(AssignExpr.java:38)
    at javassist.compiler.CodeGen.doTypeCheck(CodeGen.java:241)
    at javassist.compiler.CodeGen.atStmnt(CodeGen.java:329)
    at javassist.compiler.ast.Stmnt.accept(Stmnt.java:49)
    at javassist.compiler.CodeGen.atStmnt(CodeGen.java:350)
    at javassist.compiler.ast.Stmnt.accept(Stmnt.java:49)
    at javassist.compiler.CodeGen.atIfStmnt(CodeGen.java:404)
    at javassist.compiler.CodeGen.atStmnt(CodeGen.java:354)
    at javassist.compiler.ast.Stmnt.accept(Stmnt.java:49)
    at javassist.compiler.Javac.compileStmnt(Javac.java:568)
    at javassist.expr.NewExpr.replace(NewExpr.java:206)
    at org.powermock.core.transformers.impl.MainMockTransformer$PowerMockExpressionEditor.edit(MainMockTransformer.java:428)
    at javassist.expr.ExprEditor.loopBody(ExprEditor.java:211)
    at javassist.expr.ExprEditor.doit(ExprEditor.java:90)
    at javassist.CtClassType.instrument(CtClassType.java:1384)
    at org.powermock.core.transformers.impl.MainMockTransformer.transform(MainMockTransformer.java:75)
    at org.powermock.core.classloader.MockClassLoader.loadMockClass(MockClassLoader.java:203)
    at org.powermock.core.classloader.MockClassLoader.loadModifiedClass(MockClassLoader.java:145)
    at org.powermock.core.classloader.DeferSupportingClassLoader.loadClass(DeferSupportingClassLoader.java:65)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at sun.reflect.generics.factory.CoreReflectionFactory.makeNamedType(CoreReflectionFactory.java:114)
    at sun.reflect.generics.visitor.Reifier.visitClassTypeSignature(Reifier.java:125)
    at sun.reflect.generics.tree.ClassTypeSignature.accept(ClassTypeSignature.java:49)
    at sun.reflect.annotation.AnnotationParser.parseSig(AnnotationParser.java:439)
    at sun.reflect.annotation.AnnotationParser.parseClassValue(AnnotationParser.java:420)
    at sun.reflect.annotation.AnnotationParser.parseClassArray(AnnotationParser.java:724)
    at sun.reflect.annotation.AnnotationParser.parseArray(AnnotationParser.java:531)
    at sun.reflect.annotation.AnnotationParser.parseMemberValue(AnnotationParser.java:355)
    at sun.reflect.annotation.AnnotationParser.parseAnnotation2(AnnotationParser.java:286)
    at sun.reflect.annotation.AnnotationParser.parseAnnotations2(AnnotationParser.java:120)
    at sun.reflect.annotation.AnnotationParser.parseAnnotations(AnnotationParser.java:72)
    at java.lang.Class.createAnnotationData(Class.java:3521)
    at java.lang.Class.annotationData(Class.java:3510)
    at java.lang.Class.getAnnotations(Class.java:3446)
    at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.classAnnotations(PowerMockJUnit44RunnerDelegateImpl.java:163)
    at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.getDescription(PowerMockJUnit44RunnerDelegateImpl.java:155)
    at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.run(PowerMockJUnit44RunnerDelegateImpl.java:118)
    at org.powermock.modules.junit4.common.internal.impl.JUnit4TestSuiteChunkerImpl.run(JUnit4TestSuiteChunkerImpl.java:102)
    at org.powermock.modules.junit4.common.internal.impl.AbstractCommonPowerMockRunner.run(AbstractCommonPowerMockRunner.java:53)
    at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
    at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
    at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)

My Java versions:

java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)

and

java version "1.8.0_72"
Java(TM) SE Runtime Environment (build 1.8.0_72-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.72-b15, mixed mode)

Have a way of tracking intervals of raw data

This would replace GET_BY_POINTS_ASSUME_INTERVAL and make GetByPoints queries return a number of points closer to the requested amount in cases where data is sent at an interval that does not match GET_BY_POINTS_ASSUME_INTERVAL.

Compatibility with Java 8

Are there any plans to make BlueFlood ready for Java 8? The main reason for that decision could be that Java 7 is not supported since April 2015.
Currently it is impossible to build BlueFlood with JDK 8 because of the blueflood-core module's failing test:

Running com.rackspacecloud.blueflood.io.serializers.IMetricSerializerTest
Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.79 sec <<< FAILURE!
testSetSerialization(com.rackspacecloud.blueflood.io.serializers.IMetricSerializerTest)  Time elapsed: 1.24 sec  <<< FAILURE!
junit.framework.ComparisonFailure: expected:<..."count":9,"hashes":[[746007989,1875251108,98262,103159993,1727114331,-1034140067,98699,1062516268,99644]]}> but was:<..."count":9,"hashes":[[-1034140067,746007989,1875251108,98262,1062516268,1727114331,98699,99644,103159993]]}>
        at junit.framework.Assert.assertEquals(Assert.java:85)
        at junit.framework.Assert.assertEquals(Assert.java:91)
        at com.rackspacecloud.blueflood.io.serializers.IMetricSerializerTest.testSetSerialization(IMetricSerializerTest.java:76)

Also BlueFlood built with JDK 7 throws an exception during start under JVM 8:

2015-08-14 14:56:39 INFO  efloodServiceStarter:302 - Starting blueflood services
2015-08-14 14:56:39 INFO  tionPoolMBeanManager:45  - Registering mbean: com.netflix.MonitoredResources:type=ASTYANAX,name=MyConnectionPool,ServiceType=connectionpool
2015-08-14 14:56:39 INFO  onnectionPoolMonitor:239 - AddHost: 127.0.0.1
2015-08-14 14:56:40 INFO  efloodServiceStarter:76  - Shard push and pull services started
2015-08-14 14:56:40 INFO  efloodServiceStarter:98  - Loading ingestion service module com.rackspacecloud.blueflood.service.HttpIngestionService
2015-08-14 14:56:40 INFO  efloodServiceStarter:106 - Starting ingestion service module com.rackspacecloud.blueflood.service.HttpIngestionService with writer: AstyanaxMetricsWriter
2015-08-14 14:56:40 ERROR Instrumentation     :60  - Unable to register mbean for Instrumentation
javax.management.NotCompliantMBeanException: Interface is not public: com.rackspacecloud.blueflood.io.InstrumentationMBean
        at com.sun.jmx.mbeanserver.MBeanAnalyzer.<init>(MBeanAnalyzer.java:114)
        at com.sun.jmx.mbeanserver.MBeanAnalyzer.analyzer(MBeanAnalyzer.java:102)
        at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.getAnalyzer(StandardMBeanIntrospector.java:67)
        at com.sun.jmx.mbeanserver.MBeanIntrospector.getPerInterface(MBeanIntrospector.java:192)
        at com.sun.jmx.mbeanserver.MBeanSupport.<init>(MBeanSupport.java:138)
        at com.sun.jmx.mbeanserver.StandardMBeanSupport.<init>(StandardMBeanSupport.java:60)
        at com.sun.jmx.mbeanserver.Introspector.makeDynamicMBean(Introspector.java:192)
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:898)
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
        at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
        at com.rackspacecloud.blueflood.io.Instrumentation.<clinit>(Instrumentation.java:58)
        at com.rackspacecloud.blueflood.io.AstyanaxReader.getShardState(AstyanaxReader.java:198)
        at com.rackspacecloud.blueflood.io.AstyanaxShardStateIO.getShardState(AstyanaxShardStateIO.java:19)
        at com.rackspacecloud.blueflood.service.ShardStatePuller.performOperation(ShardStatePuller.java:42)
        at com.rackspacecloud.blueflood.service.ShardStateWorker.run(ShardStateWorker.java:88)
        at java.lang.Thread.run(Thread.java:745)
2015-08-14 14:56:40 INFO  tricsIngestionServer:103 - Starting metrics listener HTTP server on port 19000
2015-08-14 14:56:40 INFO  tricsIngestionServer:112 - Starting tracker service
2015-08-14 14:56:40 INFO  efloodServiceStarter:109 - Successfully started ingestion service module com.rackspacecloud.blueflood.service.HttpIngestionService with writer: AstyanaxMetricsWriter
2015-08-14 14:56:40 INFO  efloodServiceStarter:128 - Started 1 ingestion services
2015-08-14 14:56:40 INFO  efloodServiceStarter:147 - Loading query service module com.rackspacecloud.blueflood.service.HttpQueryService
2015-08-14 14:56:40 INFO  efloodServiceStarter:152 - Starting query service module com.rackspacecloud.blueflood.service.HttpQueryService
2015-08-14 14:56:40 INFO  etricDataQueryServer:73  - Starting metric data query server (HTTP) on port 20000
2015-08-14 14:56:40 INFO  efloodServiceStarter:154 - Successfully started query service module com.rackspacecloud.blueflood.service.HttpQueryService
2015-08-14 14:56:40 INFO  efloodServiceStarter:173 - Started 1 query services
2015-08-14 14:56:40 INFO  efloodServiceStarter:240 - No event listener modules configured.
2015-08-14 14:56:40 INFO  efloodServiceStarter:308 - All blueflood services started

My environment:

  • Ubuntu 14.04
  • Maven 3.0.5
  • Java 7 build: 1.7.0u80
  • Java 8 build: 1.8.0u51

Documentation is incomplete

Documentation is the most important feature in an open source project.
It's impossible to use this as it is.

Explore bucketing rows

There are several motivations for this:

  1. Cassandra cache tunables after 1.0 are global--row and/or key caches are on everywhere or off everywhere. If we wish to use row cache, we'll need to tend toward smaller rows.
  2. Bucketed rows will allow BF to be a more viable alternative to high frequency signals. (The current design was conceived with a 30-second period in mind.)

More details on implementing other ingestion/query protocols

It appears the ingestion protocol layer is pluggable, but based on the quick mention of Thrift in the readme, the details mentioned in the readme around this assume quite a bit of existing working knowledge of BlueFlood internals.

To add a custom ingestion protocol, are these more or less the minimum pieces one must implement?

com.rackspacecloud.blueflood.inputs.handlers:

  • MyProtocolMetricsIngestionServer
  • MyProtocolMetricsServerPipelineFactory : ChannelPipelineFactory
  • MyProtocolMetricsIngestionHandler : ProtocolRequestHandler

com.rackspacecloud.blueflood.inputs.formats:

  • MyEncapsulationMetricsContainer : MetricsContainer

Further, I'm still digging through, but is the ingestion and query layer hard-coded to use the reference HTTP classes, or is there a configuration value?

It would be great to see more details or small examples on how one might implement a sample layer for something small and simple like MessagePack, or plain-text UDP (similar to statsd).

ElasticSearch search queries returning empty list.

ISSUE:

root@cassy1:/home/lakshmi# curl -X GET http://localhost:20000/v2.0/tenant-id/metrics/search?query="*"
[]root@cassy1:/home/lakshmi#

Hit elastic search directly:

root@cassy1:/home/lakshmi# curl -X GET "http://cassy1:9200/metric_metadata/_search?pretty=true&q=tenantId:st2analytics-tester&size=121"
{
  "took" : 19,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 135,
</snap>

DEBUG INFO:
We are using Blueflood master (jar with all dependencies).

  • OS - Ubuntu 14.04
root@cassy1:/home/lakshmi# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.1 LTS"
root@cassy1:/home/lakshmi#
  • Java 8 runtime (jar probably compiled with java 7)
root@cassy1:/home/lakshmi# java -version
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
root@cassy1:/home/lakshmi#

Relevant elastic search config:

root@cassy1:/home/lakshmi# grep "ELASTICSEARCH" /opt/blueflood/blueflood.conf
ELASTICSEARCH_HOSTS=localhost:9300
ELASTICSEARCH_CLUSTERNAME=elasticsearch
root@cassy1:/home/lakshmi#

OTHER INFO

I browsed the code and it looks like the search is performed on this EVENT_INDEX https://github.com/rackerlabs/blueflood/blob/master/blueflood-elasticsearch/src/main/java/com/rackspacecloud/blueflood/io/EventElasticSearchIO.java#L82

but when metrics are indexed, the index used is metrics_metadata
https://github.com/rackerlabs/blueflood/blob/master/blueflood-elasticsearch/src/main/java/com/rackspacecloud/blueflood/service/ElasticIOConfig.java#L22

Looking at elastic search cluster, I just have the metrics_metadata index.

root@cassy1:/home/lakshmi# curl 'localhost:9200/_cat/indices?v'health status index           pri rep docs.count docs.deleted store.size pri.store.size
yellow open   metric_metadata   5   1        135            0     45.3kb         45.3kb
root@cassy1:/home/lakshmi#

This looks like a bug in configuration settings. It looks like the search should use the index in configuration. Am I missing something or is the fix what I just said? I can make a PR. Just need pointers. Thanks for looking!

Class that manages schema

We need a class that verifies and updates schema on startup. It could start out simple, but we'd want to end up with something that:

  • Works across versions of Cassandra, beginning with 1.1
  • Allows schema to be injected by other modules but protects core parts of the schema.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.