Comments (4)
A lighter weight version of this request are logs that can detail the processing status of each thread. For example, we just debugged an issue with our R&D cluster where 2 nodes were acting flaky, and I'm pretty sure that the Phoenix client was hanging specifically on those two nodes. Through trial and error we figured it out, but it would be nice to have logs something along the line of this:
[[timestamp][request-id]Starting query "select count(*) from myTable"
[timestamp]request-id][thread-id] starting against RS-X (for each thread)
[timestamp]request-id][thread-id] ending in XXX (ms)
[[timestamp][request-id]Finished in YYY (ms)
What this would have uncovered is that we had N region servers, and N-2 requests were completing.
Icing on the cake is that if a query times out, tell me which RS's it's waiting on.
from phoenix.
We should look at the Zipkin-based monitoring that Elliot Clark is doing for HBase here. Needs to aggregate/rollup the costs, but if it did that, it would be a sweet way to monitor perf.
from phoenix.
I was thinking we could use the Hadoop metrics2 to manage the metrics for a given request (both scans and inserts). What you really want in metrics tooling is:
- Async collection
- non-blocking
- Flexible writers
- Dropping extra metrics if it becomes too full
metrics2 gives us all of that. Also, there are good reference implementations, for instance, the hadoop code itself (here and here) as well as the new HBase metrics system.
We can then use this to keep stats on phoenix in phoenix tables. By instrumenting the phoenix methods correctly we can gather things like number of bytes sent, method times, region/regionserver response times, etc. Then you would publish these metrics to a phoenix sink that again, writes back to a phoenix table (and possibly updates a local stats cache too).
This only interesting bits are then:
- Tracking method calls from the client to the server
- Creating a clean abstraction around dynamic variables
The latter is just good engineering. The former can be solved by tagging each method call with a UUID (similar to how Zipkin would track the same request). Stats about the whole call would then both eventually end up in the same phoenix stat table, which is then queryable.
The intelligent bit then becomes updating the stats table with metrics in an intelligent way so you can do a rollup later to reconstruct history. Since you know the query id, you can correlate it between the clients and servers. This also gives you perfect timing as you know the operation order (and could get smarter when you parallelize things by having "sub" parts that getting their own UUID, but that correlate to the original request, e.g. UUID 1234 splits into 1234-1, 1234-2, 1234-3).
I started working through some simple, toy examples of using metrics2 for logging (simple and with dynamic method calls). Its nothing fancy and shouldn't be directly used for phoenix, but might be helpful to someone trying to figure out how the metrics2 stuff all works.
from phoenix.
Simple prototype is up at github.com/jyates/phoenix/tree/tracing. Just traces mutations from the client to the server through the indexing path, writes them to the sink, which writes them to a phoenix table and then has a simple reader to rebuild the traces.
See the end-to-end test for a full example.
from phoenix.
Related Issues (20)
- Got TableNotFoundException when upgrading from Phoenix 2.2.x HOT 1
- Not able to see the table in hbase
- java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
- encounter Error: (state=08000,code=101), when we create index on bigtable HOT 19
- Creating Empty Column Families with CREATE Table HOT 1
- why am i select data slow? HOT 2
- Add REGEXP_LIKE buit-in function HOT 2
- When I run the Phoenix over my hbase cluster I meet the warning below HOT 2
- phoenix map hbase table , phoenix data content is not correct HOT 1
- ERROR 1012 (42M03): Table undefined HOT 6
- How to use phoenuix to map to an Existing HBase Table
- what situation does index works? HOT 1
- how can i use UPSERT VALUES? HOT 1
- Exception on upserting data on table with using upsert select
- Query a Secure HBase cluster through Phoenix In Java code HOT 2
- Offtopic Question: Bloom Filter Implementation In Apex
- Phoenix View for HBase is not updating
- Operations on table throw exception: ArrayIndexOutOfBoundsException & DoNotRetryIOException
- Phoenix issue-Distribution-IBM BigInsights- Hbase(1.1.1)-Phoenix 4.7 HOT 1
- Phoenix View on pre-existing HBase namespace table? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from phoenix.