Comments (6)
Yes, from beginning of table to specified end point of 2013-02-17. Were you expecting something else?
from phoenix.
This is probably too speculative, but could region boundaries be employed to provide a lower bound?
from phoenix.
I don't see how region boundaries help you in this case if date is not the leading part of the key. For example, if you have HOST CHAR(2), EVENT_DATE DATE as your schema, then the MAX(date) could be in any region and in any row, since the order depends on the HOST first. For example:
AA date
AB date - 1
AC date + 1
BA date - 2
ZZ date + 3
In the above case, the date could be anything, and the rows would still sort in that order.
Stats only help in this case to balance the load on parallelization. We couldn't really rely on the max/min in the stats being the answer to a query, because it'll be updated asynchronously at some configurable time interval. Something may have snuck in as min/max after the last stats gathering was done.
from phoenix.
There is one case I can think of where region boundaries may be helpful. If the HOST value is more like an enum with a few limited values, then the same HOST value might repeat for multiple region boundaries. In that case, for something like MAX(event_date), you could skip regions where HOST repeats until the last one. For example, say the region boundaries are:
NA 11111111
NA 22222222
NA 33333333
TX 11111111
TX 22222222
ZZ 11111111
You could skip the first two NA regions, since you'd know the MAX(event_date) would be in them. Then you could skip the first TX region too. etc.
from phoenix.
In this case, the rowkey is led by the date column. So ideally, only one region needs to be scanned, not the entire table up to the specified ceiling.
The only questionable part for me of depending on region boundaries is how to handle retrying speculation that's been made invalid by splits and whatnot.
from phoenix.
If date is leading the row key, we can definitely do a better job, but it would really only apply if max is the only thing being selected. Right now for your original query, we'd scan every region up to to_date('2013-02-17 00:00:00'). We really only need to look at the region containing to_date('2013-02-17 00:00:00'). Would be good to generalize this. I'll morph this issue into that.
from phoenix.
Related Issues (20)
- Got TableNotFoundException when upgrading from Phoenix 2.2.x HOT 1
- Not able to see the table in hbase
- java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
- encounter Error: (state=08000,code=101), when we create index on bigtable HOT 19
- Creating Empty Column Families with CREATE Table HOT 1
- why am i select data slow? HOT 2
- Add REGEXP_LIKE buit-in function HOT 2
- When I run the Phoenix over my hbase cluster I meet the warning below HOT 2
- phoenix map hbase table , phoenix data content is not correct HOT 1
- ERROR 1012 (42M03): Table undefined HOT 6
- How to use phoenuix to map to an Existing HBase Table
- what situation does index works? HOT 1
- how can i use UPSERT VALUES? HOT 1
- Exception on upserting data on table with using upsert select
- Query a Secure HBase cluster through Phoenix In Java code HOT 2
- Offtopic Question: Bloom Filter Implementation In Apex
- Phoenix View for HBase is not updating
- Operations on table throw exception: ArrayIndexOutOfBoundsException & DoNotRetryIOException
- Phoenix issue-Distribution-IBM BigInsights- Hbase(1.1.1)-Phoenix 4.7 HOT 1
- Phoenix View on pre-existing HBase namespace table? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from phoenix.