Comments (4)
Thanks for the feedback. Glad to hear that you like HappyBase. :)
I've experimented a bit, but it seems to me everything works as expected. I'll explain below.
Let's start with a test table with only two rows in it:
>>>> list(t.scan())
[('foo', {'cf2:a': 'bar', 'cf1:b': 'bla'}), ('row1', {'cf1:col': 'value'})]
>>>> for k, d in t.scan():
.... print k, d
....
foo {'cf2:a': 'bar', 'cf1:b': 'bla'}
row1 {'cf1:col': 'value'}
When retrieving a single row, both .row() and .rows() work as expected:
>>>> t.row('foo')
{'cf2:a': 'bar', 'cf1:b': 'bla'}
>>>> t.rows(['foo'])
[('foo', {'cf2:a': 'bar', 'cf1:b': 'bla'})]
>>>> t.rows(['row1'])
[('row1', {'cf1:col': 'value'})]
This also seems to work fine when retrieving both existing rows:
>>>> t.rows(['foo', 'row1'])
[('foo', {'cf2:a': 'bar', 'cf1:b': 'bla'}), ('row1', {'cf1:col': 'value'})]
Looking up two existing rows, and one row that does not exist, results in two rows:
>>>> t.rows(['foo', 'row1', 'DOESNOTEXIST'])
[('foo', {'cf2:a': 'bar', 'cf1:b': 'bla'}), ('row1', {'cf1:col': 'value'})]
Let's insert another row, with only values in the 'cf1' column family:
>>>> t.put('row2', {'cf1:c1': 'hi'})
Let's retrieve an old row and the just insert row:
>>>> t.rows(['foo', 'row2'])
[('foo', {'cf2:a': 'bar', 'cf1:b': 'bla'}), ('row2', {'cf1:c1': 'hi'})]
If we were only interested in the 'cf1' column family, we can filter:
>>>> t.rows(['foo', 'row2'], columns=['cf1'])
[('foo', {'cf1:b': 'bla'}), ('row2', {'cf1:c1': 'hi'})]
The same for the 'cf2' column family:
>>>> t.rows(['foo', 'row2'], columns=['cf2'])
[('foo', {'cf2:a': 'bar'})]
As you can see, only 'foo' was returned here, since the 'row2' row does not have any KeyValues in the 'cf2' column family. This is expected HBase behaviour.
When retrieving rows that only have KeyValues in the 'cf1' column family, we won't get any results when filtering on the 'cf2' family:
>>>> t.rows(['row1', 'row2'], columns=['cf2'])
[]
This is also expected behaviour, since there is no data in the 'cf2' column family for the requested rows. Note that HBase, by design, does not store "null" data, so rows with data in a single column family will not show up in scans/lookups in other column families.
Long story short: it seems HappyBase correctly obtains the expected results from HBase in the cases you described. Are you really sure your code isn't behaving as expected?
Wrt. to your assumption that rows() is faster than individual row() calls: that's correct, but keep in mind that only the Python app -> Thrift server overhead is reduced. The Thrift server itself asks HBase to lookup each row in turn, possibly resulting in communication with many different region servers. There is no way to lookup non-adjacent rows faster than just performing two get operations. (For adjacent rows, scan() is what you need, but it serves a different purpose.)
Finally, wrt. to your question regarding an "exists()" function: there is no such thing in HBase. If you want a value, just retrieve it. If it's not there, you'll get an empty response. If it is there, you'll get back the value (and timestamp).
Please let me know if you experience problems that are not covered in the examples above. I'll be glad to help.
Note: the above tests were conducted with HappyBase 0.4, HBase 0.94 and PyPy 1.8.0 (Python 2.7.2)
from happybase.
Ping....
Please confirm the issues you reported earlier with some test cases (since my tests work), or better yet, let me know that things work as expected for you now. Thanks! :-)
from happybase.
Hi! Sorry I've been meaning to get back to you.
I made your changes and things suddenly worked as you described. I've been meaning to try to replicate the strange behavior I was seeing, but I haven't had time yet.
So for now, everything is great. I'll try to circle back and figure out exactly what I was doing wrong as soon as I have the chance.
On Nov 9, 2012, at 3:21 PM, Wouter Bolsterlee [email protected] wrote:
Ping....
Please confirm the issues you reported earlier with some test cases (since my tests work), or better yet, let me know that things work as expected for you now. Thanks! :-)
—
Reply to this email directly or view it on GitHub.
from happybase.
Glad to hear. I'll close this issue now. Feel free to report any further issues!
from happybase.
Related Issues (20)
- Fix simple typo: specifed, -> specified,
- How does happyhbase fuzzy query rowkey HOT 3
- Support Pre-Split when create table ?
- HBase remotely connecting to python project HOT 7
- Support gevent?
- suppot count table rows? HOT 1
- org.apache.hadoop.hbase.NamespaceNotFoundException: HOT 4
- [feature] Support reconnect host when connectionpool raise error HOT 2
- thriftpy2.protocol.exc.TProtocolException: Bad protocol id in the message: 72 HOT 4
- happybase 1.2.0 supports hbase 2.2.5 ? HOT 1
- How can I do a query for specific columns by regex-statement HOT 2
- TTransportException: TTransportException(type=4, message='TSocket read 0 bytes') HOT 4
- TypeError: __str__ returned non-string (type bytes) hbase HOT 6
- import happybase error HOT 1
- Does it support snapshot management? HOT 1
- Is there a way to update / modify TTL (time to leave) on already created table ? HOT 1
- Hbase compatibility
- Table Put - How Do We Assign And Use A Variable For the 'Row Key' HOT 8
- compact_table major no work
- 使用happybase创建连接池并扫描整张表一段时间以后报错 HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from happybase.