Coder Social home page Coder Social logo

Comments (4)

wbolster avatar wbolster commented on September 16, 2024

Thanks for the feedback. Glad to hear that you like HappyBase. :)

I've experimented a bit, but it seems to me everything works as expected. I'll explain below.

Let's start with a test table with only two rows in it:

>>>> list(t.scan())
[('foo', {'cf2:a': 'bar', 'cf1:b': 'bla'}), ('row1', {'cf1:col': 'value'})]
>>>> for k, d in t.scan():
....     print k, d
.... 
foo {'cf2:a': 'bar', 'cf1:b': 'bla'}
row1 {'cf1:col': 'value'}

When retrieving a single row, both .row() and .rows() work as expected:

>>>> t.row('foo')
{'cf2:a': 'bar', 'cf1:b': 'bla'}
>>>> t.rows(['foo'])
[('foo', {'cf2:a': 'bar', 'cf1:b': 'bla'})]
>>>> t.rows(['row1'])
[('row1', {'cf1:col': 'value'})]

This also seems to work fine when retrieving both existing rows:

>>>> t.rows(['foo', 'row1'])
[('foo', {'cf2:a': 'bar', 'cf1:b': 'bla'}), ('row1', {'cf1:col': 'value'})]

Looking up two existing rows, and one row that does not exist, results in two rows:

>>>> t.rows(['foo', 'row1', 'DOESNOTEXIST'])
[('foo', {'cf2:a': 'bar', 'cf1:b': 'bla'}), ('row1', {'cf1:col': 'value'})]

Let's insert another row, with only values in the 'cf1' column family:

>>>> t.put('row2', {'cf1:c1': 'hi'})

Let's retrieve an old row and the just insert row:

>>>> t.rows(['foo', 'row2'])
[('foo', {'cf2:a': 'bar', 'cf1:b': 'bla'}), ('row2', {'cf1:c1': 'hi'})]

If we were only interested in the 'cf1' column family, we can filter:

>>>> t.rows(['foo', 'row2'], columns=['cf1'])
[('foo', {'cf1:b': 'bla'}), ('row2', {'cf1:c1': 'hi'})]

The same for the 'cf2' column family:

>>>> t.rows(['foo', 'row2'], columns=['cf2'])
[('foo', {'cf2:a': 'bar'})]

As you can see, only 'foo' was returned here, since the 'row2' row does not have any KeyValues in the 'cf2' column family. This is expected HBase behaviour.

When retrieving rows that only have KeyValues in the 'cf1' column family, we won't get any results when filtering on the 'cf2' family:

>>>> t.rows(['row1', 'row2'], columns=['cf2'])
[]

This is also expected behaviour, since there is no data in the 'cf2' column family for the requested rows. Note that HBase, by design, does not store "null" data, so rows with data in a single column family will not show up in scans/lookups in other column families.

Long story short: it seems HappyBase correctly obtains the expected results from HBase in the cases you described. Are you really sure your code isn't behaving as expected?

Wrt. to your assumption that rows() is faster than individual row() calls: that's correct, but keep in mind that only the Python app -> Thrift server overhead is reduced. The Thrift server itself asks HBase to lookup each row in turn, possibly resulting in communication with many different region servers. There is no way to lookup non-adjacent rows faster than just performing two get operations. (For adjacent rows, scan() is what you need, but it serves a different purpose.)

Finally, wrt. to your question regarding an "exists()" function: there is no such thing in HBase. If you want a value, just retrieve it. If it's not there, you'll get an empty response. If it is there, you'll get back the value (and timestamp).

Please let me know if you experience problems that are not covered in the examples above. I'll be glad to help.

Note: the above tests were conducted with HappyBase 0.4, HBase 0.94 and PyPy 1.8.0 (Python 2.7.2)

from happybase.

wbolster avatar wbolster commented on September 16, 2024

Ping....

Please confirm the issues you reported earlier with some test cases (since my tests work), or better yet, let me know that things work as expected for you now. Thanks! :-)

from happybase.

gjlondon avatar gjlondon commented on September 16, 2024

Hi! Sorry I've been meaning to get back to you.

I made your changes and things suddenly worked as you described. I've been meaning to try to replicate the strange behavior I was seeing, but I haven't had time yet.

So for now, everything is great. I'll try to circle back and figure out exactly what I was doing wrong as soon as I have the chance.

On Nov 9, 2012, at 3:21 PM, Wouter Bolsterlee [email protected] wrote:

Ping....

Please confirm the issues you reported earlier with some test cases (since my tests work), or better yet, let me know that things work as expected for you now. Thanks! :-)


Reply to this email directly or view it on GitHub.

from happybase.

wbolster avatar wbolster commented on September 16, 2024

Glad to hear. I'll close this issue now. Feel free to report any further issues!

from happybase.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.