Comments (8)
SGTM
from zodb.
On Sat, Sep 10, 2016 at 1:08 PM, Jason Madden [email protected]
wrote:
In general, objects without custom eq and hash objects are going
to be friendlier on the DB and the cache (if you can get away with identity
semantics).Suppose you have two classes:
import uuid
class WithHash(Persistent):
def init(self):
self.id = uuid.uuid()
def eq(self, other):
return self.id == other.id
def hash(self):
return hash(self.id)
class WithoutHash(Persistent):
def init(self):
self.id = uuid.uuid()Now if you had some dictionaries using those classes as keys:
conn = db.open()
conn.root.with_hash_dict = {WithHash(): i for i in range(5000)}
conn.root.wout_hash_dict = {WithoutHash(): i for i in range(5000)}Doing something like:
conn = db.open()len(conn.root.with_hash_dict)
is going to unghost 5000 WithHash objects, whereas
conn = db.open()len(conn.root.wout_hash_dict)
isn't going to unghost any objects (because unpickling a dictionary
re-hashes all the keys, and hash() looks up hash on the class not the
instance, so if the hash method doesn't access any attributes---like
the default one---nothing has to be unghosted) .Even if your object cache is sized appropriately, large-ish dictionaries
can take a long time to unpickle when accessed for the first time in a
particular connection, adding lots of load to the DB and/or cache system;
the same happens when creating a dictionary in memory for the first time of
such persistent objects.If you can accept identity semantics (and for persistent objects, you
surprisingly often can), it's better to avoid custom eq and hash
methods if you'll ever be creating dictionaries or sets of your persistent
objects.This is a lesson we learned the hard way; coming from a Java background
almost all of our objects defined custom eq and hash methods, and
that was fine until we started to get a lot of objects, when it became a
performance burden. Now it turns out that many such of those objects don't
need these methods.Worth adding to the docs?
Probably, but I'm not sure where. It's a bit obscure.
Maybe in "Other things you can do, but shouldn't".
Jim
Jim Fulton
http://jimfulton.info
from zodb.
Maybe in "Other things you can do, but shouldn't".
As in, you can add custom __hash__
but you maybe shouldn't?
from zodb.
Yup.
It occurs to me that it might be nice to have mix-in classes (or maybe just one) that implements identity-based hash and comparison based on OIDs. In the past, I wanted PxBTrees, but maybe IdentityHashablePersistent and IdentyComparablePersistent (or maybe just the later and maybe with better names :))
from zodb.
Ok, I'll write this up and submit a PR. (I think it could also use something about implementing comparable methods to be used in a BTree, at least a pointer.)
Those sound like pretty good mixin classes. What would do you do before the object is assigned an OID though? Use its id
?
from zodb.
I would error. Using it's id would be a disaster.
from zodb.
I had cause to remember about zope.keyreference
today, which implements something remarkably similar
from zodb.
Opened PR #118 for this.
from zodb.
Related Issues (20)
- `fsstats` no longer matches output of `fsdump`
- Change `pack_date` interpretation?
- Any objections to releasing 5.7.0? HOT 3
- Blob usage depending on file size HOT 4
- 5.7.0: missing git tag? HOT 2
- 5.7.0: documentation build fails with sphinx 5.x HOT 11
- 5.7.0: pytest warning and ZODB/tests should not be installed HOT 13
- PyInstaller fails to find ZODB's config.xml and exe will no run when using a provided config from URL HOT 2
- Incorporating Blob with ClientStorage HOT 8
- Dual Inquiry: 2022 Best Practices for ZODB usage outside of ZOPE using ZEO SharedStorage HOT 1
- FileStorage and other api doesn't support pathlib interface. HOT 1
- with transaction() doesn't work as expected
- On ZODB's scalability HOT 6
- [Question] documentation for contributor of ZODB ecosystem HOT 1
- Switch test name prefix from `check` to `test`. HOT 2
- ZODB with nogil Python HOT 1
- check7ZODBThreads -> failed with InvalidObjectReference HOT 5
- PyPI Owner rights for ZODB HOT 2
- Question on ZODB License HOT 3
- The ZODB should create the "tmp" directory for savepoint automatically if the target directory does not exist. HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zodb.