Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
203 views
in Technique[技术] by (71.8m points)

python - What happens when you call `if key in dict`

I have a class (let's call it myClass) that implements both __hash__ and __eq__. I also have a dict that maps myClass objects to some value, computing which takes some time.

Over the course of my program, many (in the order of millions) myClass objects are instantiated. This is why I use the dict to keep track of those values.

However, sometimes a new myClass object might be equivalent to an older one (as defined by the __eq__ method). So rather than compute the value for that object again, I'd rather just lookup the value of older myClass object in the dict. To accomplish this, I do if myNewMyClassObj in dict.

Here's my question:

When I use that in clause, what gets called, __hash__ or __eq__? The point of using a dict is that it's O(1) lookup time. So then __hash__ must be called. But what if __hash__ and __eq__ aren't equivalent methods? In that case, will I get a false positive for if myNewMyClassObj in dict?

Follow up question:

I want to minimize the number of entries in my dict, so I would ideally like to keep only one of a set of equivalent myClass objects in the dict. So again, it seems that __eq__ needs to be called when computing if myNewClassObj in dict, which would defile a dict's O(1) lookup time to an O(n) lookup time

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

First, __hash__(myNewMyClassObj) gets called. If no object with the same hash is found in the dictionary, Python assumes myNewMyClassObj is not in the dictionary. (Note that Python requires that whenever __eq__ evaluates as equal for two objects, their __hash__ must be identical.)

If some objects with the same __hash__ are found in the dictionary, __eq__ gets called on each of them. If __eq__ evaluates as equal for any of them, the myNewMyClassObj in dict_ returns True.

Thus, you just need to make sure both __eq__ and __hash__ are fast.

To your follow up question: yes, dict_ stores only one of a set of equivalent MyClass objects (as defined by __eq__). (As does set.)

Note that __eq__ is only called on the objects that had the same hash and got allocated to the same bucket. The number of such objects is usually a very small number (dict implementation makes sure of that). So you still have (roughly) O(1) lookup performance.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...