Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
365 views
in Technique[技术] by (71.8m points)

python - Compare (assert equality of) two complex data structures containing numpy arrays in unittest

I use Python's unittest module and want to check if two complex data structures are equal. The objects can be lists of dicts with all sorts of values: numbers, strings, Python containers (lists/tuples/dicts) and numpy arrays. The latter are the reason for asking the question, because I cannot just do

self.assertEqual(big_struct1, big_struct2)

because it produces a

ValueError: The truth value of an array with more than one element is ambiguous.
Use a.any() or a.all()

I imagine that I need to write my own equality test for this. It should work for arbitrary structures. My current idea is a recursive function that:

  • tries direct comparison of the current "node" of arg1 to the corresponding node of arg2;
  • if no exception is raised, moves on ("terminal" nodes/leaves are processed here, too);
  • if ValueError is caught, goes deeper until it finds a numpy.array;
  • compares the arrays (e.g. like this).

What seems a little problematic is keeping track of "corresponding" nodes of two structures, but perhaps zip is all I need here.

The question is: are there good (simpler) alternatives to this approach? Maybe numpy presents some tools for this? If no alternatives are suggested, I will implement this idea (unless I have a better one) and post as an answer.

P.S. I have a vague feeling that I might have seen a question addressing this problem, but I can't find it now.

P.P.S. An alternative approach would be a function that traverses the structure and converts all numpy.arrays to lists, but is this any easier to implement? Seems the same to me.


Edit: Subclassing numpy.ndarray sounds very promising, but obviously I don't have both sides of the comparison hard-coded into a test. One of them, though, is indeed hardcoded, so I can:

  • populate it with custom subclasses of numpy.array;
  • change isinstance(other, SaneEqualityArray) to isinstance(other, np.ndarray) in jterrace's answer;
  • always use it as LHS in comparisons.

My questions in this regard are:

  1. Will it work (I mean, it sounds all right to me, but maybe some tricky edge cases will not be handled correctly)? Will my custom object always end up as LHS in the recursive equality checks, as I expect?
  2. Again, are there better ways (given that I get at least one of the structures with real numpy arrays).

Edit 2: I tried it out, the (seemingly) working implementation is shown in this answer.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Would have commented, but it gets too long...

Fun fact, you cannot use == to test if arrays are the same I would suggest you use np.testing.assert_array_equal instead.

  1. that checks dtype, shape, etc.,
  2. that doesn't fail for the neat little math of (float('nan') == float('nan')) == False (normal python sequence == has an even more fun way of ignoring this sometimes, because it uses PyObject_RichCompareBool which does a (for NaNs incorrect) is quick check (for testing of course that is perfect)...
  3. There is also assert_allclose because floating point equality can get very tricky if you do actual calculations and you usually want almost the same values, since the values can become hardware depended or possibly random depending what you do with them.

I would almost suggest trying serializing it with pickle if you want something this insanely nested, but that is overly strict (and point 3 is of course fully broken then), for example the memory layout of your array does not matter, but matters to its serialization.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

56.9k users

...