Check if list of numpy arrays are equal - python

I have a list of numpy arrays, and want to check if all the arrays are equal. What is the quickest way of doing this?
I am aware of the numpy.array_equal function (https://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.array_equal.html), however as far as I am aware this only applies to two arrays and I want to check N arrays against each other.
I also found this answer to test all elements in a list: check if all elements in a list are identical.
However, when I try each method in the accepted answer I get an exception (ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all())
Thanks,

You could simply adapt a general iterator method for your array comparison
def all_equal(iterator):
try:
iterator = iter(iterator)
first = next(iterator)
return all(np.array_equal(first, rest) for rest in iterator)
except StopIteration:
return True
If this is not working, it means that your arrays are not equal.
Demo:
>>> i = [np.array([1,2,3]),np.array([1,2,3]),np.array([1,2,3])]
>>> print(all_equal(i))
True
>>> j = [np.array([1,2,4]),np.array([1,2,3]),np.array([1,2,3])]
>>> print(all_equal(j))
False

You can use np.array_equal() in a list comprehension to compare each array to the first one:
all([np.array_equal(list_of_arrays[0], arr) for arr in list_of_arrays])

If your arrays are of equal size, this solution using numpy_indexed (disclaimer: I am its author) should work and be very efficient:
import numpy_indexed as npi
npi.all_unique(list_of_arrays)

#jtr's answer is great, but I would like to suggest a slightly different alternative.
First of all, I think using array_equal is not a great idea, because you could have two arrays of floats and maybe you can end up having very small differences that you are willing to tolerate, but array_equal returns True if and only if the two arrays have the same shape and exact same elements. So let's use allclose instead, which allows to select the absolute and relative tolerances that meet your needs.
Then, I would use the built-in zip function, which makes the code more elegant.
Here is the code:
all([np.allclose(array, array_expected), for array, array_expected in zip(array_list, array_list_expected)])

I guess you can use the function unique.
http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.unique.html#numpy.unique
if all sub-arrays in the array is the same, it should return only one item.
Here's better described how to use it.
Find unique rows in numpy.array

Related

Remove duplicates from a list of tuples containing floats

I have a list of size 2 tuples which have floats in them. Some of the floats are nearly equal and are close enough to be considered equal. numpy isclose() can be used with good effect here. I need to remove the duplicates in the list while always retaining the first value.
import numpy as np
data=zip(C1,C2)
comparray=[]
eval1=np.isclose(data[0],data[1])
comparray.append(eval1[0])
i=0
while i<(len(data)-1):
eval=np.isclose(data[i],data[i+1])
print eval
comparray.append(eval[0])
i+=1
l1=[a for a,b in zip(data,comparray) if not b]
I have this code which does what I need, but it seems really poor. Is there a more pythonic way of doing this?
Thanks for the help.
If I understood correctly you can do
out=[ a for a,b in zip(data,data[1:]) if not np.isclose(a,b) ]
but I can't really test this, as you didn't provide any input/output examples.
Are you familiar with the structure called a "Set"?
Sets are a collection of unordered unique elements. I believe this structure would save you a lot of overhead and be a much better fit based on your description.
https://docs.python.org/2/library/sets.html
You can use a function like this
def nearly_equal(a,b,sig_fig=2):
return ( a==b or
int(a*10**sig_fig) == int(b*10**sig_fig)
)
>>>print nearly_equal(3.456,3.457)
True

Replacing Nones in a python array with zeroes

I've just joined two arrays of unequal length together with the command:
allorders = map(None,todayorders, lastyearorders)
where "none" is given where today orders fails to have a value (as the todayorders array is not as long).
However, when I try to pass the allorders array into a matplotlib bar chart:
p10= plt.bar(ind, allorders[9], width, color='#0000DD', bottom=allorders[8])
..I get the following error:
TypeError: unsupported operand type(s) for +=: 'int' and 'NoneType'
So, is there a way for matplotlib to accept none datatypes? if not, how do I replace the 'Nones' with zeroes in my allorders array?
If you can, as I am a Python newbie (coming over from the R community), please provide detailed code from start to finish that I can use/test.
Use a list comprehension:
allorders = [i if i[0] is not None else (0, i[1]) for i in allorders]
With numpy:
import numpy as np
allorders = np.array(allorders)
This creates an arrray of objects due to the Nones. We can replace them with zeros:
allorders[allorders == None] = 0
Then convert the array to the proper type:
allorders.astype(int)
Since it sounds like you want this all to be in numpy, the direct answer to your question is really just an aside, and the right answer doesn't being until the "Of course…" paragraph.
If you think about it, you're using map with a None first parameter as a zip_longest, because Python doesn't have a zip_longest. But it does have one, in itertools—and it allows you to specify a custom fillvalue. So, you can do this all in one step with izip_longest:
>>> import itertools
>>> todayorders = [1, 2]
>>> lastyearorders = [1, 2, 3]
>>> allorders = itertools.izip_longest(todayorders, lastyearorders, fillvalue=0)
>>> list(allorders)
[(1, 1), (2, 2), (0, 3)]
This only fills in 0 for the Nones that show up as extra values for the shorter list; if you want to replace every None with a 0, you have to do it Martijn Pieters's way. But I think this is what you want.
Also, note that list(allorders) at the end: izip_longest, like most things in itertools, returns an iterator, not a list. Or, in terms you might be more familiar with, it returns a "lazy" sequence rather than a "strict" one. If you're just going to iterate over the result, that's actually better, but if you need to use it with some function that requires a list (like printing it out in human-readable form—or accessing allorders[9], as in your example), you need to explicitly convert it first.
If you actually want a numpy.array rather than a list, you can get there directly, without going through a list first. (If all you're ever going to do with it is matplotlib it, you probably do want an array.) The clearest way is to just use np.fromiter(allorders) instead of list(allorders). You might want to pass an explicit dtype=int (or whatever's appropriate). And, if you know the size (which you do—it's max(len(todayorders), len(lastyearorders))), in some cases it's faster or simpler to pass an explicit count as well.
Of course if any of the numpy stuff sounds appealing, you probably should stay within numpy in the first place, instead of using map or izip_longest:
>>> todayorders.resize(lastyearorders.shape)
>>> allorders = np.vstack(todayorders, lastyearorders).transpose()
Unfortunately, that mutates todayorders, and as far as I know, the equivalent immutable function numpy.resize doesn't give you any way to "zero-extend", but instead repeats the values. Hopefully I'm wrong and someone will suggest the easy way, but otherwise, you have to do it explicitly:
>>> extrazeros = np.zeros(len(lastyearorders) - len(todayorders), dtype=int)
>>> allorders = np.vstack(np.concatenate((todayorders, extrazeros)), lastyearorders)
>>> allorders = allorders.transpose()
array([[ 1, 1],
[ 2, 2],
[ 0, 3]])
Of course if you do a lot of that, I'd write a zeroextend function that takes a pair of arrays and extends one to match the other (or, if you're not just dealing with 1D, extends the shorter one on each axis to make the other).
At any rate, aside from being faster and using less temporary memory than using map, izip_longest, etc., this also means that you end up with a final array with the right dtype (int rather than object)—which means your result also uses less long-term memory, and everything you do from then on will also be faster and use less temporary memory.
For completeness: It is possible to have pyplot handle None values, but I don't think it's what you want. For example, you can pass it a Transform object whose transform method converts None to 0. But this will be effectively the same as Martijn Pieters's answer but much more verbose, and there's no advantage at all unless you need to plot tons of such arrays.

Convert array to python scalar

I need big help, please check out this code:
import.math
dose =20.0
a = [[[2,3,4],[5,8,9],[12,56,32]]
[[25,36,45][21,65,987][21,58,89]]
[[78,21,98],[54,36,78],[23,12,36]]]
PAC = math.exp(-dose*a)
this what I would like to do. However the error I am getting is
TypeError: only length-1 arrays can be converted to Python scalars
If you want to perform mathematical operations on arrays (whatever their dimensions...), you should really consider using NumPy which is designed just for that. In your case, the corresponding NumPy command would be:
PAC = numpy.exp(-dose*np.array(a))
If NumPy is not an option, you'll have to loop on each element of a, compute your math.exp, store the result in a list... Really cumbersome and inefficient. That's because the math functions require a scalar as input (as the exception told you), when you're passing a list (of lists). You can combine all the loops in a single list comprehension, though:
PAC = [[[math.exp(-dose*j) for j in elem] for elem in row] for row in a]
but once again, I would strongly recommend NumPy.
You should really use NumPy for that.
And here is how you should do it using nested loops:
>>> for item in a:
... for sub in item:
... for idx, number in enumerate(sub):
... print number, math.exp(-dose*number)
... sub[idx] = math.exp(-dose*number)
Using append is slow, because every time you copy the previous array and stack the new item to it.
Using enumerate, changes numbers in place. If you want to keep a copy of a, do:
acopy = a[:]
If you don't have much numbers, and NumPy is an over kill, the above could be done a tiny bit faster using list comprehensions.
If you want, for each element of the array to have it multiplied by -dose then apply math.exp on the result, you need a loop :
new_a = []
for subarray in a:
new_sub_array = []
for element in sub_array:
new_element = math.exp(-dose*element)
new_sub_array.append(new_element)
new_a.append(new_sub_array)
Alternatvely, if you have a mathlab background, you could inquire numpy, that enable transformations on array.

replacing values in a whole array

I would like to ask how I can change the values in a whole NumPy array.
For example I want to change every value which is < 1e-15 to be equal to 1e-15.
Assuming you mean a numpy array, and it's pointed to by a variable a:
np.fmax(a, 1e-15, a)
This finds the maximum of the two values given as the first two arguments (a and 1e-15) on a per-element basis, and writes the result back to the array given as the third argument, a.
I had a hard time finding the official docs for this function, but I found this.
If L is a list:
L[:] = [max(x, 10e-15) for x in L]
Assuming you mean a lsit instead of an array, I'd recommend to use a list comprehension:
new_list = [max(x, 1e-15) for x in my_list]
(I also assume you mean 1e-15 == 10. ** (-15) instead of 10e-15 == 1e-14.)
There also exist "arrays" in Python: The class array.array from the standard library, and NumPy arrays.
I like numpy.fmax (which was new to me), but for a possibly more generic case, I often use:
a[a < 1e-15] = 1e-15
(More generic in the sense that you can vary the condition, or that the replacement value is not equal to the comparison value.)

Counting array elements in Python [duplicate]

This question already has answers here:
How do I get the number of elements in a list (length of a list) in Python?
(11 answers)
Closed 5 years ago.
How can I count the number of elements in an array, because contrary to logic array.count(string) does not count all the elements in the array, it just searches for the number of occurrences of string.
The method len() returns the number of elements in the list.
Syntax:
len(myArray)
Eg:
myArray = [1, 2, 3]
len(myArray)
Output:
3
len is a built-in function that calls the given container object's __len__ member function to get the number of elements in the object.
Functions encased with double underscores are usually "special methods" implementing one of the standard interfaces in Python (container, number, etc). Special methods are used via syntactic sugar (object creation, container indexing and slicing, attribute access, built-in functions, etc.).
Using obj.__len__() wouldn't be the correct way of using the special method, but I don't see why the others were modded down so much.
If you have a multi-dimensional array, len() might not give you the value you are looking for. For instance:
import numpy as np
a = np.arange(10).reshape(2, 5)
print len(a) == 2
This code block will return true, telling you the size of the array is 2. However, there are in fact 10 elements in this 2D array. In the case of multi-dimensional arrays, len() gives you the length of the first dimension of the array i.e.
import numpy as np
len(a) == np.shape(a)[0]
To get the number of elements in a multi-dimensional array of arbitrary shape:
import numpy as np
size = 1
for dim in np.shape(a): size *= dim
Or,
myArray.__len__()
if you want to be oopy; "len(myArray)" is a lot easier to type! :)
Before I saw this, I thought to myself, "I need to make a way to do this!"
for tempVar in arrayName: tempVar+=1
And then I thought, "There must be a simpler way to do this." and I was right.
len(arrayName)

Categories

Resources