Sum up indentical list elements - python

I have an array with elements, and I want to sum up the accuracy. I want to sum the arrays that have the elements in the same order. I rather not be writing a for loop going through each element with zip and summing them up, is there an easier way to do this?
The two arrays are as follows, and currently my code is below for calculating the sum.
yp = [[0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1]]
y = [[0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1]]
sums = np.sum(yp == y)
I am getting an accuracy of zero.

Using your example:
yp = [[0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1]]
y = [[0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1]]
# First make the two arrays in question numpy arrays.
yp = np.array(yp)
y = np.array(y)
array_length = y.shape[1] # store length of sub arrays
equal_elements = np.array(yp) == np.array(y) # check all equal elements
sums = np.sum(equal_elements, 1) # sum the number of equal elements in each sub array, use axis 1 as each array/sample is axis 0
equal_arrays = np.where(sums==array_length)[0] # returns a tuple, so index first element immediately
number_equal_arrays = equal_arrays.shape[0] # What elements are equal
print('Number of equal arrays %d' % number_equal_arrays)
print('Accuracy %0.2f' % (number_equal_arrays/yp.shape[0]))
prints
Number of equal arrays 6
Accuracy 1.00

Related

Split a matrix in N chunks

Given a matrix, I want to split it in equally smaller matrices of m x n size. If the matrix is not divisible by the given size, we just put the remainder into a different matrix.
For example, given the matrix below and m=2 and n=2:
[[1, 0, 1],
[0, 0, 0],
[0, 1, 1]]
Result:
[[1, 0],
[0, 0]],
[[1],
[0]],
[[0, 1]],
[[1]],
I was using np.reshape but it fails to split when the numbers don't match, as in the example above.
matrix_size = matrix.shape[0] * matrix.shape[1]
n_matrix = math.ceil(matrix_size / (m * n))
matrix.reshape(n_matrix, m, n)
One way you could do this is using multiple calls to numpy.array_split
import numpy as np
matrix = [
[1, 0, 1],
[0, 0, 0],
[0, 1, 1],
]
sub_matrices = np.array_split(matrix, 2, axis=0)
sub_matrices = [m for sub_matrix in sub_matrices for m in np.array_split(sub_matrix, 2, axis=1)]
Where the first call to array_split splits it vertically, and the second call splits it horizontally.

Converting a population on a grid to coordinates, and vice versa

For an ecology project, I need to switch back and forth between two representations of a population on a square grid world:
Representation 1: Simply the grid (a 2d Numpy array), where the value in each cell corresponds to the number of individuals in this cell. For instance, with a 3x3 grid:
grid = np.array(
[[0, 1, 0],
[0, 3, 1],
[0, 0, 0]]
)
Representation 2: A 2d Numpy array with the x,y coordinates of each individual on the grid:
coords = np.array(
[[0, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 2]]
)
As you can see, when a cell has more than 1 individual on it, its coordinates repeat. Therefore, coords has shape (population_size, 2).
The current implementations for grid_to_coords() and coords_to_grid() both involve for loops, as you can see below, which slow down the execution considerably:
def grid_to_coords(grid):
non_zero_pos = np.nonzero(grid)
pop_size = grid.sum(keepdims=False)
coords = np.zeros((int(pop_size), 2))
offset = 0
for i in range(len(non_zero_pos[0])):
n_in_pos = int(grid[non_zero_pos[0][i], non_zero_pos[1][i]])
for j in range(n_in_pos):
coords[i + j + offset] = [non_zero_pos[0][i], non_zero_pos[1][i]]
offset += j
return pos
def coords_to_grid(coords, grid_dim):
grid = np.zeros((grid_dim, grid_dim), dtype=np.int32)
for x, y in coords:
# Add a particle to the grid, making sure it is actually on the grid!
x = max(0, min(x, grid_dim - 1))
y = max(0, min(y, grid_dim - 1))
grid[x, y] += 1
return grid
I would need a way to vectorise these two functions. Could you please help?
Many thanks.
import numpy as np
grid = np.array(
[[0, 1, 0],
[0, 3, 1],
[0, 0, 0]]
)
coords = np.array(
[[0, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 2]]
)
def grid_to_coords(grid):
"""
>>> grid_to_coords(grid)
array([[0, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 2]])
"""
x, y = np.nonzero(grid) # x = [0 1 1]; y = [1 1 2]
# np.c_[x, y] = [[0 1]
# [1 1]
# [1 2]]
# grid[x, y] = [1 3 1]
return np.c_[x, y].repeat(grid[x, y], axis=0)
def coords_to_grid(coords, grid_dim):
"""
>>> coords_to_grid(coords, 3)
array([[0, 1, 0],
[0, 3, 1],
[0, 0, 0]])
"""
unique, counts = np.unique(coords, axis=0, return_counts=True)
# unique = [[0 1]
# [1 1]
# [1 2]]
# counts = [1 3 1]
ret = np.zeros((grid_dim, grid_dim), dtype=int)
ret[unique[:, 0], unique[:, 1]] = counts
return ret

how to convert n-hot vectors to multi-labels in tensorflow?

I have a multi-classification task, and I have gotten the n-hot type predictions like
n_hot_prediction = [[0, 1, 1],
[0, 1, 0],
[1, 0, 1]]
and another top_k array like
top_k_prediction = [[1, 2],
[0, 1],
[0, 1]]
Firstly, I wish to get a function which works like:
tf.function1(n_hot_prediction) #output: [[1, 2], [1], [0, 2]]
Secondly, I with to find another function which works like:
tf.function2(top_k_prediction) #output: [[0, 1, 1], [1, 1, 0], [1, 1, 0]]
Are there any functions or methods that works like tf.function1 and tf.function2?
Your second function is pretty simple to implement:
import tensorflow as tf
#tf.function
def multi_hot(x, depth=None):
x = tf.convert_to_tensor(x)
if depth is None:
depth = tf.math.reduce_max(x) + 1
r = tf.range(tf.dtypes.cast(depth, x.dtype))
eq = tf.equal(tf.expand_dims(x, axis=-1), r)
return tf.cast(tf.reduce_any(eq, axis=-2), x.dtype)
x = [[1, 2], [0, 1], [0, 1]]
tf.print(multi_hot(x))
# [[0 1 1]
# [1 1 0]
# [1 1 0]]
For the first one, the result is not a proper tensor, so you can make a ragged tensor instead, masking a tensor of sequential values:
import tensorflow as tf
#tf.function
def as_labels(x):
mask = tf.dtypes.cast(x, tf.bool)
s = tf.shape(mask)
r = tf.reshape(tf.range(s[-1]), tf.concat([tf.ones(tf.rank(x) - 1, tf.int32), [-1]], axis=0))
r = tf.tile(r, tf.concat([s[:-1], [1]], axis=0))
return tf.ragged.boolean_mask(r, mask)
x = [[0, 1, 1], [0, 1, 0], [1, 0, 1]]
print(as_labels(x).to_list())
# [[1, 2], [1], [0, 2]]

Generate NumPy array containing the indices of another NumPy array

I'd like to generate a np.ndarray NumPy array for a given shape of another NumPy array. The former array should contain the corresponding indices for each cell of the latter array.
Example 1
Let's say we have a = np.ones((3,)) which has a shape of (3,). I'd expect
[[0]
[1]
[2]]
since there is a[0], a[1] and a[2] in a which can be accessed by their indices 0, 1 and 2.
Example 2
For a shape of (3, 2) like b = np.ones((3, 2)) there is already very much to write. I'd expect
[[[0 0]
[0 1]]
[[1 0]
[1 1]]
[[2 0]
[2 1]]]
since there are 6 cells in b which can be accessed by the corresponding indices b[0][0], b[0][1] for the first row, b[1][0], b[1][1] for the second row and b[2][0], b[2][1] for the third row. Therefore we get [0 0], [0 1], [1 0], [1 1], [2 0] and [2 1] at the matching positions in the generated array.
Thank you very much for taking the time. Let me know if I can clarify the question in any way.
One way to do it with np.indices and np.stack:
np.stack(np.indices((3,)), -1)
#array([[0],
# [1],
# [2]])
np.stack(np.indices((3,2)), -1)
#array([[[0, 0],
# [0, 1]],
# [[1, 0],
# [1, 1]],
# [[2, 0],
# [2, 1]]])
np.indices returns an array of index grid where each subarray represents an axis:
np.indices((3, 2))
#array([[[0, 0],
# [1, 1],
# [2, 2]],
# [[0, 1],
# [0, 1],
# [0, 1]]])
Then transpose the array with np.stack, stacking index for each element from different axis:
np.stack(np.indices((3,2)), -1)
#array([[[0, 0],
# [0, 1]],
# [[1, 0],
# [1, 1]],
# [[2, 0],
# [2, 1]]])

Python - Flatten lists of lists of two different types in one function

As input, I receive two types of lists of lists made of x and y coordinates that represent polygon and multipolygon geometries. In fact the input is represented in the GeoJson standard
list1 represents coordinates of a simple polygon geometry and list2 represent a multipolygon geometry:
list1 = [[[0 , 0], [0, 1], [0 ,2]]]
list2 = [[[[0, 0] , [0, 1], [0, 2]], [[1, 0], [1, 1], [1 ,2]]]]
Multipolygon geometry (list2) are represented by a list of lists one level deeper than simple polygon geometry (list1).
I want to flatten those lists in order to get those output:
if input is list1 type : list1_out = [[0, 0, 0, 1, 0, 2]]
if input is list2 type : list2_out = [[0, 0, 0, 1, 0, 2], [1, 0, 1, 1, 1, 2]]
I am using the following code that is usually used to flatten lists where input can be a list of the two types:
[coords for polygon in input for coords in polygon]
With this code above, the output for list1 is correct but the output of list2 is the following:
[[[0, 0] ,[0, 1], [0, 2]], [1, 0], [1, 1], [1, 2]]]
Is there a function that could deeply flatten those two types of lists to get the expected output?
Edit: Performance really matter here as the lists are really big
Edit 2: I can use a if sentence to filter each type of list
Try;
for list1
[sum(x, []) for x in list1]
for list2
[sum(x, []) for a in list2 for x in a]
Demo
>>> list1 = [[[0 , 0], [0, 1], [0 ,2]]]
>>> list2 = [[[[0, 0] , [0, 1], [0, 2]], [[1, 0], [1, 1], [1 ,2]]]]
>>> [sum(x, []) for x in list1]
[[0, 0, 0, 1, 0, 2]]
>>> [sum(x, []) for a in list2 for x in a]
[[0, 0, 0, 1, 0, 2], [1, 0, 1, 1, 1, 2]]
>>>
Casting your data to numpy.array, you can use reshape:
import numpy as np
t = np.array([[[[0, 0] , [0, 1], [0, 2]], [[1, 0], [1, 1], [1 ,2]]]])
print t.shape # (1, 2, 3, 2)
t = np.reshape([1, 2, 6]) # merging the last 2 coordinates/axes
flattens the second list as you want.
A code which works for both list (since in both cases you want to merge the last to axis) is:
t = np.array(yourList)
newShape = t.shape[:-2] + (t.shape[-2] * t.shape[-1], ) # this assumes your
# arrays are always at least 2 dimensional (no need to flatten them otherwise...)
t = t.reshape(t, newShape)
The key thing is to keep the shape unchanged up to the last 2 axes (hence
t.shape[:-2]), but to merge the two last axes together (using an axis of length t.shape[-2] * t.shape[-1])
We are creating the new shape by concatenation of these two tuples (hence the extra comma after the multiplication).
Edit: np.reshape() doc is here. The important parameters are the input array (your list, cast as an array), and a tuple which I've called newShape, which represents the lengths along the new axes.

Categories

Resources