1D to 2D array - python - python

I would like to change the data stored in 1D into 2D:
I mean:
from
x|y|a
1|1|a(1,1)
2|1|a(2,1)
3|1|a(3,1)
1|2|a(1,2)
...
into:
x\y|1 |2 |3
1 |a(1,1)|a(1,2)|a(1,3
2 |a(2,1)|a(2,2)|a(2,3)...
3 |a(3,1)|a(3,2)|a(3,3)...
...
I did it by 2 loops:
(rows - array of x,y,a)
for n in range(len(rows)):
for k in range(x_len):
for l in range(y_len):
if ((a[2, n] == x[0, k]) and (a[3, n] == y[0, l])):
c[k, l] = a[0, n]
but it takes ages, so my question is if there is a smart and quick
solution for that in Python.
So to clarify what I want to do:
I know the return() function, the point is that it's randomly in array a.
So:
a = np.empty([4, len(rows)]
I read the data into array a from the database which has 4 columns (1,2,x,y) and 'len(rows)' rows.
I am interested in '1' column - this one I want to put to the new modified array.
x = np.zeros([1, x_len], float)
y = np.zeros([1, y_len], float)
x is a vector of sorted column(x) from the array a, but without duplicitas with a length x_len
(I read it by the sql query: select distinct ... )
y is a vector of sorted column(y) from the array a (without duplicitas) with a length y_len
Then I am making the array:
c = np.zeros([x_len, y_len], float)
and put by 3 loops (sorry for the mistake before) the data from array a:
>
for n in range(len(rows)):
for k in range(x_len):
for l in range(y_len):
if ((a[2, n] == x[0, k]) and (a[3, n] == y[0, l])):
c[k, l] = a[0, n]
Example:
Array a
array([[1, 3, 6, 5, 6],
[1, 2, 5, 5, 6],
[1, 4, 7, 1, 2], ## x
[2, 5, 3, 3, 4]]) ## y
Vectors: x and y
[[1,2,4,7]] ## x with x_len=4
[[2,3,4,5]] ## y with y_len=4
Array c
array([[1, 5, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 3],
[0, 6, 0, 0]])
the last array c looks like this (the first a[0] is written into):
x\y 2|3|4|5
-----------
1 1|5|0|0
2 0|0|0|0
4 0|0|0|3
7 0|6|0|0
I hope I didn't make mistake how it's written into the array c.
Thanks a lot for any help.

You could use numpy:
>>> import numpy as np
>>> a = np.arange(9)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8])
>>> a.reshape(3,3)
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
#or:
>>> a.reshape(3,3).transpose()
array([[0, 3, 6],
[1, 4, 7],
[2, 5, 8]])

Related

Numpy, increment values in a 2D array using index represented in another 1D array

Here is an example of what I would like to do:
Assume Array A
A = np.array([[0, 1, 3, 5, 9],
[2, 7, 5, 1, 4]])
And Array B
B = np.array([2, 4])
I am looking for an operation that will increment the element indexed by array B in each row of array A by 1.
So the result A is:
A = np.array([[0, 1, 4, 5, 9],
[2, 7, 5, 1, 5]])
The index 2 of first row is increased by 1, and the index 4 of second row is increased by 1
You can achieve this by using advanced indexing in numpy:
A[np.arange(len(B)), B] += 1
This works by creating a 2D array with dimensions (len(B), len(B)) using np.arange(len(B)), which represents the row indices. The second index of the advanced indexing, B, represents the column indices. By adding 1 to A[np.arange(len(B)), B], you increment the elements in each row specified by B.
In numpy you can do by using arrange and shape of an array
import numpy as np
A = np.array([[0, 1, 3, 5, 9],
[2, 7, 5, 1, 4]])
B = np.array([2, 4])
A[np.arange(A.shape[0]), B] += 1
print(A)
np.arange(A.shape[0]) generates an array of integers from 0 to A.shape[0] - 1. A.shape[0] is basically rows
you can do with looping also..
import numpy as np
A = np.array([[0, 1, 3, 5, 9],
[2, 7, 5, 1, 4]])
B = np.array([2, 4])
for i, index in enumerate(B):
A[i][index] += 1
print(A)

Cross-reference between numpy arrays

I have a 1d array of ids, for example:
a = [1, 3, 4, 7, 9]
Then another 2d array:
b = [[1, 4, 7, 9], [3, 7, 9, 1]]
I would like to have a third array with the same shape of b where each item is the index of the corresponding item from a, that is:
c = [[0, 2, 3, 4], [1, 3, 4, 0]]
What's a vectorized way to do that using numpy?
this may not make sense but ... you can use np.interp to do that ...
a = [1, 3, 4, 7, 9]
sorting = np.argsort(a)
positions = np.arange(0,len(a))
xp = np.array(a)[sorting]
fp = positions[sorting]
b = [[1, 4, 7, 9], [3, 7, 9, 1]]
c = np.rint(np.interp(b,xp,fp)) # rint is better than astype(int) because floats are tricky.
# but astype(int) should work faster for small len(a) but not recommended.
this should work as long as the len(a) is smaller than the largest representable int by float (16,777,217) .... and this algorithm is of O(n*log(n)) speed, (or rather len(b)*log(len(a)) to be precise)
Effectively, this solution is a one-liner. The only catch is that you need to reshape the array before you do the one-liner, and then reshape it back again:
import numpy as np
a = np.array([1, 3, 4, 7, 9])
b = np.array([[1, 4, 7, 9], [3, 7, 9, 1]])
original_shape = b.shape
c = np.where(b.reshape(b.size, 1) == a)[1]
c = c.reshape(original_shape)
This results with:
[[0 2 3 4]
[1 3 4 0]]
Broadcasting to the rescue!
>>> ((np.arange(1, len(a) + 1)[:, None, None]) * (a[:, None, None] == b)).sum(axis=0) - 1
array([[0, 2, 3, 4],
[1, 3, 4, 0]])

Reorder a square array using a sorted 1D array

Let's say I have a symmetric n-by-n array A and a 1D array x of length n, where the rows/columns of A correspond to the entries of x, and x is ordered. Now assume both A and x are randomly rearranged, so that the rows/columns still correspond but they're no longer in order. How can I manipulate A to recover the correct order?
As an example: x = array([1, 3, 2, 0]) and
A = array([[1, 3, 2, 0],
[3, 9, 6, 0],
[2, 6, 4, 0],
[0, 0, 0, 0]])
so the mapping from x to A in this example is A[i][j] = x[i]*x[j]. x should be sorted like array([0, 1, 2, 3]) and I want to arrive at
A = array([[0, 0, 0, 0],
[0, 1, 2, 3],
[0, 2, 4, 6],
[0, 3, 6, 9]])
I guess that OP is looking for a flexible way to use indices that sorts both rows and columns of his mapping at once. What is more, OP might be interested in doing it in reverse, i.e. find and initial view of mapping if it's lost.
def mapping(x, my_map, return_index=True, return_inverse=True):
idx = np.argsort(x)
out = my_map(x[idx], x[idx])
inv = np.empty_like(idx)
inv[idx] = np.arange(len(idx))
return out, idx, inv
x = np.array([1, 3, 2, 0])
a, idx, inv = mapping(x, np.multiply.outer) #sorted mapping
b = np.multiply.outer(x, x) #straight mapping
print(b)
>>> [[1 3 2 0]
[3 9 6 0]
[2 6 4 0]
[0 0 0 0]]
print(a)
>>> [[0 0 0 0]
[0 1 2 3]
[0 2 4 6]
[0 3 6 9]]
np.array_equal(b, a[np.ix_(inv, inv)]) #sorted to straight
>>> True
np.array_equal(a, b[np.ix_(idx, idx)]) #straight to sorted
>>> True
A simple implementation would be
idx = np.argsort(x)
A = A[idx, :]
A = A[:, idx]
Another possibility would be (all credit to #mathfux):
A[np.ix_(idx, idx)]
You can use argsort and fancy indexing:
idx = np.argsort(x)
A2 = A[idx[None], idx[:,None]]
output:
array([[0, 0, 0, 0],
[0, 1, 2, 3],
[0, 2, 4, 6],
[0, 3, 6, 9]])

compare entire row and column in numpy array and delete selected rows and columns

I have 2 square array with shape = (25, 25) and I want to check if an entire row is filled with zeros and if the corresponding column is filled with zeros. If this is the case I want to remove those columns and rows from the array.
For example:
array = np.array([[1, 0, 1, 1],
[0, 0, 0, 0],
[1, 0, 1, 1],
[1, 0, 1, 1]])
I want it manipulated to
array=np.array([[1, 1, 1],
[1, 1, 1],
[1, 1, 1]])
I hope you can understand what I am aiming at. In this example row and column two have been removed as they are zero rows/columns.
I could do that by iterating through all of those arrays, as I have 10 million of those arrays I would like to have a pythonic/efficient way to solve this issue.
The second array is a tensorflow array manipulating that should be no problem if I know the index of the rows columns I want removed.
Edit:
I have now found following solution, but it is using for-looping:
def removepadding(y_true, y_pred):
shape = np.shape(y_true)
y_true_cleaned=[]
for i in range(shape[0]):
x = y_true[i]
for n in range(shape[1] - 1, -1, -1):
if sum(x[n, :]) == 0 and sum(x[:, n]) == 0:
x = np.delete(np.delete(x, n, 0), n, 1)
y_true_cleaned.append(x)
return y_true_cleaned
You can do it in one line:
array[array.any(axis = 1)][:, array.any(axis = 0)]
#array([[1, 1, 1],
# [1, 1, 1],
# [1, 1, 1]])
if there is negative values in the arr, np.sum may fail.
for 2d array:
import numpy as np
a = np.array([[1,0,2,3,0,4],
[0,0,0,0,0,0],
[0,0,0,0,0,0],
[2,0,3,4,0,5],
[3,0,4,5,0,6],
[4,0,5,6,0,7],
[5,0,6,7,0,8]])
row = np.all(a==0, axis=1)
col = np.all(a==0, axis=0)
a[~row][:,~col]
output
array([[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 6],
[4, 5, 6, 7],
[5, 6, 7, 8]])
for 3d array:
a = np.ones((3,3,3))
a[1,:,1] = 0
a[1,1,:] = 0
a[:,1,1] = 0
z = np.all(a==0, axis=2)
y = np.all(a==0, axis=1)
x = np.all(a==0, axis=0)
Z = ~np.array([z]*a.shape[2])
Y = ~np.array([y]*a.shape[1])
X = ~np.array([x]*a.shape[0])
ZZ, YY, XX = (Z*Y*X).nonzero()
a[ZZ, YY, XX]
You can use np.count_nonzero to get the indices in one step per dimension:
nnz_row = np.count_nonzero(array, axis=1)
nnz_col = np.count_nonzero(array, axis=0)
Now you make a mask of where both are zero:
mask = (nnz_row == 0) & (nnz_col == 9)
You can turn the mask into indices and pass it to np.delete:
ind = np.flatnonzero(mask)
array = np.delete(np.delete(array, ind, axis=0), ind, axis=1)
Alternatively, you can compute the positive mask:
pmask = nnz_row.astype(bool) | nnz_col.astype(bool)
This mask can select directly, analogously to what delete did with the negative mask:
array = array[pmask, :][:, pmask]
Edit: Thanks to #mad physicist, we can use np.flatnonzero. Here's the 2d case:
import numpy as np
a=np.array([[1,0,2,3,0,4],
[0,0,0,0,0,0],
[0,0,0,0,0,0],
[2,0,3,4,0,5],
[3,0,4,5,0,6],
[4,0,5,6,0,7],
[5,0,6,7,0,8]])
cols_to_keep = np.flatnonzero(a.sum(axis=0))
rows_to_keep = np.flatnonzero(a.sum(axis=1))
a = a[:, cols_to_keep]
a = a[rows_to_keep, :]
a
>>>
array([[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 6],
[4, 5, 6, 7],
[5, 6, 7, 8]])
Here's the 3d case:
import numpy as np
a=np.array([
[[1,0,2,3,0,4],
[0,0,0,0,0,0],
[0,0,0,0,0,0],
[2,0,3,4,0,5],
[3,0,4,5,0,6],
[4,0,5,6,0,7],
[5,0,6,7,0,8]],
[[0,0,0,0,0,0],
[0,0,0,0,0,0],
[0,0,0,0,0,0],
[0,0,0,0,0,0],
[0,0,0,0,0,0],
[0,0,0,0,0,0],
[0,0,0,0,0,0]],
[[5,0,5,5,0,5],
[0,0,0,0,0,0],
[0,0,0,0,0,0],
[2,0,3,4,0,5],
[3,0,4,5,0,6],
[4,0,5,6,0,7],
[5,0,6,7,0,8]],
])
ix_keep_axis_0 = np.flatnonzero(a.sum(axis=(1, 2)))
ix_keep_axis_1 = np.flatnonzero(a.sum(axis=(0, 2)))
ix_keep_axis_2 = np.flatnonzero(a.sum(axis=(0, 1)))
a = a[ix_keep_axis_0, :, :]
a = a[:, ix_keep_axis_1, :]
a = a[:, :, ix_keep_axis_2]
a
>>>
array([[[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 6],
[4, 5, 6, 7],
[5, 6, 7, 8]],
[[5, 5, 5, 5],
[2, 3, 4, 5],
[3, 4, 5, 6],
[4, 5, 6, 7],
[5, 6, 7, 8]]])

Python: How to subtract every n-th column from the ones before it in a matrix/data-frame?

Basically looking for a similar solution as posted here, but then in python. R: How to subtract every n-th column from the ones before it in a matrix/data-frame?
Some data
import numpy as np
m = np.matrix([[1, 2, 3, 5], [3, 4, 5, 2], [5, 6, 7, 2]])
I want to subtract the value in every 2 columns from the value of the column before. So the solution that I want to end up with two columns. Each row in the first column containing -1 and the second column as [-2, 3, 5]
Thanks in advance!
You can do :
import numpy as np
m = np.matrix([[1, 2, 3, 5], [3, 4, 5, 2], [5, 6, 7, 2]])
a = m[:,0] - m[:,1]
b = m[:,2] - m[:,3]
m2 = np.concatenate((a, b), axis=1)
print(m2)
And for n column :
import numpy as np
m = np.matrix([[1, 2, 3, 5, 5 , 6], [3, 4, 5, 2, 3, 2], [5, 6, 7, 2, 7, 1]])
shape = np.shape(m)
print(shape)
result = []
if(shape[1] % 2 == 0):
for i in range(0, shape[1], 2):
print(i)
result.append(m[:,i] - m[:,i+1])
m2 = result[0]
for i in range(1, len(result)):
m2 = np.concatenate((m2, result[i]), axis=1)
print(m2)

Categories

Resources