Modifying Array Giving Back Wrong Output - python

I'm new to numpy and am trying to do some slicing and indexing with arrays. My goal is to take an array, and use slicing and indexing to square the last column, and then subtract the first column from that result. I then want to put the new column back into the old array.
I've been able to figure out how to slice and index the column to get the result I want for the last column. My problem however is that when I try to put it back into my original array, I get the wrong output (as seen below).
theNumbers = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
sliceColumnOne = theNumbers[:,0]
sliceColumnThree = theNumbers[:,3]**2
editColumnThree = sliceColumnThree - sliceColumnOne
newArray = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[editColumnThree]])
print("nums:\n{}".format(newArray))
I want the output to be
[[ 1 2 3 15]
[ 5 6 7 59]
[ 9 10 11 135]
[ 13 14 15 243]]
However mine becomes:
[list([1, 2, 3, 4]) list([5, 6, 7, 8]) list([9, 10, 11, 12])
list([array([ 15, 59, 135, 243])])]
Any suggestions on how to fix this?

Just assign the last numpy array row to the new one "theNumbers[3] = editColumnThree"
Code:
import numpy as np
theNumbers = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
sliceColumnOne = theNumbers[:,0]
sliceColumnThree = theNumbers[:,3]**2
editColumnThree = sliceColumnThree - sliceColumnOne
theNumbers[3] = editColumnThree
print("nums:\n{}".format(theNumbers))
Output:
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]
[ 15 59 135 243]]

newArray = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[editColumnThree]])
print("nums:\n{}".format(newArray))
this way, editColumnThree is the last row, not column. You can use
newArray = theNumbers.copy() # if a copy is needed
newArray[:,-1] = editColumnThree # replace last (-1) column

If you just want to stack the vectors on top of eachother, use vstack:
import numpy as np
theNumbers = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
newNumbers = np.vstack(theNumbers)
print(newNumbers)
>>>[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]
[13 14 15 16]]
But the issue here isn't just that you need to stack these numbers, you are mixing up columns and rows. You are changing a row instead of a column. To change the column, update the last element in each row:
import numpy as np
theNumbers = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
LastColumn = theNumbers[:,3]**2
FirstColumn = theNumbers[:,0]
editColumnThree = LastColumn - FirstColumn
for i in range(4):
theNumbers[i,3] = editColumnThree [i]
print(theNumbers)
>>>[[ 1 2 3 15]
[ 5 6 7 59]
[ 9 10 11 135]
[ 13 14 15 243]]

Related

how to randomize a matrix keeping the rows fixed in python

I'm trying to randomize all the rows of my DataFrame but with no success.
What I want to do is from this matrix
A= [ 1 2 3
4 5 6
7 8 9 ]
to this
A_random=[ 4 5 6
7 8 9
1 2 3 ]
I've tried with np. random.shuffle but it doesn't work.
I'm working in Google Colaboratory environment.
If you want to make this work with np.random.shuffle, then one way would be to extract the rows into an ArrayLike structure, shuffle them in place and then recreate the DataFrame:
A = pandas.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
extracted_rows = A.values.tolist() # Each row is an array element, so rows will remain fixed but their order shuffled
np.random.shuffle(extracted_rows)
A_random = pandas.DataFrame(extracted_rows)

Applying a function along a numpy array using indexes as parameters

I am trying to find a way to run a function through a whole ndarray using the indexes of every value as parameters. A regular loop is quite slow and I can't find a way to make it work using numpy built-in functions. The next example code summarizes what i'm trying to do. Thanks in advance.
import numpy as np
arr = np.arange(10).reshape(2, 5)
print(arr)
def example_func(row, col, value):
return(row + col + value)
for row in range(arr.shape[0]):
for col in range(arr.shape[1]):
arr[row, col] = example_func(row, col, arr[row, col])
print(arr)
[[0 1 2 3 4]
[5 6 7 8 9]]
[[ 0 2 4 6 8]
[ 6 8 10 12 14]]
What you try to do can be done with meshgrid.
Return coordinate matrices from coordinate vectors.
n_rows, n_cols = arr.shape
col_matrix, row_matrix = np.meshgrid(np.arange(n_cols), np.arange(n_rows))
result = arr + col_matrix + row_matrix
print(result)
This returns
[[ 0 2 4 6 8]
[ 6 8 10 12 14]]

Bug (?) with numpy indexing

I was trying to achieve a kind of 2d filter with numpy, and I found something that looks to me like a bug.
In the example below, I'm trying to target the 2nd and 4th columns of the first, second and last lines of my data, ie:
[[ 2 4]
[ 8 10]
[26 28]]
I am aware that the second to last line does return that, but I wouldn't be able to assign anything there (it returns a copy). And this still doesn't explain why the last one fails.
import numpy as np
# create my data: 5x6 array
data = np.arange(0,30).reshape(5,6)
# mask: only keep row 1,2,and 5
mask = np.array([1,1,0,0,1])
mask = mask.astype(bool)
# this is fine
print 'data\n', data, '\n'
# this is good
print 'mask\n', mask, '\n'
# this is nice
print 'data[mask]\n', data[mask], '\n'
# this is great
print 'data[mask, 2]\n', data[mask, 2], '\n'
# this is awesome
print 'data[mask][:,[2,4]]\n', data[mask][:,[2,4]], '\n'
# this fails ??
print 'data[mask, [2,4]]\n', data[mask, [2,4]], '\n'
output:
data
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]
[24 25 26 27 28 29]]
mask
[ True True False False True]
data[mask]
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[24 25 26 27 28 29]]
data[mask, 2]
[ 2 8 26]
data[mask][:,[2,4]]
[[ 2 4]
[ 8 10]
[26 28]]
data[mask, [2,4]]
Traceback (most recent call last):
[...]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (3,) (2,)
I'm posting this here, because I'm not confident enough in my numpy skills to be sure this is a bug, and file a bug report...
Thanks for your help/feedback !
This is not a bug.
This is an implementation definition
If you read array indexing in section Advanced Indexing you notice that it says
Purely integer array indexing When the index consists of as many
integer arrays as the array being indexed has dimensions, the indexing
is straight forward, but different from slicing. Advanced indexes
always are broadcast and iterated as one:
result[i_1, ..., i_M] == x[ind_1[i_1, ..., i_M], ind_2[i_1, ..., i_M],
..., ind_N[i_1, ..., i_M]]
therefore
print 'data[mask, [2,4]]\n', data[mask, [1,2,4]], '\n'
works and outputs
data[mask, [1,2,4]]
[ 1 8 28]
index length in broadcasting must be the same
Maybe you can achieve what you want using ix_ function. See array indexing
columns = np.array([2, 4], dtype=np.intp)
print data[np.ix_(mask, columns)]
which outputs
[[ 2 4]
[ 8 10]
[26 28]]

How to save a large array in txt file in python

I have 2D array:
import numpy as np
output = np.array([1,1,6])*np.arange(6)[:,None]+1
output
Out[32]:
array([[ 1, 1, 1],
[ 2, 2, 7],
[ 3, 3, 13],
[ 4, 4, 19],
[ 5, 5, 25],
[ 6, 6, 31]])
I tried to use np.savetxt('file1.txt', output, fmt='%10d')
i have got the result in one line only
How can I save it in txt file simillar to :
x y z
1 1 1
2 2 7
3 3 13
4 4 19
5 5 25
6 6 31
3 separate columns, each column has name (x,y,z)
Please note: the original array too large (40000000 rows and 3 columns), I am using Python 3.6
I have tried the solutions in here and here but, it does not work with me
Noor, let me guess - you are using windows notepad to view the file?
I use Notepad++ which is smart enough to understand Unix-style-Lineendings which are used (by default) when creating files by np.savetxt() even when operated under windows.
You might want to explicitly specify newline="\r\n" when calling savetxt.
np.savetxt('file1.txt', output, fmt='%10d' ,header= " x y z", newline="\r\n")
Doku: https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.savetxt.html
I am not sure about your data, but this:
import numpy as np
output = np.array([1,1,6])*np.arange(60)[:,None]+1
print(output)
np.savetxt('file1.txt', output, fmt='%10d' ,header= " x y z")
Produces this output:
# x y z
1 1 1
2 2 7
3 3 13
=== snipped a few lines ===
58 58 343
59 59 349
60 60 355
for me.
for np.arange(1000000) its about 32MB big and similarly formatted...
for np.arange(10000000) its about 322MB big and similarly formatted...
willem-van-onsem 1+Gb was far closer.
I did not account for the spacing of fixed 10 chars per number, my bad.

Creating new numpy matrix from list of rows of existing matrix

I have a 2D numpy array A, and a list x. The elements of x are indices of the rows of A. I want to create a new matrix B, by taking the rows of A as indicated by x. How can I do this?
You can pass x as an argument when indexing A to create your new matrix B as below. See the docs here.
import numpy as np
A = np.arange(25).reshape((5,5))
x = [1, 2, 4]
B = A[x]
print(B)
# [[ 5 6 7 8 9]
# [10 11 12 13 14]
# [20 21 22 23 24]]

Categories

Resources