Python: appending to numpy array at certain indexes and change shape

Python: appending to numpy array at certain indexes and change shape - python

I have a numpy array like this:
print(pred_galactic_prob.shape)
print(pred_galactic_prob[0:3])
(465, 5)
[[0.05 0.94 0.3 0.01 0.5 ]
[0.01 0.02 0.01 0.85 0.11]
[0.03 0.95 0.3 0.3 0.02]]
I want to append to this and change the shape so there are 13 columns and it would look like this:
[[0.05 0. 0.94 0. 0. 0.3 0. 0. 0.01 0. 0. 0. 0.5 ]
[0.01 0. 0.02 0. 0. 0.01 0. 0. 0.85 0. 0. 0. 0.11]
[0.03 0. 0.95 0. 0. 0.3 0. 0. 0.3 0. 0. 0. 0.02]]
i.e a column with all 0. is added after the first column, two columns with all 0. are added after the second entry and so on, per above.
I have tried the following:
pred_galactic_prob2 = np.array
for i in pred_galactic_prob:
pred_galactic_prob2 = np.append(pred_galactic_prob2, [i[0], 0.0, i[1], 0.0, 0.0, i[2], 0.0, 0.0, i[3], 0.0, 0.0, 0.0, i[4]])
but this just turns it into a 1D array.

A "one-line" solution would be
np.concatenate((a[:,:1],
np.lib.stride_tricks.as_strided(0,[len(a),1],[0,0]),
a[:,1:2],
np.lib.stride_tricks.as_strided(0,[len(a),2],[0,0]),
a[:,2:3],
np.lib.stride_tricks.as_strided(0,[len(a),2],[0,0]),
a[:,3:4],
np.lib.stride_tricks.as_strided(0,[len(a),3],[0,0]),
a[:,4:]), -1)
Though its wired in any sense. Using append would need even more as_strideds. I believe there should be a append-ish function that automatically broadcasts input but I'm not sure what is it. Anyway, a better solution is definitely as #hpaulj mentioned:
b = np.zeros((len(a), 13), a.dtype)
b[:,[0,2,5,8,12]] = a
here a means input, b means output

Related

How to create a rectangular grid with custom start point and step value

I'm working on a project where I need to calibrate to cameras. As you know one needs to define a plane grid points in the 3D-world and find their correspondences on the image plane. Therefore, the first camera has the following 3D_grid points:
mport cv2 as cv
import numpy as np
WPoints_cam1 = np.zeros((9*3,3), np.float64)
WPoints_cam1[:,:2] = np.mgrid[0:9,0:3].T.reshape(-1,2)*0.4
print(WPoints_cam1)
[[0. 0. 0. ]# world coordinate center
[0.4 0. 0. ]
[0.8 0. 0. ]
[1.2 0. 0. ]
[1.6 0. 0. ]
[2. 0. 0. ]
[2.4 0. 0. ]
[2.8 0. 0. ]
[3.2 0. 0. ]
[0. 0.4 0. ]
[0.4 0.4 0. ]
[0.8 0.4 0. ]
[1.2 0.4 0. ]
[1.6 0.4 0. ]
[2. 0.4 0. ]
[2.4 0.4 0. ]
[2.8 0.4 0. ]
[3.2 0.4 0. ]
[0. 0.8 0. ]
[0.4 0.8 0. ]
[0.8 0.8 0. ]
[1.2 0.8 0. ]
[1.6 0.8 0. ]
[2. 0.8 0. ]
[2.4 0.8 0. ]
[2.8 0.8 0. ]
[3.2 0.8 0. ]]
As seen above the first grid (for the first camera) starts from the defined reference 3D_point (0,0,0) and ends by the point (3.2,0.8 0) with a constant offset of 0.4 and 9x3 dimension
Note that all Z coordinates were put to Z=0 (Zhengyou Zhang calibration)
Now my question is, as I need to define a second grid(for the second camera) that also refers to the defined 3D_coordinate center (0,0,0), I need to define a grid that starts from (3.6,0,0) and ends with (6.8,0.8,0) with the same offset 0.4 and has a dimension 9x3
I believe this is easy to do. However I can't think out of the box due to my beginner level of experience.
Would appreciate for some help and thanks in advance.

You can scale each column like this:
np.mgrid[0:8, 0:3].T.reshape(-1,2) * np.array([(7.8 - 3.6) / 7, 0.4]) + np.array([3.6, 0])
or combine it into scaling matrix like this (and then add on a vector for the translation)
np.mgrid[0:8, 0:3].T.reshape(-1,2) # np.array([[(7.8 - 3.6) / 7, 0], [0, 0.4]]).T + np.array([3.6, 0])
regarding where (7.8 - 3.6) / 7 comes from, the numerator should be self evident. The denominator is the same but for your original dimensions. With 0:8 the max is 7 and the min is 0 so the denominator becomes 7 - 0.

How to calculate formula for every value in an array?

Im trying to get to understand how to use numpy for calculating a formula for different times. The way the code is written gives all the values where y is bigger than 0. I am experimenting how to get the values for all y's.
Is there someone who can explain me the part: ft = t * [y >= 0.0 ]. How do i use the parts within the brackets?
from numpy import *
g = 10.0
h0 = 10.0
t = arange(0, 10.1 ,0.1)
y = h0 - 0.5*g*t*t
ft = t * [y >= 0.0 ]
print(ft)
This is the output, but I would like to see all the values calculated. So i experimented a bit but i could not figure it out how to do it and how the; [y >= 0.0] part exactly works.
[[0. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. 1.1 1.2 1.3 1.4 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. ]]
If i use [y] instead of [y >= 0.0] i get the following:
[[ 0.000000e+00 9.950000e-01 1.960000e+00 2.865000e+00 3.680000e+00
4.375000e+00 4.920000e+00 5.285000e+00 5.440000e+00 5.355000e+00
5.000000e+00 4.345000e+00 3.360000e+00 2.015000e+00 2.800000e-01
-1.875000e+00 -4.480000e+00 -7.565000e+00 -1.116000e+01 -1.529500e+01
-2.000000e+01 -2.530500e+01 -3.124000e+01 -3.783500e+01 -4.512000e+01
-5.312500e+01 -6.188000e+01 -7.141500e+01 -8.176000e+01 -9.294500e+01
-1.050000e+02 -1.179550e+02 -1.318400e+02 -1.466850e+02 -1.625200e+02
-1.793750e+02 -1.972800e+02 -2.162650e+02 -2.363600e+02 -2.575950e+02
-2.800000e+02 -3.036050e+02 -3.284400e+02 -3.545350e+02 -3.819200e+02
-4.106250e+02 -4.406800e+02 -4.721150e+02 -5.049600e+02 -5.392450e+02
-5.750000e+02 -6.122550e+02 -6.510400e+02 -6.913850e+02 -7.333200e+02
-7.768750e+02 -8.220800e+02 -8.689650e+02 -9.175600e+02 -9.678950e+02
-1.020000e+03 -1.073905e+03 -1.129640e+03 -1.187235e+03 -1.246720e+03
-1.308125e+03 -1.371480e+03 -1.436815e+03 -1.504160e+03 -1.573545e+03
-1.645000e+03 -1.718555e+03 -1.794240e+03 -1.872085e+03 -1.952120e+03
-2.034375e+03 -2.118880e+03 -2.205665e+03 -2.294760e+03 -2.386195e+03
-2.480000e+03 -2.576205e+03 -2.674840e+03 -2.775935e+03 -2.879520e+03
-2.985625e+03 -3.094280e+03 -3.205515e+03 -3.319360e+03 -3.435845e+03
-3.555000e+03 -3.676855e+03 -3.801440e+03 -3.928785e+03 -4.058920e+03
-4.191875e+03 -4.327680e+03 -4.466365e+03 -4.607960e+03 -4.752495e+03
-4.900000e+03]]
I would like to know how i can use numpy to calculate at once all the outcomes of a formula for different time intervals.
Thanks,

y >= 0.0 gives you an array of Booleans which contain True/False depending on the fulfillment of the condition y >= 0.0. When you enclose it within [] as [y >= 0.0], you get a list which contains a single array of Booleans, as pointed out by #nicola in the comments below.
[array([ True, True, True, True, True, False, False, False,...
... False, False, False, False])]
Now you multiply this with your arange array which will give you 0 when the right hand side of * operator is False and will give you the actual value from the arange when the right hand side of * operator is True

The array [y >= 0.0] produces and array of booleans. i.e. 1 if y>=0 and 0 if not. That array of 1's and 0's is then multiplied by t.
It is not clear to me from your question however, what you are trying to do with it.

2D Array Manipulation (Replacing list elements)

I'm creating a program which outputs a 5x5 matrix of 0's. I then ask the user to input a number between 0-25 which will turn the selected element to a 1
I need the matrix output to show 0s, but really, behind the scenes it needs to be like this:
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
For example: User inputs 7. The matrix will then output:
/output/
Please enter a number between 0-25:
0 0 0 0 0
0 1 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
What would be the easiest way to do this?
Current code:
def main():
grid = [[0 for row in range(5)]for col in range(5)] #creates a 5x5 matrix
#prints the matrix
for row in grid: #for each row in the grid
for column in row: #for each column in the row
print(column,end=" ") #print a space at the end of the element
print()
player1 = input("Please enter a number between 0-25: ")
main()

The function that you are looking for is numpy.unravel_index, which converts a flat index (0-24 in the 5x5 case) to a shaped index.
Also, note that a 5x5 matrix will contain 25 element ranged from 0-24, not 0-25.
Below is a piece of code demonstrating how you could do this. Also I have added a check for the user input so the number entered can only be an integer in the flat index range.
import numpy as np
sh = (5,5)
a = np.zeros(sh)
print(a)
while True:
try:
player1 = int(input('Number1: '))
if player1 < 0 or player1 > a.size-1:
raise ValueError # this will send it to the print message and back to the input option
break
except ValueError:
print("Invalid integer. The number must be in the range of 0-{}.".format(a.size))
a[np.unravel_index(player1,sh)] = 1
print(a)
This is the output:
[[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]]
Number1: 7
[[0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]]
EDIT (If you really need a list)
To use numpy operating on your list just convert the list to a numpy.array, use numpy.unravel_index and convert it back to a list:
import numpy as np
sh = (5,5)
a_array = np.zeros(sh)
a_list = a_array.tolist() # Here numpy converts a to a list (your starting point)
a_array = np.array(a_list) # Here it converts it to a numpy.array
while True:
try:
player1 = int(input('Number1: '))
if player1 < 0 or player1 > a_array.size-1:
raise ValueError # this will send it to the print message and back to the input option
break
except ValueError:
print("Invalid integer. The number must be in the range of 0-{}.".format(a.size))
a_array[np.unravel_index(player1,sh)] = 1
a_list = a_array.tolist()
print(a_list)
You will get the same output but in a list form
[[0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0]]

Pandas Multi-Index DataFrame to Numpy Ndarray

I am trying to convert a multi-index pandas DataFrame into a numpy.ndarray. The DataFrame is below:
s1 s2 s3 s4
Action State
1 s1 0.0 0 0.8 0.2
s2 0.1 0 0.9 0.0
2 s1 0.0 0 0.9 0.1
s2 0.0 0 1.0 0.0
I would like the resulting numpy.ndarray to be the following with np.shape() = (2,2,4):
[[[ 0.0 0.0 0.8 0.2 ]
[ 0.1 0.0 0.9 0.0 ]]
[[ 0.0 0.0 0.9 0.1 ]
[ 0.0 0.0 1.0 0.0]]]
I have tried df.as_matrix() but this returns:
[[ 0. 0. 0.8 0.2]
[ 0.1 0. 0.9 0. ]
[ 0. 0. 0.9 0.1]
[ 0. 0. 1. 0. ]]
How do I return a list of lists for the first level with each list representing an Action records.

You could use the following:
dim = len(df.index.get_level_values(0).unique())
result = df.values.reshape((dim1, dim1, df.shape[1]))
print(result)
[[[ 0. 0. 0.8 0.2]
[ 0.1 0. 0.9 0. ]]
[[ 0. 0. 0.9 0.1]
[ 0. 0. 1. 0. ]]]
The first line just finds the number of groups that you want to groupby.
Why this (or groupby) is needed: as soon as you use .values, you lose the dimensionality of the MultiIndex from pandas. So you need to re-pass that dimensionality to NumPy in some way.

One way
In [151]: df.groupby(level=0).apply(lambda x: x.values.tolist()).values
Out[151]:
array([[[0.0, 0.0, 0.8, 0.2],
[0.1, 0.0, 0.9, 0.0]],
[[0.0, 0.0, 0.9, 0.1],
[0.0, 0.0, 1.0, 0.0]]], dtype=object)

Using Divakar's suggestion, np.reshape() worked:
>>> print(P)
s1 s2 s3 s4
Action State
1 s1 0.0 0 0.8 0.2
s2 0.1 0 0.9 0.0
2 s1 0.0 0 0.9 0.1
s2 0.0 0 1.0 0.0
>>> np.reshape(P,(2,2,-1))
[[[ 0. 0. 0.8 0.2]
[ 0.1 0. 0.9 0. ]]
[[ 0. 0. 0.9 0.1]
[ 0. 0. 1. 0. ]]]
>>> np.shape(P)
(2, 2, 4)

Elaborating on Brad Solomon's answer, to get a sligthly more generic solution - indexes of different sizes and an unfixed number of indexes - one could do something like this:
def df_to_numpy(df):
try:
shape = [len(level) for level in df.index.levels]
except AttributeError:
shape = [len(df.index)]
ncol = df.shape[-1]
if ncol > 1:
shape.append(ncol)
return df.to_numpy().reshape(shape)
If df has missing sub-indexes reshape will not work. One way to add them would be (maybe there are better solutions):
def enforce_df_shape(df):
try:
ind = pd.MultiIndex.from_product([level.values for level in df.index.levels])
except AttributeError:
return df
fulldf = pd.DataFrame(-1, columns=df.columns, index=ind) # remove -1 to fill fulldf with nan
fulldf.update(df)
return fulldf

If you are just trying to pull out one column, say s1, and get an array with shape (2,2) you can use the .index.levshape like this:
x = df.s1.to_numpy().reshape(df.index.levshape)
This will give you a (2,2) containing the value of s1.

Fastest way to compute upper-triangular matrix of geometric series (Python)

and thanks in advance for the help.
Using Python (mostly numpy), I am trying to compute an upper-triangular matrix where each row "j" is the first j-terms of a geometric series, all rows using the same parameter.
For example, if my parameter is B (where abs(B)=<1, i.e. B in [-1,1]), then row 1 would be [1 B B^2 B^3 ... B^(N-1)], row 2 would be [0 1 B B^2...B^(N-2)] ... row N would be [0 0 0 ... 1].
This computation is key to a Bayesian Metropolis-Gibbs sampler, and so needs to be done thousands of times for new values of "B".
I have currently tried this two ways:
Method 1 - Mostly Vectorized:
B_Matrix = np.triu(np.dot(np.reshape(B**(-1*np.array(range(N))),(N,1)),np.reshape(B**(np.array(range(N))),(1,N))))
Essentially, this is the upper triangle part of a product of an Nx1 and 1xN set of matrices:
upper triangle ([1 B^(-1) B^(-2) ... B^(-(N-1))]' * [1 B B^2 B^3 ... B^(N-1)])
This works great for small N (algebraically it is correct), but for large N it errs out. And it produces errors out for B=0 (which should be allowed). I believe this is stemming from taking B^(-N) ~ inf for small B and large N.
Method 2:
B_Matrix = np.zeros((N,N))
B_Row_1 = B**(np.array(range(N)))
for n in range(N):
B_Matrix[n,n:] = B_Row_1[0:N-n]
So that just fills in the matrix row by row, but uses a loop which slows things down.
I was wondering if anyone had run into this before, or had any better ideas on how to compute this matrix in a faster way.
I've never posted on stackoverflow before, but didn't see this question anywhere, and thought I'd ask.
Let me know if there's a better place to ask this, and if I should provide anymore detail.

You could use scipy.linalg.toeplitz:
In [12]: n = 5
In [13]: b = 0.5
In [14]: toeplitz(b**np.arange(n), np.zeros(n)).T
Out[14]:
array([[ 1. , 0.5 , 0.25 , 0.125 , 0.0625],
[ 0. , 1. , 0.5 , 0.25 , 0.125 ],
[ 0. , 0. , 1. , 0.5 , 0.25 ],
[ 0. , 0. , 0. , 1. , 0.5 ],
[ 0. , 0. , 0. , 0. , 1. ]])
If your use of the array is strictly "read only", you can play tricks with numpy strides to quickly create an array that uses only 2*n-1 elements (instead of n^2):
In [55]: from numpy.lib.stride_tricks import as_strided
In [56]: def make_array(b, n):
....: vals = np.zeros(2*n - 1)
....: vals[n-1:] = b**np.arange(n)
....: a = as_strided(vals[n-1:], shape=(n, n), strides=(-vals.strides[0], vals.strides[0]))
....: return a
....:
In [57]: make_array(0.5, 4)
Out[57]:
array([[ 1. , 0.5 , 0.25 , 0.125],
[ 0. , 1. , 0.5 , 0.25 ],
[ 0. , 0. , 1. , 0.5 ],
[ 0. , 0. , 0. , 1. ]])
If you will modify the array in-place, make a copy of the result returned by make_array(b, n). That is, arr = make_array(b, n).copy().
The function make_array2 incorporates the suggestion #Jaime made in the comments:
In [30]: def make_array2(b, n):
....: vals = np.zeros(2*n-1)
....: vals[n-1] = 1
....: vals[n:] = b
....: np.cumproduct(vals[n:], out=vals[n:])
....: a = as_strided(vals[n-1:], shape=(n, n), strides=(-vals.strides[0], vals.strides[0]))
....: return a
....:
In [31]: make_array2(0.5, 4)
Out[31]:
array([[ 1. , 0.5 , 0.25 , 0.125],
[ 0. , 1. , 0.5 , 0.25 ],
[ 0. , 0. , 1. , 0.5 ],
[ 0. , 0. , 0. , 1. ]])
make_array2 is more than twice as fast as make_array:
In [35]: %timeit make_array(0.99, 600)
10000 loops, best of 3: 23.4 µs per loop
In [36]: %timeit make_array2(0.99, 600)
100000 loops, best of 3: 10.7 µs per loop

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: appending to numpy array at certain indexes and change shape - python

Related

How to create a rectangular grid with custom start point and step value

How to calculate formula for every value in an array?

2D Array Manipulation (Replacing list elements)

Pandas Multi-Index DataFrame to Numpy Ndarray

Fastest way to compute upper-triangular matrix of geometric series (Python)

Categories

Resources