I am confused about doing vectorization using numpy.
In particular, I have a matrix of this form:
of type <type 'list'>
[[0.0, 0.0, 0.0, 0.0], [0.02, 0.04, 0.0325, 0.04], [1, 2, 3, 4]]
How do I make it look like the following using numpy?
[[ 0.0 0.0 0.0 0.0 ]
[ 0.02 0.04 0.0325 0.04 ]
[ 1 2 3 4 ]]
Yes, I know I can do it using:
np.array([[0.0, 0.0, 0.0, 0.0], [0.02, 0.04, 0.0325, 0.04], [1, 2, 3, 4]])
But I have a very long matrix, and I can't just type out each rows like that. How can I handle the case when I have a very long matrix?
This is not a matrix of type list, it is a list that contains lists. You may think of it as matrix, but to Python it is just a list
alist = [[0.0, 0.0, 0.0, 0.0], [0.02, 0.04, 0.0325, 0.04], [1, 2, 3, 4]]
arr = np.array(alist)
works just the same as
arr = np.array([[0.0, 0.0, 0.0, 0.0], [0.02, 0.04, 0.0325, 0.04], [1, 2, 3, 4]])
This creates 2d array, with shape (3,4) and dtype float
In [212]: arr = np.array([[0.0, 0.0, 0.0, 0.0], [0.02, 0.04, 0.0325, 0.04], [1, 2, 3, 4]])
In [213]: arr
Out[213]:
array([[ 0. , 0. , 0. , 0. ],
[ 0.02 , 0.04 , 0.0325, 0.04 ],
[ 1. , 2. , 3. , 4. ]])
In [214]: print(arr)
[[ 0. 0. 0. 0. ]
[ 0.02 0.04 0.0325 0.04 ]
[ 1. 2. 3. 4. ]]
Assuming you start with a large array, why not split it into arrays of the right size (n):
splitted = [l[i:i + n] for i in range(0, len(array), n)]
and make the matrix from that:
np.array(splitted)
If you're saying you have a list of lists stored in Python object A, all you need to do is call np.array(A) which will return a numpy array using the elements of A. Otherwise, you need to specify what form your data is in right now to clarify how you want to load your data.
Related
I have a NumPy array made of ragged nested sequences such as the following:
arr = np.array((
np.random.random((2, 2, 2)),
np.random.random((4, 4, 4)),
np.random.random((2, 2, 2))
))
I want to resize each of the nested arrays to the shape (4, 4, 4) by filling it with zeros.
I initially looked at this post numpy - resize array filling with 0 which works for 2D NumPy arrays but, I have struggled to modify it for a 3D NumPy array.
So far I have tried iterating over the individual nested arrays however, even with some fairly basic code such as
for i, a in enumerate(arr[0]):
arr[0][i] = np.hstack([a, np.zeros([a.shape[0], 2])])
It still creates an error.
ValueError: could not broadcast input array from shape (2,4) into shape (2,2)
I could create separate variables for every nested array except this feels very slow and inefficient and I'd need even messier code to extend this to all 3 dimensions.
An example of a test:
arr = [[[0.1, 0.4],
[0.3, 0,7]],
[[0.5, 0.2],
[0.8, 0.1]]]
If I wanted it to have the shape (2, 3, 4) the output would be the following
[[[0.1, 0.4, 0.0, 0.0],
[0.3, 0,7, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0]],
[[0.5, 0.2, 0.0, 0.0],
[0.8, 0.1, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0]]]
UPDATE:
Don't even need to use pad then:
def pad_3d(arr: np.ndarray, out_shape: tuple[int, int, int]) -> np.ndarray:
x, y, z = arr.shape
output = np.zeros(out_shape, dtype=arr.dtype)
output[:x, :y, :z] = arr
return output
test_arr = np.array(
[[[0.1, 0.4],
[0.3, 0.7]],
[[0.5, 0.2],
[0.8, 0.1]]]
)
desired_shape = (2, 3, 4)
expected_output = np.array(
[[[0.1, 0.4, 0.0, 0.0],
[0.3, 0.7, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0]],
[[0.5, 0.2, 0.0, 0.0],
[0.8, 0.1, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0]]]
)
assert np.all(expected_output == pad_3d(test_arr, desired_shape)) # True
Original answer:
It's not entirely clear how you want to fill the resulting arrays with zeros around your data. Only on one side along each axis? Or do you want to essentially "center" your original data amidst the zeros?
Either way, I see no way around creating new arrays. The pad function does what you want, I think. Here is a simplified example for one array, where I "pad around" the data:
import numpy as np
a = np.arange(2*2*2).reshape((2, 2, 2))
x = np.pad(a, 0)
If you want to pad on one side with zeros:
x = np.pad(a, (0, 2))
Assuming your arrays are always cubic, i.e. of the shape (n, n, n), you can generalize like this:
def pad_with_zeros(arr, target_size):
return np.pad(arr, (0, target_size - arr.shape[0]))
IIUC, here is one way to do it:
Assuming your arr is actually a list or a tuple:
arr = (
np.random.random((2, 2, 2)),
np.random.random((4, 4, 4)),
np.random.random((2, 2, 2)),
)
# new shape: max length in each dimension:
shape = np.c_[[x.shape for x in arr]].max(0)
>>> shape
array([4, 4, 4])
# pad all arrays
new = [np.pad(x, np.c_[[0]*len(shape), shape - x.shape]) for x in arr]
>>> new[0].shape
(4, 4, 4)
>>> new[0]
array([[[0.5488135 , 0.71518937, 0. , 0. ],
[0.60276338, 0.54488318, 0. , 0. ],
[0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. ]],
[[0.4236548 , 0.64589411, 0. , 0. ],
[0.43758721, 0.891773 , 0. , 0. ],
[0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. ]],
[[0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. ]],
[[0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. ]]])
I am trying to insert a column into a 2D array.
Currently I have a 2D array generated using itertools.
sample_points=[-1.5, -.8]
base_points = itertools.combinations_with_replacement(sample_points, 3)
base_points_list=list(base_points)
base_points_array=np.asarray(base_points_list)
Then I get an array which looks like this:
>>> base_points_array
array([[-1.5, -1.5, -1.5],
[-1.5, -1.5, -0.8],
[-1.5, -0.8, -0.8],
[-0.8, -0.8, -0.8]])
I want to add a column at the beginning so that the array looks like this:
[[1 -1.5 -1.5 -1.5]
[1 -1.5 -1.5 -0.8]
[1 -1.5 -0.8 -0.8]
[1 -0.8 -0.8 -0.8]]
So I used the command:
np.insert(base_points_array,0,1,1)
Because it should be able to do that using broadcasting.
but I get something completely different. the number of rows have changes:
array([[ 1. , -1.5, -1.5, -1.5, -0.8],
[ 1. , -1.5, -1.5, -0.8, -0.8],
[ 1. , -1.5, -0.8, -0.8, -0.8]])
What am I doing wrong?
Using the np.append . But if your array to insert is 1D array
insert_array= [1, 1, 1, 1]
You need to expand the dimension of your inserting array by 1 first, you can do it with
insert_array= np.expand_dims(insert_array, 1)
And then you can use the append method
base_points_array= np.append(insert_array, base_points_array, 1)
I have an array like this:
[[0.13, 0.19],
[0.25, 0.6 ],
[0.7 , 0.89]]
I want, given the above array, to create a result like this:
[[0, 0.12],
[0.13, 0.19],
[0.20, 0.24],
[0.25, 0.60],
[0.61, 0.69],
[0.70, 0.89],
[0.90, 1]]
Namely, I want to create a total matrix of intervals, given a pre-defined intervals.
This isn't specific to numpy, but maybe it will point you in the correct direction.
Basically, you need to know where to start, end, and the 'resolution' (for lack of a better word) — how far apart the gaps are. With that you can loop through the existing intervals and fill in the others. You'll want to watch the edge cases where the intervals are already filled in — like one starting a 0 or [0.6, 0.8], [0.9, 0.95] so you don't fill those in twice. This might look something like:
def fill_intervals(existing_intervals, start=0, end=1.0, inc=0.01):
l2 = []
for i in l:
if start < i[0]:
l2.append([start, i[0] - inc])
l2.append(i)
start = i[1] + inc
if start < end:
l2.append([start, end])
return l2
l = [
[0.13, 0.19],
[0.25, 0.6 ],
[0.7 , 0.89]
]
fill_intervals(l)
Returning:
[[0, 0.12],
[0.13, 0.19],
[0.2, 0.24],
[0.25, 0.6],
[0.61, 0.69],
[0.7, 0.89],
[0.9, 1.0]]
You can duplicate items and then make it quite close:
arr = np.array([[0.13, 0.19], [0.25, 0.6 ], [0.7 , 0.89]])
consecutive = np.r_[0, np.repeat(arr, 2), 1]
intervals = consecutive.reshape(-1, 2)
intervals:
array([[0. , 0.13], # required: [0, 0.12]
[0.13, 0.19], # OK
[0.19, 0.25], # required: [0.20, 0.24]
[0.25, 0.6 ], # OK
[0.6 , 0.7 ], # required: [0.61, 0.69]
[0.7 , 0.89], # OK
[0.89, 1. ]])# required: [0.9, 1]
It seems you need to fix alternate intervals so just do:
intervals[2::2,0] = intervals[2::2,0] + 0.01
intervals[:-1:2,1] = intervals[:-1:2,1] - 0.01
intervals:
array([[0. , 0.12],
[0.13, 0.19],
[0.2 , 0.24],
[0.25, 0.6 ],
[0.61, 0.69],
[0.7 , 0.89],
[0.9 , 1. ]])
You can use linspace to create your intervals
import numpy as np
>>> np.linspace(0, 1, num=3, endpoint=False)
array([0. , 0.33333333, 0.66666667])
I have some sparse indices:
[[0 0]
[0 1]
[1 0]
[1 1]
[1 2]
[2 0]]
The corresponding value of each index is:
[[0.1 0.2 0.3]
[0.4 0.5 0.6]
[0.7 0.8 0.9]
[1.0 1.1 1.2]
[1.3 1.4 1.5]
[1.6 1.7 1.8]]
How to convert the 6x3 value tensor to 3x3x3 dense tensor in tensorflow? The value for indices not specified in indices is zero vector [0. 0. 0.]. The dense tensor is just like this:
[[[0.1 0.2 0.3]
[0.4 0.5 0.6]
[0.0 0.0 0.0]]
[[0.7 0.8 0.9]
[1.0 1.1 1.2]
[1.3 1.4 1.5]]
[[1.6 1.7 1.8]
[0.0 0.0 0.0]
[0.0 0.0 0.0]]]
You can do that with tf.scatter_nd:
import tensorflow as tf
with tf.Graph().as_default(), tf.Session() as sess:
indices = tf.constant(
[[0, 0],
[0, 1],
[1, 0],
[1, 1],
[1, 2],
[2, 0]])
values = tf.constant(
[[0.1, 0.2, 0.3],
[0.4, 0.5, 0.6],
[0.7, 0.8, 0.9],
[1.0, 1.1, 1.2],
[1.3, 1.4, 1.5],
[1.6, 1.7, 1.8]])
out = tf.scatter_nd(indices, values, [3, 3, 3])
print(sess.run(out))
Output:
[[[0.1 0.2 0.3]
[0.4 0.5 0.6]
[0. 0. 0. ]]
[[0.7 0.8 0.9]
[1. 1.1 1.2]
[1.3 1.4 1.5]]
[[1.6 1.7 1.8]
[0. 0. 0. ]
[0. 0. 0. ]]]
There is no definite way to do it in Tensorflow using any reshape kind of function. I could only think about an iterative solution by creating a list and converting it back to a Tensor. This is perhaps not the most efficient solution, but this might work for your code.
# list of indices
idx=[[0,0],[0,1], [1,0],[1,1], [1,2], [2,0]]
# Original Tensor to reshape
dense_tensor=tf.Variable([[0.1, 0.2 ,0.3],[0.4, 0.5, 0.6], [0.7, 0.8, 0.9], [1.0,1.1,1.2],[1.3,1.4,1.5], [1.6,1.7,1.8]])
# creating a temporary list to later convert to Tensor
c=np.zeros([3,3,3]).tolist()
for i in range(3):
count=0
for j in range(3):
if([i,j] in idx):
c[i][j]=dense_tensor[count]
count=count+1
else:
c[i][j]=tf.Variable([0,0,0], dtype=tf.float32)
# Convert obtained list to Tensor
converted_tensor = tf.convert_to_tensor(c, dtype=tf.float32)
You can define the ranges depending upon the size of Tensor you want. For your case, I have chosen 3 as you wanted a 3x3x3 Tensor. I hope this helps!
I must build a list of 3x2 value combinations with all possible values between 0.0 and 1.0 by the step size given (for now it’s 1/3).
The output should be [ [[v1, v2], [v3, v4], [v5, v6]], ... ] where every v is a value between 0.0 and 1.0, e.g.:
[ [[0.0, 0.0], [0.0, 0.0], [0.0, 0.0]],
[[0.0, 0.0], [0.0, 0.0], [0.0, 0.33]],
[[0.0, 0.0], [0.0, 0.0], [0.0, 0.66]],
[[0.0, 0.0], [0.0, 0.0], [0.0, 1.0]],
[[0.0, 0.0], [0.0, 0.0], [0.33, 0.0]],
[[0.0, 0.0], [0.0, 0.0], [0.33, 0.33]],
...,
[[1.0, 1.0], [1.0, 1.0], [1.0, 1.0]] ]
So far I have:
step = 1.0/3.0
lexica = []
for num1 in numpy.arange(0.0, 1.0, step):
for num2 in numpy.arange(0.0, 1.0, step):
for num3 in numpy.arange(0.0, 1.0, step):
for num4 in numpy.arange(0.0, 1.0, step):
for num5 in numpy.arange(0.0, 1.0, step):
for num6 in numpy.arange(0.0, 1.0, step):
lexica.append([[num1, num2],[num3, num4],[num5, num6]])
This doesn't get 1.0 for the highest value and knowing Python there’s got to be a better way of writing this.
You can use numpy.mgrid and manipulate it to give you the output you want
np.mgrid[0:1:step, 0:1:step, 0:1:step, 0:1:step, 0:1:step, 0:1:step].T.reshape(-1, 3, 2)
EDIT:
A bit more extensible method that fixes the endpoints:
def myMesh(nSteps, shape = (3, 2)):
c = np.prod(shape)
x = np.linspace(0, 1, nSteps + 1)
return np.array(np.meshgrid(*(x,)*c)).T.reshape((-1, ) + shape)
myMesh(3)
array([[[ 0. , 0. ],
[ 0. , 0. ],
[ 0. , 0. ]],
[[ 0. , 0.33333333],
[ 0. , 0. ],
[ 0. , 0. ]],
[[ 0. , 0.66666667],
[ 0. , 0. ],
[ 0. , 0. ]],
...,
[[ 1. , 0.33333333],
[ 1. , 1. ],
[ 1. , 1. ]],
[[ 1. , 0.66666667],
[ 1. , 1. ],
[ 1. , 1. ]],
[[ 1. , 1. ],
[ 1. , 1. ],
[ 1. , 1. ]]])
this is what you could do without numpy:
from itertools import product
ret = []
for a, b, c, d, e, f in product(range(4), repeat=6):
ret.append([[a/3, b/3], [c/3, d/3], [e/3, f/3]])
or even as a list comprehension:
ret = [[[a/3, b/3], [c/3, d/3], [e/3, f/3]]
for a, b, c, d, e, f in product(range(4), repeat=6)]
You can use itertools.combinations_with_replacement in order to accomplish that task:
>>> from itertools import combinations_with_replacement as cwr
>>> cwr(cwr(numpy.linspace(0, 1, 4), 2), 3)
cwr(numpy.linspace(0, 1, 4), 2) creates all possible combinations of length 2 from the elements of numpy.linspace(0, 1, 4) (which are 0, 1/3, 2/3, 1). The outer call cwr(..., 3) then creates all possible length 3 tuples from the previous 2-tuples, resulting in your 3x2 elements.