Supose i have,
x = [[1 ,2],
[3 ,4]]
and id like to have,
y = [[1 ,1 ,2 ,2],
[1 ,1 ,2 ,2],
[3 ,3 ,4 ,4],
[3 ,3 ,4 ,4]]
I already did it using repeat but my question is, if theres a way in pure numpy which is faster and vectorized.
My second question is, how could you efficiently downsaple from y back to x?
thanks and have a nice day :)
Using np.kron() the kronecker product:
x = np.array([[1, 2], [3, 4]])
y = np.kron(x,np.ones((2,2)) # np.ones((n,n)) where n indicate the number of repetition.
If you don't care about the output shape, you can do this operation just by changing the metadata
x = np.array([[1, 2], [3, 4]])
np.lib.stride_tricks.as_strided(x, (4,2,2), (8,0,0))
Output:
array([[[1, 1],
[1, 1]],
[[2, 2],
[2, 2]],
[[3, 3],
[3, 3]],
[[4, 4],
[4, 4]]])
further, np.block will give the desired output shape (by copying)
x = np.lib.stride_tricks.as_strided(x, (4,2,2), (8,0,0))
np.block([[x[0], x[1]],
[x[2], x[3]]])
Output:
array([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]])
The np.block can also be simulated by using transpose and reshape:
x = np.lib.stride_tricks.as_strided(x, (1,2,2,2,2), (32,16,0,8,0))
x.reshape(4,4) # copy!
Output:
array([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]])
Although, this will copy memory just like np.block. You can verify this trying to set the shape directly by x.shape = (4,4):
AttributeError: Incompatible shape for in-place modification. Use `.reshape()` to make a copy with the desired shape.
In the same fashion the output can be down-sampled just by changing the shape and strides:
np.lib.stride_tricks.as_strided(x, (2,2), (8*8,8*2))
Output:
array([[1, 2],
[3, 4]])
Notice that this is done without any copying. The reason i multiply with 8 is that it is the size in bytes of a 64 bit integer.
Generalizable solution
def upsample(x, k):
return np.lib.stride_tricks.as_strided(x, (np.prod(x.shape), k, k), (x.dtype.itemsize, 0, 0))
You can try this one: Uses simple broadcasted multiplication followed by transpose and reshape operations.
x = np.array([[1, 2], [3, 4]])
m, n = x.shape
k = 2 # upsampling factor
tmp = x.reshape(-1, 1, 1) * np.ones((1, k, k))
y = tmp.reshape(m, n, k, k).transpose(0, 2, 1, 3).reshape(m*k, n*k)
print(y)
array([[1., 1., 2., 2.],
[1., 1., 2., 2.],
[3., 3., 4., 4.],
[3., 3., 4., 4.]])
To get x back from y, simply reverse the reshape and transpose operations and do a maxpool kind of operation along the last two axes to get back the original m x n shaped array:
x = y.reshape(m, n, k, k).transpose(0, 2, 1, 3).max(axis = (2, 3))
This method works for any array of shape m x n and upsampling factor k.
Related
I have the following dataframe:
import pandas as pd
import numpy as np
df = pd.DataFrame([{'a': [1,3,2]},{'a': [7,6,5]},{'a': [9,8,8]}])
df
df['a'].to_numpy()
df['a'].to_numpy()
=> array([list([1, 3, 2]), list([7, 6, 5]), list([9, 8, 8])], dtype=object)
How can I get a numpy array of shape (3,3) without writing a for loop?
First create nested lists and then convert to array, only necessary all lists with same lengths:
arr = np.array(df.a.tolist())
print (arr)
[[1 3 2]
[7 6 5]
[9 8 8]]
If always have the same length
pd.DataFrame(df.a.tolist()).values
array([[1, 3, 2],
[7, 6, 5],
[9, 8, 8]])
All of these answers are focused on a single column rather than an entire Dataframe. If you have multiple columns, where every entry at index ij is a list you can do this:
df = pd.DataFrame({"A": [[1, 2], [3, 4]], "B": [[5, 6], [7, 8]]})
print(df)
A B
0 [1, 2] [5, 6]
1 [3, 4] [7, 8]
arrays = df.applymap(lambda x: np.array(x, dtype=np.float32)).to_numpy()
result = np.array(np.stack([np.stack(a) for a in array]))
print(result, result.shape)
array([[[1., 2.],
[5., 6.]],
[[3., 4.],
[7., 8.]]], dtype=float32)
I cannot speak to the speed of this, as I use it on very small amounts of data.
I have the following arrays:
a = np.arange(12).reshape((2, 2, 3))
and
b = np.zeros((2, 2))
Now I want to use b to access a, s.t. at each for index i,j we take the z-th element of a, if b[i, j] = z.
Meaning for the above example the answer should be [[0, 3], [6, 9]].
I feel this is very related to np.choose, but yet somehow cannot quite manage it.
Can you help me?
Two approaches could be suggested.
With explicit range arrays for advanced-indexing -
m,n = b.shape
out = a[np.arange(m)[:,None],np.arange(n),b.astype(int)]
With np.take_along_axis -
np.take_along_axis(a,b.astype(int)[...,None],axis=2)[...,0]
Sample run -
In [44]: a
Out[44]:
array([[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 6, 7, 8],
[ 9, 10, 11]]])
In [45]: b
Out[45]:
array([[0., 0.],
[0., 0.]])
In [46]: m,n = b.shape
In [47]: a[np.arange(m)[:,None],np.arange(n),b.astype(int)]
Out[47]:
array([[0, 3],
[6, 9]])
In [48]: np.take_along_axis(a,b.astype(int)[...,None],axis=2)[...,0]
Out[48]:
array([[0, 3],
[6, 9]])
There are 2 questions in the title. I am confused by both questions because tensorflow is such a static programming language (I really want to go back to either pytorch or chainer).
I give 2 examples. please answer me in tensorflow codes or providing the relevant function links.
1) tf.where()
data0 = tf.zeros([2, 3, 4], dtype = tf.float32)
data1 = tf.ones([2, 3, 4], dtype = tf.float32)
cond = tf.constant([[0, 1, 1], [1, 0, 0]])
# cond.shape == (2, 3)
# tf.where() works for 1d condition with 2d data,
# but not for 2d indices with 3d tensor
# currently, what I am doing is:
# cond = tf.stack([cond] * 4, 2)
data = tf.where(cond > 0, data1, data0)
# data should be [[0., 1., 1.], [1., 0., 0.]]
(I don't know how to broadcast cond to 3d tensor)
2) change element in 2d tensor
# all dtype == tf.int64
t2d = tf.Variable([[0, 1, 2], [3, 4, 5]])
k, v = tf.constant([[0, 2], [1, 0]]), tf.constant([-2, -3])
# TODO: change values at positions k to v
# I cannot do [t2d.copy()[i] = j for i, j in k, v]
t3d == [[[0, 1, -2], [3, 4, 5]],
[[0, 1, 2], [-3, 4, 5]]]
Thank you so much in advance. XD
This are two quite different questions, and they should probably have been posted as such, but anyway.
1)
Yes, you need to manually broadcast all the inputs to [tf.where](https://www.tensorflow.org/api_docs/python/tf/where] if they are different. For what is worth, there is an (old) open issue about it, but so far implicit broadcasting it has not been implemented. You can use tf.stack like you suggest, although tf.tile would probably be more obvious (and may save memory, although I'm not sure how it is implemented really):
cond = tf.tile(tf.expand_dims(cond, -1), (1, 1, 4))
Or simply with tf.broadcast_to:
cond = tf.broadcast_to(tf.expand_dims(cond, -1), tf.shape(data1))
2)
This is one way to do that:
import tensorflow as tf
t2d = tf.constant([[0, 1, 2], [3, 4, 5]])
k, v = tf.constant([[0, 2], [1, 0]]), tf.constant([-2, -3])
# Tile t2d
n = tf.shape(k)[0]
t2d_tile = tf.tile(tf.expand_dims(t2d, 0), (n, 1, 1))
# Add aditional coordinate to index
idx = tf.concat([tf.expand_dims(tf.range(n), 1), k], axis=1)
# Make updates tensor
s = tf.shape(t2d_tile)
t2d_upd = tf.scatter_nd(idx, v, s)
# Make updates mask
upd_mask = tf.scatter_nd(idx, tf.ones_like(v, dtype=tf.bool), s)
# Make final tensor
t3d = tf.where(upd_mask, t2d_upd, t2d_tile)
# Test
with tf.Session() as sess:
print(sess.run(t3d))
Output:
[[[ 0 1 -2]
[ 3 4 5]]
[[ 0 1 2]
[-3 4 5]]]
I'd like to duplicate a numpy array dimension, but in a way that the sum of the original and the duplicated dimension array are still the same. For instance consider a n x m shape array (a) which I'd like to convert to a n x n x m (b) array, so that a[i,j] == b[i,i,j]. Unfortunately np.repeat and np.resize are not suitable for this job. Is there another numpy function I could use or is this possible with some creative indexing?
>>> import numpy as np
>>> a = np.asarray([1, 2, 3])
>>> a
array([1, 2, 3])
>>> a.shape
(3,)
# This is not what I want...
>>> np.resize(a, (3, 3))
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
In the above example, I would like to get this result:
array([[1, 0, 0],
[0, 2, 0],
[0, 0, 3]])
From 1d to 2d array, you can use the np.diagflat method, which Create a two-dimensional array with the flattened input as a diagonal:
import numpy as np
a = np.asarray([1, 2, 3])
np.diagflat(a)
#array([[1, 0, 0],
# [0, 2, 0],
# [0, 0, 3]])
More generally, you can create a zeros array and assign values in place with advanced indexing:
a = np.asarray([[1, 2, 3], [4, 5, 6]])
result = np.zeros((a.shape[0],) + a.shape)
idx = np.arange(a.shape[0])
result[idx, idx, :] = a
result
#array([[[ 1., 2., 3.],
# [ 0., 0., 0.]],
# [[ 0., 0., 0.],
# [ 4., 5., 6.]]])
I need to accomplish the following task:
from:
a = array([[1,3,4],[1,2,3]...[1,2,1]])
(add one element to each row) to:
a = array([[1,3,4,x],[1,2,3,x]...[1,2,1,x]])
I have tried doing stuff like a[n] = array([1,3,4,x])
but numpy complained of shape mismatch. I tried iterating through a and appending element x to each item, but the changes are not reflected.
Any ideas on how I can accomplish this?
Appending data to an existing array is a natural thing to want to do for anyone with python experience. However, if you find yourself regularly appending to large arrays, you'll quickly discover that NumPy doesn't easily or efficiently do this the way a python list will. You'll find that every "append" action requires re-allocation of the array memory and short-term doubling of memory requirements. So, the more general solution to the problem is to try to allocate arrays to be as large as the final output of your algorithm. Then perform all your operations on sub-sets (slices) of that array. Array creation and destruction should ideally be minimized.
That said, It's often unavoidable and the functions that do this are:
for 2-D arrays:
np.hstack
np.vstack
np.column_stack
np.row_stack
for 3-D arrays (the above plus):
np.dstack
for N-D arrays:
np.concatenate
import numpy as np
a = np.array([[1,3,4],[1,2,3],[1,2,1]])
b = np.array([10,20,30])
c = np.hstack((a, np.atleast_2d(b).T))
returns c:
array([[ 1, 3, 4, 10],
[ 1, 2, 3, 20],
[ 1, 2, 1, 30]])
One way to do it (may not be the best) is to create another array with the new elements and do column_stack. i.e.
>>>a = array([[1,3,4],[1,2,3]...[1,2,1]])
[[1 3 4]
[1 2 3]
[1 2 1]]
>>>b = array([1,2,3])
>>>column_stack((a,b))
array([[1, 3, 4, 1],
[1, 2, 3, 2],
[1, 2, 1, 3]])
Appending a single scalar could be done a bit easier as already shown (and also without converting to float) by expanding the scalar to a python-list-type:
import numpy as np
a = np.array([[1,3,4],[1,2,3],[1,2,1]])
x = 10
b = np.hstack ((a, [[x]] * len (a) ))
returns b as:
array([[ 1, 3, 4, 10],
[ 1, 2, 3, 10],
[ 1, 2, 1, 10]])
Appending a row could be done by:
c = np.vstack ((a, [x] * len (a[0]) ))
returns c as:
array([[ 1, 3, 4],
[ 1, 2, 3],
[ 1, 2, 1],
[10, 10, 10]])
np.insert can also be used for the purpose
import numpy as np
a = np.array([[1, 3, 4],
[1, 2, 3],
[1, 2, 1]])
x = 5
index = 3 # the position for x to be inserted before
np.insert(a, index, x, axis=1)
array([[1, 3, 4, 5],
[1, 2, 3, 5],
[1, 2, 1, 5]])
index can also be a list/tuple
>>> index = [1, 1, 3] # equivalently (1, 1, 3)
>>> np.insert(a, index, x, axis=1)
array([[1, 5, 5, 3, 4, 5],
[1, 5, 5, 2, 3, 5],
[1, 5, 5, 2, 1, 5]])
or a slice
>>> index = slice(0, 3)
>>> np.insert(a, index, x, axis=1)
array([[5, 1, 5, 3, 5, 4],
[5, 1, 5, 2, 5, 3],
[5, 1, 5, 2, 5, 1]])
If x is just a single scalar value, you could try something like this to ensure the correct shape of the array that is being appended/concatenated to the rightmost column of a:
import numpy as np
a = np.array([[1,3,4],[1,2,3],[1,2,1]])
x = 10
b = np.hstack((a,x*np.ones((a.shape[0],1))))
returns b as:
array([[ 1., 3., 4., 10.],
[ 1., 2., 3., 10.],
[ 1., 2., 1., 10.]])
target = []
for line in a.tolist():
new_line = line.append(X)
target.append(new_line)
return array(target)