How to shuffle only row index in numpy arrays? - python

I have two numpy arrays.I want to shuffle only their row index simultaneously. Although I equalize their row index then when i shuffle a why it doesn't automatically shuffled b
a = np.arange(100).reshape(10,10)
b = np.arange(10).reshape(10,1)
a.shape[0]==b.shape[0]
np.random.shuffle(a)
print a
print b

It makes no sense, what you want to do can be down like below:
a = np.arange(100).reshape(10,10)
b = np.arange(10).reshape(10,1)
p = np.random.permutation(a.shape[0])
a = a[p]
b = b[p]
print a
print b

You can't marry two numpy arrays that way, to force them to stay in the same order. What you can do is save the order in a separate array and then sort both of them by it.
a = np.arange(100).reshape(10,10)
b = np.arange(10).reshape(10,1)
i = np.random.shuffle(np.arange(a.size[0]))
print a[i]
print b[i]

Use numpy.random.permutation:
import numpy as np
a = np.arange(100).reshape(10,10)
b = np.arange(10).reshape(10,1)
perm = np.random.permutation(a.shape[0])
print(a[perm, :])
print(b[perm, :])
While numpy.random.shuffle sorts the array in-place, in the last two lines of my code there is only a so called view of the arrays a and b created. If you check a and b afterwards, they are still the same. So if you want to use the shuffled version you should use something like a = a[perm, :] or c = (a[perm, :]).copy().

Related

How to find the index corresponding to an array in another array? (python)

There are 2 arrays A and B:
import numpy as np
A = np.array([1,2,3,4])
B = np.array([2,5,2,1,1,6])
If the element in A exists in B, output their index in B. The ideal output C is:
C = np.array([3,4,0,2])
While a bit ugly, this should work. You want to use np.where and np.concatenate. I'm using a placeholder list to store values and recombine, there may be a smoother method, but this should do the trick until the further reading of the docs may provide a better solution.
import numpy as np
A = np.array([1,2,3,4])
B = np.array([2,5,2,1,1,6])
preC= []
for i in A:
if len(np.where(B == i)[0]) > 0:
preC.append(np.where(B == i)[0])
C = np.concatenate(preC)
print(C)

How to insert an elements from vector into matrix based on array of random indexes

Basicly, im trying to insert an elements from vector into matrix based on random index
size = 100000
answer_count = 4
num_range = int(1e4)
a = torch.randint(-num_range, num_range, size=(size, ))
b = torch.randint(-num_range, num_range, size=(size, ))
answers = torch.randint(-num_range, num_range, size=(size, answer_count))
for i in range(size): answers[i, np.random.randint(answer_count)] = a[i] + b[i]
I tried something like
c = a + b
pos = torch.randint(answer_count, size=(size, ))
answers[:, pos] = c
But i'm certainly doing something wrong
I think you need to change the last line like this:
answers[np.arange(size), pos] = c
The problem lies in incorrect use of advanced indexing. To understand the difference of those indexing try printing out answers[:, pos] vs. answers[np.arange(size), pos] and you will see why the previous one does not work. answers[np.arange(size), pos] selects each pos with a single row while answers[:, pos] selects ALL rows with each pos. More information on advanced indexing in numpy doc here.

Slicing numpy array taking every nth element

I have an numpy array of shape 24576x25 and i want to extract 3 array out of it. Where the first array contains the every 1st,4th,7th,10th,... element
while second array contains 2nd,5,8,11th,... element and third array with 3rd,6,9,12th,...
The output array sizes would be 8192x25.
I was doing the following in MATLAB
c = reshape(a,1,[]);
x = c(:,1:3:end);
y = c(:,2:3:end);
z = c(:,3:3:end);
I have tried a[:,0::3] in python but this works only if i have array of shape divisible by 3. What can i do?
X,Y = np.mgrid[0:24576:1, 0:25:1]
a = X[:,::,3]
b = X[:,1::3]
c = X[:,2::3]
does not work either. I need a,b,c.shape = 8192x25
A simple tweak to your original attempt should yield the results you want:
X,Y = np.mgrid[0:24576:1, 0:25:1]
a = X[0::3,:]
b = X[1::3,:]
c = X[2::3,:]
import numpy as np
a = np.arange(24576*25).reshape((24576,25))
a[::3]
a[::3].shape gives you (8192, 25)

filling numpy array by index

I have a function which gives me the index for a given value. Eg,
def F(value):
index = do_something(value)
return index
I want to use this index to fill a huge numpy array by 1s. Lets call array features
l = [1,4,2,3,7,5,3,6,.....]
NOTE: features.shape[0] = len(l)
for i in range(features.shape[0]):
idx = F(l[i])
features[i, idx] = 1
Is there a pythonic way to perform this (as the loop takes a lot of time if the array is huge)?
If you can vectorize F(value) you could write something like
indices = np.arange(features.shape[0])
feature_indices = F(l)
features.flat[indices, feature_indices] = 1
try this:
i = np.arange(features.shape[0]) # rows
j = np.vectorize(F)(np.array(l)) # columns
features[i,j] = 1

savetxt save only last loop data

Can someone please explain?
import numpy
a = ([1,2,3,45])
b = ([6,7,8,9,10])
numpy.savetxt('test.txt',(a,b))
This script can save well the data. But when I am running it through a loop it can print all but cannot not save all. why?
import numpy
a = ([1,2,3,4,5])
b = ([6,7,8,9,10])
for i,j in zip(a,b):
print i,j
numpy.savetxt('test.txt',(i,j))
You overwrite the previous data each time you call numpy.savetext().
A solution, using a temporary buffer array :
import numpy
a = ([1,2,3,4,5])
b = ([6,7,8,9,10])
out = []
for i,j in zip(a,b):
print i,j
out.append( (i,j) )
numpy.savetxt('test.txt',out)
numpy.savetxt will overwrite the previously written file, so you only get the result of the last iteration.
The faster way will be to use open with
import numpy
a = ([1,2,3,4,5])
b = ([6,7,8,9,10])
with open('test.txt','wb') as ftext: #Wb if you want to create a new one,
for i,j in zip(a,b): #ab if you want to appen it. Her it's wb
print i,j
numpy.savetxt(ftext,(i,j))
It will be really faster with a large array
You should append (i,j) rather than overwriting previous ones
import numpy as np
a = np.array([1,2,3,4,5])
b = np.array([6,7,8,9,10])
np.savetxt('test.txt', np.column_stack((a, b)))

Categories

Resources