Access multiple items of list - python

Im currently trying to implement a replay buffer, in which i store 20 numbers in a list and then want to sample 5 of these numbers randomly.
I tried the following with numpy arrays:
ac = np.zeros(20, dtype=np.int32)
for i in range(20):
ac[i] = i+1
batch = np.random.choice(20, 5, replace=False)
sample = ac[batch]
print(sample)
This works the way it should, but i want the same done with a list instead of a numpy array.
But when i try to get sample = ac[batch] with a list i get this error message:
TypeError: only integer scalar arrays can be converted to a scalar index
How can i access multiple elements of a list like it did with numpy?

For a list it is quite easy. Just use the sample function from the random module:
import random
ac = [i+1 for i in range(20)]
sample = random.sample(ac, 5)
Also on a side note: When you want to create a numpy array with a range of numbers, you don't have to create an array with zeros and then fill it in a for loop, that is less convenient and also significantly slower than using the numpy function arange.
ac = np.arange(1, 21, 1)
If you really want to create a batch list that conaints the indexes you want to access, then you will have to use a list comprehension to access those, since you cant just index a list with multiple indexes like a numpy array.
batch = [random.randint(0, 20) for _ in range(5)]
sample = [ac[i] for i in batch]

Related

Frequencies of elements in 2D numpy array

I have a numpy array output of shape (1000,4). It is an array which contains 1000 quadruples with no repetitions and they are ordered (i.e. an element is [0,1,2,3]). I want to count how many times I got all possible quadruples. More practically, I use the following code:
comb=np.array(list(itertools.combinations(range(32),4)))
def counting(comb, output):
k=0
n_output=np.zeros(comb.shape[0])
for i in range(comb.shape[0]):
k=0
for j in range(output.shape[0]):
if (output[j]==comb[i]).all():
k+=1
n_output[i]=k
return n_output
How can I optimize the code? At the moment it takes 30 s to run
Your current implementation is inefficient for 2 reasons:
the complexity of the algorithm is O(n^2);
it makes use of (slow CPython) loops.
You write a simple O(n) algorithm using Python sets (still with a loop) since output does not have any repetitions. Here is the result:
def countingFast(comb, output):
k=0
n_output=np.zeros(comb.shape[0])
tmp = set(map(tuple, output))
for i in range(comb.shape[0]):
n_output[i] = int(tuple(comb[i]) in tmp)
return n_output
On my machine, using the described input sizes, the original version takes 55.2 seconds while this implementation takes 0.038 second. This is roughly 1400 times faster.
You can generate a boolean array representing if the sequence you want to check is equal to a given row in your array. As numpy's boolean arrays can be summed, you could then use this result to get the total number of matching rows.
A basic approach could look like this (including sample data generation):
import numpy as np
# set seed value of random generator to fixed value for repeatable output
np.random.seed(1234)
# create a random array with 950x4 elements
arr = np.random.rand(950, 4)
# create a 50x4 array with sample sequence
# this is the sequence we want to count in our final array
sequence = [0, 1, 2, 3]
sample = np.array([sequence, ]*50)
# stack arrays to create sample data with 1000x4 elements
arr = np.vstack((arr, sample))
# shuffle array to get a random distribution of random sample data and known sequence
np.random.shuffle(arr)
# check for equal array elements, returns a boolean array
results = np.equal(sequence, arr)
# sum the boolean array to get the number of total occurences per axis
# as the sum is the same for all columns, we just need to get the first element at index 0
occurences = np.sum(results, axis=0)[0]
print(occurences)
# --> 50
You need to call the required lines for each of sequence you are interested in. Therefore, it would be useful to write a function like this:
def number_of_occurences(data, sequence):
results = np.equal(sequence, data)
return np.sum(results, axis=0)[0]

how to change a array of a single column to a row of values instead of arrays in python

I am trying to convert an array of arrays that each contain only one integer to a single array with just the integers.
This is my code below. k=1 after the first for loop and the next code deletes all the rows of except the first one and then transposes it.
handles.Background = np.zeros(((len(imgY) * len(imgX)),len(imgZ)))
WhereIsBackground = np.zeros((len(imgY), len(imgX)))
k = 0
for i in range(len(imgY)):
for j in range (len(imgX)):
if img[i,j,handles.PS_Index] < (handles.PS_Mean_Intensity / 8):
handles.Background[k,:] = img[i,j,:]
WhereIsBackground[i,j] = 1
k = k+1
handles.Background = np.delete(handles.Background,np.s_[k:(len(imgY)*len(imgX))+1],0).T
At this point, I can access data by using handles.Background[n] but this returns an array that contains a single integer. I was trying to convert the handles.Background so that when I do handles.Background[n], it just returns a single integer instead of an array containing that value.
So, I'm getting array([0.]) when I run handles.Background[0], but I want to get just 0 when I run handles.Background[0]
I've observed that int(handles.Background[i]) returns an integer and tried to reassign them using a for loop but the result didn't really change. What would be the best option for me?
for i in range (len(handles.Background)):
handles.Background[i] = int(handles.Background[i])
if handles.Background[n] returns an array, you can index into that, too, using the same [n] notation.
So you are looking for
handles.Background[n][0]
If you want to unpack the whole array at once, you can use this:
handles.Background = [bg[0] for bg in handles.Background]

Trying to create a list of lists of a random noise sample in Python 3.X

I'm trying to create a list of a certain number of arrays, say 10, where each array is comprised of a number of random numbers between two values in a normal distribution. What I have so far is:
noise = abs(np.random.normal(0,0.1,20))
noise1 = []
for i in range(10):
noise1 = noise.append(range(i,10))
Where 'noise' is an array of 20 random positive values between 0 and 0.1. Now I want to create a list of 10 of these arrays with different random numbers each time called 'noise1'. Using this method I get the error 'TypeError: 'numpy.ndarray' object is not callable', meaning the program tries to use the array as a function, but I don't know how to solve the problem. Would really appreciate some help!
You need to use numpy.concatenate or numpy.append to append to a numpy array, which is why you are getting your current error. However, there is a much simpler way to create the list you desire.
noise1 = [abs(np.random.normal(0,0.1,20)) for _ in range(10)]
print(np.shape(noise1))
Output:
(10, 20)

Storing arrays in Python for loop

Let's say I have a function (called numpyarrayfunction) that outputs an array every time I run it. I would like to run the function multiple times and store the resulting arrays. Obviously, the current method that I am using to do this -
numpyarray = np.zeros((5))
for i in range(5):
numpyarray[i] = numpyarrayfunction
generates an error message since I am trying to store an array within an array.
Eventually, what I would like to do is to take the average of the numbers that are in the arrays, and then take the average of these averages. But for the moment, it would be useful to just know how to store the arrays!
Thank you for your help!
As comments and other answers have already laid out, a good way to do this is to store the arrays being returned by numpyarrayfunction in a normal Python list.
If you want everything to be in a single numpy array (for, say, memory efficiency or computation speed), and the arrays returned by numpyarrayfunction are of a fixed length n, you could make numpyarray multidimensional:
numpyarray = np.empty((5, n))
for i in range(5):
numpyarray[i, :] = numpyarrayfunction
Then you could do np.average(numpyarray, axis = 1) to average over the second axis, which would give you back a one-dimensional array with the average of each array you got from numpyarrayfunction. np.average(numpyarray) would be the average over all the elements, or np.average(np.average(numpyarray, axis = 1)) if you really want the average value of the averages.
More on numpy array indexing.
I initially misread what was going on inside the for loop there. The reason you're getting an error is because numpy arrays will only store numeric types by default, and numpyarrayfunction is returning a non-numeric value (from the name, probably another numpy array). If that function already returns a full numpy array, then you can do something more like this:
arrays = []
for i in range(5):
arrays.append(numpyarrayfunction(args))
Then, you can take the average like so:
avgarray = np.zeros((len(arrays[0])))
for array in arrays:
avgarray += array
avgarray = avgarray/len(arrays)

Numpy array of multiple indices replace with a different matrix

I have an array of 2d indices.
indices = [[2,4], [6,77], [102,554]]
Now, I have a different 4-dimensional array, arr, and I want to only extract an array (it is an array, since it is 4-dimensional) with corresponding index in the indices array. It is equivalent to the following code.
for i in range(len(indices)):
output[i] = arr[indices[i][0], indices[i][1]]
However, I realized that using explicit for-loop yields a slow result. Is there any built-in numpy API that I can utilized? At this point, I tried using np.choose, np.put, np.take, but did not succeed to yield what I wanted. Thank you!
We need to index into the first two axes with the two columns from indices (thinking of it as an array).
Thus, simply convert to array and index, like so -
indices_arr = np.array(indices)
out = arr[indices_arr[:,0], indices_arr[:,1]]
Or we could extract those directly without converting to array and then index -
d0,d1 = [i[0] for i in indices], [i[1] for i in indices]
out = arr[d0,d1]
Another way to extract the elements would be with conversion to tuple, like so -
out = arr[tuple(indices_arr.T)]
If indices is already an array, skip the conversion process and use indices in places where we had indices_arr.
Try using the take function of numpy arrays. Your code should be something like:
outputarray= np.take(arr,indices)

Categories

Resources