Remove elements by index from a string within numpy array - python

I have the following example array with strings:
['100000000' '101010100' '110101010' '111111110']
I would like to be able to remove some elements by index in each of the strings in the array simultaneously. For example if I will remove elements with index 6 and 8, I should receive the following outcome:
['1000000' '1010110' '1101000' '1111110']
All my attempts failed so far maybe because the string is immutable, but am not sure whether I have to convert and if so - to what and how.

import numpy as np
a = np.array(['100000000', '101010100', '110101010', '111111110'])
list(map(lambda s: "".join([c for i, c in enumerate(str(s)) if i not in {5, 7}]), a))
Returns:
['1000000', '1010110', '1101000', '1111110']

Another way to do this is to convert a into a 2D array of single characters, mask out the values you don't want, and then convert back into a 1D array of strings.
import numpy as np
a = np.array(['100000000', '101010100', '110101010', '111111110'])
b = a.view('U1').reshape(*a.shape, -1)
mask = np.ones(b.shape[-1], dtype=bool)
mask[[5, 7],] = False
b = b[:, mask].reshape(-1).view(f'U{b.shape[-1] - (~mask).sum()}')

Related

Access multiple items of list

Im currently trying to implement a replay buffer, in which i store 20 numbers in a list and then want to sample 5 of these numbers randomly.
I tried the following with numpy arrays:
ac = np.zeros(20, dtype=np.int32)
for i in range(20):
ac[i] = i+1
batch = np.random.choice(20, 5, replace=False)
sample = ac[batch]
print(sample)
This works the way it should, but i want the same done with a list instead of a numpy array.
But when i try to get sample = ac[batch] with a list i get this error message:
TypeError: only integer scalar arrays can be converted to a scalar index
How can i access multiple elements of a list like it did with numpy?
For a list it is quite easy. Just use the sample function from the random module:
import random
ac = [i+1 for i in range(20)]
sample = random.sample(ac, 5)
Also on a side note: When you want to create a numpy array with a range of numbers, you don't have to create an array with zeros and then fill it in a for loop, that is less convenient and also significantly slower than using the numpy function arange.
ac = np.arange(1, 21, 1)
If you really want to create a batch list that conaints the indexes you want to access, then you will have to use a list comprehension to access those, since you cant just index a list with multiple indexes like a numpy array.
batch = [random.randint(0, 20) for _ in range(5)]
sample = [ac[i] for i in batch]

How to convert string formed by numpy.array2string back to array?

I have a string of numpy array which is converted by using numpy.array2string
Now, I want back my numpy array.
Any suggestions for how I can achieve it?
My Code:
img = Image.open('test.png')
array = np.array(img)
print(array.shape)
array_string = np.array2string(array, precision=2, separator=',',suppress_small=True)
P.S My array is a 3D array not 1D and I am using , separators, not the default blank
This is kind of a hack, but may be the simplest solution.
import numpy as np
array = np.array([[[1,2,3,4]]]) # create a 3D array
array_string = np.array2string(array, precision=2, separator=',', suppress_small=True)
print(array_string) #=> [[[1,2,3,4]]]
# Getting the array back to numpy
new_array = eval('np.array(' + array_string + ')')
Since the string representation of the array matches the argument we pass to build such array, using eval successfully creates the same array.
Probably is best if you enclose this in a try except in case the string format isn't valid.
Update: I just tried this and it worked for me:
import numpy as np
from PIL import Image
img = Image.open('2.jpg')
arr = np.array(img)
# get shape and type
array_shape = arr.shape
array_data_type = arr.dtype.name
# converting to string
array_string = arr.tostring()
# converting back to numpy array
new_arr = np.frombuffer(array_string, dtype=array_data_type).reshape(array_shape)
print(new_arr)
For converting numpy array to string, I used arr.tostring() instead of arr.array2string(). After that converting back to numpy array works with np.frombuffer().
numpy.array2string() gives output string as : '[1, 2]' so you need to remove the braces to get to the elements just separated by some separator.
Here is a small example to extract the list elements from the string by removing the braces and then using np.fromstring(). As you have used ',' as the separator when creating the string, I am using the same to delimit the string for conversion.
import numpy as np
x = '[1, 2]'
x = x.replace('[','')
x = x.replace(']','')
a = np.fromstring(x, dtype=int, sep=",")
print(a)
#Output: [1 2]
import numpy as np
def fromStringArrayToFloatArray(stringArray):
array = [float(s) for s in stringArray[1:-1].split(' ')]
return np.array(array)
x = np.array([1.1, 2.2, 3.3, 4.4])
y = np.array2string(x)
z = fromStringArrayToFloatArray(y)
x == z
You can use a list comprehension to split your array into different strings and then, convert them to float (or whatever)

How to concatenate an empty array with Numpy.concatenate?

I need to create an array of a specific size mxn filled with empty values so that when I concatenate to that array the initial values will be overwritten with the added values.
My current code:
a = numpy.empty([2,2]) # Make empty 2x2 matrix
b = numpy.array([[1,2],[3,4]]) # Make example 2x2 matrix
myArray = numpy.concatenate((a,b)) # Combine empty and example arrays
Unfortunately, I end up making a 4x2 matrix instead of a 2x2 matrix with the values of b.
Is there anyway to make an actually empty array of a certain size so when I concatenate to it, the values of it become my added values instead of the default + added values?
Like Oniow said, concatenate does exactly what you saw.
If you want 'default values' that will differ from regular scalar elements, I would suggest you to initialize your array with NaNs (as your 'default value'). If I understand your question, you want to merge matrices so that regular scalars will override your 'default value' elements.
Anyway I suggest you to add the following:
def get_default(size_x,size_y):
# returns a new matrix filled with 'default values'
tmp = np.empty([size_x,size_y])
tmp.fill(np.nan)
return tmp
And also:
def merge(a, b):
l = lambda x, y: y if np.isnan(x) else x
l = np.vectorize(l)
return map(l, a, b)
Note that if you merge 2 matrices, and both values are non 'default' then it will take the value of the left matrix.
Using NaNs as default value, will result the expected behavior from a default value, for example all math ops will result 'default' as this value indicates that you don't really care about this index in the matrix.
If I understand your question correctly - concatenate is not what you are looking for. Concatenate does as you saw: joins along an axis.
If you are trying to have an empty matrix that becomes the values of another you could do the following:
import numpy as np
a = np.zeros([2,2])
b = np.array([[1,2],[3,4]])
my_array = a + b
--or--
import numpy as np
my_array = np.zeros([2,2]) # you can use empty here instead in this case.
my_array[0,0] = float(input('Enter first value: ')) # However you get your data to put them into the arrays.
But, I am guessing that is not what you really want as you could just use my_array = b. If you edit your question with more info I may be able to help more.
If you are worried about values adding over time to your array...
import numpy as np
a = np.zeros([2,2])
my_array = b # b is some other 2x2 matrix
''' Do stuff '''
new_b # New array has appeared
my_array = new_b # update your array to these new values. values will not add.
# Note: if you make changes to my_array those changes will carry onto new_b. To avoid this at the cost of some memory:
my_array = np.copy(new_b)

Efficiently convert a vector of bin counts to a vector of bin indices [duplicate]

Given an array of integer counts c, how can I transform that into an array of integers inds such that np.all(np.bincount(inds) == c) is true?
For example:
>>> c = np.array([1,3,2,2])
>>> inverse_bincount(c) # <-- what I need
array([0,1,1,1,2,2,3,3])
Context: I'm trying to keep track of the location of multiple sets of data, while performing computation on all of them at once. I concatenate all the data together for batch processing, but I need an index array to extract the results back out.
Current workaround:
def inverse_bincount(c):
return np.array(list(chain.from_iterable([i]*n for i,n in enumerate(c))))
using numpy.repeat :
np.repeat(np.arange(c.size), c)
no numpy needed :
c = [1,3,2,2]
reduce(lambda x,y: x + [y] * c[y], range(len(c)), [])
The following is about twice as fast on my machine than the currently accepted answer; although I must say I am surprised by how well np.repeat does. I would expect it to suffer a lot from temporary object creation, but it does pretty well.
import numpy as np
c = np.array([1,3,2,2])
p = np.cumsum(c)
i = np.zeros(p[-1],np.int)
np.add.at(i, p[:-1], 1)
print np.cumsum(i)

indexing numpy array with logical operator

I have a 2d numpy array, for instance as:
import numpy as np
a1 = np.zeros( (500,2) )
a1[:,0]=np.arange(0,500)
a1[:,1]=np.arange(0.5,1000,2)
# could be also read from txt
then I want to select the indexes corresponding to a slice that matches a criteria such as all the value a1[:,1] included in the range (l1,l2):
l1=20.0; l2=900.0; #as example
I'd like to do in a condensed expression. However, neither:
np.where(a1[:,1]>l1 and a1[:,1]<l2)
(it gives ValueError and it suggests to use np.all, which it is not clear to me in such a case); neither:
np.intersect1d(np.where(a1[:,1]>l1),np.where(a1[:,1]<l2))
is working (it gives unhashable type: 'numpy.ndarray')
My idea is then to use these indexes to map another array of size (500,n).
Is there any reasonable way to select indexes in such way? Or: is it necessary to use some mask in such case?
This should work
np.where((a1[:,1]>l1) & (a1[:,1]<l2))
or
np.where(np.logical_and(a1[:,1]>l1, a1[:,1]<l2))
Does this do what you want?
import numpy as np
a1 = np.zeros( (500,2) )
a1[:,0]=np.arange(0,500)
a1[:,1]=np.arange(0.5,1000,2)
c=(a1[:,1]>l1)*(a1[:,1]<l2) # boolean array, true if the item at that position is ok according to the criteria stated, false otherwise
print a1[c] # prints all the points in a1 that correspond to the criteria
afterwards you can than just select from your new array that you make, the points that you need (assuming your new array has dimensions (500,n)) , by doing
print newarray[c,:]

Categories

Resources