How to convert string formed by numpy.array2string back to array? - python

I have a string of numpy array which is converted by using numpy.array2string
Now, I want back my numpy array.
Any suggestions for how I can achieve it?
My Code:
img = Image.open('test.png')
array = np.array(img)
print(array.shape)
array_string = np.array2string(array, precision=2, separator=',',suppress_small=True)
P.S My array is a 3D array not 1D and I am using , separators, not the default blank

This is kind of a hack, but may be the simplest solution.
import numpy as np
array = np.array([[[1,2,3,4]]]) # create a 3D array
array_string = np.array2string(array, precision=2, separator=',', suppress_small=True)
print(array_string) #=> [[[1,2,3,4]]]
# Getting the array back to numpy
new_array = eval('np.array(' + array_string + ')')
Since the string representation of the array matches the argument we pass to build such array, using eval successfully creates the same array.
Probably is best if you enclose this in a try except in case the string format isn't valid.

Update: I just tried this and it worked for me:
import numpy as np
from PIL import Image
img = Image.open('2.jpg')
arr = np.array(img)
# get shape and type
array_shape = arr.shape
array_data_type = arr.dtype.name
# converting to string
array_string = arr.tostring()
# converting back to numpy array
new_arr = np.frombuffer(array_string, dtype=array_data_type).reshape(array_shape)
print(new_arr)
For converting numpy array to string, I used arr.tostring() instead of arr.array2string(). After that converting back to numpy array works with np.frombuffer().

numpy.array2string() gives output string as : '[1, 2]' so you need to remove the braces to get to the elements just separated by some separator.
Here is a small example to extract the list elements from the string by removing the braces and then using np.fromstring(). As you have used ',' as the separator when creating the string, I am using the same to delimit the string for conversion.
import numpy as np
x = '[1, 2]'
x = x.replace('[','')
x = x.replace(']','')
a = np.fromstring(x, dtype=int, sep=",")
print(a)
#Output: [1 2]

import numpy as np
def fromStringArrayToFloatArray(stringArray):
array = [float(s) for s in stringArray[1:-1].split(' ')]
return np.array(array)
x = np.array([1.1, 2.2, 3.3, 4.4])
y = np.array2string(x)
z = fromStringArrayToFloatArray(y)
x == z
You can use a list comprehension to split your array into different strings and then, convert them to float (or whatever)

Related

Remove elements by index from a string within numpy array

I have the following example array with strings:
['100000000' '101010100' '110101010' '111111110']
I would like to be able to remove some elements by index in each of the strings in the array simultaneously. For example if I will remove elements with index 6 and 8, I should receive the following outcome:
['1000000' '1010110' '1101000' '1111110']
All my attempts failed so far maybe because the string is immutable, but am not sure whether I have to convert and if so - to what and how.
import numpy as np
a = np.array(['100000000', '101010100', '110101010', '111111110'])
list(map(lambda s: "".join([c for i, c in enumerate(str(s)) if i not in {5, 7}]), a))
Returns:
['1000000', '1010110', '1101000', '1111110']
Another way to do this is to convert a into a 2D array of single characters, mask out the values you don't want, and then convert back into a 1D array of strings.
import numpy as np
a = np.array(['100000000', '101010100', '110101010', '111111110'])
b = a.view('U1').reshape(*a.shape, -1)
mask = np.ones(b.shape[-1], dtype=bool)
mask[[5, 7],] = False
b = b[:, mask].reshape(-1).view(f'U{b.shape[-1] - (~mask).sum()}')

Efficient way of creating numpy array of fixed size tuples from bytes

I'm trying to convert bytes to numpy array of fixed size tuples (2 or 3 doubles) and it must be 1d array.
What I managed to get is:
values = np.fromstring(data, (np.double, (n,))) - it gives me 2d array with shape (105107, 2)
array([[0.03171165, 0.03171165],
[0.03171165, 0.03171165],
[0.03020949, 0.03020949],
...,
[0.05559354, 0.16173067],
[0.12667986, 0.04522982],
[0.14062567, 0.11422881]])
values = np.fromstring(data, [('dt', np.double, (n,))]) - it gives me 1d array with shape (105107,), but array contains tuples containing array with two doubles
array([([0.03171165, 0.03171165],), ([0.03171165, 0.03171165],),
([0.03020949, 0.03020949],), ..., ([0.05559354, 0.16173067],),
([0.12667986, 0.04522982],), ([0.14062567, 0.11422881],)],
dtype=[('dt', '<f8', (2,))])
is there any efficient way to achieve 1d array like this?:
array([(0.03171165, 0.03171165),
(0.03171165, 0.03171165),
(0.03020949, 0.03020949),
...,
(0.05559354, 0.16173067),
(0.12667986, 0.04522982),
(0.14062567, 0.11422881)])
No, I don't know an efficient way, but as nobody has so far posted any answer at all, here is a way that at least gets you the desired output. However, efficient it is not.
values = np.fromstring(data, (np.double, (n,)))
x = np.empty(values.shape[0], dtype=np.object)
for i, a in enumerate(values):
x[i] = tuple(a)
I would add that if you have an array of objects, it so much negates the benefits of using vectorisation in numpy, that you might as well just use a list instead:
values = np.fromstring(data, (np.double, (n,)))
x = [tuple(a) for a in values]
A possible alternative approach to generating the array of tuples -- not sure if it is any faster -- would be to go via such a list, and convert it back into an array in such a way as to deliberately break the conversion to a nice ordinary 2-d array that numpy would otherwise do:
values = np.fromstring(data, (np.double, (n,)))
x = [tuple(a) for a in values]
x.append(None)
y = np.array(x)[:-1]
I already solved the problem using this code:
names = ['d{i}'.format(i=i) for i in range(n)]
value = np.fromstring(data, {
'names': names,
'formats': [np.double] * n
})

Obtain torch.tensor from string of floats

We can convert 1 dimensional array of floats, stored as a space separated numbers in text file, in to a numpy array or a torch tensor as follows.
line = "1 5 3 7 4"
np_array = np.fromstring(line, dtype='int', sep=" ")
np_array
>> array([1, 5, 3, 7, 4])
And to convert above numpy array to a torch tensor, we can do following :
torch_tensor = torch.tensor(np_array)
torch_tensor
>>tensor([1, 5, 3, 7, 4])
How can I convert a string of numbers separated by space in to a torch.Tensor directly without
converting them to a numpy array? We can also do this by fist splitting the string at a space, mapping them to int or float, and then feeding it to torch.tensor. But like numpy's fromstring, is there any such method in pytorch?
What about
x = torch.tensor(list(map(float, line.split(' '))), dtype=torch.float32)
PyTorch currently has no analogous function to numpy's fromstring. You can either use the numpy function itself, or by splitting and mapping as you say.

numpy fromfile and structured arrays

I'm trying to use numpy.fromfile to read a structured array (file header) by passing in a user defined data-type. For some reason, my structured array elements are coming back as 2-d Arrays instead of flat 1D arrays:
headerfmt='20i,20f,a80'
dt = np.dtype(headerfmt)
header = np.fromfile(fobj,dtype=dt,count=1)
ints,floats,chars = header['f0'][0], header['f1'][0], header['f2'][0]
# ^? ^? ^?
How do I modify headerfmt so that it will read them as flat 1D arrays?
If the count will always be 1, just do:
header = np.fromfile(fobj, dtype=dt, count=1)[0]
You'll still be able to index by field name, though the repr of the array won't show the field names.
For example:
import numpy as np
headerfmt='20i,20f,a80'
dt = np.dtype(headerfmt)
# Note the 0-index!
x = np.zeros(1, dtype=dt)[0]
print x['f0'], x['f1'], x['f2']
ints, floats, chars = x
It may or may not be ideal for your purposes, but it's simple, at any rate.

Converting a matrix created with MATLAB to Numpy array with a similar syntax

I'm playing with the code snippets of the course I'm taking which is originally written in MATLAB. I use Python and convert these matrices to Python for the toy examples. For example, for the following MATLAB matrix:
s = [2 3; 4 5];
I use
s = array([[2,3],[4,5]])
It is too time consuming for me to re-write all the toy examples this way because I just want to see how they work. Is there a way to directly give the MATLAB matrix as string to a Numpy array or a better alternative for this?
For example, something like:
s = myMagicalM2ArrayFunction('[2 3; 4 5]')
numpy.matrix can take string as an argument.
Docstring:
matrix(data, dtype=None, copy=True)
[...]
Parameters
----------
data : array_like or string
If `data` is a string, it is interpreted as a matrix with commas
or spaces separating columns, and semicolons separating rows.
In [1]: import numpy as np
In [2]: s = '[2 3; 4 5]'
In [3]: def mag_func(s):
...: return np.array(np.matrix(s.strip('[]')))
In [4]: mag_func(s)
Out[4]:
array([[2, 3],
[4, 5]])
How about just saving a set of example matrices in Matlab and load them directly into python:
http://docs.scipy.org/doc/scipy/reference/tutorial/io.html
EDIT:
or not sure how robust this is (just threw together a simple parser which is probably better implemented in some other way), but something like:
import numpy as np
def myMagicalM2ArrayFunction(s):
tok = []
for t in s.strip('[]').split(';'):
tok.append('[' + ','.join(t.strip().split(' ')) + ']')
b = eval('[' + ','.join(tok) + ']')
return np.array(b)
For 1D arrays, this will create a numpy array with shape (1,N), so you might want to use np.squeeze to get a (N,) shaped array depending on what you are doing.
If you want a numpy array rather than a numpy matrix
def str_to_mat(x):
x = x.strip('[]')
return np.vstack(list(map(lambda r: np.array(r.split(','), dtype=np.float32), x.split(';'))))

Categories

Resources