apply fromiter over matrix

apply fromiter over matrix - python

Why does fromiter fail if I want to apply a function over the entire matrix?
>>> aaa = np.matrix([[2],[23]])
>>> np.fromiter( [x/2 for x in aaa], np.float)
array([ 1., 11.])
This works fine, but if the matrix is 2D, i get the following error:
>>> aaa = np.matrix([[2,2],[1,23]])
>>> aaa
matrix([[ 2, 2],
[ 1, 23]])
>>> np.fromiter( [x/2 for x in aaa], np.float)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: setting an array element with a sequence.
What alternate can I use?
I know i can write 2 loops for rows and columns, but that seems slow and not pythonic.
Thanks in advance.

Iterating over a multidimensional matrix iterates over the rows, not the cells. To iterate over each value, iterate over aaa.flat.
Note that fromiter (as documented) only creates one-dimensional arrays, which is why you have to iterate over the cells and not the rows. If you want to create new matrix of some other shape, you'll have to reshape the resulting 1d array.
Also, of course, in many cases you don't need to iterate at all. For your example, you can just do aaa/2 to divide every element of the matrix by 2.

Related

Error (only size-1 arrays can be converted to Python scalars) when trying Calculating maximus

Iam trying to calculate maximas from the below code but getting error as
TypeError: only size-1 arrays can be converted to Python scalars
Input_list= [1,9,96,9,7,4,3,77,0,2,3,4,5].
Please help
import ast,sys
import numpy as np
input_str = sys.stdin.read()
input_list = ast.literal_eval(input_str)
from scipy.signal import argrelextrema
Ar=np.array(input_list)
maximas=argrelextrema(Ar, np.greater) #store your final list here
maximas=[int(x) for x in maximas] #do not change this code, the output should be an integer list for evaluation purposes
print(maximas)

argrelextrema returns a tuple containing the arrays (https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.argrelextrema.html). For your case, you probably want to access the first element of this tuple, so something like
maximas=[int(x) for x in maximas[0]]

According to the docs, the scypy.signal.argrelaxtrema function returns a tuple of arrays, with one value for each dimension of the input. If the input is one-dimensional, you'll get a 1-tuple back.
I suspect you can fix your code with maximas = argrelextrema(Ar, np.greater)[0], or maybe maximas, = argrelextrema(Ar, np.greater) (note the comma after the variable name, which makes it an unpacking!).

In [295]: from scipy.signal import argrelextrema
In [296]: Ar = np.array([1,9,96,9,7,4,3,77,0,2,3,4,5])
In [297]: maximas=argrelextrema(Ar, np.greater)
When you ask about an error, you should show all of the error message, or at least enough so both you and we know were it occurs. That's useful information.
In [300]: [int(x) for x in maximas]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-300-0250728a7b76> in <module>
----> 1 [int(x) for x in maximas]
<ipython-input-300-0250728a7b76> in <listcomp>(.0)
----> 1 [int(x) for x in maximas]
TypeError: only size-1 arrays can be converted to Python scalars
Most likely the error is in the int() call, since int is meant to return one integer, either from a string or a numeric input. The error indicates that x was an array with more than one value.
Look at maximas:
In [301]: maximas
Out[301]: (array([2, 7]),)
It's a tuple, containing one array. So iteration on that tuple will produce an array.
We can select the array from the tuple with indexing:
In [302]: maximas[0]
Out[302]: array([2, 7])
In [303]: maximas[0].tolist() # convert it to a list
Out[303]: [2, 7]
In [304]: maximas[0].astype(int).tolist() # ensure it's integers
Out[304]: [2, 7]
According to the docs the result of argrelextrema is a tuple, one array per dimension of the input. In fact the code shows it returning np.nonzero(results). More commonly we use the np.where(results) alias.
There was no need to use int(x). maximas arrays are integer, indices.
You might not need to convert maximas to a list. The result of np.where can often be used directly in indexing, for example:
In [307]: Ar
Out[307]: array([ 1, 9, 96, 9, 7, 4, 3, 77, 0, 2, 3, 4, 5])
In [308]: Ar[maximas]
Out[308]: array([96, 77])
===
Be ware that if the input array is 2d, maximas will be a 2 element tuple.
In [312]: Be = Ar[:12].reshape(6,2) # 2d array
In [313]: argrelextrema(Be, np.greater)
Out[313]: (array([1, 3]), array([0, 1])) # 2 element tuple
In [314]: Be[_] # using the tuple to index Be
Out[314]: array([96, 77])

Python - ValueError: setting an array element with a sequence

I have a list that looks as follows:
[array(46), array(0.09), array(5.3), array(4), array(23), array(33), array([0, 1])]
When I try to save it however, as follows:
np.save('model.npy', data)
I get the following error:
ValueError: setting an array element with a sequence.
What is causing the error? Is it the array([0, 1]? Or something to do on how to format the list.
Thanks.

np.save saves arrays, not lists. So it has to first convert your list to an array. But when I do that:
In [192]: array=np.array
In [193]: data = np.array([array(797.41993), array(0.5880978458210907), array(0.606072
...: 7759272153), array(0.590397955349836), array(0.5688937568615196), array(0.56
...: 70561030951616), array([0, 1])])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-193-390218c41b83> in <module>()
----> 1 data = np.array([array(797.41993), array(0.5880978458210907), array(0.6060727759272153), array(0.590397955349836), array(0.5688937568615196), array(0.5670561030951616), array([0, 1])])
ValueError: setting an array element with a sequence.
All your sub arrays have single items, but the last has 2. If I remove that, it can create an array.
In [194]: data = np.array([array(797.41993), array(0.5880978458210907), array(0.606072
...: 7759272153), array(0.590397955349836), array(0.5688937568615196), array(0.56
...: 70561030951616)])
In [195]: data
Out[195]:
array([ 7.97419930e+02, 5.88097846e-01, 6.06072776e-01,
5.90397955e-01, 5.68893757e-01, 5.67056103e-01])
np.array(...) tries to create as high a dimensional array as it can. If the elements vary in size it can't do that. In some cases it will create a object dtype array. But in this it raised an error.
With the 1d item first it creates the object dtype:
In [196]: data = np.array([array([0, 1]), array(797.41993), array(0.5880978458210907),
...: array(0.6060727759272153), array(0.590397955349836), array(0.56889375686151
...: 96), array(0.5670561030951616)])
In [197]: data
Out[197]:
array([array([0, 1]), array(797.41993), array(0.5880978458210907),
array(0.6060727759272153), array(0.590397955349836),
array(0.5688937568615196), array(0.5670561030951616)], dtype=object)
Or with hstack as #wim suggests:
In [198]: data = np.hstack([array(797.41993), array(0.5880978458210907), array(0.60607
...: 27759272153), array(0.590397955349836), array(0.5688937568615196), array(0.5
...: 670561030951616), array([0, 1])])
In [199]: data
Out[199]:
array([ 7.97419930e+02, 5.88097846e-01, 6.06072776e-01,
5.90397955e-01, 5.68893757e-01, 5.67056103e-01,
0.00000000e+00, 1.00000000e+00])

ValueError: setting an array element with a sequence
Best explanation of this error I can find is here
The error setting an array element with a sequence happens if we try to write something into a single place (array cell, matrix entry) of an array and this something is not a scalar value.
This happens
when you try to set a single value in an array to some sort of container object (array, tuple, list, etc)
when you try to process "ragged" datatypes (i.e. constructs that are not n-d rectangular) using methods designed to be used only on arrays (which is pretty much anything in numpy)
In this case, np.save only works on single arrays, so it starts with applying asanyarray() on your input. This sees a list of seven "things" of which the first is a float, and tries to fit them into a arr.shape = (7,), arr.dtype = float array. When it gets to the last element, it tries to set arr[6] = array([0, 1]) and throws the error.

Means of asymmetric arrays in numpy

I have an asymmetric 2d array in numpy, as in some arrays are longer than others, such as: [[1, 2], [1, 2, 3], ...]
But numpy doesn't seem to like this:
import numpy as np
foo = np.array([[1], [1, 2]])
foo.mean(axis=1)
Traceback:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/tom/.virtualenvs/nlp/lib/python3.5/site-packages/numpy/core/_methods.py", line 56, in _mean
rcount = _count_reduce_items(arr, axis)
File "/home/tom/.virtualenvs/nlp/lib/python3.5/site-packages/numpy/core/_methods.py", line 50, in _count_reduce_items
items *= arr.shape[ax]
IndexError: tuple index out of range
Is there a nice way to do this or should I just do the maths myself?

We could use an almost vectorized approach based upon np.add.reduceat that takes care of the irregular length subarrays, for which we are calculating the average values. np.add.reduceat sums up elements in those intervals of irregular lengths after getting a 1D flattened version of the input array with np.concatenate. Finally, we need to divide the summations by the lengths of those subarrays to get the average values.
Thus, the implementation would look something like this -
lens = np.array(map(len,foo)) # Thanks to #Kasramvd on this!
vals = np.concatenate(foo)
shift_idx = np.append(0,lens[:-1].cumsum())
out = np.add.reduceat(vals,shift_idx)/lens.astype(float)

You could perform the mean for each sub-array of foo using a list comprehension:
mean_foo = np.array( [np.mean(subfoo) for subfoo in foo] )
As suggested by #Kasramvd in another answer's comment, you can also use the map function :
mean_foo = np.array( map(np.mean, foo) )

how can I flatten an 2d numpy array, which has different length in the second axis?

I have a numpy array which looks like:
myArray = np.array([[1,2],[3]])
But I can not flatten it,
In: myArray.flatten()
Out: array([[1, 2], [3]], dtype=object)
If I change the array to the same length in the second axis, then I can flatten it.
In: myArray2 = np.array([[1,2],[3,4]])
In: myArray2.flatten()
Out: array([1, 2, 3, 4])
My Question is:
Can I use some thing like myArray.flatten() regardless the dimension of the array and the length of its elements, and get the output: array([1,2,3])?

myArray is a 1-dimensional array of objects. Your list objects will simply remain in the same order with flatten() or ravel(). You can use hstack to stack the arrays in sequence horizontally:
>>> np.hstack(myArray)
array([1, 2, 3])
Note that this is basically equivalent to using concatenate with an axis of 1 (this should make sense intuitively):
>>> np.concatenate(myArray, axis=1)
array([1, 2, 3])
If you don't have this issue however and can merge the items, it is always preferable to use flatten() or ravel() for performance:
In [1]: u = timeit.Timer('np.hstack(np.array([[1,2],[3,4]]))'\
....: , setup = 'import numpy as np')
In [2]: print u.timeit()
11.0124390125
In [3]: u = timeit.Timer('np.array([[1,2],[3,4]]).flatten()'\
....: , setup = 'import numpy as np')
In [4]: print u.timeit()
3.05757689476
Iluengo's answer also has you covered for further information as to why you cannot use flatten() or ravel() given your array type.

Well, I agree with the other answers when they say that hstack or concatenate do the job in this case. However, I would like to point that even if it 'fixes' the problem, the problem is not addressed properly.
The problem is that even if it looks like the second axis has different length, this is not true in practice. If you try:
>>> myArray.shape
(2,)
>>> myArray.dtype
dtype('O') # stands for Object
>>> myArray[0]
[1, 2]
It shows you that your array is not a 2D array with variable size (as you might think), it is just a 1D array of objects. In your case, the elements are list, being the first element of your array a 2-element list and the second element of the array is a 1-element list.
So, flatten and ravel won't work because transforming 1D array to a 1D array results in exactly the same 1D array. If you have a object numpy array, it won't care about what you put inside, it will treat individual items as unkown items and can't decide how to merge them.
What you should have in consideration, is if this is the behaviour you want for your application. Numpy arrays are specially efficient with fixed-size numeric matrices. If you are playing with arrays of objects, I don't see why would you like to use Numpy instead of regular python lists.

np.hstack works in this case
In [69]: np.hstack(myArray)
Out[69]: array([1, 2, 3])

Method for generating row-column arrays with other arrays

I am attempting to utilize numpy to the best of its capabilities, but I am obviously missing some important link in the documentation due to my 'Noob-ness'.
What I want to do is create an array with a certain number of rows and columns and populate it with a sub array. The sub array is incremented by a pair of values as one traverses along the row. For subsequent rows, another pair of values is used to populate the columns. The best I have come up with is to use list comprehensions to generate the desired output. At this stage I can create an array which doesn't have the desired shape...I can deal with that in an awkward fashion, so all is not lost.
Here is what I have so far:
>>> import numpy as np
>>> np.set_printoptions(precision=4,threshold=20,edgeitems=3,linewidth=80) # default print options
>>> sub = np.array([[1,2],[3,4],[5,6]],dtype='float64') # a sub array of floats
>>> sub
array([[ 1., 2.],
[ 3., 4.],
[ 5., 6.]])
>>> e = np.empty((3,4),dtype='object') # an empty array of the desired shape
>>> e
array([[None, None, None, None],
[None, None, None, None],
[None, None, None, None]], dtype=object)
>>> dX = 1; dY = np.sqrt(3.)/2.0 # values to add to sub array per cell in e
>>> rows,cols = e.shape # rows and columns from 'e' shape
>>> out = [sub + [dX*i,dY*(i%2)] for i in range(0,cols)] # create the first row
>>> for j in range(1,rows): # create the other rows
... out += [out[k] + [0,-dY*2*j] for k in range(cols)]
...
>>> arr = np.array(out)
>>> arr.shape # expect to see ((3,4),3,2)...I think
(12, 3, 2)
>>> arr[0:4] # I will let you try this to see the format
The last line just shows the format of the first 4 elements of the output array. What I was hoping to do was populate the empty array, e, in a fashion which is more elegant than my list comprehension method AND/OR how to reshape the array properly. Again, unless I am missing links in the documentation, I would have expected a 3x4 array containing 3x2 subarrays...which is not what it is showing me. I would appreciate any help or links to appropriate documentation, since, I have spent hours trolling this site and I am obviously missing some appropriate numpy terminology.

The first out is a list of 4 (3,2) arrays.
np.array(out) at this stage produces a (4,3,2) array. np.array creates the highest dimension array that the data allows. In this case is concatenates those 4 arrays along a new dimension.
After the rows loop, out is a list of 12 arrays. out +=... on a list appends them.
So by the same logic, arr = np.array(out) will produce a (12,3,2) array. That could be reshaped: arr = arr.reshape(3,4,3,2).
Subarrays from arr could be copied to e, e.g.:
e[0,0] = arr[0,0]
Which raises the question, why do you want array like e? What advantage does it have over arr? arr represents 'the best of numpy's capabilities,e` tries to extend to them in poorly developed areas.
Your out list can be vectorized with something along this line:
ii = np.arange(cols)
ixy = np.array([dX*ii, dY*(ii%2)])
arr1 = sub[None,:,:] + ixy.T[:,None,:]
arr1 is a (4,3,2) array, and could be copied to the e[0,:] elements.
This could be cleaned up and extended to the other rows.
A clean way of iterating over all the elements of e, and assigning the corresponding subarray of arr uses np.ndindex (from the index_tricks module):
for i in np.ndindex(3,4):
e[i]=arr[i]
While it is a Python level iteration, it does not involve copying data. It just copies pointers. I'm a little surprised about this, but e[i,j] points to the same data block as arr[i,j]. This is evident from the .__array_interface__ values, and by modifying entries, e.g.
e[1,1][0,0] = 30
changes the value of arr[1,1,0,0].

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

apply fromiter over matrix - python

Related

Error (only size-1 arrays can be converted to Python scalars) when trying Calculating maximus

Python - ValueError: setting an array element with a sequence

Means of asymmetric arrays in numpy

how can I flatten an 2d numpy array, which has different length in the second axis?

Method for generating row-column arrays with other arrays

Categories

Resources