ValueError: setting an array element with a sequence - python

Why do the following code samples:
np.array([[1, 2], [2, 3, 4]])
np.array([1.2, "abc"], dtype=float)
...all give the following error?
ValueError: setting an array element with a sequence.

Possible reason 1: trying to create a jagged array
You may be creating an array from a list that isn't shaped like a multi-dimensional array:
numpy.array([[1, 2], [2, 3, 4]]) # wrong!
numpy.array([[1, 2], [2, [3, 4]]]) # wrong!
In these examples, the argument to numpy.array contains sequences of different lengths. Those will yield this error message because the input list is not shaped like a "box" that can be turned into a multidimensional array.
Possible reason 2: providing elements of incompatible types
For example, providing a string as an element in an array of type float:
numpy.array([1.2, "abc"], dtype=float) # wrong!
If you really want to have a NumPy array containing both strings and floats, you could use the dtype object, which allows the array to hold arbitrary Python objects:
numpy.array([1.2, "abc"], dtype=object)

The Python ValueError:
ValueError: setting an array element with a sequence.
Means exactly what it says, you're trying to cram a sequence of numbers into a single number slot. It can be thrown under various circumstances.
1. When you pass a python tuple or list to be interpreted as a numpy array element:
import numpy
numpy.array([1,2,3]) #good
numpy.array([1, (2,3)]) #Fail, can't convert a tuple into a numpy
#array element
numpy.mean([5,(6+7)]) #good
numpy.mean([5,tuple(range(2))]) #Fail, can't convert a tuple into a numpy
#array element
def foo():
return 3
numpy.array([2, foo()]) #good
def foo():
return [3,4]
numpy.array([2, foo()]) #Fail, can't convert a list into a numpy
#array element
2. By trying to cram a numpy array length > 1 into a numpy array element:
x = np.array([1,2,3])
x[0] = np.array([4]) #good
x = np.array([1,2,3])
x[0] = np.array([4,5]) #Fail, can't convert the numpy array to fit
#into a numpy array element
A numpy array is being created, and numpy doesn't know how to cram multivalued tuples or arrays into single element slots. It expects whatever you give it to evaluate to a single number, if it doesn't, Numpy responds that it doesn't know how to set an array element with a sequence.

In my case , I got this Error in Tensorflow , Reason was i was trying to feed a array with different length or sequences :
example :
import tensorflow as tf
input_x = tf.placeholder(tf.int32,[None,None])
word_embedding = tf.get_variable('embeddin',shape=[len(vocab_),110],dtype=tf.float32,initializer=tf.random_uniform_initializer(-0.01,0.01))
embedding_look=tf.nn.embedding_lookup(word_embedding,input_x)
with tf.Session() as tt:
tt.run(tf.global_variables_initializer())
a,b=tt.run([word_embedding,embedding_look],feed_dict={input_x:example_array})
print(b)
And if my array is :
example_array = [[1,2,3],[1,2]]
Then i will get error :
ValueError: setting an array element with a sequence.
but if i do padding then :
example_array = [[1,2,3],[1,2,0]]
Now it's working.

for those who are having trouble with similar problems in Numpy, a very simple solution would be:
defining dtype=object when defining an array for assigning values to it. for instance:
out = np.empty_like(lil_img, dtype=object)

In my case, the problem was another. I was trying convert lists of lists of int to array. The problem was that there was one list with a different length than others. If you want to prove it, you must do:
print([i for i,x in enumerate(list) if len(x) != 560])
In my case, the length reference was 560.

In my case, the problem was with a scatterplot of a dataframe X[]:
ax.scatter(X[:,0],X[:,1],c=colors,
cmap=CMAP, edgecolor='k', s=40) #c=y[:,0],
#ValueError: setting an array element with a sequence.
#Fix with .toarray():
colors = 'br'
y = label_binarize(y, classes=['Irrelevant','Relevant'])
ax.scatter(X[:,0].toarray(),X[:,1].toarray(),c=colors,
cmap=CMAP, edgecolor='k', s=40)

When the shape is not regular or the elements have different data types, the dtype argument passed to np.array only can be object.
import numpy as np
# arr1 = np.array([[10, 20.], [30], [40]], dtype=np.float32) # error
arr2 = np.array([[10, 20.], [30], [40]]) # OK, and the dtype is object
arr3 = np.array([[10, 20.], 'hello']) # OK, and the dtype is also object
``

In my case, I had a nested list as the series that I wanted to use as an input.
First check: If
df['nestedList'][0]
outputs a list like [1,2,3], you have a nested list.
Then check if you still get the error when changing to input df['nestedList'][0].
Then your next step is probably to concatenate all nested lists into one unnested list, using
[item for sublist in df['nestedList'] for item in sublist]
This flattening of the nested list is borrowed from How to make a flat list out of list of lists?.

The error is because the dtype argument of the np.array function specifies the data type of the elements in the array, and it can only be set to a single data type that is compatible with all the elements. The value "abc" is not a valid float, so trying to convert it to a float results in a ValueError. To avoid this error, you can either remove the string element from the list, or choose a different data type that can handle both float values and string values, such as object.
numpy.array([1.2, "abc"], dtype=object)

Related

What does "[()]" mean when called upon a numpy array?

I just came across this piece of code:
x = np.load(lc_path, allow_pickle=True)[()]
And I've never seen this pattern before: [()]. What does it do and why is this syntacticly correct?
a = np.load(lc_path, allow_pickle=True)
>>> array({'train_ppls': [1158.359413193576, 400.54333992093854, ...],
'val_ppls': [493.0056070137404, 326.53203520368623, ...],
'train_losses': [340.40905952453613, 675.6475067138672, ...],
'val_losses': [217.46258735656738, 438.86770486831665, ...],
'times': [19.488852977752686, 20.147733449935913, ...]}, dtype=object)
So I guess a is a dict wrapped in an array for some reason by the person who saved it
It a way (the only way) of indexing a 0d array:
In [475]: x=np.array(21)
In [476]: x
Out[476]: array(21)
In [477]: x.shape
Out[477]: ()
In [478]: x[()]
Out[478]: 21
In effect it pulls the element out of the array. item() is another way:
In [479]: x.item()
Out[479]: 21
In [480]: x.ndim
Out[480]: 0
In
x = np.load(lc_path, allow_pickle=True)[()]
most likely the np.save was given a non-array; and wrapped in a 0d object dtype array to save it. This is a way of recovering that object.
In [481]: np.save('test.npy', {'a':1})
In [482]: x = np.load('test.npy', allow_pickle=True)
In [483]: x
Out[483]: array({'a': 1}, dtype=object)
In [484]: x.ndim
Out[484]: 0
In [485]: x[()]
Out[485]: {'a': 1}
In general when we index a nd array, e.g. x[1,2] we are really doing x[(1,2)], that is, using a tuple that corresponds to the number of dimensions. If x is 0d, the only tuple that works is an empty one, ().
That's indexing the array with a tuple of 0 indices. For most arrays, this just produces a view of the whole array, but for a 0-dimensional array, it extracts the array's single element as a scalar.
In this case, it looks like someone made the weird choice to dump a non-NumPy object to an array with numpy.save, resulting in NumPy saving a 0-dimensional array of object dtype wrapping the original object. The use of allow_pickle=True and the empty tuple index extracts the object from the 0-dimensional array.
They probably should have picked something other than numpy.save to save this object.

numpy array indexing: list index and np.array index give different result

I am trying to index an np.array using list and np.array indexes. But they give different result.
Here is an illustration:
import numpy as np
x = np.arange(10)
idx = [[0, 1], [1, 2]]
x[np.array(idx)] # returns array([[0, 1], [1, 2]])
but straightly apply the list gives error
x[idx] # raises IndexError: too many indices for array
I'm expecting the above returns the same result as using np.array index.
Any ideas why?
I am using python 3.5 and numpy 1.13.1.
If it's an array it's interpreted as shape of the final array containing the indices - but if it's an list it's the indices along the "dimensions" (multi-dimensional array indices).
So the first example (with an array) is equivalent to:
[[x[0], x[1],
[x[1], x[2]]
But the second example (list) is interpreted as:
[x[0, 1], x[1, 2]]
But x[0, 1] gives a IndexError: too many indices for array because your x has only one dimension.
That's because lists are interpreted like it was a tuple, which is identical to passing them in "separately":
x[[0, 1], [1, 2]]
^^^^^^----- indices for the second dimension
^^^^^^------------- indices for the first dimension
From numpy indexing documentation:
ndarrays can be indexed using the standard Python x[obj] syntax, where x is the array and obj the selection.
...
Basic slicing occurs when obj is a slice object (constructed by
start:stop:step notation inside of brackets), an integer, or a tuple
of slice objects and integers. Ellipsis and newaxis objects can be
interspersed with these as well. In order to remain backward
compatible with a common usage in Numeric, basic slicing is also
initiated if the selection object is any non-ndarray sequence (such as
a list) containing slice objects, the Ellipsis object, or the newaxis
object, but not for integer arrays or other embedded sequences. ...

ValueError: setting an array element with a sequence. Neural network [duplicate]

Why do the following code samples:
np.array([[1, 2], [2, 3, 4]])
np.array([1.2, "abc"], dtype=float)
...all give the following error?
ValueError: setting an array element with a sequence.
Possible reason 1: trying to create a jagged array
You may be creating an array from a list that isn't shaped like a multi-dimensional array:
numpy.array([[1, 2], [2, 3, 4]]) # wrong!
numpy.array([[1, 2], [2, [3, 4]]]) # wrong!
In these examples, the argument to numpy.array contains sequences of different lengths. Those will yield this error message because the input list is not shaped like a "box" that can be turned into a multidimensional array.
Possible reason 2: providing elements of incompatible types
For example, providing a string as an element in an array of type float:
numpy.array([1.2, "abc"], dtype=float) # wrong!
If you really want to have a NumPy array containing both strings and floats, you could use the dtype object, which allows the array to hold arbitrary Python objects:
numpy.array([1.2, "abc"], dtype=object)
The Python ValueError:
ValueError: setting an array element with a sequence.
Means exactly what it says, you're trying to cram a sequence of numbers into a single number slot. It can be thrown under various circumstances.
1. When you pass a python tuple or list to be interpreted as a numpy array element:
import numpy
numpy.array([1,2,3]) #good
numpy.array([1, (2,3)]) #Fail, can't convert a tuple into a numpy
#array element
numpy.mean([5,(6+7)]) #good
numpy.mean([5,tuple(range(2))]) #Fail, can't convert a tuple into a numpy
#array element
def foo():
return 3
numpy.array([2, foo()]) #good
def foo():
return [3,4]
numpy.array([2, foo()]) #Fail, can't convert a list into a numpy
#array element
2. By trying to cram a numpy array length > 1 into a numpy array element:
x = np.array([1,2,3])
x[0] = np.array([4]) #good
x = np.array([1,2,3])
x[0] = np.array([4,5]) #Fail, can't convert the numpy array to fit
#into a numpy array element
A numpy array is being created, and numpy doesn't know how to cram multivalued tuples or arrays into single element slots. It expects whatever you give it to evaluate to a single number, if it doesn't, Numpy responds that it doesn't know how to set an array element with a sequence.
In my case , I got this Error in Tensorflow , Reason was i was trying to feed a array with different length or sequences :
example :
import tensorflow as tf
input_x = tf.placeholder(tf.int32,[None,None])
word_embedding = tf.get_variable('embeddin',shape=[len(vocab_),110],dtype=tf.float32,initializer=tf.random_uniform_initializer(-0.01,0.01))
embedding_look=tf.nn.embedding_lookup(word_embedding,input_x)
with tf.Session() as tt:
tt.run(tf.global_variables_initializer())
a,b=tt.run([word_embedding,embedding_look],feed_dict={input_x:example_array})
print(b)
And if my array is :
example_array = [[1,2,3],[1,2]]
Then i will get error :
ValueError: setting an array element with a sequence.
but if i do padding then :
example_array = [[1,2,3],[1,2,0]]
Now it's working.
for those who are having trouble with similar problems in Numpy, a very simple solution would be:
defining dtype=object when defining an array for assigning values to it. for instance:
out = np.empty_like(lil_img, dtype=object)
In my case, the problem was another. I was trying convert lists of lists of int to array. The problem was that there was one list with a different length than others. If you want to prove it, you must do:
print([i for i,x in enumerate(list) if len(x) != 560])
In my case, the length reference was 560.
In my case, the problem was with a scatterplot of a dataframe X[]:
ax.scatter(X[:,0],X[:,1],c=colors,
cmap=CMAP, edgecolor='k', s=40) #c=y[:,0],
#ValueError: setting an array element with a sequence.
#Fix with .toarray():
colors = 'br'
y = label_binarize(y, classes=['Irrelevant','Relevant'])
ax.scatter(X[:,0].toarray(),X[:,1].toarray(),c=colors,
cmap=CMAP, edgecolor='k', s=40)
When the shape is not regular or the elements have different data types, the dtype argument passed to np.array only can be object.
import numpy as np
# arr1 = np.array([[10, 20.], [30], [40]], dtype=np.float32) # error
arr2 = np.array([[10, 20.], [30], [40]]) # OK, and the dtype is object
arr3 = np.array([[10, 20.], 'hello']) # OK, and the dtype is also object
``
In my case, I had a nested list as the series that I wanted to use as an input.
First check: If
df['nestedList'][0]
outputs a list like [1,2,3], you have a nested list.
Then check if you still get the error when changing to input df['nestedList'][0].
Then your next step is probably to concatenate all nested lists into one unnested list, using
[item for sublist in df['nestedList'] for item in sublist]
This flattening of the nested list is borrowed from How to make a flat list out of list of lists?.
The error is because the dtype argument of the np.array function specifies the data type of the elements in the array, and it can only be set to a single data type that is compatible with all the elements. The value "abc" is not a valid float, so trying to convert it to a float results in a ValueError. To avoid this error, you can either remove the string element from the list, or choose a different data type that can handle both float values and string values, such as object.
numpy.array([1.2, "abc"], dtype=object)

Dictionary in a numpy array?

How do I access the dictionary inside the array?
import numpy as np
x = np.array({'x': 2, 'y': 5})
My initial thought:
x['y']
Index Error: not a valid index
x[0]
Index Error: too many indices for array
You have a 0-dimensional array of object dtype. Making this array at all is probably a mistake, but if you want to use it anyway, you can extract the dictionary by indexing the array with a tuple of no indices:
x[()]
or by calling the array's item method:
x.item()
If you add square brackets to the array assignment you will have a 1-dimensional array:
x = np.array([{'x': 2, 'y': 5}])
then you could use:
x[0]['y']
I believe it would make more sense.

how can I flatten an 2d numpy array, which has different length in the second axis?

I have a numpy array which looks like:
myArray = np.array([[1,2],[3]])
But I can not flatten it,
In: myArray.flatten()
Out: array([[1, 2], [3]], dtype=object)
If I change the array to the same length in the second axis, then I can flatten it.
In: myArray2 = np.array([[1,2],[3,4]])
In: myArray2.flatten()
Out: array([1, 2, 3, 4])
My Question is:
Can I use some thing like myArray.flatten() regardless the dimension of the array and the length of its elements, and get the output: array([1,2,3])?
myArray is a 1-dimensional array of objects. Your list objects will simply remain in the same order with flatten() or ravel(). You can use hstack to stack the arrays in sequence horizontally:
>>> np.hstack(myArray)
array([1, 2, 3])
Note that this is basically equivalent to using concatenate with an axis of 1 (this should make sense intuitively):
>>> np.concatenate(myArray, axis=1)
array([1, 2, 3])
If you don't have this issue however and can merge the items, it is always preferable to use flatten() or ravel() for performance:
In [1]: u = timeit.Timer('np.hstack(np.array([[1,2],[3,4]]))'\
....: , setup = 'import numpy as np')
In [2]: print u.timeit()
11.0124390125
In [3]: u = timeit.Timer('np.array([[1,2],[3,4]]).flatten()'\
....: , setup = 'import numpy as np')
In [4]: print u.timeit()
3.05757689476
Iluengo's answer also has you covered for further information as to why you cannot use flatten() or ravel() given your array type.
Well, I agree with the other answers when they say that hstack or concatenate do the job in this case. However, I would like to point that even if it 'fixes' the problem, the problem is not addressed properly.
The problem is that even if it looks like the second axis has different length, this is not true in practice. If you try:
>>> myArray.shape
(2,)
>>> myArray.dtype
dtype('O') # stands for Object
>>> myArray[0]
[1, 2]
It shows you that your array is not a 2D array with variable size (as you might think), it is just a 1D array of objects. In your case, the elements are list, being the first element of your array a 2-element list and the second element of the array is a 1-element list.
So, flatten and ravel won't work because transforming 1D array to a 1D array results in exactly the same 1D array. If you have a object numpy array, it won't care about what you put inside, it will treat individual items as unkown items and can't decide how to merge them.
What you should have in consideration, is if this is the behaviour you want for your application. Numpy arrays are specially efficient with fixed-size numeric matrices. If you are playing with arrays of objects, I don't see why would you like to use Numpy instead of regular python lists.
np.hstack works in this case
In [69]: np.hstack(myArray)
Out[69]: array([1, 2, 3])

Categories

Resources