I am trying to convert the value of a dictionary to a 1d array using:np.asarray(dict.values()), but when I tried to print the shape of the output array, I have problem.
My array looks like this:
dict_values([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26])
but the output of array.shape is:
()
by which I was expecting (27,1) or (27,)
after I changed the code to np.asarray(dict.values()).flatten(),the output of array.shape became
(1,)
I have read the document of numpy.ndarray.shape, but can't get a hint why the outputs are like these. Can someone explain it to me? Thx
This must be python 3.
From docs
The objects returned by dict.keys(), dict.values() and dict.items()
are view objects. They provide a dynamic view on the dictionary’s
entries, which means that when the dictionary changes, the view
reflects these changes.
The issue is that dict.values() is only returning a dynamic view of the data in dictionary's values, Leading to the behaviour you see.
dict_a = {'1': 1, '2': 2}
res = np.array(dict_a.values())
res.shape #()
res
#Output:
array(dict_values([1, 2]), dtype=object)
Notice that the numpy array isn't resolving the view object into the actual integers, but rather just coercing the view into an array with dtype = object
To avoid this issue, consume the view to get a list, as follows:
dict_a = {'1': 1, '2': 2}
res = np.array(list(dict_a.values()))
res.shape #(2,)
res #array([1, 2])
res.dtype #dtype('int32')
Related
I'm trying to create numpy array and data keys are positions, metadata. Its output should be like below
#sample output
['positions', metadata] #data keys when I print file_name.keys()
{'num_pos': 10, 'keypoints': [[4, 5, 6, 10, 11, 12], [1, 2, 3, 13, 14, 15]]} #values of metadata in dictionary when I print file_name['metadata']
I want output same as above. Below is my python code to get required npz file.
#code sample
positions = [] #this step is working and values are saved in npz file, so I'm just skipping this step, my problem is in metadata key which is given below
metadata = {
'num_pos': 10,
'keypoints': [[4, 5, 6, 10, 11, 12], [1, 2, 3, 13, 14, 15]]
}
positions = np.array(positions).astype(np.float32)
np.savez_compressed('file_name.npz', position=positions, metadata=metadata)
With above code I can get npz file having values of positions but not values of metadata. When I print file_name.keys() then output is ['positions', 'metadata'] which is ok but when I print file_name['metadata'] I'm getting following error.
ValueError: unsupported pickle protocol: 3
Looking for valuable suggestions
Is there a simple way of flattening an xarray dataset into a single 1D numpy array?
For example, flattening the following test dataset:
xr.Dataset({
'a' : xr.DataArray(
data=[10,11,12,13,14],
coords={'x':[0,1,2,3,4]},
dims={'x':5}
),
'b' : xr.DataArray(data=1,coords={'y':0}),
'c' : xr.DataArray(data=2,coords={'y':0}),
'd' : xr.DataArray(data=3,coords={'y':0})
})
to
[10,11,12,13,14,1,2,3]
?
If you're OK with repeated values, you can use .to_array() and then flatten the values in NumPy, e.g.,
>>> ds.to_array().values.ravel()
array([10, 11, 12, 13, 14, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3,
3, 3, 3])
If you don't want repeated values, then you'll need to write something yourself, e.g.,
>>> np.concatenate([v.values.ravel() for v in ds.data_vars.values()])
array([10, 11, 12, 13, 14, 1, 2, 3])
More generally, this sounds somewhat similar to a proposed interface for "stacking" data variables in 2D for machine learning applications: https://github.com/pydata/xarray/issues/1317
As of July 2019, xarray now has the functions to_stacked_array and to_unstacked_dataset that perform this function.
Get Dataset from question:
ds = xr.Dataset({
'a' : xr.DataArray(
data=[10,11,12,13,14],
coords={'x':[0,1,2,3,4]},
dims={'x':5}
),
'b' : xr.DataArray(data=1,coords={'y':0}),
'c' : xr.DataArray(data=2,coords={'y':0}),
'd' : xr.DataArray(data=3,coords={'y':0})
})
Get the list of data variables:
variables = ds.data_vars
Use the np.flatten() method to reduce arrays to 1D:
arrays = [ ds[i].values.flatten() for i in variables ]
Then expand list of 1D arrays (as detailed in this answer):
arrays = [i for j in arrays for i in j ]
Now convert this to an array as requested in Q (as currently a list):
array = np.array(arrays)
I have a dictionary like this:
dict_connected_hosts = {
{'10.0.0.2': [[12564.0, 6844.0, 632711.0, 56589,0, 4856,0], <ryu.controller.controller.Datapath object at 0x7f2b2008a7d0>, '10.0.0.2', '10.0.0.1', 2, datetime.datetime(2017, 9, 26, 2, 24, 12, 301565)]}
{'10.0.0.3': [[3193.0, 621482.0, 6412.0, 2146.0, 98542.0], <ryu.controller.controller.Datapath object at 0x7f2b2008a7d0>, '10.0.0.3', '10.0.0.1', 3, datetime.datetime(2017, 9, 26, 2, 24, 12, 302224)]
{'10.0.0.7': [[4545.0, 51442.0, 325.0, 452.0, 3555.0], <ryu.controller.controller.Datapath object at 0x7f2b2008a7d0>, '10.0.0.7', '10.0.0.1', 3, datetime.datetime(2017, 9, 26, 2, 24, 12, 302250)]
}
how can I sum the first numbers of each list in the value field? In simple terms numbers
`12564.0 + 3193.0 + 4545.0`
thanks
I have debugged your dictionary structure. The relevant part of it should be :
{
'10.0.0.2': [[12564.0, 6844.0, 632711.0, 56589,0, 4856,0]],
'10.0.0.3': [[3193.0, 621482.0, 6412.0, 2146.0, 98542.0]],
'10.0.0.7': [[4545.0, 51442.0, 325.0, 452.0, 3555.0]]
}
Note : ignoring the other elements in the values as they are not relevant to the question (and they have errors I don't care to debug)
So, to get the sum of the first numbers in the first list of each value, you can do it by list comprehension :
#suppose `a` is the dictionary
print([val[0][0] for val in a.values()])
#[12564.0, 3193.0, 4545.0]
print(sum( [val[0][0] for val in a.values()] ))
#20302.0
I am trying to create a list of values that correlate to a string by comparing each character of my string to that of my "alpha_list". This is for encoding procedure so that the numerical values can be added later.
I keep getting multiple errors from numerous different ways i have tried to make this happen.
import string
alpha_list = " ABCDEFGHIJKLMNOPQRSTUVWXYZ"
ints = "HELLO WORLD"
myotherlist = []
for idx, val in enumerate(ints):
myotherlist[idx] = alpha_list.index(val)
print(myotherlist)
Right now this is my current error reading
Traceback (most recent call last):
File "C:/Users/Derek/Desktop/Python/test2.py", line 11, in <module>
myotherlist[idx] = alpha_list.index(val)
IndexError: list assignment index out of range
I am pretty new to python so if I am making a ridiculously obvious mistake please feel free to criticize.
The print(myotherlist) output that i am looking for should look something like this:
[8, 5, 12, 12, 15, 0, 23, 15, 18, 12, 4]
Just use append:
for val in ints:
myotherlist.append(alpha_list.index(val))
print(myotherlist)
myotherlist is an empty list so you cannot access using myotherlist[idx] as there is no element 0 etc..
Or just use a list comprehension:
my_other_list = [alpha_list.index(val) for val in ints]
Or a functional approach using map:
map(alpha_list.index,ints))
Both output:
In [7]: [alpha_list.index(val) for val in ints]
Out[7]: [8, 5, 12, 12, 15, 0, 23, 15, 18, 12, 4]
In [8]: map(alpha_list.index,ints)
Out[8]: [8, 5, 12, 12, 15, 0, 23, 15, 18, 12, 4]
import string - don't use that a bunch of books say its better to use the built in str
myotherlist[idx] = alpha_list.index(val) is why you are getting the error. This is saying 'Go to idx index and put alpha_list.index(val) there, but since the list is empty it cannot do that.
So if you replace
for idx, val in enumerate(ints):
myotherlist[idx] = alpha_list.index(val)
with
for letter in ints: #iterates over the 'HELLO WORLD' string
index_to_append = alpha_list.index(letter)
myotherlist.append(index_to_append)
you will get the expected result!
If there is something not clear please let me know!
I haven't been able to find anything about this value error online and I am at a complete loss as to why my code is eliciting this response.
I have a large dictionary of around 50 keys. The value associated with each key is a 2D array of many elements of the form [datetime object, some other info]. A sample would look like this:
{'some_random_key': array([[datetime(2010, 10, 26, 11, 5, 28, 157404), 14.1],
[datetime(2010, 10, 26, 11, 5, 38, 613066), 17.2]],
dtype=object),
'some_other_key': array([[datetime(2010, 10, 26, 11, 5, 28, 157404), 'true'],
[datetime(2010, 10, 26, 11, 5, 38, 613066), 'false']],
dtype=object)}
What I want my code to do is to allow a user to select a start and stop date and remove all of the array elements (for all of the keys) that are not within that range.
Placing print statements throughout the code I was able to deduce that it can find the dates that are out of range, but for some reason, the error occurs when it attempts to remove the element from the array.
Here is my code:
def selectDateRange(dictionary, start, stop):
#Make a clone dictionary to delete values from
theClone = dict(dictionary)
starting = datetime.strptime(start, '%d-%m-%Y') #put in datetime format
ending = datetime.strptime(stop+' '+ '23:59', '%d-%m-%Y %H:%M') #put in datetime format
#Get a list of all the keys in the dictionary
listOfKeys = theClone.keys()
#Go through each key in the list
for key in listOfKeys:
print key
#The value associate with each key is an array
innerAry = theClone[key]
#Loop through the array and . . .
for j, value in enumerate(reversed(innerAry)):
if (value[0] <= starting) or (value[0] >= ending):
#. . . delete anything that is not in the specified dateRange
del innerAry[j]
return theClone
This is the error message that I get:
ValueError: cannot delete array elements
and it occurs at the line: del innerAry[j]
Please help - perhaps you have the eye to see the problem where I cannot.
Thanks!
If you use numpy arrays, then use them as arrays and not as lists
numpy does comparison elementwise for the entire array, which can then be used to select the relevant subarray. This also removes the need for the inner loop.
>>> a = np.array([[datetime(2010, 10, 26, 11, 5, 28, 157404), 14.1],
[datetime(2010, 10, 26, 11, 5, 30, 613066), 17.2],
[datetime(2010, 10, 26, 11, 5, 31, 613066), 17.2],
[datetime(2010, 10, 26, 11, 5, 32, 613066), 17.2],
[datetime(2010, 10, 26, 11, 5, 33, 613066), 17.2],
[datetime(2010, 10, 26, 11, 5, 38, 613066), 17.2]],
dtype=object)
>>> start = datetime(2010, 10, 26, 11, 5, 28, 157405)
>>> end = datetime(2010, 10, 26, 11, 5, 33, 613066)
>>> (a[:,0] > start)&(a[:,0] < end)
array([False, True, True, True, False, False], dtype=bool)
>>> a[(a[:,0] > start)&(a[:,0] < end)]
array([[2010-10-26 11:05:30.613066, 17.2],
[2010-10-26 11:05:31.613066, 17.2],
[2010-10-26 11:05:32.613066, 17.2]], dtype=object)
just to make sure we still have datetimes in there:
>>> b = a[(a[:,0] > start)&(a[:,0] < end)]
>>> b[0,0]
datetime.datetime(2010, 10, 26, 11, 5, 30, 613066)
NumPy arrays are fixed in size. Use lists instead.