Saving a Python dictionary holding a numpy array to a file [duplicate] - python

This question already has answers here:
Saving dictionary of numpy arrays
(3 answers)
Closed 1 year ago.
I need to save a python dictionary of the format {key : [numpy array]} to a file (doesnt matter if the file is human readable or not, I just need to be able to retrieve it back into the same format later). I'd prefer to keep the numpy array as a numpy array as it is very large. Also the numpy array is 300dimensional so iterating through them would be impractical. I haven't seen any other questions asking this, because it doesn't look like I can use the numpy save methods as I am saving a dictionary. JSON does not work as the dictionary contains a numpy array. Does anyone know how I can do this?
Thanks :)

The pickle module can handle numpy arrays. It is used almost exactly like the json module.
>>> import pickle
>>> import numpy as np
>>> a = np.arange(200).reshape((20,10))
>>> pickle.dump( a, open('xxx.bin','wb') )
... exit and reload ...
>>> import pickle
>>> pickle.load(open('xxx.bin','rb'))
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[ 10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[ 20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
[ 30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
[ 40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
[ 50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
[ 60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
[ 70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
[ 80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
[ 90, 91, 92, 93, 94, 95, 96, 97, 98, 99],
[100, 101, 102, 103, 104, 105, 106, 107, 108, 109],
[110, 111, 112, 113, 114, 115, 116, 117, 118, 119],
[120, 121, 122, 123, 124, 125, 126, 127, 128, 129],
[130, 131, 132, 133, 134, 135, 136, 137, 138, 139],
[140, 141, 142, 143, 144, 145, 146, 147, 148, 149],
[150, 151, 152, 153, 154, 155, 156, 157, 158, 159],
[160, 161, 162, 163, 164, 165, 166, 167, 168, 169],
[170, 171, 172, 173, 174, 175, 176, 177, 178, 179],
[180, 181, 182, 183, 184, 185, 186, 187, 188, 189],
[190, 191, 192, 193, 194, 195, 196, 197, 198, 199]])
>>>

Related

How to apply slicing to a pandas DataFrame? [duplicate]

This question already has answers here:
How to slice a pandas DataFrame by position?
(5 answers)
Closed 24 days ago.
I am trying to replace the following code:
DfInt['Closest Service'] = DfInt[
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77,
78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96,
97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111,
112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126,
127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141,
142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156,
157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171,
172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186,
187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201,
202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216,
217, 218, 219, 220, 221, 222, 223]
].idxmin(axis=1)
by something like
DfInt['Closest Service'] = DfInt[[0:224]].idxmin(axis=1)
But this is not working... Anyone an idea?
If need select by labels use DataFrame.loc, : is for select all rows:
DfInt['Closest Service'] = DfInt.loc[:, :223].idxmin(axis=1)
If select by positions - first 223 columns use DataFrame.iloc by 224:
DfInt['Closest Service'] = DfInt.iloc[:, :224].idxmin(axis=1)

How to convert an array into a new array based on a lookup dictionary

I'm trying to convert a numpy array into a new array by using each value in the existing array and finding its corresponding key from a dictionary. The new array should consist of the corresponding dictionary keys.
Here is what I have:
# dictionary where values are lists
available_weights = {0.009174311926605505: [7, 14, 21, 25, 31, 32, 35, 45, 52, 82, 83, 96, 112, 119, 142], 0.009523809523809525: [33, 37, 43, 44, 69, 73, 75, 78, 79, 80, 102, 104, 110, 115, 150], 0.1111111111111111: [91], 0.019230769230769232: [36, 50, 127, 139], 0.010869565217391304: [10, 48, 55, 62, 77, 88, 103, 124, 131, 137, 147], 0.014084507042253521: [2, 3, 4, 22, 27, 30, 41, 53, 87, 122, 123, 132, 143], 0.011494252873563218: [20, 34, 99, 125, 135, 138, 141], 0.045454545454545456: [0, 109], 0.01818181818181818: [49, 64, 72, 90, 146, 148], 0.07142857142857142: [106], 0.01282051282051282: [16, 63, 68, 98, 114, 130, 145], 0.010638297872340425: [8, 28, 40, 57, 61, 66, 71, 74, 76, 84, 85, 86, 128, 144], 0.02040816326530612: [6, 65], 0.021739130434782608: [29, 67, 92, 93], 0.02127659574468085: [47, 118, 120], 0.011111111111111112: [1, 13, 19, 24, 42, 54, 70, 89, 94, 107, 117, 126, 129, 140], 0.015625: [38, 60, 101, 133, 134, 136], 0.03333333333333333: [56, 58, 97, 121], 0.016666666666666666: [5, 26, 105, 113], 0.014705882352941176: [17, 46, 95]}
# existing numpy array
train_idx = [134, 45, 137, 140, 79, 98, 128, 80, 99, 71, 145, 35, 94, 122, 77, 23, 113, 44, 68, 21, 20, 125, 74, 139, 29, 109, 25, 34, 6, 81, 22, 114, 12, 95, 150, 106, 84, 19, 58, 59, 88, 143, 136, 43, 72, 132, 117, 13, 65, 111, 39, 14, 56, 11, 26, 90, 119, 112, 27, 57, 46, 147, 123, 16, 36, 100, 141, 38, 62, 32, 75, 146, 89, 37, 31, 40, 64, 87, 3, 103, 102, 104, 78, 53, 1, 142, 47, 130, 105, 4, 93, 52, 42, 10, 9, 115, 76, 54, 49, 116, 69, 5, 86, 66, 101, 107, 96, 110, 8, 73, 121, 138, 67, 124, 108, 97, 120, 2, 148, 127, 135, 18, 149, 82, 41, 144, 129, 118, 51, 126, 33, 85, 24, 0, 61, 92, 70, 15, 17, 50, 83, 30, 28, 91, 60, 48, 133, 55, 63, 7, 131]
So I want to use each value in train_idx to find the corresponding dictionary key in available_weights. The expected output should look like this (with a length of all 150 values):
new_array = [0.015625, 0.009174311926605505, 0.010869565217391304, ... ,0.01282051282051282, 0.009174311926605505, 0.010869565217391304]
Any help would be appreciated!
result = []
flipped = dict()
for value in train_idx:
flipped[value] = []
for key in available_weights:
if value in available_weights[key]:
flipped[value].append(key)
result.append(key)

Pytorch Tensor - How to get the index of a tensor given a multidimensional tensor

I have a the following tensor lets call it lookup_table:
tensor([266, 103, 84, 12, 32, 34, 1, 523, 22, 136, 268, 432, 53, 63,
201, 51, 164, 69, 31, 42, 122, 131, 119, 36, 245, 60, 28, 81,
9, 114, 105, 3, 41, 86, 150, 79, 104, 120, 74, 420, 39, 427,
40, 59, 24, 126, 202, 222, 145, 429, 43, 30, 38, 55, 10, 141,
85, 121, 203, 240, 96, 7, 64, 89, 127, 236, 117, 99, 54, 90,
57, 11, 21, 62, 82, 25, 267, 75, 111, 518, 76, 56, 20, 2,
61, 516, 80, 78, 555, 246, 133, 497, 33, 421, 58, 107, 92, 68,
13, 113, 235, 875, 35, 98, 102, 27, 14, 15, 72, 37, 16, 50,
517, 134, 223, 163, 91, 44, 17, 412, 18, 48, 23, 4, 29, 77,
6, 110, 67, 45, 161, 254, 112, 8, 106, 19, 498, 101, 5, 157,
83, 350, 154, 238, 115, 26, 142, 143])
And I have another tensor lets call it data, which looks like this:
tensor([[517, 235, 236, 76, 81, 25, 110, 59, 245, 39],
[523, 114, 350, 246, 30, 222, 39, 517, 106, 2],
[ 35, 235, 120, 99, 266, 63, 236, 133, 412, 38],
[134, 2, 497, 21, 78, 60, 142, 498, 24, 89],
[ 60, 111, 120, 145, 91, 141, 164, 81, 350, 55]])
Now I want something which looks similar to this:
tensor([112, 100, ..., 40],
[7, 29, ..., 2],
..., ])
I want to use my data tensor to get the index of the lookup table.
Basically I want to vectorize this:
(lookup_table == data).nonzero()
So that this works for multidimensional arrays.
I have read this, but they are not working for my case:
How Pytorch Tensor get the index of specific value
How Pytorch Tensor get the index of elements?
Pytorch tensor - How to get the indexes by a specific tensor
EDIT:
I am basically searching for an optimized/vectorized version of this:
x_data = torch.stack([(lookuptable == data[0][i]).nonzero(as_tuple=False) for i in range(len(data[0]))]).flatten().unsqueeze(0)
print(x_data.size())
for o in range(1, len(data)):
x_data = torch.cat((x_data, torch.stack([(lookuptable == data[o][i]).nonzero(as_tuple=False) for i in range(len(data[o]))]).flatten().unsqueeze(0)), dim=0)
EDIT 2 Minimal example:
We have the data tensor:
data = torch.Tensor([
[523, 114, 350, 246, 30, 222, 39, 517, 106, 2],
[ 35, 235, 120, 99, 266, 63, 236, 133, 412, 38],
[555, 104, 14, 81, 55, 497, 222, 64, 57, 131]
])
And we have the lookup_table tensor, see above.
If we apply this code to the 2 tensors:
# convert champion keys into index notation
x_data = torch.stack([(lookuptable == x[0][i]).nonzero(as_tuple=False) for i in range(len(x[0]))]).flatten().unsqueeze(0)
for o in range(1, len(data) - 1):
x_data = torch.cat((x_data, torch.stack([(lookuptable == x[o][i]).nonzero(as_tuple=False) for i in range(len(x[o]))]).flatten().unsqueeze(0)), dim=0)
We get an output of this:
tensor([[ 7, 29, 141, 89, 51, 47, 40, 112, 134, 83],
[102, 100, 37, 67, 0, 13, 65, 90, 119, 52],
[ 88, 36, 106, 27, 53, 91, 47, 62, 70, 21]
])
This output is what I want, and like I said above its the index of where each value of the tensor data lies on the tensor lookuptable.
The problem is that this is not vectorized.
And I have no Idea how to vectorize it.
Using searchsorted:
Scanning the whole lookup_table array for each input element is quite inefficient. How about sorting the lookup table first (this only needs to be done once)
sorted_lookup_table, indexes = torch.sort(lookup_table)
and then using searchsorted
index_into_sorted = torch.searchsorted(sorted_lookup_table, data)
If you need an index into the original lookup_table, you can get it with
index_into_lookup_table = indexes[index_into_sorted]
Another, faster, approach that assumes that all values have a limited range, and are int64 (Here, I also assume that they are non-negative, but this limitation can be worked around):
Prep work:
sorted_lookup_table, indexes = torch.sort(lookup_table)
lut = torch.zeros(size=(sorted_lookup_table[-1]+1,), dtype=torch.int64)
lut[:] = -1 # "not found"
lut[sorted_lookup_table] = indexes
Data processing:
index_into_lookup_table = lut[data]

numpy - make repeated random number blocks (noise image) [duplicate]

This question already has answers here:
How to repeat elements of an array along two axes?
(5 answers)
Quick way to upsample numpy array by nearest neighbor tiling [duplicate]
(3 answers)
Closed 4 years ago.
I want to make a "noise" image. If I do
img = np.random.randint(0, 255, (4, 4), dtype=np.uint8)
print(img)
out:
array([[150, 45, 246, 137],
[195, 141, 246, 197],
[206, 126, 188, 76],
[134, 168, 166, 190]])
Every pixel is different. But what if I want larger 'pixels', e.g.:
array([[150, 150, 246, 246],
[150, 150, 246, 246],
[206, 206, 188, 188],
[206, 206, 188, 188]])
How do I do something like this?
You can use np.kron:
>>> np.kron(np.random.randint(0, 256, (4, 4)), np.ones((2, 2), int))
array([[252, 252, 51, 51, 10, 10, 124, 124],
[252, 252, 51, 51, 10, 10, 124, 124],
[161, 161, 137, 137, 8, 8, 89, 89],
[161, 161, 137, 137, 8, 8, 89, 89],
[ 12, 12, 24, 24, 37, 37, 98, 98],
[ 12, 12, 24, 24, 37, 37, 98, 98],
[151, 151, 149, 149, 147, 147, 15, 15],
[151, 151, 149, 149, 147, 147, 15, 15]])
Or np.repeat (once for each dimension):
>>> np.repeat(np.repeat(np.random.randint(0, 256, (4, 4)), 2, 0), 2, 1)
array([[ 41, 41, 29, 29, 103, 103, 67, 67],
[ 41, 41, 29, 29, 103, 103, 67, 67],
[231, 231, 203, 203, 231, 231, 157, 157],
[231, 231, 203, 203, 231, 231, 157, 157],
[ 18, 18, 126, 126, 15, 15, 196, 196],
[ 18, 18, 126, 126, 15, 15, 196, 196],
[198, 198, 152, 152, 74, 74, 211, 211],
[198, 198, 152, 152, 74, 74, 211, 211]])

Limit elements from a list in python 2.7 [duplicate]

This question already has answers here:
Understanding slicing
(38 answers)
Closed 7 years ago.
I have a python list of 300+ elements. I am trying to limit the print of the elements from the list, in order to show only 10 at a time.
Here is an example with only 5 elements inside a list, and I want to show just 2:
items = ["one", "two", "three", "four", "five"]
max_num = 2
for item in items:
# here I am not sure how I restrict the number elements from items
# do something
There are multiple ways of doing it.
1. (Pythonic way)
As #aa333 suggested:
for item in items[:max_num]:
print(item)
2. Probably a bit faster:
for i in xrange(max_num):
print(items[i])
3. Using a while loop:
counter = 0
while counter < max_num:
print(items[i])
counter += 1
Try this.
a = range(300)
for i in range(len(a)):
limit = 10;
i=(i*limit)
print a[i:(i+limit)]
if(i>(len(a)-limit-1)):
break;
This is the basic answer. I am sure better logic is available than this.
Output is like:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
[21, 22, 23, 24, 25, 26, 27, 28, 29, 30]......
Hope this is the desired output.
Simple way:
>>> m = 300
>>> k=range(m)
>>> s = 0
>>> e = 10
>>> while m > 0:
... print k[s:e]
... s+=10
... e+=10
... m-=10
...
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
[30, 31, 32, 33, 34, 35, 36, 37, 38, 39]
[40, 41, 42, 43, 44, 45, 46, 47, 48, 49]
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59]
[60, 61, 62, 63, 64, 65, 66, 67, 68, 69]
[70, 71, 72, 73, 74, 75, 76, 77, 78, 79]
[80, 81, 82, 83, 84, 85, 86, 87, 88, 89]
[90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
[100, 101, 102, 103, 104, 105, 106, 107, 108, 109]
[110, 111, 112, 113, 114, 115, 116, 117, 118, 119]
[120, 121, 122, 123, 124, 125, 126, 127, 128, 129]
[130, 131, 132, 133, 134, 135, 136, 137, 138, 139]
[140, 141, 142, 143, 144, 145, 146, 147, 148, 149]
[150, 151, 152, 153, 154, 155, 156, 157, 158, 159]
[160, 161, 162, 163, 164, 165, 166, 167, 168, 169]
[170, 171, 172, 173, 174, 175, 176, 177, 178, 179]
[180, 181, 182, 183, 184, 185, 186, 187, 188, 189]
[190, 191, 192, 193, 194, 195, 196, 197, 198, 199]
[200, 201, 202, 203, 204, 205, 206, 207, 208, 209]
[210, 211, 212, 213, 214, 215, 216, 217, 218, 219]
[220, 221, 222, 223, 224, 225, 226, 227, 228, 229]
[230, 231, 232, 233, 234, 235, 236, 237, 238, 239]
[240, 241, 242, 243, 244, 245, 246, 247, 248, 249]
[250, 251, 252, 253, 254, 255, 256, 257, 258, 259]
[260, 261, 262, 263, 264, 265, 266, 267, 268, 269]
[270, 271, 272, 273, 274, 275, 276, 277, 278, 279]
[280, 281, 282, 283, 284, 285, 286, 287, 288, 289]
[290, 291, 292, 293, 294, 295, 296, 297, 298, 299]
If I understood correctly the question, it could be done in the following way:
from math import floor
from numpy import arange
max_num=10.
lis=arange(0,404)
len_num=int(floor(float(len(lis))/max_num))
for i in range(len_num):
print lis[i*int(max_num):(i+1)*int(max_num)]
wait=input()
print(lis[(i+1)*int(max_num):])
This way, your list is shown slice by slice and the next slice isn't shown until the user presses one key

Categories

Resources