I have a dictionary (named distances) which looks like this :
{0: {0: 122.97560733739029, 1: 208.76062847194152, 2: 34.713109915419565}, 1: {0: 84.463009655114703, 1: 20.83266665599966, 2: 237.6299644405141}, 2: {0: 27.018512172212592, 1: 104.38390680559911, 2: 137.70257804413103}}
Now, what I need to do is I have to find the minimum value corresponding to each key and then store its index separately. I have written this code for that :
weights_indexes = {}
for index1 in distances:
min_dist = min(distances[index1], key=distances[index1].get)
weights_indexes[index1] = min_dist
The output for this, looks like :
{0: 2, 1: 1, 2: 0}
Now, the issue is this that indexes should always be unique. Lets say that now if we have a dictionary like :
{0: {0: 34.713109915419565, 1: 208.76062847194152, 2: 122.97560733739029}, 1: {0: 84.463009655114703, 1: 20.83266665599966, 2: 237.6299644405141}, 2: {0: 27.018512172212592, 1: 104.38390680559911, 2: 137.70257804413103}}
so, the output of finding minimum indexes for this will be :
{0: 0, 1: 1, 2: 0}
Here, the indexes (values) obtained are not unique. In this scenario, the values corresponding to the indexes where duplicates are found have to be compared. So, 34.713109915419565 and 27.018512172212592 will be compared. Since 27.018512172212592 is smaller, so its index will be picked. And for the index 0, mapping will be done to next smallest index, that is index of 122.97560733739029. So, final mapping will look like :
{0: 2, 1: 1, 2: 0}
This should happen iteratively unless, all the values are unique.
I am not able to figure out how to check for uniqueness and the iteratively keep finding the next minimum one to make the mapping.
Here is a workable solution:
test = {0: {0: 12.33334444, 1: 208.76062847194152, 2: 34.713109915419565}, 1: {0: 84.463009655114703, 1: 20.83266665599966, 2: 237.6299644405141}, 2: {0: 27.018512172212592, 1: 104.38390680559911, 2: 137.70257804413103}}
sorted_index_map = {}
for key, value in test.iteritems():
sorted_index_map[key] = sorted(value, key=lambda k: value[k])
index_of_min_index_map = {key: 0 for key in test}
need_to_check_duplicate = True
while need_to_check_duplicate:
need_to_check_duplicate = False
min_index_map = {key: sorted_index_map[key][i] for key, i in index_of_min_index_map.iteritems()}
index_set = list(min_index_map.itervalues())
for key, index in min_index_map.iteritems():
if index_set.count(index) == 1:
continue
else:
for key_to_check, index_to_check in min_index_map.iteritems():
if key != key_to_check and index == index_to_check:
if test[key][index] > test[key_to_check][index_to_check]:
index_of_min_index_map[key] += 1
need_to_check_duplicate = True
break
result = {key: sorted_index_map[key][i] for key, i in index_of_min_index_map.iteritems()}
print result
The result:
{0: 0, 1: 1, 2: 2}
Break down:
First sort the indexes by it's value:
test = {0: {0: 12.33334444, 1: 208.76062847194152, 2: 34.713109915419565}, 1: {0: 84.463009655114703, 1: 20.83266665599966, 2: 237.6299644405141}, 2: {0: 27.018512172212592, 1: 104.38390680559911, 2: 137.70257804413103}}
sorted_index_map = {}
for key, value in test.iteritems():
sorted_index_map[key] = sorted(value, key=lambda k: value[k])
Then for each key the min value's index is the first number in the sorted_index_map:
index_of_min_index_map = {key: 0 for key in test}
Now we need to check if there are any duplicate indexes, if there are, for all the value's of the same index that are not the smallest. we shift to the next small index, i.e. the next one in the sorted_index_map of the key. If there are non duplicate, we're done.
need_to_check_duplicate = True
while need_to_check_duplicate:
need_to_check_duplicate = False
min_index_map = {key: sorted_index_map[key][i] for key, i in index_of_min_index_map.iteritems()}
index_set = list(min_index_map.itervalues())
for key, index in min_index_map.iteritems():
if index_set.count(index) == 1:
continue
else:
for key_to_check, index_to_check in min_index_map.iteritems():
if key != key_to_check and index == index_to_check:
if test[key][index] > test[key_to_check][index_to_check]:
index_of_min_index_map[key] += 1
need_to_check_duplicate = True
break
Note you haven't mentioned how to handle the index if there are two identical value so I assume there won't be.
Related
I have a pickle file which contains the following data stored:
indices = np.load("descritores/indices_pessoa.pickle")
print(indices) # result
{0: 'fotos\\pessoa.1.1.jpg', 1: 'fotos\\pessoa.1.2.jpg', 2: 'fotos\\pessoa.1.3.jpg', 3: 'fotos\\pessoa.2.1.jpg', 4: 'fotos\\pessoa.2.2.jpg', 5: 'fotos\\pessoa.2.3.jpg'}
I would like to get all indexes of elements which contains "pessoa.1" as substring, and remove them from the list.
I've tried this so far, but not working:
r = [i for i in indices if "pessoa.1" in i]
print(r)
as an output from you question I see dictionary.
{0: 'fotos\pessoa.1.1.jpg', 1: 'fotos\pessoa.1.2.jpg', 2: 'fotos\pessoa.1.3.jpg', 3: 'fotos\pessoa.2.1.jpg', 4: 'fotos\pessoa.2.2.jpg', 5: 'fotos\pessoa.2.3.jpg'}
If you understand "index" as a key from dictionary, just iterate it.
indexes = []
for index, value in indices.items():
if 'pessoa.1' in value:
indexes.append(index)
for index in indexes:
del indices[index]
I have a tuple like this,
sample_tuple = ([{1:["hello"], 2: ["this"], 3:["is fun"]},{1:["hi"], 2:["how are you"]}],
[{1: ["this"], 2:[], 3:["that"]}, {1:[], 2:["yes"]}])
From this tuple, I would like to create a dictionary that has its key values as dictionary.
Step 1:
Iterate the main big tuple and keep track of the indexes of lists.
Step 2:
Get into the lists inside the tuple and keep track of the index of those big lists.
i.e, index 0 of first list,
[{1:["hello"], 2: ["this"], 3:["is fun"]},{1:["hi"], 2:["how are you"]}]
Step 3:
I want to iterate through the key and values of dictionary inside the list.
i.e first dictionary
{1:["hello"], 2: ["this"], 3:["is fun"]}
Step 4:
While iterating through the dictionary values I want to check and make sure values are not empty and not None.
When this process happens, I want to create a dictionary. For this dictionary,
KEY: indexes of step 2 (each index of each dictionary in the big list).
VALUES: a dictionary that has key from the step 3's keys (from my dictionary above) and values as (tricky part) a list, if the step 3 dictionary's value is not empty. As you can see below, I have an empty list temporary_keyword_list that should save the non empty lists values into a temporary list, but I am not getting what I want.
Below is what I tried, what I get and what my desired output.
output_1 = {}
for index, each_keyword in enumerate(sample_tuple):
for ind, each_file in enumerate(each_keyword):
temporary_dict = {}
for key, value in each_file.items():
temporary_keyword_list = []
# Check if a value is not empty or not None
if value!= [] and value is not None:
temporary_keyword_list.append(index) ## Here I want to save the index (tricky part)
# Start inserting values into the dictionary.
temporary_dict[key] = temporary_keyword_list
# Final big dictionary
output_1[ind] = temporary_dict
My current output_1 dictionary:
{0: {1: [1], 2: [], 3: [1]}, 1: {1: [], 2: [1]}}
Desired output:
{0: {1: [0, 1], 2: [0], 3: [0, 1]}, 1: {1: [0], 2: [0, 1]}}
Since its tuples, lists and dictionaries I tried my best to explain the problem I have. Please let me know in the comment if this doesn't make sense, I'll try my best to explain. Any help or suggestion would be awesome.
You probably do not need to create temporary lists or dictionaries here as you can obtain all the indices you need from your for loops. The key here is that your initial tuple contains lists which have a similar structure, so in your code, the structure of your final dictionary is already determined after the first iteration of the first for loop. Consider using defaultdict as well when you create the inner dictionaries as you plan to store lists inside them. Then it is all about correctly handling indices and values. The code below should work.
from collections import defaultdict
sample_tuple = ([{1: ["hello"], 2: ["this"], 3: ["is fun"]},
{1: ["hi"], 2: ["how are you"]}],
[{1: ["this"], 2: [], 3: ["that"]}, {1: [], 2: ["yes"]}])
output_1 = {}
for index, each_keyword in enumerate(sample_tuple):
for ind, each_file in enumerate(each_keyword):
if index == 0:
output_1[ind] = defaultdict(list)
for key, value in each_file.items():
if value != [] and value is not None:
output_1[ind][key].append(index)
print(output_1)
To answer your comment, you can manage without defaultdict, but do you really want to do that ?
sample_tuple = ([{1: ["hello"], 2: ["this"], 3: ["is fun"]},
{1: ["hi"], 2: ["how are you"]}],
[{1: ["this"], 2: [], 3: ["that"]}, {1: [], 2: ["yes"]}])
output_1 = {}
for index, each_keyword in enumerate(sample_tuple):
for ind, each_file in enumerate(each_keyword):
for key, value in each_file.items():
if index == 0:
if key == 1:
if value != [] and value is not None:
output_1[ind] = {key: [index]}
else:
if value != [] and value is not None:
output_1[ind][key] = [index]
else:
if value != [] and value is not None:
output_1[ind][key].append(index)
print(output_1)
I have a list : operation = [5,6] and a dictionary dic = {0: None, 1: None}
And I want to replace each values of dic with the values of operation.
I tried this but it don't seem to run.
operation = [5,6]
for i in oper and val, key in dic.items():
dic_op[key] = operation[i]
Does someone have an idea ?
Other option, maybe:
operation = [5,6]
dic = {0: None, 1: None}
for idx, val in enumerate(operation):
dic[idx] = val
dic #=> {0: 5, 1: 6}
Details for using index here: Accessing the index in 'for' loops?
zip method will do the job
operation = [5, 6]
dic = {0: None, 1: None}
for key, op in zip(dic, operation):
dic[key] = op
print(dic) # {0: 5, 1: 6}
The above solution assumes that dic is ordered in order that element position in operation is align to the keys in the dic.
Using zip in Python 3.7+, you could just do:
operation = [5,6]
dic = {0: None, 1: None}
print(dict(zip(dic, operation)))
# {0: 5, 1: 6}
I have the following code that generates a nested dictionary.
import random
import numpy as np
dict1 = {}
for i in range(0,2):
dict2 = {}
for j in range(0,3):
dict2[j] = random.randint(1,10)
dict1[i] = dict2
For example it can generate the following content of dict1:
{0: {0: 7, 1: 2, 2: 5}, 1: {0: 3, 1: 10, 2: 10}}
I want to find the sub-key of a minimum value for the fixed key. For example, for the fixed key 0, the minimum value among the nested dictionary values is 2 which refers to thew sub-key 1. Therefore the result should be 1:
result=find_min(dict1[0])
result
1
How can I develop such find_min function?
You can reverse the keys and the values, then obtain the key with the minimum value:
a = {0: {0: 7, 1: 2, 2: 5}, 1: {0: 3, 1: 10, 2: 10}}
dict(zip(a[0].values(),a[0].keys())).get(min(a[0].values()))
here we create a new dictionary whose keys and values are the reverse of the original dictionary. eg
dict(zip(a[0].values(),a[0].keys()))
Out[1575]: {7: 0, 2: 1, 5: 2}
Then from here, we obtain the minimum value in the original dictionary and use that as the key in this reversed dictionary
EDIT
As indicated in the comments, one can simply use the key within the min function:
min(a[0],key = a[0].get)
To extract the sub-dict for key 0, just do:
sub_dict = dict1[0]
Then, to find the key corresponding to the minimum value:
min_value, min_key = min((value, key) for key, value in sub_dict.items())
import random
def find_min(d, fixed_key):
# Given a dictionary of dictionaries d, and a fixed_key, get the dictionary associated with the key
myDict = d[fixed_key]
# treat the dictionary keys as a list
# get the index of the minimum value, then use it to get the key
sub_key = list(myDict.keys())[myDict.values().index(min(myDict.values()))]
return sub_key
dict1 = {0: {0: 7, 1: 2, 2: 5}, 1: {0: 3, 1: 10, 2: 10}}
print dict1
print find_min(dict1, 0)
I want to make a histogram of all the intervals between repeated values in a list. I wrote some code that works, but it's using a for loop with if statements. I often find that if one can manage to write a version using clever slicing and/or predefined python (numpy) methods, that one can get much faster Python code than using for loops, but in this case I can't think of any way of doing that. Can anyone suggest a faster or more pythonic way of doing this?
# make a 'histogram'/count of all the intervals between repeated values
def hist_intervals(a):
values = sorted(set(a)) # get list of which values are in a
# setup the dict to hold the histogram
hist, last_index = {}, {}
for i in values:
hist[i] = {}
last_index[i] = -1 # some default value
# now go through the array and find intervals
for i in range(len(a)):
val = a[i]
if last_index[val] != -1: # do nothing if it's the first time
interval = i - last_index[val]
if interval in hist[val]:
hist[val][interval] += 1
else:
hist[val][interval] = 1
last_index[val] = i
return hist
# example list/array
a = [1,2,3,1,5,3,2,4,2,1,5,3,3,4]
histdict = hist_intervals(a)
print("histdict = ",histdict)
# correct answer for this example
answer = { 1: {3:1, 6:1},
2: {2:1, 5:1},
3: {1:1, 3:1, 6:1},
4: {6:1},
5: {6:1}
}
print("answer = ",answer)
Sample output:
histdict = {1: {3: 1, 6: 1}, 2: {5: 1, 2: 1}, 3: {3: 1, 6: 1, 1: 1}, 4: {6: 1}, 5: {6: 1}}
answer = {1: {3: 1, 6: 1}, 2: {2: 1, 5: 1}, 3: {1: 1, 3: 1, 6: 1}, 4: {6: 1}, 5: {6: 1}}
^ note: I don't care about the ordering in the dict, so this solution is acceptable, but I want to be able to run on really large arrays/lists and I'm suspecting my current method will be slow.
You can eliminate the setup loop by a carefully constructed defaultdict. Then you're just left with a single scan over the input list, which is as good as it gets. Here I change the resultant defaultdict back to a regular Dict[int, Dict[int, int]], but that's just so it prints nicely.
from collections import defaultdict
def count_intervals(iterable):
# setup
last_seen = {}
hist = defaultdict(lambda: defaultdict(int))
# The actual work
for i, x in enumerate(iterable):
if x in last_seen:
hist[x][i-last_seen[x]] += 1
last_seen[x] = i
return hist
a = [1,2,3,1,5,3,2,4,2,1,5,3,3,4]
hist = count_intervals(a)
for k, v in hist.items():
print(k, dict(v))
# 1 {3: 1, 6: 1}
# 3 {3: 1, 6: 1, 1: 1}
# 2 {5: 1, 2: 1}
# 5 {6: 1}
# 4 {6: 1}
There is an obvious change to make in terms of data structures. instead of using a dictionary of dictionaries for hist use a defaultdict of Counter this lets the code become
from collections import defaultdict, Counter
# make a 'histogram'/count of all the intervals between repeated values
def hist_intervals(a):
values = sorted(set(a)) # get list of which values are in a
# setup the dict to hold the histogram
hist, last_index = defaultdict(Counter), {}
# now go through the array and find intervals
for i, val in enumerate(a):
if val in last_index
interval = i - last_index[val]
hist[val].update((interval,))
last_index[val] = i
return hist
this will be faster as the if's are written in C, and will also be cleaner.