Find element in a list with a substring - python

I have a pickle file which contains the following data stored:
indices = np.load("descritores/indices_pessoa.pickle")
print(indices) # result
{0: 'fotos\\pessoa.1.1.jpg', 1: 'fotos\\pessoa.1.2.jpg', 2: 'fotos\\pessoa.1.3.jpg', 3: 'fotos\\pessoa.2.1.jpg', 4: 'fotos\\pessoa.2.2.jpg', 5: 'fotos\\pessoa.2.3.jpg'}
I would like to get all indexes of elements which contains "pessoa.1" as substring, and remove them from the list.
I've tried this so far, but not working:
r = [i for i in indices if "pessoa.1" in i]
print(r)

as an output from you question I see dictionary.
{0: 'fotos\pessoa.1.1.jpg', 1: 'fotos\pessoa.1.2.jpg', 2: 'fotos\pessoa.1.3.jpg', 3: 'fotos\pessoa.2.1.jpg', 4: 'fotos\pessoa.2.2.jpg', 5: 'fotos\pessoa.2.3.jpg'}
If you understand "index" as a key from dictionary, just iterate it.
indexes = []
for index, value in indices.items():
if 'pessoa.1' in value:
indexes.append(index)
for index in indexes:
del indices[index]

Related

count how often a key appears in a dataset

i have a pandas dataframe
where you can find 3 columns. the third is the second one with some str slicing.
To every warranty_claim_number, there is a key_part_number (first column).
this dataframe has a lot of rows.
I have a second list, which contains 70 random select warranty_claim_numbers.
I was hoping to find the corresponding key_part_number from those 70 claims in my dataset.
Then i would like to create a dictionary with the key_part_number as key and the corresponding value as warranty_claim_number.
At last, count how often each key_part_number appears in this dataset and update the key.
This should like like this:
dicti = {4:'000120648353',10:'000119582589',....}
first of all you need to change the datatype of warranty_claim_numbers to string or you wont get the leading 0's
You can subset your df form that list of claim numbers:
df = df[df["warranty_claim_number"].isin(claimnumberlist)]
This gives you a dataframe with only the rows with those claim numbers.
countofkeyparts = df["key_part_number"].value_counts()
this gives you a pandas series with the values and you can cast i to a dict with to_dict()
countofkeyparts = countofkeyparts.to_dict()
The keys in a dict have to be unique so if you want the count as a key you can have the value be a list of key_part_numbers
values = {}
for key, value in countofkeyparts.items():
values[value]= values.get(value,[])
values[value].append(key)
According to your example, you can't use the number of occurrences as the key of the dictionary because the key in the dictionary is unique and you can't exclude multiple data columns with the same frequency of occurrence, so it is recommended to set the result in this format: dicti = {4:['000120648353', '09824091'],10:['000119582589'] ,....}
I'll use randomly generated data as an example
from collections import Counter
import random
lst = [random.randint(1, 10) for i in range(20)]
counter = Counter(lst)
print(counter) # First element, then number of occurrences
nums = set(counter.values()) # All occurrences
res = {item: [val for val in counter if counter[val] == item] for item in nums}
print(res)
# Counter({5: 6, 8: 4, 3: 2, 4: 2, 9: 2, 2: 2, 6: 1, 10: 1})
# {1: [6, 10], 2: [3, 4, 9, 2], 4: [8], 6: [5]}
This does what you want:
# Select rows where warranty_claim_numbers item is in lst:
df_wanted = df.loc[df["warranty_claim_numbers"].isin(lst), "warranty_claim_numbers"]
# Count the values in that row:
count_values = df_wanted.value_counts()
# Transform to Dictionary:
print(count_values.to_dict())

Setting a value in nested dictionary sets it in more than one place

My objective is to add a value to a list that exists in a nested dictionary.
I created a nested dictionary recursively using a method I wrote. I need this method to create the dictionary because it can have any number of "dimensions".
def create_nested_dict(dimensions):
if len(dimensions) == 1:
return dict.fromkeys(range(dimensions[0]), [])
else:
return dict.fromkeys(range(dimensions[0]), create_nested_dict((dimensions[1:])))
For example, d = create_nested_dict([2, 3]) generates a dictionary d with keys first in range(2) and then in range(3). Like this:
{0: {0: [], 1: [], 2: []}, 1: {0: [], 1: [], 2: []}}
However, now I need to add values to the lists the dictionary contains.
If I try to append a value to the list in position [0][1] like this (d[0][1]).append(3), the value is appended in all the lists:
{0: {0: [3], 1: [3], 2: [3]}, 1: {0: [3], 1: [3], 2: [3]}}
If I use the splat operator instead d[0][1] = [*d[0][1], 3]:
I get this result:
{0: {0: [], 1: [3], 2: []}, 1: {0: [], 1: [3], 2: []}}
which is not what I need either.
The expected result is to get the value appended to just the position [0][1]:
{0: {0: [], 1: [3], 2: []}, 1: {0: [], 1: [], 2: []}}
I don't know if this behavior has something to do with the method I used to create the dictionary or the keys being ints or if I'm just storing incorrectly.
Edit: I tried changing the ints by strings and the behavior didn't change.
When you do dict.fromkeys(range(dimensions[0]), create_nested_dict((dimensions[1:]))), you're assigning the same dict as the value for all the keys. Since it is the same dict (i.e. it is only 1 Python object under the hood, and each entry in the dict is a reference to this same object), changing one will change them all.
You should create a fresh dictionary each time, you can easily do this by using a dict comprehension:
def create_nested_dict(dimensions):
if len(dimensions) == 1:
return {k: [] for k in range(dimensions[0])}
else:
return {k: create_nested_dict(dimensions[1:]) for k in range(dimensions[0])}
Here is how:
def create_nested_dict(dimensions):
if len(dimensions) == 1:
return {k: [] for k in range(dimensions[0])}
return {k: create_nested_dict(dimensions[1]) for k in range(dimensions[1])}
Note that we don't need an extra else statement in the function. the return statements are enough.

How to replace a key in a list/Dictionary?

I have a list in that list dictionary is present how to replace key of that dictionary?
a = [{ 1:'1',2:'2',3:'3',4:'4',5:'5',1:'1'}]
for n, i in enumerate(a):
if i == 1:
a[n] = 10
1 is a key have to replace with 10,so i have tried in above method but can't able to do
the final thing i want is
a = [{ 10:'1',2:'2',3:'3',4:'4',5:'5',10:'1'}]
Original:
a = [{ 1:1,2:2,3:3,4:4,5:5,1:1}]
for n, i in enumerate(a):
if i == 1:
a[n] = 10
First thing to know is, arrays start at 0. You have
if i == 1:
and so, your code will never execute.
You also have a duplicate key in your dictionary - 1 is used twice.
Since your dictionary is in a list, it will have to be index like:
a[i][j] = ...
where i refers to which element it is in the list, and j refers to which element in the dictionary.
Last, your i and n are reversed - enumerate puts the index in the first variable.
So, if I understand correctly what you want to accomplish, the end result should end up looking more like this:
a = [{1:1,2:2,3:3,4:4,5:5}]
for i, n in enumerate(a):
if i == 0:
a[0][1] = 10
print(a)
If you want to change the value for more than 1 key, then I might do something like this:
a = [{1:1,2:2,3:3,4:4,5:5}]
toChange = [[1,10], [4, 76]] # 1 and 4 are the keys, and 10 and 76 are the values to change them to
for i, n in enumerate(a):
if i == 0:
for change in toChange:
a[0][change[0]] = change[1]
print(a)
EDIT: Everything above is still correct, but as you and Tomerikoo pointed out, it does not quite answer the question. My apologies. The following code should work.
a = [{1: 1, 2: 2, 3: 3, 4: 4, 5: 5}]
toChange = [[1, 10], [4, 76]] # 1 and 4 are the keys, and 10 and 76 are the
values to change them to
for i, n in enumerate(a):
if i == 0:
for change in toChange:
try:
oldValue = a[0][change[0]]
del a[0][change[0]]
a[0][change[1]] = oldValue
except:
pass # handle it here
#This likely means you tried to replace a key that isn't in there
print(a)
I wouldn't use enumerate here. Notice that the first key's value get's overwritten.
a = [{1: '1', 2: '2', 3: '3', 4: '4', 5: '5', 1: '1'}] # duplicate key
print(a) # [{1: '1', 2: '2', 3: '3', 4: '4', 5: '5'}] (got overwritten)
a[0][10] = a[0][1] # copy the value of the key 1 to the key 10
a[0].pop(1) # remove the key 1
print(a) # [{2: '2', 3: '3', 4: '4', 5: '5', 10: '1'}]
Also, note that in the original example the indentation of the if block is wrong.

Create a Dictionary of Dictionary from a Tuple of Dictionary Inside a List

I have a tuple like this,
sample_tuple = ([{1:["hello"], 2: ["this"], 3:["is fun"]},{1:["hi"], 2:["how are you"]}],
[{1: ["this"], 2:[], 3:["that"]}, {1:[], 2:["yes"]}])
From this tuple, I would like to create a dictionary that has its key values as dictionary.
Step 1:
Iterate the main big tuple and keep track of the indexes of lists.
Step 2:
Get into the lists inside the tuple and keep track of the index of those big lists.
i.e, index 0 of first list,
[{1:["hello"], 2: ["this"], 3:["is fun"]},{1:["hi"], 2:["how are you"]}]
Step 3:
I want to iterate through the key and values of dictionary inside the list.
i.e first dictionary
{1:["hello"], 2: ["this"], 3:["is fun"]}
Step 4:
While iterating through the dictionary values I want to check and make sure values are not empty and not None.
When this process happens, I want to create a dictionary. For this dictionary,
KEY: indexes of step 2 (each index of each dictionary in the big list).
VALUES: a dictionary that has key from the step 3's keys (from my dictionary above) and values as (tricky part) a list, if the step 3 dictionary's value is not empty. As you can see below, I have an empty list temporary_keyword_list that should save the non empty lists values into a temporary list, but I am not getting what I want.
Below is what I tried, what I get and what my desired output.
output_1 = {}
for index, each_keyword in enumerate(sample_tuple):
for ind, each_file in enumerate(each_keyword):
temporary_dict = {}
for key, value in each_file.items():
temporary_keyword_list = []
# Check if a value is not empty or not None
if value!= [] and value is not None:
temporary_keyword_list.append(index) ## Here I want to save the index (tricky part)
# Start inserting values into the dictionary.
temporary_dict[key] = temporary_keyword_list
# Final big dictionary
output_1[ind] = temporary_dict
My current output_1 dictionary:
{0: {1: [1], 2: [], 3: [1]}, 1: {1: [], 2: [1]}}
Desired output:
{0: {1: [0, 1], 2: [0], 3: [0, 1]}, 1: {1: [0], 2: [0, 1]}}
Since its tuples, lists and dictionaries I tried my best to explain the problem I have. Please let me know in the comment if this doesn't make sense, I'll try my best to explain. Any help or suggestion would be awesome.
You probably do not need to create temporary lists or dictionaries here as you can obtain all the indices you need from your for loops. The key here is that your initial tuple contains lists which have a similar structure, so in your code, the structure of your final dictionary is already determined after the first iteration of the first for loop. Consider using defaultdict as well when you create the inner dictionaries as you plan to store lists inside them. Then it is all about correctly handling indices and values. The code below should work.
from collections import defaultdict
sample_tuple = ([{1: ["hello"], 2: ["this"], 3: ["is fun"]},
{1: ["hi"], 2: ["how are you"]}],
[{1: ["this"], 2: [], 3: ["that"]}, {1: [], 2: ["yes"]}])
output_1 = {}
for index, each_keyword in enumerate(sample_tuple):
for ind, each_file in enumerate(each_keyword):
if index == 0:
output_1[ind] = defaultdict(list)
for key, value in each_file.items():
if value != [] and value is not None:
output_1[ind][key].append(index)
print(output_1)
To answer your comment, you can manage without defaultdict, but do you really want to do that ?
sample_tuple = ([{1: ["hello"], 2: ["this"], 3: ["is fun"]},
{1: ["hi"], 2: ["how are you"]}],
[{1: ["this"], 2: [], 3: ["that"]}, {1: [], 2: ["yes"]}])
output_1 = {}
for index, each_keyword in enumerate(sample_tuple):
for ind, each_file in enumerate(each_keyword):
for key, value in each_file.items():
if index == 0:
if key == 1:
if value != [] and value is not None:
output_1[ind] = {key: [index]}
else:
if value != [] and value is not None:
output_1[ind][key] = [index]
else:
if value != [] and value is not None:
output_1[ind][key].append(index)
print(output_1)

python find minimum value in dictionary iteratively

I have a dictionary (named distances) which looks like this :
{0: {0: 122.97560733739029, 1: 208.76062847194152, 2: 34.713109915419565}, 1: {0: 84.463009655114703, 1: 20.83266665599966, 2: 237.6299644405141}, 2: {0: 27.018512172212592, 1: 104.38390680559911, 2: 137.70257804413103}}
Now, what I need to do is I have to find the minimum value corresponding to each key and then store its index separately. I have written this code for that :
weights_indexes = {}
for index1 in distances:
min_dist = min(distances[index1], key=distances[index1].get)
weights_indexes[index1] = min_dist
The output for this, looks like :
{0: 2, 1: 1, 2: 0}
Now, the issue is this that indexes should always be unique. Lets say that now if we have a dictionary like :
{0: {0: 34.713109915419565, 1: 208.76062847194152, 2: 122.97560733739029}, 1: {0: 84.463009655114703, 1: 20.83266665599966, 2: 237.6299644405141}, 2: {0: 27.018512172212592, 1: 104.38390680559911, 2: 137.70257804413103}}
so, the output of finding minimum indexes for this will be :
{0: 0, 1: 1, 2: 0}
Here, the indexes (values) obtained are not unique. In this scenario, the values corresponding to the indexes where duplicates are found have to be compared. So, 34.713109915419565 and 27.018512172212592 will be compared. Since 27.018512172212592 is smaller, so its index will be picked. And for the index 0, mapping will be done to next smallest index, that is index of 122.97560733739029. So, final mapping will look like :
{0: 2, 1: 1, 2: 0}
This should happen iteratively unless, all the values are unique.
I am not able to figure out how to check for uniqueness and the iteratively keep finding the next minimum one to make the mapping.
Here is a workable solution:
test = {0: {0: 12.33334444, 1: 208.76062847194152, 2: 34.713109915419565}, 1: {0: 84.463009655114703, 1: 20.83266665599966, 2: 237.6299644405141}, 2: {0: 27.018512172212592, 1: 104.38390680559911, 2: 137.70257804413103}}
sorted_index_map = {}
for key, value in test.iteritems():
sorted_index_map[key] = sorted(value, key=lambda k: value[k])
index_of_min_index_map = {key: 0 for key in test}
need_to_check_duplicate = True
while need_to_check_duplicate:
need_to_check_duplicate = False
min_index_map = {key: sorted_index_map[key][i] for key, i in index_of_min_index_map.iteritems()}
index_set = list(min_index_map.itervalues())
for key, index in min_index_map.iteritems():
if index_set.count(index) == 1:
continue
else:
for key_to_check, index_to_check in min_index_map.iteritems():
if key != key_to_check and index == index_to_check:
if test[key][index] > test[key_to_check][index_to_check]:
index_of_min_index_map[key] += 1
need_to_check_duplicate = True
break
result = {key: sorted_index_map[key][i] for key, i in index_of_min_index_map.iteritems()}
print result
The result:
{0: 0, 1: 1, 2: 2}
Break down:
First sort the indexes by it's value:
test = {0: {0: 12.33334444, 1: 208.76062847194152, 2: 34.713109915419565}, 1: {0: 84.463009655114703, 1: 20.83266665599966, 2: 237.6299644405141}, 2: {0: 27.018512172212592, 1: 104.38390680559911, 2: 137.70257804413103}}
sorted_index_map = {}
for key, value in test.iteritems():
sorted_index_map[key] = sorted(value, key=lambda k: value[k])
Then for each key the min value's index is the first number in the sorted_index_map:
index_of_min_index_map = {key: 0 for key in test}
Now we need to check if there are any duplicate indexes, if there are, for all the value's of the same index that are not the smallest. we shift to the next small index, i.e. the next one in the sorted_index_map of the key. If there are non duplicate, we're done.
need_to_check_duplicate = True
while need_to_check_duplicate:
need_to_check_duplicate = False
min_index_map = {key: sorted_index_map[key][i] for key, i in index_of_min_index_map.iteritems()}
index_set = list(min_index_map.itervalues())
for key, index in min_index_map.iteritems():
if index_set.count(index) == 1:
continue
else:
for key_to_check, index_to_check in min_index_map.iteritems():
if key != key_to_check and index == index_to_check:
if test[key][index] > test[key_to_check][index_to_check]:
index_of_min_index_map[key] += 1
need_to_check_duplicate = True
break
Note you haven't mentioned how to handle the index if there are two identical value so I assume there won't be.

Categories

Resources