Using decorators vs iteration to set values? - python

So I have to loop through a list of objects, using some of their values to do computation, and then assign them new values.
Because many of the items in the list will be assigned the same new value, I used a dictionary to hold the list of items that will require the same value. For example:
item_dict = {}
for item in list:
value = item.value
if value not in item_dict:
item_dict[value] = [item]
else:
item_dict[value].append(item)
# do some calculations base on values
new_data # some dictionary created by computation
# new data is stored new_data[value] = new_value
for value, new_value in new_data.items():
items = item_dict[value]
for item in items:
item.value = new_value
I was think about removing the for item in items loop with a decorator since all the new_value(s) for that list are the same. For example:
def dec(item):
def wrap(value):
item.value = value
return wrap
def rec(item, func):
def wrap(value):
item.value = value
func(value)
return wrap
item_dict = {}
for item in list:
value = item.value
if value not in item_dict:
item_dict[value] = dec(item)
else:
item_dict[value] = rec(item, item_dict[value])
# do some calculations base on values
new_data # some dictionary created by computation
# new data is stored new_data[value] = new_value
for value, new_value in new_data.items():
items = item_dict[value]
items(new_value)
Would the decorator fashion be more efficient and how much of a memory impact will it have? Are there any better ways of doing this?

A defaultdict works well here:
from collections import defaultdict
item_dict = defaultdict(list)
for item in value_list:
item_dict[item.value].append(item)
# do some calculations base on values
new_data # some dictionary created by computation
# new data is stored new_data[value] = new_value
for value, new_value in new_data.items():
for item in item_dict[value]:
item.value = new_value
I struggle to think of a way the decorator version could be better - for one thing, you have to worry about the recursion limit.

The get method works well in the first case.
item_dict = {}
for item in list:
item_dict[item.value] = item_dict.get(item.value, []) + [item]
The key to making this work is to use list addition instead of append, as append returns None.

Related

How to detect last call of a recursive function?

I have a list of complex dictionaries like this:
data = [
{
"l_1_k": 1,
"l_1_ch": [
{
"l_2_k": 2,
"l_2_ch": [...more levels]
},
{
"l_2_k": 3,
"l_2_ch": [...more levels]
}
]
},
...more items
]
I'm trying to flatten this structure to a list of rows like this:
list = [
{ "l_1_k": 1, "l_2_k": 2, ... },
{ "l_1_k": 1, "l_2_k": 3, ... },
]
I need this list to build a pandas data frame.
So, I'm doing a recursion for each nesting level, and at the last level I'm trying to append to rows list.
def build_dict(d, row_dict, rows):
# d is the data dictionary at each nesting level
# row_dict is the final row dictionary
# rows is the final list of rows
for key, value in d.items():
if not isinstance(value, list):
row_dict[key] = value
else:
for child in value:
build_dict(child, row_dict, rows)
rows.append(row_dict) # <- How to detect the last recursion and call the append
I'm calling this function like this:
rows = []
for row in data:
build_dict(d=row, row_dict={}, rows=rows)
My question is how to detect the last call of this recursive function if I do not know how many nesting levels there are. With the current code, the row is duplicated at each nesting level.
Or, is there a better approach to obtain the final result?
After looking up some ideas, the solution I have in mind is this:
Declare the following function, taken from here:
def find_depth(d):
if isinstance(d, dict):
return 1 + (max(map(find_depth, d.values())) if d else 0)
return 0
In your function, increment every time you go deeper as follows:
def build_dict(d, row_dict, rows, depth=0):
# depth = 1 for the beginning
for key, value in d.items():
if not isinstance(value, list):
row_dict[key] = value
else:
for child in value:
build_dict(child, row_dict, rows, depth + 1)
Finally, test if you reach the maximum depth, if so, at the end of your function you can append it. You will need to add an extra variable which you will call:
def build_dict(d, row_dict, rows, max_depth, depth=0):
# depth = 1 for the beginning
for key, value in d.items():
if not isinstance(value, list):
row_dict[key] = value
else:
for child in value:
build_dict(child, row_dict, rows,max_depth, depth + 1)
if depth == max_depth:
rows.append(row_dict)
Call the function as:
build_dict(d=row, row_dict={}, rows=rows, max_depth=find_depth(data))
Do keep in mind since I don't have a data-set I can use, there might be a syntax error or two in there, but the approach should be fine.
I don't think it is good practice to try to play with mutable default argument in function prototype.
Also, I think that the function in the recursion loop should never be aware of the level it is in. That's the point of the recursion. Instead, you need to think about what the function should return, and when it should exit the recursion loop to climb back to the zeroth level. On the climb back, higher level function calls handle the return value of lower level function calls.
Here is the code that I think will work. I am not sure it is optimal, in term of computing time.
edit: fixed return list of dicts instead of dict only
def build_dict(d):
"""
returns a list when there is no lowerlevel list of dicts.
"""
lst = []
for key, value in d.items():
if not isinstance(value, list):
lst.append([{key: value}])
else:
lst_lower_levels = []
for child in value:
lst_lower_levels.extend(build_dict(child))
new_lst = []
for elm in lst:
for elm_ll in lst_lower_levels:
lst_of_dct = elm + elm_ll
new_lst.append([{k: v for d in lst_of_dct for k, v in d.items()}])
lst = new_lst
return lst
rows = []
for row in data:
rows.extend(build_dict(d=row))

Modifying a nested dictionary element by a reference, generated from a list

The code:
def main():
nested_dict = {'A': {'A_1': 'value_1', 'B_1': 'value_2'},
'B': 'value_3'}
access_pattern = ['A', 'B_1']
new_value = 'value_4'
nested_dict[access_pattern] = new_value
return nested_dict
Background information:
As can be seen, I have a variable called nested_dict - in reality, it contains hundreds of elements with a different number of sub-elements each (I'm simplifying it for the purpose of the example).
I need to modify the value of some elements inside this dictionary, but it is not predetermined which elements exactly. The specific "path" to the elements that need be modified, will be provided by the access_pattern variable, which will be different every time.
The problem:
I know how to reference the value of the dictionary with this function functools.reduce(dict.get, access_pattern, nested_dict). However, I do not know how to universally modify (regardless of the contained variable type) the value of the access_pattern in the dictionary.
The provided code produces a TypeError that I do not know how to overcome elegantly. I did think of some solution, specified in 4.
Possible solutions:
if len(access_pattern) == 1:
nested_dict[access_pattern[0]] = new_value
elif len(access_pattern) == 2:
nested_dict[access_pattern[0]][access_pattern[1]] = new_value
...
So on for all len()
This just seems VERY inelegant and painful. Is there a more practical way to achieve this?
Make use of recursion
def edit_from_access_pattern(access_pattern, nested_dict, new_value):
if len(access_pattern) == 1:
nested_dict[access_pattern[0]] = new_value
else:
return edit_from_access_pattern(access_pattern[1:], nested_dict[access_pattern[0], new_value]
You can use recursion
def set_value(container, key, value):
if len(key) == 1:
container[key[0]] = value
else:
set_value(container[key[0]], key[1:], value)
but an explicit loop is probably going to be more efficient
def set_value(container, key, value):
for i in range(len(key)-1):
container = container[key[i]]
container[key[-1]] = value

i couldn't undestand what this lines of code do?

This part of class i did not understand what does do in this code:
for file in os.listdir(path):
if(os.path.isfile(os.path.join(path,file)) and select in file):
temp = scipy.io.loadmat(os.path.join(path,file))
temp = {k:v for k, v in temp.items() if k[0] != '_'}
for i in range(len(temp[patch_type+"_patches"])):
self.tensors.append(temp[patch_type+"_patches"][i])
self.labels.append(temp[patch_type+"_labels"][0][i])
self.tensors = np.array(self.tensors)
self.labels = np.array(self.labels)
especially this line :
temp = {k:v for k, v in temp.items() if k[0] != '_'}
the whole class is as follow :
class Datasets(Dataset):
def __init__(self,path,train,transform=None):
if(train):
select ="Training"
patch_type = "train"
else:
select = "Testing"
patch_type = "testing"
self.tensors = []
self.labels = []
self.transform = transform
for file in os.listdir(path):
if(os.path.isfile(os.path.join(path,file)) and select in file):
temp = scipy.io.loadmat(os.path.join(path,file))
temp = {k:v for k, v in temp.items() if k[0] != '_'}
for i in range(len(temp[patch_type+"_patches"])):
self.tensors.append(temp[patch_type+"_patches"][i])
self.labels.append(temp[patch_type+"_labels"][0][i])
self.tensors = np.array(self.tensors)
self.labels = np.array(self.labels)
def __len__(self):
try:
if len(self.tensors) != len(self.labels):
raise Exception("Lengths of the tensor and labels list are not the same")
except Exception as e:
print(e.args[0])
return len(self.tensors)
def __getitem__(self,idx):
sample = (self.tensors[idx],self.labels[idx])
# print(self.labels)
sample = (torch.from_numpy(self.tensors[idx]),torch.from_numpy(np.array(self.labels[idx])).long())
return sample
#tuple containing the image patch and its corresponding label
It's a dict comprehension; in this particular case, it creates a new dict from an existing dict temp, but only for items for which the key k does not start with an underscore. That check is performed by the if ... part.
It is equivalent to
new = {}
for k, v in temp.items():
if key[0] != '_':
new[k] = value
temp = new
or, slightly different:
new = {}
for key, value in temp.items():
if not key.startswith('_'):
new[key] = value
temp = new
You can see that it looks a bit nicer as a single line, since it avoids a temporary dict (new; under the hood, it still creates a nameless temporary dict though).
It is filtering out the underscore-prefixed variables from the loaded MATLAB file. From the scipy documentation the function scipy.io.loadmat returns a dictionary containing the variable names from the loaded file as keys and the matricies as values. The line of code you reference is a dictionary comprehension that clones the dictionary minus the variables that fail the conditional check.
Update
What happens here is roughly this:
Load a MATLAB file (file in your code) as a hashmap (dictionary) where the keys are the variable names from the file and the values are the matricies, assign to temp.
Iterate through those key/value pairs and drop the underscore-prefixed ones and reassign the results of that iteration to temp.
Profit

function would not change the parameter as wanted

here is my code
def common_words(count_dict, limit):
'''
>>> k = {'you':2, 'made':1, 'me':1}
>>> common_words(k,2)
>>> k
{'you':2}
'''
new_list = list(revert_dictionary(count_dict).items())[::-1]
count_dict = {}
for number,word in new_list:
if len(count_dict) + len(word) <= limit:
for x in word:
count_dict[x] = number
print (count_dict)
def revert_dictionary(dictionary):
'''
>>> revert_dictionary({'sb':1, 'QAQ':2, 'CCC':2})
{1: ['sb'], 2: ['CCC', 'QAQ']}
'''
reverted = {}
for key,value in dictionary.items():
reverted[value] = reverted.get(value,[]) + [key]
return reverted
count_dict = {'you':2, 'made':1, 'me':1}
common_words(count_dict,2)
print (count_dict)
what i expected is to have the count_dict variable to change to {'you':2}.
It did work fine in the function's print statement, but not outside the function..
The problem, as others have already written, is that your function assigns a new empty dictionary to count_dict:
count_dict = {}
When you do this you modify the local variable count_dict, but the variable with the same name in the main part of your program continues to point to the original dictionary.
You should understand that you are allowed to modify the dictionary you passed in the function argument; just don't replace it with a new dictionary. To get your code to work without modifying anything else, you can instead delete all elements of the existing dictionary:
count_dict.clear()
This modifies the dictionary that was passed to the function, deleting all its elements-- which is what you intended. That said, if you are performing a new calculation it's usually a better idea to create a new dictionary in your function, and return it with return.
As already mentioned, the problem is that with count_dict = {} you are not changing the passed in dictionary, but you create a new one, and all subsequent changes are done on that new dictionary. The classical approach would be to just return the new dict, but it seems like you can't do that.
Alternatively, instead of adding the values to a new dictionary, you could reverse your condition and delete values from the existing dictionary. You can't use len(count_dict) in the condition, though, and have to use another variable to keep track of the elements already "added" to (or rather, not removed from) the dictionary.
def common_words(count_dict, limit):
new_list = list(revert_dictionary(count_dict).items())[::-1]
count = 0
for number,word in new_list:
if count + len(word) > limit:
for x in word:
del count_dict[x]
else:
count += len(word)
Also note that the dict returned from revert_dictionary does not have a particular order, so the line new_list = list(revert_dictionary(count_dict).items())[::-1] is not guaranteed to give you the items in any particular order, as well. You might want to add sorted here and sort by the count, but I'm not sure if you actually want that.
new_list = sorted(revert_dictionary(count_dict).items(), reverse=True)
just write
return count_dict
below
print count_dict
in function common_words()
and change
common_words(count_dict,2)
to
count_dict=common_words(count_dict,2)
So basically you need to return value from function and store that in your variable. When you are calling function and give it a parameter. It sends its copy to that function not variable itself.

Python: update dictionary key with tuple values

I have a dictionary that has keys with two values each. I need to update the second value as pass duplicate keys.
Clearly what I'm trying isn't working out.
if value1 not in dict.keys():
dict.update({key:(value1,value2)})
else:
dict.update({key:value1,+1)})
this just returned a diction with 1s for value 2 instead of incrementing by 1
The expression +1 doesn't increment anything, it's just the number 1
Also avoid using dict as a name because it's a Python built-in
Try structuring your code more like this:
my_dict = {} # some dict
my_key = # something
if my_key not in my_dict:
new_value = # some new value here
my_dict[my_key] = new_value
else:
# First calculate what should be the new value
# Here I'm doing a trivial new_value = old_value + 1, no tuples
new_value = my_dict[my_key] + 1
my_dict[my_key] = new_value
# For tuples you can e.g. increment the second element only
# Of course assuming you have at least 2 elements,
# or old_value[0] and old_value[1] will fail
old_value = my_dict[my_key] # this is a tuple
new_value = old_value[0], old_value[1] + 1
my_dict[my_key] = new_value
There may be shorter or smarter ways to do it, e.g. using the operator +=, but this snippet is written for clarity

Categories

Resources