Python dictionary updating old entries when appending to an array - python

I have been looking at this too long. I am writing code to use known algorithms to increase the size of an ArrayList as I am adding to it. Every time the size of the ArrayList changes I add the new size to an array which is then added to a dictionary.
Here is the code for the ArrayList class I am using
class ArrayList:
def __init__(self, growfn):
'''Initializes the empty ArrayList with the specified growth function.'''
self.size = 0
self.used = 0
self.array = []
self.grow = growfn
def get(self, index):
if index < self.size:
return self.array[index]
# exception handling
return None
def add(self, value):
if self.used == self.size:
newSize = self.grow(self.size)
if newSize <= self.size:
# exception handling
return
newArray = self.array[:]+[None for i in range(self.size,newSize)]
self.array = newArray
self.size = newSize
self.array[self.used] = value
self.used = self.used + 1
return None
Here is the code I have written to test this. This function takes 3 parameters: A list containing the max sizes of the array, the function for the algorithm I am using to increase the size of the ArrayList, and the number of tries that it will run for.
def analyzePerformance(nList, gfn, tries):
data = []
sizes_array = []
averages = []
arr = ArrayList(gfn)
for num in nList:
for j in range(tries):
current_size = 0
start = time()
while arr.size < num:
arr.add(None)
if arr.size != current_size:
sizes_array.append(arr.size)
current_size = arr.size
end = time()
averages.append(end - start)
data.append(dict({'grow': gfn, 'N': num, 'seconds': mean(averages), 'sizes': sizes_array}))
return data
And finally the code where I call to the function
def main():
for result in analyzePerformance([1,10,100], double, 11):
print(result)
double is a function that simply doubles the size of the array
Here is my current output:
{'grow': <function double at 0x7f3274eb4f28>, 'N': 1, 'seconds': 3.034418279474432e-07, 'sizes': [1, 2, 4, 8, 16, 32, 64, 128]}
{'grow': <function double at 0x7f3274eb4f28>, 'N': 10, 'seconds': 4.659999500621449e-07, 'sizes': [1, 2, 4, 8, 16, 32, 64, 128]}
{'grow': <function double at 0x7f3274eb4f28>, 'N': 100, 'seconds': 7.369301535866477e-07, 'sizes': [1, 2, 4, 8, 16, 32, 64, 128]}
And here is what I need the output to be
{'grow': <function double at 0x7f3274eb4f28>, 'N': 1, 'seconds': 3.034418279474432e-07, 'sizes': [1]}
{'grow': <function double at 0x7f3274eb4f28>, 'N': 10, 'seconds': 4.659999500621449e-07, 'sizes': [1, 2, 4, 8, 16]}
{'grow': <function double at 0x7f3274eb4f28>, 'N': 100, 'seconds': 7.369301535866477e-07, 'sizes': [1, 2, 4, 8, 16, 32, 64, 128]}
Why is it that when I add new elements to the dictionary with different lengths of sizes arrays it updates the old dictionaries to have the longer arrays as well? Any help is appreciated, thanks!

You may want to have a look for Shallow and deep copy operations:
Assignment statements in Python do not copy objects, they create bindings between a target and an object. For collections that are mutable or contain mutable items, a copy is sometimes needed so one can change one copy without changing the other.
For your next code, the sizes_array will be reused among different iterations, although it has been added into data list:
data.append(dict({'grow': gfn, 'N': num, 'seconds': mean(averages), 'sizes': sizes_array}))
To fix this, you need to change above to next:
data.append(dict({'grow': gfn, 'N': num, 'seconds': mean(averages), 'sizes': sizes_array.copy()}))
With above, you will see a copied object for sizes_array which meet your requirement.

Related

Python: Why is this recursion failing?

Why am I getting maximum recursion results of [] in this simple recursion example?
# generate data
df = pd.DataFrame({'id': [1, 2, 2, 3, 4, 5, 6, 7],
'parent': [np.nan, 1, 2, 2, np.nan, 1, 1, 5]})
parents = df.parent.dropna().unique().astype(int)
def find_parent(init_parent):
init_parent = [init_parent] if isinstance(init_parent, int) else [init_parent]
if len(init_parent) == 0:
return init_parent
else:
return find_parent(df.loc[df['parent'].isin(init_parent)]['id'].tolist())
# max recursion of [] results
find_parent(parents[1])
def find_parent(init_parent):
init_parent = [init_parent] if isinstance(init_parent, int) else [init_parent]
if len(init_parent) == 0: # this only returns true on an empty array
return init_parent # you're getting [] because this return
else:
return find_parent(df.loc[df['parent'].isin(init_parent)]['id'].tolist())
# run ops
find_parent(df.loc[df['parent'].isin(init_parent)]['id'].tolist())
your maximum return is reached when there is no parent. You overwrite the init parent in the line above, check if it's an empty array, and then return that empty array.

How find all pairs equal to N in a list

I have a problem with this algorithm- I have to find pairs in list:
[4, 8, 9, 0, 12, 1, 4, 2, 12, 12, 4, 4, 8, 11, 12, 0]
which are equal to 12. The thing is that after making a pair those numbers (elements) can not be used again.
For now, I have code which you can find below. I have tried to delete numbers from the list after matching, but I feel that there is an issue with indexing after this.
It looks very easy but still not working. ;/
class Pairs():
def __init__(self, sum, n, arr ):
self.sum = sum
self.n = n
self.arr = arr
def find_pairs(self):
self.n = len(self.arr)
for i in range(0, self.n):
for j in range(i+1, self.n):
if (self.arr[i] + self.arr[j] == self.sum):
print("[", self.arr[i], ",", " ", self.arr[j], "]", sep = "")
self.arr.pop(i)
self.arr.pop(j-1)
self.n = len(self.arr)
i+=1
def Main():
sum = 12
arr = [4, 8, 9, 0, 12, 1, 4, 2, 12, 12, 4, 4, 8, 11, 12, 0]
n = len(arr)
obj_Pairs = Pairs(sum, n, arr)
obj_Pairs.find_pairs()
if __name__ == "__main__":
Main()
update:
Thank you guys for the fast answers!
I've tried your solutions, and unfortunately, it is still not exactly what I'm looking for. I know that the expected output should look like this: [4, 8], [0, 12], [1, 11], [4, 8], [12, 0]. So in your first solution, there is still an issue with duplicated elements, and in the second one [4, 8] and [12, 0] are missing. Sorry for not giving output at the beginning.
With this problem you need to keep track of what numbers have already been tried. Python has a Counter class that will hold the count of each of the elements present in a given list.
The algorithm I would use is:
create counter of elements in list
iterate list
for each element, check if (target - element) exists in counter and count of that item > 0
decrement count of element and (target - element)
from collections import Counter
class Pairs():
def __init__(self, target, arr):
self.target = target
self.arr = arr
def find_pairs(self):
count_dict = Counter(self.arr)
result = []
for num in self.arr:
if count_dict[num] > 0:
difference = self.target - num
if difference in count_dict and count_dict[difference] > 0:
result.append([num, difference])
count_dict[num] -= 1
count_dict[difference] -= 1
return result
if __name__ == "__main__":
arr = [4, 8, 9, 0, 12, 1, 4, 2, 12, 12, 4, 4, 8, 11, 12, 0]
obj_Pairs = Pairs(12, arr)
result = obj_Pairs.find_pairs()
print(result)
Output:
[[4, 8], [8, 4], [0, 12], [12, 0], [1, 11]]
Demo
Brief
If you have learned about hashmaps and linked lists/deques, you can consider using auxiliary space to map values to their indices.
Pro:
It does make the time complexity linear.
Doesn't modify the input
Cons:
Uses extra space
Uses a different strategy from the original. If this is for a class and you haven't learned about the data structures applied then don't use this.
Code
from collections import deque # two-ended linked list
class Pairs():
def __init__(self, sum, n, arr ):
self.sum = sum
self.n = n
self.arr = arr
def find_pairs(self):
mp = {} # take advantage of a map of values to their indices
res = [] # resultant pair list
for idx, elm in enumerate(self.arr):
if mp.get(elm, None) is None:
mp[elm] = deque() # index list is actually a two-ended linked list
mp[elm].append(idx) # insert this element
comp_elm = self.sum - elm # value that matches
if mp.get(comp_elm, None) is not None and mp[comp_elm]: # there is no match
# match left->right
res.append((comp_elm, elm))
mp[comp_elm].popleft()
mp[elm].pop()
for pair in res: # Display
print("[", pair[0], ",", " ", pair[1], "]", sep = "")
# in case you want to do further processing
return res
def Main():
sum = 12
arr = [4, 8, 9, 0, 12, 1, 4, 2, 12, 12, 4, 4, 8, 11, 12, 0]
n = len(arr)
obj_Pairs = Pairs(sum, n, arr)
obj_Pairs.find_pairs()
if __name__ == "__main__":
Main()
Output
$ python source.py
[4, 8]
[0, 12]
[4, 8]
[1, 11]
[12, 0]
To fix your code - few remarks:
If you iterate over array in for loop you shouldn't be changing it - use while loop if you want to modify the underlying list (you can rewrite this solution to use while loop)
Because you're iterating only once the elements in the outer loop - you only need to ensure you "popped" elements in the inner loop.
So the code:
class Pairs():
def __init__(self, sum, arr ):
self.sum = sum
self.arr = arr
self.n = len(arr)
def find_pairs(self):
j_pop = []
for i in range(0, self.n):
for j in range(i+1, self.n):
if (self.arr[i] + self.arr[j] == self.sum) and (j not in j_pop):
print("[", self.arr[i], ",", " ", self.arr[j], "]", sep = "")
j_pop.append(j)
def Main():
sum = 12
arr = [4, 8, 9, 0, 12, 1, 4, 2, 12, 12, 4, 4, 8, 11, 12, 0]
obj_Pairs = Pairs(sum, arr)
obj_Pairs.find_pairs()
if __name__ == "__main__":
Main()

Index of an element in a nested list

I'm struggling with an exercise for a few days.
Given was following nested list:
[1, [5, 62, 6], 4, [99, [100, 200, 600, [1000, [2000]]]], [74, 41, 16], 7, [8], [[[400]]]]
And this function body:
def find_element(liste, find, index = 0):
I have to find an element in the nested list and the function should return the exact index of the found element, for example [1,0] for 5 or [3, 1, 3, 1, 0] for 2000.
The function has to be recursive.
My problem is the function has to return false if the element is not in the list.
This is my code:
def find_element(liste, find, index = 0):
indexList = []
if len(liste) == index:
return indexList
if liste[index] == find:
indexList.append(index)
else:
if type(liste[index]) == list:
indexList.extend(find_element(liste[index], find))
if indexList:
indexList.insert(0, index)
else:
indexList.extend(find_element(liste, find, index + 1))
return indexList
I tried a second function that returns false when the list is empty or an if condition if the index is 0 and the indexList is empty, but all I got were a RecursionError or a TypeError.
Ajax1234's answer works, but if you need something a little more simple this may be better:
def find_idx(input_list, elem):
for i in range(len(input_list)):
if isinstance(input_list[i], list):
result = find_idx(input_list[i], elem)
if result:
return [i] + result
elif input_list[i] == elem:
return [i]
return False
input_list = [1, [5, 62, 6], 4, [99, [100, 200, 600, [1000, [2000]]]], [74, 41, 16], 7, [8], [[[400]]]]
print(find_idx(input_list, 2000))
# Output: [3, 1, 3, 1, 0]
This is basically a DFS (https://en.wikipedia.org/wiki/Depth-first_search). If you think of your data structure as a tree, your list entries are nodes since they themselves can contain other lists, just as a node in a tree can point to other nodes. The magic is in returning False if nothing was found at the very end of the method, but recursively searching all sublists before you get to that point. Also, you have to check whether your list entry is itself a list, but this is just an analogy to the fact that a tree can have nodes that do point to other nodes, and nodes that do not (leaf nodes, or plain old numbers in your case).
You can use recursion with a generator:
def find_element(l, elem):
def get_elem(d, c = []):
for i, a in enumerate(d):
if a == elem:
yield c+[i]
elif isinstance(a, list):
yield from get_elem(a, c+[i])
return False if not (r:=list(get_elem(l))) else r[0]
data = [1, [5, 62, 6], 4, [99, [100, 200, 600, [1000, [2000]]]], [74, 41, 16], 7, [8], [[[400]]]]
print(find_element(data, 2000))
Output:
[3, 1, 3, 1, 0]
I agree generators are a nice fit for this problem. I would separate the program logic into two separate functions, dfs and find_element -
def dfs(ls, r = []):
if isinstance(ls, list):
for (i, v) in enumerate(ls):
yield from dfs(v, [*r, i])
else:
yield (r, ls)
def find_element(ls, q):
for (k, v) in dfs(ls):
if v == q:
return k
return None
print(find_element(input, 5))
# [1, 0]
print(find_element(input, 2000))
# [3, 1, 3, 1, 0]
print(find_element(input, 999))
# None
Or you could fix your original program using a fourth parameter, r = [] -
def find_element(ls, q, i = 0, r = []):
if i >= len(ls):
return None
elif isinstance(ls[i], list):
return find_element(ls[i], q, 0, [*r, i]) \
or find_element(ls, q, i + 1, r)
elif ls[i] == q:
return [*r, i]
else:
return find_element(ls, q, i + 1, r)
print(find_element(input, 5))
# [1, 0]
print(find_element(input, 2000))
# [3, 1, 3, 1, 0]
print(find_element(input, 999))
# None

How do you change a specific element of a set?

In this code, I am trying to compare the value of a set that has been looped each time to a value passed (in this case a) in a parameter. What's interesting though is that it shows when I use a for each loop that each element is an integer. How do I get a integer to integer comparison without a console error?
def remove(s,a,b):
c=set()
c=s
for element in c:
element=int(element)
if(element<a or element>b):
c.discard(element)
return c
def main():
remove({3, 17, -1, 4, 9, 2, 14}, 1, 10)
main()
Output:
if(element<=a or element>=b):
TypeError: '>=' not supported between instances of 'int' and 'set'
You reassign your local variable b:
def remove(s,a,b):
b=set() # now b is no longer the b you pass in, but an empty set
b=s # now it is the set s that you passed as an argument
# ...
if(... element > b): # so the comparison must fail: int > set ??
Short implementation using a set comprehension:
def remove(s, a, b):
return {x for x in s if a <= x <= b}
>>> remove({3, 17, -1, 4, 9, 2, 14}, 1, 10)
{9, 2, 3, 4}
if you want int to int compare then make b as list of s.
def remove(s,a,b):
b = list(s)
for element in s:
element=int(element)
if(element< a or element > b):
b.remove(element)
return b
def main():
remove({3, 17, -1, 4, 9, 2, 14}, 1, 10)
main()
Come on, why don't we make the code shorter?
Try this:
def remove(s, a, b):
return s.difference(filter(lambda x: not int(a) < int(x) < int(b), s))
def main():
new_set = remove({3, 17, -1, 4, 9, 2, 14}, 1, 10)
# {2, 3, 4, 9}
print(new_set)
main()

Python - summing and grouping through a list

I have a big list of numbers like so:
a = [133000, 126000, 123000, 108000, 96700, 96500, 93800,
93200, 92100, 90000, 88600, 87000, 84300, 82400, 80700,
79900, 79000, 78800, 76100, 75000, 15300, 15200, 15100,
8660, 8640, 8620, 8530, 2590, 2590, 2580, 2550, 2540, 2540,
2510, 2510, 1290, 1280, 1280, 1280, 1280, 951, 948, 948,
947, 946, 945, 609, 602, 600, 599, 592, 592, 592, 591, 583]
What I want to do is cycle through this list one by one checking if a value is above a certain threshold (for example 40000). If it is above this threshold we put that value in a new list and forget about it. Otherwise we wait until the sum of the values is above the threshold and when it is we put the values in a list and then continue cycling. At the end, if the final values don't sum to the threshold we just add them to the last list.
If I'm not being clear consider the simple example, with the threshold being 15
[20, 10, 9, 8, 8, 7, 6, 2, 1]
The final list should look like this:
[[20], [10, 9], [8, 8], [7, 6, 2, 1]]
I'm really bad at maths and python and I'm at my wits end. I have some basic code I came up with but it doesn't really work:
def sortthislist(list):
list = a
newlist = []
for i in range(len(list)):
while sum(list[i]) >= 40000:
newlist.append(list[i])
return newlist
Any help at all would be greatly appreciated. Sorry for the long post.
The function below will accept your input list and some limit to check and then output the sorted list:
a = [20, 10, 9, 8, 8, 7, 6, 2, 1]
def func(a, lim):
out = []
temp = []
for i in a:
if i > lim:
out.append([i])
else:
temp.append(i)
if sum(temp) > lim:
out.append(temp)
temp = []
return out
print(func(a, 15))
# [[20], [10, 9], [8, 8], [7, 6, 2, 1]]
With Python you can iterate over the list itself, rather than iterating over it's indices, as such you can see that I use for i in a rather than for i in range(len(a)).
Within the function out is the list that you want to return at the end; temp is a temporary list that is populated with numbers until the sum of temp exceeds your lim value, at which point this temp is then appended to out and replaced with an empty list.
def group(L, threshold):
answer = []
start = 0
sofar = L[0]
for i,num in enumerate(L[1:],1):
if sofar >= threshold:
answer.append(L[start:i])
sofar = L[i]
start = i
else:
sofar += L[i]
if i<len(L) and sofar>=threshold:
answer.append(L[i:])
return answer
Output:
In [4]: group([20, 10, 9, 8, 8, 7, 6, 2, 1], 15)
Out[4]: [[20], [10, 9], [8, 8], [7, 6, 2]]
Hope this will help :)
vlist = [20, 10,3,9, 7,6,5,4]
thresold = 15
result = []
tmp = []
for v in vlist:
if v > thresold:
tmp.append(v)
result.append(tmp)
tmp = []
elif sum(tmp) + v > thresold:
tmp.append(v)
result.append(tmp)
tmp = []
else:
tmp.append(v)
if tmp != []:
result.append(tmp)
Here what's the result :
[[20], [10, 3, 9], [7, 6, 5], [4]]
Here's yet another way:
def group_by_sum(a, lim):
out = []
group = None
for i in a:
if group is None:
group = []
out.append(group)
group.append(i)
if sum(group) > lim:
group = None
return out
print(group_by_sum(a, 15))
We already have plenty of working answers, but here are two other approaches.
We can use itertools.groupby to collect such groups, given a stateful accumulator that understands the contents of the group. We end up with a set of (key,group) pairs, so some additional filtering gets us only the groups. Additionally since itertools provides iterators, we convert them to lists for printing.
from itertools import groupby
class Thresholder:
def __init__(self, threshold):
self.threshold=threshold
self.sum=0
self.group=0
def __call__(self, value):
if self.sum>self.threshold:
self.sum=value
self.group+=1
else:
self.sum+=value
return self.group
print [list(g) for k,g in groupby([20, 10, 9, 8, 8, 7, 6, 2, 1], Thresholder(15))]
The operation can also be done as a single reduce call:
def accumulator(result, value):
last=result[-1]
if sum(last)>threshold:
result.append([value])
else:
last.append(value)
return result
threshold=15
print reduce(accumulator, [20, 10, 9, 8, 8, 7, 6, 2, 1], [[]])
This version scales poorly to many values due to the repeated call to sum(), and the global variable for the threshold is rather clumsy. Also, calling it for an empty list will still leave one empty group.
Edit: The question logic demands that values above the threshold get put in their own groups (not sharing with collected smaller values). I did not think of that while writing these versions, but the accepted answer by Ffisegydd handles it. There is no effective difference if the input data is sorted in descending order, as all the sample data appears to be.

Categories

Resources