Summation from sub list - python

If n = 4, m = 3, I have to select 4 elements (basically n elements) from a list from start and end. From below example lists are [17,12,10,2] and [2,11,20,8].
Then between these two lists I have to select the highest value element and after this the element has to be deleted from the original list.
The above step has to be performed m times and take the summation of the highest value elements.
A = [17,12,10,2,7,2,11,20,8], n = 4, m = 3
O/P: 20+17+12=49
I have written the following code. However, the code performance is not good and giving time out for larger list. Could you please help?
A = [17,12,10,2,7,2,11,20,8]
m = 3
n = 4
scoreSum = 0
count = 0
firstGrp = []
lastGrp = []
while(count<m):
firstGrp = A[:n]
lastGrp = A[-n:]
maxScore = max(max(firstGrp), max(lastGrp))
scoreSum = scoreSum + maxScore
if(maxScore in firstGrp):
A.remove(maxScore)
else:
ai = len(score) - 1 - score[::-1].index(maxScore)
A.pop(ai)
count = count + 1
firstGrp.clear()
lastGrp.clear()
print(scoreSum )

I would like to do that this way, you can generalize it later:
a = [17,12,10,2,7,2,11,20,8]
a.sort(reverse=True)
sums=0
for i in range(3):
sums +=a[i]
print(sums)

If you are concerned about performance, you should use specific libraries like numpy. This will be much faster !

A = [17,12,10,2,7,11,20,8]
n = 4
m = 3
score = 0
for _ in range(m):
sublist = A[:n] + A[-n:]
subidx = [x for x in range(n)] + [x for x in range(len(A) - n, len(A))]
sub = zip(sublist, subidx)
maxval = max(sub, key=lambda x: x[0])
score += maxval[0]
del A[maxval[1]]
print(score)
Your method uses a lot of max() calls. Combining the slices of the front and back lists allows you to reduce the amounts of those max() searches to one pass and then a second pass to find the index at which it occurs for removal from the list.

Related

Making the hamming distance between two strings in a list at most 3

I have a randomly generated list of values attached to a list (z) so what I did is convert two indexes next to each other to separate strings to compare each other. I need to make it so that the hamming distance is at most 3 between all strings in the list. I also can't use any modules for this.
Any help would be appreciated.
z = ["AAATCG", "GAGCGT"]
i = 0
s1 = ""
s2 = ""
while i < len(z) -1:
s1 = z[i]
i = i+1
s2 = z[i]
after that I'm lost
You're better off using a for loop. In the code below z[0] = "AAATCG" and z[1] = "GAGCGT". The if statement checks if the letters in the strings are different, if they are than ham_dist is incremented by 1.
# For the Hamming distance
ham_dist = 0
z = ["AAATCG", "GAGCGT"]
for idx in range(len(z[0])):
if z[0][idx] != z[1][idx]:
ham_dist += 1
print(ham_dist)
If you want to edit the strings to reduce the Hamming distance to zero then the following code will edit the first string to replicate the second. If you want it the other way around just reverse the indicies.
z = ["AAATCG", "GAGCGT"]
z_0_list = list(z[0])
z_1_list = list(z[1])
orig_ham_dist = 0
new_ham_dist = 0
z_ouput = []
# Calculate original Hamming distance & edit strings
for idx in range(len(z[0])):
if z_0_list[idx] != z_1_list[idx]:
z_0_list[idx] = z_1_list[idx]
orig_ham_dist += 1
z_ouput.append("".join(z_0_list))
z_ouput.append("".join(z_1_list))
# Calculate new Hamming distance
for idx in range(len(z_ouput[0])):
if z_ouput[0][idx] != z_ouput[1][idx]:
new_ham_dist += 1
print(orig_ham_dist)
print(z)
print('-------------------------')
print(new_ham_dist)
print(z_ouput)

Split "weighted" list/array into equal size chunks

I have a array of items with a weight assigned to each item. I want to split it into equal sized chunks of approx. equal cumulative weight. There is an answer here to do this using numpy https://stackoverflow.com/a/33555976/10690958
Is there a simple way to accomplish this using pure python ?
Example array:
[ ['bob',12],
['jack,6],
['jim',33],
....
]
or
a, 11
b,2
c, 5
d, 3
e, 3
f, 2
Here the correct output would be (assuming 2 chunks needed)
[a,11],[b,2] - cumulative weight of 13
and
[c,5],[d,3],[e,3],[f,2] - cumulative weight of 13
To further clarify the question, imagine a situation of sorting a 100 people into 10 elevators, where we want each elevator to have the same approx. total weight (sum of weights of all people in that elevator). So then the first list would become names and weights. Its a load-balancing problem.
You just have to mimic cumsum: build a list summing the weights. At the end you get the total weight. Scan the list with the cumulated weight, and create a new chunk each time you reach total_weight/number_of_chunks. Code could be:
def split(w_list, n_chunks):
# mimic a cumsum
s = [0,[]]
for i in w_list:
s[0]+= i[1]
s[1].append(s[0])
# now scan the lists populating the chunks
index = 0
splitted = []
stop = 0
chunk = 1
stop = s[0] / n_chunks
for i in range(len(w_list)):
# print(stop, s[1][i]) # uncomment for traces
if s[1][i] >= stop: # reached a stop ?
splitted.append(w_list[index:i+1]) # register a new chunk
index = i+1
chunk += 1
if chunk == n_chunks: # ok we can stop
break
stop = s[0] * chunk / n_chunks # next stop
splitted.append(w_list[index:]) # do not forget last chunk
return splitted
You need something like this split:
array =[ ['bob',12],
['jack',6],
['jim',33],
['bob2',1],
['jack2',16],
['jim2',3],
['bob3',7],
['jack3',6],
['jim3',1],
]
array = sorted(array, key= lambda pair: pair[1], )
summ = sum(pair[1] for pair in array )
chunks = 4
splmitt = summ // chunks
print(array)
print(summ)
print(splmitt)
def split(array, split):
splarr = []
tlist = []
summ = 0
for pair in array:
summ += pair[1]
tlist.append(pair)
if summ > split:
splarr.append(tlist)
tlist = []
summ = 0
if tlist:
splarr.append(tlist)
return splarr
spl = split(array, splmitt)
import pprint
pprint.pprint(spl)

Summing results from a monte carlo

I am trying to sum the values in the 'Callpayoff' list however am unable to do so, print(Callpayoff) returns a vertical list:
0
4.081687878300656
1.6000410648454846
0.5024316862043037
0
so I wonder if it's a special sublist ? sum(Callpayoff) does not work unfortunately. Any help would be greatly appreciated.
def Generate_asset_price(S,v,r,dt):
return (1 + r * dt + v * sqrt(dt) * np.random.normal(0,1))
def Call_Poff(S,T):
return max(stream[-1] - S,0)
# initial values
S = 100
v = 0.2
r = 0.05
T = 1
N = 2 # number of steps
dt = 0.00396825
simulations = 5
for x in range(simulations):
stream = [100]
Callpayoffs = []
t = 0
for n in range(N):
s = stream[t] * Generate_asset_price(S,v,r,dt)
stream.append(s)
t += 1
Callpayoff = Call_Poff(S,T)
print(Callpayoff)
plt.plot(stream)
Right now you're not appending values to a list, you're just replacing the value of Callpayoff at each iteration and printing it. At each iteration, it's printed on a new line so it looks like a "vertical list".
What you need to do is use Callpayoffs.append(Call_Poff(S,T)) instead of Callpayoff = Call_Poff(S,T).
Now a new element will be added to Callpayoffs at every iteration of the for loop.
Then you can print the list with print(Callpayoffs) or the sum with print(sum(Callpayoffs))
All in all the for loop should look like this:
for x in range(simulations):
stream = [100]
Callpayoffs = []
t = 0
for n in range(N):
s = stream[t] * Generate_asset_price(S,v,r,dt)
stream.append(s)
t += 1
Callpayoffs.append(Call_Poff(S,T))
print(Callpayoffs,"sum:",sum(Callpayoffs))
Output:
[2.125034975231003, 0] sum: 2.125034975231003
[0, 0] sum: 0
[0, 0] sum: 0
[0, 0] sum: 0
[3.2142923036024342, 4.1390018820809615] sum: 7.353294185683396

How do you compare two randomly generated lists to find elements in common

I'm writing a program that writes two lists of five numbers between 0-9. I need the program to compare the lists to find how many numbers the lists have in common, but I'm not sure how.
Edit: I just need the program to write the amount of numbers in common, I don't need the numbers themselves
import random
X = 0
while X<2000:
House_deal = [random.randint(0,9) for House_deal in range(5)]
Player_deal = [random.randint(0,9) for House_deal in range(5)]
print("House numbers: ", House_deal)
print("Player numbers: ", Player_deal)
print(" ")
X += 1
Assuming we have l_one and l_two for two randomly generated lists.
counter = [0] * 10
answer = 0
for x in l_one:
counter[x] += 1
for x in l_two:
if counter[x] > 0:
counter[x] -= 1
answer += 1
print(answer)
This algorithm works in O(n) compared to O(n^2) solutions posted before.
A solution similar to others already posted, but using Counter:
import random
from collections import Counter
for _ in range(2000):
House_deal = [random.randint(0,9) for _ in range(5)]
Player_deal = [random.randint(0,9) for _ in range(5)]
hc = Counter(House_deal)
pc = Counter(Player_deal)
common = hc.keys() & pc.keys() #get the intersection of both keys
counts = 0
for cel in common:
counts += min(hc[cel], pc[cel])
print("House numbers: ", House_deal)
print("Player numbers: ", Player_deal)
print("Common numbers: ", counts)
I also changed the while loop into a for loop.
You could define a function like the ones here and use the len() function to get the number of elements in the list. The functions cast the lists to sets and get the intersections of them, which would be all the elements in common. That way you can also see the elements that are incommon between them if you decide you want to as well, or you can even modify this to subtract one list from the other to get the elements they don't have in common like here. Let me know if this is what you're looking for.
You can use a function as follows:
def compare(list1 , list2):
common_list = []
for i in list1:
if i in list2:
common_list.append(i)
print(len(common_list)) # Number of common elements
Its work:
import random
list_one = random.sample(range(9), 4)
list_two = random.sample(range(9), 4)
print(list_one)
print(list_two)
count = 0
for item_one in list_one:
for item_two in list_two:
if item_one == item_two:
count +=1
print(count)
Ps.: The instruction for item_two in list_two: maybe change by if item_one in list_two:, but as you seem to be a beginner, I put it even more explicitly.
A variant of the previous answer.
counter_1 = {k, 0 : k in l_one}
counter_2 = {k, 0 : k in l_two}
answer = 0
for x in l_one:
counter_1[x] = counter_1[x] + 1
for x in l_two:
counter_2[x] = counter_2[x] + 1
for x, v1 in counter_1.items():
v2 = counter_2.get(x, 0)
answer = answer + min (v1, v2)
print(answer)
You can use the set data type in Python, and do something like
In [1]: from random import randint
In [2]: import random
In [3]: random.seed(123)
In [4]: x = set([randint(0,9) for x in range(5)])
In [5]: y = set([randint(0,9) for x in range(5)])
In [6]: x
Out[6]: {0, 1, 4, 6}
In [7]: y
Out[7]: {0, 1, 6, 8}
In [8]: cnt = len(x & y) # return the amount of numbers in common, e.g. {0, 1, 6}
In [9]: cnt
Out[9]: 3
Hope it helps.

Efficient algorithm for counting unique elements in "suffixes" of an array

I was doing 368B on CodeForces with Python 3, which basically asks you to print the numbers of unique elements in a series of "suffixes" of a given array. Here's my solution (with some additional redirection code for testing):
import sys
if __name__ == "__main__":
f_in = open('b.in', 'r')
original_stdin = sys.stdin
sys.stdin = f_in
n, m = [int(i) for i in sys.stdin.readline().rstrip().split(' ')]
a = [int(i) for i in sys.stdin.readline().rstrip().split(' ')]
l = [None] * m
for i in range(m):
l[i] = int(sys.stdin.readline().rstrip())
l_sorted = sorted(l)
l_order = sorted(range(m), key=lambda k: l[k])
# the ranks of elements in l
l_rank = sorted(range(m), key=lambda k: l_order[k])
# unique_elem[i] = non-duplicated elements between l_sorted[i] and l_sorted[i+1]
unique_elem = [None] * m
for i in range(m):
unique_elem[i] = set(a[(l_sorted[i] - 1): (l_sorted[i + 1] - 1)]) if i < m - 1 else set(a[(l_sorted[i] - 1): n])
# unique_elem_cumulative[i] = non-duplicated elements between l_sorted[i] and a's end
unique_elem_cumulative = unique_elem[-1]
# unique_elem_cumulative_count[i] = #unique_elem_cumulative[i]
unique_elem_cumulative_count = [None] * m
unique_elem_cumulative_count[-1] = len(unique_elem[-1])
for i in range(m - 1):
i_rev = m - i - 2
unique_elem_cumulative = unique_elem[i_rev] | unique_elem_cumulative
unique_elem_cumulative_count[i_rev] = len(unique_elem_cumulative)
with open('b.out', 'w') as f_out:
for i in range(m):
idx = l_rank[i]
f_out.write('%d\n' % unique_elem_cumulative_count[idx])
sys.stdin = original_stdin
f_in.close()
The code shows correct results except for the possibly last big test, with n = 81220 and m = 48576 (a simulated input file is here, and an expected output created by a naive solution is here). The time limit is 1 sec, within which I can't solve the problem. So is it possible to solve it within 1 sec with Python 3? Thank you.
UPDATE: an "expected" output file is added, which is created by the following code:
import sys
if __name__ == "__main__":
f_in = open('b.in', 'r')
original_stdin = sys.stdin
sys.stdin = f_in
n, m = [int(i) for i in sys.stdin.readline().rstrip().split(' ')]
a = [int(i) for i in sys.stdin.readline().rstrip().split(' ')]
with open('b_naive.out', 'w') as f_out:
for i in range(m):
l_i = int(sys.stdin.readline().rstrip())
f_out.write('%d\n' % len(set(a[l_i - 1:])))
sys.stdin = original_stdin
f_in.close()
You'll be cutting it close, I think. On my admittedly rather old machine, the I/O alone takes 0.9 seconds per run.
An efficient algorithm, I think, will be to iterate backwards through the array, keeping track of which distinct elements you've found. When you find a new element, add its index to a list. This will therefore be a descending sorted list.
Then for each li, the index of li in this list will be the answer.
For the small sample dataset
10 10
1 2 3 4 1 2 3 4 100000 99999
1
2
3
4
5
6
7
8
9
10
The list would contain [10, 9, 8, 7, 6, 5] since when reading from the right, the first distinct value occurs at index 10, the second at index 9, and so on.
So then if li = 5, it has index 6 in the generated list, so 6 distinct values are found at indices >= li. Answer is 6
If li = 8, it has index 3 in the generated list, so 3 distinct values are found at indices >= li. Answer is 3
It's a little fiddly that the excercise numbers 1-indexed and python counts 0-indexed.
And to find this index quickly using existing library functions, I've reversed the list and then use bisect.
import timeit
from bisect import bisect_left
def doit():
f_in = open('b.in', 'r')
n, m = [int(i) for i in f_in.readline().rstrip().split(' ')]
a = [int(i) for i in f_in.readline().rstrip().split(' ')]
found = {}
indices = []
for i in range(n - 1, 0, -1):
if not a[i] in found:
indices.append(i+1)
found[a[i]] = True
indices.reverse()
length = len(indices)
for i in range(m):
l = int(f_in.readline().rstrip())
index = bisect_left(indices, l)
print length - index
if __name__ == "__main__":
print (timeit.timeit('doit()', setup="from bisect import bisect_left;from __main__ import doit", number=10))
On my machine outputs 12 seconds for 10 runs. Still too slow.

Categories

Resources