The loop goes through list of numbers. I need to use map to accumulate the sum of all element[0] in another list's element[0] , sum of all element[1] in element[1].
result_list = []
sub1 = sub2 = sub3 = 0 #these 3 should be only indexes 0,1,2 of list above
for item in r:
l = item.split(';') # originally l = '34;56;78'
q = list(map(float,l)) # q is the list of 3 elements
#instead of code below I want to have smth like
# result_list = list(map( sum( q(item), result_list)
sub1 += q[0]
sub2 += q[1]
sub3 += q[2]
Input:
l = [['1;2;3'], ['10;20;30'], ['12;34;56']]
result_list must aggregate the sum of all element[0] in each list to result_list[0].
Output
result_list[0] = 1+ 10 + 12
result_list[1] = 2 + 20 + 34
result_list[2] = 3 + 30 + 56
r is this, I omit names and calculate average of each 'column'.
Bawerman;55;79;50
Baldwin;83;62;72
Owen;94;86;65
Watson;92;79;100
Clifford;33;99;47
Murphy;94;87;53
Shorter;83;61;61
Bishop;27;89;41
This is one approach.
Ex:
l = [["1;2;3"], ["10;20;30"], ["12;34;56"]]
result_list = []
l = [list(map(float, j.split(";"))) for i in l for j in i]
for i in zip(*l):
result_list.append(sum(i))
print(result_list)
Output:
[23.0, 56.0, 89.0]
You could do something like this, assuming each element of l is a list of one string:
l = [['1;2;3'], ['10;20;30'], ['12;34;56']]
numbers = (map(float, e.split(';')) for e, in l)
result = [sum(n) for n in zip(*numbers)]
print(result)
Output
[23.0, 56.0, 89.0]
A oneliner can do the job:
If you need first to parse the strings with the numbers:
l = [[int(i) for i in e[0].split(';')] for e in l]
And after that, just:
result = map(sum, zip(*l))
csv.reader + zip + statistics.mean
I omit names and calculate average of each 'column'
You don't need to construct a large list of lists from your data. You can use an iterator and use sequence unpacking with zip. To calculate the mean, you can use statistics.mean:
from io import StringIO
from statistics import mean
import csv
x = StringIO("""Bawerman;55;79;50
Baldwin;83;62;72
Owen;94;86;65
Watson;92;79;100
Clifford;33;99;47
Murphy;94;87;53
Shorter;83;61;61
Bishop;27;89;41""")
# replace x with open('file.csv', 'r')
with x as fin:
reader = csv.reader(x, delimiter=';')
zipper = zip(*reader)
next(zipper) # ignore labels
res = [mean(map(float, x)) for x in zipper]
print(res)
# [70.125, 80.25, 61.125]
Related
I have a ListA = [1,2,3,3,2,1,3,3,1,3]
The 3s are in a sequence of 2,2,1 (2 instances, 2 instances, and 1 instance at the end)
I want to generate a ListB = [2,2,1] from the above ListA
I have code:
ls = [1,2,3,3,2,1,3,3,2,3]
pplist = []
pp = 0
for ii in range(0,len(ls)):
if ls[ii] == 3:
pp += 1
else:
if pp > 0:
pplist.append(pp)
pp = 0
print(pplist)
This gives me [2,2] and for the last element I have to add an additional if-loop.
Is there a way to achieve this without having additional code just for the last element?
(ListA could also end with multiple 3s instead of a single 3)
Thank you
R
there's a pretty easy way to do this with groupby:
from itertools import groupby
nums = [1,2,3,3,2,1,3,3,2,3]
[sum(1 for _ in v) for k, v in groupby(nums) if k == 3]
# [2, 2, 1]
edit: adding a long form version of this to make it easier to understand:
def count_nums(nums, to_count=3):
res = []
for num, vals in groupby(nums): # num is the number, vals is an iterable of the values in the group
if num == to_count:
num_vals = sum(1 for _ in vals) # could also be `len(list(vals))`, i just don't want to create a whole list solely for its length
res.append(num_vals)
return res
ls = [1,2,3,3,2,1,3,3,2,3]
pplist = []
pp = 0
for ii in range(0,len(ls)):
if ls[ii] == 3:
pp += 1
if ii == len(ls)-1:
pplist.append(pp)
else:
if pp > 0:
pplist.append(pp)
pp = 0
print(pplist)
If n = 4, m = 3, I have to select 4 elements (basically n elements) from a list from start and end. From below example lists are [17,12,10,2] and [2,11,20,8].
Then between these two lists I have to select the highest value element and after this the element has to be deleted from the original list.
The above step has to be performed m times and take the summation of the highest value elements.
A = [17,12,10,2,7,2,11,20,8], n = 4, m = 3
O/P: 20+17+12=49
I have written the following code. However, the code performance is not good and giving time out for larger list. Could you please help?
A = [17,12,10,2,7,2,11,20,8]
m = 3
n = 4
scoreSum = 0
count = 0
firstGrp = []
lastGrp = []
while(count<m):
firstGrp = A[:n]
lastGrp = A[-n:]
maxScore = max(max(firstGrp), max(lastGrp))
scoreSum = scoreSum + maxScore
if(maxScore in firstGrp):
A.remove(maxScore)
else:
ai = len(score) - 1 - score[::-1].index(maxScore)
A.pop(ai)
count = count + 1
firstGrp.clear()
lastGrp.clear()
print(scoreSum )
I would like to do that this way, you can generalize it later:
a = [17,12,10,2,7,2,11,20,8]
a.sort(reverse=True)
sums=0
for i in range(3):
sums +=a[i]
print(sums)
If you are concerned about performance, you should use specific libraries like numpy. This will be much faster !
A = [17,12,10,2,7,11,20,8]
n = 4
m = 3
score = 0
for _ in range(m):
sublist = A[:n] + A[-n:]
subidx = [x for x in range(n)] + [x for x in range(len(A) - n, len(A))]
sub = zip(sublist, subidx)
maxval = max(sub, key=lambda x: x[0])
score += maxval[0]
del A[maxval[1]]
print(score)
Your method uses a lot of max() calls. Combining the slices of the front and back lists allows you to reduce the amounts of those max() searches to one pass and then a second pass to find the index at which it occurs for removal from the list.
I would like to append losowanie1[0] when x = 0 to group1 and so on ( losowanie1[0] when x = 1 to group2...). The same with losowanie2[0], but it comes form other List.
import random
List1 = ['AAAA','BBBBB','CCCCC','DDDD','EEEE']
List2 = ['FFFF','GGGG','HHHH','IIIII','JJJJJ']
Gruop1 = []
Group2 = []
for x in range (len(List1)):
losowanie1 = random.sample(List1,1)
losowanie2 = random.sample(List2,1)
# Here i would like to append losowanie1[0] when x = 0 to group1 and so on ( losowanie1[0] when x = 1 to group2...)
List1.remove(losowanie1[0])
I tried:
('Group' + str(x+1)).append(losowanie1[0])
but obviously i cannot append into string.
I can do it without loop but wanted to make my code more professional. Thanks for any help.
You can make a dictionary with keys (as 'Group' + str(x+1)').
And then add a value to the list!
import random
List1 = ['AAAA','BBBBB','CCCCC','DDDD','EEEE']
base_name = "Group"
my_dic = dict()
for x in range(len(List1)):
my_dic[base_name + str(x +1)] = []
for x in range (len(List1)):
losowanie1 = random.sample(List1,1)
my_dic[base_name + str(x +1)].append(losowanie1[0])
List1.remove(losowanie1[0])
print(my_dic)
Result
{'Group3': ['DDDD'], 'Group4': ['BBBBB'], 'Group1': ['EEEE'], 'Group2': ['CCCCC'], 'Group5': ['AAAA']}
This Python function interlocks the characters of two words (e.g., "sho" + "col" -> "school"). word1-char1 + word2-char1 + word1-char2 + ...
def interlock(a,b):
i = 0
c = ""
d = ""
while (i < len(a) and len(b)):
c = (a[i]+b[i])
d = d + c
i+=1
return(d)
interlock("sho", "col")
Now, I would like to apply this function to a list of words. The goal is to find out any interlock corresponds to an item of a list.
word_list = ["test", "col", "tele", "school", "tel", "sho", "aye"]
To do that, I would first have to create a new list that has all the interlocks in it. This is exactly where I am stuck - I don't know how to iterate over word_list using interlock.
Thanks for your help!
If you want all possible permutations of the list to pass to interlock without pairing a word with itself i.e we won't get interlock("col", "col"):
def interlock(s1,s2):
out = ""
while s1 and s2: # keep looping until any string is empty
out += s1[0] + s2[0]
s1, s2 = s1[1:], s2[1:]
return out + s1 + s2 # add the remainder of any longer string
word_list = ["test", "col", "tele", "school", "tel", "sho","col" "aye"]
from itertools import permutations
# get all permutations of len 2 from our word list
perms = permutations(word_list,2)
st = set(word_list)
for a, b in perms:
res = interlock(a,b)
if res in st:
print(res)
school
You can also achieve the same result using itertools.zip_longest using a fillvalue of "" to catch the end of the longer words:
from itertools import permutations, zip_longest
perms = permutations(word_list, 2)
st = set(word_list)
for a, b in perms:
res = "".join("".join(tup) for tup in zip_longest(a,b,fillvalue=""))
if res in st:
print(res)
You can do it using product function from itertools module:
from itertools import product
for a, b in product(word_list, word_list):
interlock(a, b)
https://docs.python.org/2/library/itertools.html#itertools.product
Try this.
def interlockList(A):
while Len(A) > 2:
B = interlock(A[0],A[1])
A.remove(A[0])
A.remove(A[1])
A.insert(0, B)
return B
I was doing 368B on CodeForces with Python 3, which basically asks you to print the numbers of unique elements in a series of "suffixes" of a given array. Here's my solution (with some additional redirection code for testing):
import sys
if __name__ == "__main__":
f_in = open('b.in', 'r')
original_stdin = sys.stdin
sys.stdin = f_in
n, m = [int(i) for i in sys.stdin.readline().rstrip().split(' ')]
a = [int(i) for i in sys.stdin.readline().rstrip().split(' ')]
l = [None] * m
for i in range(m):
l[i] = int(sys.stdin.readline().rstrip())
l_sorted = sorted(l)
l_order = sorted(range(m), key=lambda k: l[k])
# the ranks of elements in l
l_rank = sorted(range(m), key=lambda k: l_order[k])
# unique_elem[i] = non-duplicated elements between l_sorted[i] and l_sorted[i+1]
unique_elem = [None] * m
for i in range(m):
unique_elem[i] = set(a[(l_sorted[i] - 1): (l_sorted[i + 1] - 1)]) if i < m - 1 else set(a[(l_sorted[i] - 1): n])
# unique_elem_cumulative[i] = non-duplicated elements between l_sorted[i] and a's end
unique_elem_cumulative = unique_elem[-1]
# unique_elem_cumulative_count[i] = #unique_elem_cumulative[i]
unique_elem_cumulative_count = [None] * m
unique_elem_cumulative_count[-1] = len(unique_elem[-1])
for i in range(m - 1):
i_rev = m - i - 2
unique_elem_cumulative = unique_elem[i_rev] | unique_elem_cumulative
unique_elem_cumulative_count[i_rev] = len(unique_elem_cumulative)
with open('b.out', 'w') as f_out:
for i in range(m):
idx = l_rank[i]
f_out.write('%d\n' % unique_elem_cumulative_count[idx])
sys.stdin = original_stdin
f_in.close()
The code shows correct results except for the possibly last big test, with n = 81220 and m = 48576 (a simulated input file is here, and an expected output created by a naive solution is here). The time limit is 1 sec, within which I can't solve the problem. So is it possible to solve it within 1 sec with Python 3? Thank you.
UPDATE: an "expected" output file is added, which is created by the following code:
import sys
if __name__ == "__main__":
f_in = open('b.in', 'r')
original_stdin = sys.stdin
sys.stdin = f_in
n, m = [int(i) for i in sys.stdin.readline().rstrip().split(' ')]
a = [int(i) for i in sys.stdin.readline().rstrip().split(' ')]
with open('b_naive.out', 'w') as f_out:
for i in range(m):
l_i = int(sys.stdin.readline().rstrip())
f_out.write('%d\n' % len(set(a[l_i - 1:])))
sys.stdin = original_stdin
f_in.close()
You'll be cutting it close, I think. On my admittedly rather old machine, the I/O alone takes 0.9 seconds per run.
An efficient algorithm, I think, will be to iterate backwards through the array, keeping track of which distinct elements you've found. When you find a new element, add its index to a list. This will therefore be a descending sorted list.
Then for each li, the index of li in this list will be the answer.
For the small sample dataset
10 10
1 2 3 4 1 2 3 4 100000 99999
1
2
3
4
5
6
7
8
9
10
The list would contain [10, 9, 8, 7, 6, 5] since when reading from the right, the first distinct value occurs at index 10, the second at index 9, and so on.
So then if li = 5, it has index 6 in the generated list, so 6 distinct values are found at indices >= li. Answer is 6
If li = 8, it has index 3 in the generated list, so 3 distinct values are found at indices >= li. Answer is 3
It's a little fiddly that the excercise numbers 1-indexed and python counts 0-indexed.
And to find this index quickly using existing library functions, I've reversed the list and then use bisect.
import timeit
from bisect import bisect_left
def doit():
f_in = open('b.in', 'r')
n, m = [int(i) for i in f_in.readline().rstrip().split(' ')]
a = [int(i) for i in f_in.readline().rstrip().split(' ')]
found = {}
indices = []
for i in range(n - 1, 0, -1):
if not a[i] in found:
indices.append(i+1)
found[a[i]] = True
indices.reverse()
length = len(indices)
for i in range(m):
l = int(f_in.readline().rstrip())
index = bisect_left(indices, l)
print length - index
if __name__ == "__main__":
print (timeit.timeit('doit()', setup="from bisect import bisect_left;from __main__ import doit", number=10))
On my machine outputs 12 seconds for 10 runs. Still too slow.