I have tree lists of elements sorted by descending scores. What I need to do is to use Borda’s positional ranking to combine ranked lists using information of the ordinal ranks of the elements in each list.Given lists t1, t2, t3 ... tk, for each candidate c and list ti, the score B ti (c) is the number of candidates ranked below c in ti.
So The total Borda score is B(c) = ∑ B ti (c)
The candidates are then sorted by descending Borda scores.
I tied that, but it does not give the output needed:
for i in list1, list2, list3:
borda = (((len(list1)-1) - list1.index(i)) + ((len(list2)-1) - list2.index(i)) + ((len(list3)-1) - list3.index(i)))
print borda
Can someone help me to implement the above function?
Calling index(i) takes time proportionate to the list size, and because you have to call that for every element, it ends up taking O(N^2) time where N is the list size. Much better to iterate one list at a time where you know the index and add that part of the score to a score accumulator in a dict.
def borda_sort(lists):
scores = {}
for l in lists:
for idx, elem in enumerate(reversed(l)):
if not elem in scores:
scores[elem] = 0
scores[elem] += idx
return sorted(scores.keys(), key=lambda elem: scores[elem], reverse=True)
lists = [ ['a', 'c'], ['b', 'd', 'a'], ['b', 'a', 'c', 'd'] ]
print borda_sort(lists)
# ['b', 'a', 'c', 'd']
The only tricky part here is scanning lists in reverse; this makes sure that if an element was not in one of the lists at all, its score increases by 0 for that list.
Compare with the other suggestion here:
import itertools
import random
def borda_simple_sort(lists):
candidates = set(itertools.chain(*lists))
return sorted([sum([len(l) - l.index(c) - 1 for l in lists if c in l], 0) for c in candidates], reverse=True)
# returns scores - a bit more work needed to return a list
# make 10 random lists of size 10000
lists = [ random.sample(range(10000), 10000) for x in range(10) ]
%timeit borda_sort(lists)
10 loops, best of 3: 40.9 ms per loop
%timeit borda_simple_sort(lists)
1 loops, best of 3: 30.8 s per loop
That's not a typo :) 40 milliseconds vs 30 seconds, a 750x speedup. The fast algorithm is not significantly more difficult to read in this case, and may even be easier to read, it just relies on an appropriate auxiliary data structure, and going through the data in the right order.
This could work:
sorted([sum([len(l) - l.index(c) - 1 for l in [list1, list2, list3] if c in l], 0) for c in [candidate1, candidate2, candidate3]], reverse=True)
Note that since scores are reordered you'll lose track of which candidate each score belongs to:
>>> list1 = ['a', 'c']
>>> list2 = ['b', 'd', 'a']
>>> list3 = ['b', 'a', 'c', 'd']
>>> candidates = ['a', 'b', 'c', 'd']
>>> sorted([sum([len(l) - l.index(c) - 1 for l in [list1, list2, list3] if c in l], 0) for c in candidates], reverse=True)
[5, 3, 1, 1]
In this case the first element of the list (the winner) is 'b', the second element in the list of candidates.
Related
I know I can do something like below to get number of occurrences of elements in the list:
from collections import Counter
words = ['a', 'b', 'c', 'a']
Counter(words).keys() # equals to list(set(words))
Counter(words).values() # counts the elements' frequency
Outputs:
['a', 'c', 'b']
[2, 1, 1]
But I want to get the count 2 for b and c as b and c occur exactly once in the list.
Is there any way to do this in concise / pythonic way without using Counter or even using above output from Counter?
You could just make an algorithm that does that, here is a one liner (thanks #d.b):
sum(x for x in Counter(words).values() if x == 1)
Or more than one line:
seen = []
count = 0
for word in words:
if word not in seen:
count += 1
seen.append(word)
letters = ['a', 'b', 'c']
Assume this is my list. Where for i, letter in enumerate(letters) would be:
0, a
1, b
2, c
How can I instead make it enumerate backwards, as:
2, a
1, b
0, c
This is a great solution and works perfectly:
items = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
for idx, item in enumerate(items, start=-len(items)):
print(f"reverse index for {item}: {abs(idx)}")
Here is the OUTPUT of the above snippet:
reverse index for a: 7
reverse index for b: 6
reverse index for c: 5
reverse index for d: 4
reverse index for e: 3
reverse index for f: 2
reverse index for g: 1
Here is what happening in above snippet:
enumerate's start arg is given a negative value.
enumerate always takes a step forward.
Finally we use abs on idx to find absolute value, which is always positive.
If you want to start indexing from zero then use -len(items) + 1 to fix off-by-one error
Try this:
letters = ['a', 'b', 'c']
for i, letter in reversed(list(enumerate(reversed(letters)))):
print(i, letter)
Output:
2 a
1 b
0 c
Try this:
l = len(letters)
for i, letter in enumerate(letters):
print(l-i, letters)
I would try to make a reverse list first then you may use enumerate()
letters = ['a', 'b', 'c']
letters.reverse()
for i, letter in enumerate(letters)
The zip function creates a list of element-wise pairs for two parameter lists.
list(zip([i for i in range(len(letters))][::-1], letters))
letters = ['a', 'b', 'c']
for i, letter in zip(range(len(letters)-1, -1, -1), letters):
print(i, letter)
prints
2 a
1 b
0 c
Taken from answer in a similar question: Traverse a list in reverse order in Python
tl;dr: size - index - 1
I'll assume the question you are asking is whether or not you can have the index be reversed while the item is the same, for example, the a has the ordering number of 2 when it actually has an index of 0.
To calculate this, consider that each element in your array or list wants to have the index of the item with the same "distance" (index wise) from the end of the collection. Calculating this gives you size - index.
However, many programming languages start arrays with an index of 0. Due to this, we would need to subtract 1 in order to make the indices correspond properly. Consider our last element, with an index of size - 1. In our original equation, we would get size - (size - 1), which is equal to size - size + 1, which is equal to 1. Therefore, we need to subtract 1.
Final equation (for each element): size - index - 1
We can define utility function (in Python3.3+)
from itertools import count
def enumerate_ext(iterable, start=0, step=1):
indices = count(start, step)
yield from zip(indices, iterable)
and use it directly like
letters = ['a', 'b', 'c']
for index, letter in enumerate_ext(letters,
start=len(letters) - 1,
step=-1):
print(index, letter)
or write helper
def reverse_enumerate(sequence):
yield from enumerate_ext(sequence,
start=len(sequence) - 1,
step=-1)
and use it like
for index, letter in reverse_enumerate(letters):
print(index, letter)
Suppose I have a list:
l = ['a', 'b', 'c']
And its suffix list:
l2 = ['a_1', 'b_1', 'c_1']
I'd like the desired output to be:
out_l = ['a', 'a_1', 'b', 'b_2', 'c', 'c_3']
The result is the interleaved version of the two lists above.
I can write regular for loop to get this done, but I'm wondering if there's a more Pythonic way (e.g., using list comprehension or lambda) to get it done.
I've tried something like this:
list(map(lambda x: x[1]+'_'+str(x[0]+1), enumerate(a)))
# this only returns ['a_1', 'b_2', 'c_3']
Furthermore, what changes would need to be made for the general case i.e., for 2 or more lists where l2 is not necessarily a derivative of l?
yield
You can use a generator for an elegant solution. At each iteration, yield twice—once with the original element, and once with the element with the added suffix.
The generator will need to be exhausted; that can be done by tacking on a list call at the end.
def transform(l):
for i, x in enumerate(l, 1):
yield x
yield f'{x}_{i}' # {}_{}'.format(x, i)
You can also re-write this using the yield from syntax for generator delegation:
def transform(l):
for i, x in enumerate(l, 1):
yield from (x, f'{x}_{i}') # (x, {}_{}'.format(x, i))
out_l = list(transform(l))
print(out_l)
['a', 'a_1', 'b', 'b_2', 'c', 'c_3']
If you're on versions older than python-3.6, replace f'{x}_{i}' with '{}_{}'.format(x, i).
Generalising
Consider a general scenario where you have N lists of the form:
l1 = [v11, v12, ...]
l2 = [v21, v22, ...]
l3 = [v31, v32, ...]
...
Which you would like to interleave. These lists are not necessarily derived from each other.
To handle interleaving operations with these N lists, you'll need to iterate over pairs:
def transformN(*args):
for vals in zip(*args):
yield from vals
out_l = transformN(l1, l2, l3, ...)
Sliced list.__setitem__
I'd recommend this from the perspective of performance. First allocate space for an empty list, and then assign list items to their appropriate positions using sliced list assignment. l goes into even indexes, and l' (l modified) goes into odd indexes.
out_l = [None] * (len(l) * 2)
out_l[::2] = l
out_l[1::2] = [f'{x}_{i}' for i, x in enumerate(l, 1)] # [{}_{}'.format(x, i) ...]
print(out_l)
['a', 'a_1', 'b', 'b_2', 'c', 'c_3']
This is consistently the fastest from my timings (below).
Generalising
To handle N lists, iteratively assign to slices.
list_of_lists = [l1, l2, ...]
out_l = [None] * len(list_of_lists[0]) * len(list_of_lists)
for i, l in enumerate(list_of_lists):
out_l[i::2] = l
zip + chain.from_iterable
A functional approach, similar to #chrisz' solution. Construct pairs using zip and then flatten it using itertools.chain.
from itertools import chain
# [{}_{}'.format(x, i) ...]
out_l = list(chain.from_iterable(zip(l, [f'{x}_{i}' for i, x in enumerate(l, 1)])))
print(out_l)
['a', 'a_1', 'b', 'b_2', 'c', 'c_3']
iterools.chain is widely regarded as the pythonic list flattening approach.
Generalising
This is the simplest solution to generalise, and I suspect the most efficient for multiple lists when N is large.
list_of_lists = [l1, l2, ...]
out_l = list(chain.from_iterable(zip(*list_of_lists)))
Performance
Let's take a look at some perf-tests for the simple case of two lists (one list with its suffix). General cases will not be tested since the results widely vary with by data.
Benchmarking code, for reference.
Functions
def cs1(l):
def _cs1(l):
for i, x in enumerate(l, 1):
yield x
yield f'{x}_{i}'
return list(_cs1(l))
def cs2(l):
out_l = [None] * (len(l) * 2)
out_l[::2] = l
out_l[1::2] = [f'{x}_{i}' for i, x in enumerate(l, 1)]
return out_l
def cs3(l):
return list(chain.from_iterable(
zip(l, [f'{x}_{i}' for i, x in enumerate(l, 1)])))
def ajax(l):
return [
i for b in [[a, '{}_{}'.format(a, i)]
for i, a in enumerate(l, start=1)]
for i in b
]
def ajax_cs0(l):
# suggested improvement to ajax solution
return [j for i, a in enumerate(l, 1) for j in [a, '{}_{}'.format(a, i)]]
def chrisz(l):
return [
val
for pair in zip(l, [f'{k}_{j+1}' for j, k in enumerate(l)])
for val in pair
]
You can use a list comprehension like so:
l=['a','b','c']
new_l = [i for b in [[a, '{}_{}'.format(a, i)] for i, a in enumerate(l, start=1)] for i in b]
Output:
['a', 'a_1', 'b', 'b_2', 'c', 'c_3']
Optional, shorter method:
[j for i, a in enumerate(l, 1) for j in [a, '{}_{}'.format(a, i)]]
You could use zip:
[val for pair in zip(l, [f'{k}_{j+1}' for j, k in enumerate(l)]) for val in pair]
Output:
['a', 'a_1', 'b', 'b_2', 'c', 'c_3']
Here's my simple implementation
l=['a','b','c']
# generate new list with the indices of the original list
new_list=l + ['{0}_{1}'.format(i, (l.index(i) + 1)) for i in l]
# sort the new list in ascending order
new_list.sort()
print new_list
# Should display ['a', 'a_1', 'b', 'b_2', 'c', 'c_3']
If you wanted to return [["a","a_1"],["b","b_2"],["c","c_3"]] you could write
new_l=[[x,"{}_{}".format(x,i+1)] for i,x in enumerate(l)]
This isn't what you want, instead you want ["a","a_1"]+["b","b_2"]+["c","c_3"]. This can be made from the result of the operation above using sum(); since you're summing lists you need to add the empty list as an argument to avoid an error. So that gives
new_l=sum(([x,"{}_{}".format(x,i+1)] for i,x in enumerate(l)),[])
I don't know how this compares speed-wise (probably not well), but I find it easier to understand what's going on than the other list-comprehension based answers.
A very simple solution:
out_l=[]
for i,x in enumerate(l,1):
out_l.extend([x,f"{x}_{i}"])
Here is an easier list comprehension for this problem as well:
l = ['a', 'b', 'c']
print([ele for index, val in enumerate(l) for ele in (val, val + f'_{index + 1}')])
Output:
['a', 'a_1', 'b', 'b_2', 'c', 'c_3']
Note this is just a simpler solution for interleaving the two lists. This is not a solution for multiple lists. The reason I use two for loops is because, at the time of writing, list comprehension does not support tuple unpacking.
I have two variables holding a string each and an empty list:
a = 'YBBB'
b = 'RYBB'
x = []
I want to loop through each of the strings and treat each 'B' in the two lists as an independent element (wish I could just type a.('B') and b.('B'). What I actually want to do is loop through b and ask if each of the items in b are in a. If so, the length of the item in b (say'B') is checked for in a. This should give 3. Then I want to compare the lengths of the item in the two lists and push the lesser of the two into the empty list. In this case, only two 'B's will be pushed into x.
You can use a nested list comprehension like following:
>>> [i for i in set(b) for _ in range(min(b.count(i), a.count(i)))]
['B', 'B', 'Y']
If the order is important you can use collections.OrderedDict for creating the unique items from b:
>>> from collections import OrderedDict
>>>
>>> [i for i in OrderedDict.fromkeys(b) for _ in range(min(b.count(i), a.count(i)))]
['Y', 'B', 'B']
This is useless text for the moderators.
import collections
a = 'YBBB'
b = 'RYBB'
x = []
a_counter = collections.Counter(a)
b_counter = collections.Counter(b)
print(a_counter)
print(b_counter)
for ch in b:
if a_counter[ch]:
x.append(min(a_counter[ch], b_counter[ch]) * ch)
print(x)
--output:--
Counter({'B': 3, 'Y': 1})
Counter({'B': 2, 'Y': 1, 'R': 1})
['Y', 'BB', 'BB']
Or, if you only want to step through each unique element in b:
for ch in set(b):
if a_counter[ch]:
x.append(min(a_counter[ch], b_counter[ch]) * ch)
print(x)
--output:--
['Y', 'BB']
I have two lists that contain many of the same items, including duplicate items. I want to check which items in the first list are not in the second list. For example, I might have one list like this:
l1 = ['a', 'b', 'c', 'b', 'c']
and one list like this:
l2 = ['a', 'b', 'c', 'b']
Comparing these two lists I would want to return a third list like this:
l3 = ['c']
I am currently using some terrible code that I made a while ago that I'm fairly certain doesn't even work properly shown below.
def list_difference(l1,l2):
for i in range(0, len(l1)):
for j in range(0, len(l2)):
if l1[i] == l1[j]:
l1[i] = 'damn'
l2[j] = 'damn'
l3 = []
for item in l1:
if item!='damn':
l3.append(item)
return l3
How can I better accomplish this task?
You didn't specify if the order matters. If it does not, you can do this in >= Python 2.7:
l1 = ['a', 'b', 'c', 'b', 'c']
l2 = ['a', 'b', 'c', 'b']
from collections import Counter
c1 = Counter(l1)
c2 = Counter(l2)
diff = c1-c2
print list(diff.elements())
Create Counters for both lists, then subtract one from the other.
from collections import Counter
a = [1,2,3,1,2]
b = [1,2,3,1]
c = Counter(a)
c.subtract(Counter(b))
To take into account both duplicates and the order of elements:
from collections import Counter
def list_difference(a, b):
count = Counter(a) # count items in a
count.subtract(b) # subtract items that are in b
diff = []
for x in a:
if count[x] > 0:
count[x] -= 1
diff.append(x)
return diff
Example
print(list_difference("z y z x v x y x u".split(), "x y z w z".split()))
# -> ['y', 'x', 'v', 'x', 'u']
Python 2.5 version:
from collections import defaultdict
def list_difference25(a, b):
# count items in a
count = defaultdict(int) # item -> number of occurrences
for x in a:
count[x] += 1
# subtract items that are in b
for x in b:
count[x] -= 1
diff = []
for x in a:
if count[x] > 0:
count[x] -= 1
diff.append(x)
return diff
Counters are new in Python 2.7.
For a general solution to substract a from b:
def list_difference(b, a):
c = list(b)
for item in a:
try:
c.remove(item)
except ValueError:
pass #or maybe you want to keep a values here
return c
you can try this
list(filter(lambda x:l1.remove(x),li2))
print(l1)
Try this one:
from collections import Counter
from typing import Sequence
def duplicates_difference(a: Sequence, b: Sequence) -> Counter:
"""
>>> duplicates_difference([1,2],[1,2,2,3])
Counter({2: 1, 3: 1})
"""
shorter, longer = sorted([a, b], key=len)
return Counter(longer) - Counter(shorter)