Related
If I have a list of lists, I know I can get the index of the largest item using a solution posted here:
def get_maximum_votes_index(L):
return max((n,i,j) for i, L2 in enumerate(L) for j, n in enumerate(L2))[1:]
However, if I want to return a sorted list of indices, descending from the maximum, how would I do that?
For example:
L = [[1,2],[4,3]]
Would return:
[(1,0),(1,1),(0,1),(0,0)]
You basically just need to replace the max with sorted:
L = [[1,2],[4,3]]
# step 1: add indices to each list element
L_with_indices = ((n,i,j) for i, L2 in enumerate(L) for j, n in enumerate(L2))
# step 2: sort by value
sorted_L = sorted(L_with_indices, reverse=True)
# step 3: remove the value and keep the indices
result = [tup[1:] for tup in sorted_L]
# result: [(1, 0), (1, 1), (0, 1), (0, 0)]
Given some set of tuples (x,y):
set([(1,2),(3,4),(3,2),(1,4)])
How do I find each tuple with the property (1,z) in the set?
In this example the edges (1,2),(1,4).
EDIT: Is there some other data structure that would support such a request?
Use a comprehension (set or list):
In [145]: st = set([(1,2),(3,4),(3,2),(1,4)])
In [146]: [(i, j) for i, j in st if i == 1]
Out[146]: [(1, 2), (1, 4)]
In [147]: {(i, j) for i, j in st if i == 1}
Out[147]: {(1, 2), (1, 4)}
Or if you don't want the result in a container, i.e. you just want to loop over the results, etc. you can use a function approach by using the built-in filter function:
result = filter(lambda x: x[0] == 1, st)
I am trying to create a function which returns the empty slots in this list:
grid = [[0,0,0,4],[0,0,4,2],[2,4,4,2],[0,8,4,2]]
The empty slots in this case is those slots with zeroes.
This was my program for it:
def empty_slots():
lst = []
for i in grid:
for j in grid:
if j == 0:
lst = lst + [(i,j)]
return lst
However, when I run this program I get an empty list []. And the function should output: [(0,0), (0,1), (0,2), (1,0), (1,1), (3,0)]. Note: I'm using Python 2.7.
for i in grid: iterates over the items in grid, it doesn't iterate over their indices. However, you can get the indices as you iterate over the items of an iterable via the built-in enumerate function:
def empty_slots(grid):
return [(i, j) for i, row in enumerate(grid)
for j, v in enumerate(row) if not v]
grid = [[0,0,0,4],[0,0,4,2],[2,4,4,2],[0,8,4,2]]
print(empty_slots(grid))
output
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (3, 0)]
Here's the same thing using "traditional" for loops instead of a list comprehension.
def empty_slots(grid):
zeros = []
for i, row in enumerate(grid):
for j, v in enumerate(row):
if v == 0:
zeros.append((i, j))
return zeros
In this version I use the explicit test of v == 0 instead of not v; the latter will test true if v is any "falsey" value, eg, 0, or an empty string, list, tuple, set or dict.
You don't need enumerate to do this. You could do this:
def empty_slots(grid):
zeros = []
for i in range(len(grid)):
row = grid[i]
for j in range(len(row)):
if row[j] == 0:
zeros.append((i, j))
return zeros
However, it is considered more Pythonic to iterate directly over the items in an iterable, so this sort of thing is generally avoided, when practical:
for i in range(len(grid)):
Occasionally you will need to do that sort of thing, but usually code like that is a symptom that there's a better way to do it. :)
In list comprehension:
grid = [[0,0,0,4],[0,0,4,2],[2,4,4,2],[0,8,4,2]]
[(i,j) for i,b in enumerate(grid) for j,a in enumerate(b) if a==0]
Out[81]: [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (3, 0)]
I have a long list (~10 million elements) and the elements that have repeated values are pairs. I want to extract the list of pairs from the list, e.g.
R = [1,3,1,6,9,6,1,2,3,0]
will spit out list of pairs
P = [[e1,e3],[e1,e7],[e3,e7],[e4,e6],[e2,e9]]
What is the efficient algorithm to achieve this for a long list?
Group the indices together based on value, then iterate through pairs of indices using combinations.
from collections import defaultdict
from itertools import combinations
R = [1,3,1,6,9,6,1,2,3,0]
d = defaultdict(list)
for idx,item in enumerate(R,1):
d[item].append(idx)
result = []
for indices in d.itervalues():
result.extend(combinations(indices, 2))
print result
Result:
[(1, 3), (1, 7), (3, 7), (2, 9), (4, 6)]
Populating the defaultdict takes O(len(R)) time on average. Finding combinations is O(N!) time, where N is the number of indices in the largest group.
My simple solution:
def extract(edges):
dic = {}
for i in range(len(edges)):
if edges[i] in dic.keys():
dic[edges[i]].append(i+1)
else:
dic[edges[i]] = [i+1]
res = []
for k in sorted(dic.keys()):
res += combinations(dic[k])
return res
def combinations(positions):
ret = []
print positions
for i in range(len(positions)):
for j in range(i+1, len(positions)):
ret.append(["e"+str(positions[i]), "e"+str(positions[j])])
print ret
return ret
R = [1,3,1,6,9,6,1,2,3,0]
res = extract(R)
print res
As we can't see your input, you might encounter problems if there are many combinations. One thing to try is pypy, which sometimes gives me a (free) speed up.
Unless I understood it the wrong way, the simplest and optimal way to do this is to use a dictionary of already encountered values.
elem_dict = {}
output = []
for i, elem in zip (range (length(R))),R):
if elem_dict.has_key (elem):
output += [[duplicate, i] for duplicate in elem_dict[elem]]
else
elem_dict[elem] = set ()
elem_dict[elem].add (i)
print output #[[0, 2], [3, 5], [0, 6], [2, 6], [1, 8]]
Should be O(n log (n)) in average case, if I'm not mistaken, unless you have a lot of similar values in which case your output is O(n^2) anyway.
My approach would be to do a pass over the list to find the elements with the same value and store them into new lists, then find the elements that appear more than once and collect the combinations:
In [18]: from collections import defaultdict
In [19]: d = defaultdict(list)
In [20]: for i, e in enumerate(R, 1):
....: d[e].append(i)
....:
In [21]: from itertools import combinations
In [22]: from itertools import chain
In [23]: list(chain(*[list(combinations(v,2)) for v in d.values() if len(v) > 1]))
Out[23]: [(1, 3), (1, 7), (3, 7), (2, 9), (4, 6)]
What I have now:
d = 0
res = 0
newlist = []
l = [4, 1, 6, 1, 1, 1]
for el in range(len(l)):
for j in range(len(l)):
if abs(l[el] - l[j]) <= d and el != j and el not in newlist and j not in newlist:
newlist.append(el)
newlist.append(j)
res += 1
print(res)
It works well and returns 2 which is correct(1,1; 1,1) but takes too much time. How can I make it work faster ? Thanks.
For example if list = [1, 1, 1, 1] and d = 0 there will be 2 pairs because you can use each number only once. Using (a, b) and (b, c) is not allowed and (a, b) with (b, a) is the same pair...
Sort the list, then walk through it.
Once you have the list sorted, you can just be greedy: take the earliest pair that works, then the next, then the next... and you will end up with the maximum number of valid pairs.
def get_pairs(lst, maxdiff):
sl = sorted(lst) # may want to do lst.sort() if you don't mind changing lst
count = 0
i = 1
N = len(sl)
while i < N:
# no need for abs -- we know the previous value is not bigger.
if sl[i] - sl[i-1] <= maxdiff:
count += 1
i += 2 # these two values are now used
else:
i += 1
return count
And here's some code to benchmark it:
print('generating list...')
from random import randrange, seed
seed(0) # always same contents
l = []
for i in range(1000000):
l.append(randrange(0,5000))
print('ok, measuring...')
from time import time
start = time();
print(get_pairs(l, 0))
print('took', time()-start, 'seconds')
And the result (with 1 million values in list):
tmp$ ./test.py
generating list...
ok, measuring...
498784
took 0.6729779243469238 seconds
You may want to compute all the pairs separately and then collect the pairs you want.
def get_pairs(l, difference):
pairs = []
# first compute all pairs: n choose 2 which is O(n^2)
for i in xrange(len(l)):
for j in xrange(i+1, len(l)):
pairs.append((l[i], l[j]))
# collect pairs you want: O(n^2)
res = []
for pair in pairs:
if abs(pair[0] - pair[1]) <= difference:
res.append(pair)
return res
>>> get_pairs([1,2,3,4,2], 0)
>>> [(2, 2)]
>>> get_pairs([1,2,3,4,2], 1)
>>> [(1, 2), (1, 2), (2, 3), (2, 2), (3, 4), (3, 2)]
If you want to remove duplicates from you result, you can convert the res list to a set before you return it with set(res).