print most k frequent numbers of list with rank ties

print most k frequent numbers of list with rank ties - python

I was trying to find a way to print k most frequent number of the text file. I was able to sort those numbers into a list of lists with its number of appearance in the text file.
l =[(0, 7), (3, 4), (-101, 3), (2, 3), (-3, 1), (-2, 1), (-1, 1), (101, 1)] # 0 is the number itself, 7 means it appeared in file 7 times, and etc
So, now I want to print out k most frequent numbers of the file(should be done RECURSIVELY), but I am struggling with rank ties. For example, if k=3 I want to print:
[(0, 7), (3, 4), (-101, 3), (2, 3)] # top 3 frequencies
I tried doing:
def head(l): return l[0]
def tail(l): return l[1:]
def topk(l,k,e):
if(len(l)<=1 or k==0):
return [head(l)[1]]
elif(head(l)[1]!=e):
return [head(l)[1]] + topk(tail(l),k-1,head(l)[1])
else:
return [head(l)[1]] + topk(tail(l),k,head(l)[1])
l1 = [(0, 7), (3, 4), (-101, 3), (2, 3), (-3, 1), (-2, 1), (-1, 1), (101, 1)]
l2 = [(3.3, 4), (-3.3, 3), (-2.2, 2), (1.1, 1)]
print(topk(l1,3,''))
print(took(l2,3,''))
l1 prints correctly, but l2 has an extra frequency for some reason.

you can use sorted built-in function with parameter key to get the last frequency from top k and then you can use a list comprehenstion to get all the elements that have the frequency >= than that min value:
v = sorted(l, key=lambda x: x[1])[-3][1]
[e for e in l if e[1] >= v]
output:
[(0, 7), (3, 4), (-101, 3), (2, 3)]
if you want a recursive version you can use:
def my_f(l, v, top=None, i=0):
if top is None:
top = []
if l[i][1] >= v:
top.append(l[i])
if i == len(l) - 1:
return top
return my_f(l, v, top, i+1)
def topk(l, k):
k = min(len(l), k)
v = sorted(l, key=lambda x: x[1])[-3][1]
return my_f(l, v)
topk(l, 3)

Related

How to iterate over a dictionary of tuples

I have a list of tuples called possible_moves containing possible moves on a board in my game:
[(2, 1), (2, 2), (2, 3), (3, 1), (4, 5), (5, 2), (5, 3), (6, 0), (6, 2), (7, 1)]
Then, I have a dictionary that assigns a value to each cell on the game board:
{(0,0): 10000, (0,1): -3000, (0,2): 1000, (0,3): 800, etc.}
I want to iterate over all possible moves and find the move with the highest value.
my_value = 0
possible_moves = dict(possible_moves)
for move, value in moves_values:
if move in possible_moves and possible_moves[move] > my_value:
my_move = possible_moves[move]
my_value = value
return my_move
The problem is in the part for move, value, because it creates two integer indexes, but I want the index move to be a tuple.

IIUC, you don't even need the list of possible moves. The moves and their scores you care about are already contained in the dictionary.
>>> from operator import itemgetter
>>>
>>> scores = {(0,0): 10000, (0,1): -3000, (0,2): 1000, (0,3): 800}
>>> max_move, max_score = max(scores.items(), key=itemgetter(1))
>>>
>>> max_move
(0, 0)
>>> max_score
10000
edit: turns out I did not understand quite correctly. Assuming that the list of moves, let's call it possible_moves, contains the moves possible right now and that the dictionary scores contains the scores for all moves, even the impossible ones, you can issue:
max_score, max_move = max((scores[move], move) for move in possible_moves)
... or if you don't need the score:
max_move = max(possible_moves, key=scores.get)

You can use max with dict.get:
possible_moves = [(2, 1), (2, 2), (2, 3), (3, 1), (4, 5), (5, 2),
(5, 3), (6, 0), (6, 2), (7, 1), (0, 2), (0, 1)]
scores = {(0,0): 10000, (0,1): -3000, (0,2): 1000, (0,3): 800}
res = max(possible_moves, key=lambda x: scores.get(x, 0)) # (0, 2)
This assumes moves not found in your dictionary have a default score of 0. If you can guarantee that every move is included as a key in your scores dictionary, you can simplify somewhat:
res = max(possible_moves, key=scores.__getitem__)
Note the syntax [] is syntactic sugar for __getitem__: if the key isn't found you'll meet KeyError.

If d is a dict, iterator of d generates keys. d.items() generates key-value pairs. So:
for move, value in moves_values.items():

possibleMoves=[(2, 1), (2, 2), (2, 3), (3, 1), (4, 5), (5, 2),(0, 3),(5, 3), (6, 0), (6, 2), (7, 1),(0,2)]
movevalues={(0,0): 10000, (0,1): -3000, (0,2): 1000, (0,3): 800}
def func():
my_value=0
for i in range(len(possibleMoves)):
for k,v in movevalues.items():
if possibleMoves[i]==k and v>my_value:
my_value=v
return my_value
maxValue=func()
print(maxValue)

get indexes of elements from a zigzag configuration

"How could I get the indexes of elements in an n-row array configuration?
The length of a row should be given by a string of length l.
For example:
For a 2-row array configuration with l=7, the elements (X) will have indexes:
elements = [(0, 0), (0, 2), (0, 4), (0, 6), (1, 1), (1, 3), (1, 5), (1, 7)]
[[X - X - X - X],
[- X - X - X -]]
For a 3-rows array with l=8, the elements (X) will have indexes:
elements = [(0, 0), (0, 4), (0, 8), (1, 1), (1, 3), (1, 5), (1, 7), (2, 2), (2, 6)]
[[X - - - X - - - X],
[- X - X - X - X -],
[- - X - - - X - -]]
The idea is to extended to higher row numbers. Is there an "analytical" way of getting those indexes?
Thanks in advance.
P.S.: By "analytical" I mean an equation or something that I could code

this is my first shot at your problem:
def grid(width, depth):
assert depth % 2 == 0
height = depth//2 + 1
lines = []
for y in range(height):
line = ''.join('X' if ((i+y) % depth == 0 or (i-y) % depth == 0)
else '-' for i in range(width))
lines.append(line)
return '\n'.join(lines)
the depth is the parameter that defines how far the Xs are spaces on fhe first line (the name is poorly chosen); the width is how many characters should be displayed per line.
this will only work for even depths.
with outputs
-> print(grid(width=10, depth=2))
X-X-X-X-X-
-X-X-X-X-X
-> print(grid(width=10, depth=4))
X---X---X-
-X-X-X-X-X
--X---X---
-> print(grid(width=15, depth=6))
X-----X-----X--
-X---X-X---X-X-
--X-X---X-X---X
---X-----X-----
this was mostly trial & error so there is not much to explain...
if you prefer your elements representation - here is what you can do:
def grid_elements(width, depth):
assert depth % 2 == 0
height = depth//2 + 1
elements = []
for y in range(height):
elements.extend((y, i) for i in range(width)
if ((i+y) % depth == 0 or (i-y) % depth == 0))
return elements
this creates the results:
-> print(grid_elements(width=10, depth=2))
[(0, 0), (0, 2), (0, 4), (0, 6), (0, 8), (1, 1), (1, 3), (1, 5), (1, 7), (1, 9)]
-> print(grid_elements(width=10, depth=4))
[(0, 0), (0, 4), (0, 8), (1, 1), (1, 3), (1, 5), (1, 7), (1, 9), (2, 2), (2, 6)]
-> print(grid_elements(width=15, depth=6))
[(0, 0), (0, 6), (0, 12), (1, 1), (1, 5), (1, 7), (1, 11), (1, 13), (2, 2),
(2, 4), (2, 8), (2, 10), (2, 14), (3, 3), (3, 9)]

This is a example of code that can do this.
import numpy as np
nb_row = 3; nb_column = 10;
separator_element = '-'; element = 'X';
#Initialise the size of the table
table = np.chararray((nb_row, nb_column), itemsize=1);
table[:] = separator_element; #By default, all have the separator element.
#Loop over each column: First column have element at first row. The element
#will after decrease and wrap around the nb of row.
#When at the bottom, switch to go up. At top, switch to go down.
position_element = 0; go_down = 1;
for no_column in xrange(0,nb_column):
table[position_element,no_column] = element;
#Case when go down.
if go_down == 1:
position_element = (position_element+1) % (nb_row);
go_down = (position_element != (nb_row-1)); #Go up after go down.
#Case when go up;
else:
position_element = (position_element-1) % (nb_row);
go_down = (position_element == 0); #Go up after go down.
#end
#end
print(table)
#[['X' '-' '-' '-' 'X' '-' '-' '-' 'X' '-']
#['-' 'X' '-' 'X' '-' 'X' '-' 'X' '-' 'X']
#['-' '-' 'X' '-' '-' '-' 'X' '-' '-' '-']]

We can use itertools.groupby here to create a dictionary that has the
sublist indexes of interest as values and index of sublists as keys {0: [0, 2, 4, 6], 1: [1, 3, 5, 7]}, We can then use this on list that is generated using n = 7. From there we can modify the sublist using the indexes that are values for the corresponding sublist index in our keys.
from itertools import groupby
elements = [(0, 0), (0, 2), (0, 4), (0, 6), (1, 1), (1, 3), (1, 5), (1, 7)]
n = 7
d = {}
for k, g in groupby(elements, key=lambda x: x[0]):
d[k] = [i[1] for i in g]
lst = [['-']*n for i in d]
for k in d:
for i, v in enumerate(lst[k]):
if i in d[k]:
lst[k][i] = 'X'
lst[k] = ' '.join(lst[k])
for i in lst:
print(i)
# X - X - X - X
# - X - X - X -

Sort out pairs with same members but different order from list of pairs

From the list
l =[(3,4),(2,3),(4,3),(3,2)]
I want to sort out all second appearances of pairs with the same members in reverse order. I.e., the result should be
[(3,4),(2,3)]
What's the most concise way to do that in Python?

Alternatively, one might do it in a more verbose way:
l = [(3,4),(2,3),(4,3),(3,2)]
L = []
omega = set([])
for a,b in l:
key = (min(a,b), max(a,b))
if key in omega:
continue
omega.add(key)
L.append((a,b))
print(L)

If we want to keep only the first tuple of each pair:
l =[(3,4),(2,3),(4,3),(3,2), (3, 3), (5, 6)]
def first_tuples(l):
# We could use a list to keep track of the already seen
# tuples, but checking if they are in a set is faster
already_seen = set()
out = []
for tup in l:
if set(tup) not in already_seen:
out.append(tup)
# As sets can only contain hashables, we use a
# frozenset here
already_seen.add(frozenset(tup))
return out
print(first_tuples(l))
# [(3, 4), (2, 3), (3, 3), (5, 6)]

This ought to do the trick:
[x for i, x in enumerate(l) if any(y[::-1] == x for y in l[i:])]
Out[23]: [(3, 4), (2, 3)]
Expanding the initial list a little bit with different orderings:
l =[(3,4),(2,3),(4,3),(3,2), (1,3), (3,1)]
[x for i, x in enumerate(l) if any(y[::-1] == x for y in l[i:])]
Out[25]: [(3, 4), (2, 3), (1, 3)]
And, depending on whether each tuple is guaranteed to have an accompanying "sister" reversed tuple, the logic may change in order to keep "singleton" tuples:
l = [(3, 4), (2, 3), (4, 3), (3, 2), (1, 3), (3, 1), (10, 11), (10, 12)]
[x for i, x in enumerate(l) if any(y[::-1] == x for y in l[i:]) or not any(y[::-1] == x for y in l)]
Out[35]: [(3, 4), (2, 3), (1, 3), (10, 11), (10, 12)]

IMHO, this should be both shorter and clearer than anything posted so far:
my_tuple_list = [(3,4),(2,3),(4,3),(3,2)]
set((left, right) if left < right else (right, left) for left, right in my_tuple_list)
>>> {(2, 3), (3, 4)}
It simply makes a set of all tuples, whose members are exchanged beforehand if first member is > second member.

Outerzip / zip longest function (with multiple fill values)

Is there a Python function an "outer-zip", which is a extension of zip with different default values for each iterable?
a = [1, 2, 3] # associate a default value 0
b = [4, 5, 6, 7] # associate b default value 1
zip(a,b) # [(1, 4), (2, 5), (3, 6)]
outerzip((a, 0), (b, 1)) = [(1, 4), (2, 5), (3, 6), (0, 7)]
outerzip((b, 0), (a, 1)) = [(4, 1), (5, 2), (6, 3), (7, 1)]
I can almost replicate this outerzip function using map, but with None as the only default:
map(None, a, b) # [(1, 4), (2, 5), (3, 6), (None, 7)]
Note1: The built-in zip function takes an arbitrary number of iterables, and so should an outerzip function. (e.g. one should be able to calculate outerzip((a,0),(a,0),(b,1)) similarly to zip(a,a,b) and map(None, a, a, b).)
Note2: I say "outer-zip", in the style of this haskell question, but perhaps this is not correct terminology.

It's called izip_longest (zip_longest in python-3.x):
>>> from itertools import zip_longest
>>> a = [1,2,3]
>>> b = [4,5,6,7]
>>> list(zip_longest(a, b, fillvalue=0))
[(1, 4), (2, 5), (3, 6), (0, 7)]

You could modify zip_longest to support your use case for general iterables.
from itertools import chain, repeat
class OuterZipStopIteration(Exception):
pass
def outer_zip(*args):
count = len(args) - 1
def sentinel(default):
nonlocal count
if not count:
raise OuterZipStopIteration
count -= 1
yield default
iters = [chain(p, sentinel(default), repeat(default)) for p, default in args]
try:
while iters:
yield tuple(map(next, iters))
except OuterZipStopIteration:
pass
print(list(outer_zip( ("abcd", '!'),
("ef", '#'),
(map(int, '345'), '$') )))

This function can be defined by extending each inputted list and zipping:
def outerzip(*args):
# args = (a, default_a), (b, default_b), ...
max_length = max( map( lambda s: len(s[0]), args))
extended_args = [ s[0] + [s[1]]*(max_length-len(s[0])) for s in args ]
return zip(*extended_args)
outerzip((a, 0), (b, 1)) # [(1, 4), (2, 5), (3, 6), (0, 7)]

Get tuples with max value from each key from a list

I have a list of tuples like this:
[(1, 0), (2, 1), (3, 1), (6, 2), (3, 2), (2, 3)]
I want to keep the tuples which have the max first value of every tuple with the same second value. For example (2, 1) and (3, 1) share the same second (key) value, so I just want to keep the one with the max first value -> (3, 1). In the end I would get this:
[(1, 0), (3, 1), (6, 2), (2, 3)]
I don't mind at all if it is not a one-liner but I was wondering about an efficient way to go about this...

from operator import itemgetter
from itertools import groupby
[max(items) for key, items in groupby(L,key = itemgetter(1))]
It's assuming that you initial list of tuples is sorted by key values.
groupby creates an iterator that yields objects like (0, <itertools._grouper object at 0x01321330>), where the first value is the key value, the second one is another iterator which gives all the tuples with that key.
max(items) just selects the tuple with the maximum value, and since all the second values of the group are the same (and is also the key), it gives the tuple with the maximum first value.
A list comprehension is used to form an output list of tuples based on the output of these functions.

Probably using a dict:
rd = {}
for V,K in my_tuples:
if V > rd.setdefault(K,V):
rd[K] = V
result = [ (V,K) for K,V in rd.items() ]

import itertools
import operator
l = [(1, 0), (2, 1), (3, 1), (6, 2), (3, 2), (2, 3)]
result = list(max(v, key=operator.itemgetter(0)) for k, v in itertools.groupby(l, operator.itemgetter(1)))

You could use a dictionary keyed on the second element of the tuple:
l = [(1, 0), (2, 1), (3, 1), (6, 2), (3, 2), (2, 3)]
d = dict([(t[1], None) for t in l])
for v, k in l:
if d[k] < v:
d[k] = v
l2 = [ (v, k) for (k, v) in d.items() if v != None ]
print l2

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

print most k frequent numbers of list with rank ties - python

Related

How to iterate over a dictionary of tuples

get indexes of elements from a zigzag configuration

Sort out pairs with same members but different order from list of pairs

Outerzip / zip longest function (with multiple fill values)

Get tuples with max value from each key from a list

Categories

Resources