How to iterate over a dictionary of tuples - python

I have a list of tuples called possible_moves containing possible moves on a board in my game:
[(2, 1), (2, 2), (2, 3), (3, 1), (4, 5), (5, 2), (5, 3), (6, 0), (6, 2), (7, 1)]
Then, I have a dictionary that assigns a value to each cell on the game board:
{(0,0): 10000, (0,1): -3000, (0,2): 1000, (0,3): 800, etc.}
I want to iterate over all possible moves and find the move with the highest value.
my_value = 0
possible_moves = dict(possible_moves)
for move, value in moves_values:
if move in possible_moves and possible_moves[move] > my_value:
my_move = possible_moves[move]
my_value = value
return my_move
The problem is in the part for move, value, because it creates two integer indexes, but I want the index move to be a tuple.

IIUC, you don't even need the list of possible moves. The moves and their scores you care about are already contained in the dictionary.
>>> from operator import itemgetter
>>>
>>> scores = {(0,0): 10000, (0,1): -3000, (0,2): 1000, (0,3): 800}
>>> max_move, max_score = max(scores.items(), key=itemgetter(1))
>>>
>>> max_move
(0, 0)
>>> max_score
10000
edit: turns out I did not understand quite correctly. Assuming that the list of moves, let's call it possible_moves, contains the moves possible right now and that the dictionary scores contains the scores for all moves, even the impossible ones, you can issue:
max_score, max_move = max((scores[move], move) for move in possible_moves)
... or if you don't need the score:
max_move = max(possible_moves, key=scores.get)

You can use max with dict.get:
possible_moves = [(2, 1), (2, 2), (2, 3), (3, 1), (4, 5), (5, 2),
(5, 3), (6, 0), (6, 2), (7, 1), (0, 2), (0, 1)]
scores = {(0,0): 10000, (0,1): -3000, (0,2): 1000, (0,3): 800}
res = max(possible_moves, key=lambda x: scores.get(x, 0)) # (0, 2)
This assumes moves not found in your dictionary have a default score of 0. If you can guarantee that every move is included as a key in your scores dictionary, you can simplify somewhat:
res = max(possible_moves, key=scores.__getitem__)
Note the syntax [] is syntactic sugar for __getitem__: if the key isn't found you'll meet KeyError.

If d is a dict, iterator of d generates keys. d.items() generates key-value pairs. So:
for move, value in moves_values.items():

possibleMoves=[(2, 1), (2, 2), (2, 3), (3, 1), (4, 5), (5, 2),(0, 3),(5, 3), (6, 0), (6, 2), (7, 1),(0,2)]
movevalues={(0,0): 10000, (0,1): -3000, (0,2): 1000, (0,3): 800}
def func():
my_value=0
for i in range(len(possibleMoves)):
for k,v in movevalues.items():
if possibleMoves[i]==k and v>my_value:
my_value=v
return my_value
maxValue=func()
print(maxValue)

Related

How to find the maximum per group in an rdd?

I'm using PySpark and I have an RDD that looks like this:
[
("Moviex", [(1, 100), (2, 20), (3, 50)]),
("MovieY", [(1, 100), (2, 250), (3, 100), (4, 120)]),
("MovieZ", [(1, 1000), (2, 250)]),
("MovieX", [(4, 50), (5, 10), (6, 0)]),
("MovieY", [(3, 0), (4, 260)]),
("MovieZ", [(5, 180)]),
]
The first element in the tuple represents the week number and the second element represents the number of viewers. I want to find the week with the most views for each movie, but ignoring the first week.
I've tried some things but nothing worked, for example:
stats.reduceByKey(max).collect()
returns:
[('MovieX', [(4, 50), (5, 10), (6, 0)]),
('MovieY', [(5, 180)]),
('MovieC', [(3, 0), (4, 260)])]
so the entire second set.
Also this:
stats.groupByKey().reduce(max)
which returns just this:
('MovieZ', <pyspark.resultiterable.ResultIterable at 0x558f75eeb0>)
How can I solve this?
If you want the most views per movie, ignoring the first week ... [('MovieA', 50), ('MovieC', 250), ('MovieB', 260)]
Then, you'll want your own map function rather than a reduce.
movie_stats = spark.sparkContext.parallelize([
("MovieA", [(1, 100), (2, 20), (3, "50")]),
("MovieC", [(1, 100), (2, "250"), (3, 100), (4, "120")]),
("MovieB", [(1, 1000), (2, 250)]),
("MovieA", [(4, 50), (5, "10"), (6, 0)]),
("MovieB", [(3, 0), (4, "260")]),
("MovieC", [(5, "180")]),
])
def get_views_after_first_week(v):
values = iter(v) # iterator of tuples, groupped by key
result = list()
for x in values:
result.extend([int(y[1]) for y in x if y[0] > 1])
return result
mapped = movie_stats.groupByKey().mapValues(get_views_after_first_week).mapValues(max)
mapped.collect()
to include the week number... [('MovieA', (3, 50)), ('MovieC', (2, 250)), ('MovieB', (4, 260))]
def get_max_weekly_views_after_first_week(v):
values = iter(v) # iterator of tuples, groupped by key
max_views = float('-inf')
max_week = None
for x in values:
for t in x:
week, views = t
views = int(views)
if week > 1 and views > max_views:
max_week = week
max_views = views
return (max_week, max_views, )
mapped = movie_stats.groupByKey().mapValues(get_max_weekly_views_after_first_week)
Some code is needed to convert the string into int, and apply a map function to 1) filter out week 1 data; 2) get the week with max view.
def helper(arr: list):
max_week = None
for sub_arr in arr:
for item in sub_arr:
if item[0] == 1:
continue
count = int(item[1])
if max_week is None or max_week[1] < count:
max_week = [item[0], count]
return max_week
movie_stats.groupByKey().map(lambda x: (x[0], helper(x[1]))).collect()

print most k frequent numbers of list with rank ties

I was trying to find a way to print k most frequent number of the text file. I was able to sort those numbers into a list of lists with its number of appearance in the text file.
l =[(0, 7), (3, 4), (-101, 3), (2, 3), (-3, 1), (-2, 1), (-1, 1), (101, 1)] # 0 is the number itself, 7 means it appeared in file 7 times, and etc
So, now I want to print out k most frequent numbers of the file(should be done RECURSIVELY), but I am struggling with rank ties. For example, if k=3 I want to print:
[(0, 7), (3, 4), (-101, 3), (2, 3)] # top 3 frequencies
I tried doing:
def head(l): return l[0]
def tail(l): return l[1:]
def topk(l,k,e):
if(len(l)<=1 or k==0):
return [head(l)[1]]
elif(head(l)[1]!=e):
return [head(l)[1]] + topk(tail(l),k-1,head(l)[1])
else:
return [head(l)[1]] + topk(tail(l),k,head(l)[1])
l1 = [(0, 7), (3, 4), (-101, 3), (2, 3), (-3, 1), (-2, 1), (-1, 1), (101, 1)]
l2 = [(3.3, 4), (-3.3, 3), (-2.2, 2), (1.1, 1)]
print(topk(l1,3,''))
print(took(l2,3,''))
l1 prints correctly, but l2 has an extra frequency for some reason.
you can use sorted built-in function with parameter key to get the last frequency from top k and then you can use a list comprehenstion to get all the elements that have the frequency >= than that min value:
v = sorted(l, key=lambda x: x[1])[-3][1]
[e for e in l if e[1] >= v]
output:
[(0, 7), (3, 4), (-101, 3), (2, 3)]
if you want a recursive version you can use:
def my_f(l, v, top=None, i=0):
if top is None:
top = []
if l[i][1] >= v:
top.append(l[i])
if i == len(l) - 1:
return top
return my_f(l, v, top, i+1)
def topk(l, k):
k = min(len(l), k)
v = sorted(l, key=lambda x: x[1])[-3][1]
return my_f(l, v)
topk(l, 3)

Order of For Loops with Python List Comprehension

In this answer, it is claimed that
The best way to remember this is that the order of for loop inside the list comprehension is based on the order in which they appear in traditional loop approach. Outer most loop comes first, and then the inner loops subsequently.
However, this answer,, and my own experiment below, seem to show the opposite - i.e, the inner loop coming first.
In my example, I want j to represent the row number and i to represent the column number. I want 5 rows and 4 columns What am I missing please?
board = [[(j, i) for i in range(4)] for j in range(5)]
# I believe the above comprehension is equivalent to the nested for loops below
# board = []
# for j in range(5):
# new_row = []
# for i in range(4):
# new_row.append((j,i))
# board.append(new_row)
for j in range(5):
for i in range(4):
print(board[j][i], end="")
print()
This is the correct way to get desired output:
board = [(j, i) for i in range(4) for j in range(5)]
Output:-
[(0, 0), (1, 0), (2, 0), (3, 0), (4, 0), (0, 1), (1, 1), (2, 1), (3, 1), (4, 1), (0, 2), (1, 2), (2, 2), (3, 2), (4, 2), (0, 3), (1, 3), (2, 3), (3, 3), (4, 3)]

Number of passengers. Error: list indices must be integers or slices, not list

So, I'm trying to sum the number of passenger at each stop.
The "stops" variable are the number of stops, and is conformed by a tuple which contains the in's and out's of passengers, example:
stops = [(in1, out1), (in2, out2), (in3, out3), (in4, out4)]
stops = [(10, 0), (4, 1), (3, 5), (3, 4), (5, 1), (1, 5), (5, 8), (4, 6), (2, 3)]
number_passenger_per_stop = []
for i in stops:
resta = stops[i][0] - stops[i][1]
number_passenger_per_stop.append(resta)
print(number_passenger_per_stop)
I can do the math like this outside the loop, but I don't understand why in the loop crashes:
stops[i][0] - stops[i][1]
i is not the list index, it's the list element itself. You don't need to write stops[i].
resta = i[0] - i[1]
Your code would be correct if you had written
for i in range(len(stops)):
You could also replace the entire thing with a list comprehension:
number_passenger_per_stop = [on - off for on, off in stops]
I just edited the for loop to adress each in the index in the list correctly, you needed to call each element in the list by its position, and not by its value:
stops = [(10, 0), (4, 1), (3, 5), (3, 4), (5, 1), (1, 5), (5, 8), (4, 6), (2, 3)]
number_passenger_per_stop = []
for i in range(len(stops)):
resta = stops[i][0] - stops[i][1]
number_passenger_per_stop.append(resta)
print(number_passenger_per_stop)
Output:
[10, 3, -2, -1, 4, -4, -3, -2, -1]

Sort out pairs with same members but different order from list of pairs

From the list
l =[(3,4),(2,3),(4,3),(3,2)]
I want to sort out all second appearances of pairs with the same members in reverse order. I.e., the result should be
[(3,4),(2,3)]
What's the most concise way to do that in Python?
Alternatively, one might do it in a more verbose way:
l = [(3,4),(2,3),(4,3),(3,2)]
L = []
omega = set([])
for a,b in l:
key = (min(a,b), max(a,b))
if key in omega:
continue
omega.add(key)
L.append((a,b))
print(L)
If we want to keep only the first tuple of each pair:
l =[(3,4),(2,3),(4,3),(3,2), (3, 3), (5, 6)]
def first_tuples(l):
# We could use a list to keep track of the already seen
# tuples, but checking if they are in a set is faster
already_seen = set()
out = []
for tup in l:
if set(tup) not in already_seen:
out.append(tup)
# As sets can only contain hashables, we use a
# frozenset here
already_seen.add(frozenset(tup))
return out
print(first_tuples(l))
# [(3, 4), (2, 3), (3, 3), (5, 6)]
This ought to do the trick:
[x for i, x in enumerate(l) if any(y[::-1] == x for y in l[i:])]
Out[23]: [(3, 4), (2, 3)]
Expanding the initial list a little bit with different orderings:
l =[(3,4),(2,3),(4,3),(3,2), (1,3), (3,1)]
[x for i, x in enumerate(l) if any(y[::-1] == x for y in l[i:])]
Out[25]: [(3, 4), (2, 3), (1, 3)]
And, depending on whether each tuple is guaranteed to have an accompanying "sister" reversed tuple, the logic may change in order to keep "singleton" tuples:
l = [(3, 4), (2, 3), (4, 3), (3, 2), (1, 3), (3, 1), (10, 11), (10, 12)]
[x for i, x in enumerate(l) if any(y[::-1] == x for y in l[i:]) or not any(y[::-1] == x for y in l)]
Out[35]: [(3, 4), (2, 3), (1, 3), (10, 11), (10, 12)]
IMHO, this should be both shorter and clearer than anything posted so far:
my_tuple_list = [(3,4),(2,3),(4,3),(3,2)]
set((left, right) if left < right else (right, left) for left, right in my_tuple_list)
>>> {(2, 3), (3, 4)}
It simply makes a set of all tuples, whose members are exchanged beforehand if first member is > second member.

Categories

Resources