get indexes of elements from a zigzag configuration - python

"How could I get the indexes of elements in an n-row array configuration?
The length of a row should be given by a string of length l.
For example:
For a 2-row array configuration with l=7, the elements (X) will have indexes:
elements = [(0, 0), (0, 2), (0, 4), (0, 6), (1, 1), (1, 3), (1, 5), (1, 7)]
[[X - X - X - X],
[- X - X - X -]]
For a 3-rows array with l=8, the elements (X) will have indexes:
elements = [(0, 0), (0, 4), (0, 8), (1, 1), (1, 3), (1, 5), (1, 7), (2, 2), (2, 6)]
[[X - - - X - - - X],
[- X - X - X - X -],
[- - X - - - X - -]]
The idea is to extended to higher row numbers. Is there an "analytical" way of getting those indexes?
Thanks in advance.
P.S.: By "analytical" I mean an equation or something that I could code

this is my first shot at your problem:
def grid(width, depth):
assert depth % 2 == 0
height = depth//2 + 1
lines = []
for y in range(height):
line = ''.join('X' if ((i+y) % depth == 0 or (i-y) % depth == 0)
else '-' for i in range(width))
lines.append(line)
return '\n'.join(lines)
the depth is the parameter that defines how far the Xs are spaces on fhe first line (the name is poorly chosen); the width is how many characters should be displayed per line.
this will only work for even depths.
with outputs
-> print(grid(width=10, depth=2))
X-X-X-X-X-
-X-X-X-X-X
-> print(grid(width=10, depth=4))
X---X---X-
-X-X-X-X-X
--X---X---
-> print(grid(width=15, depth=6))
X-----X-----X--
-X---X-X---X-X-
--X-X---X-X---X
---X-----X-----
this was mostly trial & error so there is not much to explain...
if you prefer your elements representation - here is what you can do:
def grid_elements(width, depth):
assert depth % 2 == 0
height = depth//2 + 1
elements = []
for y in range(height):
elements.extend((y, i) for i in range(width)
if ((i+y) % depth == 0 or (i-y) % depth == 0))
return elements
this creates the results:
-> print(grid_elements(width=10, depth=2))
[(0, 0), (0, 2), (0, 4), (0, 6), (0, 8), (1, 1), (1, 3), (1, 5), (1, 7), (1, 9)]
-> print(grid_elements(width=10, depth=4))
[(0, 0), (0, 4), (0, 8), (1, 1), (1, 3), (1, 5), (1, 7), (1, 9), (2, 2), (2, 6)]
-> print(grid_elements(width=15, depth=6))
[(0, 0), (0, 6), (0, 12), (1, 1), (1, 5), (1, 7), (1, 11), (1, 13), (2, 2),
(2, 4), (2, 8), (2, 10), (2, 14), (3, 3), (3, 9)]

This is a example of code that can do this.
import numpy as np
nb_row = 3; nb_column = 10;
separator_element = '-'; element = 'X';
#Initialise the size of the table
table = np.chararray((nb_row, nb_column), itemsize=1);
table[:] = separator_element; #By default, all have the separator element.
#Loop over each column: First column have element at first row. The element
#will after decrease and wrap around the nb of row.
#When at the bottom, switch to go up. At top, switch to go down.
position_element = 0; go_down = 1;
for no_column in xrange(0,nb_column):
table[position_element,no_column] = element;
#Case when go down.
if go_down == 1:
position_element = (position_element+1) % (nb_row);
go_down = (position_element != (nb_row-1)); #Go up after go down.
#Case when go up;
else:
position_element = (position_element-1) % (nb_row);
go_down = (position_element == 0); #Go up after go down.
#end
#end
print(table)
#[['X' '-' '-' '-' 'X' '-' '-' '-' 'X' '-']
#['-' 'X' '-' 'X' '-' 'X' '-' 'X' '-' 'X']
#['-' '-' 'X' '-' '-' '-' 'X' '-' '-' '-']]

We can use itertools.groupby here to create a dictionary that has the
sublist indexes of interest as values and index of sublists as keys {0: [0, 2, 4, 6], 1: [1, 3, 5, 7]}, We can then use this on list that is generated using n = 7. From there we can modify the sublist using the indexes that are values for the corresponding sublist index in our keys.
from itertools import groupby
elements = [(0, 0), (0, 2), (0, 4), (0, 6), (1, 1), (1, 3), (1, 5), (1, 7)]
n = 7
d = {}
for k, g in groupby(elements, key=lambda x: x[0]):
d[k] = [i[1] for i in g]
lst = [['-']*n for i in d]
for k in d:
for i, v in enumerate(lst[k]):
if i in d[k]:
lst[k][i] = 'X'
lst[k] = ' '.join(lst[k])
for i in lst:
print(i)
# X - X - X - X
# - X - X - X -

Related

How to find the maximum per group in an rdd?

I'm using PySpark and I have an RDD that looks like this:
[
("Moviex", [(1, 100), (2, 20), (3, 50)]),
("MovieY", [(1, 100), (2, 250), (3, 100), (4, 120)]),
("MovieZ", [(1, 1000), (2, 250)]),
("MovieX", [(4, 50), (5, 10), (6, 0)]),
("MovieY", [(3, 0), (4, 260)]),
("MovieZ", [(5, 180)]),
]
The first element in the tuple represents the week number and the second element represents the number of viewers. I want to find the week with the most views for each movie, but ignoring the first week.
I've tried some things but nothing worked, for example:
stats.reduceByKey(max).collect()
returns:
[('MovieX', [(4, 50), (5, 10), (6, 0)]),
('MovieY', [(5, 180)]),
('MovieC', [(3, 0), (4, 260)])]
so the entire second set.
Also this:
stats.groupByKey().reduce(max)
which returns just this:
('MovieZ', <pyspark.resultiterable.ResultIterable at 0x558f75eeb0>)
How can I solve this?
If you want the most views per movie, ignoring the first week ... [('MovieA', 50), ('MovieC', 250), ('MovieB', 260)]
Then, you'll want your own map function rather than a reduce.
movie_stats = spark.sparkContext.parallelize([
("MovieA", [(1, 100), (2, 20), (3, "50")]),
("MovieC", [(1, 100), (2, "250"), (3, 100), (4, "120")]),
("MovieB", [(1, 1000), (2, 250)]),
("MovieA", [(4, 50), (5, "10"), (6, 0)]),
("MovieB", [(3, 0), (4, "260")]),
("MovieC", [(5, "180")]),
])
def get_views_after_first_week(v):
values = iter(v) # iterator of tuples, groupped by key
result = list()
for x in values:
result.extend([int(y[1]) for y in x if y[0] > 1])
return result
mapped = movie_stats.groupByKey().mapValues(get_views_after_first_week).mapValues(max)
mapped.collect()
to include the week number... [('MovieA', (3, 50)), ('MovieC', (2, 250)), ('MovieB', (4, 260))]
def get_max_weekly_views_after_first_week(v):
values = iter(v) # iterator of tuples, groupped by key
max_views = float('-inf')
max_week = None
for x in values:
for t in x:
week, views = t
views = int(views)
if week > 1 and views > max_views:
max_week = week
max_views = views
return (max_week, max_views, )
mapped = movie_stats.groupByKey().mapValues(get_max_weekly_views_after_first_week)
Some code is needed to convert the string into int, and apply a map function to 1) filter out week 1 data; 2) get the week with max view.
def helper(arr: list):
max_week = None
for sub_arr in arr:
for item in sub_arr:
if item[0] == 1:
continue
count = int(item[1])
if max_week is None or max_week[1] < count:
max_week = [item[0], count]
return max_week
movie_stats.groupByKey().map(lambda x: (x[0], helper(x[1]))).collect()

print most k frequent numbers of list with rank ties

I was trying to find a way to print k most frequent number of the text file. I was able to sort those numbers into a list of lists with its number of appearance in the text file.
l =[(0, 7), (3, 4), (-101, 3), (2, 3), (-3, 1), (-2, 1), (-1, 1), (101, 1)] # 0 is the number itself, 7 means it appeared in file 7 times, and etc
So, now I want to print out k most frequent numbers of the file(should be done RECURSIVELY), but I am struggling with rank ties. For example, if k=3 I want to print:
[(0, 7), (3, 4), (-101, 3), (2, 3)] # top 3 frequencies
I tried doing:
def head(l): return l[0]
def tail(l): return l[1:]
def topk(l,k,e):
if(len(l)<=1 or k==0):
return [head(l)[1]]
elif(head(l)[1]!=e):
return [head(l)[1]] + topk(tail(l),k-1,head(l)[1])
else:
return [head(l)[1]] + topk(tail(l),k,head(l)[1])
l1 = [(0, 7), (3, 4), (-101, 3), (2, 3), (-3, 1), (-2, 1), (-1, 1), (101, 1)]
l2 = [(3.3, 4), (-3.3, 3), (-2.2, 2), (1.1, 1)]
print(topk(l1,3,''))
print(took(l2,3,''))
l1 prints correctly, but l2 has an extra frequency for some reason.
you can use sorted built-in function with parameter key to get the last frequency from top k and then you can use a list comprehenstion to get all the elements that have the frequency >= than that min value:
v = sorted(l, key=lambda x: x[1])[-3][1]
[e for e in l if e[1] >= v]
output:
[(0, 7), (3, 4), (-101, 3), (2, 3)]
if you want a recursive version you can use:
def my_f(l, v, top=None, i=0):
if top is None:
top = []
if l[i][1] >= v:
top.append(l[i])
if i == len(l) - 1:
return top
return my_f(l, v, top, i+1)
def topk(l, k):
k = min(len(l), k)
v = sorted(l, key=lambda x: x[1])[-3][1]
return my_f(l, v)
topk(l, 3)

Generating list of position (1, (2, 3), 4, (5, 6)) how?

Hello i was wonder how to create list of integer / tuples for a list of positions for string slicing.
same example as in title just make a list of this format:
(1, (2, 3), 4, (5, 6))
# my current attempt:
str_input, decrypted_str = "tester", ""
num_lists = [[x] + [(x + 1, x + 2)] for x in range(0, len(str_input), 4)]
for clist in num_lists:
for position in clist:
if isinstance(position, int):
decrypted_str += str_input[position]
else:
decrypted_str += str_input[position[0]:position[1]+1]
print(decrypted_str)
this results in "teser", but output should be tester.
The line before the last one should be
decrypted_str += str_input[position[0]:position[1]+1]
because 1:5 means 1,2,3,4 but not 5

Order of For Loops with Python List Comprehension

In this answer, it is claimed that
The best way to remember this is that the order of for loop inside the list comprehension is based on the order in which they appear in traditional loop approach. Outer most loop comes first, and then the inner loops subsequently.
However, this answer,, and my own experiment below, seem to show the opposite - i.e, the inner loop coming first.
In my example, I want j to represent the row number and i to represent the column number. I want 5 rows and 4 columns What am I missing please?
board = [[(j, i) for i in range(4)] for j in range(5)]
# I believe the above comprehension is equivalent to the nested for loops below
# board = []
# for j in range(5):
# new_row = []
# for i in range(4):
# new_row.append((j,i))
# board.append(new_row)
for j in range(5):
for i in range(4):
print(board[j][i], end="")
print()
This is the correct way to get desired output:
board = [(j, i) for i in range(4) for j in range(5)]
Output:-
[(0, 0), (1, 0), (2, 0), (3, 0), (4, 0), (0, 1), (1, 1), (2, 1), (3, 1), (4, 1), (0, 2), (1, 2), (2, 2), (3, 2), (4, 2), (0, 3), (1, 3), (2, 3), (3, 3), (4, 3)]

Lower elements in list given a certain position

I must lower letters in a list if the occupy a certain position given in a previous function I did. The function I must program is lower_words.
I'm having an issue: every time I lower an element the row is repeated.
I don't need to use the list "words" for this. Just left it there so you could understand better what the function does/must do. Can someone help me?
words= ["PATO", "GATO", "BOI", "CAO"]
grid1= ["PIGATOS",
"ANRBKFD",
"TMCAOXA",
"OOBBYQU",
"MACOUIV",
"EEJMIWL"]
positions_words_occupy = ((0, 0), (1, 0), (2, 0), (3, 0), (0, 2), (0, 3), (0, 4), (0, 5), (3, 2), (4, 3), (5, 4), (2, 2), (2, 3), (2, 4)) #these are the positions the words occupy. I have determined these positions with a previous function. first is the line, second the column
def lower_words(grid, positions_words_occupy):
new= []
for position in positions_words_occupy:
line= position[0]
column= position[1]
row= grid[line]
element= row[column]
new.append(row.replace(element, element.lower()))
return new
Expected output:
['pIgatoS', 'aNRBKFD', 'tMcaoXA', 'oObBYQU', 'MACoUIV', 'EEJMiWL']
Actual output:
['pIGATOS', 'aNRBKFD', 'tMCAOXA', 'ooBBYQU', 'PIgATOS', 'PIGaTOS', 'PIGAtOS', 'PIGAToS', 'OObbYQU', 'MACoUIV', 'EEJMiWL', 'TMcAOXA', 'TMCaOXa', 'TMCAoXA']
Changing the perspective, you can see it lowers the words I have in the list words:
['pIgatoS',
'aNRBKFD',
'tMcaoXA',
'oObBYQU',
'MACoUIV',
'EEJMiWL']
You are very close! You're actually appending to your new list new every time you replace a letter. That is why you are getting so many values in your list.
Another way you would run your code is to create a copy of grid1, and then replace each word every time you replace a letter. Here is a new function implementing these small changes:
def lower_words(grid, positions_words_occupy):
new = grid1.copy()
for position in positions_words_occupy:
line= position[0]
column= position[1]
row= new[line]
element= row[column]
#new.remove(row)
new_word = row[:column] + element.lower() + row[column+1:]
new[line] = new_word
return new
Output running lower_words(grid1, positions_words_occupy):
['pIgatoS', 'aNRBKFD', 'tMcaoXa', 'oObBYQU', 'MACoUIV', 'EEJMiWL']
I would first collect your grid positions in a collections.defaultdict or sets, then rebuild the strings with lowercase letters if their positions exist in these sets.
Demo:
from collections import defaultdict
grid1 = ["PIGATOS", "ANRBKFD", "TMCAOXA", "OOBBYQU", "MACOUIV", "EEJMIWL"]
positions_words_occupy = (
(0, 0),
(1, 0),
(2, 0),
(3, 0),
(0, 2),
(0, 3),
(0, 4),
(0, 5),
(3, 2),
(4, 3),
(5, 4),
(2, 2),
(2, 3),
(2, 4),
)
d = defaultdict(set)
for grid, pos in positions_words_occupy:
d[grid].add(pos)
result = []
for grid, pos in d.items():
result.append(
"".join(x.lower() if i in pos else x for i, x in enumerate(grid1[grid]))
)
print(result)
Output:
['pIgatoS', 'aNRBKFD', 'tMcaoXA', 'oObBYQU', 'MACoUIV', 'EEJMiWL']

Categories

Resources