I need help pulling data from a list with different techniques in python
For example:
We have a list with 20 different values.
lst = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','r','s','t','w']
mod = 5
roundMod= 3
DESIRED OUTPUT
Round 1 :
1 - a,
2 - b,
3 - c,
4 - d,
5 - e,
Round 2 :
1 - a,
2 - b,
3 - c,
4 - d,
5 - e,
Round 3 :
1 - a,
2 - b,
3 - c,
4 - d,
5 - e,
Round 1:
6 - f,
7 - g,
8 - h,
9 - i,
10 - j,
Round 2 :
6 - f,
7 - g,
8 - h,
9 - i,
10 - j,
Round 3 :
6 - f,
7 - g,
8 - h,
9 - i,
10 - j,
I have a mod for getting max 5 values for each round and roundmod for maximum round before getting next 5 element
IIUC, you want to slice the List with stepwise starting/ending points. Use an integer division (//) for this:
List = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','r','s','t','w']
mod = 5
roundMod= 3
for i in range(6): # not sure how the number of "lines" is defined
d = i//roundMod
print(f'{i=}, {d=},', List[d*mod:(d+1)*mod])
output:
i=0, d=0, ['a', 'b', 'c', 'd', 'e']
i=1, d=0, ['a', 'b', 'c', 'd', 'e']
i=2, d=0, ['a', 'b', 'c', 'd', 'e']
i=3, d=1, ['f', 'g', 'h', 'i', 'j']
i=4, d=1, ['f', 'g', 'h', 'i', 'j']
i=5, d=1, ['f', 'g', 'h', 'i', 'j']
If you also want to track the round, use divmod:
List = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','r','s','t','w']
mod = 5
roundMod= 3
for i in range(6):
d,r = divmod(i, roundMod)
print(f'Round {r+1}: ', List[d*mod:(d+1)*mod])
output:
Round 1: ['a', 'b', 'c', 'd', 'e']
Round 2: ['a', 'b', 'c', 'd', 'e']
Round 3: ['a', 'b', 'c', 'd', 'e']
Round 1: ['f', 'g', 'h', 'i', 'j']
Round 2: ['f', 'g', 'h', 'i', 'j']
Round 3: ['f', 'g', 'h', 'i', 'j']
This seems a job for a generator:
def pull(lst, mod = 5, round_mod = 3):
counter = 0
while True:
start = counter // round_mod
if start * mod >= len(lst):
break
yield lst[start * mod:(start + 1)*mod]
counter += 1
puller = pull(l)
print([x for x in puller])
OUTPUT
[['a', 'b', 'c', 'd', 'e'], ['a', 'b', 'c', 'd', 'e'], ['a', 'b', 'c', 'd', 'e'], ['f', 'g', 'h', 'i', 'j'], ['f', 'g', 'h', 'i', 'j'], ['f', 'g', 'h', 'i', 'j'], ['k', 'l', 'm', 'n', 'o'], ['k', 'l', 'm', 'n', 'o'], ['k', 'l', 'm', 'n', 'o'], ['p', 'r', 's', 't', 'w'], ['p', 'r', 's', 't', 'w'], ['p', 'r', 's', 't', 'w']]
or, to reproduce exactly your desired output:
for n, x in enumerate(puller):
print(f'Round {n + 1}: {", ".join([f"{i + 1} - {v}" for i, v in enumerate(x)])}')
OUTPUT
Round 1: 1 - a, 2 - b, 3 - c, 4 - d, 5 - e
Round 2: 1 - a, 2 - b, 3 - c, 4 - d, 5 - e
Round 3: 1 - a, 2 - b, 3 - c, 4 - d, 5 - e
Round 4: 1 - f, 2 - g, 3 - h, 4 - i, 5 - j
Round 5: 1 - f, 2 - g, 3 - h, 4 - i, 5 - j
Round 6: 1 - f, 2 - g, 3 - h, 4 - i, 5 - j
Round 7: 1 - k, 2 - l, 3 - m, 4 - n, 5 - o
Round 8: 1 - k, 2 - l, 3 - m, 4 - n, 5 - o
Round 9: 1 - k, 2 - l, 3 - m, 4 - n, 5 - o
Round 10: 1 - p, 2 - r, 3 - s, 4 - t, 5 - w
Round 11: 1 - p, 2 - r, 3 - s, 4 - t, 5 - w
Round 12: 1 - p, 2 - r, 3 - s, 4 - t, 5 - w
Related
I'm trying to obtain the combinations of each element in a list within a list. Given this case:
my_list
[['A', 'B'], ['C', 'D', 'E'], ['F', 'G', 'H', 'I']]
The output would be:
0
1
0
A
B
1
C
D
2
C
E
3
D
E
4
F
G
5
F
H
6
F
I
7
G
H
8
G
I
9
H
I
Or it could also be a new list instead of a DataFrame:
my_new_list
[['A','B'], ['C','D'], ['C','E'],['D','E'], ['F','G'],['F','H'],['F','I'],['G','H'],['G','I'],['H','I']]
This should do it. You have to flatten the result of combinations.
from itertools import combinations
x = [['A', 'B'], ['C', 'D', 'E'], ['F', 'G', 'H', 'I']]
y = [list(combinations(xx, 2)) for xx in x]
z = [list(item) for subl in y for item in subl]
z
[['A', 'B'],
['C', 'D'],
['C', 'E'],
['D', 'E'],
['F', 'G'],
['F', 'H'],
['F', 'I'],
['G', 'H'],
['G', 'I'],
['H', 'I']]
Create combination by itertools.combinations with flatten values in list comprehension:
from itertools import combinations
L = [['A', 'B'], ['C', 'D', 'E'], ['F', 'G', 'H', 'I']]
data = [list(j) for i in L for j in combinations(i, 2)]
print (data)
[['A', 'B'], ['C', 'D'], ['C', 'E'],
['D', 'E'], ['F', 'G'], ['F', 'H'],
['F', 'I'], ['G', 'H'], ['G', 'I'],
['H', 'I']]
And then pass to DataFrame by constructor:
df = pd.DataFrame(data)
print (df)
0 1
0 A B
1 C D
2 C E
3 D E
4 F G
5 F H
6 F I
7 G H
8 G I
9 H I
def get_pair( arrs ):
result = []
for arr in arrs:
for i in range(0, len(arr) - 1 ):
for j in range( i + 1, len(arr) ):
result.append( [arr[i], arr[j]] )
return result
arrs = [['A', 'B'], ['C', 'D', 'E'], ['F', 'G', 'H', 'I']]
print( get_pair(arrs) )
My dataframe looks like this.
df = pd.DataFrame({
'ID': [1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3],
'text': ['a', 'a', 'b', 'b', 'c', 'c', 'c', 'd', 'd', 'e', 'e', 'e', 'f', 'g'] ,
'out_text': ['x1', 'x2', 'x3', 'x4', 'x5', 'x6', 'x7', 'x8', 'x9', 'x10', 'x11', 'x12', 'x13', 'x14'] ,
'Rule_1': ['N', 'N', 'N', 'Y', 'N', 'N', 'N', 'N', 'N', 'N','N', 'N', 'Y', 'Y'],
'Rule_2': ['Y', 'N', 'N', 'N', 'Y', 'N', 'N', 'N', 'N', 'N','N', 'N', 'Y', 'N'],
'Rule_3': ['N', 'N', 'N', 'N', 'N', 'N', 'N', 'N', 'N', 'N','N', 'N', 'Y', 'Y']})
ID text out_text Rule_1 Rule_2 Rule_3
0 1 a x1 N Y N
1 1 a x2 N N N
2 1 b x3 N N N
3 1 b x4 Y N N
4 2 c x5 N Y N
5 2 c x6 N N N
6 2 c x7 N N N
7 2 d x8 N N N
8 2 d x9 N N N
9 2 e x10 N N N
10 2 e x11 N N N
11 2 e x12 N N N
12 3 f x13 Y Y Y
13 3 g x14 Y N Y
I have to aggregate Rule_1, Rule_2, Rule_3 to such that if a combination of ID and Text has a 'Y' in any of these columns, the overall result is a Y for that combination. In our example 1-a and 1-b are Y overall. 2-d and 2-e are 'N'. How do I aggregate multiple columns?
Let's try using max(1) to aggregate the rules by rows, then groupyby().any() to check if any row has Y:
(df[['Rule_1','Rule_2','Rule_3']].eq('Y')
.max(axis=1)
.groupby([df['ID'],df['text']])
.any()
)
Output:
ID text
1 a True
b True
2 c True
d False
e False
3 f True
g True
dtype: bool
Or if you want Y/N, we can change max/any to max, and drop comparison:
(df[['Rule_1','Rule_2','Rule_3']]
.max(axis=1)
.groupby([df['ID'],df['text']])
.max()
)
Output:
ID text
1 a Y
b Y
2 c Y
d N
e N
3 f Y
g Y
dtype: object
I have 3 lists as follows:
L1 = ['H', 'H', 'T', 'T', 'T', 'H', 'H', 'H', 'H', 'T']
L2 = ['H', 'H', 'T', 'T', 'T', 'H', 'H', 'H', 'H', 'T' , 'T', 'H, 'T', 'T', 'T', 'H', 'H', 'H', 'T']
L3 = ['H', 'T', 'H', 'H']
I would like to count sequential occurrences of 'H' in each list and produce the following table showing the frequencies of these 'H' sequences:
Length | L1 | L2 | L3
----------------------
1 0 1 1
2 1 1 1
3 0 1 0
4 1 1 0
5 0 0 0
I know that doing the following gives me the frequnecies of a sequence in a list:
from itertools import groupby
[len(list(g[1])) for g in groupby(L1) if g[0]=='H']
[2, 4]
But am in need of an elegant way to take this further over the remaining lists and ensuring that a '0' is placed for unobserved lengths.
You can use collections.Counter to create a frequency dict from a generator expression that outputs the lengths of sequences generated by itertools.groupby, and then iterate through a range of possible lengths to output the frequencies from the said dict, with 0 as a default value in absence of a frequency.
Using L1 as an example:
from itertools import groupby
from collections import Counter
counts = Counter(sum(1 for _ in g) for k, g in groupby(L1) if k == 'H')
print([counts[length] for length in range(1, 6)])
This outputs:
[0, 1, 0, 1, 0]
You can use itertools.groupby with collections.Counter:
import itertools as it, collections as _col
def scores(l):
return _col.Counter([len(list(b)) for a, b in it.groupby(l, key=lambda x:x == 'H') if a])
L1 = ['H', 'H', 'T', 'T', 'T', 'H', 'H', 'H', 'H', 'T']
L2 = ['H', 'H', 'T', 'T', 'T', 'H', 'H', 'H', 'H', 'T' , 'T', 'H', 'T', 'T', 'T', 'H', 'H', 'H', 'T']
L3 = ['H', 'T', 'H', 'H']
d = {'L1':scores(L1), 'L2':scores(L2), 'L3':scores(L3)}
r = '\n'.join([f'Length | {" | ".join(d.keys())} ', '-'*20]+[f'{i} {" ".join(str(b.get(i, 0)) for b in d.values())}' for i in range(1, 6)])
print(r)
Output:
Length | L1 | L2 | L3
--------------------
1 0 1 1
2 1 1 1
3 0 1 0
4 1 1 0
5 0 0 0
This might work :
from itertools import groupby
a = [len(list(v)) if k=='H' and v else 0 for k,v in groupby(''.join(L1))]
For a sample L4 = ['T', 'T'] where there is no 'H' item in list, it returns [0].
For L1 it returns [2, 0, 4, 0].
For L2 it returns [2, 0, 4, 0, 1, 0, 3, 0].
For L3 it returns [1, 0, 2].
Please try max([len(x) for x in ''.join(y).split('T')]) where y is your list.
def windows(iterable,n,m=1):
x = iter(iterable)
l = []
y = next(x)
for i in range(n):
l.append(y)
y = next(x)
yield l
while x:
for i in range(m):
l.pop(0)
for i in range(m):
l.append(y)
y = next(x)
yield l
I need to write a windows generator takes an iterable and two ints (call them n and m; with m’s default value 1) as parameters: it produces lists of n values: the first list contains the first n values; every subsequent list drops the first m from the previous list and adds the next m values from the iterable, until there are fewer than n values to put in the returned list.
for instance:
for i in windows('abcdefghijk', 4,2):
print(i,end='')
prints ['a','b','c','d'] ['c','d','e','f'] ['e','f','g','h'] ['g','h','i','j'].
when I call the above function, my code prints
[['i', 'j', 'k'], ['i', 'j', 'k'], ['i', 'j', 'k'], ['i', 'j', 'k']]
I cannot figure out the problem. can someone help me to fix it? Thanks in advance.
You should use slicing to grab n items and have a start value that increases by m.
def windows(iterable, n, m = 1):
if m == 0: # otherwise infinte loop
raise ValueError("Parameter 'm' can't be 0")
lst = list(iterable)
i = 0
while i + n < len(lst):
yield lst[i:i + n]
i += m
# Output
>>> for i in windows('abcdefghijk', 4, 2):
print(i)
['a', 'b', 'c', 'd']
['c', 'd', 'e', 'f']
['e', 'f', 'g', 'h']
['g', 'h', 'i', 'j']
Maybe something like this, assuming you aren't working with a lazy iterable.
def windows(iterable, n, m=1):
length = len(iterable)
i = 0
while i + n < length:
yield list(iterable[i:i + n])
i += m
for win in windows('abcdefghijk', 4, 2):
print(win)
output
['a', 'b', 'c', 'd']
['c', 'd', 'e', 'f']
['e', 'f', 'g', 'h']
['g', 'h', 'i', 'j']
appears to work for the few cases I tried, the generator itself is a one liner
def WindGen(astr, n, m = 1):
if m !=0:
return (list(astr[i * m : i * m + n]) for i in range((len(astr) - n) // m + 1))
astr = 'abcdefghijk'
n, m = 4, 2
print(*WindGen(astr, n, m), sep='\n')
['a', 'b', 'c', 'd']
['c', 'd', 'e', 'f']
['e', 'f', 'g', 'h']
['g', 'h', 'i', 'j']
The following will also work, and will not miss the end terms in some cases:
def sliding_window(iterable, window_size, step_size=1):
length = len(iterable)
i = 0
end_flag = False
while not end_flag:
curr_split = list(iterable[i:i + window_size])
if len(curr_split) < window_size:
end_flag = True
curr_split.extend([None] * (window_size - len(curr_split)))
yield curr_split
i += step_size
iterable = 'abcdefghijk'
window_size = 4
step_size = 2
res = list(sliding_window(iterable, window_size, step_size))
print(res)
Output:
[['a', 'b', 'c', 'd'],
['c', 'd', 'e', 'f'],
['e', 'f', 'g', 'h'],
['g', 'h', 'i', 'j'],
['i', 'j', 'k', None]]
I'm needing help on a small logic using python.
I have several lists of different sizes. For each list I need to generate 5 different sets (lists) and each set formed must have 5 elements. Each element of the set must be generated randomly.
It is important to note that a set {a, b, c} is equal to the set {b, a, c}.
Example: L1 = {a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s}
Set 1 = {a, c, e, h, i, s}
Set 2 = {a, b, c, d, e, f}
Set 3 = {m, n, o, p, q, r}
Set 4 = {e, n, k, a, i, s}
Set 5 = {h, p, a, i, g, k}
Could anyone help me in Python logic? Thank you!
Combine random.sample with itertools.combinations:
>>> import itertools
>>> import random
>>> [set(i) for i in random.sample(list(itertools.combinations(L1, 5)), 5)]
[{'c', 'h', 'i', 'l', 'q'},
{'a', 'd', 'o', 'p', 'r'},
{'b', 'h', 'j', 'k', 'l'},
{'b', 'd', 'g', 'n', 's'},
{'d', 'f', 'i', 'l', 's'}]
Or if the combinations is too slow then you could take samples as long as you don't have 5 different ones:
res = []
while len(res) < 5:
samp = set(random.sample(L1, 5))
if samp not in res:
res.append(samp)
res
[{'a', 'b', 'd', 'g', 'q'},
{'e', 'g', 'i', 'j', 'q'},
{'h', 'i', 'j', 'k', 'n'},
{'b', 'e', 'j', 'n', 's'},
{'e', 'f', 'g', 'l', 'r'}]
You can use a hit-and-miss approach if you want to generate 5 distinct sets:
import random
L1 = {c for c in 'abcdefghijklmnopqrstuvwxyz'}
sets = set()
items = list(L1)
while len(sets) < 5:
s = frozenset(random.sample(items,5))
sets.add(s)
sets = [set(s) for s in sets]
for s in sets: print(s)
Typical output:
{'w', 'b', 'p', 'j', 'r'}
{'b', 'p', 'v', 'l', 'o'}
{'q', 'c', 'i', 'z', 't'}
{'j', 'f', 'h', 'l', 'y'}
{'b', 'h', 'd', 'r', 'c'}
For problems of the given size, this should be adequate. 26 choose 5 = 65780, so collisions should be rare. As L1 gets closer to 5 in size it will become more difficult to get 5 distinct sets.