Iterate over three lists with different lengths simultaneously - python
So I tried to iterate over 3 lists simultaneously using zip and itertools cycle in python 3, but it gave me something I didn't want. Suppose that I have
list_a = [0,1,2,3,4,5,6,7,8,9,10,11]
list_b = [0,1,2,3,4,5,6,7,8,9,10,11]
list_c = [0,1,2,3,4,5,6,7,8,9,10,11,
12,13,14,15,16,17,18,19,20,21,22,23,
24,25,26,27,28,29,30,31,32,33,34,35,
36,37,38,39,40,41,42,43,44,45,46,47,
48,49,50,51,52,53,54,55,56,57,58,59,
60,61,62,63,64,65,66,67,68,69,70,71,
72,73,74,75,76,77,78,79,80,81,82,83,
84,85,86,87,88,89,90,91,92,93,94,95,
96,97,98,99,100,101,102,103,104,105,106,107,
108,109,110,111,112,113,114,115,116,117,118,119,
120,121,122,123,124,125,126,127,128,129,130,131,
132,133,134,135,136,137,138,139,140,141,142,143]
I have tried this:
from itertools import cycle
for val_a in list_a:
for val_b, val_c in zip(cycle(list_b), list_c):
print(val_a, val_b, val_c)
my output is:
0 0 0
0 1 1
0 2 2
0 3 3
0 4 4
0 5 5
0 6 6
0 7 7
0 8 8
0 9 9
0 10 10
0 11 11
0 0 12
0 1 13
0 2 14
0 3 15
0 4 16
0 5 17
0 6 18
0 7 19
0 8 20
0 9 21
0 10 22
0 11 23
0 0 24
0 1 25
0 2 26
0 3 27
0 4 28
0 5 29
0 6 30
0 7 31
0 8 32
0 9 33
0 10 34
0 11 35
. . .
. . .
. . .
. . .
. . .
and so on...
I expect the output:
0 0 0
0 1 1
0 2 2
0 3 3
0 4 4
0 5 5
0 6 6
0 7 7
0 8 8
0 9 9
0 10 10
0 11 11
1 0 12
1 1 13
1 2 14
1 3 15
1 4 16
1 5 17
1 6 18
1 7 19
1 8 20
1 9 21
1 10 22
1 11 23
2 0 24
2 1 25
2 2 26
2 3 27
2 4 28
2 5 29
2 6 30
2 7 31
2 8 32
2 9 33
2 10 34
2 11 35
. . .
. . .
. . .
. . .
. . .
11 9 141
11 10 142
11 11 143
I have tried without using itertools cycle, using itertools.izip_longest and changing the order of iteration of lists. What should I do?
It appears you don't want to cycle through any lists at all. Instead you want to go through every element in b for each element in a, while incrementing c.
Turn c into an iterator like so to increment it, and proceed with the nested for loop like so:
iter_c = iter(list_c)
for val_a in list_a:
for val_b, val_c in zip(list_b, iter_c):
print(val_a, val_b, val_c)
Output:
0 0 0
0 1 1
0 2 2
0 3 3
0 4 4
0 5 5
0 6 6
0 7 7
0 8 8
0 9 9
0 10 10
0 11 11
1 0 12
1 1 13
1 2 14
1 3 15
1 4 16
1 5 17
1 6 18
1 7 19
1 8 20
1 9 21
1 10 22
1 11 23
2 0 24
2 1 25
2 2 26
2 3 27
2 4 28
2 5 29
2 6 30
2 7 31
2 8 32
2 9 33
2 10 34
2 11 35
. . .
. . .
. . .
. . .
. . .
11 9 141
11 10 142
11 11 143
Related
How to calculate an an accumulated value conditionally?
This question is based on this thread. I have the following dataframe: diff_hours stage sensor 0 0 20 0 0 21 0 0 21 1 0 22 5 0 21 0 0 22 0 1 20 7 1 23 0 1 24 0 3 25 0 3 28 6 0 21 0 0 22 I need to calculated an accumulated value of diff_hours while stage is growing. When stage drops to 0, the accumulated value acc_hours should restart to 0 even though diff_hours might not be equal to 0. The proposed solution is this one: blocks = df['stage'].diff().lt(0).cumsum() df['acc_hours'] = df['diff_hours'].groupby(blocks).cumsum() Output: diff_hours stage sensor acc_hours 0 0 0 20 0 1 0 0 21 0 2 0 0 21 0 3 1 0 22 1 4 5 0 21 6 5 0 0 22 6 6 0 1 20 6 7 7 1 23 13 8 0 1 24 13 9 0 3 25 13 10 0 3 28 13 11 6 0 21 6 12 0 0 22 6 On the line 11 the value of acc_hours is equal to 6. I need it to be restarted to 0, because the stage dropped from 3 back to 0 in row 11. The expected output: diff_hours stage sensor acc_hours 0 0 0 20 0 1 0 0 21 0 2 0 0 21 0 3 1 0 22 1 4 5 0 21 6 5 0 0 22 6 6 0 1 20 6 7 7 1 23 13 8 0 1 24 13 9 0 3 25 13 10 0 3 28 13 11 6 0 21 0 12 0 0 22 0 How can I implement this logic?
The expected output is unclear, what about a simple mask? Masking only the value during the change: m = df['stage'].diff().lt(0) df['acc_hours'] = (df.groupby(m.cumsum()) ['diff_hours'].cumsum() .mask(m, 0) ) Output: diff_hours stage sensor acc_hours 0 0 0 20 0 1 0 0 21 0 2 0 0 21 0 3 1 0 22 1 4 5 0 21 6 5 0 0 22 6 6 0 1 20 6 7 7 1 23 13 8 0 1 24 13 9 0 3 25 13 10 0 3 28 13 11 6 0 21 0 12 0 0 22 6 13 3 0 22 9 14 0 0 22 9 Or ignoring the value completely bu masking before groupby: m = df['stage'].diff().lt(0) df['acc_hours'] = (df['diff_hours'].mask(m, 0) .groupby(m.cumsum()) .cumsum() ) Output: diff_hours stage sensor acc_hours 0 0 0 20 0 1 0 0 21 0 2 0 0 21 0 3 1 0 22 1 4 5 0 21 6 5 0 0 22 6 6 0 1 20 6 7 7 1 23 13 8 0 1 24 13 9 0 3 25 13 10 0 3 28 13 11 6 0 21 0 12 0 0 22 0 13 3 0 22 3 14 0 0 22 3
pandas: Create new column by comparing DataFrame rows of one column of DataFrame
assume i have df: pd.DataFrame({'data': [0,0,0,1,1,1,2,2,2,3,3,4,4,5,5,0,0,0,0,2,2,2,2,4,4,4,4]}) data 0 0 1 0 2 0 3 1 4 1 5 1 6 2 7 2 8 2 9 3 10 3 11 4 12 4 13 5 14 5 15 0 16 0 17 0 18 0 19 2 20 2 21 2 22 2 23 4 24 4 25 4 26 4 I'm looking for a way to create a new column in df that shows the number of data items repeated in new column For example: data new 0 0 1 1 0 2 2 0 3 3 1 1 4 1 2 5 1 3 6 2 1 7 2 2 8 2 3 9 3 1 10 3 2 11 4 1 12 4 2 13 5 1 14 5 2 15 0 1 16 0 2 17 0 3 18 0 4 19 2 1 20 2 2 21 2 3 22 2 4 23 4 1 24 4 2 25 4 3 26 4 4 My logic was to get the rows to python list compare and create a new list. Is there a simple way to do this?
Example df = pd.DataFrame({'data': [0,0,0,1,1,1,2,2,2,3,3,4,4,5,5,0,0,0,0,2,2,2,2,4,4,4,4]}) Code grouper = df['data'].ne(df['data'].shift(1)).cumsum() df['new'] = df.groupby(grouper).cumcount().add(1) df data new 0 0 1 1 0 2 2 0 3 3 1 1 4 1 2 5 1 3 6 2 1 7 2 2 8 2 3 9 3 1 10 3 2 11 4 1 12 4 2 13 5 1 14 5 2 15 0 1 16 0 2 17 0 3 18 0 4 19 2 1 20 2 2 21 2 3 22 2 4 23 4 1 24 4 2 25 4 3 26 4 4
Cumulative Sum based on a Trigger
I am trying to track cumulative sums of the 'Value' column that should begin every time I get 1 in the 'Signal' column. So in the table below I need to obtain 3 cumulative sums starting at values 3, 6, and 9 of the index, and each sum ending at value 11 of the index: Index Value Signal 0 3 0 1 8 0 2 8 0 3 7 1 4 9 0 5 10 0 6 14 1 7 10 0 8 10 0 9 4 1 10 10 0 11 10 0 What would be a way to do it? Expected Output: Index Value Signal Cumsum_1 Cumsum_2 Cumsum_3 0 3 0 0 0 0 1 8 0 0 0 0 2 8 0 0 0 0 3 7 1 7 0 0 4 9 0 16 0 0 5 10 0 26 0 0 6 14 1 40 14 0 7 10 0 50 24 0 8 10 0 60 34 0 9 4 1 64 38 4 10 10 0 74 48 14 11 10 0 84 58 24
You can pivot, bfill, then cumsum: df.merge(df.assign(id=df['Signal'].cumsum().add(1)) .pivot(index='Index', columns='id', values='Value') .bfill(axis=1).fillna(0, downcast='infer') .cumsum() .add_prefix('cumsum'), left_on='Index', right_index=True ) output: Index Value Signal cumsum1 cumsum2 cumsum3 cumsum4 0 0 3 0 3 0 0 0 1 1 8 0 11 0 0 0 2 2 8 0 19 0 0 0 3 3 7 1 26 7 0 0 4 4 9 0 35 16 0 0 5 5 10 0 45 26 0 0 6 6 14 1 59 40 14 0 7 7 10 0 69 50 24 0 8 8 10 0 79 60 34 0 9 9 4 1 83 64 38 4 10 10 10 0 93 74 48 14 11 11 10 0 103 84 58 24 older answer IIUC, you can use groupby.cumsum: df['cumsum'] = df.groupby(df['Signal'].cumsum())['Value'].cumsum() output: Index Value Signal cumsum 0 0 3 0 3 1 1 8 0 11 2 2 8 0 19 3 3 7 1 7 4 4 9 0 16 5 5 10 0 26 6 6 14 1 14 7 7 10 0 24 8 8 10 0 34 9 9 4 1 4 10 10 10 0 14 11 11 10 0 24
How can I create unique id based on the value in the other column
I wanted to assign the unique id based on the value from the column. For ex. i have a table like this: df = pd.DataFrame({'A': [0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,0,0,0,0,1,1,1]} Eventually I would like to have my output table looks like this: A id 1 0 1 2 0 1 3 0 1 4 0 1 5 0 1 6 0 1 7 1 2 8 1 2 9 1 2 10 1 2 11 1 2 12 1 2 13 0 3 14 0 3 15 0 3 16 0 3 17 0 3 18 0 3 19 1 4 20 1 4 21 1 4 22 0 5 23 0 5 24 0 5 25 0 5 26 1 6 27 1 6 28 1 6 I tried data.groupby(['a'], sort=False).ngroup() + 1 but its not working as what I want. Any help and guidance will be appreciated! thanks!
diff + cumsum: df['id'] = df.A.diff().ne(0).cumsum() df A id 0 0 1 1 0 1 2 0 1 3 0 1 4 0 1 5 0 1 6 0 1 7 1 2 8 1 2 9 1 2 10 1 2 11 1 2 12 1 2 13 0 3 14 0 3 15 0 3 16 0 3 17 0 3 18 0 3 19 1 4 20 1 4 21 1 4 22 0 5 23 0 5 24 0 5 25 0 5 26 1 6 27 1 6 28 1 6
import pdrle df["id"] = pdrle.get_id(df["A"]) + 1 df # A id # 0 0 1 # 1 0 1 # 2 0 1 # 3 0 1 # 4 0 1 # 5 0 1 # 6 0 1 # 7 1 2 # 8 1 2 # 9 1 2 # 10 1 2 # 11 1 2 # 12 1 2 # 13 0 3 # 14 0 3 # 15 0 3 # 16 0 3 # 17 0 3 # 18 0 3 # 19 1 4 # 20 1 4 # 21 1 4 # 22 0 5 # 23 0 5 # 24 0 5 # 25 0 5 # 26 1 6 # 27 1 6 # 28 1 6
cumcount() None
I would like to have a new columnen ( not_ordered_in_STREET_x_before_my_car ) that counts the None values in my Dataframe up until the row I am in, groupted by x, sorted by x and y. import pandas as pd x_start = 1 y_start = 1 size_city = 10 cars = pd.DataFrame({'x': np.repeat(np.arange(x_start,x_start+size_city),size_city), 'y': np.tile(np.arange(y_start,y_start+size_city),size_city), 'pizza_ordered' : np.repeat([None,None,1,6,3,7,5,None,8,9,0,None,None,None,4,None,11,12,14,15],5)}) The first 4 columns is what I have, and the 5th is the one I want. x y pizza_ordered not_ordered_in_STREET_x_before_my_car 0 1 1 None 0 1 1 2 None 1 2 1 3 1 2 3 1 4 2 2 4 1 5 1 2 5 1 6 1 2 6 1 7 1 2 7 1 8 None 2 8 1 9 1 3 9 1 10 4 3 10 2 1 1 0 11 2 2 None 0 12 2 3 None 1 13 2 4 None 2 14 2 5 4 3 15 2 6 None 3 16 2 7 5 4 17 2 8 3 4 18 2 9 1 4 19 2 10 1 4 This is what I have tried, but it does not work. cars = cars.sort_values(['x', 'y']) cars['not_ordered_in_STREET_x_before_my_car'] = cars.where(cars['pizza_ordered'].isnull()).groupby(['x']).cumcount().add(1)
You can try: cars["not_ordered_in_STREET_x_before_my_car"] = cars.groupby("x")[ "pizza_ordered" ].transform(lambda x: x.isna().cumsum().shift(1).fillna(0).astype(int )) print(cars) Prints: x y pizza_ordered not_ordered_in_STREET_x_before_my_car 0 1 1 None 0 1 1 2 None 1 2 1 3 None 2 3 1 4 None 3 4 1 5 None 4 5 1 6 None 5 6 1 7 None 6 7 1 8 None 7 8 1 9 None 8 9 1 10 None 9 10 2 1 1 0 11 2 2 1 0 12 2 3 1 0 13 2 4 1 0 14 2 5 1 0 15 2 6 6 0 16 2 7 6 0 17 2 8 6 0 18 2 9 6 0 19 2 10 6 0 20 3 1 3 0 21 3 2 3 0 22 3 3 3 0 23 3 4 3 0 24 3 5 3 0 25 3 6 7 0 26 3 7 7 0 27 3 8 7 0 28 3 9 7 0 29 3 10 7 0 30 4 1 5 0 31 4 2 5 0 32 4 3 5 0 33 4 4 5 0 34 4 5 5 0 35 4 6 None 0 36 4 7 None 1 37 4 8 None 2 38 4 9 None 3 39 4 10 None 4 40 5 1 8 0 41 5 2 8 0 42 5 3 8 0 43 5 4 8 0 44 5 5 8 0 45 5 6 9 0 46 5 7 9 0 47 5 8 9 0 48 5 9 9 0 49 5 10 9 0 50 6 1 0 0 51 6 2 0 0 52 6 3 0 0 53 6 4 0 0 54 6 5 0 0 55 6 6 None 0 56 6 7 None 1 57 6 8 None 2 58 6 9 None 3 59 6 10 None 4 60 7 1 None 0 61 7 2 None 1 62 7 3 None 2 63 7 4 None 3 64 7 5 None 4 65 7 6 None 5 66 7 7 None 6 67 7 8 None 7 68 7 9 None 8 69 7 10 None 9 70 8 1 4 0 71 8 2 4 0 72 8 3 4 0 73 8 4 4 0 74 8 5 4 0 75 8 6 None 0 76 8 7 None 1 77 8 8 None 2 78 8 9 None 3 79 8 10 None 4 80 9 1 11 0 81 9 2 11 0 82 9 3 11 0 83 9 4 11 0 84 9 5 11 0 85 9 6 12 0 86 9 7 12 0 87 9 8 12 0 88 9 9 12 0 89 9 10 12 0 90 10 1 14 0 91 10 2 14 0 92 10 3 14 0 93 10 4 14 0 94 10 5 14 0 95 10 6 15 0 96 10 7 15 0 97 10 8 15 0 98 10 9 15 0 99 10 10 15 0
cars['not_ordered_in_STREET_x_before_my_car'] = pd.isnull(cars['pizza_ordered']).cumsum()