How to average over next n values within groups - python

With the following data
ex = {'id': {0: 12,
1: 7745,
2: 14190,
3: 12,
4: 7745,
5: 14190,
6: 12,
7: 7745,
8: 14190,
9: 12,
10: 7745,
11: 14190,
12: 12,
13: 7745,
14: 14190,
15: 12,
16: 7745,
17: 14190,
18: 12,
19: 7745,
20: 14190,
21: 12,
22: 7745,
23: 14190,
24: 12,
25: 7745,
26: 14190,
27: 12,
28: 7745,
29: 14190,
30: 12,
31: 7745,
32: 14190,
33: 12,
34: 7745,
35: 14190,
36: 12,
37: 7745,
38: 14190,
39: 12,
40: 7745,
41: 14190,
42: 12,
43: 7745,
44: 14190,
45: 12,
46: 7745,
47: 14190,
48: 12,
49: 7745,
50: 14190,
51: 12,
52: 7745,
53: 14190,
54: 12,
55: 7745,
56: 14190,
57: 12,
58: 7745,
59: 14190},
'id2': {0: 0,
1: 0,
2: 0,
3: 1,
4: 1,
5: 1,
6: 2,
7: 2,
8: 2,
9: 3,
10: 3,
11: 3,
12: 4,
13: 4,
14: 4,
15: 5,
16: 5,
17: 5,
18: 6,
19: 6,
20: 6,
21: 7,
22: 7,
23: 7,
24: 8,
25: 8,
26: 8,
27: 9,
28: 9,
29: 9,
30: 10,
31: 10,
32: 10,
33: 11,
34: 11,
35: 11,
36: 12,
37: 12,
38: 12,
39: 13,
40: 13,
41: 13,
42: 14,
43: 14,
44: 14,
45: 15,
46: 15,
47: 15,
48: 16,
49: 16,
50: 16,
51: 17,
52: 17,
53: 17,
54: 18,
55: 18,
56: 18,
57: 19,
58: 19,
59: 19},
'var1': {0: 60.57423361566744,
1: 58.044840216178606,
2: 51.29251700680272,
3: 60.674455993946225,
4: 58.21241610641044,
5: 51.31371599732972,
6: 60.77849708396439,
7: 58.369465051911966,
8: 51.33611104900928,
9: 60.88625886689413,
10: 58.516561288952005,
11: 51.35969457224551,
12: 60.99764332390786,
13: 58.65427905379941,
14: 51.38445897744256,
15: 61.112552436177864,
16: 58.78319258272294,
17: 51.4103966750045,
18: 61.230888184876434,
19: 58.90387611199144,
20: 51.43750007533549,
21: 61.35255255117588,
22: 59.01690387787371,
23: 51.465761588839634,
24: 61.4774475162485,
25: 59.122850116638496,
26: 51.49517362592107,
27: 61.60547506126665,
28: 59.222289064554694,
29: 51.52572859698392,
30: 61.736537167402595,
31: 59.31579495789107,
32: 51.55741891243228,
33: 61.870535815828646,
34: 59.40394203291643,
35: 51.5902369826703,
36: 62.00737298771711,
37: 59.48730452589962,
38: 51.624175218102074,
39: 62.14695066424032,
40: 59.56645667310938,
41: 51.659226029131744,
42: 62.289170826570604,
43: 59.64197271081458,
44: 51.69538182616348,
45: 62.43393545588018,
46: 59.714426875284005,
47: 51.732635019601275,
48: 62.58114653334144,
49: 59.784393402786435,
50: 51.770978019849345,
51: 62.73070604012664,
52: 59.85244652959075,
53: 51.81040323731179,
54: 62.88251595740815,
55: 59.919160491965705,
56: 51.85090308239276,
57: 63.03647826635822,
58: 59.98510952618012,
59: 51.892469965496346},
'var2': {0: 26.46961208868258,
1: 25.02784060286349,
2: 67.01680672268907,
3: 26.362852053047188,
4: 25.16250452630659,
5: 67.20428262498875,
6: 26.257170717779545,
7: 25.25801378937902,
8: 67.37902432665504,
9: 26.15255739707393,
10: 25.315898046471766,
11: 67.5412758313266,
12: 26.04900140512476,
13: 25.33768695197584,
14: 67.69128114264197,
15: 25.946492056126274,
16: 25.32491016028206,
17: 67.82928426423972,
18: 25.84501866427287,
19: 25.27909732578149,
20: 67.95552919975847,
21: 25.74457054375889,
22: 25.201778102865052,
23: 68.07025995283685,
24: 25.64513700877862,
25: 25.094482145923664,
26: 68.17372052711335,
27: 25.546707373526395,
28: 24.958739109348315,
29: 68.26615492622662,
30: 25.449270952196603,
31: 24.796078647529914,
32: 68.34780715381525,
33: 25.35281705898356,
34: 24.608030414859442,
35: 68.41892121351782,
36: 25.257335008081554,
37: 24.396124065727854,
38: 68.47974110897286,
39: 25.162814113684988,
40: 24.16188925452609,
41: 68.53051084381906,
42: 25.069243689988213,
43: 23.906855635645105,
44: 68.57147442169496,
45: 24.976613051185442,
46: 23.63255286347585,
47: 68.60287584623913,
48: 24.88491151147112,
49: 23.340510592409263,
50: 68.62495912109016,
51: 24.79412838503955,
52: 23.03225847683625,
53: 68.63796824988664,
54: 24.704252986085066,
55: 22.70932617114788,
56: 68.64214723626722,
57: 24.615274628802,
58: 22.373243329735022,
59: 68.6377400838704}}
ex = pd.DataFrame(ex).set_index(['id', 'id2'])
I'd like to calculate for each value in id the average of next n values of var1 where "next" is defined by id2. I know that pd.Series.expanding exists and I could do something like df.groupby('id')['var1'].transform(lambda x: x.expanding().mean()) but this would involve all 20 elements of each id, when I want to limit the average to the next n elements (let's say n = 5). How it can be done?

This should do the trick:
print(ex.sort_index(ascending=False).groupby("id")["var1"].rolling(6, min_periods=1).mean().reset_index(0, drop=True))
Output:
id id2
12 19 63.036478
18 62.959497
17 62.883233
16 62.807712
15 62.732956
14 62.658992
13 62.510738
12 62.364880
11 62.221519
10 62.080750
9 61.942674
8 61.807387
7 61.674987
6 61.545573
5 61.419242
4 61.296093
3 61.176224
2 61.059732
1 60.946716
0 60.837274
7745 19 59.985110
18 59.952135
17 59.918906
16 59.885277
15 59.851107
14 59.816252
13 59.746476
12 59.674500
11 59.599749
10 59.521650
9 59.439627
8 59.353106
7 59.261514
6 59.164276
5 59.060818
4 58.950565
3 58.832944
2 58.707380
1 58.573298
0 58.430126
14190 19 51.892470
18 51.871687
17 51.851259
16 51.831189
15 51.811478
14 51.792129
13 51.753255
12 51.715467
11 51.678772
10 51.643179
9 51.608695
8 51.575327
7 51.543082
6 51.511970
5 51.481997
4 51.453170
3 51.425498
2 51.398987
1 51.373646
0 51.349482
Name: var1, dtype: float64
[Program finished]

Related

How to set up a while loop based on values in two dataframes (warehouse slotting) in Python?

Warehouse slotting question. I have a 'sorted_location' df that I have sequenced in order that I need to assign to certain products based on their demands:
sorted_location df:
Label Slot_Type Slot_Type_Sequence Cumulative Pallets Sequence to Slot
3-10-36-2 6 Deep Bulk 1 11 0
3-10-35-2 6 Deep Bulk 1 22 1
3-10-34-2 6 Deep Bulk 1 33 2
3-10-33-2 6 Deep Bulk 1 44 3
3-10-32-2 6 Deep Bulk 1 55 4
... ... ... ... ... ...
2-9-5-2 1 Deep Prime 5 2579 760
2-9-4-2 1 Deep Prime 5 2580 761
2-9-3-2 1 Deep Prime 5 2581 762
2-9-2-2 1 Deep Prime 5 2582 763
2-9-1-2 1 Deep Prime 5 2583 764
demand df:
SKU Description Pallets
0 85,072 TITO'S HANDMADE VODKA 1.75LTR 33
1 2,922 VEUVE CLICQUOT BRUT NV 22
2 66,867 DON JULIO 1942 ANEJO TEQ 22
I basically need to assign location by 'Slot_Type_Sequence' first (5 different slot types) and then within the slot type I have several locations ('Label', total 764 locations). How do I set up a while loop in such a way that I will get 3 locations assigned for Tito's Vodka (since the Demand is 33 pallets and the first 3 locations can hold 33 pallets)
Any help appreciated!
Desired output:
Label SKU Description
3-10-36-2 85,072 TITO'S HANDMADE VODKA 1.75LTR
3-10-35-2 85,072 TITO'S HANDMADE VODKA 1.75LTR
3-10-34-2 85,072 TITO'S HANDMADE VODKA 1.75LTR
3-10-33-2 2,922 VEUVE CLICQUOT BRUT NV
3-10-32-2 2,922 VEUVE CLICQUOT BRUT NV
Edited to add raw data
sorted_location df (Top 35 locations)
{'Label': {0: '3-10-36-2',
1: '3-10-35-2',
2: '3-10-34-2',
3: '3-10-33-2',
4: '3-10-32-2',
5: '3-10-31-2',
6: '3-10-30-2',
7: '3-10-29-2',
8: '3-10-28-2',
9: '3-10-27-2',
10: '3-10-25-2',
11: '3-10-24-2',
12: '3-10-23-2',
13: '3-10-22-2',
14: '3-10-21-2',
15: '3-10-20-2',
16: '3-10-19-2',
17: '3-10-18-2',
18: '3-10-17-2',
19: '3-10-16-2',
20: '3-10-14-2',
21: '3-10-13-2',
22: '3-10-12-2',
23: '3-10-11-2',
24: '3-10-10-2',
25: '3-10-9-2',
26: '3-10-8-2',
27: '3-10-6-2',
28: '3-10-5-2',
29: '3-10-4-2',
30: '3-10-3-2',
31: '3-10-2-2',
32: '3-10-1-2',
33: '2-4-2-2',
34: '2-4-3-2'},
'Slot_Type': {0: '6 Deep Bulk',
1: '6 Deep Bulk',
2: '6 Deep Bulk',
3: '6 Deep Bulk',
4: '6 Deep Bulk',
5: '6 Deep Bulk',
6: '6 Deep Bulk',
7: '6 Deep Bulk',
8: '6 Deep Bulk',
9: '6 Deep Bulk',
10: '6 Deep Bulk',
11: '6 Deep Bulk',
12: '6 Deep Bulk',
13: '6 Deep Bulk',
14: '6 Deep Bulk',
15: '6 Deep Bulk',
16: '6 Deep Bulk',
17: '6 Deep Bulk',
18: '6 Deep Bulk',
19: '6 Deep Bulk',
20: '6 Deep Bulk',
21: '6 Deep Bulk',
22: '6 Deep Bulk',
23: '6 Deep Bulk',
24: '6 Deep Bulk',
25: '6 Deep Bulk',
26: '6 Deep Bulk',
27: '6 Deep Bulk',
28: '6 Deep Bulk',
29: '6 Deep Bulk',
30: '6 Deep Bulk',
31: '6 Deep Bulk',
32: '6 Deep Bulk',
33: '4 Deep Prime',
34: '4 Deep Prime'},
'Slot_Type_Sequence': {0: 1,
1: 1,
2: 1,
3: 1,
4: 1,
5: 1,
6: 1,
7: 1,
8: 1,
9: 1,
10: 1,
11: 1,
12: 1,
13: 1,
14: 1,
15: 1,
16: 1,
17: 1,
18: 1,
19: 1,
20: 1,
21: 1,
22: 1,
23: 1,
24: 1,
25: 1,
26: 1,
27: 1,
28: 1,
29: 1,
30: 1,
31: 1,
32: 1,
33: 2,
34: 2},
'Cumulative Pallets': {0: 11,
1: 22,
2: 33,
3: 44,
4: 55,
5: 66,
6: 77,
7: 88,
8: 99,
9: 110,
10: 121,
11: 132,
12: 143,
13: 154,
14: 165,
15: 176,
16: 187,
17: 198,
18: 209,
19: 220,
20: 231,
21: 242,
22: 253,
23: 264,
24: 275,
25: 286,
26: 297,
27: 308,
28: 319,
29: 330,
30: 341,
31: 352,
32: 363,
33: 370,
34: 377},
'Sequence to Slot': {0: 0,
1: 1,
2: 2,
3: 3,
4: 4,
5: 5,
6: 6,
7: 7,
8: 8,
9: 9,
10: 10,
11: 11,
12: 12,
13: 13,
14: 14,
15: 15,
16: 16,
17: 17,
18: 18,
19: 19,
20: 20,
21: 21,
22: 22,
23: 23,
24: 24,
25: 25,
26: 26,
27: 27,
28: 28,
29: 29,
30: 30,
31: 31,
32: 32,
33: 33,
34: 34}}
demand df (Top 40 SKUs)
{'SKU': {0: '85,072',
1: '2,922',
2: '66,867',
3: '100,835',
4: '114,732',
5: '139,672',
6: '120,922',
7: '22,024',
8: '95,340',
9: '144,916',
10: '6,780',
11: '7,154',
12: '90,607',
13: '1,242',
14: '95,527',
15: '1,394',
16: '94,079',
17: '1,497',
18: '95,684',
19: '9,721',
20: '80,587',
21: '1,644',
22: '106,160',
23: '139,228',
24: '77,264',
25: '111,205',
26: '120,920',
27: '86,813',
28: '75,899',
29: '2,064',
30: '3,594',
31: '114,451',
32: '120,756',
33: '1,493',
34: '50,995',
35: '2,023',
36: '95,347',
37: '131,255',
38: '111,125',
39: '2,580'},
'Description': {0: "TITO'S HANDMADE VODKA 1.75LTR",
1: 'VEUVE CLICQUOT BRUT NV',
2: 'DON JULIO 1942 ANEJO TEQ',
3: 'COPPOLA CHARD DIRECTORS',
4: 'GRUET CUVEE 89 SPARKLING BRUT',
5: 'WHIP SHOTS VOD INF VAN CRM 200',
6: 'GRUET CUVEE 89 SPARKLING ROSE',
7: 'KETEL ONE VODKA 1.75',
8: 'LA MARCA PROSECCO',
9: 'DUE WEST CABERNET SAUVIGNON',
10: 'JOHNNIE WALKER BLUE 750',
11: 'MAKERS MARK BOURBON 1.75',
12: 'UNRULY RED',
13: 'BAILEYS IRISH CREAM',
14: 'UNRULY CABERNET',
15: 'PATRON SILVER TEQ',
16: 'DAOU CABERNET SAUVIGNON',
17: 'ABSOLUT VODKA 1.75',
18: 'DON JULIO ANEJO 70TH ANNIV',
19: 'JAMESON IRISH WHISKEY 1.75',
20: "TITO'S HANDMADE VODKA 750ML",
21: 'JACK DAN WHISKEY 1.75L',
22: 'UNRULY CHARDONNAY',
23: 'DOM PERIGNON 12',
24: 'DOLCE VITA PROSECCO',
25: 'CASA DRAGONES BLANCO TEQ',
26: 'NAVIGATOR CAB NAPA VALLEY',
27: 'DONOVAN-PARKE PINOT NOIR',
28: 'TOQUES CLOCHERS CREMANT',
29: 'JAMESON IRISH WHISKEY',
30: 'HENNESSY VS COGNAC 750ML',
31: 'ATHENAEUM CAB SAUV NAPA',
32: 'ENCORE PINOT NOIR MONTEREY',
33: 'ABSOLUT VODKA',
34: 'DON JULIO BLANCO TEQ',
35: 'CROWN ROYAL CANDN WH 1.75',
36: 'BUCCANEER CABERNET',
37: 'NAVIGATOR NAPA SAUVIGNON BLANC',
38: 'UNRULY DARK RED WINE',
39: 'CARNEROS CREEK CHARDONNAY RSV'},
'Pallets': {0: 33,
1: 22,
2: 22,
3: 22,
4: 22,
5: 22,
6: 22,
7: 22,
8: 22,
9: 11,
10: 11,
11: 11,
12: 11,
13: 11,
14: 11,
15: 11,
16: 11,
17: 11,
18: 11,
19: 11,
20: 14,
21: 14,
22: 14,
23: 14,
24: 7,
25: 7,
26: 7,
27: 7,
28: 7,
29: 7,
30: 7,
31: 7,
32: 7,
33: 7,
34: 7,
35: 7,
36: 7,
37: 7,
38: 7,
39: 7}}

Why does the size of my 3D Plotly Scatterplot randomly change?

I am trying to create an animated 3D scatterplot to represent fish swimming in 3D space. I have 8 fish, and for each fish I have 4 points. I am able to make the graph and animate it, however the size of the graph changes randomly between time points. I have set the axes mins and maxes, but the distance between them seems to change. What aspect of the plot do I need to alter in order to keep it stable?
This is the plotly express command that I am using:
fig = px.scatter_3d(df,x="x", y="y", z="z",
color="Fish", animation_frame="Frame", hover_data = ["BodyPart"],
range_x=[-0.25,0.25], range_y=[-0.15,0.15], range_z=[-0.15,0.15],
color_continuous_scale = "rainbow")
These two images show the graph one frame apart from one another. The green square shows stats on one point to show that it is not changing drastically:
I am also including this video for a clearer example.
Edited:
Minimum graphing code:
import pandas as pd
import plotly.express as px
data_dict = {'Fish': {0: 0, 1: 0, 2: 0, 3: 0, 4: 1, 5: 1, 6: 1, 7: 1, 8: 2, 9: 2, 10: 2, 11: 2, 12: 3, 13: 3, 14: 3, 15: 3, 16: 4, 17: 4, 18: 4, 19: 4, 20: 5, 21: 5, 22: 5, 23: 5, 24: 6, 25: 6, 26: 6, 27: 6, 28: 7, 29: 7, 30: 7, 31: 7, 32: 0, 33: 0, 34: 0, 35: 0, 36: 1, 37: 1, 38: 1, 39: 1, 40: 2, 41: 2, 42: 2, 43: 2, 44: 3, 45: 3, 46: 3, 47: 3, 48: 4, 49: 4, 50: 4, 51: 4, 52: 5, 53: 5, 54: 5, 55: 5, 56: 6, 57: 6, 58: 6, 59: 6, 60: 7, 61: 7, 62: 7, 63: 7}, 'BodyPart': {0: 'head', 1: 'midline2', 2: 'tailbase', 3: 'tailtip', 4: 'head', 5: 'midline2', 6: 'tailbase', 7: 'tailtip', 8: 'head', 9: 'midline2', 10: 'tailbase', 11: 'tailtip', 12: 'head', 13: 'midline2', 14: 'tailbase', 15: 'tailtip', 16: 'head', 17: 'midline2', 18: 'tailbase', 19: 'tailtip', 20: 'head', 21: 'midline2', 22: 'tailbase', 23: 'tailtip', 24: 'head', 25: 'midline2', 26: 'tailbase', 27: 'tailtip', 28: 'head', 29: 'midline2', 30: 'tailbase', 31: 'tailtip', 32: 'head', 33: 'midline2', 34: 'tailbase', 35: 'tailtip', 36: 'head', 37: 'midline2', 38: 'tailbase', 39: 'tailtip', 40: 'head', 41: 'midline2', 42: 'tailbase', 43: 'tailtip', 44: 'head', 45: 'midline2', 46: 'tailbase', 47: 'tailtip', 48: 'head', 49: 'midline2', 50: 'tailbase', 51: 'tailtip', 52: 'head', 53: 'midline2', 54: 'tailbase', 55: 'tailtip', 56: 'head', 57: 'midline2', 58: 'tailbase', 59: 'tailtip', 60: 'head', 61: 'midline2', 62: 'tailbase', 63: 'tailtip'}, 'x': {0: 0.121283071, 1: 0.074230535, 2: 0.096664814, 3: 0.063435668, 4: -0.11843468, 5: -0.133776416, 6: -0.12698166, 7: -0.133996648, 8: 0.154499401, 9: 0.099541555, 10: 0.126525899, 11: 0.086448979, 12: -0.001723707, 13: -0.064203743, 14: -0.033163578, 15: -0.077987938, 16: 0.160456072, 17: 0.175340028, 18: 0.178537856, 19: 0.16438273, 20: -0.151890354, 21: -0.099510254, 22: -0.123827166, 23: -0.08765671, 24: 0.052741099, 25: -0.003778201, 26: 0.022010701, 27: -0.014747641, 28: -0.137528989, 29: -0.078632593, 30: -0.106688178, 31: -0.065274018, 32: 0.12128202, 33: 0.074230379, 34: 0.096662597, 35: 0.063435699, 36: -0.118412987, 37: -0.133729238, 38: -0.12729935, 39: -0.134238167, 40: 0.154498856, 41: 0.099541572, 42: 0.126525899, 43: 0.086450612, 44: -0.001719156, 45: -0.064209291, 46: -0.033163578, 47: -0.07796947, 48: 0.157094899, 49: 0.175288008, 50: 0.178383788, 51: 0.1643551, 52: -0.153086656, 53: -0.100645272, 54: -0.125700666, 55: -0.089248865, 56: 0.052731775, 57: -0.003778201, 58: 0.022011924, 59: -0.014749184, 60: -0.138954183, 61: -0.079588201, 62: -0.107413558, 63: -0.06588028}, 'y': {0: -0.018777537, 1: -0.017936625, 2: -0.019031854, 3: -0.018688299, 4: 0.031655295, 5: 0.089278103, 6: 0.060434868, 7: 0.102354879, 8: 0.012448659, 9: 0.005374916, 10: 0.008431857, 11: 0.010384436, 12: 0.007394437, 13: 0.002657548, 14: 0.0047918, 15: 0.004216939, 16: -0.061691249, 17: -0.022574622, 18: -0.044862196, 19: -0.015288812, 20: 0.126254494, 21: 0.125420316, 22: 0.127216595, 23: 0.122366769, 24: -0.018798237, 25: -0.026209512, 26: -0.020654802, 27: -0.030922742, 28: 0.100460973, 29: 0.091726762, 30: 0.095608508, 31: 0.089022071, 32: -0.018930378, 33: -0.018313362, 34: -0.019121954, 35: -0.018839649, 36: 0.030465513, 37: 0.087966041, 38: 0.058855924, 39: 0.100617287, 40: 0.012372615, 41: 0.00530059, 42: 0.008431857, 43: 0.009864426, 44: 0.007169236, 45: 0.002524294, 46: 0.0047918, 47: 0.002813216, 48: -0.061409007, 49: -0.024774863, 50: -0.045825365, 51: -0.017002469, 52: 0.125813664, 53: 0.125533354, 54: 0.126988948, 55: 0.121414741, 56: -0.019165739, 57: -0.026209512, 58: -0.020802186, 59: -0.031842627, 60: 0.100213119, 61: 0.091677506, 62: 0.095490242, 63: 0.08724155}, 'z': {0: -0.011584533, 1: -0.005671144, 2: -0.004720913, 3: -0.007099159, 4: 0.048633092, 5: 0.044680886, 6: 0.047755313, 7: 0.047602698, 8: 0.005219131, 9: 0.020195691, 10: 0.013766486, 11: 0.019271016, 12: -0.009086866, 13: 0.005213358, 14: -0.003552202, 15: 0.001820855, 16: -0.039992723, 17: 0.041166976, 18: -0.013040119, 19: 0.048827692, 20: 0.044577227, 21: 0.043492943, 22: 0.045104437, 23: 0.0399218, 24: 0.007934858, 25: 0.007980119, 26: 0.010593472, 27: 0.006390279, 28: 0.070277892, 29: 0.066889416, 30: 0.070485941, 31: 0.054907996, 32: -0.011559485, 33: -0.005583401, 34: -0.004725084, 35: -0.007089815, 36: 0.048823811, 37: 0.04574317, 38: 0.047201689, 39: 0.043995531, 40: 0.005234299, 41: 0.020211407, 42: 0.013766486, 43: 0.019405438, 44: -0.009034049, 45: 0.005200504, 46: -0.003552202, 47: 0.002061042, 48: -0.035258171, 49: 0.041424053, 50: -0.013317812, 51: 0.048629332, 52: 0.043972705, 53: 0.042581942, 54: 0.046299595, 55: 0.040028712, 56: 0.007931264, 57: 0.007980119, 58: 0.010624531, 59: 0.006616644, 60: 0.068992196, 61: 0.064455916, 62: 0.07226277, 63: 0.056393304}, 'Frame': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0, 13: 0, 14: 0, 15: 0, 16: 0, 17: 0, 18: 0, 19: 0, 20: 0, 21: 0, 22: 0, 23: 0, 24: 0, 25: 0, 26: 0, 27: 0, 28: 0, 29: 0, 30: 0, 31: 0, 32: 1, 33: 1, 34: 1, 35: 1, 36: 1, 37: 1, 38: 1, 39: 1, 40: 1, 41: 1, 42: 1, 43: 1, 44: 1, 45: 1, 46: 1, 47: 1, 48: 1, 49: 1, 50: 1, 51: 1, 52: 1, 53: 1, 54: 1, 55: 1, 56: 1, 57: 1, 58: 1, 59: 1, 60: 1, 61: 1, 62: 1, 63: 1}}
df = pd.DataFrame(data_dict)
fig = px.scatter_3d(df,x="x", y="y", z="z", color="Fish", animation_frame="Frame", hover_data = ["BodyPart"],
range_x=[-0.25,0.25], range_y=[-0.15,0.15], range_z=[-0.15,0.15], color_continuous_scale = "rainbow")
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0))
fig.show()
This seems related to the aspectratio in fig.layout.scene:
layout.Scene({
'aspectmode': 'auto',
'aspectratio': {'x': 1.7359689116422856, 'y': 0.9924641251101735, 'z':0.5804211635071164},
If you manually set x, y and z in the dict above to something specific, the flinching of the figure between animation frames seems to disappear.
I've tried:
fig.layout.scene.aspectratio = {'x':1, 'y':1, 'z':1}
fig.show()
And the results are promising. Give it a go on your end and let me know how it works out for you.
It also seems, as you've already discovered, to work best in tandem with setting defined ranges for x_range, y_range, z_range. Since your datasample is a bit limited, I've been messing around with px.data.gapminder().
Plot
Complete code
import plotly.express as px
df = px.data.gapminder()
# df
fig = px.scatter_3d(df, x = 'pop', y='lifeExp', z = 'gdpPercap', animation_frame='year',
range_x=[int(df['pop'].min()*0.5),int(df['pop'].max()*1.5)],
range_y=[int(df.lifeExp.min()*0.5),int(df.lifeExp.max()*1.5)],
range_z=[int(df['gdpPercap'].min()*0.5),int(df['gdpPercap'].max()*1.5)]
)
fig.layout.scene.aspectratio = {'x':1, 'y':1, 'z':1}
fig.show()

write dataframe columns past nan values in specific column based on 2 dataframe relationship

I know the question is worded horribly but I can't think of how to word it any better.
I have two dataframes, one containing the original data:
{2016: {1: 88698.0,
2: 86725.0,
3: 80426.0,
4: 74888.0,
5: 71659.0,
6: 67431.0,
7: 63613.0,
8: 60174.0,
9: 59495.0,
10: 59487.0,
11: 59118.0,
12: 59542.0,
13: 61170.0,
14: 63785.0,
15: 65038.0,
16: 67441.0,
17: 68188.0,
18: 69303.0,
19: 70224.0,
20: 70163.0,
21: 71522.0,
22: 73707.0,
23: 75002.0,
24: 76487.0,
25: 78806.0,
26: 81444.0,
27: 84114.0,
28: 84274.0,
29: 86701.0,
30: 87051.0,
31: 89298.0,
32: 91461.0,
33: 93937.0,
34: 96308.0,
35: 96803.0,
36: 98718.0,
37: 99343.0,
38: 100494.0,
39: 101260.0,
40: 101153.0,
41: 99668.0,
42: 97139.0,
43: 97203.0,
44: 95940.0,
45: 96969.0,
46: 98608.0,
47: 96332.0,
48: 94736.0,
49: 90970.0,
50: 87938.0,
51: 82082.0,
52: 79481.0,
53: nan},
2017: {1: 75212.0,
2: 68024.0,
3: 64087.0,
4: 58824.0,
5: 52226.0,
6: 50006.0,
7: 46975.0,
8: 46794.0,
9: 42855.0,
10: 42021.0,
11: 41884.0,
12: 40281.0,
13: 39117.0,
14: 37985.0,
15: 37120.0,
16: 36968.0,
17: 36702.0,
18: 38486.0,
19: 39051.0,
20: 40589.0,
21: 44099.0,
22: 47631.0,
23: 49984.0,
24: 51726.0,
25: 55653.0,
26: 57663.0,
27: 59409.0,
28: 62820.0,
29: 63324.0,
30: 64788.0,
31: 64693.0,
32: 66452.0,
33: 69349.0,
34: 70697.0,
35: 76470.0,
36: 78805.0,
37: 77624.0,
38: 75268.0,
39: 74695.0,
40: 75892.0,
41: 75930.0,
42: 74942.0,
43: 75824.0,
44: 74628.0,
45: 72058.0,
46: 71113.0,
47: 70602.0,
48: 71898.0,
49: 72186.0,
50: 68760.0,
51: 65931.0,
52: 65441.0,
53: nan},
2018: {1: 59224.0,
2: 55546.0,
3: 51355.0,
4: 50126.0,
5: 45962.0,
6: 42438.0,
7: 39840.0,
8: 39370.0,
9: 37844.0,
10: 35470.0,
11: 33731.0,
12: 32671.0,
13: 33416.0,
14: 33039.0,
15: 33260.0,
16: 32937.0,
17: 33599.0,
18: 35737.0,
19: 37453.0,
20: 38314.0,
21: 40159.0,
22: 44152.0,
23: 47971.0,
24: 51381.0,
25: 55825.0,
26: 58905.0,
27: 61242.0,
28: 62724.0,
29: 61766.0,
30: 63514.0,
31: 63533.0,
32: 66825.0,
33: 65732.0,
34: 68240.0,
35: 70572.0,
36: 71835.0,
37: 72966.0,
38: 74556.0,
39: 76592.0,
40: 78223.0,
41: 79895.0,
42: 79209.0,
43: 79793.0,
44: 80800.0,
45: 79795.0,
46: 78203.0,
47: 77027.0,
48: 75356.0,
49: 72124.0,
50: 68584.0,
51: 67402.0,
52: 65576.0,
53: nan},
2019: {1: 63624.0,
2: 62046.0,
3: 58091.0,
4: 54316.0,
5: 51765.0,
6: 52033.0,
7: 48140.0,
8: 46787.0,
9: 44772.0,
10: 43806.0,
11: 44905.0,
12: 45564.0,
13: 46906.0,
14: 48134.0,
15: 50554.0,
16: 51797.0,
17: 53271.0,
18: 54197.0,
19: 57114.0,
20: 60312.0,
21: 60509.0,
22: 63388.0,
23: 66265.0,
24: 69530.0,
25: 70905.0,
26: 72313.0,
27: 72288.0,
28: 73153.0,
29: 74967.0,
30: 76430.0,
31: 79261.0,
32: 82623.0,
33: 86492.0,
34: 90041.0,
35: 92856.0,
36: 93701.0,
37: 96520.0,
38: 95368.0,
39: 96264.0,
40: 96355.0,
41: 95794.0,
42: 95282.0,
43: 94817.0,
44: 95536.0,
45: 92914.0,
46: 89160.0,
47: 88321.0,
48: 86443.0,
49: 88099.0,
50: 85469.0,
51: 82634.0,
52: 82188.0,
53: nan},
2020: {1: 82784.0,
2: 81804.0,
3: 80581.0,
4: 77236.0,
5: 77976.0,
6: 71822.0,
7: 68726.0,
8: 68132.0,
9: 64557.0,
10: 61529.0,
11: 61379.0,
12: 59424.0,
13: 59134.0,
14: 59027.0,
15: 56780.0,
16: 57442.0,
17: 56835.0,
18: 59376.0,
19: 61625.0,
20: 62697.0,
21: 64240.0,
22: 67329.0,
23: 66282.0,
24: 68967.0,
25: 71331.0,
26: 74599.0,
27: 76823.0,
28: 80348.0,
29: 82388.0,
30: 84404.0,
31: 86713.0,
32: 89336.0,
33: 89295.0,
34: 90833.0,
35: 95222.0,
36: 97380.0,
37: 96141.0,
38: 97890.0,
39: 101959.0,
40: 101842.0,
41: 99897.0,
42: 98325.0,
43: 98391.0,
44: 95828.0,
45: 94889.0,
46: 92887.0,
47: 92562.0,
48: 91718.0,
49: 87637.0,
50: 83927.0,
51: 81596.0,
52: 75146.0,
53: 72777.0},
2021: {1: 66048.0,
2: 59818.0,
3: 57610.0,
4: 56053.0,
5: 51545.0,
6: 48649.0,
7: 43491.0,
8: 41246.0,
9: 41199.0,
10: 41029.0,
11: 41269.0,
12: nan,
13: nan,
14: nan,
15: nan,
16: nan,
17: nan,
18: nan,
19: nan,
20: nan,
21: nan,
22: nan,
23: nan,
24: nan,
25: nan,
26: nan,
27: nan,
28: nan,
29: nan,
30: nan,
31: nan,
32: nan,
33: nan,
34: nan,
35: nan,
36: nan,
37: nan,
38: nan,
39: nan,
40: nan,
41: nan,
42: nan,
43: nan,
44: nan,
45: nan,
46: nan,
47: nan,
48: nan,
49: nan,
50: nan,
51: nan,
52: nan,
53: nan}}
and then one which is just the first dataframe.diff():
{2016: {1: nan,
2: -1973.0,
3: -6299.0,
4: -5538.0,
5: -3229.0,
6: -4228.0,
7: -3818.0,
8: -3439.0,
9: -679.0,
10: -8.0,
11: -369.0,
12: 424.0,
13: 1628.0,
14: 2615.0,
15: 1253.0,
16: 2403.0,
17: 747.0,
18: 1115.0,
19: 921.0,
20: -61.0,
21: 1359.0,
22: 2185.0,
23: 1295.0,
24: 1485.0,
25: 2319.0,
26: 2638.0,
27: 2670.0,
28: 160.0,
29: 2427.0,
30: 350.0,
31: 2247.0,
32: 2163.0,
33: 2476.0,
34: 2371.0,
35: 495.0,
36: 1915.0,
37: 625.0,
38: 1151.0,
39: 766.0,
40: -107.0,
41: -1485.0,
42: -2529.0,
43: 64.0,
44: -1263.0,
45: 1029.0,
46: 1639.0,
47: -2276.0,
48: -1596.0,
49: -3766.0,
50: -3032.0,
51: -5856.0,
52: -2601.0,
53: nan},
2017: {1: nan,
2: -7188.0,
3: -3937.0,
4: -5263.0,
5: -6598.0,
6: -2220.0,
7: -3031.0,
8: -181.0,
9: -3939.0,
10: -834.0,
11: -137.0,
12: -1603.0,
13: -1164.0,
14: -1132.0,
15: -865.0,
16: -152.0,
17: -266.0,
18: 1784.0,
19: 565.0,
20: 1538.0,
21: 3510.0,
22: 3532.0,
23: 2353.0,
24: 1742.0,
25: 3927.0,
26: 2010.0,
27: 1746.0,
28: 3411.0,
29: 504.0,
30: 1464.0,
31: -95.0,
32: 1759.0,
33: 2897.0,
34: 1348.0,
35: 5773.0,
36: 2335.0,
37: -1181.0,
38: -2356.0,
39: -573.0,
40: 1197.0,
41: 38.0,
42: -988.0,
43: 882.0,
44: -1196.0,
45: -2570.0,
46: -945.0,
47: -511.0,
48: 1296.0,
49: 288.0,
50: -3426.0,
51: -2829.0,
52: -490.0,
53: nan},
2018: {1: nan,
2: -3678.0,
3: -4191.0,
4: -1229.0,
5: -4164.0,
6: -3524.0,
7: -2598.0,
8: -470.0,
9: -1526.0,
10: -2374.0,
11: -1739.0,
12: -1060.0,
13: 745.0,
14: -377.0,
15: 221.0,
16: -323.0,
17: 662.0,
18: 2138.0,
19: 1716.0,
20: 861.0,
21: 1845.0,
22: 3993.0,
23: 3819.0,
24: 3410.0,
25: 4444.0,
26: 3080.0,
27: 2337.0,
28: 1482.0,
29: -958.0,
30: 1748.0,
31: 19.0,
32: 3292.0,
33: -1093.0,
34: 2508.0,
35: 2332.0,
36: 1263.0,
37: 1131.0,
38: 1590.0,
39: 2036.0,
40: 1631.0,
41: 1672.0,
42: -686.0,
43: 584.0,
44: 1007.0,
45: -1005.0,
46: -1592.0,
47: -1176.0,
48: -1671.0,
49: -3232.0,
50: -3540.0,
51: -1182.0,
52: -1826.0,
53: nan},
2019: {1: nan,
2: -1578.0,
3: -3955.0,
4: -3775.0,
5: -2551.0,
6: 268.0,
7: -3893.0,
8: -1353.0,
9: -2015.0,
10: -966.0,
11: 1099.0,
12: 659.0,
13: 1342.0,
14: 1228.0,
15: 2420.0,
16: 1243.0,
17: 1474.0,
18: 926.0,
19: 2917.0,
20: 3198.0,
21: 197.0,
22: 2879.0,
23: 2877.0,
24: 3265.0,
25: 1375.0,
26: 1408.0,
27: -25.0,
28: 865.0,
29: 1814.0,
30: 1463.0,
31: 2831.0,
32: 3362.0,
33: 3869.0,
34: 3549.0,
35: 2815.0,
36: 845.0,
37: 2819.0,
38: -1152.0,
39: 896.0,
40: 91.0,
41: -561.0,
42: -512.0,
43: -465.0,
44: 719.0,
45: -2622.0,
46: -3754.0,
47: -839.0,
48: -1878.0,
49: 1656.0,
50: -2630.0,
51: -2835.0,
52: -446.0,
53: nan},
2020: {1: nan,
2: -980.0,
3: -1223.0,
4: -3345.0,
5: 740.0,
6: -6154.0,
7: -3096.0,
8: -594.0,
9: -3575.0,
10: -3028.0,
11: -150.0,
12: -1955.0,
13: -290.0,
14: -107.0,
15: -2247.0,
16: 662.0,
17: -607.0,
18: 2541.0,
19: 2249.0,
20: 1072.0,
21: 1543.0,
22: 3089.0,
23: -1047.0,
24: 2685.0,
25: 2364.0,
26: 3268.0,
27: 2224.0,
28: 3525.0,
29: 2040.0,
30: 2016.0,
31: 2309.0,
32: 2623.0,
33: -41.0,
34: 1538.0,
35: 4389.0,
36: 2158.0,
37: -1239.0,
38: 1749.0,
39: 4069.0,
40: -117.0,
41: -1945.0,
42: -1572.0,
43: 66.0,
44: -2563.0,
45: -939.0,
46: -2002.0,
47: -325.0,
48: -844.0,
49: -4081.0,
50: -3710.0,
51: -2331.0,
52: -6450.0,
53: -2369.0}}
What I am trying to do is calculate, for all columns in any row where 2021 is NaN, the next value row by taking the value in the normal dataframe and adding the next down value from the .diff() dataframe. So, for example, 2020 for week 12 would be 61379 (row 11 in normal df) + (-1955.0, row 12 from the .diff() df)
TIA
Same logic like before
out = df1.mask(df1[2021].notna(),df1+df2.shift(-1),axis=0).fillna(df1[[2021]])

Stacked bar chart X axis gives wrong order python plotly

Hi created a stack bar chart using python plotly. But gives the wrong X-axis order.
DF :
Day-Shift State seconds
Day 01-05 A 7439
Day 01-05 STOPPED 0
Day 01-05 B 10
Day 01-05 C 35751
Night 01-05 C 43200
Day 01-06 STOPPED 7198
Day 01-06 F 18
Day 01-06 A 14
Day 01-06 A 29301
Day 01-06 STOPPED 6
Day 01-06 A 6663
Night 01-06 A 43200
In df Day-Shift represent shift and Date, it goes Day 01-05, Night 01-05, Day 01-06, Night 01-06, and so on.
But in the graph, gives the wrong order on X-axis. Ex: After the Day 01-05 graph shows Night 01-08 instead of Night 01-05.
Sample df and my code attached below:
import plotly.express as px
fig = px.bar(df, x="Day-Shift", y="seconds", color="State")
fig.show()
Df ad Dict:
import pandas as pd
import plotly.express as px
df = pd.DataFrame({'Day-Shift': {0: 'Day 01-05',
1: 'Day 01-05',
2: 'Day 01-05',
3: 'Day 01-05',
4: 'Night 01-05',
5: 'Day 01-06',
6: 'Day 01-06',
7: 'Day 01-06',
8: 'Day 01-06',
9: 'Day 01-06',
10: 'Day 01-06',
11: 'Night 01-06',
12: 'Day 01-07',
13: 'Night 01-07',
14: 'Night 01-07',
15: 'Night 01-07',
16: 'Night 01-07',
17: 'Night 01-07',
18: 'Night 01-08',
19: 'Night 01-08',
20: 'Night 01-08',
21: 'Night 01-08',
22: 'Day 01-08',
23: 'Day 01-08',
24: 'Day 01-08',
25: 'Night 01-09',
26: 'Night 01-09',
27: 'Night 01-09',
28: 'Day 01-09',
29: 'Day 01-09',
30: 'Day 01-09',
31: 'Day 01-09',
32: 'Day 01-10',
33: 'Night 01-10',
34: 'Day 01-11',
35: 'Day 01-11',
36: 'Day 01-11',
37: 'Day 01-11',
38: 'Day 01-11',
39: 'Night 01-11',
40: 'Day 01-12',
41: 'Night 01-12',
42: 'Day 01-13',
43: 'Day 01-13',
44: 'Day 01-13',
45: 'Day 01-13',
46: 'Day 01-13',
47: 'Day 01-13',
48: 'Day 01-13',
49: 'Night 01-13',
50: 'Day 01-14',
51: 'Day 01-14',
52: 'Day 01-14',
53: 'Day 01-14',
54: 'Day 01-14',
55: 'Day 01-14',
56: 'Day 01-14',
57: 'Day 01-14',
58: 'Day 01-14',
59: 'Night 01-14'},
'State': {0: 'D',
1: 'STOPPED',
2: 'B',
3: 'A',
4: 'A',
5: 'A',
6: 'A1',
7: 'A2',
8: 'A3',
9: 'A4',
10: 'B1',
11: 'B1',
12: 'B1',
13: 'B1',
14: 'B2',
15: 'STOPPED',
16: 'RUNNING',
17: 'B',
18: 'STOPPED',
19: 'B',
20: 'RUNNING',
21: 'D',
22: 'STOPPED',
23: 'B',
24: 'RUNNING',
25: 'STOPPED',
26: 'RUNNING',
27: 'B',
28: 'RUNNING',
29: 'STOPPED',
30: 'B',
31: 'D',
32: 'B',
33: 'B',
34: 'B',
35: 'RUNNING',
36: 'STOPPED',
37: 'D',
38: 'A',
39: 'A',
40: 'A',
41: 'A',
42: 'A',
43: 'A1',
44: 'A2',
45: 'A3',
46: 'A4',
47: 'B1',
48: 'B2',
49: 'B2',
50: 'B2',
51: 'B',
52: 'STOPPED',
53: 'A',
54: 'A1',
55: 'A2',
56: 'A3',
57: 'A4',
58: 'B1',
59: 'B1'},
'seconds': {0: 7439,
1: 0,
2: 10,
3: 35751,
4: 43200,
5: 7198,
6: 18,
7: 14,
8: 29301,
9: 6,
10: 6663,
11: 43200,
12: 43200,
13: 5339,
14: 8217,
15: 0,
16: 4147,
17: 1040,
18: 24787,
19: 1500,
20: 14966,
21: 1410,
22: 2499,
23: 1310,
24: 39391,
25: 3570,
26: 17234,
27: 47390,
28: 36068,
29: 270,
30: 6842,
31: 20,
32: 43200,
33: 43200,
34: 2486,
35: 8420,
36: 870,
37: 30,
38: 31394,
39: 43200,
40: 43200,
41: 43200,
42: 36733,
43: 23,
44: 6,
45: 4,
46: 4,
47: 3,
48: 6427,
49: 43200,
50: 620,
51: 0,
52: 4,
53: 41336,
54: 4,
55: 4,
56: 4,
57: 23,
58: 1205,
59: 43200}})
Really appreciate your support !!!
You can use category_orders to set the order of values:
import pandas as pd
import plotly.express as px
df = pd.DataFrame({'Day-Shift': {0: 'Day 01-05', 1: 'Day 01-05', 2: 'Day 01-05', 3: 'Day 01-05', 4: 'Night 01-05', 5: 'Day 01-06', 6: 'Day 01-06', 7: 'Day 01-06', 8: 'Day 01-06', 9: 'Day 01-06', 10: 'Day 01-06', 11: 'Night 01-06', 12: 'Day 01-07', 13: 'Night 01-07', 14: 'Night 01-07', 15: 'Night 01-07', 16: 'Night 01-07', 17: 'Night 01-07', 18: 'Night 01-08', 19: 'Night 01-08', 20: 'Night 01-08', 21: 'Night 01-08', 22: 'Day 01-08', 23: 'Day 01-08', 24: 'Day 01-08', 25: 'Night 01-09', 26: 'Night 01-09', 27: 'Night 01-09', 28: 'Day 01-09', 29: 'Day 01-09', 30: 'Day 01-09', 31: 'Day 01-09', 32: 'Day 01-10', 33: 'Night 01-10', 34: 'Day 01-11', 35: 'Day 01-11', 36: 'Day 01-11', 37: 'Day 01-11', 38: 'Day 01-11', 39: 'Night 01-11', 40: 'Day 01-12', 41: 'Night 01-12', 42: 'Day 01-13', 43: 'Day 01-13', 44: 'Day 01-13', 45: 'Day 01-13', 46: 'Day 01-13', 47: 'Day 01-13', 48: 'Day 01-13', 49: 'Night 01-13', 50: 'Day 01-14', 51: 'Day 01-14', 52: 'Day 01-14', 53: 'Day 01-14', 54: 'Day 01-14', 55: 'Day 01-14', 56: 'Day 01-14', 57: 'Day 01-14', 58: 'Day 01-14', 59: 'Night 01-14'}, 'State': {0: 'D', 1: 'STOPPED', 2: 'B', 3: 'A', 4: 'A', 5: 'A', 6: 'A1', 7: 'A2', 8: 'A3', 9: 'A4', 10: 'B1', 11: 'B1', 12: 'B1', 13: 'B1', 14: 'B2', 15: 'STOPPED', 16: 'RUNNING', 17: 'B', 18: 'STOPPED', 19: 'B', 20: 'RUNNING', 21: 'D', 22: 'STOPPED', 23: 'B', 24: 'RUNNING', 25: 'STOPPED', 26: 'RUNNING', 27: 'B', 28: 'RUNNING', 29: 'STOPPED', 30: 'B', 31: 'D', 32: 'B', 33: 'B', 34: 'B', 35: 'RUNNING', 36: 'STOPPED', 37: 'D', 38: 'A', 39: 'A', 40: 'A', 41: 'A', 42: 'A', 43: 'A1', 44: 'A2', 45: 'A3', 46: 'A4', 47: 'B1', 48: 'B2', 49: 'B2', 50: 'B2', 51: 'B', 52: 'STOPPED', 53: 'A', 54: 'A1', 55: 'A2', 56: 'A3', 57: 'A4', 58: 'B1', 59: 'B1'}, 'seconds': {0: 7439, 1: 0, 2: 10, 3: 35751, 4: 43200, 5: 7198, 6: 18, 7: 14, 8: 29301, 9: 6, 10: 6663, 11: 43200, 12: 43200, 13: 5339, 14: 8217, 15: 0, 16: 4147, 17: 1040, 18: 24787, 19: 1500, 20: 14966, 21: 1410, 22: 2499, 23: 1310, 24: 39391, 25: 3570, 26: 17234, 27: 47390, 28: 36068, 29: 270, 30: 6842, 31: 20, 32: 43200, 33: 43200, 34: 2486, 35: 8420, 36: 870, 37: 30, 38: 31394, 39: 43200, 40: 43200, 41: 43200, 42: 36733, 43: 23, 44: 6, 45: 4, 46: 4, 47: 3, 48: 6427, 49: 43200, 50: 620, 51: 0, 52: 4, 53: 41336, 54: 4, 55: 4, 56: 4, 57: 23, 58: 1205, 59: 43200}})
fig = px.bar(df, x="Day-Shift", y="seconds", category_orders={'Day-Shift': df['Day-Shift'].to_list()},color="State")
fig.show()
Output:
Setting category_orders = {"Day-Shift":df['Day-Shift'].unique()} will work, but only reliably if your dataset has the correct order to begin with. Another condition is that you only have data for one unique year. In order to guarantee the correct order regardless of original order, and to make it possible to have data for december 2020 combinde with january 2021 I would suggest you to:
split "Day-Shift" into two separate columns; time of day == tod and day of month = date,
append year to your dates, like dfs['date2'] = dfs['date'] + '-2021',
turn 'date2' into datetime using dfs['date2'] = pd.to_datetime(dfs['date2']),
sort your values chronologically, and
retrieve "Day-Shift" in the now correct order with new_order = list(df['Day-Shift'].unique()), and then
apply the chronologially correct order through category_orders = {'Day-Shift': new_order}
Plot
Complete code:
import pandas as pd
import plotly.express as px
df = pd.DataFrame({'Day-Shift': {0: 'Day 01-05',
1: 'Day 01-05',
2: 'Day 01-05',
3: 'Day 01-05',
4: 'Night 01-05',
5: 'Day 01-06',
6: 'Day 01-06',
7: 'Day 01-06',
8: 'Day 01-06',
9: 'Day 01-06',
10: 'Day 01-06',
11: 'Night 01-06',
12: 'Day 01-07',
13: 'Night 01-07',
14: 'Night 01-07',
15: 'Night 01-07',
16: 'Night 01-07',
17: 'Night 01-07',
18: 'Night 01-08',
19: 'Night 01-08',
20: 'Night 01-08',
21: 'Night 01-08',
22: 'Day 01-08',
23: 'Day 01-08',
24: 'Day 01-08',
25: 'Night 01-09',
26: 'Night 01-09',
27: 'Night 01-09',
28: 'Day 01-09',
29: 'Day 01-09',
30: 'Day 01-09',
31: 'Day 01-09',
32: 'Day 01-10',
33: 'Night 01-10',
34: 'Day 01-11',
35: 'Day 01-11',
36: 'Day 01-11',
37: 'Day 01-11',
38: 'Day 01-11',
39: 'Night 01-11',
40: 'Day 01-12',
41: 'Night 01-12',
42: 'Day 01-13',
43: 'Day 01-13',
44: 'Day 01-13',
45: 'Day 01-13',
46: 'Day 01-13',
47: 'Day 01-13',
48: 'Day 01-13',
49: 'Night 01-13',
50: 'Day 01-14',
51: 'Day 01-14',
52: 'Day 01-14',
53: 'Day 01-14',
54: 'Day 01-14',
55: 'Day 01-14',
56: 'Day 01-14',
57: 'Day 01-14',
58: 'Day 01-14',
59: 'Night 01-14'},
'State': {0: 'D',
1: 'STOPPED',
2: 'B',
3: 'A',
4: 'A',
5: 'A',
6: 'A1',
7: 'A2',
8: 'A3',
9: 'A4',
10: 'B1',
11: 'B1',
12: 'B1',
13: 'B1',
14: 'B2',
15: 'STOPPED',
16: 'RUNNING',
17: 'B',
18: 'STOPPED',
19: 'B',
20: 'RUNNING',
21: 'D',
22: 'STOPPED',
23: 'B',
24: 'RUNNING',
25: 'STOPPED',
26: 'RUNNING',
27: 'B',
28: 'RUNNING',
29: 'STOPPED',
30: 'B',
31: 'D',
32: 'B',
33: 'B',
34: 'B',
35: 'RUNNING',
36: 'STOPPED',
37: 'D',
38: 'A',
39: 'A',
40: 'A',
41: 'A',
42: 'A',
43: 'A1',
44: 'A2',
45: 'A3',
46: 'A4',
47: 'B1',
48: 'B2',
49: 'B2',
50: 'B2',
51: 'B',
52: 'STOPPED',
53: 'A',
54: 'A1',
55: 'A2',
56: 'A3',
57: 'A4',
58: 'B1',
59: 'B1'},
'seconds': {0: 7439,
1: 0,
2: 10,
3: 35751,
4: 43200,
5: 7198,
6: 18,
7: 14,
8: 29301,
9: 6,
10: 6663,
11: 43200,
12: 43200,
13: 5339,
14: 8217,
15: 0,
16: 4147,
17: 1040,
18: 24787,
19: 1500,
20: 14966,
21: 1410,
22: 2499,
23: 1310,
24: 39391,
25: 3570,
26: 17234,
27: 47390,
28: 36068,
29: 270,
30: 6842,
31: 20,
32: 43200,
33: 43200,
34: 2486,
35: 8420,
36: 870,
37: 30,
38: 31394,
39: 43200,
40: 43200,
41: 43200,
42: 36733,
43: 23,
44: 6,
45: 4,
46: 4,
47: 3,
48: 6427,
49: 43200,
50: 620,
51: 0,
52: 4,
53: 41336,
54: 4,
55: 4,
56: 4,
57: 23,
58: 1205,
59: 43200}})
dfs = df['Day-Shift'].str.extract('([a-zA-Z]+)([^a-zA-Z]+)', expand=True)
dfs.columns = ['tod', 'date']
dfs['date2'] = dfs['date'] + '-2021'
dfs['date2'] = pd.to_datetime(dfs['date2'])
df = pd.concat([df, dfs], axis = 1)
df = df.sort_values(['date2', 'tod'], ascending = [True, True])
new_order = list(df['Day-Shift'].unique())
# df['Day-Shift'] = pd.Categorical(df['Day-Shift'], categories=new_order, ordered=True)
fig = px.bar(df, x="Day-Shift", y="seconds", color="State",
category_orders = {'Day-Shift': new_order})
fig.update_xaxes(type='category')
fig.show()

Manipulating an image in python

I have this set of data:
{5: 136018, 4: 131402, 6: 113441, 7: 94609, 8: 80752, 9: 69753, 10: 60322, 11: 51388,
12: 44416, 13: 37638, 14: 31524, 15: 26275, 16: 22098, 17: 18458, 18: 15294, 19: 12207,
20: 10209, 21: 8355, 22: 6826, 23: 5657, 24: 4554, 25: 3668, 26: 2907, 27: 2438, 28: 1923,
29: 1609, 30: 1223, 31: 1000, 32: 821, 33: 693, 34: 492, 35: 381, 36: 315, 37: 263, 38: 218,
40: 170, 39: 164, 41: 103, 42: 94, 43: 58, 44: 48, 45: 40, 47: 36, 46: 30, 49: 22, 48: 21,
50: 14, 51: 12, 53: 9, 52: 6, 54: 5, 55: 5, 56: 4, 57: 3, 64: 2, 58: 1, 59: 1, 60: 1,
61: 1, 62: 1, 65: 1, 66: 1}
What I want to do is create an image from this data. I know its not going to be easy, but basically I want to use something like PIL to create an image, I want to show kind of a bar graph with it, I know it would also be a huge image because of the big numbers (like 136018)
So how the heck would I possibly do this with Python and PIL?
The hard way. Use matplotlib instead.
Ignacio's advice is good - but here is a simpler example to get you started:
import pylab # part of the matplotlib package - a simpler interface
data = {
5: 136018, 4: 131402, 6: 113441, 7: 94609, 8: 80752, 9: 69753,
10: 60322, 11: 51388, 12: 44416, 13: 37638, 14: 31524, 15: 26275,
16: 22098, 17: 18458, 18: 15294, 19: 12207, 20: 10209, 21: 8355,
22: 6826, 23: 5657, 24: 4554, 25: 3668, 26: 2907, 27: 2438, 28: 1923,
29: 1609, 30: 1223, 31: 1000, 32: 821, 33: 693, 34: 492, 35: 381,
36: 315, 37: 263, 38: 218, 40: 170, 39: 164, 41: 103, 42: 94, 43: 58,
44: 48, 45: 40, 47: 36, 46: 30, 49: 22, 48: 21, 50: 14, 51: 12,
53: 9, 52: 6, 54: 5, 55: 5, 56: 4, 57: 3, 64: 2, 58: 1, 59: 1, 60: 1,
61: 1, 62: 1, 65: 1, 66: 1
}
xs = range(min(data), max(data)+1)
ys = [data.get(x, 0) for x in xs]
pylab.bar(xs, ys)
gives you
Try something like this:
You'll have to modify the pixels part naturally
import PIL
image = PIL.Image.new('RGBA', (1000, 1000))
pixels = image.load()
print pixels[x, y]
pixels[x, y] = some_color

Categories

Resources