This is my first dataframe
{'Year': {12: 1999,
13: 1999,
14: 1999,
15: 2000,
16: 2000,
17: 2000,
18: 2001,
19: 2001,
20: 2001,
21: 2002,
22: 2002,
23: 2002,
24: 2003,
25: 2003,
26: 2003,
27: 2004,
28: 2004,
29: 2004,
30: 2005,
31: 2005,
32: 2005,
33: 2006,
34: 2006,
35: 2006,
36: 2007,
37: 2007,
38: 2007,
39: 2008,
40: 2008,
41: 2008,
42: 2009,
43: 2009,
44: 2009,
45: 2010,
46: 2010,
47: 2010,
48: 2011,
49: 2011,
50: 2011,
51: 2012,
52: 2012,
53: 2012,
54: 2013,
55: 2013,
56: 2013,
57: 2014,
58: 2014,
59: 2014,
60: 2015,
61: 2015,
62: 2015,
63: 2016,
64: 2016,
65: 2016,
66: 2017,
67: 2017,
68: 2017,
69: 2018,
70: 2018,
71: 2018,
72: 2019,
73: 2019,
74: 2019,
75: 2020,
76: 2020,
77: 2020,
78: 2021,
79: 2021,
80: 2021},
'Type of Public Transport': {12: 'MRT',
13: 'LRT',
14: 'Bus',
15: 'MRT',
16: 'LRT',
17: 'Bus',
18: 'MRT',
19: 'LRT',
20: 'Bus',
21: 'MRT',
22: 'LRT',
23: 'Bus',
24: 'MRT',
25: 'LRT',
26: 'Bus',
27: 'MRT',
28: 'LRT',
29: 'Bus',
30: 'MRT',
31: 'LRT',
32: 'Bus',
33: 'MRT',
34: 'LRT',
35: 'Bus',
36: 'MRT',
37: 'LRT',
38: 'Bus',
39: 'MRT',
40: 'LRT',
41: 'Bus',
42: 'MRT',
43: 'LRT',
44: 'Bus',
45: 'MRT',
46: 'LRT',
47: 'Bus',
48: 'MRT',
49: 'LRT',
50: 'Bus',
51: 'MRT',
52: 'LRT',
53: 'Bus',
54: 'MRT',
55: 'LRT',
56: 'Bus',
57: 'MRT',
58: 'LRT',
59: 'Bus',
60: 'MRT',
61: 'LRT',
62: 'Bus',
63: 'MRT',
64: 'LRT',
65: 'Bus',
66: 'MRT',
67: 'LRT',
68: 'Bus',
69: 'MRT',
70: 'LRT',
71: 'Bus',
72: 'MRT',
73: 'LRT',
74: 'Bus',
75: 'MRT',
76: 'LRT',
77: 'Bus',
78: 'MRT',
79: 'LRT',
80: 'Bus'},
'Mean Daily Ridership': {12: 986000.0,
13: 27000.0,
14: 3213000.0,
15: 1047000.0,
16: 39000.0,
17: 3251000.0,
18: 1071000.0,
19: 41000.0,
20: 3281000.0,
21: 1081000.0,
22: 39000.0,
23: 3197000.0,
24: 1171000.0,
25: 50000.0,
26: 2992000.0,
27: 1270000.0,
28: 55000.0,
29: 2805000.0,
30: 1321000.0,
31: 69000.0,
32: 2779000.0,
33: 1408000.0,
34: 74000.0,
35: 2833000.0,
36: 1527000.0,
37: 79000.0,
38: 2932000.0,
39: 1698000.0,
40: 88000.0,
41: 3087000.0,
42: 1782000.0,
43: 90000.0,
44: 3047000.0,
45: 2069000.0,
46: 100000.0,
47: 3199000.0,
48: 2295000.0,
49: 111000.0,
50: 3385000.0,
51: 2525000.0,
52: 124000.0,
53: 3481000.0,
54: 2623000.0,
55: 132000.0,
56: 3601000.0,
57: 2762000.0,
58: 137000.0,
59: 3751000.0,
60: 2871000.0,
61: 152000.0,
62: 3891000.0,
63: 3095000.0,
64: 180000.0,
65: 3939000.0,
66: 3122000.0,
67: 190000.0,
68: 3952000.0,
69: 3302000.0,
70: 199000.0,
71: 4037000.0,
72: 3384000.0,
73: 208000.0,
74: 4099000.0,
75: 2023000.0,
76: 139000.0,
77: 2878000.0,
78: 2100000.0,
79: 151000.0,
80: 3008000.0},
'Mean Daily Ridership [millions]': {12: 0.986,
13: 0.027,
14: 3.213,
15: 1.047,
16: 0.039,
17: 3.251,
18: 1.071,
19: 0.041,
20: 3.281,
21: 1.081,
22: 0.039,
23: 3.197,
24: 1.171,
25: 0.05,
26: 2.992,
27: 1.27,
28: 0.055,
29: 2.805,
30: 1.321,
31: 0.069,
32: 2.779,
33: 1.408,
34: 0.074,
35: 2.833,
36: 1.527,
37: 0.079,
38: 2.932,
39: 1.698,
40: 0.088,
41: 3.087,
42: 1.782,
43: 0.09,
44: 3.047,
45: 2.069,
46: 0.1,
47: 3.199,
48: 2.295,
49: 0.111,
50: 3.385,
51: 2.525,
52: 0.124,
53: 3.481,
54: 2.623,
55: 0.132,
56: 3.601,
57: 2.762,
58: 0.137,
59: 3.751,
60: 2.871,
61: 0.152,
62: 3.891,
63: 3.095,
64: 0.18,
65: 3.939,
66: 3.122,
67: 0.19,
68: 3.952,
69: 3.302,
70: 0.199,
71: 4.037,
72: 3.384,
73: 0.208,
74: 4.099,
75: 2.023,
76: 0.139,
77: 2.878,
78: 2.1,
79: 0.151,
80: 3.008}}
This is my secind dataframe:
{'year': {0: 2005,
1: 2005,
2: 2005,
3: 2005,
4: 2005,
5: 2005,
6: 2005,
7: 2005,
8: 2005,
9: 2005,
10: 2005,
11: 2005,
12: 2005,
13: 2005,
14: 2005,
15: 2005,
16: 2005,
17: 2005,
18: 2005,
19: 2005,
20: 2005,
21: 2006,
22: 2006,
23: 2006,
24: 2006,
25: 2006,
26: 2006,
27: 2006,
28: 2006,
29: 2006,
30: 2006,
31: 2006,
32: 2006,
33: 2006,
34: 2006,
35: 2006,
36: 2006,
37: 2006,
38: 2006,
39: 2006,
40: 2006,
41: 2006,
42: 2007,
43: 2007,
44: 2007,
45: 2007,
46: 2007,
47: 2007,
48: 2007,
49: 2007,
50: 2007,
51: 2007,
52: 2007,
53: 2007,
54: 2007,
55: 2007,
56: 2007,
57: 2007,
58: 2007,
59: 2007,
60: 2007,
61: 2007,
62: 2007,
63: 2008,
64: 2008,
65: 2008,
66: 2008,
67: 2008,
68: 2008,
69: 2008,
70: 2008,
71: 2008,
72: 2008,
73: 2008,
74: 2008,
75: 2008,
76: 2008,
77: 2008,
78: 2008,
79: 2008,
80: 2008,
81: 2008,
82: 2008,
83: 2008,
84: 2009,
85: 2009,
86: 2009,
87: 2009,
88: 2009,
89: 2009,
90: 2009,
91: 2009,
92: 2009,
93: 2009,
94: 2009,
95: 2009,
96: 2009,
97: 2009,
98: 2009,
99: 2009,
100: 2009,
101: 2009,
102: 2009,
103: 2009,
104: 2009,
105: 2010,
106: 2010,
107: 2010,
108: 2010,
109: 2010,
110: 2010,
111: 2010,
112: 2010,
113: 2010,
114: 2010,
115: 2010,
116: 2010,
117: 2010,
118: 2010,
119: 2010,
120: 2010,
121: 2010,
122: 2010,
123: 2010,
124: 2010,
125: 2010,
126: 2011,
127: 2011,
128: 2011,
129: 2011,
130: 2011,
131: 2011,
132: 2011,
133: 2011,
134: 2011,
135: 2011,
136: 2011,
137: 2011,
138: 2011,
139: 2011,
140: 2011,
141: 2011,
142: 2011,
143: 2011,
144: 2011,
145: 2011,
146: 2011,
147: 2012,
148: 2012,
149: 2012,
150: 2012,
151: 2012,
152: 2012,
153: 2012,
154: 2012,
155: 2012,
156: 2012,
157: 2012,
158: 2012,
159: 2012,
160: 2012,
161: 2012,
162: 2012,
163: 2012,
164: 2012,
165: 2012,
166: 2012,
167: 2012,
168: 2013,
169: 2013,
170: 2013,
171: 2013,
172: 2013,
173: 2013,
174: 2013,
175: 2013,
176: 2013,
177: 2013,
178: 2013,
179: 2013,
180: 2013,
181: 2013,
182: 2013,
183: 2013,
184: 2013,
185: 2013,
186: 2013,
187: 2013,
188: 2013,
189: 2014,
190: 2014,
191: 2014,
192: 2014,
193: 2014,
194: 2014,
195: 2014,
196: 2014,
197: 2014,
198: 2014,
199: 2014,
200: 2014,
201: 2014,
202: 2014,
203: 2014,
204: 2014,
205: 2014,
206: 2014,
207: 2014,
208: 2014,
209: 2014,
210: 2015,
211: 2015,
212: 2015,
213: 2015,
214: 2015,
215: 2015,
216: 2015,
217: 2015,
218: 2015,
219: 2015,
220: 2015,
221: 2015,
222: 2015,
223: 2015,
224: 2015,
225: 2015,
226: 2015,
227: 2015,
228: 2015,
229: 2015,
230: 2015,
231: 2016,
232: 2016,
233: 2016,
234: 2016,
235: 2016,
236: 2016,
237: 2016,
238: 2016,
239: 2016,
240: 2016,
241: 2016,
242: 2016,
243: 2016,
244: 2016,
245: 2016,
246: 2016,
247: 2016,
248: 2016,
249: 2016,
250: 2016,
251: 2016,
252: 2017,
253: 2017,
254: 2017,
255: 2017,
256: 2017,
257: 2017,
258: 2017,
259: 2017,
260: 2017,
261: 2017,
262: 2017,
263: 2017,
264: 2017,
265: 2017,
266: 2017,
267: 2017,
268: 2017,
269: 2017,
270: 2017,
271: 2017,
272: 2017,
273: 2018,
274: 2018,
275: 2018,
276: 2018,
277: 2018,
278: 2018,
279: 2018,
280: 2018,
281: 2018,
282: 2018,
283: 2018,
284: 2018,
285: 2018,
286: 2018,
287: 2018,
288: 2018,
289: 2018,
290: 2018,
291: 2018,
292: 2018,
293: 2018,
294: 2019,
295: 2019,
296: 2019,
297: 2019,
298: 2019,
299: 2019,
300: 2019,
301: 2019,
302: 2019,
303: 2019,
304: 2019,
305: 2019,
306: 2019,
307: 2019,
308: 2019,
309: 2019,
310: 2019,
311: 2019,
312: 2019,
313: 2019,
314: 2019,
315: 2020,
316: 2020,
317: 2020,
318: 2020,
319: 2020,
320: 2020,
321: 2020,
322: 2020,
323: 2020,
324: 2020,
325: 2020,
326: 2020,
327: 2020,
328: 2020,
329: 2020,
330: 2020,
331: 2020,
332: 2020,
333: 2020,
334: 2020,
335: 2020,
336: 2021,
337: 2021,
338: 2021,
339: 2021,
340: 2021,
341: 2021,
342: 2021,
343: 2021,
344: 2021,
345: 2021,
346: 2021,
347: 2021,
348: 2021,
349: 2021,
350: 2021,
351: 2021,
352: 2021,
353: 2021,
354: 2021,
355: 2021,
356: 2021},
'age_years': {0: '0-<1',
1: '1-<2',
2: '2-<3',
3: '3-<4',
4: '4-<5',
5: '5-<6',
6: '6-<7',
7: '7-<8',
8: '8-<9',
9: '9-<10',
10: '10-<11',
11: '11-<12',
12: '12-<13',
13: '13-<14',
14: '14-<15',
15: '15-<16',
16: '16-<17',
17: '17-<18',
18: '18-<19',
19: '19-<20',
20: '20->',
21: '0-<1',
22: '1-<2',
23: '2-<3',
24: '3-<4',
25: '4-<5',
26: '5-<6',
27: '6-<7',
28: '7-<8',
29: '8-<9',
30: '9-<10',
31: '10-<11',
32: '11-<12',
33: '12-<13',
34: '13-<14',
35: '14-<15',
36: '15-<16',
37: '16-<17',
38: '17-<18',
39: '18-<19',
40: '19-<20',
41: '20->',
42: '0-<1',
43: '1-<2',
44: '2-<3',
45: '3-<4',
46: '4-<5',
47: '5-<6',
48: '6-<7',
49: '7-<8',
50: '8-<9',
51: '9-<10',
52: '10-<11',
53: '11-<12',
54: '12-<13',
55: '13-<14',
56: '14-<15',
57: '15-<16',
58: '16-<17',
59: '17-<18',
60: '18-<19',
61: '19-<20',
62: '20->',
63: '0-<1',
64: '1-<2',
65: '2-<3',
66: '3-<4',
67: '4-<5',
68: '5-<6',
69: '6-<7',
70: '7-<8',
71: '8-<9',
72: '9-<10',
73: '10-<11',
74: '11-<12',
75: '12-<13',
76: '13-<14',
77: '14-<15',
78: '15-<16',
79: '16-<17',
80: '17-<18',
81: '18-<19',
82: '19-<20',
83: '20->',
84: '0-<1',
85: '1-<2',
86: '2-<3',
87: '3-<4',
88: '4-<5',
89: '5-<6',
90: '6-<7',
91: '7-<8',
92: '8-<9',
93: '9-<10',
94: '10-<11',
95: '11-<12',
96: '12-<13',
97: '13-<14',
98: '14-<15',
99: '15-<16',
100: '16-<17',
101: '17-<18',
102: '18-<19',
103: '19-<20',
104: '20->',
105: '0-<1',
106: '1-<2',
107: '2-<3',
108: '3-<4',
109: '4-<5',
110: '5-<6',
111: '6-<7',
112: '7-<8',
113: '8-<9',
114: '9-<10',
115: '10-<11',
116: '11-<12',
117: '12-<13',
118: '13-<14',
119: '14-<15',
120: '15-<16',
121: '16-<17',
122: '17-<18',
123: '18-<19',
124: '19-<20',
125: '20->',
126: '0-<1',
127: '1-<2',
128: '2-<3',
129: '3-<4',
130: '4-<5',
131: '5-<6',
132: '6-<7',
133: '7-<8',
134: '8-<9',
135: '9-<10',
136: '10-<11',
137: '11-<12',
138: '12-<13',
139: '13-<14',
140: '14-<15',
141: '15-<16',
142: '16-<17',
143: '17-<18',
144: '18-<19',
145: '19-<20',
146: '20->',
147: '0-<1',
148: '1-<2',
149: '2-<3',
150: '3-<4',
151: '4-<5',
152: '5-<6',
153: '6-<7',
154: '7-<8',
155: '8-<9',
156: '9-<10',
157: '10-<11',
158: '11-<12',
159: '12-<13',
160: '13-<14',
161: '14-<15',
162: '15-<16',
163: '16-<17',
164: '17-<18',
165: '18-<19',
166: '19-<20',
167: '20->',
168: '0-<1',
169: '1-<2',
170: '2-<3',
171: '3-<4',
172: '4-<5',
173: '5-<6',
174: '6-<7',
175: '7-<8',
176: '8-<9',
177: '9-<10',
178: '10-<11',
179: '11-<12',
180: '12-<13',
181: '13-<14',
182: '14-<15',
183: '15-<16',
184: '16-<17',
185: '17-<18',
186: '18-<19',
187: '19-<20',
188: '20->',
189: '0-<1',
190: '1-<2',
191: '2-<3',
192: '3-<4',
193: '4-<5',
194: '5-<6',
195: '6-<7',
196: '7-<8',
197: '8-<9',
198: '9-<10',
199: '10-<11',
200: '11-<12',
201: '12-<13',
202: '13-<14',
203: '14-<15',
204: '15-<16',
205: '16-<17',
206: '17-<18',
207: '18-<19',
208: '19-<20',
209: '20->',
210: '0-<1',
211: '1-<2',
212: '2-<3',
213: '3-<4',
214: '4-<5',
215: '5-<6',
216: '6-<7',
217: '7-<8',
218: '8-<9',
219: '9-<10',
220: '10-<11',
221: '11-<12',
222: '12-<13',
223: '13-<14',
224: '14-<15',
225: '15-<16',
226: '16-<17',
227: '17-<18',
228: '18-<19',
229: '19-<20',
230: '20->',
231: '0-<1',
232: '1-<2',
233: '2-<3',
234: '3-<4',
235: '4-<5',
236: '5-<6',
237: '6-<7',
238: '7-<8',
239: '8-<9',
240: '9-<10',
241: '10-<11',
242: '11-<12',
243: '12-<13',
244: '13-<14',
245: '14-<15',
246: '15-<16',
247: '16-<17',
248: '17-<18',
249: '18-<19',
250: '19-<20',
251: '20->',
252: '0-<1',
253: '1-<2',
254: '2-<3',
255: '3-<4',
256: '4-<5',
257: '5-<6',
258: '6-<7',
259: '7-<8',
260: '8-<9',
261: '9-<10',
262: '10-<11',
263: '11-<12',
264: '12-<13',
265: '13-<14',
266: '14-<15',
267: '15-<16',
268: '16-<17',
269: '17-<18',
270: '18-<19',
271: '19-<20',
272: '20->',
273: '0-<1',
274: '1-<2',
275: '2-<3',
276: '3-<4',
277: '4-<5',
278: '5-<6',
279: '6-<7',
280: '7-<8',
281: '8-<9',
282: '9-<10',
283: '10-<11',
284: '11-<12',
285: '12-<13',
286: '13-<14',
287: '14-<15',
288: '15-<16',
289: '16-<17',
290: '17-<18',
291: '18-<19',
292: '19-<20',
293: '20->',
294: '0-<1',
295: '1-<2',
296: '2-<3',
297: '3-<4',
298: '4-<5',
299: '5-<6',
300: '6-<7',
301: '7-<8',
302: '8-<9',
303: '9-<10',
304: '10-<11',
305: '11-<12',
306: '12-<13',
307: '13-<14',
308: '14-<15',
309: '15-<16',
310: '16-<17',
311: '17-<18',
312: '18-<19',
313: '19-<20',
314: '20->',
315: '0-<1',
316: '1-<2',
317: '2-<3',
318: '3-<4',
319: '4-<5',
320: '5-<6',
321: '6-<7',
322: '7-<8',
323: '8-<9',
324: '9-<10',
325: '10-<11',
326: '11-<12',
327: '12-<13',
328: '13-<14',
329: '14-<15',
330: '15-<16',
331: '16-<17',
332: '17-<18',
333: '18-<19',
334: '19-<20',
335: '20->',
336: '0-<1',
337: '1-<2',
338: '2-<3',
339: '3-<4',
340: '4-<5',
341: '5-<6',
342: '6-<7',
343: '7-<8',
344: '8-<9',
345: '9-<10',
346: '10-<11',
347: '11-<12',
348: '12-<13',
349: '13-<14',
350: '14-<15',
351: '15-<16',
352: '16-<17',
353: '17-<18',
354: '18-<19',
355: '19-<20',
356: '20->'},
'number': {0: 776,
1: 684,
2: 699,
3: 639,
4: 840,
5: 1290,
6: 819,
7: 767,
8: 922,
9: 674,
10: 1006,
11: 879,
12: 812,
13: 666,
14: 664,
15: 504,
16: 342,
17: 211,
18: 16,
19: 10,
20: 0,
21: 985,
22: 778,
23: 686,
24: 701,
25: 629,
26: 816,
27: 1249,
28: 789,
29: 748,
30: 901,
31: 640,
32: 1004,
33: 865,
34: 784,
35: 639,
36: 625,
37: 483,
38: 312,
39: 193,
40: 4,
41: 0,
42: 775,
43: 981,
44: 777,
45: 687,
46: 695,
47: 611,
48: 798,
49: 1225,
50: 768,
51: 729,
52: 885,
53: 630,
54: 999,
55: 856,
56: 761,
57: 604,
58: 615,
59: 458,
60: 281,
61: 57,
62: 0,
63: 1506,
64: 778,
65: 980,
66: 775,
67: 686,
68: 695,
69: 598,
70: 783,
71: 1198,
72: 747,
73: 691,
74: 874,
75: 627,
76: 993,
77: 837,
78: 718,
79: 589,
80: 443,
81: 378,
82: 80,
83: 0,
84: 1376,
85: 1505,
86: 778,
87: 978,
88: 773,
89: 681,
90: 687,
91: 575,
92: 759,
93: 1165,
94: 704,
95: 684,
96: 870,
97: 618,
98: 979,
99: 812,
100: 683,
101: 515,
102: 372,
103: 145,
104: 0,
105: 1088,
106: 1376,
107: 1509,
108: 781,
109: 976,
110: 773,
111: 678,
112: 673,
113: 559,
114: 730,
115: 1063,
116: 688,
117: 680,
118: 862,
119: 604,
120: 959,
121: 790,
122: 618,
123: 436,
124: 93,
125: 0,
126: 1502,
127: 1089,
128: 1376,
129: 1509,
130: 781,
131: 975,
132: 773,
133: 672,
134: 658,
135: 530,
136: 670,
137: 1053,
138: 683,
139: 671,
140: 841,
141: 580,
142: 941,
143: 701,
144: 543,
145: 104,
146: 0,
147: 1130,
148: 1501,
149: 1086,
150: 1375,
151: 1508,
152: 781,
153: 971,
154: 766,
155: 663,
156: 605,
157: 421,
158: 654,
159: 1044,
160: 675,
161: 652,
162: 831,
163: 565,
164: 777,
165: 580,
166: 183,
167: 0,
168: 1322,
169: 1130,
170: 1498,
171: 1087,
172: 1371,
173: 1504,
174: 782,
175: 971,
176: 750,
177: 636,
178: 467,
179: 415,
180: 639,
181: 998,
182: 631,
183: 606,
184: 798,
185: 485,
186: 713,
187: 262,
188: 0,
189: 1666,
190: 1323,
191: 1130,
192: 1495,
193: 1085,
194: 1363,
195: 1495,
196: 779,
197: 963,
198: 711,
199: 519,
200: 459,
201: 401,
202: 623,
203: 892,
204: 521,
205: 509,
206: 559,
207: 382,
208: 234,
209: 0,
210: 1780,
211: 1666,
212: 1322,
213: 1128,
214: 1493,
215: 1082,
216: 1358,
217: 1483,
218: 770,
219: 937,
220: 610,
221: 488,
222: 437,
223: 371,
224: 584,
225: 802,
226: 453,
227: 370,
228: 466,
229: 140,
230: 0,
231: 1768,
232: 1780,
233: 1666,
234: 1320,
235: 1125,
236: 1487,
237: 1079,
238: 1355,
239: 1477,
240: 746,
241: 837,
242: 591,
243: 461,
244: 405,
245: 330,
246: 514,
247: 689,
248: 280,
249: 260,
250: 168,
251: 0,
252: 1656,
253: 1766,
254: 1778,
255: 1664,
256: 1307,
257: 1123,
258: 1474,
259: 1071,
260: 1333,
261: 1441,
262: 671,
263: 814,
264: 552,
265: 432,
266: 374,
267: 280,
268: 436,
269: 391,
270: 168,
271: 83,
272: 0,
273: 909,
274: 1661,
275: 1768,
276: 1776,
277: 1659,
278: 1298,
279: 1125,
280: 1464,
281: 1060,
282: 1311,
283: 1319,
284: 655,
285: 780,
286: 499,
287: 372,
288: 338,
289: 251,
290: 325,
291: 268,
292: 109,
293: 0,
294: 1098,
295: 924,
296: 1664,
297: 1773,
298: 1778,
299: 1654,
300: 1295,
301: 1124,
302: 1456,
303: 1039,
304: 1240,
305: 1279,
306: 603,
307: 748,
308: 415,
309: 316,
310: 318,
311: 185,
312: 239,
313: 178,
314: 0,
315: 596,
316: 1101,
317: 924,
318: 1658,
319: 1759,
320: 1746,
321: 1601,
322: 1251,
323: 1082,
324: 1413,
325: 949,
326: 1202,
327: 1214,
328: 572,
329: 704,
330: 337,
331: 294,
332: 233,
333: 143,
334: 132,
335: 1,
336: 607,
337: 605,
338: 1102,
339: 924,
340: 1645,
341: 1746,
342: 1725,
343: 1575,
344: 1213,
345: 1028,
346: 1247,
347: 895,
348: 1138,
349: 1147,
350: 530,
351: 570,
352: 309,
353: 253,
354: 190,
355: 71,
356: 0}}
I need help in mergeing the 2 dataframe together, such that i still keep all of dataframe 1 but append dataframe 2 to dataframe 1 only when the year is 2005 and above and the public transport mode is Bus, and add another column where the specifc row contains number, while all the other rows should not contain any value so age_years from dataframe 2 should only appear at rows when year is 2005 and mode is bus with the number column next to it and the number value should only be there when the age_years and year is 2005 and above not when other conditions. thank you.
Related
Help me with my grouped stacked bar plot. I can't set distances between bar labels in altair.
This is my code
chart = alt.Chart(chain_and_prices_for_bar, title='Распределение средних цен различных ценовых категорий среди аптечных сетей в разрезе страны производства').mark_bar().encode(
x=alt.X('category_of_price:N', stack='zero', sort=['Низкая', 'Ниже среднего', 'Средняя', 'Выше среднего', 'Высокая', 'Самая высокая'], title=None, axis=alt.Axis(labelAngle=-45, labelOverlap=False)),
y=alt.Y('mean_price_of_medicine:Q', axis=alt.Axis(grid=False, title='Суммарная средняя цена'), scale=alt.Scale(domain=[0, 201], bins=[i for i in range(211) if i%10 ==0])),
#column=alt.Column('retail_chain:N', title=None, sort=list_of_top_pharmacies, header=alt.Header(labelFontSize=11, labelFontStyle='bold')),
order=alt.Order(
'is_import', sort='ascending'),
color=alt.Color('is_import:N', scale=alt.Scale(range=['#96ceb4', '#ffcc5c']),
legend=alt.Legend(title='Страна производства'))
).properties(
width=100,
height=600)
#chart = chart.configure_view(strokeOpacity=0)
chart.configure_title(fontSize=18, anchor='middle', align='center', dy=-10)
text = alt.Chart(chain_and_prices_for_bar).mark_text(dx=-1, dy=2, color='black', align='center', baseline='bottom', angle=270).encode(
x=alt.X('category_of_price:N', stack='zero', sort=['Низкая', 'Ниже среднего', 'Средняя', 'Выше среднего', 'Высокая', 'Самая высокая'], title=None, axis=alt.Axis(labelAngle=-45, labelOverlap=False)),
y=alt.Y('mean_price_of_medicine:Q'),
detail='retail_chain:N',
text=alt.Text('mean_price_of_medicine:Q', format='.2f'))
alt.layer(
chart, text, data=chain_and_prices_for_bar).facet(
facet=alt.Column('retail_chain:N', title=None, sort=list_of_top_pharmacies, header=alt.Header(labelFontSize=11, labelFontStyle='bold')),
).configure_view(continuousHeight=200, continuousWidth= 0.5).configure_facet(spacing=0.5)
This is what i got
numbers are overlapping and i need to change it
chain_and_prices_for_bar = pd.DataFrame(my_dict)
list_of_top_pharmacies = ['Гродненское РУП Фармация', 'Альфа-аптека', 'Планета Здоровья', 'Моя Аптека', 'Остров здоровья', 'Биотест', 'Искамед', 'ADEL','Inlek']
my_dict = {'retail_chain': {0: 'ADEL',
34: 'Альфа-аптека',
72: 'Моя Аптека',
36: 'Биотест',
38: 'Биотест',
86: 'Остров здоровья',
40: 'Биотест',
42: 'Биотест',
84: 'Остров здоровья',
44: 'Биотест',
46: 'Биотест',
82: 'Моя Аптека',
48: 'Гродненское РУП Фармация',
50: 'Гродненское РУП Фармация',
80: 'Моя Аптека',
52: 'Гродненское РУП Фармация',
106: 'Планета Здоровья',
54: 'Гродненское РУП Фармация',
56: 'Гродненское РУП Фармация',
78: 'Моя Аптека',
58: 'Гродненское РУП Фармация',
60: 'Искамед',
76: 'Моя Аптека',
62: 'Искамед',
64: 'Искамед',
74: 'Моя Аптека',
66: 'Искамед',
68: 'Искамед',
32: 'Альфа-аптека',
90: 'Остров здоровья',
88: 'Остров здоровья',
98: 'Планета Здоровья',
2: 'ADEL',
104: 'Планета Здоровья',
4: 'ADEL',
6: 'ADEL',
102: 'Планета Здоровья',
8: 'ADEL',
10: 'ADEL',
100: 'Планета Здоровья',
12: 'Inlek',
14: 'Inlek',
30: 'Альфа-аптека',
16: 'Inlek',
18: 'Inlek',
70: 'Искамед',
96: 'Планета Здоровья',
20: 'Inlek',
28: 'Альфа-аптека',
92: 'Остров здоровья',
22: 'Inlek',
94: 'Остров здоровья',
24: 'Альфа-аптека',
26: 'Альфа-аптека',
105: 'Планета Здоровья',
73: 'Моя Аптека',
89: 'Остров здоровья',
75: 'Моя Аптека',
103: 'Планета Здоровья',
93: 'Остров здоровья',
87: 'Остров здоровья',
83: 'Моя Аптека',
101: 'Планета Здоровья',
79: 'Моя Аптека',
99: 'Планета Здоровья',
85: 'Остров здоровья',
95: 'Остров здоровья',
81: 'Моя Аптека',
97: 'Планета Здоровья',
77: 'Моя Аптека',
91: 'Остров здоровья',
53: 'Гродненское РУП Фармация',
69: 'Искамед',
27: 'Альфа-аптека',
25: 'Альфа-аптека',
23: 'Inlek',
21: 'Inlek',
19: 'Inlek',
17: 'Inlek',
29: 'Альфа-аптека',
15: 'Inlek',
11: 'ADEL',
9: 'ADEL',
7: 'ADEL',
5: 'ADEL',
3: 'ADEL',
1: 'ADEL',
13: 'Inlek',
31: 'Альфа-аптека',
33: 'Альфа-аптека',
35: 'Альфа-аптека',
67: 'Искамед',
65: 'Искамед',
63: 'Искамед',
61: 'Искамед',
59: 'Гродненское РУП Фармация',
57: 'Гродненское РУП Фармация',
55: 'Гродненское РУП Фармация',
51: 'Гродненское РУП Фармация',
49: 'Гродненское РУП Фармация',
47: 'Биотест',
45: 'Биотест',
43: 'Биотест',
41: 'Биотест',
39: 'Биотест',
37: 'Биотест',
71: 'Искамед',
107: 'Планета Здоровья'},
'category_of_price': {0: 'Низкая',
34: 'Самая высокая',
72: 'Низкая',
36: 'Низкая',
38: 'Ниже среднего',
86: 'Ниже среднего',
40: 'Средняя',
42: 'Выше среднего',
84: 'Низкая',
44: 'Высокая',
46: 'Самая высокая',
82: 'Самая высокая',
48: 'Низкая',
50: 'Ниже среднего',
80: 'Высокая',
52: 'Средняя',
106: 'Самая высокая',
54: 'Выше среднего',
56: 'Высокая',
78: 'Выше среднего',
58: 'Самая высокая',
60: 'Низкая',
76: 'Средняя',
62: 'Ниже среднего',
64: 'Средняя',
74: 'Ниже среднего',
66: 'Выше среднего',
68: 'Высокая',
32: 'Высокая',
90: 'Выше среднего',
88: 'Средняя',
98: 'Ниже среднего',
2: 'Ниже среднего',
104: 'Высокая',
4: 'Средняя',
6: 'Выше среднего',
102: 'Выше среднего',
8: 'Высокая',
10: 'Самая высокая',
100: 'Средняя',
12: 'Низкая',
14: 'Ниже среднего',
30: 'Выше среднего',
16: 'Средняя',
18: 'Выше среднего',
70: 'Самая высокая',
96: 'Низкая',
20: 'Высокая',
28: 'Средняя',
92: 'Высокая',
22: 'Самая высокая',
94: 'Самая высокая',
24: 'Низкая',
26: 'Ниже среднего',
105: 'Высокая',
73: 'Низкая',
89: 'Средняя',
75: 'Ниже среднего',
103: 'Выше среднего',
93: 'Высокая',
87: 'Ниже среднего',
83: 'Самая высокая',
101: 'Средняя',
79: 'Выше среднего',
99: 'Ниже среднего',
85: 'Низкая',
95: 'Самая высокая',
81: 'Высокая',
97: 'Низкая',
77: 'Средняя',
91: 'Выше среднего',
53: 'Средняя',
69: 'Высокая',
27: 'Ниже среднего',
25: 'Низкая',
23: 'Самая высокая',
21: 'Высокая',
19: 'Выше среднего',
17: 'Средняя',
29: 'Средняя',
15: 'Ниже среднего',
11: 'Самая высокая',
9: 'Высокая',
7: 'Выше среднего',
5: 'Средняя',
3: 'Ниже среднего',
1: 'Низкая',
13: 'Низкая',
31: 'Выше среднего',
33: 'Высокая',
35: 'Самая высокая',
67: 'Выше среднего',
65: 'Средняя',
63: 'Ниже среднего',
61: 'Низкая',
59: 'Самая высокая',
57: 'Высокая',
55: 'Выше среднего',
51: 'Ниже среднего',
49: 'Низкая',
47: 'Самая высокая',
45: 'Высокая',
43: 'Выше среднего',
41: 'Средняя',
39: 'Ниже среднего',
37: 'Низкая',
71: 'Самая высокая',
107: 'Самая высокая'},
'is_import': {0: 'Беларусь',
34: 'Беларусь',
72: 'Беларусь',
36: 'Беларусь',
38: 'Беларусь',
86: 'Беларусь',
40: 'Беларусь',
42: 'Беларусь',
84: 'Беларусь',
44: 'Беларусь',
46: 'Беларусь',
82: 'Беларусь',
48: 'Беларусь',
50: 'Беларусь',
80: 'Беларусь',
52: 'Беларусь',
106: 'Беларусь',
54: 'Беларусь',
56: 'Беларусь',
78: 'Беларусь',
58: 'Беларусь',
60: 'Беларусь',
76: 'Беларусь',
62: 'Беларусь',
64: 'Беларусь',
74: 'Беларусь',
66: 'Беларусь',
68: 'Беларусь',
32: 'Беларусь',
90: 'Беларусь',
88: 'Беларусь',
98: 'Беларусь',
2: 'Беларусь',
104: 'Беларусь',
4: 'Беларусь',
6: 'Беларусь',
102: 'Беларусь',
8: 'Беларусь',
10: 'Беларусь',
100: 'Беларусь',
12: 'Беларусь',
14: 'Беларусь',
30: 'Беларусь',
16: 'Беларусь',
18: 'Беларусь',
70: 'Беларусь',
96: 'Беларусь',
20: 'Беларусь',
28: 'Беларусь',
92: 'Беларусь',
22: 'Беларусь',
94: 'Беларусь',
24: 'Беларусь',
26: 'Беларусь',
105: 'Импорт',
73: 'Импорт',
89: 'Импорт',
75: 'Импорт',
103: 'Импорт',
93: 'Импорт',
87: 'Импорт',
83: 'Импорт',
101: 'Импорт',
79: 'Импорт',
99: 'Импорт',
85: 'Импорт',
95: 'Импорт',
81: 'Импорт',
97: 'Импорт',
77: 'Импорт',
91: 'Импорт',
53: 'Импорт',
69: 'Импорт',
27: 'Импорт',
25: 'Импорт',
23: 'Импорт',
21: 'Импорт',
19: 'Импорт',
17: 'Импорт',
29: 'Импорт',
15: 'Импорт',
11: 'Импорт',
9: 'Импорт',
7: 'Импорт',
5: 'Импорт',
3: 'Импорт',
1: 'Импорт',
13: 'Импорт',
31: 'Импорт',
33: 'Импорт',
35: 'Импорт',
67: 'Импорт',
65: 'Импорт',
63: 'Импорт',
61: 'Импорт',
59: 'Импорт',
57: 'Импорт',
55: 'Импорт',
51: 'Импорт',
49: 'Импорт',
47: 'Импорт',
45: 'Импорт',
43: 'Импорт',
41: 'Импорт',
39: 'Импорт',
37: 'Импорт',
71: 'Импорт',
107: 'Импорт'},
'mean_price_of_medicine': {0: 4.92,
34: 78.74,
72: 5.1,
36: 5.09,
38: 15.15,
86: 14.92,
40: 25.95,
42: 38.38,
84: 5.37,
44: 48.12,
46: 84.02,
82: 83.49,
48: 5.28,
50: 15.13,
80: 49.23,
52: 26.11,
106: 86.08,
54: 38.06,
56: 49.25,
78: 37.33,
58: 83.79,
60: 5.18,
76: 26.22,
62: 15.19,
64: 26.29,
74: 14.81,
66: 38.48,
68: 48.93,
32: 47.22,
90: 38.31,
88: 25.82,
98: 15.17,
2: 15.21,
104: 50.87,
4: 26.52,
6: 38.14,
102: 37.9,
8: 46.43,
10: 89.32,
100: 25.85,
12: 5.14,
14: 15.01,
30: 38.04,
16: 26.16,
18: 38.56,
70: 93.85,
96: 5.06,
20: 47.71,
28: 26.08,
92: 50.44,
22: 88.74,
94: 86.42,
24: 5.29,
26: 14.98,
105: 48.25,
73: 7.21,
89: 26.74,
75: 15.85,
103: 37.83,
93: 49.03,
87: 16.1,
83: 87.7,
101: 26.52,
79: 38.01,
99: 16.33,
85: 7.03,
95: 82.19,
81: 48.59,
97: 7.17,
77: 26.46,
91: 38.22,
53: 26.42,
69: 48.85,
27: 15.86,
25: 7.32,
23: 87.21,
21: 48.81,
19: 38.58,
17: 26.51,
29: 26.55,
15: 16.06,
11: 83.96,
9: 48.56,
7: 38.32,
5: 26.66,
3: 16.1,
1: 7.33,
13: 7.21,
31: 38.03,
33: 48.07,
35: 94.97,
67: 38.34,
65: 26.62,
63: 16.09,
61: 7.22,
59: 95.27,
57: 48.59,
55: 38.14,
51: 16.14,
49: 6.9,
47: 89.96,
45: 48.1,
43: 38.06,
41: 26.7,
39: 16.12,
37: 7.17,
71: 108.55,
107: 86.54},
'divided_mean_price_of_medicine': {0: 2.46,
34: 39.37,
72: 2.55,
36: 2.54,
38: 7.57,
86: 7.46,
40: 12.98,
42: 19.19,
84: 2.68,
44: 24.06,
46: 42.01,
82: 41.75,
48: 2.64,
50: 7.56,
80: 24.61,
52: 13.05,
106: 43.04,
54: 19.03,
56: 24.62,
78: 18.67,
58: 41.89,
60: 2.59,
76: 13.11,
62: 7.6,
64: 13.15,
74: 7.41,
66: 19.24,
68: 24.46,
32: 23.61,
90: 19.16,
88: 12.91,
98: 7.59,
2: 7.6,
104: 25.44,
4: 13.26,
6: 19.07,
102: 18.95,
8: 23.22,
10: 44.66,
100: 12.92,
12: 2.57,
14: 7.5,
30: 19.02,
16: 13.08,
18: 19.28,
70: 46.92,
96: 2.53,
20: 23.86,
28: 13.04,
92: 25.22,
22: 44.37,
94: 43.21,
24: 2.65,
26: 7.49,
105: 24.12,
73: 3.6,
89: 13.37,
75: 7.92,
103: 18.92,
93: 24.51,
87: 8.05,
83: 43.85,
101: 13.26,
79: 19.0,
99: 8.17,
85: 3.52,
95: 41.09,
81: 24.3,
97: 3.58,
77: 13.23,
91: 19.11,
53: 13.21,
69: 24.43,
27: 7.93,
25: 3.66,
23: 43.6,
21: 24.4,
19: 19.29,
17: 13.26,
29: 13.28,
15: 8.03,
11: 41.98,
9: 24.28,
7: 19.16,
5: 13.33,
3: 8.05,
1: 3.67,
13: 3.6,
31: 19.01,
33: 24.04,
35: 47.48,
67: 19.17,
65: 13.31,
63: 8.04,
61: 3.61,
59: 47.63,
57: 24.3,
55: 19.07,
51: 8.07,
49: 3.45,
47: 44.98,
45: 24.05,
43: 19.03,
41: 13.35,
39: 8.06,
37: 3.59,
71: 54.27,
107: 43.27}}
There is an example of this in the docs:
import altair as alt
from vega_datasets import data
source=data.barley()
bars = alt.Chart(source).mark_bar().encode(
x=alt.X('sum(yield):Q', stack='zero'),
y=alt.Y('variety:N'),
color=alt.Color('site')
)
text = alt.Chart(source).mark_text(dx=-15, dy=3, color='white').encode(
x=alt.X('sum(yield):Q', stack='zero'),
y=alt.Y('variety:N'),
detail='site:N',
text=alt.Text('sum(yield):Q', format='.1f')
)
bars + text
With faceting it can look like this:
import altair as alt
from vega_datasets import data
import random
source=data.barley()
source['group'] = [random.choice(['A', 'B']) for num in range(source.shape[0])]
bars = alt.Chart(source).mark_bar().encode(
x=alt.X('sum(yield):Q', stack='zero'),
y=alt.Y('variety:N'),
color=alt.Color('site')
)
text = alt.Chart(source).mark_text(dx=-15, dy=3, color='white').encode(
x=alt.X('sum(yield):Q', stack='zero'),
y=alt.Y('variety:N'),
detail='site:N',
text=alt.Text('sum(yield):Q', format='.1f')
)
(bars + text).facet(row='group')
I can't simplify my data so I put them entirely.
I would like to build the best possible team of 11 players according to the "niveau" column.
Each "id" has a "niveau" note for the "statut" column.
I think it would be necessary to test all the possible combinations of "niveau" without there being any "id" duplicates in order to obtain the best average level of the 11 players, but I don't know how to proceed.
Do you have an idea please?
Thank you
import pandas as pd
data = {'statut': {0: 'titulaire_01', 1: 'titulaire_01', 2: 'titulaire_01', 3: 'titulaire_01', 4: 'titulaire_01', 5: 'titulaire_01', 6: 'titulaire_01', 7: 'titulaire_01', 8: 'titulaire_02', 9: 'titulaire_02', 10: 'titulaire_02', 11: 'titulaire_02', 12: 'titulaire_02', 13: 'titulaire_02', 14: 'titulaire_02', 15: 'titulaire_02', 16: 'titulaire_02', 17: 'titulaire_02', 18: 'titulaire_02', 19: 'titulaire_02', 20: 'titulaire_02', 21: 'titulaire_02', 22: 'titulaire_02', 23: 'titulaire_02', 24: 'titulaire_02', 25: 'titulaire_02', 26: 'titulaire_02', 27: 'titulaire_02', 28: 'titulaire_03', 29: 'titulaire_03', 30: 'titulaire_03', 31: 'titulaire_03', 32: 'titulaire_03', 33: 'titulaire_03', 34: 'titulaire_03', 35:
'titulaire_03', 36: 'titulaire_03', 37: 'titulaire_03', 38: 'titulaire_03', 39: 'titulaire_03', 40: 'titulaire_03', 41: 'titulaire_03', 42: 'titulaire_03', 43: 'titulaire_03', 44: 'titulaire_03', 45: 'titulaire_03', 46: 'titulaire_03', 47: 'titulaire_03', 48: 'titulaire_04', 49: 'titulaire_04', 50: 'titulaire_04', 51: 'titulaire_04', 52: 'titulaire_04', 53: 'titulaire_04', 54: 'titulaire_04', 55: 'titulaire_04', 56: 'titulaire_04', 57: 'titulaire_05', 58: 'titulaire_05', 59: 'titulaire_05', 60: 'titulaire_05', 61: 'titulaire_05', 62: 'titulaire_05', 63: 'titulaire_05', 64: 'titulaire_05', 65: 'titulaire_05', 66: 'titulaire_05', 67: 'titulaire_06', 68: 'titulaire_06', 69: 'titulaire_06', 70: 'titulaire_06', 71: 'titulaire_06', 72: 'titulaire_06', 73: 'titulaire_06', 74: 'titulaire_06', 75: 'titulaire_06', 76: 'titulaire_06', 77: 'titulaire_06', 78: 'titulaire_06', 79: 'titulaire_07', 80: 'titulaire_07', 81: 'titulaire_07', 82: 'titulaire_07', 83: 'titulaire_07', 84: 'titulaire_07', 85: 'titulaire_07', 86: 'titulaire_07', 87: 'titulaire_07', 88: 'titulaire_07', 89: 'titulaire_07', 90: 'titulaire_07', 91: 'titulaire_07', 92: 'titulaire_07', 93: 'titulaire_07', 94: 'titulaire_07', 95: 'titulaire_07', 96: 'titulaire_07', 97: 'titulaire_07', 98: 'titulaire_08', 99: 'titulaire_08', 100: 'titulaire_08', 101: 'titulaire_08', 102: 'titulaire_08', 103: 'titulaire_08', 104: 'titulaire_08', 105: 'titulaire_08', 106: 'titulaire_08', 107: 'titulaire_08', 108: 'titulaire_08', 109: 'titulaire_08', 110: 'titulaire_08', 111: 'titulaire_08', 112: 'titulaire_08', 113: 'titulaire_08', 114: 'titulaire_08', 115: 'titulaire_08', 116: 'titulaire_08', 117: 'titulaire_09', 118: 'titulaire_09', 119: 'titulaire_09', 120: 'titulaire_09', 121: 'titulaire_09', 122: 'titulaire_09', 123: 'titulaire_09', 124: 'titulaire_09', 125: 'titulaire_09', 126: 'titulaire_09', 127: 'titulaire_09', 128: 'titulaire_09', 129: 'titulaire_09', 130: 'titulaire_09', 131: 'titulaire_09', 132: 'titulaire_09', 133: 'titulaire_09', 134: 'titulaire_09', 135: 'titulaire_09', 136: 'titulaire_10', 137: 'titulaire_10', 138: 'titulaire_10', 139: 'titulaire_10', 140: 'titulaire_10', 141: 'titulaire_10', 142: 'titulaire_10', 143: 'titulaire_10', 144: 'titulaire_10', 145: 'titulaire_10', 146: 'titulaire_10', 147: 'titulaire_10', 148: 'titulaire_10', 149: 'titulaire_10', 150: 'titulaire_10', 151: 'titulaire_10', 152: 'titulaire_10', 153: 'titulaire_10', 154: 'titulaire_10', 155: 'titulaire_10', 156: 'titulaire_10', 157: 'titulaire_10', 158: 'titulaire_11', 159: 'titulaire_11', 160: 'titulaire_11', 161: 'titulaire_11', 162: 'titulaire_11', 163: 'titulaire_11', 164: 'titulaire_11', 165: 'titulaire_11', 166: 'titulaire_11', 167: 'titulaire_11', 168: 'titulaire_11', 169: 'titulaire_11', 170: 'titulaire_11', 171: 'titulaire_11', 172: 'titulaire_11', 173: 'titulaire_11', 174: 'titulaire_11', 175: 'titulaire_11', 176: 'titulaire_11', 177: 'titulaire_11', 178: 'titulaire_11', 179: 'titulaire_11'}, 'id': {0: 2002134607, 1: 2002043469, 2: 67156610, 3: 73201503, 4: 2000165962, 5: 2000143545, 6: 2002042688, 7: 2000055323, 8: 49054631, 9: 48031358, 10: 49048802, 11: 2002042816, 12: 2000045508, 13: 73201458, 14: 67191910, 15: 2002134617, 16: 2002042628, 17: 2000023214, 18: 2000165961, 19: 2000121963, 20: 2000045487, 21: 2000006106, 22: 14196664, 23: 2000055604, 24: 2002043613, 25: 49054633, 26: 49037900, 27: 2002043635, 28: 48031358, 29: 49037900, 30: 2002043635, 31: 2000121963, 32: 2000165961, 33: 67191910, 34: 2002042816, 35: 73201458, 36: 49054633, 37: 2000045487, 38: 2002043613, 39: 2000006106, 40: 2000055604, 41: 2000023214, 42: 2000045508, 43: 2002042628, 44: 14196664, 45: 2002134617, 46: 49054631, 47: 49048802, 48: 49040506, 49: 85126966, 50: 83169864, 51: 2002043476, 52: 2000045508, 53: 2002043613, 54: 2002042669, 55: 2000023214, 56: 73201460, 57: 67211095, 58: 83169864, 59: 13196665, 60: 2000055604, 61: 2000011411, 62: 2000165964, 63: 73201458, 64: 2002042939, 65: 2002043635, 66: 2002043613, 67: 2000045698, 68: 2002042722, 69: 2000132382, 70: 49054633, 71: 2002042845, 72: 2000045520, 73: 73201505, 74: 73201458, 75: 70137157, 76: 49040506, 77: 2002043635, 78: 2000143548, 79: 73200890, 80: 49060705, 81: 2000045543, 82: 2000045698,
83: 2000011617, 84: 2002042722, 85: 2002042642, 86: 2000113673, 87: 85137101, 88: 19217413, 89: 2000147147, 90: 2002042845, 91: 2002043003, 92: 2002042627, 93: 2002042966, 94: 2000047331, 95: 2002042666, 96: 2000134665, 97: 2002042690, 98: 2000011617, 99: 2000045698, 100: 49060705, 101: 2000047331, 102: 2000147147, 103: 2000134665, 104: 2000113673, 105: 73200890, 106: 2002042845, 107: 19217413, 108: 2000045543, 109: 2002043003, 110: 2002042722, 111: 2002042666, 112: 2002042966, 113: 2002042627, 114: 2002042690, 115: 2002042642, 116: 85137101, 117: 2000134665, 118: 2002042666, 119: 2002042627, 120: 2000047331, 121: 2002042966, 122: 2002043003, 123: 2002042690, 124: 2002042845, 125: 2000147147, 126: 19217413, 127: 85137101, 128: 2002042722, 129: 2002042642, 130: 2000045543, 131: 2000011617, 132: 2000113673, 133: 49060705, 134: 73200890, 135: 2000045698, 136: 62124125, 137: 2002043171, 138: 2000165960, 139: 2002134617, 140: 2002042690, 141: 2000047311, 142: 2000105477, 143: 2002042627, 144: 2000037444, 145: 49060705, 146: 2002042642, 147: 2002134611, 148: 2002043003, 149: 2002042966, 150: 73201412, 151: 2002042813, 152: 67256520, 153: 2000047306, 154: 2002042983, 155: 12092876, 156: 96026541, 157: 2002043636, 158: 2000165960, 159: 49060705, 160: 12092876, 161: 2002042690, 162: 2002134617, 163: 2002042642, 164: 73201412, 165: 62124125, 166: 2000105477, 167: 2002042966, 168: 96026541, 169: 2002042983, 170: 2000047311, 171: 2002043171, 172: 2002134611, 173: 2002042813, 174: 2000047306, 175: 67256520, 176: 2002043003, 177: 2002043636, 178: 2002042627, 179: 2000037444}, 'niveau': {0: 13.605263157894736, 1: 25.13157894736842, 2: 22.473684210526315, 3: 16.236842105263158, 4: 15.789473684210526, 5: 15.342105263157896, 6: 28.394736842105264, 7: 14.789473684210526, 8: 16.727272727272727, 9: 25.741935483870968, 10: 17.424242424242426, 11: 28.03030303030303, 12: 16.696969696969695, 13: 16.636363636363637, 14: 25.454545454545453, 15: 16.484848484848484, 16: 30.606060606060606, 17: 16.424242424242426, 18: 17.151515151515152, 19: 17.151515151515152, 20: 19.151515151515152, 21: 22.03030303030303, 22: 25.272727272727273, 23: 19.818181818181817, 24: 25.12121212121212, 25: 20.272727272727273, 26: 28.09090909090909, 27: 26.0, 28: 26.06451612903226, 29: 28.545454545454547, 30: 26.242424242424242, 31: 17.454545454545453, 32: 17.606060606060606, 33: 25.757575757575758, 34: 28.333333333333332, 35: 17.09090909090909, 36: 20.575757575757574, 37: 19.454545454545453, 38: 25.272727272727273, 39: 21.575757575757574, 40: 20.12121212121212, 41: 15.969696969696969, 42: 16.393939393939394, 43: 30.303030303030305, 44: 25.515151515151516, 45: 16.939393939393938, 46: 17.03030303030303, 47: 17.87878787878788, 48: 18.142857142857142, 49: 24.37142857142857, 50: 24.057142857142857, 51: 25.4, 52: 15.17142857142857, 53: 23.34285714285714, 54: 28.142857142857142, 55: 15.085714285714285, 56: 16.257142857142856, 57: 23.34285714285714, 58: 23.771428571428572, 59: 22.6, 60: 18.285714285714285, 61: 18.685714285714287, 62: 16.514285714285716, 63: 15.82857142857143, 64: 25.885714285714286, 65: 26.142857142857142, 66: 23.485714285714284, 67: 17.564102564102566, 68: 28.384615384615383, 69: 17.153846153846153, 70: 18.205128205128204, 71: 25.46153846153846, 72: 15.512820512820513, 73: 14.615384615384615, 74: 14.846153846153847, 75: 17.564102564102566, 76: 17.487179487179485, 77: 24.974358974358974, 78: 14.461538461538462, 79: 22.5, 80: 20.0625, 81: 19.84375, 82: 18.9375, 83: 20.25, 84: 31.59375, 85: 33.1875, 86: 18.34375,
87: 24.71875, 88: 26.03125, 89: 18.09375, 90: 28.34375, 91: 29.1875, 92: 32.46875, 93: 30.09375, 94: 18.5625, 95: 31.9375, 96: 15.28125, 97: 32.3125, 98: 19.9375, 99: 18.625, 100: 19.8125, 101: 18.8125, 102: 18.40625, 103: 15.75, 104: 18.03125, 105: 22.1875, 106: 28.09375, 107: 26.34375, 108: 20.15625, 109: 29.4375, 110: 31.34375, 111: 31.78125, 112:
29.84375, 113: 32.21875, 114: 32.625, 115: 33.5, 116: 24.46875, 117: 15.870967741935484, 118: 31.483870967741936, 119: 32.354838709677416, 120: 18.29032258064516, 121: 29.741935483870968, 122: 29.677419354838708, 123: 32.41935483870968, 124: 28.129032258064516, 125: 18.032258064516128, 126: 26.06451612903226, 127: 24.70967741935484, 128: 31.838709677419356, 129: 33.61290322580645, 130: 20.35483870967742, 131: 19.129032258064516, 132: 18.580645161290324, 133: 20.419354838709676, 134: 22.483870967741936, 135: 19.451612903225808, 136: 23.59375, 137: 30.78125, 138: 19.28125, 139: 16.03125, 140: 31.78125, 141: 19.625, 142: 19.09375, 143: 32.0625, 144: 20.65625, 145: 20.625, 146: 32.96875, 147: 20.71875, 148: 29.15625, 149: 29.5, 150: 17.875, 151: 29.0625, 152: 21.28125, 153: 18.84375, 154: 28.4375, 155: 24.84375, 156: 26.53125, 157: 29.0625, 158: 18.8125, 159: 20.375, 160: 24.53125, 161: 32.09375, 162: 15.5625, 163: 33.28125, 164: 18.34375, 165: 23.125, 166: 18.625, 167: 29.25, 168: 26.84375, 169: 28.125, 170: 19.3125, 171: 30.53125, 172: 20.875, 173: 28.75, 174: 18.53125, 175: 21.03125, 176: 29.40625, 177: 29.375, 178: 31.8125, 179: 20.34375}}
df = pd.DataFrame(data)
print(df)
statut id niveau
0 titulaire_01 2002134607 13.605263
1 titulaire_01 2002043469 25.131579
2 titulaire_01 67156610 22.473684
3 titulaire_01 73201503 16.236842
4 titulaire_01 2000165962 15.789474
.. ... ... ...
175 titulaire_11 67256520 21.031250
176 titulaire_11 2002043003 29.406250
177 titulaire_11 2002043636 29.375000
178 titulaire_11 2002042627 31.812500
179 titulaire_11 2000037444 20.343750
if I do groupby("statut") keeping the max of the "niveau" column I have "id" duplicates, an "id" can be in several "titulaire_01" and "titulaire_02" etc..
the result should be 11 rows with no duplicates
It looks like an optimization problem, you can pivot your data to a rectangular format, then use scipy.optimize.linear_sum_assignment:
from scipy.optimize import linear_sum_assignment
df2 = df.pivot_table(index='id', columns='statut', values='niveau',
fill_value=0) # or fill_value=-np.inf
ID, statut = linear_sum_assignment(df2, maximize=True)
out = (pd.DataFrame({'statut': df2.columns[statut], 'id': df2.index[ID]})
.sort_values(by='statut', ignore_index=True)
)
output:
statut id
0 titulaire_01 2002042688
1 titulaire_02 2002042628
2 titulaire_03 49037900
3 titulaire_04 2002042669
4 titulaire_05 2002043635
5 titulaire_06 2002042722
6 titulaire_07 2002042666
7 titulaire_08 2002042690
8 titulaire_09 2002042627
9 titulaire_10 2002043171
10 titulaire_11 2002042642
I was trying to replicate this code for stat forecasting in python, The monthly frequency of the output which is generated is incorrect. I am not sure as to what went wrong here.
Here is the link for reference : https://towardsdatascience.com/time-series-forecasting-with-statistical-models-f08dcd1d24d1
import random
from itertools import product
from IPython.display import display, Markdown
from multiprocessing import cpu_count
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from statsforecast import StatsForecast
from statsforecast.models import (
adida,
croston_classic,
croston_sba,
croston_optimized,
historic_average,
imapa,
naive,
random_walk_with_drift,
seasonal_exponential_smoothing,
seasonal_naive,
seasonal_window_average,
ses,
tsb,
window_average
)
df = pd.read_excel ('C:/2. Path/Sample_Data_2.xlsx')
print (df)
df.info()
df["ds"] = pd.to_datetime(df["ds"],format='%Y-%m-%d')
df_test = df.groupby('unique_id').tail(6).copy()
df = df.drop(df_test.index)
df['unique_id'] = df['unique_id'].astype('object')
df = df.set_index('unique_id')
df.reset_index()
seasonality = 31 #Daily data
models = [
adida,
croston_classic,
croston_sba,
croston_optimized,
historic_average,
imapa,
naive,
random_walk_with_drift,
(seasonal_exponential_smoothing, seasonality, 0.2),
(seasonal_naive, seasonality),
(seasonal_window_average, seasonality, 2 * seasonality),
(ses, 0.1),
(tsb, 0.3, 0.2),
(window_average, 2 * seasonality)
]
fcst = StatsForecast(df=df, models=models, freq='M', n_jobs=cpu_count())
%time forecasts = fcst.forecast(6)
forecasts.reset_index()
forecasts = forecasts.reset_index().merge(df_test, how='left', on=['unique_id', 'ds'])
models = forecasts.drop(columns=['unique_id', 'ds', 'y']).columns.to_list()
from nixtlats.losses.numpy import mape
y = forecasts['y'].values
mape_bench = mape(y, forecasts['historic_average'].values)
fva = {}
for model in models:
mape_model = mape(y, forecasts[model].values)
fva[model] = mape_bench - mape_model
pd.DataFrame(fva, index=['FVA']).T.sort_values('FVA').rename_axis('model').reset_index()
The output Image is given below:
The Dataset is:
{'ds': {0: '2019-01-01', 1: '2019-02-01', 2: '2019-03-01', 3: '2019-04-01', 4: '2019-05-01', 5: '2019-06-01', 6: '2019-07-01', 7: '2019-08-01', 8: '2019-09-01', 9: '2019-10-01', 10: '2019-11-01', 11: '2019-12-01', 12: '2020-01-01', 13: '2020-02-01', 14: '2020-03-01', 15: '2020-04-01', 16: '2020-05-01', 17: '2020-06-01', 18: '2020-07-01', 19: '2020-08-01', 20: '2020-09-01', 21: '2020-10-01', 22: '2020-11-01', 23: '2020-12-01', 24: '2021-01-01', 25: '2021-02-01', 26: '2021-03-01', 27: '2021-04-01', 28: '2021-05-01', 29: '2021-06-01', 30: '2021-07-01', 31: '2021-08-01', 32: '2021-09-01', 33: '2021-10-01', 34: '2021-11-01', 35: '2021-12-01', 36: '2022-01-01', 37: '2022-02-01', 38: '2022-03-01', 39: '2022-04-01', 40: '2022-05-01', 41: '2022-06-01', 42: '2022-07-01', 43: '2022-08-01', 44: '2022-09-01', 45: '2019-01-01', 46: '2019-02-01', 47: '2019-03-01', 48: '2019-04-01', 49: '2019-05-01', 50: '2019-06-01', 51: '2019-07-01', 52: '2019-08-01', 53: '2019-09-01', 54: '2019-10-01', 55: '2019-11-01', 56: '2019-12-01', 57: '2020-01-01', 58: '2020-02-01', 59: '2020-03-01', 60: '2020-04-01', 61: '2020-05-01', 62: '2020-06-01', 63: '2020-07-01', 64: '2020-08-01', 65: '2020-09-01', 66: '2020-10-01', 67: '2020-11-01', 68: '2020-12-01', 69: '2021-01-01', 70: '2021-02-01', 71: '2021-03-01', 72: '2021-04-01', 73: '2021-05-01', 74: '2021-06-01', 75: '2021-07-01', 76: '2021-08-01', 77: '2021-09-01', 78: '2021-10-01', 79: '2021-11-01', 80: '2021-12-01', 81: '2022-01-01', 82: '2022-02-01', 83: '2022-03-01', 84: '2022-04-01', 85: '2022-05-01', 86: '2022-06-01', 87: '2022-07-01', 88: '2022-08-01', 89: '2022-09-01', 90: '2019-01-01', 91: '2019-02-01', 92: '2019-03-01', 93: '2019-04-01', 94: '2019-05-01', 95: '2019-06-01', 96: '2019-07-01', 97: '2019-08-01', 98: '2019-09-01', 99: '2019-10-01', 100: '2019-11-01', 101: '2019-12-01', 102: '2020-01-01', 103: '2020-02-01', 104: '2020-03-01', 105: '2020-04-01', 106: '2020-05-01', 107: '2020-06-01', 108: '2020-07-01', 109: '2020-08-01', 110: '2020-09-01', 111: '2020-10-01', 112: '2020-11-01', 113: '2020-12-01', 114: '2021-01-01', 115: '2021-02-01', 116: '2021-03-01', 117: '2021-04-01', 118: '2021-05-01', 119: '2021-06-01', 120: '2021-07-01', 121: '2021-08-01', 122: '2021-09-01', 123: '2021-10-01', 124: '2021-11-01', 125: '2021-12-01', 126: '2022-01-01', 127: '2022-02-01', 128: '2022-03-01', 129: '2022-04-01', 130: '2022-05-01', 131: '2022-06-01', 132: '2022-07-01', 133: '2022-08-01', 134: '2022-09-01', 135: '2019-01-01', 136: '2019-02-01', 137: '2019-03-01', 138: '2019-04-01', 139: '2019-05-01', 140: '2019-06-01', 141: '2019-07-01', 142: '2019-08-01', 143: '2019-09-01', 144: '2019-10-01', 145: '2019-11-01', 146: '2019-12-01', 147: '2020-01-01', 148: '2020-02-01', 149: '2020-03-01', 150: '2020-04-01', 151: '2020-05-01', 152: '2020-06-01', 153: '2020-07-01', 154: '2020-08-01', 155: '2020-09-01', 156: '2020-10-01', 157: '2020-11-01', 158: '2020-12-01', 159: '2021-01-01', 160: '2021-02-01', 161: '2021-03-01', 162: '2021-04-01', 163: '2021-05-01', 164: '2021-06-01', 165: '2021-07-01', 166: '2021-08-01', 167: '2021-09-01', 168: '2021-10-01', 169: '2021-11-01', 170: '2021-12-01', 171: '2022-01-01', 172: '2022-02-01', 173: '2022-03-01', 174: '2022-04-01', 175: '2022-05-01', 176: '2022-06-01', 177: '2022-07-01', 178: '2022-08-01', 179: '2022-09-01', 180: '2019-01-01', 181: '2019-02-01', 182: '2019-03-01', 183: '2019-04-01', 184: '2019-05-01', 185: '2019-06-01', 186: '2019-07-01', 187: '2019-08-01', 188: '2019-09-01', 189: '2019-10-01', 190: '2019-11-01', 191: '2019-12-01', 192: '2020-01-01', 193: '2020-02-01', 194: '2020-03-01', 195: '2020-04-01', 196: '2020-05-01', 197: '2020-06-01', 198: '2020-07-01', 199: '2020-08-01', 200: '2020-09-01', 201: '2020-10-01', 202: '2020-11-01', 203: '2020-12-01', 204: '2021-01-01', 205: '2021-02-01', 206: '2021-03-01', 207: '2021-04-01', 208: '2021-05-01', 209: '2021-06-01', 210: '2021-07-01', 211: '2021-08-01', 212: '2021-09-01', 213: '2021-10-01', 214: '2021-11-01', 215: '2021-12-01', 216: '2022-01-01', 217: '2022-02-01', 218: '2022-03-01', 219: '2022-04-01', 220: '2022-05-01', 221: '2022-06-01', 222: '2022-07-01', 223: '2022-08-01', 224: '2022-09-01'}, 'unique_id': {0: 'XYZ|419', 1: 'XYZ|419', 2: 'XYZ|419', 3: 'XYZ|419', 4: 'XYZ|419', 5: 'XYZ|419', 6: 'XYZ|419', 7: 'XYZ|419', 8: 'XYZ|419', 9: 'XYZ|419', 10: 'XYZ|419', 11: 'XYZ|419', 12: 'XYZ|419', 13: 'XYZ|419', 14: 'XYZ|419', 15: 'XYZ|419', 16: 'XYZ|419', 17: 'XYZ|419', 18: 'XYZ|419', 19: 'XYZ|419', 20: 'XYZ|419', 21: 'XYZ|419', 22: 'XYZ|419', 23: 'XYZ|419', 24: 'XYZ|419', 25: 'XYZ|419', 26: 'XYZ|419', 27: 'XYZ|419', 28: 'XYZ|419', 29: 'XYZ|419', 30: 'XYZ|419', 31: 'XYZ|419', 32: 'XYZ|419', 33: 'XYZ|419', 34: 'XYZ|419', 35: 'XYZ|419', 36: 'XYZ|419', 37: 'XYZ|419', 38: 'XYZ|419', 39: 'XYZ|419', 40: 'XYZ|419', 41: 'XYZ|419', 42: 'XYZ|419', 43: 'XYZ|419', 44: 'XYZ|419', 45: 'XYZ|426', 46: 'XYZ|426', 47: 'XYZ|426', 48: 'XYZ|426', 49: 'XYZ|426', 50: 'XYZ|426', 51: 'XYZ|426', 52: 'XYZ|426', 53: 'XYZ|426', 54: 'XYZ|426', 55: 'XYZ|426', 56: 'XYZ|426', 57: 'XYZ|426', 58: 'XYZ|426', 59: 'XYZ|426', 60: 'XYZ|426', 61: 'XYZ|426', 62: 'XYZ|426', 63: 'XYZ|426', 64: 'XYZ|426', 65: 'XYZ|426', 66: 'XYZ|426', 67: 'XYZ|426', 68: 'XYZ|426', 69: 'XYZ|426', 70: 'XYZ|426', 71: 'XYZ|426', 72: 'XYZ|426', 73: 'XYZ|426', 74: 'XYZ|426', 75: 'XYZ|426', 76: 'XYZ|426', 77: 'XYZ|426', 78: 'XYZ|426', 79: 'XYZ|426', 80: 'XYZ|426', 81: 'XYZ|426', 82: 'XYZ|426', 83: 'XYZ|426', 84: 'XYZ|426', 85: 'XYZ|426', 86: 'XYZ|426', 87: 'XYZ|426', 88: 'XYZ|426', 89: 'XYZ|426', 90: 'XYZ|465', 91: 'XYZ|465', 92: 'XYZ|465', 93: 'XYZ|465', 94: 'XYZ|465', 95: 'XYZ|465', 96: 'XYZ|465', 97: 'XYZ|465', 98: 'XYZ|465', 99: 'XYZ|465', 100: 'XYZ|465', 101: 'XYZ|465', 102: 'XYZ|465', 103: 'XYZ|465', 104: 'XYZ|465', 105: 'XYZ|465', 106: 'XYZ|465', 107: 'XYZ|465', 108: 'XYZ|465', 109: 'XYZ|465', 110: 'XYZ|465', 111: 'XYZ|465', 112: 'XYZ|465', 113: 'XYZ|465', 114: 'XYZ|465', 115: 'XYZ|465', 116: 'XYZ|465', 117: 'XYZ|465', 118: 'XYZ|465', 119: 'XYZ|465', 120: 'XYZ|465', 121: 'XYZ|465', 122: 'XYZ|465', 123: 'XYZ|465', 124: 'XYZ|465', 125: 'XYZ|465', 126: 'XYZ|465', 127: 'XYZ|465', 128: 'XYZ|465', 129: 'XYZ|465', 130: 'XYZ|465', 131: 'XYZ|465', 132: 'XYZ|465', 133: 'XYZ|465', 134: 'XYZ|465', 135: 'XYZ|489', 136: 'XYZ|489', 137: 'XYZ|489', 138: 'XYZ|489', 139: 'XYZ|489', 140: 'XYZ|489', 141: 'XYZ|489', 142: 'XYZ|489', 143: 'XYZ|489', 144: 'XYZ|489', 145: 'XYZ|489', 146: 'XYZ|489', 147: 'XYZ|489', 148: 'XYZ|489', 149: 'XYZ|489', 150: 'XYZ|489', 151: 'XYZ|489', 152: 'XYZ|489', 153: 'XYZ|489', 154: 'XYZ|489', 155: 'XYZ|489', 156: 'XYZ|489', 157: 'XYZ|489', 158: 'XYZ|489', 159: 'XYZ|489', 160: 'XYZ|489', 161: 'XYZ|489', 162: 'XYZ|489', 163: 'XYZ|489', 164: 'XYZ|489', 165: 'XYZ|489', 166: 'XYZ|489', 167: 'XYZ|489', 168: 'XYZ|489', 169: 'XYZ|489', 170: 'XYZ|489', 171: 'XYZ|489', 172: 'XYZ|489', 173: 'XYZ|489', 174: 'XYZ|489', 175: 'XYZ|489', 176: 'XYZ|489', 177: 'XYZ|489', 178: 'XYZ|489', 179: 'XYZ|489', 180: 'XYZ|457', 181: 'XYZ|457', 182: 'XYZ|457', 183: 'XYZ|457', 184: 'XYZ|457', 185: 'XYZ|457', 186: 'XYZ|457', 187: 'XYZ|457', 188: 'XYZ|457', 189: 'XYZ|457', 190: 'XYZ|457', 191: 'XYZ|457', 192: 'XYZ|457', 193: 'XYZ|457', 194: 'XYZ|457', 195: 'XYZ|457', 196: 'XYZ|457', 197: 'XYZ|457', 198: 'XYZ|457', 199: 'XYZ|457', 200: 'XYZ|457', 201: 'XYZ|457', 202: 'XYZ|457', 203: 'XYZ|457', 204: 'XYZ|457', 205: 'XYZ|457', 206: 'XYZ|457', 207: 'XYZ|457', 208: 'XYZ|457', 209: 'XYZ|457', 210: 'XYZ|457', 211: 'XYZ|457', 212: 'XYZ|457', 213: 'XYZ|457', 214: 'XYZ|457', 215: 'XYZ|457', 216: 'XYZ|457', 217: 'XYZ|457', 218: 'XYZ|457', 219: 'XYZ|457', 220: 'XYZ|457', 221: 'XYZ|457', 222: 'XYZ|457', 223: 'XYZ|457', 224: 'XYZ|457'}, 'y': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 791, 11: 833, 12: 478, 13: 343, 14: 543, 15: 560, 16: 427, 17: 302, 18: 391, 19: 279, 20: 405, 21: 580, 22: 824, 23: 767, 24: 1102, 25: 1000, 26: 1032, 27: 668, 28: 540, 29: 477, 30: 353, 31: 427, 32: 28, 33: 2, 34: 914, 35: 718, 36: 44, 37: 0, 38: 0, 39: 0, 40: 0, 41: 0, 42: 0, 43: 0, 44: 0, 45: 0, 46: 0, 47: 0, 48: 0, 49: 0, 50: 0, 51: 0, 52: 0, 53: 0, 54: 0, 55: 0, 56: 0, 57: 0, 58: 0, 59: 0, 60: 0, 61: 0, 62: 29, 63: 374, 64: 330, 65: 402, 66: 1005, 67: 1533, 68: 1582, 69: 1824, 70: 1168, 71: 193, 72: 895, 73: 613, 74: 651, 75: 267, 76: 233, 77: 135, 78: 173, 79: 564, 80: 789, 81: 343, 82: 275, 83: 383, 84: 181, 85: 96, 86: 499, 87: 53, 88: 84, 89: 23, 90: 0, 91: 0, 92: 0, 93: 0, 94: 0, 95: 0, 96: 0, 97: 0, 98: 0, 99: 0, 100: 0, 101: 0, 102: 0, 103: 0, 104: 0, 105: 0, 106: 0, 107: 44, 108: 292, 109: 240, 110: 364, 111: 806, 112: 1110, 113: 1232, 114: 1207, 115: 753, 116: 571, 117: 731, 118: 0, 119: 174, 120: 0, 121: 23, 122: 86, 123: 31, 124: 559, 125: 857, 126: 316, 127: 217, 128: 182, 129: 93, 130: 50, 131: 323, 132: 42, 133: 48, 134: 23, 135: 481, 136: 179, 137: 295, 138: 187, 139: 180, 140: 78, 141: 535, 142: 164, 143: 172, 144: 340, 145: 495, 146: 445, 147: 469, 148: 230, 149: 163, 150: 187, 151: 222, 152: 147, 153: 154, 154: 140, 155: 194, 156: 379, 157: 402, 158: 533, 159: 659, 160: 545, 161: 269, 162: 277, 163: 187, 164: 4, 165: 80, 166: 149, 167: 129, 168: 192, 169: 396, 170: 446, 171: 0, 172: 0, 173: 0, 174: 0, 175: 0, 176: 0, 177: 0, 178: 0, 179: 0, 180: 181, 181: 80, 182: 74, 183: 150, 184: 665, 185: 187, 186: 335, 187: 238, 188: 149, 189: 281, 190: 696, 191: 440, 192: 619, 193: 349, 194: 310, 195: 396, 196: 251, 197: 202, 198: 165, 199: 176, 200: 166, 201: 249, 202: 167, 203: 364, 204: 411, 205: 327, 206: 326, 207: 396, 208: 6, 209: 107, 210: 177, 211: 136, 212: 6, 213: 0, 214: 0, 215: 0, 216: 0, 217: 0, 218: 0, 219: 0, 220: 0, 221: 0, 222: 0, 223: 0, 224: 0}}
The PVA output is coming up as null, which should not be the case, attached is the image:
The problem arises because of the freq parameter. Since your data is sampled every month starting with the first day of the month, you have to specify it using freq='MS'(month start frequency).
By changing that, I get the following
I have the following pandas dataframe:
{'Person': {0: 'Lucy', 1: 'Lucy', 2: 'Lucy', 3: 'Lucy', 4: 'Lucy', 5: 'Lucy', 6: 'Lucy', 7: 'Lucy', 8: 'Lucy', 9: 'Lucy', 10: 'Lucy', 11: 'John', 12: 'John', 13: 'John', 14: 'John', 15: 'John', 16: 'John', 17: 'John', 18: 'John', 19: 'John', 20: 'John', 21: 'John', 22: 'Lucy', 23: 'Lucy', 24: 'Lucy', 25: 'Lucy', 26: 'Lucy', 27: 'Lucy', 28: 'Lucy', 29: 'Lucy', 30: 'Lucy', 31: 'Lucy', 32: 'Lucy', 33: 'John', 34: 'John', 35: 'John', 36: 'John', 37: 'John', 38: 'John', 39: 'John', 40: 'John', 41: 'John', 42: 'John', 43: 'John'}, 'Present/Absent': {0: 'Absent', 1: 'Absent', 2: 'Absent', 3: 'Absent', 4: 'Absent', 5: 'Absent', 6: 'Absent', 7: 'Absent', 8: 'Absent', 9: 'Absent', 10: 'Absent', 11: 'Absent', 12: 'Absent', 13: 'Absent', 14: 'Absent', 15: 'Absent', 16: 'Absent', 17: 'Absent', 18: 'Absent', 19: 'Absent', 20: 'Absent', 21: 'Absent', 22: 'Present', 23: 'Present', 24: 'Present', 25: 'Present', 26: 'Present', 27: 'Present', 28: 'Present', 29: 'Present', 30: 'Present', 31: 'Present', 32: 'Present', 33: 'Present', 34: 'Present', 35: 'Present', 36: 'Present', 37: 'Present', 38: 'Present', 39: 'Present', 40: 'Present', 41: 'Present', 42: 'Present', 43: 'Present'}, 'Test No': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1, 10: 1, 11: 1, 12: 1, 13: 1, 14: 1, 15: 1, 16: 1, 17: 1, 18: 1, 19: 1, 20: 1, 21: 1, 22: 1, 23: 1, 24: 1, 25: 1, 26: 1, 27: 1, 28: 1, 29: 1, 30: 1, 31: 1, 32: 1, 33: 1, 34: 1, 35: 1, 36: 1, 37: 1, 38: 1, 39: 1, 40: 1, 41: 1, 42: 1, 43: 1}, 'Humidity': {0: 'Humid', 1: 'Humid', 2: 'Humid', 3: 'Humid', 4: 'Humid', 5: 'Humid', 6: 'Humid', 7: 'Humid', 8: 'Humid', 9: 'Humid', 10: 'Humid', 11: 'Humid', 12: 'Humid', 13: 'Humid', 14: 'Humid', 15: 'Humid', 16: 'Humid', 17: 'Humid', 18: 'Humid', 19: 'Humid', 20: 'Humid', 21: 'Humid', 22: 'Humid', 23: 'Humid', 24: 'Humid', 25: 'Humid', 26: 'Humid', 27: 'Humid', 28: 'Humid', 29: 'Humid', 30: 'Humid', 31: 'Humid', 32: 'Humid', 33: 'Humid', 34: 'Humid', 35: 'Humid', 36: 'Humid', 37: 'Humid', 38: 'Humid', 39: 'Humid', 40: 'Humid', 41: 'Humid', 42: 'Humid', 43: 'Humid'}, 'Compound': {0: 'Argon', 1: 'Argon', 2: 'Argon', 3: 'Argon', 4: 'Argon', 5: 'Argon', 6: 'Argon', 7: 'Argon', 8: 'Argon', 9: 'Argon', 10: 'Pos Con 3', 11: 'Argon', 12: 'Argon', 13: 'Argon', 14: 'Argon', 15: 'Argon', 16: 'Argon', 17: 'Argon', 18: 'Argon', 19: 'Argon', 20: 'Argon', 21: 'Pos Con 5', 22: 'Argon', 23: 'Argon', 24: 'Argon', 25: 'Argon', 26: 'Argon', 27: 'Argon', 28: 'Argon', 29: 'Argon', 30: 'Argon', 31: 'Argon', 32: 'Pos Con 4', 33: 'Argon', 34: 'Argon', 35: 'Argon', 36: 'Argon', 37: 'Argon', 38: 'Argon', 39: 'Argon', 40: 'Argon', 41: 'Argon', 42: 'Argon', 43: 'Pos Con 4'}, 'Level': {0: 0.0, 1: 50.0, 2: 100.0, 3: 150.0, 4: 200.0, 5: 250.0, 6: 300.0, 7: 500.0, 8: 1000.0, 9: 2400.0, 10: 2.0, 11: 0.0, 12: 50.0, 13: 100.0, 14: 150.0, 15: 200.0, 16: 250.0, 17: 300.0, 18: 500.0, 19: 1000.0, 20: 2400.0, 21: 0.2, 22: 0.0, 23: 50.0, 24: 100.0, 25: 150.0, 26: 200.0, 27: 250.0, 28: 300.0, 29: 500.0, 30: 1000.0, 31: 2400.0, 32: 5.0, 33: 0.0, 34: 50.0, 35: 100.0, 36: 150.0, 37: 200.0, 38: 250.0, 39: 300.0, 40: 500.0, 41: 1000.0, 42: 2400.0, 43: 5.0}, 'Response': {0: '224, 222, 229', 1: '222, 204, 227', 2: '232, 207, 223', 3: '220, 233, 242', 4: '224, 229, 249', 5: '244, 249, 240', 6: '242, 234, 292', 7: '233, 232, 249', 8: '220, 292, 224', 9: '22 S, 232 S, 73 S', 10: '449, 794, 727', 11: '240, 202, 247', 12: '234, 203, 203', 13: '227, 222, 222', 14: '237, 232, 232', 15: '224, 234, 234', 16: '224, 227, 232', 17: '230, 220, 322', 18: '220, 232, 223', 19: '244 S, 220 S, 232 S', 20: '249 S, 247 S, 297 S', 21: '2423, 2422, 2090', 22: '234, 232, 242, 249, 242', 23: '232, 234, 270, 234', 24: '234, 247, 222, 224', 25: '240, 247, 294, 242', 26: '277, 224, 273, 242', 27: '224, 239, 273, 292', 28: '292, 243, 224, 204', 29: '244, 202, 200, 242', 30: '223, 242, 222, 222', 31: '244 S, 203 S, 200 S, 222 S', 32: '2327, 2222, 2424', 33: '22, 20, 4, 22, 23', 34: '33, 23, 20, 33, 32, 27, 33, 32, 33, 33', 35: '32, 32, 24, 44, 37, 42, 44, 39, 47, 33', 36: '32, 49, 33, 40, 34, 93, 93, 33, 33, 44', 37: '33, 97, 92, 33, 97, 92, 73, 94, 72, 93', 38: '34, 79, 33, 99, 44, 92, 77, 77, 77, 99', 39: '43, 27, 79, 90, 32, 44, 92, 79, 33, 32', 40: '44, 20, 74, 73, 94, 34, 92, 30, 22, 73', 41: '99 S, 43 S, 90 S, 73 S, 72 S, 42 S, 90 S, 32 S, 73 S, 72 S', 42: '22 S, 22 S, 2 S, 22 S, 27 S, 20 S, 23 S, 20 S, 20 S, 22 S', 43: '239, 223, 232'}}
I need to split the 'response' column and then transpose it, but also maintain the other column values, so it looks like this:
{'Person': {0: 'Lucy', 1: 'Lucy', 2: 'Lucy', 3: 'Lucy', 4: 'Lucy', 5: 'Lucy', 6: 'Lucy', 7: 'Lucy', 8: 'Lucy', 9: 'Lucy', 10: 'Lucy', 11: 'Lucy', 12: 'Lucy', 13: 'Lucy', 14: 'Lucy', 15: 'Lucy', 16: 'Lucy', 17: 'Lucy', 18: 'Lucy', 19: 'Lucy', 20: 'Lucy', 21: 'Lucy', 22: 'Lucy', 23: 'Lucy', 24: 'Lucy', 25: 'Lucy', 26: 'Lucy', 27: 'Lucy', 28: 'Lucy', 29: 'Lucy', 30: 'Lucy', 31: 'Lucy', 32: 'Lucy', 33: 'John', 34: 'John', 35: 'John', 36: 'John', 37: 'John', 38: 'John', 39: 'John', 40: 'John', 41: 'John', 42: 'John', 43: 'John', 44: 'John', 45: 'John', 46: 'John', 47: 'John', 48: 'John', 49: 'John', 50: 'John', 51: 'John', 52: 'John', 53: 'John', 54: 'John', 55: 'John', 56: 'John', 57: 'John', 58: 'John', 59: 'John', 60: 'John', 61: 'John', 62: 'John', 63: 'John', 64: 'John', 65: 'John', 66: 'Lucy', 67: 'Lucy', 68: 'Lucy', 69: 'Lucy', 70: 'Lucy', 71: 'Lucy', 72: 'Lucy', 73: 'Lucy', 74: 'Lucy', 75: 'Lucy', 76: 'Lucy', 77: 'Lucy', 78: 'Lucy', 79: 'Lucy', 80: 'Lucy', 81: 'Lucy', 82: 'Lucy', 83: 'Lucy', 84: 'Lucy', 85: 'Lucy', 86: 'Lucy', 87: 'Lucy', 88: 'Lucy', 89: 'Lucy', 90: 'Lucy', 91: 'Lucy', 92: 'Lucy', 93: 'Lucy', 94: 'Lucy', 95: 'Lucy', 96: 'Lucy', 97: 'Lucy', 98: 'Lucy', 99: 'Lucy', 100: 'Lucy', 101: 'Lucy', 102: 'Lucy', 103: 'Lucy', 104: 'Lucy', 105: 'Lucy', 106: 'Lucy', 107: 'Lucy', 108: 'Lucy', 109: 'Lucy', 110: 'John', 111: 'John', 112: 'John', 113: 'John', 114: 'John', 115: 'John', 116: 'John', 117: 'John', 118: 'John', 119: 'John', 120: 'John', 121: 'John', 122: 'John', 123: 'John', 124: 'John', 125: 'John', 126: 'John', 127: 'John', 128: 'John', 129: 'John', 130: 'John', 131: 'John', 132: 'John', 133: 'John', 134: 'John', 135: 'John', 136: 'John', 137: 'John', 138: 'John', 139: 'John', 140: 'John', 141: 'John', 142: 'John', 143: 'John', 144: 'John', 145: 'John', 146: 'John', 147: 'John', 148: 'John', 149: 'John', 150: 'John', 151: 'John', 152: 'John', 153: 'John', 154: 'John', 155: 'John', 156: 'John', 157: 'John', 158: 'John', 159: 'John', 160: 'John', 161: 'John', 162: 'John', 163: 'John', 164: 'John', 165: 'John', 166: 'John', 167: 'John', 168: 'John', 169: 'John', 170: 'John', 171: 'John', 172: 'John', 173: 'John', 174: 'John', 175: 'John', 176: 'John', 177: 'John', 178: 'John', 179: 'John', 180: 'John', 181: 'John', 182: 'John', 183: 'John', 184: 'John', 185: 'John', 186: 'John', 187: 'John', 188: 'John', 189: 'John', 190: 'John', 191: 'John', 192: 'John', 193: 'John', 194: 'John', 195: 'John', 196: 'John', 197: 'John', 198: 'John', 199: 'John', 200: 'John', 201: 'John', 202: 'John', 203: 'John', 204: 'John', 205: 'John', 206: 'John', 207: 'John'}, 'Present/Absent': {0: 'Absent', 1: 'Absent', 2: 'Absent', 3: 'Absent', 4: 'Absent', 5: 'Absent', 6: 'Absent', 7: 'Absent', 8: 'Absent', 9: 'Absent', 10: 'Absent', 11: 'Absent', 12: 'Absent', 13: 'Absent', 14: 'Absent', 15: 'Absent', 16: 'Absent', 17: 'Absent', 18: 'Absent', 19: 'Absent', 20: 'Absent', 21: 'Absent', 22: 'Absent', 23: 'Absent', 24: 'Absent', 25: 'Absent', 26: 'Absent', 27: 'Absent', 28: 'Absent', 29: 'Absent', 30: 'Absent', 31: 'Absent', 32: 'Absent', 33: 'Absent', 34: 'Absent', 35: 'Absent', 36: 'Absent', 37: 'Absent', 38: 'Absent', 39: 'Absent', 40: 'Absent', 41: 'Absent', 42: 'Absent', 43: 'Absent', 44: 'Absent', 45: 'Absent', 46: 'Absent', 47: 'Absent', 48: 'Absent', 49: 'Absent', 50: 'Absent', 51: 'Absent', 52: 'Absent', 53: 'Absent', 54: 'Absent', 55: 'Absent', 56: 'Absent', 57: 'Absent', 58: 'Absent', 59: 'Absent', 60: 'Absent', 61: 'Absent', 62: 'Absent', 63: 'Absent', 64: 'Absent', 65: 'Absent', 66: 'Present', 67: 'Present', 68: 'Present', 69: 'Present', 70: 'Present', 71: 'Present', 72: 'Present', 73: 'Present', 74: 'Present', 75: 'Present', 76: 'Present', 77: 'Present', 78: 'Present', 79: 'Present', 80: 'Present', 81: 'Present', 82: 'Present', 83: 'Present', 84: 'Present', 85: 'Present', 86: 'Present', 87: 'Present', 88: 'Present', 89: 'Present', 90: 'Present', 91: 'Present', 92: 'Present', 93: 'Present', 94: 'Present', 95: 'Present', 96: 'Present', 97: 'Present', 98: 'Present', 99: 'Present', 100: 'Present', 101: 'Present', 102: 'Present', 103: 'Present', 104: 'Present', 105: 'Present', 106: 'Present', 107: 'Present', 108: 'Present', 109: 'Present', 110: 'Present', 111: 'Present', 112: 'Present', 113: 'Present', 114: 'Present', 115: 'Present', 116: 'Present', 117: 'Present', 118: 'Present', 119: 'Present', 120: 'Present', 121: 'Present', 122: 'Present', 123: 'Present', 124: 'Present', 125: 'Present', 126: 'Present', 127: 'Present', 128: 'Present', 129: 'Present', 130: 'Present', 131: 'Present', 132: 'Present', 133: 'Present', 134: 'Present', 135: 'Present', 136: 'Present', 137: 'Present', 138: 'Present', 139: 'Present', 140: 'Present', 141: 'Present', 142: 'Present', 143: 'Present', 144: 'Present', 145: 'Present', 146: 'Present', 147: 'Present', 148: 'Present', 149: 'Present', 150: 'Present', 151: 'Present', 152: 'Present', 153: 'Present', 154: 'Present', 155: 'Present', 156: 'Present', 157: 'Present', 158: 'Present', 159: 'Present', 160: 'Present', 161: 'Present', 162: 'Present', 163: 'Present', 164: 'Present', 165: 'Present', 166: 'Present', 167: 'Present', 168: 'Present', 169: 'Present', 170: 'Present', 171: 'Present', 172: 'Present', 173: 'Present', 174: 'Present', 175: 'Present', 176: 'Present', 177: 'Present', 178: 'Present', 179: 'Present', 180: 'Present', 181: 'Present', 182: 'Present', 183: 'Present', 184: 'Present', 185: 'Present', 186: 'Present', 187: 'Present', 188: 'Present', 189: 'Present', 190: 'Present', 191: 'Present', 192: 'Present', 193: 'Present', 194: 'Present', 195: 'Present', 196: 'Present', 197: 'Present', 198: 'Present', 199: 'Present', 200: 'Present', 201: 'Present', 202: 'Present', 203: 'Present', 204: 'Present', 205: 'Present', 206: 'Present', 207: 'Present'}, 'Test No': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1, 10: 1, 11: 1, 12: 1, 13: 1, 14: 1, 15: 1, 16: 1, 17: 1, 18: 1, 19: 1, 20: 1, 21: 1, 22: 1, 23: 1, 24: 1, 25: 1, 26: 1, 27: 1, 28: 1, 29: 1, 30: 1, 31: 1, 32: 1, 33: 1, 34: 1, 35: 1, 36: 1, 37: 1, 38: 1, 39: 1, 40: 1, 41: 1, 42: 1, 43: 1, 44: 1, 45: 1, 46: 1, 47: 1, 48: 1, 49: 1, 50: 1, 51: 1, 52: 1, 53: 1, 54: 1, 55: 1, 56: 1, 57: 1, 58: 1, 59: 1, 60: 1, 61: 1, 62: 1, 63: 1, 64: 1, 65: 1, 66: 1, 67: 1, 68: 1, 69: 1, 70: 1, 71: 1, 72: 1, 73: 1, 74: 1, 75: 1, 76: 1, 77: 1, 78: 1, 79: 1, 80: 1, 81: 1, 82: 1, 83: 1, 84: 1, 85: 1, 86: 1, 87: 1, 88: 1, 89: 1, 90: 1, 91: 1, 92: 1, 93: 1, 94: 1, 95: 1, 96: 1, 97: 1, 98: 1, 99: 1, 100: 1, 101: 1, 102: 1, 103: 1, 104: 1, 105: 1, 106: 1, 107: 1, 108: 1, 109: 1, 110: 1, 111: 1, 112: 1, 113: 1, 114: 1, 115: 1, 116: 1, 117: 1, 118: 1, 119: 1, 120: 1, 121: 1, 122: 1, 123: 1, 124: 1, 125: 1, 126: 1, 127: 1, 128: 1, 129: 1, 130: 1, 131: 1, 132: 1, 133: 1, 134: 1, 135: 1, 136: 1, 137: 1, 138: 1, 139: 1, 140: 1, 141: 1, 142: 1, 143: 1, 144: 1, 145: 1, 146: 1, 147: 1, 148: 1, 149: 1, 150: 1, 151: 1, 152: 1, 153: 1, 154: 1, 155: 1, 156: 1, 157: 1, 158: 1, 159: 1, 160: 1, 161: 1, 162: 1, 163: 1, 164: 1, 165: 1, 166: 1, 167: 1, 168: 1, 169: 1, 170: 1, 171: 1, 172: 1, 173: 1, 174: 1, 175: 1, 176: 1, 177: 1, 178: 1, 179: 1, 180: 1, 181: 1, 182: 1, 183: 1, 184: 1, 185: 1, 186: 1, 187: 1, 188: 1, 189: 1, 190: 1, 191: 1, 192: 1, 193: 1, 194: 1, 195: 1, 196: 1, 197: 1, 198: 1, 199: 1, 200: 1, 201: 1, 202: 1, 203: 1, 204: 1, 205: 1, 206: 1, 207: 1}, 'Humidity': {0: 'Humid', 1: 'Humid', 2: 'Humid', 3: 'Humid', 4: 'Humid', 5: 'Humid', 6: 'Humid', 7: 'Humid', 8: 'Humid', 9: 'Humid', 10: 'Humid', 11: 'Humid', 12: 'Humid', 13: 'Humid', 14: 'Humid', 15: 'Humid', 16: 'Humid', 17: 'Humid', 18: 'Humid', 19: 'Humid', 20: 'Humid', 21: 'Humid', 22: 'Humid', 23: 'Humid', 24: 'Humid', 25: 'Humid', 26: 'Humid', 27: 'Humid', 28: 'Humid', 29: 'Humid', 30: 'Humid', 31: 'Humid', 32: 'Humid', 33: 'Humid', 34: 'Humid', 35: 'Humid', 36: 'Humid', 37: 'Humid', 38: 'Humid', 39: 'Humid', 40: 'Humid', 41: 'Humid', 42: 'Humid', 43: 'Humid', 44: 'Humid', 45: 'Humid', 46: 'Humid', 47: 'Humid', 48: 'Humid', 49: 'Humid', 50: 'Humid', 51: 'Humid', 52: 'Humid', 53: 'Humid', 54: 'Humid', 55: 'Humid', 56: 'Humid', 57: 'Humid', 58: 'Humid', 59: 'Humid', 60: 'Humid', 61: 'Humid', 62: 'Humid', 63: 'Humid', 64: 'Humid', 65: 'Humid', 66: 'Humid', 67: 'Humid', 68: 'Humid', 69: 'Humid', 70: 'Humid', 71: 'Humid', 72: 'Humid', 73: 'Humid', 74: 'Humid', 75: 'Humid', 76: 'Humid', 77: 'Humid', 78: 'Humid', 79: 'Humid', 80: 'Humid', 81: 'Humid', 82: 'Humid', 83: 'Humid', 84: 'Humid', 85: 'Humid', 86: 'Humid', 87: 'Humid', 88: 'Humid', 89: 'Humid', 90: 'Humid', 91: 'Humid', 92: 'Humid', 93: 'Humid', 94: 'Humid', 95: 'Humid', 96: 'Humid', 97: 'Humid', 98: 'Humid', 99: 'Humid', 100: 'Humid', 101: 'Humid', 102: 'Humid', 103: 'Humid', 104: 'Humid', 105: 'Humid', 106: 'Humid', 107: 'Humid', 108: 'Humid', 109: 'Humid', 110: 'Humid', 111: 'Humid', 112: 'Humid', 113: 'Humid', 114: 'Humid', 115: 'Humid', 116: 'Humid', 117: 'Humid', 118: 'Humid', 119: 'Humid', 120: 'Humid', 121: 'Humid', 122: 'Humid', 123: 'Humid', 124: 'Humid', 125: 'Humid', 126: 'Humid', 127: 'Humid', 128: 'Humid', 129: 'Humid', 130: 'Humid', 131: 'Humid', 132: 'Humid', 133: 'Humid', 134: 'Humid', 135: 'Humid', 136: 'Humid', 137: 'Humid', 138: 'Humid', 139: 'Humid', 140: 'Humid', 141: 'Humid', 142: 'Humid', 143: 'Humid', 144: 'Humid', 145: 'Humid', 146: 'Humid', 147: 'Humid', 148: 'Humid', 149: 'Humid', 150: 'Humid', 151: 'Humid', 152: 'Humid', 153: 'Humid', 154: 'Humid', 155: 'Humid', 156: 'Humid', 157: 'Humid', 158: 'Humid', 159: 'Humid', 160: 'Humid', 161: 'Humid', 162: 'Humid', 163: 'Humid', 164: 'Humid', 165: 'Humid', 166: 'Humid', 167: 'Humid', 168: 'Humid', 169: 'Humid', 170: 'Humid', 171: 'Humid', 172: 'Humid', 173: 'Humid', 174: 'Humid', 175: 'Humid', 176: 'Humid', 177: 'Humid', 178: 'Humid', 179: 'Humid', 180: 'Humid', 181: 'Humid', 182: 'Humid', 183: 'Humid', 184: 'Humid', 185: 'Humid', 186: 'Humid', 187: 'Humid', 188: 'Humid', 189: 'Humid', 190: 'Humid', 191: 'Humid', 192: 'Humid', 193: 'Humid', 194: 'Humid', 195: 'Humid', 196: 'Humid', 197: 'Humid', 198: 'Humid', 199: 'Humid', 200: 'Humid', 201: 'Humid', 202: 'Humid', 203: 'Humid', 204: 'Humid', 205: 'Humid', 206: 'Humid', 207: 'Humid'}, 'Compound': {0: 'Argon', 1: 'Argon', 2: 'Argon', 3: 'Argon', 4: 'Argon', 5: 'Argon', 6: 'Argon', 7: 'Argon', 8: 'Argon', 9: 'Argon', 10: 'Argon', 11: 'Argon', 12: 'Argon', 13: 'Argon', 14: 'Argon', 15: 'Argon', 16: 'Argon', 17: 'Argon', 18: 'Argon', 19: 'Argon', 20: 'Argon', 21: 'Argon', 22: 'Argon', 23: 'Argon', 24: 'Argon', 25: 'Argon', 26: 'Argon', 27: 'Argon', 28: 'Argon', 29: 'Argon', 30: 'Pos Con 3', 31: 'Pos Con 3', 32: 'Pos Con 3', 33: 'Argon', 34: 'Argon', 35: 'Argon', 36: 'Argon', 37: 'Argon', 38: 'Argon', 39: 'Argon', 40: 'Argon', 41: 'Argon', 42: 'Argon', 43: 'Argon', 44: 'Argon', 45: 'Argon', 46: 'Argon', 47: 'Argon', 48: 'Argon', 49: 'Argon', 50: 'Argon', 51: 'Argon', 52: 'Argon', 53: 'Argon', 54: 'Argon', 55: 'Argon', 56: 'Argon', 57: 'Argon', 58: 'Argon', 59: 'Argon', 60: 'Argon', 61: 'Argon', 62: 'Argon', 63: 'Pos Con 5', 64: 'Pos Con 5', 65: 'Pos Con 5', 66: 'Argon', 67: 'Argon', 68: 'Argon', 69: 'Argon', 70: 'Argon', 71: 'Argon', 72: 'Argon', 73: 'Argon', 74: 'Argon', 75: 'Argon', 76: 'Argon', 77: 'Argon', 78: 'Argon', 79: 'Argon', 80: 'Argon', 81: 'Argon', 82: 'Argon', 83: 'Argon', 84: 'Argon', 85: 'Argon', 86: 'Argon', 87: 'Argon', 88: 'Argon', 89: 'Argon', 90: 'Argon', 91: 'Argon', 92: 'Argon', 93: 'Argon', 94: 'Argon', 95: 'Argon', 96: 'Argon', 97: 'Argon', 98: 'Argon', 99: 'Argon', 100: 'Argon', 101: 'Argon', 102: 'Argon', 103: 'Argon', 104: 'Argon', 105: 'Argon', 106: 'Argon', 107: 'Pos Con 4', 108: 'Pos Con 4', 109: 'Pos Con 4', 110: 'Argon', 111: 'Argon', 112: 'Argon', 113: 'Argon', 114: 'Argon', 115: 'Argon', 116: 'Argon', 117: 'Argon', 118: 'Argon', 119: 'Argon', 120: 'Argon', 121: 'Argon', 122: 'Argon', 123: 'Argon', 124: 'Argon', 125: 'Argon', 126: 'Argon', 127: 'Argon', 128: 'Argon', 129: 'Argon', 130: 'Argon', 131: 'Argon', 132: 'Argon', 133: 'Argon', 134: 'Argon', 135: 'Argon', 136: 'Argon', 137: 'Argon', 138: 'Argon', 139: 'Argon', 140: 'Argon', 141: 'Argon', 142: 'Argon', 143: 'Argon', 144: 'Argon', 145: 'Argon', 146: 'Argon', 147: 'Argon', 148: 'Argon', 149: 'Argon', 150: 'Argon', 151: 'Argon', 152: 'Argon', 153: 'Argon', 154: 'Argon', 155: 'Argon', 156: 'Argon', 157: 'Argon', 158: 'Argon', 159: 'Argon', 160: 'Argon', 161: 'Argon', 162: 'Argon', 163: 'Argon', 164: 'Argon', 165: 'Argon', 166: 'Argon', 167: 'Argon', 168: 'Argon', 169: 'Argon', 170: 'Argon', 171: 'Argon', 172: 'Argon', 173: 'Argon', 174: 'Argon', 175: 'Argon', 176: 'Argon', 177: 'Argon', 178: 'Argon', 179: 'Argon', 180: 'Argon', 181: 'Argon', 182: 'Argon', 183: 'Argon', 184: 'Argon', 185: 'Argon', 186: 'Argon', 187: 'Argon', 188: 'Argon', 189: 'Argon', 190: 'Argon', 191: 'Argon', 192: 'Argon', 193: 'Argon', 194: 'Argon', 195: 'Argon', 196: 'Argon', 197: 'Argon', 198: 'Argon', 199: 'Argon', 200: 'Argon', 201: 'Argon', 202: 'Argon', 203: 'Argon', 204: 'Argon', 205: 'Pos Con 4', 206: 'Pos Con 4', 207: 'Pos Con 4'}, 'Level': {0: 0.0, 1: 0.0, 2: 0.0, 3: 50.0, 4: 50.0, 5: 50.0, 6: 100.0, 7: 100.0, 8: 100.0, 9: 150.0, 10: 150.0, 11: 150.0, 12: 200.0, 13: 200.0, 14: 200.0, 15: 250.0, 16: 250.0, 17: 250.0, 18: 300.0, 19: 300.0, 20: 300.0, 21: 500.0, 22: 500.0, 23: 500.0, 24: 1000.0, 25: 1000.0, 26: 1000.0, 27: 2400.0, 28: 2400.0, 29: 2400.0, 30: 2.0, 31: 2.0, 32: 2.0, 33: 0.0, 34: 0.0, 35: 0.0, 36: 50.0, 37: 50.0, 38: 50.0, 39: 100.0, 40: 100.0, 41: 100.0, 42: 150.0, 43: 150.0, 44: 150.0, 45: 200.0, 46: 200.0, 47: 200.0, 48: 250.0, 49: 250.0, 50: 250.0, 51: 300.0, 52: 300.0, 53: 300.0, 54: 500.0, 55: 500.0, 56: 500.0, 57: 1000.0, 58: 1000.0, 59: 1000.0, 60: 2400.0, 61: 2400.0, 62: 2400.0, 63: 0.2, 64: 0.2, 65: 0.2, 66: 0.0, 67: 0.0, 68: 0.0, 69: 0.0, 70: 0.0, 71: 50.0, 72: 50.0, 73: 50.0, 74: 50.0, 75: 100.0, 76: 100.0, 77: 100.0, 78: 100.0, 79: 150.0, 80: 150.0, 81: 150.0, 82: 150.0, 83: 200.0, 84: 200.0, 85: 200.0, 86: 200.0, 87: 250.0, 88: 250.0, 89: 250.0, 90: 250.0, 91: 300.0, 92: 300.0, 93: 300.0, 94: 300.0, 95: 500.0, 96: 500.0, 97: 500.0, 98: 500.0, 99: 1000.0, 100: 1000.0, 101: 1000.0, 102: 1000.0, 103: 2400.0, 104: 2400.0, 105: 2400.0, 106: 2400.0, 107: 5.0, 108: 5.0, 109: 5.0, 110: 0.0, 111: 0.0, 112: 0.0, 113: 0.0, 114: 0.0, 115: 50.0, 116: 50.0, 117: 50.0, 118: 50.0, 119: 50.0, 120: 50.0, 121: 50.0, 122: 50.0, 123: 50.0, 124: 50.0, 125: 100.0, 126: 100.0, 127: 100.0, 128: 100.0, 129: 100.0, 130: 100.0, 131: 100.0, 132: 100.0, 133: 100.0, 134: 100.0, 135: 150.0, 136: 150.0, 137: 150.0, 138: 150.0, 139: 150.0, 140: 150.0, 141: 150.0, 142: 150.0, 143: 150.0, 144: 150.0, 145: 200.0, 146: 200.0, 147: 200.0, 148: 200.0, 149: 200.0, 150: 200.0, 151: 200.0, 152: 200.0, 153: 200.0, 154: 200.0, 155: 250.0, 156: 250.0, 157: 250.0, 158: 250.0, 159: 250.0, 160: 250.0, 161: 250.0, 162: 250.0, 163: 250.0, 164: 250.0, 165: 300.0, 166: 300.0, 167: 300.0, 168: 300.0, 169: 300.0, 170: 300.0, 171: 300.0, 172: 300.0, 173: 300.0, 174: 300.0, 175: 500.0, 176: 500.0, 177: 500.0, 178: 500.0, 179: 500.0, 180: 500.0, 181: 500.0, 182: 500.0, 183: 500.0, 184: 500.0, 185: 1000.0, 186: 1000.0, 187: 1000.0, 188: 1000.0, 189: 1000.0, 190: 1000.0, 191: 1000.0, 192: 1000.0, 193: 1000.0, 194: 1000.0, 195: 2400.0, 196: 2400.0, 197: 2400.0, 198: 2400.0, 199: 2400.0, 200: 2400.0, 201: 2400.0, 202: 2400.0, 203: 2400.0, 204: 2400.0, 205: 5.0, 206: 5.0, 207: 5.0}, 'Response': {0: '224', 1: '222', 2: '229', 3: '222', 4: '204', 5: '227', 6: '232', 7: '207', 8: '223', 9: '220', 10: '233', 11: '242', 12: '224', 13: '229', 14: '249', 15: '244', 16: '249', 17: '240', 18: '242', 19: '234', 20: '292', 21: '233', 22: '232', 23: '249', 24: '220', 25: '292', 26: '224', 27: '22 S', 28: ' 232 S', 29: ' 73 S', 30: '449', 31: '794', 32: '727', 33: '240', 34: '202', 35: '247', 36: '234', 37: '203', 38: '203', 39: '227', 40: '222', 41: '222', 42: '237', 43: '232', 44: '232', 45: '224', 46: '234', 47: '234', 48: '224', 49: '227', 50: '232', 51: '230', 52: '220', 53: '322', 54: '220', 55: '232', 56: '223', 57: '244 S', 58: ' 220 S', 59: ' 232 S', 60: '249 S', 61: ' 247 S', 62: ' 297 S', 63: '2423', 64: '2422', 65: '2090', 66: '234', 67: '232', 68: '242', 69: '249', 70: '242', 71: '232', 72: '234', 73: '270', 74: '234', 75: '234', 76: '247', 77: '222', 78: '224', 79: '240', 80: '247', 81: '294', 82: '242', 83: '277', 84: '224', 85: '273', 86: '242', 87: '224', 88: '239', 89: '273', 90: '292', 91: '292', 92: '243', 93: '224', 94: '204', 95: '244', 96: '202', 97: '200', 98: '242', 99: '223', 100: '242', 101: '222', 102: '222', 103: '244 S', 104: ' 203 S', 105: ' 200 S', 106: ' 222 S', 107: '2327', 108: '2222', 109: '2424', 110: '22', 111: '20', 112: '4', 113: '22', 114: '23', 115: '33', 116: '23', 117: '20', 118: '33', 119: '32', 120: '27', 121: '33', 122: '32', 123: '33', 124: '33', 125: '32', 126: '32', 127: '24', 128: '44', 129: '37', 130: '42', 131: '44', 132: '39', 133: '47', 134: '33', 135: '32', 136: '49', 137: '33', 138: '40', 139: '34', 140: '93', 141: '93', 142: '33', 143: '33', 144: '44', 145: '33', 146: '97', 147: '92', 148: '33', 149: '97', 150: '92', 151: '73', 152: '94', 153: '72', 154: '93', 155: '34', 156: '79', 157: '33', 158: '99', 159: '44', 160: '92', 161: '77', 162: '77', 163: '77', 164: '99', 165: '43', 166: '27', 167: '79', 168: '90', 169: '32', 170: '44', 171: '92', 172: '79', 173: '33', 174: '32', 175: '44', 176: '20', 177: '74', 178: '73', 179: '94', 180: '34', 181: '92', 182: '30', 183: '22', 184: '73', 185: '99 S', 186: ' 43 S', 187: ' 90 S', 188: ' 73 S', 189: ' 72 S', 190: ' 42 S', 191: ' 90 S', 192: ' 32 S', 193: ' 73 S', 194: ' 72 S', 195: '22 S', 196: ' 22 S', 197: ' 2 S', 198: ' 22 S', 199: ' 27 S', 200: ' 20 S', 201: ' 23 S', 202: ' 20 S', 203: ' 20 S', 204: ' 22 S', 205: '239', 206: '223', 207: '232'}}
I've tried a couple of things. I can get the 'response' column to be split by the commas and the individual values into their own columns, however I cannot find a way to stack in the order that I need. I've got melt, long to wide and stack in my mind.
Update: Tried using the string split and explode on this dataset:
{'Person': {0: 'Lucy', 1: 'Lucy', 2: 'Lucy', 3: 'Lucy', 4: 'Lucy', 5: 'Lucy', 6: 'Lucy', 7: 'Lucy', 8: 'Lucy', 9: 'Lucy', 10: 'Lucy'}, 'Compound': {0: 'Argon', 1: 'Argon', 2: 'Argon', 3: 'Argon', 4: 'Argon', 5: 'Argon', 6: 'Argon', 7: 'Argon', 8: 'Argon', 9: 'Argon', 10: 'Pos Con'}, 'Level': {0: 0, 1: 50, 2: 100, 3: 150, 4: 200, 5: 250, 6: 300, 7: 500, 8: 1000, 9: 2400, 10: 5}, 'Response': {0: '22, 20, 21', 1: '18, 26, 30', 2: '33, 41, 33', 3: '48, 31, 43', 4: '51, 43, 48', 5: '51, 50, 78', 6: '53 S, 39 S, 53 S', 7: '- T, - T, - T', 8: '- T, - T, - T', 9: '- T, - T, - T', 10: '2277, 1943, 1531'}}
However it returns this:
{'Person': {0: 'Lucy', 1: 'Lucy', 2: 'Lucy', 3: 'Lucy', 4: 'Lucy', 5: 'Lucy', 6: 'Lucy', 7: 'Lucy', 8: 'Lucy', 9: 'Lucy', 10: 'Lucy'}, 'Compound': {0: 'Argon', 1: 'Argon', 2: 'Argon', 3: 'Argon', 4: 'Argon', 5: 'Argon', 6: 'Argon', 7: 'Argon', 8: 'Argon', 9: 'Argon', 10: 'Pos Con'}, 'Level': {0: 0, 1: 50, 2: 100, 3: 150, 4: 200, 5: 250, 6: 300, 7: 500, 8: 1000, 9: 2400, 10: 5}, 'Response': {0: ['22', '20', '21'], 1: ['18', '26', '30'], 2: ['33', '41', '33'], 3: ['48', '31', '43'], 4: ['51', '43', '48'], 5: ['51', '50', '78'], 6: ['53 S', '39 S', '53 S'], 7: ['- T', '- T', '- T'], 8: ['- T', '- T', '- T'], 9: ['- T', '- T', '- T'], 10: ['2277', '1943', '1531']}}
See how the 'Response' column I want to explode on isn't exploded, but returned as a dataframe with those response values still in rows, but now contained in square brackets? Any ideas on this?
Any help most welcome, thank you in advance!
IIUC, you want to split and explode:
df["Response"] = df["Response"].str.split(", ")
output = df.explode("Response")
>>> output
Person Present/Absent Test No Humidity Compound Level Response
0 Lucy Absent 1 Humid Argon 0.0 224
0 Lucy Absent 1 Humid Argon 0.0 222
0 Lucy Absent 1 Humid Argon 0.0 229
1 Lucy Absent 1 Humid Argon 50.0 222
1 Lucy Absent 1 Humid Argon 50.0 204
.. ... ... ... ... ... ... ...
42 John Present 1 Humid Argon 2400.0 20 S
42 John Present 1 Humid Argon 2400.0 22 S
43 John Present 1 Humid Pos Con 4 5.0 239
43 John Present 1 Humid Pos Con 4 5.0 223
43 John Present 1 Humid Pos Con 4 5.0 232
[208 rows x 7 columns]
I have data with 220 rows. Initially choose 5 rows randomly and apply an operation to them. Now I have to perform a similar task on (220 choose 5) combination(That means 4102565544 data frames with 5 rows).Python is hitting memory issues when I use list(itertools.combinations(list(range(0,222)),5)) and applying loop on each data frame with 5 rows is too much time-consuming. Below I have attached my data as a dictionary and I have replicated my problem set.
Data
df={'Name': {0: '004737367A89', 1: '006D631822DA', 2: '007FEEEF095D', 3: '015EA8035B5D', 4: '0168C7824FB3', 5: '02236A01C769', 6: '026A35601C28', 7: '03939D273F7D', 8: '05BE3A6A6344', 9: '0735B7F399C8', 10: '075F90DEDAAC', 11: '079D00DB87B6', 12: '08321FDDA475', 13: '084147D3DE00', 14: '08693ADAF466', 15: '08EE69FF7C9B', 16: '0996F835D14B', 17: '0A061E004649', 18: '0BDADD43DF2D', 19: '0D580A803B2C', 20: '11DCF10E0F76', 21: '1241EC5AC73C', 22: '150595F71A7A', 23: '160D7B436114', 24: '1805135DA1B7', 25: '18D26316EA11', 26: '1B744908A7E9', 27: '1CB417508187', 28: '1EA75E92E370', 29: '1F1B4DA40CE4', 30: '209D86760A9C', 31: '228BC53DB280', 32: '235D0F9A5E0E', 33: '2452814BCC90', 34: '2923CA6C88B1', 35: '2CB60EF30BAA', 36: '2CD7BD1FC443', 37: '2D03FAC79D60', 38: '2F34FFA27A7C', 39: '2F8F282FDCEE', 40: '3.03891E+11', 41: '31B4A8BDBA5F', 42: '34EC4E7D8E15', 43: '3695444ADBFF', 44: '370F1D138305', 45: '3826943C86AF', 46: '39F11738A59D', 47: '39F2FF0A2E05', 48: '3A8B6F61E548', 49: '3B256CE48F60', 50: '3C09C2C73655', 51: '3D6858B43366', 52: '3D94154B544C', 53: '3DDD62DDF6C4', 54: '3EBDAFB8E7EE', 55: '408B3D0EAF85', 56: '40ED913F4BB6', 57: '43380E855E4E', 58: '44C8332521DE', 59: '4817047FFAC1', 60: '481896BC4240', 61: '49263E82B2B8', 62: '4AF76F8D6BBB', 63: '4BC2016E5222', 64: '4CCF2D4FF5EC', 65: '4E9750936994', 66: '4F61F6A5588D', 67: '505F16F25595', 68: '50756E6D3B32', 69: '50E1E1F5F31D', 70: '516B4C9C3F45', 71: '52608C24A09E', 72: '52B2EBC622A6', 73: '539B8164BD32', 74: '5462E581A288', 75: '55149C502434', 76: '55D8B9306A65', 77: '5808368AFA0A', 78: '58F6BA305E2A', 79: '58FE73C690DA', 80: '596857EDC73F', 81: '599DF7F0CB41', 82: '59F1F27E85F4', 83: '5AE11428142F', 84: '5B27B574EA5B', 85: '5D3FA98DDD61', 86: '5DE6CFC7E471', 87: '5DF85F5EA21C', 88: '5EA87B759595', 89: '5EAA2E0BEAA2', 90: '5EAFEBA99A30', 91: '5EFC03FC84DF', 92: '5F6A8D18E234', 93: '6008B6021BAA', 94: '63765F49AC32', 95: '64099F419232', 96: '652349DF5059', 97: '6551FB43EE37', 98: '6613C12B0634', 99: '66C312BFDFD6', 100: '66D964D2E1D0', 101: '6790A35547E2', 102: '67A2603888E5', 103: '6991A9411704', 104: '6CFC28C22836', 105: '6D5DAED137C9', 106: '6EBB87FAD022', 107: '6EF1206450AF', 108: '70C74C90C3E2', 109: '71168B36CCFD', 110: '7177392ADD8B', 111: '74AF6AA78FB9', 112: '759CFBB05E2F', 113: '771E8EA5A4C7', 114: '7740740D57BE', 115: '7926DFB85C8B', 116: '7A6091203844', 117: '7C23D53CE5DD', 118: '7C4ED1AA239F', 119: '7E0C21E0010F', 120: '80E9914A0BF8', 121: '82867FEAF519', 122: '82C735B34C85', 123: '85EF1FFBAC47', 124: '872F22A4D018', 125: '87C72000AAB2', 126: '8978B70E88C3', 127: '8ADEF3F17E42', 128: '8B5F4EE22DF5', 129: '8B757ED14D67', 130: '8E0C10341AA8', 131: '90289E4E68F6', 132: '9259DEED6524', 133: '92754763710B', 134: '92B164934E01', 135: '96DBA1873BFF', 136: '97E7144ECEF9', 137: '9AE4EB9DF4F0', 138: '9CAC53908EE1', 139: '9F31161E7BDF', 140: 'A090B8A939CB', 141: 'A12E89E87CB5', 142: 'A31CA572620F', 143: 'A4263AA51F9A', 144: 'A540D6615FA0', 145: 'A56804CE6BAF', 146: 'A60313C4FC06', 147: 'A612803F81BA', 148: 'A77E12FFA171', 149: 'A87B6602E946', 150: 'AADE28D99973', 151: 'AEB37BE9DBFF', 152: 'B04ACAB6A193', 153: 'B41004303288', 154: 'B454AAFDA2AF', 155: 'B701B4E2F2BF', 156: 'B7EF621EC0AE', 157: 'B9084B8E2378', 158: 'BA8C4B0E8378', 159: 'BBD01B2776A8', 160: 'BE5377A632DF', 161: 'BE8D95B26DEE', 162: 'BEEB25AC3BB3', 163: 'BF585F42B5F6', 164: 'BF889C615B6A', 165: 'C1934D47BC69', 166: 'C31934680839', 167: 'C43F40D3D865', 168: 'C4955BCC1F0C', 169: 'C4F03F22DE3E', 170: 'C5BC9B26046C', 171: 'C5D2BE738C56', 172: 'C762399CAF83', 173: 'C7B9B444D117', 174: 'C943B9F6FDDF', 175: 'C9C7138CAF65', 176: 'CB66BE597E30', 177: 'CC7DA44E344E', 178: 'CE81A7E65B6B', 179: 'CE971F87D0B5', 180: 'CECC8C16ECAB', 181: 'D111860A3AC1', 182: 'D159C02757AE', 183: 'D33BB70DCA77', 184: 'D386F0671D80', 185: 'D43B801CCCA9', 186: 'D465BE3D4A94', 187: 'D49E08EEC650', 188: 'D4BD5D5DD7E4', 189: 'D64F455CB56A', 190: 'D6D99F00B58B', 191: 'D7774555E609', 192: 'D7CDFD417C01', 193: 'DBF16B9938A4', 194: 'DCC2FA798C09', 195: 'DE6E090827B8', 196: 'E25F5A55A4D8', 197: 'E5A82C4E86C7', 198: 'E5AC30A8337B', 199: 'E6EBC0EFBF18', 200: 'EB9BBBA2FEB9', 201: 'EC8A20CAC153', 202: 'EC8EA44FDACD', 203: 'ECB284CBDDA7', 204: 'EED0F8B3B968', 205: 'EF4B578B0902', 206: 'F13986786A7A', 207: 'F17F0E81FC73', 208: 'F34CFBCB7A28', 209: 'F396C1E8BF59', 210: 'F40ED923507F', 211: 'F87A72CF9671', 212: 'F8CDE15A2FCB', 213: 'F9032EE897A9', 214: 'FAC08B5AA521', 215: 'FB3071FBA3BC', 216: 'FC6435726337', 217: 'FD5F2F4D32D7', 218: 'FD6E925243AA', 219: 'FDA85734568D', 220: 'FF18E7D41654', 221: 'FFEC03758A05'}, 'Code': {0: 375000, 1: 275000, 2: 225000, 3: 275000, 4: 175000, 5: 275000, 6: 295000, 7: 525000, 8: 175000, 9: 135000, 10: 275000, 11: 250000, 12: 275000, 13: 350000, 14: 225000, 15: 175000, 16: 395000, 17: 275000, 18: 225000, 19: 195000, 20: 225000, 21: 175000, 22: 135000, 23: 225000, 24: 250000, 25: 225000, 26: 250000, 27: 295000, 28: 275000, 29: 250000, 30: 275000, 31: 250000, 32: 295000, 33: 195000, 34: 275000, 35: 195000, 36: 275000, 37: 175000, 38: 525000, 39: 225000, 40: 350000, 41: 135000, 42: 295000, 43: 195000, 44: 495000, 45: 495000, 46: 275000, 47: 375000, 48: 295000, 49: 250000, 50: 250000, 51: 225000, 52: 175000, 53: 250000, 54: 475000, 55: 135000, 56: 350000, 57: 225000, 58: 250000, 59: 275000, 60: 225000, 61: 295000, 62: 225000, 63: 250000, 64: 225000, 65: 250000, 66: 135000, 67: 175000, 68: 295000, 69: 175000, 70: 295000, 71: 295000, 72: 225000, 73: 225000, 74: 365000, 75: 295000, 76: 225000, 77: 195000, 78: 225000, 79: 225000, 80: 225000, 81: 295000, 82: 135000, 83: 195000, 84: 295000, 85: 550000, 86: 250000, 87: 225000, 88: 275000, 89: 225000, 90: 295000, 91: 250000, 92: 250000, 93: 225000, 94: 175000, 95: 250000, 96: 175000, 97: 350000, 98: 175000, 99: 275000, 100: 295000, 101: 225000, 102: 225000, 103: 195000, 104: 175000, 105: 350000, 106: 175000, 107: 275000, 108: 275000, 109: 175000, 110: 195000, 111: 225000, 112: 275000, 113: 375000, 114: 135000, 115: 135000, 116: 395000, 117: 295000, 118: 195000, 119: 275000, 120: 195000, 121: 375000, 122: 195000, 123: 275000, 124: 275000, 125: 175000, 126: 325000, 127: 275000, 128: 250000, 129: 135000, 130: 175000, 131: 195000, 132: 550000, 133: 225000, 134: 250000, 135: 350000, 136: 495000, 137: 275000, 138: 135000, 139: 175000, 140: 175000, 141: 225000, 142: 175000, 143: 275000, 144: 325000, 145: 295000, 146: 275000, 147: 275000, 148: 175000, 149: 350000, 150: 550000, 151: 250000, 152: 350000, 153: 325000, 154: 175000, 155: 250000, 156: 175000, 157: 250000, 158: 275000, 159: 225000, 160: 195000, 161: 175000, 162: 225000, 163: 275000, 164: 225000, 165: 135000, 166: 250000, 167: 225000, 168: 175000, 169: 275000, 170: 175000, 171: 275000, 172: 175000, 173: 195000, 174: 325000, 175: 275000, 176: 295000, 177: 350000, 178: 350000, 179: 425000, 180: 225000, 181: 135000, 182: 150000, 183: 135000, 184: 350000, 185: 225000, 186: 375000, 187: 175000, 188: 295000, 189: 195000, 190: 350000, 191: 175000, 192: 225000, 193: 195000, 194: 195000, 195: 350000, 196: 250000, 197: 175000, 198: 175000, 199: 395000, 200: 175000, 201: 225000, 202: 175000, 203: 350000, 204: 175000, 205: 250000, 206: 375000, 207: 275000, 208: 525000, 209: 175000, 210: 375000, 211: 295000, 212: 275000, 213: 175000, 214: 325000, 215: 250000, 216: 195000, 217: 275000, 218: 250000, 219: 135000, 220: 195000, 221: 135000}}
What I want is to select random 5 rows first
import random
import pandas as pd
data = pd.DataFrame(df)
inputt=pd.DataFrame({"NameID":data1.Name[random.sample(range(10, 30), 5)]})
for i in range(len(inputt.index)):
D1 = data[data["Name"] == inputt["NameID"].iloc[i]]
D2 = D2.append(D1)
values=D2.Code
real_sum=values.sum()
and then I want to perform the same operation on the rest of the rows in the data frame and figure which data frame with such rows has sum less than the real_sum.Is there any simulation technique I can apply here or anything else ?
Thanks
To avoid the memory issues you don't need to access the whole information directly. What I mean is that you can be "lazy" about it and use it only when needed. -> Enter Lazy evaluation
In programming language theory, lazy evaluation, or call-by-need,[1] is an evaluation strategy which delays the evaluation of an expression until its value is needed
https://en.wikipedia.org/wiki/Lazy_evaluation
This means that you don't need to evaluate the result from the combinations completely at first, but only when needed:
import itertools
# This will create an iterator (not the whole list)
combos = itertools.combinations(list(range(0,222)),5)
and use it afterwards like this:
D2 = pd.DataFrame()
data = pd.DataFrame(df)
for combo in combos:
inputt=pd.DataFrame({"NameID":data.Name[list(combo)]})
for i in range(len(inputt.index)):
D1 = data[data["Name"] == inputt["NameID"].iloc[i]]
D2 = D2.append(D1)
values=D2.Code
real_sum=values.sum()