I am trying to display a band using an upper and lower plots. but I keep getting an error. I am using matplotlib.pyplot fill_between.
Error Message:
ValueError: x and y must have same first dimension, but have shapes (1,) and (448, 1)
plt.xlim([0, 45.5])
plt.ylim([-16.50, 15.000])
plt.plot(len(X)-1, y_predicted, 'b', linewidth=1)
plt.fill_between(X, y_predit_lower, y_predit_upper, alpha=0.1, color='green')
plt.plot(X, y_predit_upper, 'g', linewidth=1, linestyle='--')
plt.plot(X, y_predit_lower, 'g', linewidth=1, linestyle='--')
plt.axhline(y=10.70, color='r', linestyle='-.',linewidth=2)
plt.xticks(np.arange(0, 45.5, 1.0), fontsize=8)
plt.yticks(np.arange(-16.50, 15.00, 0.50), fontsize=8)
plt.title("Pressure Gradient Valve Size (27mm)")
plt.xlabel("Time (sec)")
plt.ylabel("Pressure (mmHg)")
plt.grid()
plt.show()
For my x am using the values from a column of a DataFrame:
X = df_train['Time'].to_numpy('float')
This is the line of the code thats gives me the error:
plt.fill_between(X, y_predit_lower, y_predit_upper, alpha=0.1, color='green')
Error Message I get:
ValueError: x and y must have same first dimension, but have shapes (1,) and (448, 1)
In: print(X.shape)
Out: (448,)
In: print(y_predit_upper.shape)
Out: (448,)
In: print(y_predit_lower.shape)
Out: (448,)
In: print(X)
Out:
[ 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. 1.1 1.2 1.3 1.4
1.5 1.6 1.7 1.8 1.9 2. 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8
2.9 3. 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4. 4.1 4.2
4.3 4.4 4.5 4.6 4.7 4.8 4.9 5. 5.1 5.2 5.3 5.4 5.5 5.6
5.7 5.8 5.9 6. 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 7.
7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 8. 8.1 8.2 8.3 8.4
8.5 8.6 8.7 8.8 8.9 9. 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8
9.9 10. 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 11. 11.1 11.2
11.3 11.4 11.5 11.6 11.7 11.8 11.9 12. 12.1 12.2 12.3 12.4 12.5 12.6
12.7 12.8 12.9 13. 13.1 13.2 13.3 13.4 13.5 13.6 13.7 13.8 13.9 14.
14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8 14.9 15. 15.1 15.2 15.3 15.4
15.5 15.6 15.7 15.8 15.9 16. 16.1 16.2 16.3 16.4 16.5 16.6 16.7 16.8
16.9 17. 17.1 17.2 17.3 17.4 17.5 17.6 17.7 17.8 17.9 18. 18.1 18.2
18.3 18.4 18.5 18.6 18.7 18.8 18.9 19. 19.1 19.2 19.3 19.4 19.5 19.6
19.7 19.8 19.9 20. 20.1 20.2 20.3 20.4 20.5 20.6 20.7 20.8 20.9 21.
21.1 21.2 21.3 21.4 21.5 21.6 21.7 21.8 21.9 22. 22.1 22.2 22.3 22.4
22.5 22.6 22.7 22.8 22.9 23. 23.1 23.2 23.3 23.4 23.5 23.6 23.7 23.8
23.9 24. 24.1 24.2 24.3 24.4 24.5 24.6 24.7 24.8 24.9 25. 25.1 25.2
25.3 25.4 25.5 25.6 25.7 25.8 25.9 26. 26.1 26.2 26.3 26.4 26.5 26.6
26.7 26.8 26.9 27. 27.1 27.2 27.3 27.4 27.5 27.6 27.7 27.8 27.9 28.
28.1 28.2 28.3 28.4 28.5 28.6 28.7 28.8 28.9 29. 29.1 29.2 29.3 29.4
29.5 29.6 29.7 29.8 29.9 30. 30.1 30.2 30.3 30.4 30.5 30.6 30.7 30.8
30.9 31. 31.1 31.2 31.3 31.4 31.5 31.6 31.7 31.8 31.9 32. 32.1 32.2
32.3 32.4 32.5 32.6 32.7 32.8 32.9 33. 33.1 33.2 33.3 33.4 33.5 33.6
33.7 33.8 33.9 34. 34.1 34.2 34.3 34.4 34.5 34.6 34.7 34.8 34.9 35.
35.1 35.2 35.3 35.4 35.5 35.6 35.7 35.8 35.9 36. 36.1 36.2 36.3 36.4
36.5 36.6 36.7 36.8 36.9 37. 37.1 37.2 37.3 37.4 37.5 37.6 37.7 37.8
37.9 38. 38.1 38.2 38.3 38.4 38.5 38.6 38.7 38.8 38.9 39. 39.1 39.2
39.3 39.4 39.5 39.6 39.7 39.8 39.9 40. 40.1 40.2 40.3 40.4 40.5 40.6
40.7 40.8 40.9 41. 41.1 41.2 41.3 41.4 41.5 41.6 41.7 41.8 41.9 42.
42.1 42.2 42.3 42.4 42.5 42.6 42.7 42.8 42.9 43. 43.1 43.2 43.3 43.4
43.5 43.6 43.7 43.8 43.9 44. 44.1 44.2 44.3 44.4 44.5 44.6 44.7 44.8]
In: print(y_predit_upper.shape)
Out:
[-10.920185 -10.730879 -10.395649 -9.781197 -8.639384
-6.6007776 -3.5282364 -0.28529644 2.1445403 3.7071989
4.679699 5.2941465 5.695036 5.9663973 6.157011
6.2958107 6.400445 6.4820027 6.5476484 6.6021276
6.648663 6.6894903 6.72619 6.7599044 6.7914734
6.821528 6.8505487 6.8789124 6.906917 6.934807
6.962783 6.991015 7.01965 7.0488157 7.0786242
7.1091766 7.140561 7.1728573 7.2061357 7.2404547
7.275866 7.31241 7.3501153 7.389001 7.4290714
7.4703183 7.512725 7.5562544 7.600858 7.646476
7.6930337 7.740444 7.7886114 7.8374276 7.886779
7.9365463 7.986604 8.036831 8.087101 8.137291
8.187288 8.236979 8.286261 8.335035 8.383221
8.430737 8.477519 8.52351 8.568661 8.61293
8.65629 8.698717 8.740196 8.780713 8.82027
8.858864 8.8965 8.933188 8.96894 9.003771
9.037693 9.070728 9.102894 9.134211 9.1647
9.19438 9.223274 9.251405 9.2787895 9.305452
9.331414 9.356693 9.381311 9.405285 9.428637
9.451382 9.47354 9.495129 9.516164 9.536662
9.556641 9.576112 9.595093 9.613596 9.631638
9.649229 9.666382 9.683111 9.69943 9.715345
9.730873 9.746021 9.760802 9.775225 9.7893
9.803036 9.8164425 9.829529 9.842304 9.854776
9.866953 9.878841 9.890451 9.901789 9.912861
9.923676 9.9342375 9.944555 9.954636 9.964481
9.974102 9.983503 9.992689 10.001665 10.010438
10.019011 10.0273905 10.03558 10.043587 10.051413
10.059064 10.066545 10.07386 10.081012 10.088005
10.094845 10.101532 10.108074 10.114471 10.120729
10.126851 10.132839 10.138698 10.144428 10.150038
10.155523 10.1608925 10.166145 10.171288 10.176319
10.181244 10.186064 10.190783 10.1954 10.199922
10.204349 10.20868 10.212924 10.217076 10.221144
10.225128 10.229028 10.232847 10.236588 10.2402525
10.243841 10.247356 10.2508 10.254173 10.257479
10.2607155 10.263888 10.266995 10.270041 10.273026
10.275951 10.278817 10.281626 10.284379 10.287078
10.289722 10.292316 10.294858 10.297348 10.299793
10.302189 10.304538 10.30684 10.309098 10.311312
10.313485 10.315615 10.317703 10.319752 10.321762
10.323734 10.325667 10.327566 10.329426 10.331253
10.333044 10.334803 10.336527 10.338222 10.339883
10.341513 10.343113 10.344685 10.346227 10.34774
10.349226 10.350685 10.352116 10.353521 10.354902
10.356257 10.357589 10.358896 10.36018 10.36144
10.362679 10.363894 10.365089 10.366262 10.367414
10.368546 10.369658 10.370752 10.371826 10.372882
10.373919 10.374937 10.375938 10.376922 10.37789
10.3788395 10.379774 10.380693 10.381596 10.3824835
10.383356 10.384214 10.385057 10.385887 10.3867035
10.387505 10.388294 10.3890705 10.3898325 10.390584
10.391323 10.39205 10.392763 10.393466 10.394157
10.394838 10.395506 10.396166 10.396815 10.397452
10.398081 10.398699 10.399307 10.399906 10.400494
10.401075 10.401647 10.402208 10.402762 10.403307
10.403845 10.404373 10.404894 10.405405 10.4059105
10.406408 10.4068985 10.40738 10.407856 10.408323
10.408785 10.409239 10.409686 10.410128 10.410563
10.410991 10.411414 10.411829 10.412239 10.412644
10.413042 10.413435 10.413822 10.414204 10.414579
10.414951 10.415317 10.415678 10.416034 10.416384
10.416731 10.417071 10.417408 10.41774 10.418068
10.418391 10.41871 10.4190235 10.419333 10.41964
10.419942 10.42024 10.420534 10.420824 10.42111
10.421393 10.421673 10.421947 10.422218 10.422487
10.422752 10.423013 10.423271 10.423527 10.423778
10.4240265 10.424272 10.424515 10.424753 10.42499
10.425224 10.425454 10.425682 10.425907 10.426128
10.426348 10.426565 10.426779 10.4269905 10.4272
10.427407 10.427612 10.427814 10.428013 10.428211
10.428406 10.428598 10.428788 10.428976 10.429163
10.429345 10.429527 10.4297085 10.429886 10.430061
10.430235 10.430407 10.430576 10.430744 10.43091
10.431075 10.431237 10.431398 10.431557 10.431714
10.4318695 10.432024 10.432177 10.432327 10.432476
10.432623 10.43277 10.432913 10.433057 10.433199
10.433338 10.433476 10.433613 10.433748 10.433882
10.434015 10.434147 10.434278 10.434406 10.434532
10.43466 10.434786 10.434908 10.43503 10.435152
10.435272 10.43539 10.435509 10.435625 10.435741
10.435854 10.435968 10.43608 10.436192 10.436301
10.43641 10.436517 10.436625 10.43673 10.436835
10.436939 10.437042 10.437143 10.437244 10.4373455
10.437443 10.437542 10.437639 10.437736 10.437831
10.437925 10.43802 10.438112 10.438204 10.438295
10.438386 10.438476 10.438565 10.438652 10.438741
10.438827 10.438912 10.438997 10.438997 10.438997
10.438997 10.438997 10.438997 10.438997 10.438997
10.438997 10.438997 10.438997 ]
In: print(y_predit_lower.shape)
Out:
[-1.4231827e+01 -1.4231827e+01 -1.4231827e+01 -1.4231827e+01
-1.4231827e+01 -1.4231827e+01 -1.4231827e+01 -1.4231827e+01
-1.4231827e+01 -1.4231827e+01 -1.4231827e+01 -1.4229019e+01
-1.4225172e+01 -1.4219831e+01 -1.4212313e+01 -1.4201557e+01
-1.4185902e+01 -1.4162625e+01 -1.4127128e+01 -1.4071282e+01
-1.3980150e+01 -1.3825264e+01 -1.3550985e+01 -1.3048252e+01
-1.2114041e+01 -1.0446091e+01 -7.9321928e+00 -5.2788787e+00
-3.2908306e+00 -2.0122919e+00 -1.2166102e+00 -7.1388006e-01
-3.8587976e-01 -1.6385698e-01 -7.9002380e-03 1.0566306e-01
1.9127297e-01 2.5800228e-01 3.1171203e-01 3.5628605e-01
3.9436054e-01 4.2776465e-01 4.5779157e-01 4.8537636e-01
5.1120520e-01 5.3579545e-01 5.5953956e-01 5.8274651e-01
6.0565925e-01 6.2847805e-01 6.5136743e-01 6.7446661e-01
6.9789529e-01 7.2175813e-01 7.4614692e-01 7.7114415e-01
7.9682255e-01 8.2324672e-01 8.5047436e-01 8.7855363e-01
9.0752649e-01 9.3742609e-01 9.6827579e-01 1.0000916e+00
1.0328760e+00 1.0666242e+00 1.1013203e+00 1.1369352e+00
1.1734290e+00 1.2107525e+00 1.2488456e+00 1.2876358e+00
1.3270454e+00 1.3669858e+00 1.4073639e+00 1.4480829e+00
1.4890399e+00 1.5301342e+00 1.5712638e+00 1.6123290e+00
1.6532354e+00 1.6938915e+00 1.7342124e+00 1.7741199e+00
1.8135443e+00 1.8524213e+00 1.8906975e+00 1.9283261e+00
1.9652677e+00 2.0014882e+00 2.0369649e+00 2.0716777e+00
2.1056142e+00 2.1387653e+00 2.1711292e+00 2.2027068e+00
2.2335000e+00 2.2635174e+00 2.2927685e+00 2.3212667e+00
2.3490210e+00 2.3760500e+00 2.4023676e+00 2.4279904e+00
2.4529357e+00 2.4772196e+00 2.5008607e+00 2.5238762e+00
2.5462823e+00 2.5680971e+00 2.5893383e+00 2.6100211e+00
2.6301632e+00 2.6497784e+00 2.6688843e+00 2.6874938e+00
2.7056236e+00 2.7232871e+00 2.7404976e+00 2.7572689e+00
2.7736154e+00 2.7895460e+00 2.8050761e+00 2.8202147e+00
2.8349762e+00 2.8493690e+00 2.8634038e+00 2.8770909e+00
2.8904424e+00 2.9034648e+00 2.9161687e+00 2.9285626e+00
2.9406557e+00 2.9524565e+00 2.9639721e+00 2.9752111e+00
2.9861798e+00 2.9968877e+00 3.0073400e+00 3.0175438e+00
3.0275064e+00 3.0372334e+00 3.0467329e+00 3.0560083e+00
3.0650678e+00 3.0739164e+00 3.0825577e+00 3.0909996e+00
3.0992465e+00 3.1073031e+00 3.1151748e+00 3.1228657e+00
3.1303816e+00 3.1377258e+00 3.1449032e+00 3.1519175e+00
3.1587734e+00 3.1654744e+00 3.1720252e+00 3.1784282e+00
3.1846886e+00 3.1908092e+00 3.1967940e+00 3.2026453e+00
3.2083673e+00 3.2139630e+00 3.2194352e+00 3.2247872e+00
3.2300220e+00 3.2351422e+00 3.2401505e+00 3.2450500e+00
3.2498436e+00 3.2545323e+00 3.2591219e+00 3.2636104e+00
3.2680025e+00 3.2723012e+00 3.2765074e+00 3.2806249e+00
3.2846537e+00 3.2885976e+00 3.2924585e+00 3.2962365e+00
3.2999353e+00 3.3035574e+00 3.3071012e+00 3.3105736e+00
3.3139715e+00 3.3172994e+00 3.3205595e+00 3.3237495e+00
3.3268752e+00 3.3299356e+00 3.3329334e+00 3.3358698e+00
3.3387456e+00 3.3415632e+00 3.3443227e+00 3.3470273e+00
3.3496766e+00 3.3522720e+00 3.3548145e+00 3.3573065e+00
3.3597488e+00 3.3621411e+00 3.3644867e+00 3.3667846e+00
3.3690367e+00 3.3712454e+00 3.3734097e+00 3.3755307e+00
3.3776112e+00 3.3796487e+00 3.3816485e+00 3.3836088e+00
3.3855305e+00 3.3874140e+00 3.3892622e+00 3.3910737e+00
3.3928509e+00 3.3945937e+00 3.3963022e+00 3.3979788e+00
3.3996229e+00 3.4012361e+00 3.4028187e+00 3.4043717e+00
3.4058938e+00 3.4073887e+00 3.4088535e+00 3.4102931e+00
3.4117036e+00 3.4130898e+00 3.4144497e+00 3.4157834e+00
3.4170928e+00 3.4183784e+00 3.4196396e+00 3.4208779e+00
3.4220929e+00 3.4232869e+00 3.4244580e+00 3.4256082e+00
3.4267378e+00 3.4278469e+00 3.4289360e+00 3.4300056e+00
3.4310555e+00 3.4320869e+00 3.4331002e+00 3.4340954e+00
3.4350724e+00 3.4360318e+00 3.4369760e+00 3.4379010e+00
3.4388113e+00 3.4397054e+00 3.4405851e+00 3.4414482e+00
3.4422965e+00 3.4431300e+00 3.4439492e+00 3.4447536e+00
3.4455457e+00 3.4463229e+00 3.4470873e+00 3.4478393e+00
3.4485774e+00 3.4493046e+00 3.4500179e+00 3.4507203e+00
3.4514103e+00 3.4520888e+00 3.4527564e+00 3.4534125e+00
3.4540586e+00 3.4546938e+00 3.4553175e+00 3.4559317e+00
3.4565368e+00 3.4571309e+00 3.4577150e+00 3.4582901e+00
3.4588556e+00 3.4594121e+00 3.4599595e+00 3.4604993e+00
3.4610300e+00 3.4615517e+00 3.4620652e+00 3.4625711e+00
3.4630685e+00 3.4635587e+00 3.4640403e+00 3.4645157e+00
3.4649830e+00 3.4654431e+00 3.4658966e+00 3.4663415e+00
3.4667811e+00 3.4672141e+00 3.4676394e+00 3.4680586e+00
3.4684715e+00 3.4688792e+00 3.4692798e+00 3.4696741e+00
3.4700637e+00 3.4704461e+00 3.4708233e+00 3.4711957e+00
3.4715614e+00 3.4719224e+00 3.4722786e+00 3.4726286e+00
3.4729748e+00 3.4733148e+00 3.4736505e+00 3.4739819e+00
3.4743066e+00 3.4746289e+00 3.4749455e+00 3.4752569e+00
3.4755650e+00 3.4758687e+00 3.4761677e+00 3.4764628e+00
3.4767542e+00 3.4770417e+00 3.4773250e+00 3.4776039e+00
3.4778790e+00 3.4781504e+00 3.4784188e+00 3.4786835e+00
3.4789438e+00 3.4792008e+00 3.4794540e+00 3.4797049e+00
3.4799519e+00 3.4801960e+00 3.4804363e+00 3.4806743e+00
3.4809079e+00 3.4811401e+00 3.4813676e+00 3.4815922e+00
3.4818144e+00 3.4820347e+00 3.4822516e+00 3.4824648e+00
3.4826765e+00 3.4828849e+00 3.4830904e+00 3.4832940e+00
3.4834948e+00 3.4836936e+00 3.4838891e+00 3.4840827e+00
3.4842734e+00 3.4844618e+00 3.4846487e+00 3.4848323e+00
3.4850144e+00 3.4851933e+00 3.4853716e+00 3.4855466e+00
3.4857192e+00 3.4858909e+00 3.4860601e+00 3.4862275e+00
3.4863930e+00 3.4865561e+00 3.4867177e+00 3.4868770e+00
3.4870348e+00 3.4871902e+00 3.4873438e+00 3.4874964e+00
3.4876461e+00 3.4877954e+00 3.4879422e+00 3.4880881e+00
3.4882312e+00 3.4883738e+00 3.4885139e+00 3.4886527e+00
3.4887900e+00 3.4889264e+00 3.4890614e+00 3.4891939e+00
3.4893255e+00 3.4894552e+00 3.4895840e+00 3.4897108e+00
3.4898376e+00 3.4899626e+00 3.4900851e+00 3.4902072e+00
3.4903274e+00 3.4904480e+00 3.4905648e+00 3.4906826e+00
3.4907985e+00 3.4909120e+00 3.4910259e+00 3.4911375e+00
3.4912481e+00 3.4913583e+00 3.4914665e+00 3.4915743e+00
3.4916811e+00 3.4917865e+00 3.4918904e+00 3.4919934e+00
3.4920964e+00 3.4921975e+00 3.4922967e+00 3.4923968e+00
3.4924951e+00 3.4925923e+00 3.4926887e+00 3.4927831e+00
3.4928784e+00 3.4929719e+00 3.4930644e+00 3.4931564e+00
3.4932475e+00 3.4933367e+00 3.4934263e+00 3.4935131e+00
3.4936023e+00 3.4936886e+00 3.4937744e+00 3.4938593e+00
3.4939427e+00 3.4940267e+00 3.4941092e+00 3.4941907e+00
3.4942718e+00 3.4943528e+00 3.4944315e+00 3.4945107e+00
3.4945889e+00 3.4946666e+00 3.4947433e+00 3.4948187e+00]
Dont know what I am missing here with the data structure.
I'm getting data from using print command but in Pandas DataFrame throwing result as : Empty DataFrame,Columns: [],Index: [`]
Script:
from bs4 import BeautifulSoup
import requests
import re
import json
import pandas as pd
url='http://financials.morningstar.com/finan/financials/getFinancePart.html?&callback=jsonp1640132253903&t=XNAS:AAPL'
req=requests.get(url).text
#print(req)
data=re.search(r'jsonp1640132253903\((\{.*\})\)',req).group(1)
json_data=json.loads(data)['componentData']
#print(json_data)
# with open('index.html','w') as f:
# f.write(json_data)
soup=BeautifulSoup(json_data,'lxml')
for tr in soup.select('tr'):
row_data=[td.get_text(strip=True) for td in tr.select('td,th') if td.text]
if not row_data:
continue
if len(row_data) < 12:
row_data = ['Particulars'] + row_data
#print(row_data)
df=pd.DataFrame(row_data)
print(df)
Print result:
['Particulars', '2012-09', '2013-09', '2014-09', '2015-09', '2016-09', '2017-09', '2018-09', '2019-09', '2020-09', '2021-09', 'TTM']
['RevenueUSD Mil', '156,508', '170,910', '182,795', '233,715', '215,639', '229,234', '265,595', '260,174', '274,515', '365,817', '365,817']
['Gross Margin %', '43.9', '37.6', '38.6', '40.1', '39.1', '38.5', '38.3', '37.8', '38.2', '41.8', '41.8']
['Operating IncomeUSD Mil', '55,241', '48,999', '52,503', '71,230', '60,024', '61,344', '70,898', '63,930', '66,288', '108,949', '108,949']
['Operating Margin %', '35.3', '28.7', '28.7', '30.5', '27.8', '26.8', '26.7', '24.6', '24.1', '29.8', '29.8']
['Net IncomeUSD Mil', '41,733', '37,037', '39,510', '53,394', '45,687', '48,351', '59,531', '55,256', '57,411',
'94,680', '94,680']
['Earnings Per ShareUSD', '1.58', '1.42', '1.61', '2.31', '2.08', '2.30', '2.98', '2.97', '3.28', '5.61', '5.61'
Expected output:
2012-09 2013-09 2014-09 2015-09 2016-09 2017-09 2018-09 2019-09 2020-09 2021-09 TTM
Revenue USD Mil 156,508 170,910 182,795 233,715 215,639 229,234 265,595 260,174 274,515 365,817 365,817
Gross Margin % 43.9 37.6 38.6 40.1 39.1 38.5 38.3 37.8 38.2 41.8 41.8
Operating Income USD Mil 55,241 48,999 52,503 71,230 60,024 61,344 70,898 63,930 66,288 108,949 108,949
Operating Margin % 35.3 28.7 28.7 30.5 27.8 26.8 26.7 24.6 24.1 29.8 29.8
Net Income USD Mil 41,733 37,037 39,510 53,394 45,687 48,351 59,531 55,256 57,411 94,680 94,680
Earnings Per Share USD 1.58 1.42 1.61 2.31 2.08 2.30 2.98 2.97 3.28 5.61 5.61
Dividends USD 0.09 0.41 0.45 0.49 0.55 0.60 0.68 0.75 0.80 0.85 0.85
Payout Ratio % * — 27.4 28.5 22.3 24.8 26.5 23.7 25.1 23.7 16.3 15.2
Shares Mil 26,470 26,087 24,491 23,172 22,001 21,007 20,000 18,596 17,528 16,865 16,865
Book Value Per Share * USD 4.25 4.90 5.15 5.63 5.93 6.46 6.04 5.43 4.26 3.91 3.85
Operating Cash Flow USD Mil 50,856 53,666 59,713 81,266 65,824 63,598 77,434 69,391 80,674 104,038 104,038
Cap Spending USD Mil -9,402 -9,076 -9,813 -11,488 -13,548 -12,795 -13,313 -10,495 -7,309 -11,085 -11,085
Free Cash Flow USD Mil 41,454 44,590 49,900 69,778 52,276 50,803 64,121 58,896 73,365 92,953 92,953
Free Cash Flow Per Share * USD 1.58 1.61 1.93 2.96 2.24 2.41 2.88 3.07 4.04 5.57 —
Working Capital USD Mil 19,111 29,628 5,083 8,768 27,863 27,831 14,473 57,101 38,321 9,355
Expected columns:
'Particulars', '2012-09', '2013-09', '2014-09', '2015-09', '2016-09', '2017-09', '2018-09', '2019-09', '2020-09', '2021-09', 'TTM'
#QHarr's answer is by far the most straightforward, but in case you are wondering what is wrong with your code, it's that you are resetting the variable row_data for every iteration of the loop.
To make your code work, you can instead store each row as an element in a list. Then to build a DataFrame, you can pass this list of rows and the column names to pd.DataFrame:
data = []
soup=BeautifulSoup(json_data,'lxml')
for tr in soup.select('tr'):
row_data=[td.get_text(strip=True) for td in tr.select('td,th') if td.text]
if not row_data:
continue
elif len(row_data) < 12:
columns = ['Particulars'] + row_data
else:
data.append(row_data)
df=pd.DataFrame(data, columns=columns)
Result:
>>> df
Particulars 2012-09 2013-09 2014-09 2015-09 2016-09 2017-09 2018-09 2019-09 2020-09 2021-09 TTM
0 RevenueUSD Mil 156,508 170,910 182,795 233,715 215,639 229,234 265,595 260,174 274,515 365,817 365,817
1 Gross Margin % 43.9 37.6 38.6 40.1 39.1 38.5 38.3 37.8 38.2 41.8 41.8
2 Operating IncomeUSD Mil 55,241 48,999 52,503 71,230 60,024 61,344 70,898 63,930 66,288 108,949 108,949
3 Operating Margin % 35.3 28.7 28.7 30.5 27.8 26.8 26.7 24.6 24.1 29.8 29.8
4 Net IncomeUSD Mil 41,733 37,037 39,510 53,394 45,687 48,351 59,531 55,256 57,411 94,680 94,680
5 Earnings Per ShareUSD 1.58 1.42 1.61 2.31 2.08 2.30 2.98 2.97 3.28 5.61 5.61
6 DividendsUSD 0.09 0.41 0.45 0.49 0.55 0.60 0.68 0.75 0.80 0.85 0.85
7 Payout Ratio % * — 27.4 28.5 22.3 24.8 26.5 23.7 25.1 23.7 16.3 15.2
8 SharesMil 26,470 26,087 24,491 23,172 22,001 21,007 20,000 18,596 17,528 16,865 16,865
9 Book Value Per Share *USD 4.25 4.90 5.15 5.63 5.93 6.46 6.04 5.43 4.26 3.91 3.85
10 Operating Cash FlowUSD Mil 50,856 53,666 59,713 81,266 65,824 63,598 77,434 69,391 80,674 104,038 104,038
11 Cap SpendingUSD Mil -9,402 -9,076 -9,813 -11,488 -13,548 -12,795 -13,313 -10,495 -7,309 -11,085 -11,085
12 Free Cash FlowUSD Mil 41,454 44,590 49,900 69,778 52,276 50,803 64,121 58,896 73,365 92,953 92,953
13 Free Cash Flow Per Share *USD 1.58 1.61 1.93 2.96 2.24 2.41 2.88 3.07 4.04 5.57 —
14 Working CapitalUSD Mil 19,111 29,628 5,083 8,768 27,863 27,831 14,473 57,101 38,321 9,355 —
Use read_html for the DataFrame creation and then drop the na rows
json_data=json.loads(data)['componentData']
pd.read_html(json_data)[0].dropna(axis=0, how='all')
I have a series it is like:
{lag1mid_quoteDiff: 1.51
lag1TradeDirection: 2.12
lag2mid_quoteDiff: 1.53
lag2TradeDirection: 2.18
lag3mid_quoteDiff: 1.59
lag3TradeDirection: 2.10}
I need a dataframe as to have two columns, lagmid_quoteDiff and lagTradeDirection, with 3 rows, index as 1, 2, 3:
lagmid_quoteDiff lagTradeDirection
1 1.51 2.12
2 1.53 2.18
3 1.59 2.10
How can I do this?
Try crosstab after modify the series
s = pd.Series(data)
s1 = s.index.str.extract('(\d+)')[0]
out = pd.crosstab(index = s1, columns = s.index.str.replace('(\d+)',''), values = s.values, aggfunc = 'sum')
out
col_0 lagTradeDirection lagmid_quoteDiff
0
1 2.12 1.51
2 2.18 1.53
3 2.10 1.59
If you can ensure that the order of your dictionary entries is consistent:
import pandas as pd
import numpy as np
data = {"lag1mid_quoteDiff": 1.51,
"lag1TradeDirection": 2.12,
"lag2mid_quoteDiff": 1.53,
"lag2TradeDirection": 2.18,
"lag3mid_quoteDiff": 1.59,
"lag3TradeDirection": 2.10}
data = np.array(list(data.values()))
df = pd.DataFrame(data.reshape(-1, 2), columns=["lagmid_quoteDiff", "lagTradeDirection"])
print(df)
lagmid_quoteDiff lagTradeDirection
0 1.51 2.12
1 1.53 2.18
2 1.59 2.10
If you can not guarantee the order of your dictionary entries, try this:
import pandas as pd
import numpy as np
data = {"lag1mid_quoteDiff": 1.51,
"lag1TradeDirection": 2.12,
"lag2mid_quoteDiff": 1.53,
"lag2TradeDirection": 2.18,
"lag3mid_quoteDiff": 1.59,
"lag3TradeDirection": 2.10}
df = pd.DataFrame(data, index=[0]).melt()
df = (df["variable"].str.extract(r"(?P<lag_n>\d+)(?P<type>\w+)")
.join(df)
.pivot(index="lag_n", columns="type", values="value")
.rename_axis(columns=None))
print(df)
TradeDirection mid_quoteDiff
lag_n
1 2.12 1.51
2 2.18 1.53
3 2.10 1.59
Not quite as elegant, but gets the job done:
series = pd.Series({
'lag1mid_quoteDiff': 1.51,
'lag1TradeDirection': 2.12,
'lag2mid_quoteDiff': 1.53,
'lag2TradeDirection': 2.18,
'lag3mid_quoteDiff': 1.59,
'lag3TradeDirection': 2.10
})
unpack = {}
for k, v in series.iteritems():
re_match = re.match(r'(lag)(\d+)(.*)', k)
try:
index_num = int(re_match.group(2))
col = re_match.group(1)+re_match.group(3)
if unpack.get(col):
unpack[col][index_num] = v
else:
unpack[col] = {index_num: v}
except Exception as e:
raise ValueError(f"Provided key is incorrect format: {k}")
df = pd.DataFrame(unpack)
lagmid_quoteDiff lagTradeDirection
1 1.51 2.12
2 1.53 2.18
3 1.59 2.10