Issues Querying and Downloading Sentinel-3 OLCI Data with Sentinelsat - python

I am working with Sentinel-3 OLCI Level-2 Data Products with the Sentinelsat API and am having issues querying and exceeding my data download quota. Overall, I would like to write a program that accepts a date range and a specific geographic location, then downloads a dataframe of all values in the "Oa04_radiance"-band within the specified dates for that location. This is what I have so far:
from sentinelsat import SentinelAPI, read_geojson, geojson_to_wkt
from datetime import date
from geojson import Feature, Point, Polygon
api = SentinelAPI('user', 'password', 'https://apihub.copernicus.eu/apihub')
lon = -123.312383
lat = 49.319269
my_point = Point((lon, lat))
footprint = geojson_to_wkt(my_point)
products = api.query(footprint,
date=(date(2021, 1, 1), date(2021, 6, 15)),
platformname='Sentinel-3',
producttype='OL_2_LRR___',
cloudcoverpercentage=(0, 80))
products_df = api.to_dataframe(products)
api.download_all(products_df.index)
Error Output:
Traceback (most recent call last):
File "C:/Users/t7dej/Desktop/Turbid Time Local/SentSat/SenSat_mdl.py", line 48, in <module>
api.download_all(products_df_sorted.index)
File "E:\Software\Anaconda\lib\site-packages\sentinelsat\sentinel.py", line 723, in download_all
is_online = not self.trigger_offline_retrieval(pid)
File "E:\Software\Anaconda\lib\site-packages\sentinelsat\sentinel.py", line 636, in trigger_offline_retrieval
raise LTAError(msg, r)
sentinelsat.exceptions.LTAError: HTTP status 403 Forbidden: User quota exceeded: MediaRegulationException : An exception occured while creating a stream: Maximum number of 4 concurrent flows achieved by the user
Even when I set my api.query(limit=1) I am receiving this error message. The products_df is 173 MB and has a geometry column with a value of :
MULTIPOLYGON (((-146.081 -49.2196, -145.768 -48.2668, -145.201 -46.4727, -144.658 -44.6765, -144.135 -42.8782, -143.63 -41.0787, -143.142 -39.2767, -142.667 -37.4733, -142.204 -35.6694, -141.753 -33.863, -141.312 -32.0559, -140.878 -30.2474, -140.453 -28.4377, -140.033 -26.6268, -139.62 -24.8151, -139.211 -23.0024, -138.806 -21.1887, -138.404 -19.3744, -138.006 -17.5593, -137.609 -15.7434, -137.213 -13.927, -136.819 -12.1101, -136.425 -10.2928, -136.031 -8.47512, -135.636 -6.65724, -135.24 -4.83889, -134.843 -3.02078, -134.443 -1.2025, -134.04 0.61575, -133.634 2.43352, -133.224 4.25148, -132.81 6.06894, -132.391 7.88578, -131.965 9.702120000000001, -131.534 11.5179, -131.095 13.3329, -130.649 15.1468, -130.194 16.9598, -129.729 18.7716, -129.253 20.582, -128.767 22.3916, -128.267 24.1992, -127.753 26.0043, -127.224 27.8086, -126.678 29.6103, -126.113 31.4098, -125.527 33.2067, -124.919 35.0007, -124.286 36.7919, -123.624 38.5795, -122.932 40.3637, -122.205 42.1436, -121.44 43.9188, -120.631 45.6892, -119.773 47.4541, -118.861 49.2124, -117.886 50.9639, -116.841 52.7073, -115.714 54.4414, -114.495 56.1652, -113.168 57.8769, -111.717 59.5747, -110.12 61.256, -108.352 62.9179, -106.382 64.5568, -104.173 66.16800000000001, -101.677 67.7456, -98.8378 69.2821, -95.5855 70.7672, -91.83369999999999 72.1889, -87.4884 73.5312, -82.44029999999999 74.7715, -76.5813 75.8849, -69.83199999999999 76.83880000000001, -62.1773 77.59399999999999, -53.7161 78.1165, -44.6976 78.3725, -35.5051 78.3451, -26.5677 78.03619999999999, -18.2457 77.46639999999999, -10.7612 76.6694, -4.18714 75.6829, 3.64587789990069e-15 74.8448206506109, 1.50843 74.5429, 6.41354 73.28100000000001, 10.6367 71.9226, 14.2845 70.4875, 17.4519 68.99160000000001, 20.2206 67.4469, 21.6077 67.77589999999999, 23.0264 68.09050000000001, 24.4837 68.3925, 25.9603 68.679, 27.4944 68.9546, 29.0662 69.2161, 30.6951 69.46680000000001, 32.3417 69.6985, 34.024 69.9145, 35.7316 70.10760000000001, 37.4774 70.2903, 39.2531 70.4558, 41.0752 70.6046, 42.9054 70.7343, 44.7579 70.8455, 46.6297 70.938, 48.5174 71.01139999999999, 50.4176 71.0655, 52.3019 71.10290000000001, 52.0832 72.8877, 51.8771 74.6721, 51.6876 76.4562, 51.5212 78.2398, 51.3914 80.0231, 51.3212 81.80629999999999, 51.3582 83.58880000000001, 51.57959163346614 85.05115000000001, 3.911836325497215e-15 85.05115000000001, -133.5599156744917 85.05115000000001, -133.35 83.92870000000001, -133.276 82.14530000000001, -133.33 80.3614, -133.453 78.57859999999999, -133.611 76.7949, -133.799 75.0104, -134.002 73.226, -134.218 71.4409, -134.445 69.6553, -134.678 67.8693, -134.918 66.0826, -135.163 64.2954, -135.413 62.5074, -135.666 60.719, -135.923 58.93, -136.183 57.1401, -136.447 55.3495, -136.712 53.5582, -136.982 51.7664, -137.253 49.9738, -137.528 48.1799, -137.805 46.3858, -138.085 44.5911, -138.368 42.7954, -138.653 40.9991, -138.942 39.2019, -139.234 37.4039, -139.528 35.6058, -139.826 33.8069, -140.128 32.0072, -140.433 30.207, -140.741 28.4055, -141.054 26.6057, -141.37 24.8042, -141.691 23.0012, -142.016 21.1992, -142.346 19.3968, -142.68 17.5942, -143.02 15.7915, -143.365 13.9887, -143.716 12.1859, -144.073 10.3832, -144.436 8.58104, -144.806 6.77937, -145.183 4.97715, -145.567 3.17639, -145.96 1.37623, -146.361 -0.423407, -146.77 -2.22189, -147.19 -4.01968, -147.62 -5.81594, -148.06 -7.61083, -148.512 -9.40438, -148.976 -11.1962, -149.454 -12.9862, -149.946 -14.7742, -150.453 -16.5599, -150.975 -18.343, -151.516 -20.1234, -152.076 -21.901, -152.655 -23.6751, -153.257 -25.4461, -153.883 -27.2129, -154.534 -28.9754, -155.214 -30.7335, -155.923 -32.487, -156.667 -34.2331, -157.447 -35.9749, -158.267 -37.7106, -159.131 -39.4374, -160.043 -41.1566, -161.008 -42.867, -162.033 -44.5672, -162.604 -45.4664, -161.805 -45.7119, -160.989 -45.9615, -160.165 -46.2051, -159.335 -46.4427, -158.497 -46.6742, -157.652 -46.8994, -156.79 -47.1211, -155.932 -47.3333, -155.068 -47.539, -154.207 -47.742, -153.329 -47.9343, -152.444 -48.1198, -151.538 -48.3011, -150.642 -48.4725, -149.738 -48.6368, -148.842 -48.7908, -147.928 -48.9408, -147.008 -49.0835, -146.081 -49.2196)))
I have specified a geojson point object in the products query and am wondering why it is returning such a large multipolygon object in the products_df. I am thinking this is why products_df is so large and my quota is exceeded. Does anyone have any recommendations for this? Also, is it possible to query only the specific band 'Oa04_radiance' before downloading since I do not need any of the other bands from the Sentinel-3 OLCI Level-2 data products.

Related

List of coordinates to WKB hex

I am having some trouble when trying to load geojson data to Athena. I am trying to load world data boundaries (from this link)
The thing is, for some reason AWS Athena does not let me load MultiPolygon countries data (loading done based on this), only Polygons. As a result of this, only the first Polygon of a MultiPolygon is loaded.
Because of this, I am trying to find another solution to load the data. I have thought of the following:
Try to build a DataFrame with the name of the country, the list of coordinates and the corresponding type (Polygon or MultiPolygon). Depending on the type, generate a new column in wkb hex format accepted by Athena. Generate a partitioned parquet file and try to load it into Athena.
Any thoughts on how to do this?
Coordinates are coming in this format:
Polygon:
[[[-17.102499999999907, 32.823330000000055], [-17.05305999999996, 32.80944000000005], [-17.029999999999916, 32.810560000000066], [-17.013199999999927, 32.81346000000008], [-16.959999999999923, 32.830820000000074], [-16.913329999999917, 32.83916000000005], [-16.902219999999943, 32.83791000000008], [-16.715559999999925, 32.758890000000065], [-16.720839999999953, 32.74528000000004], [-16.819579999999917, 32.64611000000008], [-16.82188999999994, 32.64435000000009], [-16.83916999999991, 32.638610000000085], [-16.94360999999992, 32.637500000000045], [-16.988329999999905, 32.65527000000009], [-17.067779999999914, 32.67694000000006], [-17.102499999999907, 32.68333000000007], [-17.15805999999992, 32.709160000000054], [-17.19596999999993, 32.72888000000006], [-17.206389999999942, 32.73750000000007], [-17.232779999999934, 32.77000000000004], [-17.239169999999945, 32.77833000000004], [-17.254519999999957, 32.81284000000005], [-17.190699999999936, 32.86861000000005], [-17.169719999999927, 32.87028000000004], [-17.15888999999993, 32.865270000000066], [-17.151949999999943, 32.85750000000007], [-17.13333999999992, 32.838880000000074], [-17.12527999999992, 32.83194000000009], [-17.102499999999907, 32.823330000000055]]]
MultiPolygon:
[[[[111.41152000000005, 2.376390000000071], [111.36804000000006, 2.359580000000051], [111.35081000000008, 2.371940000000052], [111.30393000000004, 2.450000000000045], [111.30554000000006, 2.466390000000046], [111.31184000000007, 2.497220000000084], [111.30497000000003, 2.594440000000077], [111.29525000000007, 2.6808300000000713], [111.29332000000005, 2.73333000000008], [111.29540000000003, 2.748470000000054], [111.30359000000004, 2.767640000000085], [111.31248000000005, 2.775830000000041], [111.32895000000008, 2.780350000000055], [111.34539000000007, 2.771810000000073], [111.35443000000004, 2.760000000000047], [111.37776000000008, 2.708050000000071], [111.38165000000004, 2.698330000000055], [111.38388000000003, 2.680000000000062], [111.38388000000003, 2.652500000000088], [111.38026000000008, 2.615000000000066], [111.37693000000007, 2.536110000000064], [111.37692000000004, 2.495280000000093], [111.37831000000006, 2.482780000000048], [111.38109000000003, 2.471940000000074], [111.39943000000005, 2.408050000000059], [111.40583000000004, 2.393890000000056], [111.41152000000005, 2.376390000000071]]], [[[104.21191000000005, 2.7113900000000513], [104.16775000000007, 2.705280000000073], [104.15554000000003, 2.710550000000069], [104.14165000000003, 2.730830000000082], [104.12941000000006, 2.753890000000069], [104.12497000000008, 2.770000000000038], [104.12329000000005, 2.781670000000076], [104.16948000000008, 2.892500000000041], [104.18385000000006, 2.879440000000045], [104.18794000000003, 2.86779000000007], [104.18941000000007, 2.866390000000081], [104.19912000000005, 2.841940000000079], [104.20415000000003, 2.826670000000092], [104.21719000000007, 2.785560000000089], [104.21970000000005, 2.774440000000083], [104.22073000000006, 2.721870000000081], [104.21191000000005, 2.7113900000000513]]], [[[117.6869200000001, 4.168340000000057], [117.64373, 4.212990000000047], [117.63890000000004, 4.22861000000006], [117.64972, 4.238050000000044], [117.68581000000006, 4.259880000000066], [117.72388000000001, 4.260000000000048], [117.7460900000001, 4.258330000000058], [117.7602700000001, 4.255280000000084], [117.76971000000003, 4.25111000000004], [117.78554000000008, 4.239440000000059], [117.90356000000008, 4.174040000000048], [117.87944000000005, 4.173050000000046], [117.83971000000008, 4.171940000000063], [117.72664000000009, 4.169720000000041], [117.6869200000001, 4.168340000000057]]], [[[118.59604000000002, 4.638330000000053], [118.5681800000001, 4.599440000000072], [118.52970000000005, 4.600280000000055], [118.34987000000001, 4.67205000000007], [118.39462000000003, 4.676050000000089], [118.47887000000003, 4.689170000000047], [118.57640000000004, 4.650830000000042], [118.59604000000002, 4.638330000000053]]], [[[100.31762000000003, 5.335420000000056], [100.28704000000005, 5.254440000000045], [100.20386000000008, 5.271110000000078], [100.19609000000003, 5.293330000000083], [100.17968000000008, 5.427500000000066], [100.18580000000003, 5.462290000000053], [100.24774000000008, 5.466670000000079], [100.26110000000006, 5.466940000000079], [100.27275000000003, 5.464720000000057], [100.29413000000005, 5.458050000000071], [100.30525000000006, 5.451940000000093], [100.30827000000005, 5.4461200000000645], [100.32025000000004, 5.427780000000041], [100.32469000000003, 5.38083000000006], [100.31762000000003, 5.335420000000056]]], [[[99.86954000000003, 6.419300000000078], [99.89386000000007, 6.4019400000000815], [99.91545000000008, 6.387080000000083], [99.92302000000007, 6.333610000000078], [99.87162000000006, 6.288190000000043], [99.73956000000004, 6.248890000000074], [99.65694000000008, 6.360830000000078], [99.65221000000008, 6.369720000000086], [99.64695000000006, 6.385000000000048], [99.64193000000006, 6.407220000000052], [99.64222000000007, 6.422080000000051], [99.70332000000008, 6.426110000000051], [99.71666000000005, 6.425000000000068], [99.72747000000004, 6.4222200000000385], [99.73220000000003, 6.413050000000055], [99.74942000000004, 6.407780000000059], [99.79413000000005, 6.411670000000072], [99.85295000000008, 6.464150000000075], [99.85913000000005, 6.436390000000074], [99.86954000000003, 6.419300000000078]]], [[[102.09523000000007, 6.236140000000091], [102.12302000000005, 6.218050000000062], [102.16666000000004, 6.193610000000092], [102.18588000000005, 6.205730000000074], [102.22165000000007, 6.217500000000086], [102.31303000000003, 6.189440000000047], [102.33388000000008, 6.175550000000044], [102.35860000000008, 6.151670000000081], [102.38540000000006, 6.116530000000068], [102.41193000000004, 6.070830000000058], [102.43274000000008, 6.020000000000039], [102.48803000000004, 5.902640000000076], [102.50166000000007, 5.882080000000087], [102.53972000000005, 5.853170000000091], [102.58136000000007, 5.828610000000083], [102.60524000000004, 5.812220000000082], [102.62345000000005, 5.795690000000093], [102.64609000000007, 5.763470000000041], [102.66573000000005, 5.729860000000087], [102.84220000000005, 5.589300000000037], [102.87690000000003, 5.56833000000006], [102.92135000000007, 5.547430000000077], [102.96087000000006, 5.537010000000066], [103.03775000000007, 5.47694000000007], [103.06442000000004, 5.44805000000008], [103.09607000000005, 5.410280000000057], [103.12191000000007, 5.377360000000067], [103.18192000000005, 5.282780000000059], [103.20832000000007, 5.240550000000042], [103.22803000000005, 5.205550000000073], [103.24025000000006, 5.176940000000059], [103.24802000000005, 5.157500000000084], [103.26554000000004, 5.113330000000076], [103.27913000000007, 5.086110000000076], [103.35608000000008, 4.949170000000038], [103.40998000000008, 4.858050000000048], [103.44107000000008, 4.765280000000075], [103.45497000000006, 4.626940000000047], [103.45380000000006, 4.484790000000089], [103.46498000000003, 4.411670000000072], [103.47635000000008, 4.374720000000082], [103.48553000000004, 4.349860000000092], [103.49371000000008, 4.308750000000089], [103.48650000000004, 4.280550000000062], [103.46637000000004, 4.234440000000063], [103.44656000000003, 4.16644000000008], [103.41414000000003, 4.150760000000048], [103.39595000000003, 4.111390000000085], [103.39413000000008, 4.085000000000093], [103.40107000000006, 4.057780000000093], [103.40912000000003, 4.032220000000052], [103.41385000000008, 3.958890000000053], [103.37636000000003, 3.863060000000075], [103.33636000000007, 3.74410000000006], [103.37108000000006, 3.642780000000073], [103.38498000000004, 3.622220000000084], [103.42497000000003, 3.57278000000008], [103.44830000000007, 3.548060000000077], [103.46303000000006, 3.531810000000064], [103.47643000000005, 3.498680000000092], [103.45775000000003, 3.472780000000057], [103.44163000000003, 3.440140000000042], [103.42636000000005, 3.392500000000041], [103.42580000000004, 3.371670000000051], [103.42719000000005, 3.329720000000065], [103.43065000000007, 3.309720000000084], [103.43941000000007, 3.280560000000093], [103.44442000000004, 3.258330000000057], [103.45137000000005, 3.218330000000037], [103.45442000000003, 3.180970000000059], [103.45137000000005, 3.148610000000076], [103.44551000000007, 3.126530000000059], [103.43469000000005, 3.052780000000041], [103.43247000000008, 2.962220000000059], [103.43802000000005, 2.925830000000076], [103.45135000000005, 2.888190000000065], [103.46942000000007, 2.857780000000048], [103.48246000000006, 2.8366700000000833], [103.50165000000004, 2.808050000000094], [103.52246000000008, 2.780830000000094], [103.55469000000005, 2.7443100000000413], [103.59220000000005, 2.706110000000079], [103.61885000000007, 2.680560000000071], [103.63702000000006, 2.6640900000000443], [103.65331000000003, 2.660830000000089], [103.72775000000007, 2.640000000000043], [103.76526000000007, 2.625830000000064], [103.82025000000004, 2.575900000000047], [103.83386000000007, 2.517360000000053], [103.82706000000007, 2.476390000000037], [103.83637000000004, 2.455140000000085], [103.89832000000007, 2.385830000000055], [103.97746000000006, 2.243050000000039], [104.01969000000003, 2.136670000000038], [104.05830000000003, 2.059170000000051], [104.11580000000004, 1.966110000000071], [104.18830000000003, 1.8050000000000632], [104.22331000000003, 1.7179200000000492], [104.25359000000003, 1.633330000000057], [104.29329000000007, 1.437780000000088], ...]], [[[117.4333200000001, 6.628330000000062], [117.41235000000006, 6.6258300000000645], [117.35068000000001, 6.6409700000000385], [117.33859000000007, 6.649440000000084], [117.33527000000004, 6.658890000000042], [117.34274000000005, 6.672530000000052], [117.3986000000001, 6.676940000000059], [117.41278, 6.680000000000064], [117.42720000000008, 6.686110000000042], [117.43776000000003, 6.69264000000004], [117.44456000000002, 6.706110000000081], [117.43859000000009, 6.718330000000037], [117.42302000000007, 6.721670000000074], [117.40332000000001, 6.723120000000051], [117.40012000000002, 6.736800000000073], [117.4104000000001, 6.7470800000000395], [117.4679000000001, 6.761390000000063], [117.49192000000005, 6.744300000000067], [117.51486, 6.705830000000049], [117.51555000000008, 6.689440000000047], [117.51277000000005, 6.678610000000049], [117.50074000000006, 6.663330000000087], [117.46693000000005, 6.642500000000041], [117.45388000000003, 6.635000000000048], [117.44414000000006, 6.631110000000092], [117.4333200000001, 6.628330000000062]]], [[[117.59206000000006, 4.1698200000000725], [117.5288700000001, 4.175280000000043], [117.49775, 4.178890000000081], [117.47554000000002, 4.183610000000044], [117.45276000000001, 4.18861000000004], [117.43943000000002, 4.195830000000058], [117.42442000000005, 4.214720000000057], [117.4202600000001, 4.224720000000047], [117.41081000000008, 4.242500000000064], [117.40179, 4.254580000000089], [117.25179000000003, 4.353890000000092], [117.23944000000006, 4.35833000000008], [117.22360000000003, 4.358750000000043], [117.21111000000008, 4.354440000000068], [117.2027700000001, 4.345970000000079], [117.19179000000008, 4.336250000000064], [117.17665000000011, 4.335550000000069], [117.04387000000008, 4.336940000000084], [116.9369200000001, 4.350830000000087], [116.92442000000005, 4.352220000000045], [116.91053000000011, 4.352220000000045], [116.88916000000006, 4.349300000000085], [116.87693000000002, 4.345280000000059], [116.83985000000007, 4.33042000000006], [116.68877000000009, 4.33075000000008], [116.64194000000009, 4.334720000000061], [116.53360000000009, 4.330830000000049], [116.52762000000007, 4.319300000000055], [116.46582000000001, 4.294030000000077], [116.45416, 4.294580000000053], [116.4416500000001, 4.299440000000061], [116.42581000000007, 4.310000000000059], [116.31470000000002, 4.357780000000048], [116.15332000000001, 4.355000000000075], [116.1402700000001, 4.337220000000059], [116.07416, 4.283610000000067], [116.06110000000001, 4.2779200000000515], [116.04762000000005, 4.281110000000069], [116.03088000000002, 4.300760000000082], [116.0041500000001, 4.330000000000041], [115.98665000000005, 4.339720000000057], [115.97609, 4.343050000000062], [115.88804000000005, 4.36812000000009], [115.87248, 4.361110000000053], [115.85755000000006, 4.344100000000083], [115.85054000000002, 4.32167000000004], [115.7702700000001, 4.244720000000086], [115.76139, 4.239720000000091], [115.69081000000006, 4.180280000000039], [115.68331, 4.167360000000087], [115.6597200000001, 4.108600000000081], [115.66110000000003, 4.097780000000057], [115.66249000000005, 4.078330000000051], [115.61638000000005, 3.854170000000067], [115.58528000000001, 3.741670000000056], [115.57693000000006, 3.708610000000078], [115.57139000000006, 3.66611000000006], [115.57138000000009, 3.612780000000043], [115.57416, 3.594720000000052], [115.57887000000005, 3.585550000000069], [115.60193000000004, 3.539170000000069], [115.62608, 3.45778000000007], [115.62769000000003, 3.434300000000064], [115.61387000000002, 3.420420000000092], [115.59693000000004, 3.424720000000093], [115.58297000000005, 3.427990000000079], [115.57083, 3.4177800000000502], [115.55887000000007, 3.389170000000092], [115.55331000000001, 3.373890000000074], [115.53888000000006, 3.333610000000078], [115.53415000000007, 3.31833000000006], [115.52998000000002, 3.301670000000058], [115.51193, 3.2100000000000932], [115.50998000000004, 3.198330000000055], [115.51305000000002, 3.184030000000064], [115.52859000000001, 3.176110000000051], [115.4988800000001, 3.050280000000043], [115.49553000000003, 3.040000000000077], [115.48720000000003, 3.027500000000088], [115.37804000000006, 2.991800000000069], [115.31832000000009, 2.987780000000043], [115.31026000000008, 2.997780000000091], [115.30554000000006, 3.006670000000042], [115.30138000000011, 3.016390000000058], [115.24971000000005, 3.010830000000055], [115.15208000000007, 2.922080000000051], [115.13971000000004, 2.9061100000000692], [115.11832000000004, 2.851110000000062], [115.08236000000011, 2.613610000000051], [115.09206000000006, 2.600550000000055], [115.10762, 2.59722000000005], [115.12318000000005, 2.605550000000049], [115.1336, 2.611530000000073], [115.16944000000001, 2.605140000000062], [115.18525, 2.596940000000074], [115.19914000000006, 2.583050000000071], [115.2360900000001, 2.52965000000006], [115.23082000000011, 2.5080600000000572], [115.21568000000002, 2.492780000000039], [115.20387000000005, 2.487220000000093], ...]], [[[117.28298000000007, 7.319440000000043], [117.27832000000001, 7.250000000000057], [117.2763900000001, 7.238610000000051], [117.27304000000004, 7.228330000000085], [117.25054, 7.179170000000056], [117.19274000000007, 7.173330000000078], [117.17221000000006, 7.173330000000078], [117.1599900000001, 7.171940000000063], [117.14943000000005, 7.168610000000058], [117.1208200000001, 7.155550000000062], [117.09818000000007, 7.13722000000007], [117.0894300000001, 7.118890000000078], [117.07929000000001, 7.105000000000075], [117.06645000000003, 7.105070000000069], [117.05193000000008, 7.17083000000008], [117.06248000000005, 7.271670000000086], [117.07068000000004, 7.284030000000087], [117.08805000000007, 7.294170000000065], [117.0932600000001, 7.293450000000064], [117.14512000000002, 7.334030000000041], [117.22221000000002, 7.3527800000000525], [117.26818000000003, 7.343890000000044], [117.27832000000001, 7.333750000000066], [117.28298000000007, 7.319440000000043]]], [[[116.87471000000005, 7.221800000000087], [116.85694000000001, 7.183890000000076], [116.85234000000003, 7.188330000000064], [116.8787400000001, 7.274170000000083], [116.88666, 7.280280000000062], [117.00088000000005, 7.35292000000004], [117.00943000000007, 7.342220000000054], [117.01749000000007, 7.308330000000069], [117.01804000000004, 7.266740000000084], [116.9580400000001, 7.241670000000056], [116.90818000000002, 7.233470000000068], [116.87471000000005, 7.221800000000087]]]]
Thank you in advance,
Best regards.

TypeError when fitting Statsmodels OLS with standard errors clustered 2 ways

Context
Building on top of How to run Panel OLS regressions with 3+ fixed-effect and errors clustering? and notably Josef's third comment, I am trying to adapt the OLS Coefficients and Standard Errors Clustered by Firm and Year section of this example notebook below:
cluster_2ways_ols = sm.ols(formula='y ~ x', data=df).fit(cov_type='cluster',
cov_kwds={'groups': np.array(df[['firmid', 'year']])},
use_t=True)
to my own example dataset.
Note that I am able to reproduce this example (and it works). I can also add fixed-effects, by using 'y ~ x + C(firmid) + C(year)' as formula instead.
Problem
However, trying to port the same command to my example dataset (see code below), I'm getting the following error:
>>> model = sm.OLS.from_formula("gdp ~ population + C(year_publication) + C(country)", df)
>>> result = model.fit(
cov_type='cluster',
cov_kwds={'groups': np.array(df[['country', 'year_publication']])},
use_t=True
)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/path/venv/lib64/python3.10/site-packages/statsmodels/regression/linear_model.py", line 343, in fit
lfit = OLSResults(
File "/path/venv/lib64/python3.10/site-packages/statsmodels/regression/linear_model.py", line 1607, in __init__
self.get_robustcov_results(cov_type=cov_type, use_self=True,
File "/path/venv/lib64/python3.10/site-packages/statsmodels/regression/linear_model.py", line 2568, in get_robustcov_results
res.cov_params_default = sw.cov_cluster_2groups(
File "/path/venv/lib64/python3.10/site-packages/statsmodels/stats/sandwich_covariance.py", line 591, in cov_cluster_2groups
combine_indices(group)[0],
File "/path/venv/lib64/python3.10/site-packages/statsmodels/tools/grouputils.py", line 55, in combine_indices
groups_ = groups.view([('', groups.dtype)] * groups.shape[1])
File "/path/venv/lib64/python3.10/site-packages/numpy/core/_internal.py", line 549, in _view_is_safe
raise TypeError("Cannot change data-type for object array.")
TypeError: Cannot change data-type for object array.
I have tried to manually cast the year_publication to string/object using np.array(df[['country', 'year_publication']].astype("str")), but it doesn't solve the issue.
Questions
What is the cause of the TypeError()?
How to adapt the example command to my dataset?
Minimal Working Example
from io import StringIO
import numpy as np
import pandas as pd
import statsmodels.api as sm
DATA = """
"continent","country","source","year_publication","year_data","population","gdp"
"Africa","Angola","OECD",2020,2018,972,52.69
"Africa","Angola","OECD",2020,2019,986,802.7
"Africa","Angola","OECD",2020,2020,641,568.74
"Africa","Angola","OECD",2021,2018,438,168.83
"Africa","Angola","OECD",2021,2019,958,310.57
"Africa","Angola","OECD",2021,2020,270,144.02
"Africa","Angola","OECD",2022,2018,528,359.71
"Africa","Angola","OECD",2022,2019,974,582.98
"Africa","Angola","OECD",2022,2020,835,820.49
"Africa","Angola","IMF",2020,2018,168,148.85
"Africa","Angola","IMF",2020,2019,460,236.21
"Africa","Angola","IMF",2020,2020,360,297.15
"Africa","Angola","IMF",2021,2018,381,249.13
"Africa","Angola","IMF",2021,2019,648,128.05
"Africa","Angola","IMF",2021,2020,206,179.05
"Africa","Angola","IMF",2022,2018,282,150.29
"Africa","Angola","IMF",2022,2019,125,23.42
"Africa","Angola","IMF",2022,2020,410,247.35
"Africa","Angola","WorldBank",2020,2018,553,182.06
"Africa","Angola","WorldBank",2020,2019,847,698.87
"Africa","Angola","WorldBank",2020,2020,844,126.61
"Africa","Angola","WorldBank",2021,2018,307,239.76
"Africa","Angola","WorldBank",2021,2019,659,510.73
"Africa","Angola","WorldBank",2021,2020,548,331.89
"Africa","Angola","WorldBank",2022,2018,448,122.76
"Africa","Angola","WorldBank",2022,2019,768,761.41
"Africa","Angola","WorldBank",2022,2020,324,163.57
"Africa","Benin","OECD",2020,2018,513,196.9
"Africa","Benin","OECD",2020,2019,590,83.7
"Africa","Benin","OECD",2020,2020,791,511.09
"Africa","Benin","OECD",2021,2018,799,474.43
"Africa","Benin","OECD",2021,2019,455,234.21
"Africa","Benin","OECD",2021,2020,549,238.83
"Africa","Benin","OECD",2022,2018,235,229.33
"Africa","Benin","OECD",2022,2019,347,46.51
"Africa","Benin","OECD",2022,2020,532,392.13
"Africa","Benin","IMF",2020,2018,138,137.05
"Africa","Benin","IMF",2020,2019,978,239.82
"Africa","Benin","IMF",2020,2020,821,33.41
"Africa","Benin","IMF",2021,2018,453,291.93
"Africa","Benin","IMF",2021,2019,526,381.88
"Africa","Benin","IMF",2021,2020,467,313.57
"Africa","Benin","IMF",2022,2018,948,555.23
"Africa","Benin","IMF",2022,2019,323,289.91
"Africa","Benin","IMF",2022,2020,421,62.35
"Africa","Benin","WorldBank",2020,2018,983,271.69
"Africa","Benin","WorldBank",2020,2019,138,23.55
"Africa","Benin","WorldBank",2020,2020,636,623.65
"Africa","Benin","WorldBank",2021,2018,653,534.99
"Africa","Benin","WorldBank",2021,2019,564,368.8
"Africa","Benin","WorldBank",2021,2020,741,312.02
"Africa","Benin","WorldBank",2022,2018,328,292.11
"Africa","Benin","WorldBank",2022,2019,653,429.21
"Africa","Benin","WorldBank",2022,2020,951,242.73
"Africa","Chad","OECD",2020,2018,176,95.06
"Africa","Chad","OECD",2020,2019,783,425.34
"Africa","Chad","OECD",2020,2020,885,461.6
"Africa","Chad","OECD",2021,2018,673,15.87
"Africa","Chad","OECD",2021,2019,131,74.46
"Africa","Chad","OECD",2021,2020,430,61.58
"Africa","Chad","OECD",2022,2018,593,211.34
"Africa","Chad","OECD",2022,2019,647,550.37
"Africa","Chad","OECD",2022,2020,154,105.65
"Africa","Chad","IMF",2020,2018,160,32.41
"Africa","Chad","IMF",2020,2019,654,27.84
"Africa","Chad","IMF",2020,2020,616,468.92
"Africa","Chad","IMF",2021,2018,996,22.4
"Africa","Chad","IMF",2021,2019,126,93.18
"Africa","Chad","IMF",2021,2020,879,547.87
"Africa","Chad","IMF",2022,2018,663,520
"Africa","Chad","IMF",2022,2019,681,544.76
"Africa","Chad","IMF",2022,2020,101,55.6
"Africa","Chad","WorldBank",2020,2018,786,757.22
"Africa","Chad","WorldBank",2020,2019,599,593.69
"Africa","Chad","WorldBank",2020,2020,641,529.84
"Africa","Chad","WorldBank",2021,2018,343,287.89
"Africa","Chad","WorldBank",2021,2019,438,340.83
"Africa","Chad","WorldBank",2021,2020,762,594.67
"Africa","Chad","WorldBank",2022,2018,430,128.69
"Africa","Chad","WorldBank",2022,2019,260,242.59
"Africa","Chad","WorldBank",2022,2020,607,216.1
"Europe","Denmark","OECD",2020,2018,114,86.75
"Europe","Denmark","OECD",2020,2019,937,373.29
"Europe","Denmark","OECD",2020,2020,866,392.93
"Europe","Denmark","OECD",2021,2018,296,41.04
"Europe","Denmark","OECD",2021,2019,402,32.67
"Europe","Denmark","OECD",2021,2020,306,7.88
"Europe","Denmark","OECD",2022,2018,540,379.51
"Europe","Denmark","OECD",2022,2019,108,26.72
"Europe","Denmark","OECD",2022,2020,752,307.2
"Europe","Denmark","IMF",2020,2018,157,24.24
"Europe","Denmark","IMF",2020,2019,303,79.04
"Europe","Denmark","IMF",2020,2020,286,122.36
"Europe","Denmark","IMF",2021,2018,569,69.32
"Europe","Denmark","IMF",2021,2019,808,642.67
"Europe","Denmark","IMF",2021,2020,157,5.58
"Europe","Denmark","IMF",2022,2018,147,112.21
"Europe","Denmark","IMF",2022,2019,414,311.16
"Europe","Denmark","IMF",2022,2020,774,230.46
"Europe","Denmark","WorldBank",2020,2018,695,350.03
"Europe","Denmark","WorldBank",2020,2019,511,209.84
"Europe","Denmark","WorldBank",2020,2020,181,29.27
"Europe","Denmark","WorldBank",2021,2018,503,176.89
"Europe","Denmark","WorldBank",2021,2019,710,609.02
"Europe","Denmark","WorldBank",2021,2020,264,165.78
"Europe","Denmark","WorldBank",2022,2018,670,638.99
"Europe","Denmark","WorldBank",2022,2019,651,354.6
"Europe","Denmark","WorldBank",2022,2020,632,623.94
"Europe","Estonia","OECD",2020,2018,838,263.67
"Europe","Estonia","OECD",2020,2019,638,533.95
"Europe","Estonia","OECD",2020,2020,898,638.73
"Europe","Estonia","OECD",2021,2018,262,98.16
"Europe","Estonia","OECD",2021,2019,569,552.54
"Europe","Estonia","OECD",2021,2020,868,252.48
"Europe","Estonia","OECD",2022,2018,927,264.65
"Europe","Estonia","OECD",2022,2019,205,150.6
"Europe","Estonia","OECD",2022,2020,828,752.61
"Europe","Estonia","IMF",2020,2018,841,176.31
"Europe","Estonia","IMF",2020,2019,614,230.55
"Europe","Estonia","IMF",2020,2020,500,41.19
"Europe","Estonia","IMF",2021,2018,510,169.68
"Europe","Estonia","IMF",2021,2019,765,401.85
"Europe","Estonia","IMF",2021,2020,751,319.6
"Europe","Estonia","IMF",2022,2018,314,58.81
"Europe","Estonia","IMF",2022,2019,155,2.24
"Europe","Estonia","IMF",2022,2020,734,187.6
"Europe","Estonia","WorldBank",2020,2018,332,160.17
"Europe","Estonia","WorldBank",2020,2019,466,385.33
"Europe","Estonia","WorldBank",2020,2020,487,435.06
"Europe","Estonia","WorldBank",2021,2018,461,249.19
"Europe","Estonia","WorldBank",2021,2019,932,763.38
"Europe","Estonia","WorldBank",2021,2020,650,463.91
"Europe","Estonia","WorldBank",2022,2018,570,549.97
"Europe","Estonia","WorldBank",2022,2019,909,80.48
"Europe","Estonia","WorldBank",2022,2020,523,242.22
"Europe","Finland","OECD",2020,2018,565,561.64
"Europe","Finland","OECD",2020,2019,646,161.62
"Europe","Finland","OECD",2020,2020,194,133.69
"Europe","Finland","OECD",2021,2018,529,39.76
"Europe","Finland","OECD",2021,2019,800,680.12
"Europe","Finland","OECD",2021,2020,418,399.19
"Europe","Finland","OECD",2022,2018,591,253.12
"Europe","Finland","OECD",2022,2019,457,272.58
"Europe","Finland","OECD",2022,2020,157,105.1
"Europe","Finland","IMF",2020,2018,860,445.03
"Europe","Finland","IMF",2020,2019,108,47.72
"Europe","Finland","IMF",2020,2020,523,500.58
"Europe","Finland","IMF",2021,2018,560,81.47
"Europe","Finland","IMF",2021,2019,830,664.64
"Europe","Finland","IMF",2021,2020,903,762.62
"Europe","Finland","IMF",2022,2018,179,167.73
"Europe","Finland","IMF",2022,2019,137,98.98
"Europe","Finland","IMF",2022,2020,666,524.86
"Europe","Finland","WorldBank",2020,2018,319,146.01
"Europe","Finland","WorldBank",2020,2019,401,219.56
"Europe","Finland","WorldBank",2020,2020,711,45.35
"Europe","Finland","WorldBank",2021,2018,828,20.97
"Europe","Finland","WorldBank",2021,2019,180,66.3
"Europe","Finland","WorldBank",2021,2020,682,92.57
"Europe","Finland","WorldBank",2022,2018,254,81.2
"Europe","Finland","WorldBank",2022,2019,619,159.08
"Europe","Finland","WorldBank",2022,2020,191,184.4
"""
df = pd.read_csv(StringIO(DATA))
model = sm.OLS.from_formula("gdp ~ population + C(year_publication) + C(country)", df)
result = model.fit(
cov_type='cluster',
cov_kwds={'groups': np.array(df[['country', 'year_publication']])},
use_t=True
)
print(result.summary())
I have realized that the groups must be an array of integers rather than of objects/strings.
Thus, label encoding the string column as follows:
df["country"] = df["country"].astype("category")
df["country_id"] = df.country.cat.codes
and using country_id to cluster the standard errors solves the issue:
result = model.fit(
cov_type='cluster',
cov_kwds={'groups': np.array(df[['country_id', 'year_publication']])},
use_t=True
)
Fully working example:
from io import StringIO
import numpy as np
import pandas as pd
import statsmodels.api as sm
DATA = """
"continent","country","source","year_publication","year_data","population","gdp"
"Africa","Angola","OECD",2020,2018,972,52.69
"Africa","Angola","OECD",2020,2019,986,802.7
"Africa","Angola","OECD",2020,2020,641,568.74
"Africa","Angola","OECD",2021,2018,438,168.83
"Africa","Angola","OECD",2021,2019,958,310.57
"Africa","Angola","OECD",2021,2020,270,144.02
"Africa","Angola","OECD",2022,2018,528,359.71
"Africa","Angola","OECD",2022,2019,974,582.98
"Africa","Angola","OECD",2022,2020,835,820.49
"Africa","Angola","IMF",2020,2018,168,148.85
"Africa","Angola","IMF",2020,2019,460,236.21
"Africa","Angola","IMF",2020,2020,360,297.15
"Africa","Angola","IMF",2021,2018,381,249.13
"Africa","Angola","IMF",2021,2019,648,128.05
"Africa","Angola","IMF",2021,2020,206,179.05
"Africa","Angola","IMF",2022,2018,282,150.29
"Africa","Angola","IMF",2022,2019,125,23.42
"Africa","Angola","IMF",2022,2020,410,247.35
"Africa","Angola","WorldBank",2020,2018,553,182.06
"Africa","Angola","WorldBank",2020,2019,847,698.87
"Africa","Angola","WorldBank",2020,2020,844,126.61
"Africa","Angola","WorldBank",2021,2018,307,239.76
"Africa","Angola","WorldBank",2021,2019,659,510.73
"Africa","Angola","WorldBank",2021,2020,548,331.89
"Africa","Angola","WorldBank",2022,2018,448,122.76
"Africa","Angola","WorldBank",2022,2019,768,761.41
"Africa","Angola","WorldBank",2022,2020,324,163.57
"Africa","Benin","OECD",2020,2018,513,196.9
"Africa","Benin","OECD",2020,2019,590,83.7
"Africa","Benin","OECD",2020,2020,791,511.09
"Africa","Benin","OECD",2021,2018,799,474.43
"Africa","Benin","OECD",2021,2019,455,234.21
"Africa","Benin","OECD",2021,2020,549,238.83
"Africa","Benin","OECD",2022,2018,235,229.33
"Africa","Benin","OECD",2022,2019,347,46.51
"Africa","Benin","OECD",2022,2020,532,392.13
"Africa","Benin","IMF",2020,2018,138,137.05
"Africa","Benin","IMF",2020,2019,978,239.82
"Africa","Benin","IMF",2020,2020,821,33.41
"Africa","Benin","IMF",2021,2018,453,291.93
"Africa","Benin","IMF",2021,2019,526,381.88
"Africa","Benin","IMF",2021,2020,467,313.57
"Africa","Benin","IMF",2022,2018,948,555.23
"Africa","Benin","IMF",2022,2019,323,289.91
"Africa","Benin","IMF",2022,2020,421,62.35
"Africa","Benin","WorldBank",2020,2018,983,271.69
"Africa","Benin","WorldBank",2020,2019,138,23.55
"Africa","Benin","WorldBank",2020,2020,636,623.65
"Africa","Benin","WorldBank",2021,2018,653,534.99
"Africa","Benin","WorldBank",2021,2019,564,368.8
"Africa","Benin","WorldBank",2021,2020,741,312.02
"Africa","Benin","WorldBank",2022,2018,328,292.11
"Africa","Benin","WorldBank",2022,2019,653,429.21
"Africa","Benin","WorldBank",2022,2020,951,242.73
"Africa","Chad","OECD",2020,2018,176,95.06
"Africa","Chad","OECD",2020,2019,783,425.34
"Africa","Chad","OECD",2020,2020,885,461.6
"Africa","Chad","OECD",2021,2018,673,15.87
"Africa","Chad","OECD",2021,2019,131,74.46
"Africa","Chad","OECD",2021,2020,430,61.58
"Africa","Chad","OECD",2022,2018,593,211.34
"Africa","Chad","OECD",2022,2019,647,550.37
"Africa","Chad","OECD",2022,2020,154,105.65
"Africa","Chad","IMF",2020,2018,160,32.41
"Africa","Chad","IMF",2020,2019,654,27.84
"Africa","Chad","IMF",2020,2020,616,468.92
"Africa","Chad","IMF",2021,2018,996,22.4
"Africa","Chad","IMF",2021,2019,126,93.18
"Africa","Chad","IMF",2021,2020,879,547.87
"Africa","Chad","IMF",2022,2018,663,520
"Africa","Chad","IMF",2022,2019,681,544.76
"Africa","Chad","IMF",2022,2020,101,55.6
"Africa","Chad","WorldBank",2020,2018,786,757.22
"Africa","Chad","WorldBank",2020,2019,599,593.69
"Africa","Chad","WorldBank",2020,2020,641,529.84
"Africa","Chad","WorldBank",2021,2018,343,287.89
"Africa","Chad","WorldBank",2021,2019,438,340.83
"Africa","Chad","WorldBank",2021,2020,762,594.67
"Africa","Chad","WorldBank",2022,2018,430,128.69
"Africa","Chad","WorldBank",2022,2019,260,242.59
"Africa","Chad","WorldBank",2022,2020,607,216.1
"Europe","Denmark","OECD",2020,2018,114,86.75
"Europe","Denmark","OECD",2020,2019,937,373.29
"Europe","Denmark","OECD",2020,2020,866,392.93
"Europe","Denmark","OECD",2021,2018,296,41.04
"Europe","Denmark","OECD",2021,2019,402,32.67
"Europe","Denmark","OECD",2021,2020,306,7.88
"Europe","Denmark","OECD",2022,2018,540,379.51
"Europe","Denmark","OECD",2022,2019,108,26.72
"Europe","Denmark","OECD",2022,2020,752,307.2
"Europe","Denmark","IMF",2020,2018,157,24.24
"Europe","Denmark","IMF",2020,2019,303,79.04
"Europe","Denmark","IMF",2020,2020,286,122.36
"Europe","Denmark","IMF",2021,2018,569,69.32
"Europe","Denmark","IMF",2021,2019,808,642.67
"Europe","Denmark","IMF",2021,2020,157,5.58
"Europe","Denmark","IMF",2022,2018,147,112.21
"Europe","Denmark","IMF",2022,2019,414,311.16
"Europe","Denmark","IMF",2022,2020,774,230.46
"Europe","Denmark","WorldBank",2020,2018,695,350.03
"Europe","Denmark","WorldBank",2020,2019,511,209.84
"Europe","Denmark","WorldBank",2020,2020,181,29.27
"Europe","Denmark","WorldBank",2021,2018,503,176.89
"Europe","Denmark","WorldBank",2021,2019,710,609.02
"Europe","Denmark","WorldBank",2021,2020,264,165.78
"Europe","Denmark","WorldBank",2022,2018,670,638.99
"Europe","Denmark","WorldBank",2022,2019,651,354.6
"Europe","Denmark","WorldBank",2022,2020,632,623.94
"Europe","Estonia","OECD",2020,2018,838,263.67
"Europe","Estonia","OECD",2020,2019,638,533.95
"Europe","Estonia","OECD",2020,2020,898,638.73
"Europe","Estonia","OECD",2021,2018,262,98.16
"Europe","Estonia","OECD",2021,2019,569,552.54
"Europe","Estonia","OECD",2021,2020,868,252.48
"Europe","Estonia","OECD",2022,2018,927,264.65
"Europe","Estonia","OECD",2022,2019,205,150.6
"Europe","Estonia","OECD",2022,2020,828,752.61
"Europe","Estonia","IMF",2020,2018,841,176.31
"Europe","Estonia","IMF",2020,2019,614,230.55
"Europe","Estonia","IMF",2020,2020,500,41.19
"Europe","Estonia","IMF",2021,2018,510,169.68
"Europe","Estonia","IMF",2021,2019,765,401.85
"Europe","Estonia","IMF",2021,2020,751,319.6
"Europe","Estonia","IMF",2022,2018,314,58.81
"Europe","Estonia","IMF",2022,2019,155,2.24
"Europe","Estonia","IMF",2022,2020,734,187.6
"Europe","Estonia","WorldBank",2020,2018,332,160.17
"Europe","Estonia","WorldBank",2020,2019,466,385.33
"Europe","Estonia","WorldBank",2020,2020,487,435.06
"Europe","Estonia","WorldBank",2021,2018,461,249.19
"Europe","Estonia","WorldBank",2021,2019,932,763.38
"Europe","Estonia","WorldBank",2021,2020,650,463.91
"Europe","Estonia","WorldBank",2022,2018,570,549.97
"Europe","Estonia","WorldBank",2022,2019,909,80.48
"Europe","Estonia","WorldBank",2022,2020,523,242.22
"Europe","Finland","OECD",2020,2018,565,561.64
"Europe","Finland","OECD",2020,2019,646,161.62
"Europe","Finland","OECD",2020,2020,194,133.69
"Europe","Finland","OECD",2021,2018,529,39.76
"Europe","Finland","OECD",2021,2019,800,680.12
"Europe","Finland","OECD",2021,2020,418,399.19
"Europe","Finland","OECD",2022,2018,591,253.12
"Europe","Finland","OECD",2022,2019,457,272.58
"Europe","Finland","OECD",2022,2020,157,105.1
"Europe","Finland","IMF",2020,2018,860,445.03
"Europe","Finland","IMF",2020,2019,108,47.72
"Europe","Finland","IMF",2020,2020,523,500.58
"Europe","Finland","IMF",2021,2018,560,81.47
"Europe","Finland","IMF",2021,2019,830,664.64
"Europe","Finland","IMF",2021,2020,903,762.62
"Europe","Finland","IMF",2022,2018,179,167.73
"Europe","Finland","IMF",2022,2019,137,98.98
"Europe","Finland","IMF",2022,2020,666,524.86
"Europe","Finland","WorldBank",2020,2018,319,146.01
"Europe","Finland","WorldBank",2020,2019,401,219.56
"Europe","Finland","WorldBank",2020,2020,711,45.35
"Europe","Finland","WorldBank",2021,2018,828,20.97
"Europe","Finland","WorldBank",2021,2019,180,66.3
"Europe","Finland","WorldBank",2021,2020,682,92.57
"Europe","Finland","WorldBank",2022,2018,254,81.2
"Europe","Finland","WorldBank",2022,2019,619,159.08
"Europe","Finland","WorldBank",2022,2020,191,184.4
"""
df = pd.read_csv(StringIO(DATA))
df["country"] = df["country"].astype("category")
df["country_id"] = df.country.cat.codes
model = sm.OLS.from_formula("gdp ~ population + C(year_publication) + C(country)", df)
result = model.fit(
cov_type='cluster',
cov_kwds={'groups': np.array(df[['country_id', 'year_publication']])},
use_t=True
)
print(result.summary())

how to solve IndexError : single positional indexer is out-of-bounds

CODE:-
from datetime import date
from datetime import timedelta
from nsepy import get_history
import pandas as pd
import datetime
# import matplotlib.pyplot as mp
end1 = date.today()
start1 = end1 - timedelta(days=365)
stock = [
'RELIANCE','HDFCBANK','INFY','ICICIBANK','HDFC','TCS','KOTAKBANK','LT','SBIN','HINDUNILVR','AXISBANK','ITC','BAJFINANCE','BHARTIARTL','ASIANPAINT','HCLTECH','MARUTI','TITAN','BAJAJFINSV','TATAMOTORS',
'TECHM','SUNPHARMA','TATASTEEL','M&M','WIPRO','ULTRACEMCO','POWERGRID','HINDALCO','NTPC','NESTLEIND','GRASIM','ONGC','JSWSTEEL','HDFCLIFE','INDUSINDBK','SBILIFE','DRREDDY','ADANIPORTS','DIVISLAB','CIPLA',
'BAJAJ-AUTO','TATACONSUM','UPL','BRITANNIA','BPCL','EICHERMOT','HEROMOTOCO','COALINDIA','SHREECEM','IOC','VEDL','ADANIENT', 'APOLLOHOSP', 'TATAPOWER', 'PIDILITIND', 'SRF', 'NAUKRI', 'ICICIGI', 'DABUR',
'GODREJCP', 'HAVELLS', 'PEL', 'VOLTAS', 'AUBANK', 'LTI', 'CHOLAFIN', 'AMBUJACEM', 'MARICO', 'SRTRANSFIN','GAIL', 'MCDOWELL-N', 'MPHASIS', 'MINDTREE', 'PAGEIND', 'ZEEL', 'BEL', 'TRENT', 'CROMPTON', 'JUBLFOOD',
'DLF', 'SBICARD', 'SIEMENS', 'BANDHANBNK', 'IRCTC', 'LAURUSLABS', 'PIIND', 'INDIGO', 'INDUSTOWER','ICICIPRULI', 'MOTHERSON', 'AARTIIND', 'FEDERALBNK', 'BANKBARODA', 'PERSISTENT', 'HINDPETRO', 'ACC',
'AUROPHARMA', 'COLPAL', 'GODREJPROP', 'MFSL', 'LUPIN', 'BIOCON', 'ASHOKLEY', 'BHARATFORG', 'BERGEPAINT','JINDALSTEL', 'ASTRAL', 'IEX', 'NMDC', 'CONCOR', 'INDHOTEL', 'BALKRISIND', 'PETRONET', 'CANBK', 'ALKEM',
'DIXON', 'DEEPAKNTR', 'DALBHARAT', 'TVSMOTOR', 'ATUL', 'HDFCAMC', 'TATACOMM', 'MUTHOOTFIN', 'TATACHEM','SAIL', 'IDFCFIRSTB', 'PFC', 'BOSCHLTD', 'MRF', 'NAVINFLUOR', 'CUMMINSIND', 'IGL', 'IPCALAB', 'COFORGE',
'ESCORTS', 'TORNTPHARM', 'LTTS', 'RECLTD', 'LICHSGFIN', 'BATAINDIA', 'HAL', 'PNB', 'GUJGASLTD', 'UBL','3MINDIA','ABB','AIAENG','APLAPOLLO','AARTIDRUGS','AAVAS','ABBOTINDIA','ADANIGREEN','ATGL','ABCAPITAL',
'ABFRL','ABSLAMC','ADVENZYMES','AEGISCHEM','AFFLE','AJANTPHARM','ALKYLAMINE','ALLCARGO','AMARAJABAT','AMBER','ANGELONE','ANURAS','APTUS','ASAHIINDIA','ASTERDM','ASTRAZEN','AVANTIFEED','DMART','BASF',
'BSE','BAJAJELEC','BAJAJHLDNG','BALAMINES','BALRAMCHIN','BANKINDIA','MAHABANK','BAYERCROP','BDL','BEL','BHEL','BIRLACORPN','BSOFT','BLUEDART','BLUESTARCO','BORORENEW','BOSCHLTD','BRIGADE','BCG','MAPMYINDIA'
]
target_stocks_list = []
target_stocks = pd.DataFrame()
for stock in stock:
vol = get_history(symbol=stock,
start=start1,
end=end1)
d_vol = pd.concat([vol['Deliverable Volume']])
symbol_s = pd.concat([vol['Symbol']])
close = pd.concat([vol['Close']])
df = pd.DataFrame(symbol_s)
df['D_vol'] = d_vol
# print(df)
cond = df['D_vol'].iloc[-1] > max(df['D_vol'].iloc[-91:-1])
if(cond):
target_stocks_list.append(stock)
target_stocks = pd.concat([target_stocks, df])
print(target_stocks_list)
file_name = f'{datetime.datetime.now().day}-{datetime.datetime.now().month}-{datetime.datetime.now().year}.csv'
target_stocks.to_csv(f'D:/HUGE VOLUME SPURTS/first 250/SEP 2022/{file_name}')
pd.set_option('display.max_columns',10)
pd.set_option('display.max_rows',2000)
print(target_stocks)
ERROR:-
C:\python\Python310\python.exe "C:/Users/Yogesh_PC/PycharmProjects/future oi data analysis/trial2.py"
Traceback (most recent call last):
File "C:\Users\Yogesh_PC\PycharmProjects\future oi data analysis\trial2.py", line 64, in <module>
cond = df['D_vol'].iloc[-1] > max(df['D_vol'].iloc[-91:-1])
File "C:\python\Python310\lib\site-packages\pandas\core\indexing.py", line 967, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "C:\python\Python310\lib\site-packages\pandas\core\indexing.py", line 1520, in _getitem_axis
self._validate_integer(key, axis)
File "C:\python\Python310\lib\site-packages\pandas\core\indexing.py", line 1452, in _validate_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
Process finished with exit code 1
Above code gives the historical stock data of Indian stock market. The data is updated on website after market closed around 8:00PM to 9:00PM daily. Then I run my code. For most of the days my code gives output without any error but frequently it throws an error which showed above.
There are around 150-200 stocks in my code. This error occurs because some time exchange do not update the data of one or two stocks from the above list that is why this error comes.
So please post the code which will skip the particular one or two stocks which are not updated and should give the output for rest all stocks.
for example:- stocks = ['DLF', 'SBICARD', 'SIEMENS', 'BANDHANBNK', 'IRCTC', 'LAURUSLABS', 'PIIND',
'INDIGO', 'INDUSTOWER','ICICIPRULI', 'MOTHERSON']
in above stocks suppose exchange didn't update the data of 'IRCTC' and rest all stocks are up to date then due to 'IRCTC' my code throws error and it is not showing data which is updated.
Thank you.
The "out-of-bounds" error indicates you're trying to access a part of the dataframe series that doesn't exist. It's most likely caused by df['D_vol'] being less than 90 items long when you try to do
df['D_vol'].iloc[-91:-1]
Edit:
add a length check before the offending line:
if df['D_vol'].size > 90:
cond = df['D_vol'].iloc[-1] > max(df['D_vol'].iloc[-91:-1])
if(cond):
target_stocks_list.append(stock)
target_stocks = pd.concat([target_stocks, df])

Django List Comprehension - Trouble Comparing Datetime Objects. TypeError: unorderable types: datetime.date() <= str()

In my models, I have the following method:
def _bags_remaining(self):
current_set = SortingRecords.objects.values().filter(~Q(id=self.id), tag=self.tag)
sorted = [SortingRecords['bags_sorted'] for SortingRecords in current_set if
SortingRecords['date'] <= self.date]
remaining = self.tag.pieces - sum(sorted) - self.bags_sorted
return remaining
bags_remaining = property(_bags_remaining)
It is designed to find the amount of bags that have been sorted so far under the tag associated with the record, and deduct that amount (along with the amount sorted under this record) from the total bags.
It works great! The appropriate amounts are successfully passed to the templates.
However, I was dismayed that it threw off my unit tests.
======================================================================
ERROR: test_sorting_records_bags_remaining_calculation (AlmondKing.InventoryLogs.tests.test_views.test_purchase_details.DetailsPageTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\Projects\AlmondKing\AlmondKing\InventoryLogs\tests\test_views\test_purchase_details.py", line 155, in test_sorting_records_bags_remaining_calculation
self.assertEqual(self.sortrecord1.bags_remaining, 79)
File "C:\Projects\AlmondKing\AlmondKing\InventoryLogs\models.py", line 115, in _bags_remaining
sorted = [SortingRecords['bags_sorted'] for SortingRecords in current_set if
File "C:\Projects\AlmondKing\AlmondKing\InventoryLogs\models.py", line 116, in <listcomp>
SortingRecords['date'] <= self.date]
TypeError: unorderable types: datetime.date() <= str()
----------------------------------------------------------------------
Ran 22 tests in 0.203s
FAILED (errors=2)
Destroying test database for alias 'default'...
It seems to be interpreting my date object as a string. The model it draws from is a DateField. If I call type on it it reports it as:
Here's the model where it is housed:
class SortingRecords(models.Model):
tag = models.ForeignKey(Purchase, related_name='sorting_record')
date = models.DateField()
bags_sorted = models.IntegerField()
turnout = models.IntegerField()
objects = models.Manager()
def __str__(self):
return "%s [%s]" % (self.date, self.tag.tag)
This is the test I am running.
# Sorting Records should calculate bags remaining for each entry.
def test_sorting_records_bags_remaining_calculation(self):
self.assertEqual(self.sortrecord1.bags_remaining, 79)
self.assertEqual(self.sortrecord2.bags_remaining, 39)
self.assertEqual(self.sortrecord3.bags_remaining, 9)
Again, it works in real life, but fails while running the test. Any ideas why?
EDIT TO ADD DETAILS:
Database employed is Postgres.
Here is my test setUpTestData():
class DetailsPageTest(TestCase):
#classmethod
def setUpTestData(cls):
cls.product1 = ProductGroup.objects.create(
product_name="Almonds"
)
cls.variety1 = Variety.objects.create(
product_group = cls.product1,
variety_name = "non pareil",
husked = False,
finished = False,
)
cls.supplier1 = Supplier.objects.create(
company_name = "Acme",
company_location = "Acme Acres",
contact_info = "Call me!"
)
cls.shipment1 = Purchase.objects.create(
tag=9,
shipment_id=9999,
supplier_id = cls.supplier1,
purchase_date='2015-01-09',
purchase_price=9.99,
product_name=cls.variety1,
pieces=99,
kgs=999,
crackout_estimate=99.9
)
cls.shipment2 = Purchase.objects.create(
tag=8,
shipment_id=8888,
supplier_id=cls.supplier1,
purchase_date='2015-01-08',
purchase_price=8.88,
product_name=cls.variety1,
pieces=88,
kgs=888,
crackout_estimate=88.8
)
cls.shipment3 = Purchase.objects.create(
tag=7,
shipment_id=7777,
supplier_id=cls.supplier1,
purchase_date='2014-01-07',
purchase_price=7.77,
product_name=cls.variety1,
pieces=77,
kgs=777,
crackout_estimate=77.7
)
cls.sortrecord1 = SortingRecords.objects.create(
tag=cls.shipment1,
date="2015-02-05",
bags_sorted=20,
turnout=199,
)
cls.sortrecord2 = SortingRecords.objects.create(
tag=cls.shipment1,
date="2015-02-07",
bags_sorted=40,
turnout=399,
)
cls.sortrecord3 = SortingRecords.objects.create(
tag=cls.shipment1,
date='2015-02-09',
bags_sorted=30,
turnout=299,
)
Thanks to #bruno desthuilliers I observed the issue.
My setUpTestData() method was populating the fields with strings instead of datetime objects. It works properly after converting them to the appropriate input:
purchase_date=datetime.date(2015,1,9)

R DTW multivariate series with asymmetric step fails to compute alignment

I'm using the DTW implementation found in R along with the python bindings in order to verify the effects of changing different parameters(like local constraint, local distance function and others) for my data. The data represents feature vectors that an audio processing frontend outputs(MFCC). Because of this I am dealing with multivariate time series, each feature vector has a size of 8. The problem I'm facing is when I try to use certain local constraints ( or step patterns ) I get the following error:
Error in if (is.na(gcm$distance)) { : argument is of length zero
Traceback (most recent call last):
File "r_dtw_simplified.py", line 32, in <module>
alignment = R.dtw(canDist, rNull, "Euclidean", stepPattern, "none", True, Fa
lse, True, False )
File "D:\Python27\lib\site-packages\rpy2\robjects\functions.py", line 86, in _
_call__
return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
File "D:\Python27\lib\site-packages\rpy2\robjects\functions.py", line 35, in _
_call__
res = super(Function, self).__call__(*new_args, **new_kwargs)
rpy2.rinterface.RRuntimeError: Error in if (is.na(gcm$distance)) { : argument is
of length zero
Because the process of generating and adapting the input data is complicated I only made a simplified script to ilustrate the error i'm receiving.
#data works
#reference = [[-0.126678, -1.541763, 0.29985, 1.719757, 0.755798, -3.594681, -1.492798, 3.493042], [-0.110596, -1.638184, 0.128174, 1.638947, 0.721085, -3.247696, -0.920013, 3.763977], [-0.022415, -1.643539, -0.130692, 1.441742, 1.022064, -2.882172, -0.952225, 3.662842], [0.071259, -2.030411, -0.531891, 0.835114, 1.320419, -2.432281, -0.469116, 3.871094], [0.070526, -2.056702, -0.688293, 0.530396, 1.962128, -1.681915, -0.368973, 4.542419], [0.047745, -2.005127, -0.798203, 0.616028, 2.146988, -1.895874, 0.371597, 4.090881], [0.013962, -2.162796, -1.008545, 0.363495, 2.062866, -0.856613, 0.543884, 4.043335], [0.066757, -2.152969, -1.087097, 0.257263, 2.592697, -0.422424, -0.280533, 3.327576], [0.123123, -2.061035, -1.012863, 0.389282, 2.50206, 0.078186, -0.887711, 2.828247], [0.157455, -2.060425, -0.790344, 0.210419, 2.542114, 0.016983, -0.959274, 1.916504], [0.029648, -2.128204, -1.047318, 0.116547, 2.44899, 0.166534, -0.677551, 2.49231], [0.158554, -1.821365, -1.045044, 0.374207, 2.426712, 0.406952, -1.055084, 2.543762], [0.077026, -1.863235, -1.14827, 0.277069, 2.669067, 0.362549, -1.294342, 1.66748], [0.101822, -1.800293, -1.126801, 0.364594, 2.503815, 0.294846, -0.881302, 1.281616], [0.166138, -1.627762, -0.866013, 0.494476, 2.450668, 0.569, -1.392868, 0.651184], [0.225006, -1.596069, -1.07634, 0.550049, 2.167435, 0.554123, -1.432983, 1.166931], [0.114777, -1.462769, -0.793167, 0.565704, 2.183792, 0.345978, -1.410919, 0.708679], [0.144028, -1.444458, -0.831985, 0.536652, 2.222366, 0.330368, -0.715149, 0.517212], [0.147888, -1.450577, -0.809372, 0.479584, 2.271378, 0.250763, -0.540359, -0.036072], [0.090714, -1.485474, -0.888153, 0.268768, 2.001221, 0.412537, -0.698868, 0.17157], [0.11972, -1.382767, -0.890457, 0.218414, 1.666519, 0.659592, -0.069641, 0.914307], [0.189774, -1.18428, -0.785797, 0.106659, 1.429977, 0.195236, 0.627029, 0.503296], [0.194702, -1.098068, -0.956818, 0.020386, 1.369247, 0.10437, 0.641724, 0.410767], [0.215134, -1.069092, -1.11644, 0.283234, 1.313507, 0.110962, 0.600861, 0.752869], [0.216766, -1.065338, -1.047974, 0.080231, 1.500702, -0.113388, 0.712646, 0.914307], [0.259933, -0.964386, -0.981369, 0.092224, 1.480667, -0.00238, 0.896255, 0.665344], [0.265991, -0.935257, -0.93779, 0.214966, 1.235275, 0.104782, 1.33754, 0.599487], [0.266098, -0.62619, -0.905792, 0.131409, 0.402908, 0.103363, 1.352814, 1.554688], [0.273468, -0.354691, -0.709579, 0.228027, 0.315125, -0.15564, 0.942123, 1.024292], [0.246429, -0.272522, -0.609924, 0.318604, -0.007355, -0.165756, 1.07019, 1.087708], [0.248596, -0.232468, -0.524887, 0.53009, -0.476334, -0.184479, 1.088089, 0.667358], [0.074478, -0.200455, -0.058411, 0.662811, -0.111923, -0.686462, 1.205154, 1.271912], [0.063065, -0.080765, 0.065552, 0.79071, -0.569946, -0.899506, 0.875687, 0.095215], [0.117706, -0.270584, -0.021027, 0.723694, -0.200073, -0.365158, 0.892624, -0.152466], [0.00148, -0.075348, 0.017761, 0.757507, 0.719299, -0.355362, 0.749329, 0.315247], [0.035034, -0.110794, 0.038559, 0.949677, 0.478699, 0.005951, 0.097305, -0.388245], [-0.101944, -0.392487, 0.401886, 1.154938, 0.199127, 0.117371, -0.070007, -0.562439], [-0.083282, -0.388657, 0.449066, 1.505951, 0.46405, -0.566208, 0.216293, -0.528076], [-0.152054, -0.100113, 0.833054, 1.746857, 0.085861, -1.314102, 0.294632, -0.470947], [-0.166672, -0.183777, 0.988373, 1.925262, -0.202057, -0.961441, 0.15242, 0.594421], [-0.234573, -0.227707, 1.102112, 1.802002, -0.382492, -1.153336, 0.29335, 0.074036], [-0.336426, 0.042435, 1.255096, 1.804535, -0.610153, -0.810745, 1.308441, 0.599854], [-0.359344, 0.007248, 1.344543, 1.441559, -0.758286, -0.800079, 1.0233, 0.668213], [-0.321823, 0.027618, 1.1521, 1.509827, -0.708267, -0.668152, 1.05722, 0.710571], [-0.265335, 0.012344, 1.491501, 1.844971, -0.584137, -1.042419, -0.449188, 0.5354], [-0.302399, 0.049698, 1.440643, 1.674866, -0.626633, -1.158554, -0.906937, 0.405579], [-0.330276, 0.466675, 1.444153, 0.855499, -0.645447, -0.352158, 0.730423, 0.429932], [-0.354721, 0.540207, 1.570786, 0.626648, -0.897446, -0.007416, 0.174042, 0.100525], [-0.239609, 0.669983, 0.978851, 0.85321, -0.156784, 0.107986, 0.915054, 0.114197], [-0.189346, 0.930756, 0.824295, 0.516083, -0.339767, -0.206314, 0.744049, -0.36377]]
#query = [[0.387268, -1.21701, -0.432266, -1.394104, -0.458984, -1.469788, 0.12764, 2.310059], [0.418091, -1.389526, -0.150146, -0.759155, -0.578003, -2.123199, 0.276001, 3.022339], [0.264694, -1.526886, -0.238907, -0.511108, -0.90683, -2.699249, 0.692032, 2.849854], [0.246628, -1.675171, -0.533432, 0.070007, -0.392151, -1.739227, 0.534485, 2.744019], [0.099335, -1.983826, -0.985291, 0.428833, 0.26535, -1.285583, -0.234451, 2.4729], [0.055893, -2.108063, -0.401825, 0.860413, 0.724106, -1.959137, -1.360458, 2.350708], [-0.131592, -1.928314, -0.056213, 0.577698, 0.859146, -1.812286, -1.21669, 2.2052], [-0.162796, -2.149933, 0.467239, 0.524231, 0.74913, -1.829498, -0.741913, 1.616577], [-0.282745, -1.971008, 0.837616, 0.56427, 0.198288, -1.826935, -0.118027, 1.599731], [-0.497223, -1.578705, 1.277298, 0.682983, 0.055084, -2.032562, 0.64151, 1.719238], [-0.634232, -1.433258, 1.760513, 0.550415, -0.053787, -2.188568, 1.666687, 1.611938], [-0.607498, -1.302826, 1.960556, 1.331726, 0.417633, -2.271973, 2.095001, 0.9823], [-0.952957, -0.222076, 0.772064, 2.062256, -0.295258, -1.255371, 3.450974, -0.047607], [-1.210587, 1.00061, 0.036392, 1.952209, 0.470123, 0.231628, 2.670502, -0.608276], [-1.213287, 0.927002, -0.414825, 2.104065, 1.160126, 0.088898, 1.32959, -0.018311], [-1.081558, 1.007751, -0.337509, 1.7146, 0.653687, 0.297089, 1.916733, -0.772461], [-1.064804, 1.284302, -0.393585, 2.150635, 0.132294, 0.443298, 1.967575, 0.775513], [-0.972366, 1.039734, -0.588135, 1.413818, 0.423813, 0.781494, 1.977509, -0.556274], [-0.556381, 0.591309, -0.678314, 1.025635, 1.094284, 2.234711, 1.504013, -1.71875], [-0.063477, 0.626129, 0.360489, 0.149902, 0.92804, 0.936493, 1.203018, 0.264282], [0.162003, 0.577698, 0.956863, -0.477051, 1.081161, 0.817749, 0.660843, -0.428711], [-0.049515, 0.423615, 0.82489, 0.446228, 1.323853, 0.562775, -0.144196, 1.145386], [-0.146851, 0.171906, 0.304871, 0.320435, 1.378937, 0.673004, 0.188416, 0.208618], [0.33992, -2.072418, -0.447968, 0.526794, -0.175858, -1.400299, -0.452454, 1.396606], [0.226089, -2.183441, -0.301071, -0.475159, 0.834961, -2.191864, -1.092361, 2.434814], [0.279556, -2.073181, -0.517639, -0.766479, 0.974808, -2.070374, -2.003891, 2.706421], [0.237961, -1.9245, -0.708435, -0.582153, 1.285934, -1.75882, -2.146164, 2.369995], [0.149658, -1.703705, -0.539749, -0.215332, 1.369705, -1.484802, -1.506256, 1.04126], [0.078735, -1.719543, 0.157013, 0.382385, 1.100998, -0.223755, 0.021683, -0.545654], [0.106003, -1.404358, 0.372345, 1.881165, -0.292511, -0.263855, 1.579529, -1.426025], [0.047729, -1.198608, 0.600769, 1.901123, -1.106949, 0.128815, 1.293701, -1.364258], [0.110748, -0.894348, 0.712601, 1.728699, -1.250381, 0.674377, 0.812302, -1.428833], [0.085754, -0.662903, 0.794312, 1.102844, -1.234283, 1.084442, 0.986938, -1.10022], [0.140823, -0.300323, 0.673508, 0.669983, -0.551453, 1.213074, 1.449326, -1.567261], [0.03743, 0.550293, 0.400909, -0.174622, 0.355301, 1.325867, 0.875854, 0.126953], [-0.084885, 1.128906, 0.292099, -0.248779, 0.722961, 0.873871, -0.409515, 0.470581], [0.019684, 0.947754, 0.19931, -0.306274, 0.176849, 1.431702, 1.091507, 0.701416], [-0.094162, 0.895203, 0.687378, -0.229065, 0.549088, 1.376953, 0.892303, -0.642334], [-0.727692, 0.626495, 0.848877, 0.521362, 1.521912, -0.443481, 1.247238, 0.197388], [-0.82048, 0.117279, 0.975174, 1.487244, 1.085281, -0.567993, 0.776093, -0.381592], [-0.009827, -0.553009, -0.213135, 0.837341, 0.482712, -0.939423, 0.140884, 0.330566], [-0.018127, -1.362335, -0.199265, 1.260742, 0.005188, -1.445068, -1.159653, 1.220825], [0.186172, -1.727814, -0.246552, 1.544128, 0.285416, 0.081848, -1.634003, -0.47522], [0.193649, -1.144043, -0.334854, 1.220276, 1.241302, 1.554382, 0.57048, -1.334961], [0.344604, -1.647461, -0.720749, 0.993774, 0.585709, 0.953522, -0.493042, -1.845703], [0.37471, -1.989471, -0.518555, 0.555908, -0.025787, 0.148132, -1.463425, -0.844849], [0.34523, -1.821625, -0.809418, 0.59137, -0.577927, 0.037903, -2.067764, -0.519531], [0.413193, -1.503876, -0.752243, 0.280396, -0.236206, 0.429932, -1.684097, -0.724731], [0.331299, -1.349243, -0.890121, -0.178589, -0.285721, 0.809875, -2.012329, -0.157227], [0.278946, -1.090057, -0.670441, -0.477539, -0.267105, 0.446045, -1.95668, 0.501343], [0.127304, -0.977112, -0.660324, -1.011658, -0.547409, 0.349182, -1.357574, 1.045654], [0.217728, -0.793182, -0.496262, -1.259949, -0.128937, 0.38855, -1.513306, 1.863647], [0.240143, -0.891541, -0.619995, -1.478577, -0.361481, 0.258362, -1.630585, 1.841064], [0.241547, -0.758453, -0.515442, -1.370605, -0.428238, 0.23996, -1.469406, 1.307617], [0.289948, -0.714661, -0.533798, -1.574036, 0.017929, -0.368317, -1.290283, 0.851563], [0.304916, -0.783752, -0.459915, -1.523621, -0.107651, -0.027649, -1.089905, 0.969238], [0.27179, -0.795593, -0.352432, -1.597656, -0.001678, -0.06189, -1.072495, 0.637329], [0.301956, -0.823578, -0.152115, -1.637634, 0.2034, -0.214508, -1.315491, 0.773071], [0.282486, -0.853271, -0.162094, -1.561096, 0.15686, -0.289307, -1.076874, 0.673706], [0.299881, -0.97052, -0.051086, -1.431152, -0.074692, -0.32428, -1.385452, 0.684326], [0.220886, -1.072266, -0.269531, -1.038269, 0.140533, -0.711273, -1.7453, 1.090332], [0.177628, -1.229126, -0.274292, -0.943481, 0.483246, -1.214447, -2.026321, 0.719971], [0.176987, -1.137543, -0.007645, -0.794861, 0.965118, -1.084717, -2.37677, 0.598267], [0.135727, -1.36795, 0.09462, -0.776367, 0.946655, -1.157959, -2.794403, 0.226074], [0.067337, -1.648987, 0.535721, -0.665833, 1.506119, -1.348755, -3.092728, 0.281616], [-0.038101, -1.437347, 0.983917, -0.280762, 1.880722, -1.351318, -3.002258, -0.599609], [-0.152573, -1.146027, 0.717545, -0.60321, 2.126541, -0.59198, -2.282028, -1.048584], [-0.113525, -0.629669, 0.925323, 0.465393, 2.368698, -0.352661, -1.969391, -0.915161], [-0.140121, -0.311951, 0.884262, 0.809021, 1.557693, -0.552429, -1.776062, -0.925537], [-0.189423, -0.117767, 0.975174, 1.595032, 1.284485, -0.698639, -2.007202, -1.307251], [-0.048874, -0.176941, 0.820679, 1.306519, 0.584259, -0.913147, -0.658066, -0.630981], [-0.127594, 0.33313, 0.791336, 1.400696, 0.685577, -1.500275, -0.657959, -0.207642], [-0.044128, 0.653351, 0.615326, 0.476685, 1.099625, -0.902893, -0.154449, 0.325073], [-0.150223, 1.059845, 1.208405, -0.038635, 0.758667, 0.458038, -0.178909, -0.998657], [-0.099854, 1.127197, 0.789871, -0.013611, 0.452805, 0.736176, 0.948273, -0.236328], [-0.250275, 1.188568, 0.935989, 0.34314, 0.130463, 0.879913, 1.669037, 0.12793], [-0.122818, 1.441223, 0.670029, 0.389526, -0.15274, 1.293549, 1.22908, -1.132568]]
#this one doesn't
reference = [[-0.453598, -2.439209, 0.973587, 1.362091, -0.073654, -1.755112, 1.090057, 4.246765], [-0.448502, -2.621201, 0.723282, 1.257324, 0.26619, -1.375351, 1.328735, 4.46991], [-0.481247, -2.29718, 0.612854, 1.078033, 0.309708, -2.037506, 1.056305, 3.181702], [-0.42482, -2.306702, 0.436157, 1.529907, 0.50708, -1.930069, 0.653198, 3.561768], [-0.39032, -2.361343, 0.589294, 1.965607, 0.611801, -2.417084, 0.035675, 3.381104], [-0.233444, -2.281525, 0.703171, 2.17868, 0.519257, -2.474442, -0.502808, 3.569153], [-0.174652, -1.924591, 0.180267, 2.127075, 0.250626, -2.208527, -0.396591, 2.565552], [-0.121078, -1.53801, 0.234344, 2.221039, 0.845367, -1.516205, -0.174149, 1.298645], [-0.18631, -1.047806, 0.629654, 2.073303, 0.775024, -1.931076, 0.382706, 2.278442], [-0.160477, -0.78743, 0.694214, 1.917572, 0.834885, -1.574707, 0.780045, 2.370422], [-0.203659, -0.427246, 0.726486, 1.548767, 0.465698, -1.185379, 0.555206, 2.619629], [-0.208298, -0.393707, 0.771881, 1.646484, 0.612946, -0.996277, 0.658539, 2.499146], [-0.180679, -0.166656, 0.689209, 1.205994, 0.3918, -1.051483, 0.771072, 1.854553], [-0.1978, 0.082764, 0.723541, 1.019104, 0.165405, -0.127533, 1.0522, 0.552368], [-0.171127, 0.168533, 0.529541, 0.584839, 0.702011, -0.36525, 0.711792, 1.029114], [-0.224243, 0.38765, 0.916031, 0.45108, 0.708923, -0.059326, 1.016312, 0.437561], [-0.217072, -0.981766, 1.67363, 1.864014, 0.050812, -2.572815, -0.22937, 0.757996], [-0.284714, -0.784927, 1.720383, 1.782379, -0.093414, -2.492111, 0.623398, 0.629028], [-0.261169, -0.427979, 1.680038, 1.585358, 0.067093, -1.8181, 1.276291, 0.838989], [-0.183075, -0.08197, 1.094147, 1.120392, -0.117752, -0.86142, 1.94194, 0.966858], [-0.188919, 0.121521, 1.277664, 0.90979, 0.114288, -0.880875, 1.920517, 0.95752], [-0.226868, 0.338455, 0.78067, 0.803009, 0.347092, -0.387955, 0.641296, 0.374634], [-0.206329, 0.768158, 0.759537, 0.264099, 0.15979, 0.152618, 0.911636, -0.011597], [-0.230453, 0.495941, 0.547165, 0.137604, 0.36377, 0.594406, 1.168839, 0.125916], [0.340851, -0.382736, -1.060455, -0.267792, 1.1306, 0.595047, -1.544922, -1.6828], [0.341492, -0.325836, -1.07164, -0.215607, 0.895645, 0.400177, -0.773956, -1.827515], [0.392075, -0.305389, -0.885422, -0.293427, 0.993225, 0.66655, -1.061218, -1.730713], [0.30191, -0.339005, -0.877853, 0.153992, 0.986588, 0.711823, -1.100525, -1.648376], [0.303574, -0.491241, -1.000183, 0.075378, 0.686295, 0.752792, -1.192123, -1.744568], [0.315781, -0.629456, -0.996063, 0.224731, 1.074173, 0.757736, -1.170807, -2.08313], [0.313675, -0.804688, -1.00325, 0.431641, 0.685883, 0.538879, -0.988373, -2.421326], [0.267181, -0.790329, -0.726974, 0.853027, 1.369629, -0.213638, -1.708023, -1.977844], [0.304459, -0.935257, -0.778061, 1.042633, 1.391861, -0.296768, -1.562164, -2.014099], [0.169754, -0.792953, -0.481842, 1.404236, 0.766983, -0.29805, -1.587265, -1.25531], [0.15918, -0.9814, -0.197662, 1.748718, 0.888367, -0.880234, -1.64949, -1.359802], [0.028244, -0.772934, -0.186172, 1.594238, 0.863571, -1.224701, -1.153183, -0.292664], [-0.020401, -0.461578, 0.368088, 1.000366, 1.079636, -0.389603, -0.144409, 0.651733], [0.018555, -0.725418, 0.632599, 1.707336, 0.535049, -1.783859, -0.916122, 1.557007], [-0.038971, -0.797668, 0.820419, 1.483093, 0.350494, -1.465073, -0.786453, 1.370361], [-0.244888, -0.469513, 1.067978, 1.028809, 0.4879, -1.796585, -0.77887, 1.888977], [-0.260193, -0.226593, 1.141754, 1.21228, 0.214005, -1.200943, -0.441177, 0.532715], [-0.165283, 0.016129, 1.263016, 0.745514, -0.211288, -0.802368, 0.215698, 0.316406], [-0.353134, 0.053787, 1.544189, 0.21106, -0.469086, -0.485367, 0.767761, 0.849548], [-0.330215, 0.162704, 1.570053, 0.304718, -0.561172, -0.410294, 0.895126, 0.858093], [-0.333847, 0.173904, 1.56958, 0.075531, -0.5569, -0.259552, 1.276764, 0.749084], [-0.347107, 0.206665, 1.389832, 0.50473, -0.721664, -0.56955, 1.542618, 0.817444], [-0.299057, 0.140244, 1.402924, 0.215363, -0.62767, -0.550461, 1.60788, 0.506958], [-0.292084, 0.052063, 1.463348, 0.290497, -0.462875, -0.497452, 1.280609, 0.261841], [-0.279877, 0.183548, 1.308609, 0.305756, -0.6483, -0.374771, 1.647781, 0.161865], [-0.28389, 0.27916, 1.148636, 0.466736, -0.724442, -0.21991, 1.819901, -0.218872], [-0.275528, 0.309753, 1.192856, 0.398163, -0.828781, -0.268066, 1.763672, 0.116089], [-0.275284, 0.160019, 1.200623, 0.718628, -0.925552, -0.026596, 1.367447, 0.174866], [-0.302795, 0.383438, 1.10556, 0.441833, -0.968323, -0.137375, 1.851791, 0.357971], [-0.317078, 0.22876, 1.272217, 0.462219, -0.855789, -0.294296, 1.593994, 0.127502], [-0.304932, 0.207718, 1.156189, 0.481506, -0.866776, -0.340027, 1.670105, 0.657837], [-0.257217, 0.155655, 1.041428, 0.717926, -0.761597, -0.17244, 1.114151, 0.653503], [-0.321426, 0.292358, 0.73848, 0.422607, -0.850754, -0.057907, 1.462357, 0.697754], [-0.34642, 0.361526, 0.69722, 0.585175, -0.464508, -0.26651, 1.860596, 0.106201], [-0.339844, 0.584229, 0.542603, 0.184937, -0.341263, 0.085648, 1.837311, 0.160461], [-0.32338, 0.661224, 0.512833, 0.319702, -0.195572, 0.004028, 1.046799, 0.233704], [-0.346329, 0.572388, 0.385986, 0.118988, 0.057556, 0.039001, 1.255081, -0.18573], [-0.383392, 0.558395, 0.553391, -0.358612, 0.443573, -0.086014, 0.652878, 0.829956], [-0.420395, 0.668991, 0.64856, -0.021271, 0.511475, 0.639221, 0.860474, 0.463196], [-0.359039, 0.748672, 0.522964, -0.308899, 0.717194, 0.218811, 0.681396, 0.606812], [-0.323914, 0.942627, 0.249069, -0.418365, 0.673599, 0.797974, 0.162674, 0.120361], [-0.411301, 0.92775, 0.493332, -0.286346, 0.165054, 0.63446, 1.085571, 0.120789], [-0.346191, 0.632309, 0.635056, -0.402496, 0.143814, 0.785614, 0.952164, 0.482727], [-0.203812, 0.789261, 0.240433, -0.47699, -0.12912, 0.91832, 1.145493, 0.052002], [-0.048203, 0.632095, 0.009583, -0.53833, 0.232727, 1.293045, 0.308151, 0.188904], [-0.062393, 0.732315, 0.06694, -0.697144, 0.126221, 0.864578, 0.581635, -0.088379]]
query = [[-0.113144, -3.316223, -1.101563, -2.128418, 1.853867, 3.61972, 1.218185, 1.71228], [-0.128952, -3.37915, -1.152237, -2.033081, 1.860199, 4.008179, 0.445938, 1.665894], [-0.0392, -2.976654, -0.888245, -1.613953, 1.638641, 3.849518, 0.034073, 0.768188], [-0.146042, -2.980713, -1.044113, -1.44397, 0.954514, 3.20929, -0.232422, 1.050781], [-0.155029, -2.997192, -1.064438, -1.369873, 0.67688, 2.570709, -0.855347, 1.523438], [-0.102341, -2.686401, -1.029648, -1.00531, 0.950089, 1.933228, -0.526367, 1.598633], [-0.060272, -2.538727, -1.278259, -0.65332, 0.630875, 1.459717, -0.264038, 1.872925], [0.064087, -2.592682, -1.112823, -0.775024, 0.848618, 0.810883, 0.298965, 2.312134], [0.111557, -2.815277, -1.203506, -1.173584, 0.54863, 0.46756, -0.023071, 3.029053], [0.266068, -2.624786, -1.089066, -0.864136, 0.055389, 0.619446, -0.160965, 2.928589], [0.181488, -2.31073, -1.307785, -0.720276, 0.001297, 0.534668, 0.495499, 2.989502], [0.216202, -2.25354, -1.288193, -0.902039, -0.152283, -0.060791, 0.566315, 2.911621], [0.430084, -2.0289, -1.099594, -1.091736, -0.302505, -0.087799, 0.955963, 2.677002], [0.484253, -1.412842, -0.881882, -1.087158, -1.064072, -0.145935, 1.437683, 2.606567], [0.339081, -1.277222, -1.24498, -1.048279, -0.219498, 0.448517, 1.168625, 0.563843], [0.105728, 0.138275, -1.01413, -0.489868, 1.319275, 1.604645, 1.634003, -0.94812], [-0.209061, 1.025665, 0.180405, 0.955566, 1.527405, 0.91745, 1.951233, -0.40686], [-0.136993, 1.332275, 0.639862, 1.277832, 1.277313, 0.361267, 0.390717, -0.728394], [-0.217758, 1.416718, 1.080002, 0.816101, 0.343933, -0.154175, 1.10347, -0.568848]]
reference = np.array( reference )
query = np.array( query )
rpy2.robjects.numpy2ri.activate()
# Set up our R namespaces
R = rpy2.robjects.r
rNull = R("NULL")
rprint = rpy2.robjects.globalenv.get("print")
rplot = rpy2.robjects.r('plot')
distConstr = rpy2.robjects.r('proxy::dist')
DTW = importr('dtw')
stepName = "asymmetricP05"
stepPattern = rpy2.robjects.r( stepName )
canDist = distConstr( reference, query, "Euclidean" ) #
alignment = R.dtw(canDist, rNull, "Euclidean", stepPattern, "none", True, False, True, False )
For some series the script doesn't generate the error but there are some which do. See the commented lines for examples. It is worth noting that for the classic constraint this error does not appear. I am thinking that perhaps I have not set-up something correct but I am no expert in python nor in R so that is why I was hoping that others who have used the R DTW can help me on this. I am sorry for the long lines for reference and query (the data is from outputting the MFCC's of a 2 second wav file).
One of the two series is too short to be compatible with the fancy step pattern you chose. Use the common symmetric2 pattern, which does not restrict slopes, before the more exotic ones.

Categories

Resources