Trying to pass columns into function, getting keyerror (Pandas)

Trying to pass columns into function, getting keyerror (Pandas) - python

I have this code block:
def euc_dist(x,y):
return ((x[0] - y[0])**2 +(x[1] - y[1])**2 )**(1/2)
def dist(s1,s2):
distances = [euc_dist(s1[i],s2[i]) for i in range(s1.shape[0])]
return pd.Series(distances)
distances_df = tracking_data.loc[:,tracking_data[['away_player10_point', 'away_player9_point', 'away_player8_point', 'away_player7_point', 'away_player6_point', 'away_player5_point', 'away_player4_point', 'away_player3_point', 'away_player2_point', 'away_player1_point', 'away_player11_point', 'home_player1_point', 'home_player2_point', 'home_player3_point', 'home_player4_point', 'home_player5_point', 'home_player6_point', 'home_player7_point', 'home_player8_point', 'home_player9_point', 'home_player10_point', 'home_player11_point']].apply(tuple, axis = 1)].apply(dist, args = (tracking_data["ball_point"]))
tracking_data["closest"] = distances_df.idxmin(axis = 1).apply(lambda x: str(x)[:-6])
I am getting this error when running:
KeyError: "None of [Index([
((-22.06, -8.32), (-0.12, 21.38), (-1.49, -9.62), (-0.26, -28.52),
(-19.32, 16.22), (-15.11, 0.43), (-7.69, 32.87), (0.45, -0.25),
(-9.88, 7.67), (-47.29, -0.14), (-18.1, -25.42), (0.46, -19.84),
(7.58, 4.82), (15.33, -23.38), (21.08, 6.57), (14.98, 20.7), (8.14,
-4.27), (21.36, -9.06), (46.92, 0.01), (0.29, 9.88), (0.67, 22.24), (-0.06, -9.07)),\n
((-22.06, -8.32), (-0.07, 21.39), (-1.47, -9.64), (-0.23, -28.51),
(-19.31, 16.22), (-15.1, 0.42), (-7.68, 32.88), (0.46, -0.26), (-9.87,
7.7), (-47.3, -0.15), (-18.09, -25.41), (0.43, -19.83), (7.5600000000000005, 4.83), (15.31, -23.38), (21.06, 6.57), (14.97,
20.72), (8.12, -4.28), (21.33, -9.04), (46.91, 0.02), (0.25, 9.85), (0.67, 22.24), (-0.11, -9.05)),\n
((-22.06, -8.33), (-0.03, 21.39), (-1.43, -9.67), (-0.2, -28.5),
(-19.29, 16.24), (-15.09, 0.42), (-7.66, 32.9), (0.47000000000000003,
-0.27), (-9.85, 7.72), (-47.31, -0.16), (-18.08, -25.4), (0.39, -19.83), (7.55, 4.85), (15.28, -23.38), (21.03, 6.57), (14.95, 20.74), (8.09, -4.28), (21.28, -9.02), (46.91, 0.03), (0.2, 9.82), (0.66,
22.24), (-0.16, -9.02)),\n ((-22.06, -8.34), (0.01, 21.39), (-1.3900000000000001, -9.7), (-0.16, -28.5), (-19.28,
16.25), (-15.08, 0.42), (-7.64, 32.92), (0.49, -0.27), (-9.84, 7.75), (-47.32, -0.16), (-18.07, -25.4), (0.3500000000000...
Please reference this notebook to see my table as it is too large to put here. The work pertaining to this question is at the bottom.
https://github.com/piercepatrick/Articles_EDA/blob/main/nashSCProject.ipynb
I have been trying to work out this problem in my previous question: Pandas Dataframe: Find the column with the closest coordinate point to another columns coordinate point
I have a hunch that this issue lies in the source data since I originally loaded it in as JSON?

First, reset the index with
tracking_data = tracking_data.reset_index()
Then change
distances_df = tracking_data.loc[:,tracking_data[['away_player10_point', 'away_player9_point', 'away_player8_point', 'away_player7_point', 'away_player6_point', 'away_player5_point', 'away_player4_point', 'away_player3_point', 'away_player2_point', 'away_player1_point', 'away_player11_point', 'home_player1_point', 'home_player2_point', 'home_player3_point', 'home_player4_point', 'home_player5_point', 'home_player6_point', 'home_player7_point', 'home_player8_point', 'home_player9_point', 'home_player10_point', 'home_player11_point']].apply(tuple, axis = 1)].apply(dist, args = (tracking_data["ball_point"]))
for
distances_df = tracking_data[['away_player10_point', 'away_player9_point', 'away_player8_point', 'away_player7_point', 'away_player6_point', 'away_player5_point', 'away_player4_point', 'away_player3_point', 'away_player2_point', 'away_player1_point', 'away_player11_point', 'home_player1_point', 'home_player2_point', 'home_player3_point', 'home_player4_point', 'home_player5_point', 'home_player6_point', 'home_player7_point', 'home_player8_point', 'home_player9_point', 'home_player10_point', 'home_player11_point']].apply(dist, args = (tracking_data["ball_point"],))

Related

Max of values from 3 columns in a dataframe?

I would like to calculate the maximum of 3 columns value.
import pandas as pd
import pandas_datareader.data as web
data = web.DataReader("^NSEI" , data_source="yahoo",start = "1/4/2016",end ="6/4/2018")
data=pd.DataFrame(data)
data["High-Low"] = data["High"] - data["Low"]
data["Close-low"] = abs(data["Close"].shift(1) - data["Low"])
data["Close-High"] = abs(data["Close"].shift(1) - data["High"])
data["True_Range"] = data[[data["High-Low"], data["Close-low"],data["Close-High"]]].max(axis=1)
In data["True _Range"] column, I want to take the maximum of value in data["High-low"],data["Close-low"] and data["Close-high"] columns. On this, it is giving a name error.
KeyError: "None of [Index([(156.44970703125, 67.9501953125, 79.75, 118.35009765625, 53.05029296875, 110.75, 100.5, 165.150390625, 161.0, 139.2001953125, 127.25, 98.60009765625, 229.39990234375, 148.7001953125, 105.7998046875, 65.94970703125, 58.19970703125, 59.25, 172.85009765625, 59.2001953125, 148.25, 69.10009765625, 91.099609375, 96.5, 149.349609375, 48.30029296875, 94.10009765625, 248.69970703125, 165.7998046875, 126.0, 166.94970703125, 163.05029296875, 87.25, 80.89990234375, 51.69970703125, 151.0, 81.0498046875, 72.80029296875, 67.7998046875, 268.80029296875, 200.39990234375, 72.2001953125, 77.900390625, 61.7998046875, 85.0, 114.7001953125, 99.7001953125, 83.35009765625, 68.650390625, 92.400390625, 102.85009765625, 105.89990234375, 95.7001953125, 95.849609375, 84.400390625, 56.25, 161.69970703125, 70.64990234375, 98.5, 75.60009765625, 74.0498046875, 60.05029296875, 147.64990234375, 46.89990234375, 94.89990234375, 42.64990234375, 161.94970703125, 54.0498046875, 92.599609375, 77.85009765625, 72.85009765625, 94.35009765625, 50.0, 84.0, 151.9501953125, 50.4501953125, 157.5498046875, 100.349609375, 52.5, 155.10009765625, 51.75, 70.69970703125, 60.5498046875, 120.10009765625, 59.19970703125, 112.2001953125, 66.39990234375, 96.7998046875, 101.75, 60.39990234375, 71.2998046875, 109.400390625, 76.64990234375, 98.39990234375, 45.75, 131.900390625, 134.5, 87.150390625, 49.2001953125, 79.2998046875, ...), (nan, 28.0498046875, 63.44970703125, 184.39990234375, 12.75, 107.0, 76.05029296875, 84.5, 118.60009765625, 109.5, 101.39990234375, 13.14990234375, 193.60009765625, 59.2998046875, 50.80029296875, 1.25, 16.44970703125, 28.14990234375, 21.85009765625, 22.2998046875, 127.900390625, 105.25, 4.150390625, 2.64990234375, 125.89990234375, 112.10009765625, 120.4501953125, 255.75, 107.35009765625, 75.849609375, 125.25, 87.60009765625, 19.39990234375, 45.7998046875, 10.0498046875, 143.849609375, 99.7998046875, 57.30029296875, 14.5, 203.9501953125, 48.05029296875, 85.85009765625, 37.19970703125, 31.5, 43.2001953125, 61.0, 84.39990234375, 25.5498046875, 4.849609375, 85.9501953125, 55.4501953125, 19.35009765625, 5.35009765625, 13.35009765625, 60.4501953125, 44.2998046875, 128.7998046875, 32.85009765625, 46.4501953125, 33.2001953125, 72.2998046875, 8.64990234375, 170.14990234375, 11.4501953125, 78.5, 19.75, 38.35009765625, 8.0498046875, 63.25, 7.7001953125, 37.150390625, 30.64990234375, 38.69970703125, 72.2998046875, 32.5, 22.10009765625, 145.44970703125, 58.5498046875, 72.5, 70.75, 49.75, 0.30029296875, 57.14990234375, 20.099609375, 28.349609375, 106.89990234375, 0.7998046875, 116.19970703125, 42.75, 18.9501953125, 80.0, 103.35009765625, 47.64990234375, 27.5, 15.25, 60.44970703125, 13.60009765625, 7.39990234375, 5.85009765625, 44.2001953125, ...)], dtype='object')] are in the [columns]"

Try:
data["True_Range"] = data[["High-Low","Close-low","Close-High"]].max(axis=1)
Instead of:
data["True_Range"] = data[[data["High-Low"], data["Close-low"],data["Close-High"]]].max(axis=1)

Your error signifies that you have input the data instead of the nested columnnames into your view of the dataframe, which produced the error.
Since the data from the columns themselves are not in the columnnames - you received the error.
One way of avoiding the error is:
data["True_Range"] = data[["High-Low", "Close-low", "Close-High"]].max(axis=1)

pandas will not let me reindex?

I am creating a function to more easily manipulate similar data sets, but for some reason the function is not reindexing my data frame. Could someone tell me what is going on? I am trying to figure out how to reindex and interpolate the data and am wondering why it stops there.
CODE:
import pandas as pd
data2.rename(columns={'DATE':'DATE','DGS20':'Yd'},inplace = True)
data.rename(columns={'DATE':'DATE','DGS10':'Yd'},inplace = True)
def func(dat):
dat.DATE = pd.to_datetime(dat.DATE)
dat.Yd = pd.to_numeric(dat.Yd,errors = "coerce")
dat.index = dat.DATE
dat.drop('DATE',axis = 1,inplace = True)
scale = pd.date_range(start = data.index[0],end = data.index[3774],freq = 'D')
dat = dat.reindex(scale) <--- THIS LINE IS NOT EXECUTING
dat.interpolate(method = 'time',inplace = True)
RESULT:
The function works, but the manipulation is stopping at the line I have pointed out above.
SAMPLE OF DATA:
DATE,DGS5
2004-01-02,3.36
2004-01-05,3.39
2004-01-06,3.26
2004-01-07,3.25
2004-01-08,3.24
2004-01-09,3.05
2004-01-12,3.04
2004-01-13,2.98
2004-01-14,2.96
2004-01-15,2.97
2004-01-16,3.03
2004-01-19,.
2004-01-20,3.05
2004-01-21,3.02
2004-01-22,2.96
2004-01-23,3.06
2004-01-26,3.13
2004-01-27,3.07
2004-01-28,3.22
2004-01-29,3.22
2004-01-30,3.17
2004-02-02,3.18
2004-02-03,3.12
2004-02-04,3.15
2004-02-05,3.21
2004-02-06,3.12
2004-02-09,3.08
2004-02-10,3.13
2004-02-11,3.03
2004-02-12,3.07
2004-02-13,3.01
2004-02-16,.
2004-02-17,3.02
2004-02-18,3.03
2004-02-19,3.02
2004-02-20,3.08
2004-02-23,3.03
2004-02-24,3.01
2004-02-25,2.98
2004-02-26,3.01
2004-02-27,3.01
2004-03-01,2.98
2004-03-02,3.04
2004-03-03,3.06
2004-03-04,3.02
2004-03-05,2.81
2004-03-08,2.74
2004-03-09,2.68
2004-03-10,2.71
2004-03-11,2.72
2004-03-12,2.73
2004-03-15,2.74
2004-03-16,2.65
2004-03-17,2.66
2004-03-18,2.72
2004-03-19,2.75
2004-03-22,2.69
2004-03-23,2.69
2004-03-24,2.68
2004-03-25,2.70
2004-03-26,2.81
2004-03-29,2.86
2004-03-30,2.86
2004-03-31,2.80
2004-04-01,2.87
2004-04-02,3.15
2004-04-05,3.24
2004-04-06,3.19
2004-04-07,3.19
2004-04-08,3.22
2004-04-09,.
2004-04-12,3.26
2004-04-13,3.37
2004-04-14,3.44
2004-04-15,3.45
2004-04-16,3.39
2004-04-19,3.42
2004-04-20,3.45
2004-04-21,3.52
2004-04-22,3.46
2004-04-23,3.58
2004-04-26,3.57
2004-04-27,3.52
2004-04-28,3.60
2004-04-29,3.66
2004-04-30,3.63
2004-05-03,3.63
2004-05-04,3.66
2004-05-05,3.71
2004-05-06,3.72
2004-05-07,3.96
2004-05-10,3.95
2004-05-11,3.94
2004-05-12,3.96
2004-05-13,4.01
2004-05-14,3.92
2004-05-17,3.83
2004-05-18,3.87
2004-05-19,3.93
2004-05-20,3.86
2004-05-21,3.91
2004-05-24,3.90
2004-05-25,3.89
2004-05-26,3.81
2004-05-27,3.74
2004-05-28,3.81
2004-05-31,.
2004-06-01,3.86
2004-06-02,3.91
2004-06-03,3.89
2004-06-04,3.97
2004-06-07,3.95
2004-06-08,3.96
2004-06-09,4.01
2004-06-10,4.00
2004-06-11,.
2004-06-14,4.10
2004-06-15,3.90
2004-06-16,3.96
2004-06-17,3.93
2004-06-18,3.94
2004-06-21,3.91
2004-06-22,3.92
2004-06-23,3.90
2004-06-24,3.85
2004-06-25,3.85
2004-06-28,3.97
2004-06-29,3.92
2004-06-30,3.81
2004-07-01,3.74
2004-07-02,3.62
2004-07-05,.
2004-07-06,3.65

From the v0.23.4 docs:
DataFrame.reindex supports two calling conventions
(index=index_labels, columns=column_labels, ...)
(labels, axis={'index', 'columns'}, ...)
We highly recommend using keyword arguments to clarify your intent.
EDIT: The following code works for me. I added a return statement in my function.
import pandas as pd
raw_series = {'Yd': [3.36, 3.39, 3.26, 3.25, 3.24, 3.05, 3.04, 2.98, 2.96, 2.97, 3.03, '.']}
raw_index = ['2004-01-02', '2004-01-05', '2004-01-06', '2004-01-07', '2004-01-08', '2004-01-09', '2004-01-12', '2004-01-13', '2004-01-14', '2004-01-15', '2004-01-16', '2004-01-19']
dat = pd.DataFrame(raw_series, index=raw_index)
def func(dat):
dat.loc[:, 'Yd'] = pd.to_numeric(dat['Yd'], errors="coerce")
dat.index = pd.to_datetime(dat.index)
scale = pd.date_range(raw_index[0], raw_index[-1], freq='D')
reindexed = dat.reindex(index=scale)
return reindexed.interpolate(method='time')
Output:
Yd
2004-01-02 3.360000
2004-01-03 3.370000
2004-01-04 3.380000
2004-01-05 3.390000
2004-01-06 3.260000
2004-01-07 3.250000
2004-01-08 3.240000
2004-01-09 3.050000
2004-01-10 3.046667
2004-01-11 3.043333
2004-01-12 3.040000
2004-01-13 2.980000
2004-01-14 2.960000
2004-01-15 2.970000
2004-01-16 3.030000
2004-01-17 3.035000
2004-01-18 3.040000
2004-01-19 3.045000
2004-01-20 3.050000
verify the data types:
>>>func(dat).reset_index().dtypes
index datetime64[ns]
Yd float64
dtype: object

Syntax for interp2d or RectBivariateSpline

I have a data set of points, logR, logT, and logX, where X is a function of R and T. It's only a data set, I have no defined function for X. The data is listed in a table, where logR corresponds to columns and logT corresponds to logT. I am attempting to use an interpolation function to evaluate this grid at two inputs of logR and logT. I found my situation most related to this post:
How to pass arrays into Scipy Interpolate RectBivariateSpline?
But I could not arrange things so my function could be evaluated at my inputs. Here are my attempts:
import numpy as np
from scipy.interpolate import RectBivariateSpline, interp2d
op_r = np.array([1e-8, 3.1622e-8, 1e-7, 3.1622e-7, 1e-6, 3.1622e-6, 1e-5,
3.1622e-5, 1e-4, 3.1622e-4, 1e-3, 3.1622e-3, 1e-2, 3.1622e-2, 0.1, .31622, 1,
3.1622, 10])
op_T = np.array([17782.794, 19952.623, 22387.211, 25118.864, 28183.829,
31622.777, 35481.339, 39810.717, 44668.359, 50118.723, 56234.133, 63095.734,
79432.823, 89125.094])
log_op_val = np.array([[-0.598, -0.593, -0.583, -0.568, -0.539, -0.477,
-0.353, -0.142, 0.168, 0.558, 0.990, 1.443, 1.915, 2.407, 2.866, 3.239, 3.517,
3.725, 3.896], [-0.597, -0.592, -0.580, -0.561, -0.532, -0.474, -0.362,
-0.165, 0.138, 0.539, 1.001, 1.476, 1.942, 2.426, 2.912, 3.352, 3.702, 3.968,
4.175], [-0.588, -0.588, -0.578, -0.555, -0.520, -0.462, -0.357, -0.171,
0.124, 0.529, 1.009, 1.507, 2.001, 2.487, 2.979, 3.453, 3.856, 4.176, 4.422],
[-0.545, -0.559, -0.563, -0.546, -0.506, -0.442, -0.338, -0.159, 0.132, 0.538,
1.015, 1.525, 2.051, 2.565, 3.072, 3.563, 3.996, 4.356, 4.634], [-0.520,
-0.521, -0.519, -0.509, -0.475, -0.409, -0.301, -0.122, 0.167, 0.571, 1.052,
1.570, 2.106, 2.642, 3.176, 3.684, 4.136, 4.517, 4.822], [-0.518, -0.514,
-0.504, -0.478, -0.425, -0.344, -0.232, -0.056, 0.226, 0.629, 1.111, 1.631,
2.169, 2.719, 3.276, 3.804, 4.275, 4.672, 4.990], [-0.517, -0.513, -0.504,
-0.479, -0.417, -0.297, -0.129, 0.074, 0.353, 0.734, 1.202, 1.715, 2.250,
2.800, 3.364, 3.907, 4.394, 4.802, 5.127], [-0.518, -0.514, -0.505, -0.484,
-0.429, -0.311, -0.104, 0.185, 0.521, 0.894, 1.329, 1.818, 2.341, 2.883,
3.441, 3.986, 4.481, 4.894, 5.218], [-0.517, -0.514, -0.507, -0.490, -0.443,
-0.337, -0.142, 0.169, 0.588, 1.039, 1.480, 1.936, 2.431, 2.955, 3.496, 4.031,
4.521, 4.934, 5.253], [-0.516, -0.513, -0.507, -0.492, -0.453, -0.361, -0.184,
0.103, 0.510, 1.009, 1.531, 2.022, 2.502, 3.002, 3.519, 4.035, 4.513, 4.920,
5.235], [-0.515, -0.511, -0.506, -0.493, -0.460, -0.381, -0.225, 0.036, 0.409,
0.877, 1.415, 1.973, 2.502, 3.005, 3.505, 4.002, 4.468, 4.868, 5.183],
[-0.515, -0.511, -0.503, -0.490, -0.462, -0.394, -0.257, -0.022, 0.321, 0.759,
1.269, 1.827, 2.403, 2.949, 3.458, 3.948, 4.405, 4.802, 5.113], [-0.516,
-0.512, -0.502, -0.487, -0.460, -0.400, -0.279, -0.066, 0.254, 0.672, 1.164,
1.701, 2.278, 2.851, 3.388, 3.889, 4.347, 4.741, 5.047], [-0.517, -0.512,
-0.503, -0.485, -0.454, -0.397, -0.287, -0.092, 0.211, 0.620, 1.101, 1.628,
2.190, 2.762, 3.322, 3.841, 4.305, 4.695, 4.989], [-0.516, -0.512, -0.503,
-0.484, -0.449, -0.388, -0.283, -0.099, 0.192, 0.596, 1.071, 1.596, 2.148,
2.714, 3.281, 3.811, 4.280, 4.661, 4.937]])
T_1a = 22100.
R_a = rho_ta /(((T_1a)/(1e6))**3)
gri_chi_a = RectBivariateSpline(op_r, op_T, op_val)
chi_a = RectBivariateSpline(R_a, T_1a)
print chi_a
And this is the error I get:
Traceback (most recent call last):
File "model.py", line 279, in <module>
gri_chi_a = RectBivariateSpline(op_r, op_T, op_val)
"/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/scipy/interpolate/fitpack2.py", line 882, in __init__
raise TypeError('x dimension of z must have same number of '
TypeError: x dimension of z must have same number of elements as x
It is the same as if I use the interp2d function. Any help would be appreciated.

Copy-n-pasting your arrays I get:
In [391]: op_r.shape
Out[391]: (19,)
In [393]: op_T.shape
Out[393]: (14,)
In [395]: log_op_val.shape
Out[395]: (15, 19)
15 does not equal 14 or 19!

Writing numbers into file python

I was trying to extract all the elements of the my data points (x,y) tuples, and put them into list of x values and y list, and transfer them to two columns in excel spreadsheet. It seems writing numbers into file is quite difficult. Can anyone shed a light on this problem?
Current state:
xlist=[list[i][0] for i in range(len(list))]
ylist=[list[i][1] for i in range(len(list))]
fob=open('c:/test/a.txt','w')
fob.write(xlist[i] for i in range(len(xlist))
i want to write down a column of numbers in notepad so that I can highlight and copy into spread sheet directly .
Below are my data.
list = [(0.496, 12.49), (0.531, 12.40), (0.578, 12.18), (0.615,
11.96), (0.657, 11.75), (0.731, 11.28), (0.785, 10.85), (0.812,
10.61), (0.883, 9.92), (0.930, 9.40), (0.979, 8.77), (1.026,
8.10), (1.081, 7.23), (1.134, 6.33), (1.189, 5.39), (1.220,
4.85), (1.273, 3.92), (1.332, 2.91), (1.364, 2.55), (1.418,
2.16), (1.467, 1.65), (1.523, 1.17), (1.569, 0.82), (1.626,
0.47), (1.678, 0.21), (1.723, 0.01), (1.776, 0.19), (1.814,
0.28), (1.869, 0.36), (1.933, 0.36), (1.972, 0.31), (2.021,
0.18), (2.081, 0.13), (2.129, 0.46), (2.169, 0.79), (2.219,
1.24), (2.280, 1.84), (2.306, 2.11), (2.358, 2.67), (2.414,
3.37), (2.471, 4.05), (2.505, 4.51), (2.562, 5.22), (2.613,
5.84), (2.652, 6.31), (2.712, 7.01), (2.758, 7.52), (2.802,
7.99), (2.869, 8.63), (2.930, 9.16), (2.971, 9.57), (3.043,
10.35), (3.078, 10.69), (3.119, 11.00), (3.174, 11.26), (3.217,
11.40), (3.261, 11.53), (3.307, 11.55), (3.371, 11.51), (3.432,
11.40), (3.479, 11.26), (3.507, 11.20), (3.557, 11.00), (3.623,
10.55), (3.663, 10.28), (3.729, 9.79), (3.768, 9.57), (3.825,
9.24), (3.880, 8.85), (3.944, 8.41), (3.969, 8.04), (4.014,
7.55), (4.086, 6.67), (4.105, 6.37), (4.166, 5.50), (4.212,
4.88), (4.266, 4.20), (4.311, 3.69), (4.364, 3.06), (4.401,
2.65), (4.453, 2.09), (4.497, 1.68), (4.556, 1.18), (4.602,
0.85), (4.644, 0.57), (4.695, 0.29), (4.754, 0.04), (4.799,
0.11), (4.847, 0.17), (4.918, 0.11), (4.959, 0.04), (4.992,
0.19), (5.063, 0.64), (5.098, 0.90), (5.157, 1.40), (5.201,
1.79), (5.245, 2.20), (5.291, 2.65), (5.326, 3.00), (5.387,
3.65), (5.420, 4.02), (5.469, 4.62), (5.538, 5.44), (5.579,
5.96), (5.629, 6.57), (5.674, 7.14), (5.724, 7.73), (5.798,
8.60), (5.823, 8.88), (5.888, 9.62), (5.919, 9.94), (5.963,
10.41), (6.009, 10.85), (6.050, 11.22), (6.115, 11.71), (6.153,
11.99), (6.222, 12.39), (6.263, 12.61), (6.302, 12.77), (6.377,
12.99), (6.414, 13.03), (6.454, 13.02), (6.522, 12.89), (6.558,
12.74), (6.626, 12.41), (6.677, 12.05), (6.729, 11.64), (6.791,
11.00), (6.832, 10.58), (6.887, 9.92), (6.949, 9.13), (6.996,
8.48), (7.028, 8.09), (7.094, 7.13), (7.123, 6.70), (7.161,
6.16), (7.213, 5.35), (7.250, 4.81), (7.332, 3.61), (7.382,
2.93), (7.420, 2.45), (7.474, 1.88), (7.514, 1.40), (7.576,
0.71), (7.600, 0.50), (7.662, 0.12), (7.725, 0.16), (7.768,
0.26), (7.810, 0.30), (7.858, 0.26), (7.904, 0.18), (7.980,
0.10), (8.021, 0.29), (8.078, 0.65), (8.133, 1.06), (8.165,
1.33), (8.218, 1.83), (8.267, 2.31), (8.321, 2.87), (8.355,
3.27), (8.413, 3.91), (8.473, 4.61), (8.519, 5.22), (8.553,
5.65), (8.643, 6.74), (8.678, 7.23), (8.734, 7.94), (8.760,
8.27), (8.803, 8.81), (8.851, 9.35), (8.905, 9.94), (8.961,
10.45), (9.009, 10.92), (9.053, 11.34), (9.106, 11.75), (9.166,
12.14), (9.228, 12.48), (9.292, 12.71), (9.340, 12.86), (9.384,
13.01), (9.412, 13.05), (9.452, 13.03), (9.472, 13.00)]
Cheers

Export it into a CSV file. Your use case is very simple and you should be able to do it using standard Python.
with open('output.csv', 'w') as f:
for x, y in l:
f.write("%s, %s\n" % (x, y))
Note: list is a reserved word in python and you should not be using it.

Use openpyxl to write .xslx files from Python:
import openpyxl
my_list = [(0.496, 12.49), (0.531, 12.40), (0.578, 12.18), (0.615,
11.96), (0.657, 11.75), (0.731, 11.28), (0.785, 10.85), (0.812,
10.61), (0.883, 9.92), (0.930, 9.40), (0.979, 8.77), (1.026,
8.10), (1.081, 7.23), (1.134, 6.33), (1.189, 5.39), (1.220,
4.85), (1.273, 3.92), (1.332, 2.91), (1.364, 2.55), (1.418,
2.16), (1.467, 1.65), (1.523, 1.17), (1.569, 0.82), (1.626,
0.47), (1.678, 0.21), (1.723, 0.01), (1.776, 0.19), (1.814,
0.28), (1.869, 0.36), (1.933, 0.36), (1.972, 0.31), (2.021,
0.18), (2.081, 0.13), (2.129, 0.46), (2.169, 0.79), (2.219,
1.24), (2.280, 1.84), (2.306, 2.11), (2.358, 2.67), (2.414,
3.37), (2.471, 4.05), (2.505, 4.51), (2.562, 5.22), (2.613,
5.84), (2.652, 6.31), (2.712, 7.01), (2.758, 7.52), (2.802,
7.99), (2.869, 8.63), (2.930, 9.16), (2.971, 9.57), (3.043,
10.35), (3.078, 10.69), (3.119, 11.00), (3.174, 11.26), (3.217,
11.40), (3.261, 11.53), (3.307, 11.55), (3.371, 11.51), (3.432,
11.40), (3.479, 11.26), (3.507, 11.20), (3.557, 11.00), (3.623,
10.55), (3.663, 10.28), (3.729, 9.79), (3.768, 9.57), (3.825,
9.24), (3.880, 8.85), (3.944, 8.41), (3.969, 8.04), (4.014,
7.55), (4.086, 6.67), (4.105, 6.37), (4.166, 5.50), (4.212,
4.88), (4.266, 4.20), (4.311, 3.69), (4.364, 3.06), (4.401,
2.65), (4.453, 2.09), (4.497, 1.68), (4.556, 1.18), (4.602,
0.85), (4.644, 0.57), (4.695, 0.29), (4.754, 0.04), (4.799,
0.11), (4.847, 0.17), (4.918, 0.11), (4.959, 0.04), (4.992,
0.19), (5.063, 0.64), (5.098, 0.90), (5.157, 1.40), (5.201,
1.79), (5.245, 2.20), (5.291, 2.65), (5.326, 3.00), (5.387,
3.65), (5.420, 4.02), (5.469, 4.62), (5.538, 5.44), (5.579,
5.96), (5.629, 6.57), (5.674, 7.14), (5.724, 7.73), (5.798,
8.60), (5.823, 8.88), (5.888, 9.62), (5.919, 9.94), (5.963,
10.41), (6.009, 10.85), (6.050, 11.22), (6.115, 11.71), (6.153,
11.99), (6.222, 12.39), (6.263, 12.61), (6.302, 12.77), (6.377,
12.99), (6.414, 13.03), (6.454, 13.02), (6.522, 12.89), (6.558,
12.74), (6.626, 12.41), (6.677, 12.05), (6.729, 11.64), (6.791,
11.00), (6.832, 10.58), (6.887, 9.92), (6.949, 9.13), (6.996,
8.48), (7.028, 8.09), (7.094, 7.13), (7.123, 6.70), (7.161,
6.16), (7.213, 5.35), (7.250, 4.81), (7.332, 3.61), (7.382,
2.93), (7.420, 2.45), (7.474, 1.88), (7.514, 1.40), (7.576,
0.71), (7.600, 0.50), (7.662, 0.12), (7.725, 0.16), (7.768,
0.26), (7.810, 0.30), (7.858, 0.26), (7.904, 0.18), (7.980,
0.10), (8.021, 0.29), (8.078, 0.65), (8.133, 1.06), (8.165,
1.33), (8.218, 1.83), (8.267, 2.31), (8.321, 2.87), (8.355,
3.27), (8.413, 3.91), (8.473, 4.61), (8.519, 5.22), (8.553,
5.65), (8.643, 6.74), (8.678, 7.23), (8.734, 7.94), (8.760,
8.27), (8.803, 8.81), (8.851, 9.35), (8.905, 9.94), (8.961,
10.45), (9.009, 10.92), (9.053, 11.34), (9.106, 11.75), (9.166,
12.14), (9.228, 12.48), (9.292, 12.71), (9.340, 12.86), (9.384,
13.01), (9.412, 13.05), (9.452, 13.03), (9.472, 13.00)]
book = openpyxl.Workbook()
sheet = book.active
for i, value in enumerate(my_list):
sheet.cell(row=i+1, column=1).value = value[0]
sheet.cell(row=i+1, column=2).value = value[1]
book.save('test.xlsx')

When you have data like numbers or objects in memory, it's generally not correct to dump that data directly into disk, you'll want to serialize it.
The easiest way to serialize it is with print which automatically calls the "serialization" method __str__. The problem with this serialization method is that's not always easy to deserialize.
When you have a data structure, like the matrix you describe, you'll want a serialization method that will preserve the structure and allow to reconstruct it in memory. In this case you can use CSV (through the csv module), JSON (through the json module) or many others.
Use CSV.

Retrieving results from dictionary of a range

I have a ENTIREMAP which has a mapping of all names and toy prices. What I want to try and do is create a function getResults like below:
#################
def getResults(name, price):
# where name could be 'Laura' and price is 0.02
# then from the ENTIREMAP find the key 'Laura' and if 0.02 is in the range
# of the prices in the map i.e since 0.02 is between (0.0,0.05) then return
# ('PEN', 'BLUE')
prices = [d[name] for d in ENTIRELIST if name in d]
if prices:
print prices[0]
###################
GIRLTOYPRICES = {(0.0,0.05):('PEN', 'BLUE'),
(0.05,0.08):('GLASSES', 'DESIGNER'),
(0.08,0.12):('TOP', 'STRIPY'),
}
BOYTOYPRICES = {(0.0,0.10):('BOOK', 'HARRY POTTER'),
(0.10,0.15):('BLANKET', 'SOFT'),
(0.15,0.40):('GBA', 'GAMES'),
}
GIRLS = ['Laura', 'Samantha']
BOYS = ['Mike','Fred']
GIRLLIST = [{girl: GIRLTOYPRICES} for girl in GIRLS]
BOYLIST = [{boy: BOYTOYPRICES} for boy in BOYS]
ENTIRELIST = GIRLMAP + BOYMAP
print ENTIRELIST
[{'Laura': {(0.0, 0.05): ('PEN', 'BLUE'), (0.08, 0.12): ('TOP', 'STRIPY'), (0.05, 0.08): ('GLASSES', 'DESIGNER')}}, {'Samantha': {(0.0, 0.05): ('PEN', 'BLUE'), (0.08, 0.12): ('TOP', 'STRIPY'), (0.05, 0.08): ('GLASSES', 'DESIGNER')}}, {'Mike': {(0.0, 0.1): ('BOOK', 'HARRY POTTER'), (0.15, 0.4): ('GBA', 'GAMES'), (0.1, 0.15): ('BLANKET', 'SOFT')}}, {'Fred': {(0.0, 0.1): ('BOOK', 'HARRY POTTER'), (0.15, 0.4): ('GBA', 'GAMES'), (0.1, 0.15): ('BLANKET', 'SOFT')}}]
Any help would be appreciated.

Kind of a weird data structure, but:
for person in ENTIRELIST:
person_name, toys = person.items()[0]
if person_name != name: # inverted to reduce nesting
continue
for (price_min, price_max), toy in toys.items():
if price_min <= price < price_max:
return toy
This is simpler (and more effective):
GIRLMAP = {girl: GIRLTOYPRICES for girl in GIRLS}
BOYMAP = {boy: BOYTOYPRICES for boy in BOYS}
ENTIREMAP = dict(GIRLMAP, **BOYMAP)
for (price_min, price_max), toy in ENTIREMAP[name].items():
if price_min <= price < price_max:
return toy

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Trying to pass columns into function, getting keyerror (Pandas) - python

Related

Max of values from 3 columns in a dataframe?

pandas will not let me reindex?

Syntax for interp2d or RectBivariateSpline

Writing numbers into file python

Retrieving results from dictionary of a range

Categories

Resources