I am trying to export the summary of my multiple regression models in a table.
results = {'A':result.summary(),
'B': result1.summary(), 'C': result2.summary(), 'D': result3.summary(), 'E' : result4.summary()}
df2 = pd.DataFrame({'Model':[], 'Param':[], 'Value':[]})
for mod in results.keys():
for col in results[mod].tables[0].columns:
if col % 2 == 0:
df2 = df2.append(pd.DataFrame({'Model': [mod]*results[mod].tables[0][col].size,
'Param':results[mod].tables[0][col].values,
'Value':results[mod].tables[0][col+1].values}))
print(df2)
When I run the code it gives me error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-280-952fff354224> in <module>
3 df2 = pd.DataFrame({'Model':[], 'Param':[], 'Value':[]})
4 for mod in results.keys():
----> 5 for col in results[mod].tables[0].column:
6 if col % 2 == 0:
7 df2 = df2.append(pd.DataFrame({'Model': [mod]*results[mod].tables[0][col].size,
AttributeError: 'SimpleTable' object has no attribute 'column'
The SimpleTable in this context is statsmodels.iolib.table.SimpleTable. We can use pandas.DataFrame.from_records to convert the data type to DataFrame. From here, you can access the columns easily.
Assure this SimpleTable is accessed through a variable named "t"
df = pd.DataFrame.from_records(t.data)
header = df.iloc[0] # grab the first row for the header
df = df[1:] # take the data less the header row
df.columns = header
print(df.shape)
return df['your_col_name']
It's hard to tell without seeing how you're creating result.summary() et al, but it's likely that the SimpleTable API follows similar/related pandas APIs, in which case you're looking for the columns attribute (note the plural 's').
Related
a=["ExpNCCIFactor","Requestid","EffDate","TransresposnseDate","QuoteEffDate","ApplicationID","PortUrl","UQuestion","DescriptionofOperations","Error"]
d = [ExpNCCIFactor,Requestid,EffDate,TransresposnseDate,QuoteEffDate,ApplicationID,PortUrl,UQuestion,DescriptionofOperations,Error]
df2 = pd.DataFrame(data = d , columns = a)
Got error
Traceback (most recent call last):
File "C:\Users\praka\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\internals\construction.py", line 982, in _finalize_columns_and_data
columns = _validate_or_indexify_columns(contents, columns)
File "C:\Users\praka\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\internals\construction.py", line 1030, in _validate_or_indexify_columns
raise AssertionError(
AssertionError: 10 columns passed, passed data had 11 columns
You are creating a dataframe from list.
In your case, data arugment should be list of list.
a=["ExpNCCIFactor","Requestid","EffDate","TransresposnseDate","QuoteEffDate","ApplicationID","PortUrl","UQuestion","DescriptionofOperations","Error"]
d = [[ExpNCCIFactor,Requestid,EffDate,TransresposnseDate,QuoteEffDate,ApplicationID,PortUrl,UQuestion,DescriptionofOperations,Error]]
It seems like one of your columns in variable d has a different shape probably a double column on it.
Try this code to iterate over them and print their shapes:
import numpy as np
c = 0
for i in d:
print('Shape of column {} is {}'.format(c, np.shape(i)))
c += 1
df = pd.DataFrame({'a': ['Anakin Ana', 'Anakin Ana, Chris Cannon', 'Chris Cannon', 'Bella Bold'],
'b': ['Bella Bold, Chris Cannon', 'Donald Deakon', 'Bella Bold', 'Bella Bold'],
'c': ['Chris Cannon', 'Chris Cannon, Donald Deakon', 'Chris Cannon', 'Anakin Ana, Bella Bold']},
index=[0, 1, 2])
Hi everyone,
I'm trying to count how many names are in common in each column.
Above is an example of what my data looks like. At first, it said 'float' object has no attribute 'split' error. I did some searching and it seems the error is coming from my missing data which is reading as float. But even when I change the column in string variable it keeps getting the error.
Below is my code.
import pandas as pd
import csv
filepath = "C:/Users/data/Untitled Folder/creditdata2.csv"
df = pd.read_csv(filepath,encoding='utf-8')
df['word_overlap'] = [set(x[8].astype(str).split(",")) & set(x[10].astype(str).split(",")) for x in df.values]
df['overlap_count'] = df['word_overlap'].str.len()
df.to_csv('creditdata3.csv',mode='a',index=False)
And here is the error
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-21-b85ac8637aae> in <module>
4 df = pd.read_csv(filepath,encoding='utf-8')
5
----> 6 df['word_overlap'] = [set(x[8].astype(str).split(",")) & set(x[10].astype(str).split(",")) for x in df.values]
7 df['overlap_count'] = df['word_overlap'].str.len()
8
<ipython-input-21-b85ac8637aae> in <listcomp>(.0)
4 df = pd.read_csv(filepath,encoding='utf-8')
5
----> 6 df['word_overlap'] = [set(x[8].astype(str).split(",")) & set(x[10].astype(str).split(",")) for x in df.values]
7 df['overlap_count'] = df['word_overlap'].str.len()
8
AttributeError: 'float' object has no attribute 'astype'
astype is a method in DataFrame, and here you have just a primitive float type, because you've already indexed x.
Try this:
df['word_overlap'] = [set(str(x[8]).split(",")) & set(str(x[10]).split(",")) for x in df.values]
import pandas as pd
import csv
filepath = "C:/data/Untitled Folder/creditdata2.csv"
df = pd.read_csv(filepath,encoding='utf-8')
def f(columns):
f_desc, f_def = str(columns[6]), str(columns[7])
common = set(f_desc.split(",")).intersection(set(f_def.split(",")))
return common, len(common)
df[['word_overlap', 'word_count']] = df.apply(f, axis=1, raw=True).apply(pd.Series)
df.to_csv('creditdata3.csv',mode='a',index=False)
I found another way to do it thank you, everyone!
Below is the problem, the code and the error that arises. top_10_movies has two columns, which are rating and name.
import babypandas as bpd
top_10_movies = top_10_movies = bpd.DataFrame().assign(
Rating = top_10_movie_ratings,
Name = top_10_movie_names
)
top_10_movies
You can use the assign method to add a column to an already-existing
table, too. Create a new DataFrame called with_ranking by adding a
column named "Ranking" to the table in top_10_movies
import babypandas as bpd
Ranking = my_ranking
with_ranking = top_10_movies.assign(Ranking)
TypeError Traceback (most recent call last)
<ipython-input-41-a56d9c05ae19> in <module>
1 import babypandas as bpd
2 Ranking = my_ranking
----> 3 with_ranking = top_10_movies.assign(Ranking)
TypeError: assign() takes 1 positional argument but 2 were given
While using assign, it needs a key to assign to, you can do:
with_ranking = top_10_movies.assign(ranking = Ranking)
Here's a simple example to check:
df = pd.DataFrame({'col': ['a','b']})
ranks = [1, 2]
df.assign(ranks) # causes the same error
df.assign(rank = ranks) # works
I have a list of part numbers that I want to use to extract a list of prices on a website.
However I'm getting the below error when running the code:
Traceback (most recent call last):
File "C:/Users/212677036/.PyCharmCE2019.1/config/scratches/scratch_1.py", line 13, in
data = {"partOptionFilter": {"PartNumber": PN(i), "AlternativeOemId": "17155"}}
TypeError: 'DataFrame' object is not callable
Process finished with exit code 1
import requests
import pandas as pd
df = pd.read_excel(r'C:\Users\212677036\Documents\Copy of MIC Parts Review - July 26 19.xlsx')
PN = pd.DataFrame(df, columns = ['Product code'])
#print(PN)
i = 0
Total_rows = PN.shape[0]
while i < Total_rows:
data = {"partOptionFilter": {"PartNumber": PN(i), "AlternativeOemId": "17155"}}
r = requests.post('https://www.partsfinder.com/Catalog/Service/GetPartOptions', json=data).json()
print(r['Data']['PartOptions'][0]['YourPrice'])
i=i+1
You are calling PN(i). That is why it says
TypeError: 'DataFrame' object is not callable
The (i) is like a method call.
I am not sure how your df looks like and what you want to extract but you have to index the DataFrame like this:
PN[i]
or
PN.loc[i, 'columnname']
or
PN.iloc[i, 0]
or ... depending on your df
I am working with a CSV in a format that I can not change. It contains a multiindex. The raw file looks like this:
I use the following code to perform a multiindex, then stack and then reset index on it. It works.
import pandas as pd
myfile = 'c:/temp/myfile.csv'
df = pd.read_csv(myfile, header=[0, 1], tupleize_cols=True)
df.columns = [c for _, c in df.columns[:3]] + [c for c in df.columns[3:]]
df = df.set_index(list(df.columns[:3]), append = True)
df.columns = pd.MultiIndex.from_tuples(df.columns, names = ['hour', 'field'])
df.stack(level=['hour'])
df2 = df.reset_index().copy()
df2
Sometimes the "Zone" field is left blank, though.
Putting the file through the same code gives me this error:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-15-8e51ff24c0c4> in <module>()
6 df.columns = pd.MultiIndex.from_tuples(df.columns, names = ['hour', 'field'])
7 df.stack(level=['hour'])
----> 8 df2 = df.reset_index().copy()
9 df2
C:\Anaconda3\lib\site-packages\pandas\core\frame.py in reset_index(self, level, drop, inplace, col_level, col_fill)
2832
2833 # to ndarray and maybe infer different dtype
-> 2834 level_values = _maybe_casted_values(lev, lab)
2835 if level is None or i in level:
2836 new_obj.insert(0, col_name, level_values)
C:\Anaconda3\lib\site-packages\pandas\core\frame.py in _maybe_casted_values(index, labels)
2796 if labels is not None:
2797 mask = labels == -1
-> 2798 values = values.take(labels)
2799 if mask.any():
2800 values, changed = com._maybe_upcast_putmask(values,
IndexError: cannot do a non-empty take from an empty axes.
Ideally, I would like to keep the NaNs in the df post reset.
I ran into the same problem. This is my hack:
# Loop through the index columns
for clmNm in df_w_idx.index.names:
print(clmNm)
# Make a new column in the dataframe
df_w_idx[clmNm] = df_w_idx.index.get_level_values(clmNm)
# Now you can reset the index
df_w_idx = df_w_idx.reset_index(drop=True).copy()
df_w_idx
Below is fully reproducible code. I am sure there are better ways
import pandas as pd
import numpy as np
import random
import string
# Create 12 random strings 3 char long
rndm_strgs = [''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(3)) for i in range(12)]
rndm_strgs[0] = None
rndm_strgs[5] = None
# Make Dataframe
df = pd.DataFrame({'A' : list('pandasisgood'),
'B' : np.nan,
'C' : rndm_strgs,
'D' : np.random.rand(12)})
# Set an Index -> Columns have Nans
df_w_idx = df.set_index(['A','B','C'])
for clmNm in df_w_idx.index.names:
print(clmNm)
df_w_idx[clmNm] = df_w_idx.index.get_level_values(clmNm)
df_w_idx = df_w_idx.reset_index(drop=True).copy()
df_w_idx
Also See issue 6322 in git. It looks closed