Error in using profiling object is not callable - python

I am trying to to use the pandas profiling for profile report. I got the error "module' object is not callable. How can I fix in Jupyter notebook from Anaconda.
My code:
import pandas as pd
import pandas_profiling
df = pd.read_csv(r'C:\Users\tai.phan\Desktop\Pythone training\Data\titanic.csv')
pandas_profiling.profile_report(df)
The error:
TypeError: 'module' object is not callable

pandas_profiling isn't a function
import pandas as pd
import pandas_profiling.profile_report as report
df = pd.read_csv(r'C:\Users\tai.phan\Desktop\Pythone training\Data\titanic.csv')
report.whateverfunctionyouwant(df)
look at the documentation to understand this module more
https://pandas-profiling.github.io/pandas-profiling/docs/master/index.html
you can also select a function from the following list of functions for profile_report
clear_config
description_set
df_hash
get_description
get_duplicates
get_rejected_variables
get_sample
html
json
preprocess
report
set_variable
set_variables
title
to_app
to_file
to_html
to_json
to_notebook_iframe
to_widgets
widgets

try df.profile_report() worked for me

Related

TypeError with all my print functions: "TypeError: 'tuple' object is not callable"

I just started out with Python and wanted to use a print function on my code in Colab. However, all my print functions are now giving the same error: "TypeError: 'tuple' object is not callable".
Therefore I tried a simple
print("Hello")
and even that function is giving the same error. What happened? Everything was working well yesterday.
Below also the code that I used to upload my table.
`
#upload weather data
import pandas as pd
from google.colab import files
uploaded = files.upload()
weer = pd.read_csv("weather_netherlands.csv")
df = pd.DataFrame(weer)
`
When I opened a new notebook, the same print function did work.

AttributeError: module 'imp' has no attribute 'fit_transform' in Jupyter Notebook

import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
df = pd.read_csv('palmerpenguins.csv')
dfc = df.copy() # we keep a copy in case we need the original data set
from sklearn.impute import KNNImputer
from sklearn.preprocessing import LabelEncoder
encode = LabelEncoder()
impute = KNNImputer()
from sklearn.preprocessing import MinMaxScaler
sca = MinMaxScaler()
df_sca = pd.DataFrame(sca.fit_transform(df),columns=dfc.columns)
df_sca.head()
**new_df = pd.DataFrame(imp.fit_transform(df_sca), columns=dfc.columns)**
new_df.isnull().sum()
When I execute this code segment in Jupyter Notebook, the error occurred in second line from the bottom of the code, like this.
AttributeError: module 'imp' has no attribute 'fit_transform'
What is the solution to solve this issue?
The name imp is not defined. Maybe you simply meant impute.fit_transform(df_sca)?
Anyway, I would recommend using a Pipeline object to concatenate multiple operations on your data.

How to solve "Can't pickle local object" error in using pandarallel

I am trying to use the pandarallel module to speed-up my apply functions in pandas. When I run the example provided on the git page of pandarallel I get the following error: AttributeError: Can't pickle local object 'prepare_worker..closure..wrapper'
This is my code:
from pandarallel import pandarallel
import pandas as pd
import numpy as np
import math
pandarallel.initialize(nb_workers=4)
df_size = int(5e6)
df = pd.DataFrame(dict(a=np.random.randint(1, 8, df_size),
b=np.random.rand(df_size)))
def func(x):
return x
res_parallel = df.parallel_apply(func, axis=1)
It is a known issue: https://github.com/nalepae/pandarallel/issues/72
There is also the solution posted by C1ARKGABLE. Use a new virtual environment with the following packages with python 3.7.3:
numpy 1.17.4
pandarallel 1.4.6
pandas 0.25.3
Personal remark: pickle and pandas do not play well together. Avoid if possible.

pandas_profiling main method not working correctly on Windows 10... Constructor works but not method

df.profile_report() fails immediately after installation using
import pandas_profiling
The package is installed properly, because I can generate a report in Jupyter by importing and using just the constructor ProfileReport(df). However, the syntax df.profile_report() does not work.
When I run df.profile_report() I get an error message below:
```AttributeError Traceback (most recent call last)
in
----> 1 df.profile_report()
C:\Anaconda3\envs\quantecon\lib\site-packages\pandas\core\generic.py in getattr(self, name)
5065 if self._info_axis._can_hold_identifiers_and_holds_name(name):
5066 return self[name]
-> 5067 return object.getattribute(self, name)
5068
5069 def setattr(self, name, value):
AttributeError: 'DataFrame' object has no attribute 'profile_report'
```
Version information:
Python 3.7.1
pandas==0.24.2
Windows 10 2022H2
import pandas as pd
from pandas_profiling import ProfileReport
# The dataframe is the same as the tutorial example given by the author.
df = pd.DataFrame(np.random.rand(100, 5), columns=['a', 'b', 'c', 'd', 'e'])
df.profile_report() # this fails.```
What else I've tried that does work is as follows:
from pandas_profiling import ProfileReport
...steps to create dataframe df
ProfileReport(df)
Using the constructor ProfileReport(df) by itself at least gets me a report in my Jupyter Notebook. Because of this I know the package is installed and working. However, the object.method() route to get the report doesn't work. But many other methods rely on the object.method() syntax.
I cannot get any dataframes work with the df.profile_report() method.
```import numpy as np
import pandas as pd
from pandas_profiling import ProfileReport
# The dataframe is the same as the tutorial example given by the author.
df = pd.DataFrame(
np.random.rand(100, 5),
columns=['a', 'b', 'c', 'd', 'e']
)
df.profile_report() # this fails.
ProfileReport(df) # this works, but `df.profile_report()` does not work.
```
My guess as to what's wrong...?
Since the pandas error is referring to "generic.py" for Pandas Core DataFrame, and the error is "no attribute 'profile_report', perhaps it is the decorator that wraps the dataframe object and modifies it to give it the extra attribute method of .profile_report() ?? That is my guess. I don't know what's causing the error, since it works when I "peek under the covers" and use the report constructor directly. I just cannot use the other methods that rely on the object.method() syntax.
The .profile_report() syntax was introduced in pandas_profiling version 2.
You can install this version via pip: pip install pandas-profiling.
EDIT
The way to import the package is:
import pandas_profiling
in contrast to your current approach
from pandas_profiling import ProfileReport
This will work for google colab
!pip uninstall -y pandas-profiling
!pip install -U pandas-profiling
Try this:
import pandas_profiling
pandas_profiling.describe_df(data_df)
html_str_output = pandas_profiling.ProfileReport(data_df)

Python 3 Attribute Error: Statsmodels has no attribute 'tools'

Im trying to use the following code (example):
import pandas as pd
(import statsmodels.api as sm) - Tried adding, no luck
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
x = pd.DataFrame(imports a vector)
plot_acf(x)
There is some code in between, but the problem arises when Python tries to plot the autocorrelation using statsmodels, and returns the following error:
File "/Users/user/anaconda/lib/python3.6/site-packages/statsmodels/iolib/foreign.py",
line 20, in <module>
import statsmodels.tools.data as data_util
AttributeError: module 'statsmodels' has no attribute 'tools'
I tried reinstalling multiple libraries, but nothing seems to get me past this error. Could this be a statsmodels-side bug?

Categories

Resources