I am new to python and I am trying to understand why I am not being able to graph a plot. First I imported all of the libraries I have been using in the program:
import pandas as pd
import statsmodels.formula.api as sm
import numpy as np
import seaborn as sns
import scipy as stats
import matplotlib.pyplot as plt
And when I run the following code:
sns.distplot(financials.residual,kde=False,fit=stats.norm)
I get the following error:
AttributeError: module 'scipy' has no attribute 'norm'
I believe it might be because I am not importing the correct module from spicy but I can't find the way to get it right.
Thanks for your help
norm is in scipy.stats, not in scipy.
Importing "scipy as stats" simply imports scipy and renames it to stats, it doesn't import the stats submodule inside scipy.
do
from scipy.stats import norm
like on the official website example
or
from scipy import stats
stats.norm(...)
Note:
when "importing something as somethingelse", be careful to not shadow other names and if possible follow conventions (like import numpy as np).
For scipy, as explained in this answer, the convention is to never "import scipy as ..." since all the interesting functions in scipy are actually located in the submodules, which are not automatically imported.
Do not import scipy as stats. There is a library module called stats. By renaming scipy, you shadow the original stats module and prevent
Python from accessing it. Then stats.norm essentially becomes scipy.norm, which is not what you want.
First of all, let's see how to make proper use of import:
1)Importing a Library:
The first method is to import the library directly and access the modules under it with a '.'. This is illustrated here:
import pandas
import numpy
pandas.read_csv('helloworld.csv')
pandas is a library and read_csv is a module inside it.
2)Importing the module directly
The second method is to import the module directly as follows:
from pandas import read_csv
read_csv('helloworld.csv')
Use this method when you know that you'll only be using read_csv module of pandas and not any other module under it.
3)Importing all the modules under a library
from pandas import *
This will import all the modules under the pandas library and you can use them directly.
Now coming to your question:
The norm function is present inside the stats module of scipy. When you use as, you're giving an alias for the library/module inside your code. So instead try this method
from scipy.stats import norm
norm(...)
or
from scipy import stats
stats.norm(..)
Hope this helps!
Related
I'm pretty new to Python and I am trying to understand how import works. I have a doubt when it comes to read_csv.
We generally use the following lines when we call the read_csv function.
import pandas as pd
...
...
file=pd.read_csv(Filename)
The read_csv module is present in the module pandas.io.parsers. Why don't we mention the entire path before accessing read_csv? I mean, why not this:
import pandas.io.parsers as pd
...
...
file=pd.read_csv(Filename)
If we can access a function without giving the entire path, why do we use
import matplotlib.pyplot as plt
...
...
plt.show()
when we can just write
import matplotlib as plt
...
...
plt.show()
What I mean to ask is, are the imports used in Python codes just conventions(is the .pyplot tacked on to matplotlib.pyplot omissible?) or are there specific rules? Do we use the entire location when there's a chance of clash with other methods of the same name? Do modules in a package contain non-unique function names then?
This depends on whether the package chooses to expose some of the "deeper" functions to the package namespace. This is done through a file called __init__.py. This gives flexibility, because it allows the developer to keep chunks of the source code organized in multiple folders, but easily accessible by the user importing the package as a whole.
This also means that the "rules" for importing - whether you import matplotlib or matplotlib.pyplot - depend on the package's maintainers. Both matplotlib and matplotlib.pyplot are considered packages, because they each have a __init.py file associated with their respective folder in the source code. What you exactly you import by writing import matplotlib or import matplotlib.pyplot depends on the contents of these __init__.py files.
For example, you can see in the pandas source code that there is a filled called __init__.py at the source root, meaning it will import some names whenever you import pandas in your code. It includes the import for read_csv at around line 150:
from pandas.io.api import (
# ...
read_csv,
# ...
)
Most of my jupyter notebooks usually begin with a long list of common packages to import. E.g.,
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
...
Is there any way I can call all of the same packages using a function defined in another python file?
For example, I tried putting the following in util.py:
def import_usual_packages():
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
And in the main notebook:
import util
util.import_usual_packages()
so that the end effect is I can still call the usual packages without using additional namespaces, e.g.
pd.DataFrame()
At the moment this wouldn't work with what I have above. E.g. pd isn't defined.
EDIT: It's similar but not exactly the same as other questions asking about how to do a from util import *. I'm trying to put the import statements inside functions of the utility python file, not simply at the top of the file. I'm aware that I can simply put e.g. import pandas as pd at the top of the util file and then run from util import *, but that is not what I'm looking for. Putting it within functions gives me additional control. For example, I can have 2 different functions called import_usual_packages() and import_plotting_packages() within the same util file that calls different groups of packages, so that in my notebook I can simply call
import_usual_packages()
import_plotting_packages()
instead of having 10+ lines calling the same stuff everytime. This is purely for personal use so I don't care about if other people don't understand what's going on (in fact, that might be a good thing in some circumstances).
With some slight modifications your method can work. In util.py
def import_usual_packages():
global pd, np # Make the names pd & np global
import pandas as pd
import numpy as np
And in main.py
import utils
utils.import_usual_packages()
utils.pd.DataFrame() # access via the utils namespace
This is definitely not the cleanest approach though.
Overtime I have built up a collection of utility functions for various things.
I would like to put them all in package, with a bit more structure than just a single file containing all the functions.
Some of these functions are written assuming certain packages have been imported e.g. I have several numpy and pandas utility functions that assume something like import numpy as np
Obviously I will not use this hypothetical package like from <pkg> import * but I do not want to hinder performance either.
So if I have a numpy utility function, should I add this to every function
# mypkg.np.utils
import sys
def np_util_fn(...):
if 'np' not in sys.modules: import numpy as np
# rest of func
or
# mypkg.np.utils
import sys
if 'np' not in sys.modules: import numpy as np
def np_util_fn(...):
# rest of func
which is more performant if I use a different part of this package? e.g. from pkg.other.utils import fn
Ok, let's analyze your issue. Assume you have a file module.py:
print("Module got imported")
and a file test.py with:
import module
import module
. If you now execute test.py you will get
Module got imported
. Please note that this line is not outputted two times. This means that python already checks whether a module was already imported (before reimporting it). So your check if 'np' not in sys.modules: import numpy as np is not needed. This check only delays things as it may result in a double check.
In case you want to reimport a module you need reload(module). So if you have
import module
import module
reload(module)
in code.py you will see the line Module got imported two times.
This means that
import numpy as np
is sufficient. There is no need to check whether it already got imported via:
if 'np' not in sys.modules: import numpy as np
It depends whether it is advantageous to do import numpy as np at the very beginning of your script or in a function. If the function is executed multiple times, it is advantageous to do so only at the very beginning. Otherwise you are rechecking whether 'np' is not in sys.modules all the time. In contrast if you can argue that your function is not called to often / is not necessarily executed in your program (e.g. because it depends on user input) then it may be advantageous (seen from the "point vu" of speed) to import this module in a function only.
I normally don't use any import statements in functions as I always have the feeling that they blow up the function body and thus reduce readability.
Google's style guide says, about imports, that modules might be aliased with import xyz as x when x is a common abbreviation for xyz.
What are the standard abbreviations for the most common modules?
I'm here looking for a list exhaustive as possible, including modules from the standard library, as well as third-party niche packages that are frequently used in their respective fields.
For instance, numpy is always imported as np, and tkinter, when hopefully not imported with from module import *, is generally imported as tk.
Here are the names I see most of the time for the modules I frequently use.
This list is not meant to become an absolute reference, but I hope it will help provide some guidelines.
Please feel free to complete it, or to change whatever you think needs to be changed.
The import statements follow the conventions established by Google's Python style guide, namely:
Use import x for importing packages and modules.
Use from x import y where x is the package prefix and y is the module name with no prefix.
Use from x import y as z if two modules named y are to be imported or if y is an inconveniently long name.
Use import y as z only when z is a standard abbreviation (e.g., np for numpy).
MODULE ALIAS IMPORT STATEMENT
datetime dt import datetime as dt
matplotlib.pyplot plt from matplotlib import pyplot as plt
multiprocessing mp import multiprocessing as mp
numpy np import numpy as np
pandas pd import pandas as pd
seaborn sns import seaborn as sns
tensorflow tf import tensorflow as tf
tkinter tk import tkinter as tk
I would like to create a function that contains all my imports statement, :
def imports():
import pandas as pd
import numpy as np
etc...
save it in a .py file as a module and call that function from my Jupyter Notebook.
This is simply to unclutter the Notebook. However it seems it doesnt work to create a function containing import statements? (I'm receiving errors NameError: name 'pd' is not defined ). Anyone knows why?
Instead, put into your module all the import statements you want and as you'd normally put them
contents_of_your_module.py
import pandas as pd
import numpy as np
import itertools
import seaborn as sns
Then you import from Jupyter
from contents_of_your_module import *
Or, you can create a namespace for your module and do this
import contents_of_you_module as radar
And then you can access all the modules via your name space
radar.pd.DataFrame
import pandas as pd creates a variable pd. In your case, the variable is local to the function and not visible outside. Declare the shortcuts to the imported modules as global variables:
def imports():
global pd, np
import pandas as pd
import numpy as np
In general, I would say that it's a very bad practice to do what you want to do. If you want to do exactly what you describe you can't do it in a pythonic way.
But this is a solution that probably would work:
Pretend you have a module called imports.py, that should handle all your imports:
File: imports.py:
import numpy as np
import pandas as pd
The you can have a function in your program.
def do_funky_import_from():
from imports import *