Import statement affects script performance - python

I was reviewing code and updating the import statements based on general guidelines, changing “from xxx import *” to “from xxx import m, n, p”. The difference in script execution time, however, was noticeable:
From collections import OrderedDict
From definitions import *
Average script time 22ms
From collections import OrderedDict
From definitions import a, b, c, d, e, f
Average script time 48ms
The topic of import performance has been taken up several times on SE, and these results seem to run counter to some answers. Why would the import statement in this case cause such a significant difference in the script performance?
A follow-up question: This package uses a "definitions.py" to store the package general purpose (mostly static) classes and functions. What is the best way to import all classes from a module, without needing to prefix with "definitions." every time they are used?
EDIT More information... Curiouser and Curiouser
Script timing is done using time.clock() over >50 iterations
It turns out that OrderedDict is already imported in definitions, so when I import it from there then the script time is back down:
From definitions import a, b, c, d, e, f, OrderedDict
Average script time 22ms
Just to thicken the plot, there was also a "from System import Array" statement though this has no effect on the script time. The way that the script imports OrderedDict appears to be the issue.

Related

Performance impact in having a single import per line

does anybody know if there is a performance difference between having all imports from one module in a single line vs one per line.
For example, having:
from a import A, B, C, D, E, F, G
instead of:
from a import A
from a import B
from a import C
from a import D
from a import E
from a import F
from a import G
I'm trying to convince my team to use reorder-python-imports in our pre-commit hooks and this doubt is the only obstacle that prevents me from adding it.
Combining imports is technically faster but there should not be a noticeable performance difference in real world usage. Python modules execute only once, on their first time being imported. After being initialized, you only incur the negligible cost of the additional import statements themselves.
As long as you only do top-level imports, your imports will only ever execute during startup anyway. By combining them you might at best manage to shave off a negligible amount of milliseconds during startup. Read the Python docs on the import mechanism.
Here is my machine's performance after a million repetitions each:
test1.py
from timeit import timeit
print(timeit("""
from socket import socket
from socket import create_connection
from socket import has_dualstack_ipv6
from socket import getaddrinfo
from socket import gethostbyaddr
"""))
# Prints 2.8450163
test2.py
from timeit import timeit
print(timeit("""
from socket import socket, create_connection, has_dualstack_ipv6, getaddrinfo, gethostbyaddr
"""))
# Prints 0.6992155

Import statements in modules in Python

Let's say I have a module that defines a function:
# import time
def printTime():
import time
print 'the time is ' + time.time()
I my main app, I load the function like this:
from modules.Help import printTime
if somecondition:
# from modules.Help import printTime
printTime()
In modules, should I do all imports at the top of the file?
Or, since the module is only required under certain conditions, should I do the import in the function?
What is more economical, or is there no difference?
In PHP, there is the require_once. Are the import statements at the top of submodules optimised in the same way?
I have some duplicate imports (e.g. time) in both the parent and submodule. Should I worry about the number of packages that I import in submodules?

Python : 'import module' vs 'import module as'

Is there any differences among the following two statements?
import os
import os as os
If so, which one is more preferred?
It is just used for simplification,like say
import random
print random.randint(1,100)
is same as:
import random as r
print r.randint(1,100)
So you can use r instead of random everytime.
The below syntax will help you in understanding the usage of using "as" keyword while importing modules
import NAMES as RENAME from MODULE searching HOW
Using this helps developer to make use of user specific name for imported modules.
Example:
import random
print random.randint(1,100)
Now I would like to introduce user specific module name for random module thus
I can rewrite the above code as
import random as myrand
print myrand.randint(1,100)
Now coming to your question; Which one is preferred?
The answer is your choice; There will be no performance impact on using "as" as part of importing modules.
Is there any differences among the following two statements?
No.
If so, which one is more preferred?
The first one (import os), because the second one does the exact same thing but is longer and repeats itself for no reason.
If you want to use name f for imported module foo, use
import foo as f
# other examples
import numpy as np
import pandas as pd
In your case, use import os
The import .... as syntax was designed to limit errors.
This syntax allows us to give a name of our choice to the package or module we are importing—theoretically this could lead to name clashes, but in practice the as syntax is used to avoid them.
Renaming is particularly useful when experimenting with different implementations of a module.
Example: if we had two modules ModA and ModB that had the same API we could write import ModA as MyMod in a program, and later on switch to using import MoB as MyMod.
In answering your question, there is no preferred syntax. It is all up to you to decide.

import module_name Vs __import__('module_name')

I am writing a python module and I am using many imports of other different modules.
I am bit confused that whether I should import all the necessary dependent modules in the opening of the file or shall I do it when necessary.
I also wanted to know the implications of both.
I come from C++ back ground so I am really thrilled with this feature and does not see any reason of not using __import__(), importing the modules only when needed inside my function.
Kindly throw some light on this.
To write less code, import a module at the first lines of the script, e.g.:
#File1.py
import os
#use os somewhere:
os.path.chdir(some_dir)
...
...
#use os somewhere else, you don't need to "import os" everywhere
os.environ.update(some_dict)
While sometimes you may need to import a module locally (e.g., in a function):
abc=3
def foo():
from some_module import abc #import inside foo avoids you from naming conflicts
abc(...) #call the function, nothing to do with the variable "abc" outside "foo"
Don't worry about the time consumption when calling foo() multiple times, since import statements loads modules/functions only one time. Once a module/function is imported, the object is stored in dictionary sys.modules, which is a lookup table for speedup when running the same import statement.
As #bruno desthuilliers mentioned, importing insede functions may not be that pythonic, it violates PEP8, here's a discussion I found, you should stick to importing at the top of the file most of the time.
First, __import__ isn't usually needed anywhere. It's main purpose is to support dynamic importing of things that you don't know ahead of time (think plug-ins). You can easily use the import statement inside your function:
import sys
def foo():
import this
if __name__ == "__main__":
print sys.version_info
foo()
The main advantage to importing everything up-front is that it is most customary. That's where people reading your code will go to see if something is imported or not. Also, you don't need to write import os in every function that uses os. The main downsides of this approach are that:
you can get yourself into unresolvable import loops (A imports B which imports A)
that you pull everything into memory even if you aren't going to use it.
The second problem isn't typically an issue -- very rarely do you notice the performance or memory impact of an import.
If you run into the first problem, it's likely a symptom of poorly grouped code and the common stuff should be factored into a new module C which both A and B can use.
Firstly, it's a violation of PEP8 using imports inside functions.
Calling import it's an expensive call EVEN if the module is already loaded, so if your function is gonna being called many times this will not compensate the performance gain.
Also when you call "import test" python do this:
dataFile = __ import__('test')
The only downside of imports at the top of file it's the namespace that get polluted very fast depending on complexity of the file, but if your file it's too complex it's a signal of bad design.

Python Module Import: Single-line vs Multi-line

When importing modules in Python, what is the difference between this:
from module import a, b, c, d
and this
from module import a
from module import b
from module import c
from module import d
To me it makes sense always to condense code and use the first example, but I've been seeing some code samples out there dong the second. Is there any difference at all or is it all in the preference of the programmer?
There is no difference at all. They both function exactly the same.
However, from a stylistic perspective, one might be more preferable than the other. And on that note, the PEP-8 for imports says that you should compress from module import name1, name2 onto a single line and leave import module1 on multiple lines:
Yes: import os
import sys
No: import sys, os
Ok: from subprocess import Popen, PIPE
In response to #teewuane's comment (repeated here in case the comment gets deleted):
#inspectorG4dget What if you have to import several functions from one
module and it ends up making that line longer than 80 char? I know
that the 80 char thing is "when it makes the code more readable" but I
am still wondering if there is a more tidy way to do this. And I don't
want to do from foo import * even though I am basically importing
everything.
The issue here is that doing something like the following could exceed the 80 char limit:
from module import func1, func2, func3, func4, func5
To this, I have two responses (I don't see PEP8 being overly clear about this):
Break it up into two imports:
from module import func1, func2, func3
from module import func4, func5
Doing this has the disadvantage that if module is removed from the codebase or otherwise refactored, then both import lines will need to be deleted. This could prove to be painful
Split the line:
To mitigate the above concern, it may be wiser to do
from module import func1, func2, func3, \
func4, func5
This would result in an error if the second line is not deleted along with the first, while still maintaining the singular import statement
To add to some of the questions raised from inspectorG4dget's answer, you can also use tuples to do multi-line imports when folder structures start getting deeply nested or you have modules with obtuse names.
from some.module.submodule.that_has_long_names import (
first_item,
second_item,
more_imported_items_with_really_enormously_long_names_that_might_be_too_descriptive,
that_would_certainly_not_fit,
on_one_line,
)
This also works, though I'm not a fan of this style:
from module import (a_ton, of, modules, that_seem, to_keep, needing,
to_be, added, to_the_list, of_required_items)
I would suggest not to follow PEP-8 blindly. When you have about half screen worth of imports, things start becoming uncomfortable and PEP-8 is then in conflicts with PEP-20 readability guidelines.
My preference is,
Put all built-in imports on one line such as sys, os, time etc.
For other imports, use one line per package (not module)
Above gives you good balance because the reader can still quickly glance the dependencies while achieving reasonable compactness.
For example,
My Preference
# one line per package
import os, json, time, sys, math
import numpy as np
import torch, torch.nn as nn, torch.autograd, torch.nn.functional as F
from torchvision models, transforms
PEP-8 Recommandation
# one line per module or from ... import statement
import os
import json
import time
import sys
import math
import numpy as np
import torch
from torch import nn as nn, autograd, nn.functional as F
from torchvision import models, transforms
A concern not mentioned by other answers is git merge conflicts.
Let's say you start with this import statement:
import os
If you change this line to import os, sys in one branch and import json, os in another branch, you will get this conflict when you attempt to merge them:
<<<<<<< HEAD
import os, sys
=======
import json, os
>>>>>>> branch
But if you add import sys and import json on separate lines, you get a nice merge commit with no conflicts:
--- a/foo.py
+++ b/foo.py
### -1,2 -1,2 +1,3 ###
+ import json
import os
+import sys
You will still get a conflict if the two imports were added at the same location, as git doesn't know which order they should appear in. So if you had imported time instead of json, for example:
import os
<<<<<<< HEAD
import sys
=======
import time
>>>>>>> branch
Still, it can be worth sticking with this style for the occasions where it does avoid merge conflicts.
Imports should usually be on separate lines as per PEP 8 guidelines.
# Wrong Use
import os, sys
# Correct Use
import os
import sys
For more import based PEP 8 violations and fixes please check this out https://ayush-raj-blogs.hashnode.dev/making-clean-pr-for-open-source-contributors-pep-8-style.
Both are same.
Use from module import a, b, c, d.
If you want to import only one part of a module, use:
from module import a
If u want to import multiple codes from same module, use:
from module import a,b,c,d
No need to write all in separate lines when both are same.

Categories

Resources