What is happening in m5/objects/__init__.py file gem5

What is happening in m5/objects/__init__.py file gem5 - python

I am new with gem5 simulator. I was reading the documentation (http://www.m5sim.org/Configuration_/_Simulation_Scripts) trying to understand how everything is implemented. When they write about Python classes they say the following:
gem5 provides a collection of Python object classes that correspond to its C++ simulation object classes. These Python classes are defined in a Python module called "m5.objects". The Python class definitions for these objects can be found in .py files in src, typically in the same directory as their C++ definitions.
To make the Python classes visible, the configuration file must first import the class definitions from the m5 module
In the m5/objects directory there is only one file "__init__.py". This is the code:
from __future__ import print_function
from __future__ import absolute_import
from m5.internal import params
from m5.SimObject import *
try:
modules = __loader__.modules
except NameError:
modules = { }
for module in modules.keys():
if module.startswith('m5.objects.'):
exec("from %s import *" % module)
Normally I don't program with Python so perhaps that is the problem, but I haven't fully understood what is going on here. In this other post Python's __loader__, what is it? they speak about what loader means but I feel I am missing something. Any help would be appreciated. Thanks in advance.

The __loader__
Consider the following code:
import sys
class FooImporter:
def find_module(self, module_name, package_path):
return self if module_name == 'foo' else None
def load_module(self, module_name):
print('FooImporter is working.')
sys.modules[module_name] = __import__('sys')
# This activates the importer
sys.meta_path.append(FooImporter())
# This should trigger our importer to import 'foo'
import foo
# Show what we've just got
print(foo)
This will result in output:
FooImporter is working.
<module 'sys' (built-in)>
As long as you do not have a python module named foo in PYTHONPATH.
Python Import Hook (PEP 302) allows us to customize the behavior of import. In the above example, module foo was said to be found and handled by the FooImporter. Note the importer will create the module foo as an alias of sys. The complete importer (unlike the simplified one we've seen) will be responsible for setting the __loader__ attribute for the imported module to the importer itself.
Gem5's import hook
Back to your question, gem5 is using the same mechanism for loading SimObject's by its design of modulization. You can find the very importer class at src/python/importer.py with the class name CodeImporter.
When the module m5.object was being imported, say,
from m5.objects import Root
The CodeImporter will be responsible for handling the import task, in which the __loader__ attribute will be set for the imported module (in this case m5.objects). If you try printing __loader__ in m5/objects/__init__.py, you'll get something like:
<importer.CodeImporter object at 0x7f4f58941d60>
The __loader__.modules is a dictionary containing gem5 maintained SimObjects where each item will be added by addModule() calls from src/sim/init.cc.
As long as a SimObject's C++ correspondence has called the constructor for EmbeddedPython, it will be added to a list so the gem5 initialization will remember to add it to the instance of CodeImporter. For example, one should be able to find a Root.py.cc file in the build folder that registers the Root object. The loop at the end of m5/object/__init__.py is just importing a list of known SimObject's by this mechanism.
I think this should be sufficient to give someone a picture of the underlying magic and (hopefully) resolve their curiosity.

Related

Importing modules in python (3 modules)

Let's say i have 3 modules within the same directory. (module1,module2,module3)
Suppose the 2nd module imports the 3rd module then if i import module2 in module 1. Does that automatically import module 3 to module 1 ?
Thanks

No. The imports only work inside a module. You can verify that by creating a test.
Saying,
# module1
import module2
# module2
import module3
# in module1
module3.foo() # oops
This is reasonable because you can think in reverse: if imports cause a chain of importing, it'll be hard to decide which function is from which module, thus causing complex naming conflicts.

No, it will not be imported unless you explicitly specify python to, like so:
from module2 import *

What importing does conceptually is outlined below.
import some_module
The statement above is equivalent to:
module_variable = import_module("some_module")
All we have done so far is bind some object to a variable name.
When it comes to the implementation of import_module it is also not that hard to grasp.
def import_module(module_name):
if module_name in sys.modules:
module = sys.modules[module_name]
else:
filename = find_file_for_module(module_name)
python_code = open(filename).read()
module = create_module_from_code(python_code)
sys.modules[module_name] = module
return module
First, we check if the module has been imported before. If it was, then it will be available in the global list of all modules (sys.modules), and so will simply be reused. In the case that the module is not available, we create it from the code. Once the function returns, the module will be assigned to the variable name that you have chosen. As you can see the process is not inefficient or wasteful. All you are doing is creating an alias for your module. In most cases, transparency is prefered, hence having a quick look at the top of the file can tell you what resources are available to you. Otherwise, you might end up in a situation where you are wondering where is a given resource coming from. So, that is why you do not get modules inherently "imported".
Resource:
Python doc on importing

how to import my own package from ipython

I have my own repository created in BitBucket.
In that repository, I have a file named core.py and an __init__.py file
I tried to import the core module, and I fixed all the requirements that were needed.
Now when I am finally able to import the module using ipython, which is only one big class, with the call:
obj = MyClass()
I get an error:
name 'MyClass()' is not defined
even though it seems the module was imported.
Let me know if more information is Needed.

As you stated in your comment, you are importing core.py:
from mintigocloudstorage import core
That means, you also have to tell your script where to find your class:
obj = core.MyClass()
If the import was sucessfull as you say, Python should now be able to locate your classes definition.
Alternatively you can also import your class:
from mintigocloudstorage.core import MyClass
obj = MyClass()

Python Importing - explaination

Similar Question: Understanding A Chain of Imports in Python
NB: I'm using Python 3.3
I have setup the following two files in the same directory to explain importing to myself, however I still don't get exactly what it's doing. I understand function and class definitions are statements that need to run.
untitled.py:
import string
class testing:
def func(self):
try:
print(string.ascii_lowercase)
except:
print('not imported')
class second:
x=1
print('print statement in untitled executed')
stuff.py:
from untitled import testing
try:
t=testing()
t.func()
except NameError:
print('testing not imported')
try:
print(string.ascii_uppercase)
except NameError:
print('string not imported')
try:
print(untitled.string.ascii_uppercase)
except NameError:
print('string not imported in untitled')
try:
s=second()
print(s.x)
except NameError:
print('second not imported')
This is the output I get from running stuff.py:
print statement in untitled executed
abcdefghijklmnopqrstuvwxyz
string not imported
string not imported in untitled
second not imported
The print statement in untitled.py is executed despite the import in stuff.py specifying only the testing class. Moreover what is the string module's relation inside stuff.py, as it can be called from within the testing class yet not from the outside.
Could somebody please explain this behaviour to me, what exactly does a "from import" statment do (what does it run)?

You can think of python modules as namespaces. Keep in mind that imports are not includes:
modules are only imported once
the first time, the top level code is executed
any imports, variable, function or class declarations affects only the module local namespace
Suppose you have a module called foo.py:
import eggs
bar = "Lets drink, it's a bar'
So when you do a from foo import bar in another module, you will make bar available in the current namespace. The module eggs will be available under foo.eggs if you do an import foo. If you do a from foo import *, then eggs, bar and everything else in the module namespace will be also in the current namespace - but never do that, wildcard imports are frowned upon in Python.
If you do a import foo and then import eggs, the top level code at eggs will be executed once and the module namespace will be stored in the module cache: if another module imports it the information will be pulled from this cache. If you are going to use it, then import it - no need to worry about multiple imports executing the top level code multiple times.
Python programmers are very fond of namespaces; I always try to use import foo and then foo.bar instead of from foo import bar if possible - it keeps the namespace clean and prevent name clashes.
That said, the import mechanism is hackable, you can make python import statement work even with files that are not python.

The from statement isn't any different to import with regard to loading behaviour. Always the top level code is executed, when loading the module. from just controls which parts of the loaded module are being added to the current scope (the first point is most important):
The from form uses a slightly more complex process:
find the module specified in the from clause loading and initializing it if necessary;
for each of the identifiers specified in the import clauses:
check if the imported module has an attribute by that name
if not, attempt to import a submodule with that name and then check the imported module again for that attribute
if the attribute is not found, ImportError is raised.
otherwise, a reference to that value is bound in the local namespace, using the name in the as clause if it is present, otherwise using the attribute name
Thus you can access the contents of a module partially imported with from with this inelegant trick:
print(sys.modules['untitled'].string.ascii_uppercase)

In your first file (untitled.py), when python compiler parses(since you called it in import) this file It will create 2 class code objects and execute the print statement. Note that it will even print it if you run untitled.py from command line.
In your second file(stuff.py), to add to #Paulo comments, you have only imported testing class in your namspace, so only that will be available, from the 2 code objects from untitled.py
However if you just say
import untitled
your 3rd "try" statement will work, since it will have untitled in its namespace.
Next thing. try importing untitled.testing :)

Recursive version of 'reload'

When I'm developing Python code, I usually test it in an ad-hoc way in the interpreter. I'll import some_module, test it, find a bug, fix the bug and save, and then use the built-in reload function to reload(some_module) and test again.
However, suppose that in some_module I have import some_other_module, and while testing some_module I discover a bug in some_other_module and fix it. Now calling reload(some_module) won't recursively re-import some_other_module. I have to either manually reimport the dependency (by doing something like reload(some_module.some_other_module), or import some_other_module; reload(some_other_module), or, if I've changed a whole bunch of dependencies and lost track of what I need to reload, I need to restart the entire interpreter.
What'd be more convenient is if there were some recursive_reload function, and I could just do recursive_reload(some_module) and have Python not only reload some_module, but also recursively reload every module that some_module imports (and every module that each of those modules imports, and so on) so that I could be sure that I wasn't using an old version of any of the other modules upon which some_module depends.
I don't think there's anything built in to Python that behaves like the recursive_reload function I describe here, but is there an easy way to hack such a thing together?

I've run up against the same issue, and you inspired me to actually solve the problem.
from types import ModuleType
try:
from importlib import reload # Python 3.4+
except ImportError:
# Needed for Python 3.0-3.3; harmless in Python 2.7 where imp.reload is just an
# alias for the builtin reload.
from imp import reload
def rreload(module):
"""Recursively reload modules."""
reload(module)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if type(attribute) is ModuleType:
rreload(attribute)
Or, if you are using IPython, just use dreload or pass --deep-reload on startup.

I've run against the same issue and I've built up on #Mattew and #osa answer.
from types import ModuleType
import os, sys
def rreload(module, paths=None, mdict=None):
"""Recursively reload modules."""
if paths is None:
paths = ['']
if mdict is None:
mdict = {}
if module not in mdict:
# modules reloaded from this module
mdict[module] = []
reload(module)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if type(attribute) is ModuleType:
if attribute not in mdict[module]:
if attribute.__name__ not in sys.builtin_module_names:
if os.path.dirname(attribute.__file__) in paths:
mdict[module].append(attribute)
rreload(attribute, paths, mdict)
reload(module)
#return mdict
There are three differences:
In the general case, reload(module) has to be called at the end of the function as well, as #osa pointed out.
With circular import dependencies the code posted earlier would loop forever so I've added a dictionary of lists to keep track of the set of modules loaded by other modules. While circular dependencies are not cool, Python allows them, so this reload function deals with them as well.
I've added a list of paths (default is ['']) from which the reloading is allowed. Some modules don't like been reloaded the normal way, (as shown here).

The code worked great for dependency modules imported just as import another_module, but it failed when the module imported functions with from another_module import some_func.
I expanded on #redsk's answer to try and be smart about these functions. I've also added a blacklist because unfortunately typing and importlib don't appear in sys.builtin_module_names (maybe there are more). Also I wanted to prevent reloading of some dependencies I knew about.
I also track the reloaded module names and return them.
Tested on Python 3.7.4 Windows:
def rreload(module, paths=None, mdict=None, base_module=None, blacklist=None, reloaded_modules=None):
"""Recursively reload modules."""
if paths is None:
paths = [""]
if mdict is None:
mdict = {}
if module not in mdict:
# modules reloaded from this module
mdict[module] = []
if base_module is None:
base_module = module
if blacklist is None:
blacklist = ["importlib", "typing"]
if reloaded_modules is None:
reloaded_modules = []
reload(module)
reloaded_modules.append(module.__name__)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if type(attribute) is ModuleType and attribute.__name__ not in blacklist:
if attribute not in mdict[module]:
if attribute.__name__ not in sys.builtin_module_names:
if os.path.dirname(attribute.__file__) in paths:
mdict[module].append(attribute)
reloaded_modules = rreload(attribute, paths, mdict, base_module, blacklist, reloaded_modules)
elif callable(attribute) and attribute.__module__ not in blacklist:
if attribute.__module__ not in sys.builtin_module_names and f"_{attribute.__module__}" not in sys.builtin_module_names:
if sys.modules[attribute.__module__] != base_module:
if sys.modules[attribute.__module__] not in mdict:
mdict[sys.modules[attribute.__module__]] = [attribute]
reloaded_modules = rreload(sys.modules[attribute.__module__], paths, mdict, base_module, blacklist, reloaded_modules)
reload(module)
return reloaded_modules
Some notes:
I don't know why some builtin_module_names are prefixed with an underscore (for example collections is listed as _collections, so I have to do the double string check.
callable() returns True for classes, I guess that's expected but that was one of the reasons I had to blacklist extra modules.
At least now I am able to deep reload a module at runtime and from my tests I was able to go multiple levels deep with from foo import bar and see the result at each call to rreload()
(Apologies for the long and ugly depth, but the black formatted version doesn't look so readable on SO)

Wouldn't it be simpler to actually write some test cases and run them every time you are done with modifying your module?
What you are doing is cool (you are in essence using TDD (test driven development) but you are doing it wrong.
Consider that with written unit tests(using the default python unittest module, or better yet nose) you get to have tests that are reusable, stable and help you detect inconsitencies in your code much much faster and better than with testing your module in the interactive environment.

I found the idea to just clear all the modules and then reimport your module here, which suggested to just do this:
import sys
sys.modules.clear()
This would mess up modules loaded that you don't want to reload (if you only want to reload your own modules). My idea is to only clear the modules that include your own folders. Something like this:
import sys
import importlib
def reload_all():
delete_folders = ["yourfolder", "yourotherfolder"]
for module in list(sys.modules.keys()):
if any(folder in module for folder in delete_folders):
del sys.modules[module]
# And then you can reimport the file that you are running.
importlib.import_module("yourfolder.entrypoint")
Reimporting your entry point will reimport all of its imports since the modules were cleared and it's automatically recursive.

Technically, in each file you could put a reload command, to ensure that it reloads each time it imports
a.py:
def testa():
print 'hi!'
b.py:
import a
reload(a)
def testb():
a.testa()
Now, interactively:
import b
b.testb()
#hi!
#<modify a.py>
reload(b)
b.testb()
#hello again!

I found the answer of redsk very useful.
I propose a simplified (for the user, not as code) version where the path to the module is automatically gathered and recursion works for an arbitrary number of levels.
Everything is self-contained in a single function.
Tested on Python 3.4. I guess for python 3.3 one must import reload from imp instead of ... from importlib.
It also checks if the __file__ file is present, which might be false if the coder forgets to define an __init__.py file in a submodule. In such case, an exception is raised.
def rreload(module):
"""
Recursive reload of the specified module and (recursively) the used ones.
Mandatory! Every submodule must have an __init__.py file
Usage:
import mymodule
rreload(mymodule)
:param module: the module to load (the module itself, not a string)
:return: nothing
"""
import os.path
import sys
def rreload_deep_scan(module, rootpath, mdict=None):
from types import ModuleType
from importlib import reload
if mdict is None:
mdict = {}
if module not in mdict:
# modules reloaded from this module
mdict[module] = []
# print("RReloading " + str(module))
reload(module)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
# print ("for attr "+attribute_name)
if type(attribute) is ModuleType:
# print ("typeok")
if attribute not in mdict[module]:
# print ("not int mdict")
if attribute.__name__ not in sys.builtin_module_names:
# print ("not a builtin")
# If the submodule is a python file, it will have a __file__ attribute
if not hasattr(attribute, '__file__'):
raise BaseException("Could not find attribute __file__ for module '"+str(attribute)+"'. Maybe a missing __init__.py file?")
attribute_path = os.path.dirname(attribute.__file__)
if attribute_path.startswith(rootpath):
# print ("in path")
mdict[module].append(attribute)
rreload_deep_scan(attribute, rootpath, mdict)
rreload_deep_scan(module, rootpath=os.path.dirname(module.__file__))

For Python 3.6+ you can use:
from types import ModuleType
import sys
import importlib
def deep_reload(m: ModuleType):
name = m.__name__ # get the name that is used in sys.modules
name_ext = name + '.' # support finding sub modules or packages
def compare(loaded: str):
return (loaded == name) or loaded.startswith(name_ext)
all_mods = tuple(sys.modules) # prevent changing iterable while iterating over it
sub_mods = filter(compare, all_mods)
for pkg in sorted(sub_mods, key=lambda item: item.count('.'), reverse=True):
importlib.reload(sys.modules[pkg]) # reload packages, beginning with the most deeply nested

Below is the recursive reload function that I use, including a magic function for ipython/jupyter.
It does a depth-first search through all sub-modules and reloads them in the correct order of dependence.
import logging
from importlib import reload, import_module
from types import ModuleType
from IPython.core.magic import register_line_magic
logger = logging.getLogger(__name__)
def reload_recursive(module, reload_external_modules=False):
"""
Recursively reload a module (in order of dependence).
Parameters
----------
module : ModuleType or str
The module to reload.
reload_external_modules : bool, optional
Whether to reload all referenced modules, including external ones which
aren't submodules of ``module``.
"""
_reload(module, reload_external_modules, set())
#register_line_magic('reload')
def reload_magic(module):
"""
Reload module on demand.
Examples
--------
>>> %reload my_module
reloading module: my_module
"""
reload_recursive(module)
def _reload(module, reload_all, reloaded):
if isinstance(module, ModuleType):
module_name = module.__name__
elif isinstance(module, str):
module_name, module = module, import_module(module)
else:
raise TypeError(
"'module' must be either a module or str; "
f"got: {module.__class__.__name__}")
for attr_name in dir(module):
attr = getattr(module, attr_name)
check = (
# is it a module?
isinstance(attr, ModuleType)
# has it already been reloaded?
and attr.__name__ not in reloaded
# is it a proper submodule? (or just reload all)
and (reload_all or attr.__name__.startswith(module_name))
)
if check:
_reload(attr, reload_all, reloaded)
logger.debug(f"reloading module: {module.__name__}")
reload(module)
reloaded.add(module_name)

It is a tricky thing to do - I have an working example in this answer:
how to find list of modules which depend upon a specific module in python

from <module> import ... in init.py makes module name visible?

Take the following code example:
File package1/__init__.py:
from moduleB import foo
print moduleB.__name__
File package1/moduleB.py:
def foo(): pass
Then from the current directory:
>>> import package1
package1.moduleB
This code works in CPython. What surprises me about it is that the from ... import in __init__.py statement makes the moduleB name visible. According to Python documentation, this should not be the case:
The from form does not bind the module name
Could someone please explain why CPython works that way? Is there any documentation describing this in detail?

The documentation misled you as it is written to describe the more common case of importing a module from outside of the parent package containing it.
For example, using "from example import submodule" in my own code, where "example" is some third party library completely unconnected to my own code, does not bind the name "example". It does still import both the example/__init__.py and example/submodule.py modules, create two module objects, and assign example.submodule to the second module object.
But, "from..import" of names from a submodule must set the submodule attribute on the parent package object. Consider if it didn't:
package/__init__.py executes when package is imported.
That __init__ does "from submodule import name".
At some point later, other completely different code does "import package.submodule".
At step 3, either sys.modules["package.submodule"] doesn't exist, in which case loading it again will give you two different module objects in different scopes; or sys.modules["package.submodule"] will exist but "submodule" won't be an attribute of the parent package object (sys.modules["package"]), and "import package.submodule" will do nothing. However, if it does nothing, the code using the import cannot access submodule as an attribute of package!
Theoretically, how importing a submodule works could be changed if the rest of the import machinery was changed to match.
If you just need to know what importing a submodule S from package P will do, then in a nutshell:
Ensure P is imported, or import it otherwise. (This step recurses to handle "import A.B.C.D".)
Execute S.py to get a module object. (Skipping details of .pyc files, etc.)
Store module object in sys.modules["P.S"].
setattr(sys.modules["P"], "S", sys.modules["P.S"])
If that import was of the form "import P.S", bind "P" in local scope.

this is because __init__.py represent itself as package1 module object at runtime, so every .py file will be defined as an submodule. and rewrite __all__ will not make any sense. you can make another file e.g example.py and fill it with the same code in __init__.py and it will raise NameError.
i think CPython runtime takes special algorithm when __init__.py looking for variables differ from other python files, may be like this:
looking for variable named "moduleB"
if not found:
if __file__ == '__init__.py': #dont raise NameError, looking for file named moduleB.py
if current dir contains file named "moduleB.py":
import moduleB
else:
raise namerror

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

What is happening in m5/objects/init.py file gem5 - python

Related

Importing modules in python (3 modules)

how to import my own package from ipython

Python Importing - explaination

Recursive version of 'reload'

from <module> import ... in init.py makes module name visible?

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

What is happening in m5/objects/__init__.py file gem5 - python

Related

Importing modules in python (3 modules)

how to import my own package from ipython

Python Importing - explaination

Recursive version of 'reload'

from <module> import ... in __init__.py makes module name visible?

Categories

Resources

What is happening in m5/objects/init.py file gem5 - python

from <module> import ... in init.py makes module name visible?