I am new with gem5 simulator. I was reading the documentation (http://www.m5sim.org/Configuration_/_Simulation_Scripts) trying to understand how everything is implemented. When they write about Python classes they say the following:
gem5 provides a collection of Python object classes that correspond to its C++ simulation object classes. These Python classes are defined in a Python module called "m5.objects". The Python class definitions for these objects can be found in .py files in src, typically in the same directory as their C++ definitions.
To make the Python classes visible, the configuration file must first import the class definitions from the m5 module
In the m5/objects directory there is only one file "__init__.py". This is the code:
from __future__ import print_function
from __future__ import absolute_import
from m5.internal import params
from m5.SimObject import *
try:
modules = __loader__.modules
except NameError:
modules = { }
for module in modules.keys():
if module.startswith('m5.objects.'):
exec("from %s import *" % module)
Normally I don't program with Python so perhaps that is the problem, but I haven't fully understood what is going on here. In this other post Python's __loader__, what is it? they speak about what loader means but I feel I am missing something. Any help would be appreciated. Thanks in advance.
The __loader__
Consider the following code:
import sys
class FooImporter:
def find_module(self, module_name, package_path):
return self if module_name == 'foo' else None
def load_module(self, module_name):
print('FooImporter is working.')
sys.modules[module_name] = __import__('sys')
# This activates the importer
sys.meta_path.append(FooImporter())
# This should trigger our importer to import 'foo'
import foo
# Show what we've just got
print(foo)
This will result in output:
FooImporter is working.
<module 'sys' (built-in)>
As long as you do not have a python module named foo in PYTHONPATH.
Python Import Hook (PEP 302) allows us to customize the behavior of import. In the above example, module foo was said to be found and handled by the FooImporter. Note the importer will create the module foo as an alias of sys. The complete importer (unlike the simplified one we've seen) will be responsible for setting the __loader__ attribute for the imported module to the importer itself.
Gem5's import hook
Back to your question, gem5 is using the same mechanism for loading SimObject's by its design of modulization. You can find the very importer class at src/python/importer.py with the class name CodeImporter.
When the module m5.object was being imported, say,
from m5.objects import Root
The CodeImporter will be responsible for handling the import task, in which the __loader__ attribute will be set for the imported module (in this case m5.objects). If you try printing __loader__ in m5/objects/__init__.py, you'll get something like:
<importer.CodeImporter object at 0x7f4f58941d60>
The __loader__.modules is a dictionary containing gem5 maintained SimObjects where each item will be added by addModule() calls from src/sim/init.cc.
As long as a SimObject's C++ correspondence has called the constructor for EmbeddedPython, it will be added to a list so the gem5 initialization will remember to add it to the instance of CodeImporter. For example, one should be able to find a Root.py.cc file in the build folder that registers the Root object. The loop at the end of m5/object/__init__.py is just importing a list of known SimObject's by this mechanism.
I think this should be sufficient to give someone a picture of the underlying magic and (hopefully) resolve their curiosity.
I have my own repository created in BitBucket.
In that repository, I have a file named core.py and an __init__.py file
I tried to import the core module, and I fixed all the requirements that were needed.
Now when I am finally able to import the module using ipython, which is only one big class, with the call:
obj = MyClass()
I get an error:
name 'MyClass()' is not defined
even though it seems the module was imported.
Let me know if more information is Needed.
As you stated in your comment, you are importing core.py:
from mintigocloudstorage import core
That means, you also have to tell your script where to find your class:
obj = core.MyClass()
If the import was sucessfull as you say, Python should now be able to locate your classes definition.
Alternatively you can also import your class:
from mintigocloudstorage.core import MyClass
obj = MyClass()
I have a problem loading objects via numpy.load after renaming a module.
Here's a simple example showing the problem.
Imagine having a class defined in mymodule.py:
class MyClass(object):
a = "ciao"
b = [1, 2, 3]
def __init__(self, value=2):
self.value = value
from a python session I can simply create an instance and save it:
import numpy as np
import mymodule
instance = mymodule.MyClass()
np.save("dump.npy", instance)
Loading the file works nicely (even from a fresh session started in the same folder):
np.load("dump.npy")
If I now rename the module:
mv mymodule.py mymodule2.py
the loading fails. This is expected, but I was hoping that by importing the module before loading:
import mymodule2 as mymodule
the object definition could be found ... but it does not work.
This means that:
1. I do not understand how it works
2. I am forced to keep a symbolic link to the renamed file in a project I am partially refactoring.
Is there anything else I can do do avoid the symbolic link solution ? and to avoid having the same problem in the future ?
Thanks a lot,
marco
[this is my first question here, sorry If I am doing something wrong]
NumPy uses pickle for arrays with objects, but adds a header on top of it. Therefore, you'll need to do a bit more than coding a custom Unpickler:
import pickle
from numpy.lib.format import read_magic, _check_version, _read_array_header
class RenamingUnpickler(pickle.Unpickler):
def find_class(self, module, name):
if module == 'mymodule':
module = 'mymodule2'
return super().find_class(module, name)
with open('dump.npy', 'rb') as fp:
version = read_magic(fp)
_check_version(version)
dtype = _read_array_header(fp, version)[2]
assert dtype.hasobject
print(RenamingUnpickler(fp).load())
Following the suggestion here, my package (or the directory containing my modules) is located at C:/Python34/Lib/site-packages. The directory contains an __init__.py and sys.path contains a path to the directory as shown.
Still I am getting the following error:
Traceback (most recent call last):
File "C:/Python34/Lib/site-packages/toolkit/window.py", line 6, in <module>
from catalogmaker import Catalog
File "C:\Python34\Lib\site-packages\toolkit\catalogmaker.py", line 1, in <module>
from patronmaker import Patron
File "C:\Python34\Lib\site-packages\toolkit\patronmaker.py", line 4, in <module>
class Patron:
File "C:\Python34\Lib\site-packages\toolkit\patronmaker.py", line 11, in Patron
patrons = pickle.load(f)
ImportError: No module named 'Patron'
I have a class in patronmaker.py named 'Patron' but no module named Patron so I am not sure what the last statement in the error message means. I very much appreciate your thoughts on what I am missing.
Python Version 3.4.1 on a Windows 32 bits machine.
You are saving all patron instances (i.e. self) to the Patron class attribute Patron.patrons. Then you are trying to pickle a class attribute from within the class. This can choke pickle, however I believe dill should be able to handle it. Is it really necessary to save all the class instances to a list in Patrons? It's a bit of an odd thing to do…
pickle serializes classes by reference, and doesn't play well with __main__ for many objects. In dill, you don't have to serialize classes by reference, and it can handle issues with __main__, much better. Get dill here: https://github.com/uqfoundation
Edit:
I tried your code (with one minor change) and it worked.
dude#hilbert>$ python patronmaker.py
Then start python…
>>> import dill
>>> f = open('patrons.pkl', 'rb')
>>> p = dill.load(f)
>>> p
[Julius Caeser, Kunte Kinta, Norton Henrich, Mother Teresa]
The only change I made was to uncomment the lines at the end of patronmaker.py so that it saved some patrons…. and I also replaced import pickle with import dill as pickle everywhere.
So, even by downloading and running your code, I can't produce an error with dill. I'm using the latest dill from github.
Additional Edit:
Your traceback above is from an ImportError. Did you install your module? If you didn't use setup.py to install it, or if you don't have your module on your PYTHONPATH, then you won't find your module regardless of how you are serializing things.
Even more edits:
Looking at your code, you should be using the singleton pattern for patrons… it should not be inside the class Patron. The block of code at the class level to load the patrons into Patron.patrons is sure to cause problems… and probably bound to be the source of some form of errors. I also see that you are pickling the attribute Patrons.patrons (not even the class itself) from inside the Patrons class -- this is madness -- don't do it. Also notice that when you are trying to obtain the patrons, you use Patron.patrons… this is calling the class object and not an instance. Move patrons outside of the class, and use the singleton directly as a list of patrons. Also you should typically be using the patrons instance, so if you wanted to have each patron know who all the other patrons are, p = Patron('Joe', 'Blow'), then p.patrons to get all patrons… but you'd need to write a Patrons.load method that reads the singleton list of patrons… you could also use a property to make the load give you something that looks like an attribute.
If you build a singleton of patrons (as a list)… or a "registry" of patrons (as a dict) if you like, then just check if a patrons pickle file exists… to load to the registry… and don't do it from inside the Patrons class… things should go much better. Your code currently is trying to load a class instance on a class definition while it builds that class object. That's bad...
Also, don't expect people to go downloading your code and debugging it for you, when you don't present a minimal test case or sufficient info for how the traceback was created.
You may have hit on a valid pickling error in dill for some dark corner case, but I can't tell b/c I can't reproduce your error. However, I can tell that you need some refactoring.
And just to be explicit:
Move your patrons initializing mess from Patrons into a new file patrons.py
import os
import dill as pickle
#Initialize patrons with saved pickle data
if os.path.isfile('patrons.pkl'):
with open("patrons.pkl", 'rb') as f:
patrons = pickle.load(f)
else: patrons = []
Then in patronmaker.py, and everywhere else you need the singleton…
import dill as pickle
import os.path
import patrons as the
class Patron:
def __init__(self, lname, fname):
self.lname = lname.title()
self.fname = fname.title()
self.terrCheckedOutHistory = {}
#Add any created Patron to patrons list
the.patrons.append(self)
#Preserve this person via pickle
with open('patrons.pkl', 'wb') as f:
pickle.dump(the.patrons, f)
And you should be fine unless your code is hitting one of the cases that attributes on modules can't be serialized because they were added dynamically (see https://github.com/uqfoundation/dill/pull/47), which should definitely make pickle fail, and in some cases dill too… probably with an AtrributeError on the module. I just can't reproduce this… and I'm done.
When I'm developing Python code, I usually test it in an ad-hoc way in the interpreter. I'll import some_module, test it, find a bug, fix the bug and save, and then use the built-in reload function to reload(some_module) and test again.
However, suppose that in some_module I have import some_other_module, and while testing some_module I discover a bug in some_other_module and fix it. Now calling reload(some_module) won't recursively re-import some_other_module. I have to either manually reimport the dependency (by doing something like reload(some_module.some_other_module), or import some_other_module; reload(some_other_module), or, if I've changed a whole bunch of dependencies and lost track of what I need to reload, I need to restart the entire interpreter.
What'd be more convenient is if there were some recursive_reload function, and I could just do recursive_reload(some_module) and have Python not only reload some_module, but also recursively reload every module that some_module imports (and every module that each of those modules imports, and so on) so that I could be sure that I wasn't using an old version of any of the other modules upon which some_module depends.
I don't think there's anything built in to Python that behaves like the recursive_reload function I describe here, but is there an easy way to hack such a thing together?
I've run up against the same issue, and you inspired me to actually solve the problem.
from types import ModuleType
try:
from importlib import reload # Python 3.4+
except ImportError:
# Needed for Python 3.0-3.3; harmless in Python 2.7 where imp.reload is just an
# alias for the builtin reload.
from imp import reload
def rreload(module):
"""Recursively reload modules."""
reload(module)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if type(attribute) is ModuleType:
rreload(attribute)
Or, if you are using IPython, just use dreload or pass --deep-reload on startup.
I've run against the same issue and I've built up on #Mattew and #osa answer.
from types import ModuleType
import os, sys
def rreload(module, paths=None, mdict=None):
"""Recursively reload modules."""
if paths is None:
paths = ['']
if mdict is None:
mdict = {}
if module not in mdict:
# modules reloaded from this module
mdict[module] = []
reload(module)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if type(attribute) is ModuleType:
if attribute not in mdict[module]:
if attribute.__name__ not in sys.builtin_module_names:
if os.path.dirname(attribute.__file__) in paths:
mdict[module].append(attribute)
rreload(attribute, paths, mdict)
reload(module)
#return mdict
There are three differences:
In the general case, reload(module) has to be called at the end of the function as well, as #osa pointed out.
With circular import dependencies the code posted earlier would loop forever so I've added a dictionary of lists to keep track of the set of modules loaded by other modules. While circular dependencies are not cool, Python allows them, so this reload function deals with them as well.
I've added a list of paths (default is ['']) from which the reloading is allowed. Some modules don't like been reloaded the normal way, (as shown here).
The code worked great for dependency modules imported just as import another_module, but it failed when the module imported functions with from another_module import some_func.
I expanded on #redsk's answer to try and be smart about these functions. I've also added a blacklist because unfortunately typing and importlib don't appear in sys.builtin_module_names (maybe there are more). Also I wanted to prevent reloading of some dependencies I knew about.
I also track the reloaded module names and return them.
Tested on Python 3.7.4 Windows:
def rreload(module, paths=None, mdict=None, base_module=None, blacklist=None, reloaded_modules=None):
"""Recursively reload modules."""
if paths is None:
paths = [""]
if mdict is None:
mdict = {}
if module not in mdict:
# modules reloaded from this module
mdict[module] = []
if base_module is None:
base_module = module
if blacklist is None:
blacklist = ["importlib", "typing"]
if reloaded_modules is None:
reloaded_modules = []
reload(module)
reloaded_modules.append(module.__name__)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if type(attribute) is ModuleType and attribute.__name__ not in blacklist:
if attribute not in mdict[module]:
if attribute.__name__ not in sys.builtin_module_names:
if os.path.dirname(attribute.__file__) in paths:
mdict[module].append(attribute)
reloaded_modules = rreload(attribute, paths, mdict, base_module, blacklist, reloaded_modules)
elif callable(attribute) and attribute.__module__ not in blacklist:
if attribute.__module__ not in sys.builtin_module_names and f"_{attribute.__module__}" not in sys.builtin_module_names:
if sys.modules[attribute.__module__] != base_module:
if sys.modules[attribute.__module__] not in mdict:
mdict[sys.modules[attribute.__module__]] = [attribute]
reloaded_modules = rreload(sys.modules[attribute.__module__], paths, mdict, base_module, blacklist, reloaded_modules)
reload(module)
return reloaded_modules
Some notes:
I don't know why some builtin_module_names are prefixed with an underscore (for example collections is listed as _collections, so I have to do the double string check.
callable() returns True for classes, I guess that's expected but that was one of the reasons I had to blacklist extra modules.
At least now I am able to deep reload a module at runtime and from my tests I was able to go multiple levels deep with from foo import bar and see the result at each call to rreload()
(Apologies for the long and ugly depth, but the black formatted version doesn't look so readable on SO)
Wouldn't it be simpler to actually write some test cases and run them every time you are done with modifying your module?
What you are doing is cool (you are in essence using TDD (test driven development) but you are doing it wrong.
Consider that with written unit tests(using the default python unittest module, or better yet nose) you get to have tests that are reusable, stable and help you detect inconsitencies in your code much much faster and better than with testing your module in the interactive environment.
I found the idea to just clear all the modules and then reimport your module here, which suggested to just do this:
import sys
sys.modules.clear()
This would mess up modules loaded that you don't want to reload (if you only want to reload your own modules). My idea is to only clear the modules that include your own folders. Something like this:
import sys
import importlib
def reload_all():
delete_folders = ["yourfolder", "yourotherfolder"]
for module in list(sys.modules.keys()):
if any(folder in module for folder in delete_folders):
del sys.modules[module]
# And then you can reimport the file that you are running.
importlib.import_module("yourfolder.entrypoint")
Reimporting your entry point will reimport all of its imports since the modules were cleared and it's automatically recursive.
Technically, in each file you could put a reload command, to ensure that it reloads each time it imports
a.py:
def testa():
print 'hi!'
b.py:
import a
reload(a)
def testb():
a.testa()
Now, interactively:
import b
b.testb()
#hi!
#<modify a.py>
reload(b)
b.testb()
#hello again!
I found the answer of redsk very useful.
I propose a simplified (for the user, not as code) version where the path to the module is automatically gathered and recursion works for an arbitrary number of levels.
Everything is self-contained in a single function.
Tested on Python 3.4. I guess for python 3.3 one must import reload from imp instead of ... from importlib.
It also checks if the __file__ file is present, which might be false if the coder forgets to define an __init__.py file in a submodule. In such case, an exception is raised.
def rreload(module):
"""
Recursive reload of the specified module and (recursively) the used ones.
Mandatory! Every submodule must have an __init__.py file
Usage:
import mymodule
rreload(mymodule)
:param module: the module to load (the module itself, not a string)
:return: nothing
"""
import os.path
import sys
def rreload_deep_scan(module, rootpath, mdict=None):
from types import ModuleType
from importlib import reload
if mdict is None:
mdict = {}
if module not in mdict:
# modules reloaded from this module
mdict[module] = []
# print("RReloading " + str(module))
reload(module)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
# print ("for attr "+attribute_name)
if type(attribute) is ModuleType:
# print ("typeok")
if attribute not in mdict[module]:
# print ("not int mdict")
if attribute.__name__ not in sys.builtin_module_names:
# print ("not a builtin")
# If the submodule is a python file, it will have a __file__ attribute
if not hasattr(attribute, '__file__'):
raise BaseException("Could not find attribute __file__ for module '"+str(attribute)+"'. Maybe a missing __init__.py file?")
attribute_path = os.path.dirname(attribute.__file__)
if attribute_path.startswith(rootpath):
# print ("in path")
mdict[module].append(attribute)
rreload_deep_scan(attribute, rootpath, mdict)
rreload_deep_scan(module, rootpath=os.path.dirname(module.__file__))
For Python 3.6+ you can use:
from types import ModuleType
import sys
import importlib
def deep_reload(m: ModuleType):
name = m.__name__ # get the name that is used in sys.modules
name_ext = name + '.' # support finding sub modules or packages
def compare(loaded: str):
return (loaded == name) or loaded.startswith(name_ext)
all_mods = tuple(sys.modules) # prevent changing iterable while iterating over it
sub_mods = filter(compare, all_mods)
for pkg in sorted(sub_mods, key=lambda item: item.count('.'), reverse=True):
importlib.reload(sys.modules[pkg]) # reload packages, beginning with the most deeply nested
Below is the recursive reload function that I use, including a magic function for ipython/jupyter.
It does a depth-first search through all sub-modules and reloads them in the correct order of dependence.
import logging
from importlib import reload, import_module
from types import ModuleType
from IPython.core.magic import register_line_magic
logger = logging.getLogger(__name__)
def reload_recursive(module, reload_external_modules=False):
"""
Recursively reload a module (in order of dependence).
Parameters
----------
module : ModuleType or str
The module to reload.
reload_external_modules : bool, optional
Whether to reload all referenced modules, including external ones which
aren't submodules of ``module``.
"""
_reload(module, reload_external_modules, set())
#register_line_magic('reload')
def reload_magic(module):
"""
Reload module on demand.
Examples
--------
>>> %reload my_module
reloading module: my_module
"""
reload_recursive(module)
def _reload(module, reload_all, reloaded):
if isinstance(module, ModuleType):
module_name = module.__name__
elif isinstance(module, str):
module_name, module = module, import_module(module)
else:
raise TypeError(
"'module' must be either a module or str; "
f"got: {module.__class__.__name__}")
for attr_name in dir(module):
attr = getattr(module, attr_name)
check = (
# is it a module?
isinstance(attr, ModuleType)
# has it already been reloaded?
and attr.__name__ not in reloaded
# is it a proper submodule? (or just reload all)
and (reload_all or attr.__name__.startswith(module_name))
)
if check:
_reload(attr, reload_all, reloaded)
logger.debug(f"reloading module: {module.__name__}")
reload(module)
reloaded.add(module_name)
It is a tricky thing to do - I have an working example in this answer:
how to find list of modules which depend upon a specific module in python