Loading files into variables - python

I am trying to write a small function that gets a variable name, check if it exists, and if not loads it from a file (using pickle) to the global namespace.
I tried using this in a file:
import cPickle
#
# Load if neccesary
#
def loadfile(variable, filename):
if variable not in globals():
cmd = "%s = cPickle.load(file('%s','r'))" % (variable, filename)
print cmd
exec(cmd) in globals()
But it doesn't work - the variable don't get defined. What am I doing wrong?

Using 'globals' has the problem that it only works for the current module. Rather than passing 'globals' around, a better way is to use the 'setattr' builtin directly on a namespace. This means you can then reuse the function on instances as well as modules.
import cPickle
#
# Load if neccesary
#
def loadfile(variable, filename, namespace=None):
if module is None:
import __main__ as namespace
setattr(namespace, variable, cPickle.load(file(filename,'r')))
# From the main script just do:
loadfile('myvar','myfilename')
# To set the variable in module 'mymodule':
import mymodule
...
loadfile('myvar', 'myfilename', mymodule)
Be careful about the module name: the main script is always a module main. If you are running script.py and do 'import script' you'll get a separate copy of your code which is usually not what you want.

You could alway avoid exec entirely:
import cPickle
#
# Load if neccesary
#
def loadfile(variable, filename):
g=globals()
if variable not in g:
g[variable]=cPickle.load(file(filename,'r'))
EDIT: of course that only loads the globals into the current module's globals.
If you want to load the stuff into the globals of another module you'd be best to pass in them in as a parameter:
import cPickle
#
# Load if neccesary
#
def loadfile(variable, filename, g=None):
if g is None:
g=globals()
if variable not in g:
g[variable]=cPickle.load(file(filename,'r'))
# then in another module do this
loadfile('myvar','myfilename',globals())

Related

Import pyarmor obfuscated code using importlib

Suppose I have 2 modules - one has been obfuscated by PyArmor. The other imports the obfuscated module and uses it:
# obfuscated.py
def run_task(conn):
conn.send_msg("Here you go")
print(conn.some_val + 55)
return 0
# Non obfuscated (user) code
import importlib.util
class conn:
some_val = 5
def send_msg(msg):
print(msg)
def main():
# import obfuscated # This works...but I need to dynamically load it:
# This does not:
spec = importlib.util.spec_from_file_location("module.name", r'c:\Users\me\obfuscated.py')
obfuscated = importlib.util.module_from_spec(spec)
spec.loader.exec_module(swdl)
ret = obfuscated.run_task(conn)
print("from main: ", ret)
if __name__ == "__main__":
main()
If I import the obfuscated file using import it is fine. But I need to use importlib to dynamically import the obfuscated file. The importlib does not work - I get:
AttributeError: module 'module.name' has no attribute 'obfuscated'
The idea is that the user can write a script using the API available within obfuscated.py but need to load the module from wherever it resides on their system.
Is there anyway to achieve this?
I think I have a method based on what I read here: https://pyarmor.readthedocs.io/en/latest/mode.html#restrict-mode
I use a proxy between the user code and the obfuscated code.
User code may or may not be obfuscated
The obfuscated code is obviously obfuscated!
The proxy must not be obfuscated (for simplicity, I obfuscated everything then copied the original proxy.py over the obfuscated one)
So, now user code imports the proxy.py using importlib instead of the obfuscated.py.
And the proxy merely imports the obfuscated.py:
# proxy.py
import obfuscated
I managed to import modules dynamically in this way:
code = open('c:\Users\me\obfuscated.py','r').read()
spec = importlib.util.spec_from_loader(package_name,loader=None)
module = importlib.util.module_from_spec(spec)
module.__file__ = 'c:\Users\me\obfuscated.py'
globals_dict = {"__file__":module.__file__}
exec(code, globals_dict)
for item in [x for x in globals_dict["__builtins__"] if not x.startswith("_")]:
setattr(module,item,globals_dict["__builtins__"].get(item))
It reads code from a file, initiates a module, and eventually puts variables in a dictionary. You can find the module's functions in globals_dict["__builtins__"]

python create and register module via compile function [duplicate]

I have some code in the form of a string and would like to make a module out of it without writing to disk.
When I try using imp and a StringIO object to do this, I get:
>>> imp.load_source('my_module', '', StringIO('print "hello world"'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: load_source() argument 3 must be file, not instance
>>> imp.load_module('my_module', StringIO('print "hello world"'), '', ('', '', 0))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: load_module arg#2 should be a file or None
How can I create the module without having an actual file? Alternatively, how can I wrap a StringIO in a file without writing to disk?
UPDATE:
NOTE: This issue is also a problem in python3.
The code I'm trying to load is only partially trusted. I've gone through it with ast and determined that it doesn't import anything or do anything I don't like, but I don't trust it enough to run it when I have local variables running around that could get modified, and I don't trust my own code to stay out of the way of the code I'm trying to import.
I created an empty module that only contains the following:
def load(code):
# Delete all local variables
globals()['code'] = code
del locals()['code']
# Run the code
exec(globals()['code'])
# Delete any global variables we've added
del globals()['load']
del globals()['code']
# Copy k so we can use it
if 'k' in locals():
globals()['k'] = locals()['k']
del locals()['k']
# Copy the rest of the variables
for k in locals().keys():
globals()[k] = locals()[k]
Then you can import mymodule and call mymodule.load(code). This works for me because I've ensured that the code I'm loading does not use globals. Also, the global keyword is only a parser directive and can't refer to anything outside of the exec.
This really is way too much work to import the module without writing to disk, but if you ever want to do this, I believe it's the best way.
Here is how to import a string as a module (Python 2.x):
import sys,imp
my_code = 'a = 5'
mymodule = imp.new_module('mymodule')
exec my_code in mymodule.__dict__
In Python 3, exec is a function, so this should work:
import sys,imp
my_code = 'a = 5'
mymodule = imp.new_module('mymodule')
exec(my_code, mymodule.__dict__)
Now access the module attributes (and functions, classes etc) as:
print(mymodule.a)
>>> 5
To ignore any next attempt to import, add the module to sys:
sys.modules['mymodule'] = mymodule
imp.new_module is deprecated since python 3.4, but it still works as of python 3.9
imp.new_module was replaced with importlib.util.module_from_spec
importlib.util.module_from_spec
is preferred over using types.ModuleType to create a new module as
spec is used to set as many import-controlled attributes on the module
as possible.
importlib.util.spec_from_loader
uses available loader APIs, such as InspectLoader.is_package(), to
fill in any missing information on the spec.
these module attributes are __builtins__ __doc__ __loader__ __name__ __package__ __spec__
import sys, importlib.util
def import_module_from_string(name: str, source: str):
"""
Import module from source string.
Example use:
import_module_from_string("m", "f = lambda: print('hello')")
m.f()
"""
spec = importlib.util.spec_from_loader(name, loader=None)
module = importlib.util.module_from_spec(spec)
exec(source, module.__dict__)
sys.modules[name] = module
globals()[name] = module
# demo
# note: "if True:" allows to indent the source string
import_module_from_string('hello_module', '''if True:
def hello():
print('hello')
''')
hello_module.hello()
You could simply create a Module object and stuff it into sys.modules and put your code inside.
Something like:
import sys
from types import ModuleType
mod = ModuleType('mymodule')
sys.modules['mymodule'] = mod
exec(mycode, mod.__dict__)
If the code for the module is in a string, you can forgo using StringIO and use it directly with exec, as illustrated below with a file named dynmodule.py.
Works in Python 2 & 3.
from __future__ import print_function
class _DynamicModule(object):
def load(self, code):
execdict = {'__builtins__': None} # optional, to increase safety
exec(code, execdict)
keys = execdict.get(
'__all__', # use __all__ attribute if defined
# else all non-private attributes
(key for key in execdict if not key.startswith('_')))
for key in keys:
setattr(self, key, execdict[key])
# replace this module object in sys.modules with empty _DynamicModule instance
# see Stack Overflow question:
# https://stackoverflow.com/questions/5365562/why-is-the-value-of-name-changing-after-assignment-to-sys-modules-name
import sys as _sys
_ref, _sys.modules[__name__] = _sys.modules[__name__], _DynamicModule()
if __name__ == '__main__':
import dynmodule # name of this module
import textwrap # for more readable code formatting in sample string
# string to be loaded can come from anywhere or be generated on-the-fly
module_code = textwrap.dedent("""\
foo, bar, baz = 5, 8, 2
def func():
return foo*bar + baz
__all__ = 'foo', 'bar', 'func' # 'baz' not included
""")
dynmodule.load(module_code) # defines module's contents
print('dynmodule.foo:', dynmodule.foo)
try:
print('dynmodule.baz:', dynmodule.baz)
except AttributeError:
print('no dynmodule.baz attribute was defined')
else:
print('Error: there should be no dynmodule.baz module attribute')
print('dynmodule.func() returned:', dynmodule.func())
Output:
dynmodule.foo: 5
no dynmodule.baz attribute was defined
dynmodule.func() returned: 42
Setting the '__builtins__' entry to None in the execdict dictionary prevents the code from directly executing any built-in functions, like __import__, and so makes running it safer. You can ease that restriction by selectively adding things to it you feel are OK and/or required.
It's also possible to add your own predefined utilities and attributes which you'd like made available to the code thereby creating a custom execution context for it to run in. That sort of thing can be useful for implementing a "plug-in" or other user-extensible architecture.
you could use exec or eval to execute python code as a string. see here, here and here
The documentation for imp.load_source says (my emphasis):
The file argument is the source file, open for reading as text, from the beginning. It must currently be a real file object, not a user-defined class emulating a file.
... so you may be out of luck with this method, I'm afraid.
Perhaps eval would be enough for you in this case?
This sounds like a rather surprising requirement, though - it might help if you add some more to your question about the problem you're really trying to solve.

How can a string be imported as a Python module bound to a local name? [duplicate]

I have some code in the form of a string and would like to make a module out of it without writing to disk.
When I try using imp and a StringIO object to do this, I get:
>>> imp.load_source('my_module', '', StringIO('print "hello world"'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: load_source() argument 3 must be file, not instance
>>> imp.load_module('my_module', StringIO('print "hello world"'), '', ('', '', 0))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: load_module arg#2 should be a file or None
How can I create the module without having an actual file? Alternatively, how can I wrap a StringIO in a file without writing to disk?
UPDATE:
NOTE: This issue is also a problem in python3.
The code I'm trying to load is only partially trusted. I've gone through it with ast and determined that it doesn't import anything or do anything I don't like, but I don't trust it enough to run it when I have local variables running around that could get modified, and I don't trust my own code to stay out of the way of the code I'm trying to import.
I created an empty module that only contains the following:
def load(code):
# Delete all local variables
globals()['code'] = code
del locals()['code']
# Run the code
exec(globals()['code'])
# Delete any global variables we've added
del globals()['load']
del globals()['code']
# Copy k so we can use it
if 'k' in locals():
globals()['k'] = locals()['k']
del locals()['k']
# Copy the rest of the variables
for k in locals().keys():
globals()[k] = locals()[k]
Then you can import mymodule and call mymodule.load(code). This works for me because I've ensured that the code I'm loading does not use globals. Also, the global keyword is only a parser directive and can't refer to anything outside of the exec.
This really is way too much work to import the module without writing to disk, but if you ever want to do this, I believe it's the best way.
Here is how to import a string as a module (Python 2.x):
import sys,imp
my_code = 'a = 5'
mymodule = imp.new_module('mymodule')
exec my_code in mymodule.__dict__
In Python 3, exec is a function, so this should work:
import sys,imp
my_code = 'a = 5'
mymodule = imp.new_module('mymodule')
exec(my_code, mymodule.__dict__)
Now access the module attributes (and functions, classes etc) as:
print(mymodule.a)
>>> 5
To ignore any next attempt to import, add the module to sys:
sys.modules['mymodule'] = mymodule
imp.new_module is deprecated since python 3.4, but it still works as of python 3.9
imp.new_module was replaced with importlib.util.module_from_spec
importlib.util.module_from_spec
is preferred over using types.ModuleType to create a new module as
spec is used to set as many import-controlled attributes on the module
as possible.
importlib.util.spec_from_loader
uses available loader APIs, such as InspectLoader.is_package(), to
fill in any missing information on the spec.
these module attributes are __builtins__ __doc__ __loader__ __name__ __package__ __spec__
import sys, importlib.util
def import_module_from_string(name: str, source: str):
"""
Import module from source string.
Example use:
import_module_from_string("m", "f = lambda: print('hello')")
m.f()
"""
spec = importlib.util.spec_from_loader(name, loader=None)
module = importlib.util.module_from_spec(spec)
exec(source, module.__dict__)
sys.modules[name] = module
globals()[name] = module
# demo
# note: "if True:" allows to indent the source string
import_module_from_string('hello_module', '''if True:
def hello():
print('hello')
''')
hello_module.hello()
You could simply create a Module object and stuff it into sys.modules and put your code inside.
Something like:
import sys
from types import ModuleType
mod = ModuleType('mymodule')
sys.modules['mymodule'] = mod
exec(mycode, mod.__dict__)
If the code for the module is in a string, you can forgo using StringIO and use it directly with exec, as illustrated below with a file named dynmodule.py.
Works in Python 2 & 3.
from __future__ import print_function
class _DynamicModule(object):
def load(self, code):
execdict = {'__builtins__': None} # optional, to increase safety
exec(code, execdict)
keys = execdict.get(
'__all__', # use __all__ attribute if defined
# else all non-private attributes
(key for key in execdict if not key.startswith('_')))
for key in keys:
setattr(self, key, execdict[key])
# replace this module object in sys.modules with empty _DynamicModule instance
# see Stack Overflow question:
# https://stackoverflow.com/questions/5365562/why-is-the-value-of-name-changing-after-assignment-to-sys-modules-name
import sys as _sys
_ref, _sys.modules[__name__] = _sys.modules[__name__], _DynamicModule()
if __name__ == '__main__':
import dynmodule # name of this module
import textwrap # for more readable code formatting in sample string
# string to be loaded can come from anywhere or be generated on-the-fly
module_code = textwrap.dedent("""\
foo, bar, baz = 5, 8, 2
def func():
return foo*bar + baz
__all__ = 'foo', 'bar', 'func' # 'baz' not included
""")
dynmodule.load(module_code) # defines module's contents
print('dynmodule.foo:', dynmodule.foo)
try:
print('dynmodule.baz:', dynmodule.baz)
except AttributeError:
print('no dynmodule.baz attribute was defined')
else:
print('Error: there should be no dynmodule.baz module attribute')
print('dynmodule.func() returned:', dynmodule.func())
Output:
dynmodule.foo: 5
no dynmodule.baz attribute was defined
dynmodule.func() returned: 42
Setting the '__builtins__' entry to None in the execdict dictionary prevents the code from directly executing any built-in functions, like __import__, and so makes running it safer. You can ease that restriction by selectively adding things to it you feel are OK and/or required.
It's also possible to add your own predefined utilities and attributes which you'd like made available to the code thereby creating a custom execution context for it to run in. That sort of thing can be useful for implementing a "plug-in" or other user-extensible architecture.
you could use exec or eval to execute python code as a string. see here, here and here
The documentation for imp.load_source says (my emphasis):
The file argument is the source file, open for reading as text, from the beginning. It must currently be a real file object, not a user-defined class emulating a file.
... so you may be out of luck with this method, I'm afraid.
Perhaps eval would be enough for you in this case?
This sounds like a rather surprising requirement, though - it might help if you add some more to your question about the problem you're really trying to solve.

How do I change the name of an imported library?

I am using jython with a third party application. The third party application has some builtin libraries foo. To do some (unit) testing we want to run some code outside of the application. Since foo is bound to the application we decided to write our own mock implementation.
However there is one issue, we implemented our mock class in python while their class is in java. Thus to use their code one would do import foo and foo is the mock class afterwards. However if we import the python module like this we get the module attached to the name, thus one has to write foo.foo to get to the class.
For convenience reason we would love to be able to write from ourlib.thirdparty import foo to bind foo to the foo-class. However we would like to avoid to import all the classes in ourlib.thirdparty directly, since the loading time for each file takes quite a while.
Is there any way to this in python? ( I did not get far with Import hooks I tried simply returning the class from load_module or overwriting what I write to sys.modules (I think both approaches are ugly, particularly the later))
edit:
ok: here is what the files in ourlib.thirdparty look like simplified(without magic):
foo.py:
try:
import foo
except ImportError:
class foo
....
Actually they look like this:
foo.py:
class foo
....
__init__.py in ourlib.thirdparty
import sys
import os.path
import imp
#TODO: 3.0 importlib.util abstract base classes could greatly simplify this code or make it prettier.
class Importer(object):
def __init__(self, path_entry):
if not path_entry.startswith(os.path.join(os.path.dirname(__file__), 'thirdparty')):
raise ImportError('Custom importer only for thirdparty objects')
self._importTuples = {}
def find_module(self, fullname):
module = fullname.rpartition('.')[2]
try:
if fullname not in self._importTuples:
fileObj, self._importTuples[fullname] = imp.find_module(module)
if isinstance(fileObj, file):
fileObj.close()
except:
print 'backup'
path = os.path.join(os.path.join(os.path.dirname(__file__), 'thirdparty'), module+'.py')
if not os.path.isfile(path):
return None
raise ImportError("Could not find dummy class for %s (%s)\n(searched:%s)" % (module, fullname, path))
self._importTuples[fullname] = path, ('.py', 'r', imp.PY_SOURCE)
return self
def load_module(self, fullname):
fp = None
python = False
print fullname
if self._importTuples[fullname][1][2] in (imp.PY_SOURCE, imp.PY_COMPILED, imp.PY_FROZEN):
fp = open( self._importTuples[fullname][0], self._importTuples[fullname][1][1])
python = True
try:
imp.load_module(fullname, fp, *self._importTuples[fullname])
finally:
if python:
module = fullname.rpartition('.')[2]
#setattr(sys.modules[fullname], module, getattr(sys.modules[fullname], module))
#sys.modules[fullname] = getattr(sys.modules[fullname], module)
if isinstance(fp, file):
fp.close()
return getattr(sys.modules[fullname], module)
sys.path_hooks.append(Importer)
As others have remarked, it is such a plain thing in Python that the import statement iself has a syntax for that:
from foo import foo as original_foo, for example -
or even import foo as module_foo
Interesting to note is that the import statemente binds a name to the imported module or object ont he local context - however, the dictionary sys.modules (on the moduels sys of course), is a live reference to all imported modules, using their names as a key. This mechanism plays a key role in avoding that Python re-reads and re-executes and already imported module , when running (that is, if various of yoru modules or sub-modules import the samefoo` module, it is just read once -- the subsequent imports use the reference stored in sys.modules).
And -- besides the "import...as" syntax, modules in Python are just another object: you can assign any other name to them in run time.
So, the following code would also work perfectly for you:
import foo
original_foo = foo
class foo(Mock):
...

Dynamically import a method in a file, from a string

I have a string, say: abc.def.ghi.jkl.myfile.mymethod. How do I dynamically import mymethod?
Here is how I went about it:
def get_method_from_file(full_path):
if len(full_path) == 1:
return map(__import__,[full_path[0]])[0]
return getattr(get_method_from_file(full_path[:-1]),full_path[-1])
if __name__=='__main__':
print get_method_from_file('abc.def.ghi.jkl.myfile.mymethod'.split('.'))
I am wondering if the importing individual modules is required at all.
Edit: I am using Python version 2.6.5.
From Python 2.7 you can use the importlib.import_module() function. You can import a module and access an object defined within it with the following code:
from importlib import import_module
p, m = name.rsplit('.', 1)
mod = import_module(p)
met = getattr(mod, m)
met()
You don't need to import the individual modules. It is enough to import the module you want to import a name from and provide the fromlist argument:
def import_from(module, name):
module = __import__(module, fromlist=[name])
return getattr(module, name)
For your example abc.def.ghi.jkl.myfile.mymethod, call this function as
import_from("abc.def.ghi.jkl.myfile", "mymethod")
(Note that module-level functions are called functions in Python, not methods.)
For such a simple task, there is no advantage in using the importlib module.
For Python < 2.7 the builtin method __ import__ can be used:
__import__('abc.def.ghi.jkl.myfile.mymethod', fromlist=[''])
For Python >= 2.7 or 3.1 the convenient method importlib.import_module has been added. Just import your module like this:
importlib.import_module('abc.def.ghi.jkl.myfile.mymethod')
Update: Updated version according to comments (I must admit I didn't read the string to be imported till the end and I missed the fact that a method of a module should be imported and not a module itself):
Python < 2.7 :
mymethod = getattr(__import__("abc.def.ghi.jkl.myfile", fromlist=["mymethod"]))
Python >= 2.7:
mymethod = getattr(importlib.import_module("abc.def.ghi.jkl.myfile"), "mymethod")
from importlib import import_module
name = "file.py".strip('.py')
# if Path like : "path/python/file.py"
# use name.replaces("/",".")
imp = import_module(name)
# get Class From File.py
model = getattr(imp, "classNameImportFromFile")
NClass = model() # Class From file
It's unclear what you are trying to do to your local namespace. I assume you want just my_method as a local, typing output = my_method()?
# This is equivalent to "from a.b.myfile import my_method"
the_module = importlib.import_module("a.b.myfile")
same_module = __import__("a.b.myfile")
# import_module() and __input__() only return modules
my_method = getattr(the_module, "my_method")
# or, more concisely,
my_method = getattr(__import__("a.b.myfile"), "my_method")
output = my_method()
While you only add my_method to the local namespace, you do load the chain of modules. You can look at changes by watching the keys of sys.modules before and after the import. I hope this is clearer and more accurate than your other answers.
For completeness, this is how you add the whole chain.
# This is equivalent to "import a.b.myfile"
a = __import__("a.b.myfile")
also_a = importlib.import_module("a.b.myfile")
output = a.b.myfile.my_method()
# This is equivalent to "from a.b import myfile"
myfile = __import__("a.b.myfile", fromlist="a.b")
also_myfile = importlib.import_module("a.b.myfile", "a.b")
output = myfile.my_method()
And, finally, if you are using __import__() and have modified you search path after the program started, you may need to use __import__(normal args, globals=globals(), locals=locals()). The why is a complex discussion.
This website has a nice solution: load_class. I use it like this:
foo = load_class(package.subpackage.FooClass)()
type(foo) # returns FooClass
As requested, here is the code from the web link:
import importlib
def load_class(full_class_string):
"""
dynamically load a class from a string
"""
class_data = full_class_string.split(".")
module_path = ".".join(class_data[:-1])
class_str = class_data[-1]
module = importlib.import_module(module_path)
# Finally, we retrieve the Class
return getattr(module, class_str)
Use importlib (2.7+ only).
The way I tend to to this (as well as a number of other libraries, such as pylons and paste, if my memory serves me correctly) is to separate the module name from the function/attribute name by using a ':' between them. See the following example:
'abc.def.ghi.jkl.myfile:mymethod'
This makes the import_from(path) function below a little easier to use.
def import_from(path):
"""
Import an attribute, function or class from a module.
:attr path: A path descriptor in the form of 'pkg.module.submodule:attribute'
:type path: str
"""
path_parts = path.split(':')
if len(path_parts) < 2:
raise ImportError("path must be in the form of pkg.module.submodule:attribute")
module = __import__(path_parts[0], fromlist=path_parts[1])
return getattr(module, path_parts[1])
if __name__=='__main__':
func = import_from('a.b.c.d.myfile:mymethod')
func()
How about this :
def import_module(name):
mod = __import__(name)
for s in name.split('.')[1:]:
mod = getattr(mod, s)
return mod

Categories

Resources