Execute bytecode .pyc from python code? - python

I have a bytecode document that declares functions and a logo. I also have a .py file where I call the bytecode to output the logo and strings in the functions. How do I go about actually executing the bytecode? I was able to dissemble it and see the assembly code. How can I actually run it?
question.py
import dis
import logo
def work_here():
# execute the bytecode
def main():
work_here()
if __name__ == '__main__':
main()

Try something like:
import dis
code = 'some byte code'
b_code = dis.Bytecode(code)
exec(b.codeobj)

To import a .pyc file, you just do the same thing you do with a .py file: import spam will find an appropriately-placed spam.pyc (or rather, something like __pycache__/spam.cpython-36.pyc) just as it will find an appropriately-placed spam.py. Its top-level code gets run, any functions and classes get defined so you can call them, etc., exactly the same as with a .py file; the only difference is that there isn't source text to show for things like tracebacks or debugger stepping.
If you want to programmatically import a .pyc file by explicit path, or execute one without importing it, you again do the same thing you do with a .py file.
Look at the Examples in importlib. For example:
path = 'bytecoderepo/myfile.pyc'
spec = importlib.util.spec_from_file('myfile', path)
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
And now, the code in bytecoderepo/myfile.pyc has been executed, and the resulting module is available in the variable mod, but it isn't in sys.modules or stored as a global.
If you actually need to dig into the .pyc format and, e.g., extract the bytecode of some function so you can exec it (or build a function object out of it) without executing the main module code, the details are only documented in the source, and subject to change between Python versions. Start with importlib; being able to (validate and) skip over the header and marshal.loads the body may be as far as you need to learn, but probably not (since ultimately, that's what the module loader already does for you in the sample code above, so if that's not good enough, you need to get deeper into the internals).

Related

Get Path To File of Caller From Within Library

I want to be able to get the path to the file which is importing my python library. How can I do that?
For example:
The user creates a file at C:\Users\Bob\bobsproject\main.py. From within the library, I want to be able to get the path to the file, and read it as a txt. How can I do that?
If you want to get the name of the driver script that is (possibly indirectly) loading your library, you can use the fact that python runs a script under the name __main__. You can get it from sys.modules just like any other module and access its __file__ attribute if it exists:
import sys
try:
print(sys.modules['__main__'].__file__)
except KeyError:
print('libray not loaded from script')
except AttributeError:
print('script not loaded from file')
The KeyError is unlikely to ever occur (not even if you run the script with python -m), but it's useful to be safe. The AttributeError is much more likely, and can easily be demonstrated with something like python -c.
If you want something more complex, like the file containing the code that actually called your library function, you will likely have to use the inspect module or similar. This will be even less robust as a matter of course, but may still suit your needs:
import inspect
module = inspect.getmodule(inspect.stack()[1][0])
try:
print(module.__file__)
except AttributeError:
print(f'module "{module.__name__}" not loaded from file')
Notice that inspect.getmodule explicitly uses the word "guess" in its official documentation, while inspect.stack can be a fidgety beast sometimes.
Code for second part referenced from here: https://stackoverflow.com/a/1095621/2988730.
Remember that there are two options here. If you place this code directly in your library module, it will be executed exactly once, when the module is first imported. If you place it in a function that the user can call directly, you will see the printouts every time. If you place the second snippet it in a utility function that you then call from your public module functions, don't forget to increment the frame index to reflect that:
module = inspect.getmodule(inspect.stack()[2][0])

How can I get the directory from a script called by another script in python via a function imported [duplicate]

When writing throwaway scripts it's often needed to load a configuration file, image, or some such thing from the same directory as the script. Preferably this should continue to work correctly regardless of the directory the script is executed from, so we may not want to simply rely on the current working directory.
Something like this works fine if defined within the same file you're using it from:
from os.path import abspath, dirname, join
def prepend_script_directory(s):
here = dirname(abspath(__file__))
return join(here, s)
It's not desirable to copy-paste or rewrite this same function into every module, but there's a problem: if you move it into a separate library, and import as a function, __file__ is now referencing some other module and the results are incorrect.
We could perhaps use this instead, but it seems like the sys.argv may not be reliable either.
def prepend_script_directory(s):
here = dirname(abspath(sys.argv[0]))
return join(here, s)
How to write prepend_script_directory robustly and correctly?
I would personally just os.chdir into the script's directory whenever I execute it. It is just:
import os
os.chdir(os.path.split(__file__)[0])
However if you did want to refactor this thing into a library, you are in essence wanting a function that is aware of its caller's state. You thus have to make it
prepend_script_directory(__file__, blah)
If you just wanted to write
prepend_script_directory(blah)
you'd have to do cpython-specific tricks with stack frames:
import inspect
def getCallerModule():
# gets globals of module called from, and prints out __file__ global
print(inspect.currentframe().f_back.f_globals['__file__'])
I think the reason it doesn't smell right is that $PYTHONPATH (or sys.path) is the proper general mechanism to use.
You want pkg_resources
import pkg_resources
foo_fname = pkg_resources.resource_filename(__name__, "foo.txt")

python, trouble with calling functions from a module

I imported a module as below:
filename = "email"
mymodule = __import__('actions.'+filename)
the problem I have with this is, that the file is immediatly executing, and I would much rather execute a specific function from the file (that way I can send variables through it).
I am basically working with plugins, so it works.
Edit:
for the time being, I am not concerned with whether or not the script executes when I add the line below:
mymodule = __import__('actions.'+filename)
but what I would like to work is when I add the line below, I would like the function to execute. But instead I get an error that the module dosn't have that function even though it exisits in the script.
mymodule.dosomething(n)
Edit:
I personally don't think that the function has anything to do with it but here is one python files that I am trying to open.
import webbrowser
def OpenEmail():
handle = webbrowser.get()
handle.open('http://gmail.google.com')
OpenEmail()
print "Your email has been opened"
The functions don't exist unless the module executes. You can't have it both ways. Perhaps you need to add a main stanza to the module.
The problem is, that you get the actions module returned. Try this:
mymodule = __import__('actions.'+filename)
for submodule in filename.split('.'):
mymodule = getattr(mymodule, submodule)
This happens when you try importing a submodule, i.e. module.something.somethingelse, you get module returned.

How to concatenate multiple Python source files into a single file?

(Assume that: application start-up time is absolutely critical; my application is started a lot; my application runs in an environment in which importing is slower than usual; many files need to be imported; and compilation to .pyc files is not available.)
I would like to concatenate all the Python source files that define a collection of modules into a single new Python source file.
I would like the result of importing the new file to be as if I imported one of the original files (which would then import some more of the original files, and so on).
Is this possible?
Here is a rough, manual simulation of what a tool might produce when fed the source files for modules 'bar' and 'baz'. You would run such a tool prior to deploying the code.
__file__ = 'foo.py'
def _module(_name):
import types
mod = types.ModuleType(name)
mod.__file__ = __file__
sys.modules[module_name] = mod
return mod
def _bar_module():
def hello():
print 'Hello World! BAR'
mod = create_module('foo.bar')
mod.hello = hello
return mod
bar = _bar_module()
del _bar_module
def _baz_module():
def hello():
print 'Hello World! BAZ'
mod = create_module('foo.bar.baz')
mod.hello = hello
return mod
baz = _baz_module()
del _baz_module
And now you can:
from foo.bar import hello
hello()
This code doesn't take account of things like import statements and dependencies. Is there any existing code that will assemble source files using this, or some other technique?
This is very similar idea to tools being used to assemble and optimise JavaScript files before sending to the browser, where the latency of multiple HTTP requests hurts performance. In this Python case, it's the latency of importing hundreds of Python source files at startup which hurts.
If this is on google app engine as the tags indicate, make sure you are using this idiom
def main():
#do stuff
if __name__ == '__main__':
main()
Because GAE doesn't restart your app every request unless the .py has changed, it just runs main() again.
This trick lets you write CGI style apps without the startup performance hit
AppCaching
If a handler script provides a main()
routine, the runtime environment also
caches the script. Otherwise, the
handler script is loaded for every
request.
I think that due to the precompilation of Python files and some system caching, the speed up that you'll eventually get won't be measurable.
Doing this is unlikely to yield any performance benefits. You're still importing the same amount of Python code, just in fewer modules - and you're sacrificing all modularity for it.
A better approach would be to modify your code and/or libraries to only import things when needed, so that a minimum of required code is loaded for each request.
Without dealing with the question, whether or not this technique would boost up things at your environment, say you are right, here is what I would have done.
I would make a list of all my modules e.g.
my_files = ['foo', 'bar', 'baz']
I would then use os.path utilities to read all lines in all files under the source directory and writes them all into a new file, filtering all import foo|bar|baz lines since all code is now within a single file.
Of curse, at last adding the main() from __init__.py (if there is such) at the tail of the file.

python refresh/reload

This is a very basic question - but I haven't been able to find an answer by searching online.
I am using python to control ArcGIS, and I have a simple python script, that calls some pre-written code.
However, when I make a change to the pre-written code, it does not appear to result in any change. I import this module, and have tried refreshing it, but nothing happens.
I've even moved the file it calls to another location, and the script still works fine. One thing I did yesterday was I added the folder where all my python files are to the sys path (using sys.append('path') ), and I wonder if that made a difference.
Thanks in advance, and sorry for the sloppy terminology.
It's unclear what you mean with "refresh", but the normal behavior of Python is that you need to restart the software for it to take a new look on a Python module and reread it.
If your changes isn't taken care of even after restart, then this is due to one of two errors:
The timestamp on the pyc-file is incorrect and some time in the future.
You are actually editing the wrong file.
You can with reload re-read a file even without restarting the software with the reload() command. Note that any variable pointing to anything in the module will need to get reimported after the reload. Something like this:
import themodule
from themodule import AClass
reload(themodule)
from themodule import AClass
One way to do this is to call reload.
Example: Here is the contents of foo.py:
def bar():
return 1
In an interactive session, I can do:
>>> import foo
>>> foo.bar()
1
Then in another window, I can change foo.py to:
def bar():
return "Hello"
Back in the interactive session, calling foo.bar() still returns 1, until I do:
>>> reload(foo)
<module 'foo' from 'foo.py'>
>>> foo.bar()
'Hello'
Calling reload is one way to ensure that your module is up-to-date even if the file on disk has changed. It's not necessarily the most efficient (you might be better off checking the last modification time on the file or using something like pyinotify before you reload), but it's certainly quick to implement.
One reason that Python doesn't read from the source module every time is that loading a module is (relatively) expensive -- what if you had a 300kb module and you were just using a single constant from the file? Python loads a module once and keeps it in memory, until you reload it.
If you are running in an IPython shell, then there are some magic commands that exist.
The IPython docs cover this feature called the autoreload extension.
Originally, I found this solution from Jonathan March's blog posting on this very subject (see point 3 from that link).
Basically all you have to do is the following, and changes you make are reflected automatically after you save:
In [1]: %load_ext autoreload
In [2]: %autoreload 2
In [3]: Import MODULE
In [4]: my_class = Module.class()
my_class.printham()
Out[4]: ham
In [5]: #make changes to printham and save
In [6]: my_class.printham()
Out[6]: hamlet
I used the following when importing all objects from within a module to ensure web2py was using my current code:
import buttons
import table
reload(buttons)
reload(table)
from buttons import *
from table import *
I'm not really sure that is what you mean, so don't hesitate to correct me. You are importing a module - let's call it mymodule.py - in your program, but when you change its contents, you don't see the difference?
Python will not look for changes in mymodule.py each time it is used, it will load it a first time, compile it to bytecode and keep it internally. It will normally also save the compiled bytecode (mymodule.pyc). The next time you will start your program, it will check if mymodule.py is more recent than mymodule.pyc, and recompile it if necessary.
If you need to, you can reload the module explicitly:
import mymodule
[... some code ...]
if userAskedForRefresh:
reload(mymodule)
Of course, it is more complicated than that and you may have side-effects depending on what you do with your program regarding the other module, for example if variables depends on classes defined in mymodule.
Alternatively, you could use the execfile function (or exec(), eval(), compile())
I had the exact same issue creating a geoprocessing script for ArcGIS 10.2. I had a python toolbox script, a tool script and then a common script. I have a parameter for Dev/Test/Prod in the tool that would control which version of the code was run. Dev would run the code in the dev folder, test from test folder and prod from prod folder. Changes to the common dev script would not run when the tool was run from ArcCatalog. Closing ArcCatalog made no difference. Even though I selected Dev or Test it would always run from the prod folder.
Adding reload(myCommonModule) to the tool script resolved this issue.
The cases will be different for different versions of python.
Following shows an example of python 3.4 version or above:
hello import hello_world
#Calls hello_world function
hello_world()
HI !!
#Now changes are done and reload option is needed
import importlib
importlib.reload(hello)
hello_world()
How are you?
For earlier python versions like 2.x, use inbuilt reload function as stated above.
Better is to use ipython3 as it provides autoreload feature.

Categories

Resources