Embedding Python Design

Embedding Python Design - python

There are lots of tutorials/instructions on how to embed python in an application, but nothing (that I've seen) on overall design for how the embedded interpreter should be used and interact with the application.
The only idea I could think of would be to simply give the user a method (menu option, etc) of executing scripts in the program. So certain classes, functions, objects, etc. would be exported to python, some script would do something, then said script could be run from the program.
Would such a design be "safe?" Meaning is it feasible for a malicious/poorly-written script to "damage" the program and/or computer? I assume its possible depending on the functions available to the script (e.g: it could try to overwrite some important files, etc.) How might one prevent such from happening? (e.g: script certification, program design, etc.)
This is implementation specific, but is it possible/feasible to have the effects of the script stay after its done running? Meaning if a script computes something, will the result be available to the program after execution of the script has finished? I think it is possible to do if the program were setup to interact with a specific script, but the program will be released before most scripts are written; and such a setup seems like a misuse of embedding a scripting language. Is there actually cases where you would want the result of a scripts execution to be available, or is this a contrived situation that doesn't really occur?
Are there any other designs for embedding python?
What about using python in a way similar to a plugin architecture?
Thanks,
Matthew A. Todd

The only idea I could think of would be to simply give the user a method (menu option, etc) of executing scripts in the program.
Correct.
So certain classes, functions, objects, etc. would be exported to python, some script would do something, then said script could be run from the program.
Correct.
Would such a design be "safe?"
Yes. Unless your users are malicious, psychotic sociopaths. They want to make your program do useful things. They bought/downloaded the software in the first place. They think it has value.
They trusted your software. Why not trust them?
Meaning if a script computes something, will the result be available to the program after execution of the script has finished?
Programs like Apache do this all the time. You screw up the configuration ("script"), it crashes. Lesson learned? Don't screw up the configuration.

Related

How to write python code that can be self-updated without need to quit application?

It is easy to create highly integrated code in python, so the need to quit and restart an application when its code is changed is understandable.
However, it surely must exist some strategies and models to be able to isolate parts of the code so it can be updated on-the-fly without the need to quit and restart.
For the applications I am working on many features will be independant background tasks that the main application will talk to, present status information and also instruct to perform tasks based on the current state. In many ways these background tasks can be seen as independent programs, just that they share some of the codebase with the main application and other tools, tasks, etc built on-top it.
While it probably is hard to make the whole shebang live updateable, I'm sure there must be ways where it is possible to roll out updates and have the running code to notice and update itself as needed.
Since I'm also keen on taking advantage of multithreading and asyncio (in Python 3.5), as well as explore making things stateless, it seems logically possible to do some interesting stuff here that at least makes it possible to avoid some forced hard restarts when rolling out new code.
Would be very grateful for tips and pointers to information about how to make this working.

There's a built-in function reload() that will reload a module, but it's easy to mess up. You'd have to be very careful about what holds references to objects created by the old version of the module, and then be sure to replace those when you reload the module.
I use Django a lot, and when it's running its web server in debug mode, it reloads the entire web server process whenever a source file changes. It's a nice compromise between a manual restart and reloading a single module.
I haven't used it, but watchdog might be helpful to monitor the file system for changes to trigger a reload.

Is writing a daemon in Python a good idea?

I have to write a daemon program that constantly runs in the background and performs some simple tasks. The logic is not complicated at all, however it has to run for extended periods of time and be stable.
I think C++ would be a good choice for writing this kind of application, however I'm also considering Python since it's easier to write and test something quickly in it.
The problem that I have with Python is that I'm not sure how its runtime environment is going to behave over extended periods of time. Can it eat up more and more memory because of some GC quirks? Can it crash unexpectedly? I've never written daemons in Python before, so if anyone here did, please share your experience. Thanks!

I've written a number of daemons in Python for my last company. The short answer is, it works just fine. As long as the code itself doesn't have some huge memory bomb, I've never seen any gradual degradation or memory hogging. Be mindful of anything in the global or class scopes, because they'll live on, so use del more liberally than you might normally. Otherwise, like I said, no issues I can personally report.
And in case you're wondering, they ran for months and months (let's say 6 months usually) between routine reboots with zero problems.

Yes it can leak. Yes it can crash unexpectedly. Anything can.
I'd say you're far more likely to end up accidentally leaking in an environment with manual memory management (e.g. C++) than you are with something like Python.
As for crashing unexpectedly, well, chances are an arbitrary lump of Python might be more likely to crash unexpectedly than an arbitrary lump of Java, because the latter benefits from static typing where you can catch a whole load of errors at compile time, that Python with its duck typing and other forms of flexibility.
Realistically, Python sounds a perfectly reasonable choice for what you want to do. Take a look at something like Twisted for a decent engine to build things around, or at least for an idea of structure (your question sounds like some sort of school assignment, so I'm not sure how much freedom of implementation you get)

I've written many things in C/C++ and Perl that are initiated when a LINUX box O.S. boots, launching them using the rc.d.
Also I've written a couple of java and python scripts that are started the same way I've mentioned above, but I needed a little shell-script (.sh file) to launch them and I used rc.5.
Let me tell you that your concerns about their runtime environments are completely valid, you will have to be careful about wich runlevel you'll use... (only from rc.2 to rc.5, because rc.1 and rc.6 are for the System).
If the runlevel is too low, the python runtime might not be up at the time you are launching your program and it could flop. e.g.: In a LAMP Server MySQL and Apache are started in rc.3 where the Network is already available.
I think your best shot is to make your script in python and launch it using a .sh file from rc.5.
Good luck!

Python - Creating a "scripting" system

I'm making a wxpython app that I will compile with the various freezing utility out there to create an executable for multiple platforms.
the program will be a map editer for a tile-based game engine
in this app I want to provide a scripting system so that advanced users can modify the behavior of the program such as modifying project data, exporting the project to a different format ect.
I want the system to work like so.
the user will place the python script they wish to run into a styled textbox and then press a button to execute the script.
I'm good with this so far thats all really simple stuff.
obtain the script from the text-box as a string compile it to a cod object with the inbuilt function compile() then execute the script with an exec statment
script = textbox.text #bla bla store the string
code = compile(script, "script", "exec") #make the code object
eval(code, globals())
the thing is, I want to make sure that this feature can't cause any errors or bugs
say if there is an import statement in the script. will this cause any problems taking into account that the code has been compiled with something like py2exe or py2app?
how do I make sure that the user can't break critical part of the program like modifying part of the GUI while still allowing them to modify the project data (the data is held in global properties in it's own module)? I think that this would mean modifying the globals dict that is passed to the eval function.
how to I make sure that this eval can't cause the program to hang due to a long or infinite loop?
how do I make sure that an error raised inside the user's code can't crash the whole app?
basically, how to I avoid all those problems that can arise when allowing the user to run their own code?
EDIT: Concerning the answers given
I don't feel like any of the answers so far have really answered my questions
yes they have been in part answered but not completely. I'm well aware the it is impossible to completely stop unsafe code. people are just too clever for one man (or even a teem) to think of all the ways to get around a security system and prevent them.
in fact I don't really care if they do. I'm more worried about some one unintentional breaking something they didn't know about. if some one really wanted to they could tear the app to shreds with the scripting functionality, but I couldn't care less. it will be their instance and all the problem they create will be gone when they restart the app unless they have messed with files on the HD.
I want to prevent the problems that arise when the user dose something stupid.
things like IOError's, SystaxErrors, InfiniteLoopErrors ect.
now the part about scope has been answered. I now understand how to define what functions and globals can be accessed from the eval function
but is there a way to make sure that the execution of their code can be stopped if it is taking too long?
a green thread system perhaps? (green because it would be eval to make users worry about thread safety)
also if a users uses an import module statement to load a module from even the default library that isn't used in the rest of the class. could this cause problems with the app being frozen by Py2exe, Py2app, or Freeze? what if they call a modal out side of the standard library? would it be enough that the modal is present in the same directory as the frozen executable?
I would like to get these answers with out creating a new question but I will if I must.

Easy answer: don't.
You can forbid certain keywords (import) and operations, and accesses to certain data structures, but ultimately you're giving your power users quite a bit of power. Since this is for a rich client that runs on the user's machine, a malicious user can crash or even trash the whole app if they really feel like it. But it's their instance to crash. Document it well and tell people what not to touch.
That said, I've done this sort of thing for web apps that execute user input and yes, call eval like this:
eval(code, {"__builtins__":None}, {safe_functions})
where safe_functions is a dictionary containing {"name": func} type pairs of functions you want your users to be able to access. If there's some essential data structure that you're positive your users will never want to poke at, just pop it out of globals before passing them in.
Incidentally, Guido addressed this issue on his blog a while ago. I'll see if I can find it.
Edit: found.

Short Answer: No
Is using eval in Python a bad practice?
Other related posts:
Safety of Python 'eval' For List Deserialization
It is not easy to create a safety net. The details too many and clever hacks are around:
Python: make eval safe
On your design goals:
It seems you are trying to build an extensible system by providing user to modify a lot of behavior and logic.
Easiest option is to ask them to write a script which you can evaluate (eval) during the program run.
How ever, a good design describes , scopes the flexibility and provides scripting mechanism through various design schemes ranging from configuration, plugin to scripting capabilities etc. The scripting apis if well defined can provide more meaningful extensibility. It is safer too.

I'd suggest providing some kind of plug-in API and allowing users to provide plug-ins in the form of text files. You can then import them as modules into their own namespace, catching syntax errors in the process, and call the various functions defined in the plug-in module, again checking for errors. You can provide an API module that defines the functions/classes from your program that the plug-in module has access to. That gives you the freedom to make changes to your application's architecture without breaking plug-ins, since you can just adapt the API module to expose the functionality in the same way.

If you have the option to switch to Tkinter you can use the bundled tcl interpreter to process your script. For that matter you can probably do that with a wxpython app if you don't start the tk event loop; just use the tcl interpreter without creating any windows.
Since the tcl interpreter is a separate thing it should be nearly impossible to crash the python interpreter if you are careful about what commands you expose to tcl. Plus, tcl makes creating DSLs very easy.
Python - the only scripting language with a built-in scripting engine :-).

Debugging a scripting language like ruby

I am basically from the world of C language programming, now delving into the world of scripting languages like Ruby and Python.
I am wondering how to do debugging.
At present the steps I follow is,
I complete a large script,
Comment everything but the portion I
want to check
Execute the script
Though it works, I am not able to debug like how I would do in, say, a VC++ environment or something like that.
My question is, is there any better way of debugging?
Note: I guess it may be a repeated question, if so, please point me to the answer.

Your sequence seems entirely backwards to me. Here's how I do it:
I write a test for the functionality I want.
I start writing the script, executing bits and verifying test results.
I review what I'd done to document and publish.
Specifically, I execute before I complete. It's way too late by then.
There are debuggers, of course, but with good tests and good design, I've almost never needed one.

Here's a screencast on ruby debugging with ruby-debug.

Seems like the problem here is that your environment (Visual Studio) doesn't support these languages, not that these languages don't support debuggers in general.
Perl, Python, and Ruby all have fully-featured debuggers; you can find other IDEs that help you, too. For Ruby, there's RubyMine; for Perl, there's Komodo. And that's just off the top of my head.

There is a nice gentle introduction to the Python debugger here

If you're working with Python then you can find a list of debugging tools here to which I just want to add Eclipse with the Pydev extension, which makes working with breakpoints etc. also very simple.

My question is, is there any better way of debugging?"
Yes.
Your approach, "1. I complete a large script, 2. Comment everything but the portion I want to check, 3. Execute the script" is not really the best way to write any software in any language (sorry, but that's the truth.)
Do not write a large anything. Ever.
Do this.
Decompose your problem into classes of objects.
For each class, write the class by
2a. Outline the class, focus on the external interface, not the implementation details.
2b. Write tests to prove that interface works.
2c. Run the tests. They'll fail, since you only outlined the class.
2d. Fix the class until it passes the test.
2e. At some points, you'll realize your class designs aren't optimal. Refactor your design, assuring your tests still pass.
Now, write your final script. It should be short. All the classes have already been tested.
3a. Outline the script. Indeed, you can usually write the script.
3b. Write some test cases that prove the script works.
3c. Runt the tests. They may pass. You're done.
3d. If the tests don't pass, fix things until they do.
Write many small things. It works out much better in the long run that writing a large thing and commenting parts of it out.

Script languages have no differences compared with other languages in the sense that you still have to break your problems into manageable pieces -- that is, functions. So, instead of testing the whole script after finishing the whole script, I prefer to test those small functions before integrating them. TDD always helps.

There's a SO question on Ruby IDEs here - and searching for "ruby IDE" offers more.
I complete a large script
That's what caught my eye: "complete", to me, means "done", "finished", "released". Whether or not you write tests before writing the functions that pass them, or whether or not you write tests at all (and I recommend that you do) you should not be writing code that can't be run (which is a test in itself) until it's become large. Ruby and Python offer a multitude of ways to write small, individually-testable (or executable) pieces of code, so that you don't have to wait for (?) days before you can run the thing.
I'm building a (Ruby) database translation/transformation script at the moment - it's up to about 1000 lines and still not done. I seldom go more than 5 minutes without running it, or at least running the part on which I'm working. When it breaks (I'm not perfect, it breaks a lot ;-p) I know where the problem must be - in the code I wrote in the last 5 minutes. Progress is pretty fast.
I'm not asserting that IDEs/debuggers have no place: some problems don't surface until a large body of code is released: it can be really useful on occasion to drop the whole thing into a debugging environment to find out what is going on. When third-party libraries and frameworks are involved it can be extremely useful to debug into their code to locate problems (which are usually - but not always - related to faulty understanding of the library function).

You can debug your Python scripts using the included pdb module. If you want a visual debugger, you can download winpdb - don't be put off by that "win" prefix, winpdb is cross-platform.

The debugging method you described is perfect for a static language like C++, but given that the language is so different, the coding methods are similarly different. One of the big very important things in a dynamic language such as Python or Ruby is the interactive toplevel (what you get by typing, say python on the command line). This means that running a part of your program is very easy.
Even if you've written a large program before testing (which is a bad idea), it is hopefully separated into many functions. So, open up your interactive toplevel, do an import thing (for whatever thing happens to be) and then you can easily start testing your functions one by one, just calling them on the toplevel.
Of course, for a more mature project, you probably want to write out an actual test suite, and most languages have a method to do that (in Python, this is doctest and nose, don't know about other languages). At first, though, when you're writing something not particularly formal, just remember a few simple rules of debugging dynamic languages:
Start small. Don't write large programs and test them. Test each function as you write it, at least cursorily.
Use the toplevel. Running small pieces of code in a language like Python is extremely lightweight: fire up the toplevel and run it. Compare with writing a complete program and the compile-running it in, say, C++. Use that fact that you can quickly change the correctness of any function.
Debuggers are handy. But often, so are print statements. If you're only running a single function, debugging with print statements isn't that inconvenient, and also frees you from dragging along an IDE.

There's a lot of good advice here, i recommend going through some best practices:
http://github.com/edgecase/ruby_koans
http://blog.rubybestpractices.com/
http://on-ruby.blogspot.com/2009/01/ruby-best-practices-mini-interview-2.html
(and read Greg Brown's book, it's superb)
You talk about large scripts. A lot of my workflow is working out logic in irb or the python shell, then capturing them into a cascade of small, single-task focused methods, with appropriate tests (not 100% coverage, more focus on edge and corner cases).
http://binstock.blogspot.com/2008/04/perfecting-oos-small-classes-and-short.html

How can I create a locked-down python environment?

I'm responsible for developing a large Python/Windows/Excel application used by a financial institution which has offices all round the world. Recently the regulations in one country have changed, and as a result we have been told that we need to create a "locked-down" version of our distribution.
After some frustrating conversations with my foreign counterpart, it seems that they are concerned that somebody might misuse the python interpreter on their computer to generate non-standard applications which might be used to circumvent security.
My initial suggestion was just to take away execute-rights on python.exe and pythonw.exe: Our application works as an Excel plugin which only uses the Python DLL. Those exe files are never actually used.
My counterpart remained concerned that somebody could make calls against the Python DLL - a hacker could exploit the "exec" function, for example from another programming language or virtual machine capable of calling functions in Windows DLLs, for example VBA.
Is there something we can do to prevent the DLL we want installed from being abused? At this point I ran out of ideas. I need to find a way to ensure that Python will only run our authorized programs.
Of course, there is an element of absurdity to this question: Since the computers all have Excel and Word they all have VBA which is a well-known scripting language somewhat equivalent in capability to Python.
It obviously does not make sense to worry about python when Excel's VBA is wide-open, however this is corporate politics and it's my team who are proposing to use Python, so we need to prove that our stuff can be made reasonably safe.

"reasonably safe" defined arbitrarily as "safer than Excel and VBA".
You can't win that fight. Because the fight is over the wrong thing. Anyone can use any EXE or DLL.
You need to define "locked down" differently -- in a way that you can succeed.
You need to define "locked down" as "cannot change the .PY files" or "we can audit all changes to the .PY files". If you shift the playing field to something you can control, you stand a small chance of winning.
Get the regulations -- don't trust anyone else to interpret them for you.
Be absolutely sure what the regulations require -- don't listen to someone else's interpretation.

Sadly, Python is not very safe in this way, and this is quite well known. Its dynamic nature combined with the very thin layer over standard OS functionality makes it hard to lock things down. Usually the best option is to ensure the user running the app has limited rights, but I expect that is not practical. Another is to run it in a virtual machine where potential damage is at least limited to that environment.
Failing that, someone with knowledge of Python's C API could build you a bespoke .dll that explicitly limits the scope of that particular Python interpreter, vetting imports and file writes etc., but that would require quite a thorough inspection of what functionality you need and don't need and some equally thorough testing after the fact.

One idea is to run IronPython inside an AppDomain on .NET. See David W.'s comment here.
Also, there is a module you can import to restrict the interpreter from writing files. It's called safelite.
You can use that, and point out that even the inventor of Python was unable to break the security of that module! (More than twice.)
I know that writing files is not the only security you're worried about. But what you are being asked to do is stupid, so hopefully the person asking you will see "safe Python" and be satisfied.

Follow the geordi way.
Each suspect program is run in its own jailed account. Monitor system calls from the process (I know how to do this in Windows, but it is beyond the scope of this question) and kill the process if needed.

I agree that you should examine the regulations to see what they actually require. If you really do need a "locked down" Python DLL, here's a way that might do it. It will probably take a while and won't be easy to get right. BTW, since this is theoretical, I'm waving my hands over work that varies from massive to trivial :-).
The idea is to modify Python.DLL so it only imports .py modules that have been signed by your private key. It verifies this by using the public key and a code signature stashed in a variable you add to each .py that you trust via a coding signing tool.
Thus you have:
Build a private build from the Python source.
In your build, change the import statement to check for a code signature on every module when imported.
Create a code signing tool that signs all the modules you use and assert are safe (aka trusted code). Something like __code_signature__ = "ABD03402340"; but cryptographically secure in each module's __init__.py file.
Create a private/public key pair for signing the code and guard the private key.
You probably don't want to sign any of the modules with exec() or eval() type capabilities.
.NET's code signing model might be helpful here if you can use IronPython.
I'm not sure its a great idea to pursue, but in the end you would be able to assert that your build of the Python.DLL would only run the code that you have signed with your private key.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.