I have Sage 4.7.1 installed and have run into an odd problem. Many of my older scripts that use functions like deepcopy() and uniq() no longer recognize them as global names. I have been able to fix this by importing the python modules one by one, but this is quite tedious. But when I start the command-line Sage interface, I can type "list2=deepcopy(list1)" without importing the copy module, and this works fine. How is it possible that the command line Sage can recognize global name 'deepcopy' but if I load my script that uses the same name it doesn't recognize it?
oops, sorry, not familiar with stackoverflow yet. I type: 'sage_4.7.1/sage" to start the command line interface; then, I type "load jbom.py" to load up all the functions I defined in a python script. When I use one of the functions from the script, it runs for a few seconds (complex function) then hits a spot where I use some function that Sage normally has as a global name (deepcopy, uniq, etc) but for some reason the script I loaded does not know what the function is. And to reiterate, my script jbom.py used to work the last time I was working on this particular research, just as I described.
It also makes no difference if I use 'load jbom.py' or 'import jbom'. Both methods get the functions I defined in my script (but I have to use jbom. in the second case) and both get the same error about 'deepcopy' not being a global name.
REPLY TO DSM: I have been sloppy about describing the problem, for which I am sorry. I have created a new script 'experiment.py' that has "import jbom" as its first line. Executing the function in experiment.py recognizes the functions in jbom.py but deepcopy is not recognized. I tried loading jbom.py as "load jbom.py" and I can use the functions just like I did months ago. So, is this all just a problem of layering scripts without proper usage of import/load etc?
SOLVED: I added "from sage.all import *" to the beginning of jbom.py and now I can load experiment.py and execute the functions calling jbom.py functions without any problems. From the Sage doc on import/load I can't really tell what I was doing wrong exactly.
Okay, here's what's going on:
You can only import files ending with .py (ignoring .py[co]) These are standard Python files and aren't preparsed, so 1/3 == int(0), not QQ(1)/QQ(3), and you don't have the equivalent of a from sage.all import * to play with.
You can load and attach both .py and .sage files (as well as .pyx and .spyx and .m). Both have access to Sage definitions but the .py files aren't preparsed (so y=17 makes y a Python int) while the .sage files are (so y=17 makes y a Sage Integer).
So import jbom here works just like it would in Python, and you don't get the access to what Sage has put in scope. load etc. are handy but they don't scale up to larger programs so well. I've proposed improving this in the past and making .sage scripts less second-class citizens, but there hasn't yet been the mix of agreement on what to do and energy to do it. In the meantime your best bet is to import from sage.all.
Related
I am aware that there are multiple libraries for both languages (R/Python) to call modules from the other one. I am looking for a way to have the backend of my code running in python mainly because of .pyc and speed, and also the front end running in R so I can have a Shiny app. I couldn't find a way to make python machine for the backend. If anyone knows how to do it in R/Rstudio please respond.
I don't have any good benchmarks on it's speed, but the reticulate package is the best way I know of to pass data to and from a python script without using a webserver. It lets you import python objects into R where they will act like R objects, accepting arguments and returning values.
There were a few wonky problems that I had when I just wanted to run functions from a single file. It ran into problems with import statements and multiple functions that called on each other. What worked well was to run the import statements separately (see the sapply() statement below) and to merge all the code in my python script into a single object. This worked nicely and seemed about as fast as running it in python normally (though I haven't done any real benchmarking)
library(reticulate)
use_python(python = '/usr/bin/python') # optionally specify python location
# Import statements are here, not in the file
sapply(c("import mysql.connector", "import re"), py_run_string)
# File contains only the definition of class MismatchFinder
source_python("python_script.py")
# Now we can call on that python object from R
result <- MismatchFinder()$find_mismatch(arg1, arg2)
My impression is that it might be simpler if you make your python code into a module and load it with: py_module <- import_from_path('my_python_module', path = 'PATH') but I didn't try that.
Hope this helps!
I believe what you are looking for is the code below. It will run a python script in R.
system('python3 file_name.py')
My script depends on loading lots of variables in a minute and uses them globally in many functions. Every time I call that script in iPython, it loads them again, taking time.
I tried to take these calls to load and populate functions out of that script, but then these global variables are not available to the functions in the script.
It gives NameError: name 'clf' is not defined error message.
Is there a best way to refactor this code to keep these globals in memory and make the script use them? The script loads many variables like these, and uses them in other functions as globals.
vectorizer_title, vectorizer_desc, clf,
df_instance, vocab, all_tokens, df_dist_all,
df_soc2class_proba, dict_p2s,
dict_f2m, token_pattern, cleanup_pattern,
excluded_words = load_data_and_model(lang)
dict_token2idx_all, dict_token2idx_instance,
dist_array, token_dist_to_instance_min,
dict_bigram_by_instance, denominate,
similar_threshold = populate_data(1)
I had asked this question after trying
from depended_library import *
it had not worked in iPython.
But used with python and used in a Flask Web API it works.
Importing library using the "from" statement executes also the codes out of functions in the depended_library in addition to defining functions.
(If someone explains the problem with iPython and suggest a solution, I shall select it as answer.)
Following up on this question and it's answer, I am trying to create an SPSS custom function in Python, to use in SPSS syntax.
I have a program that works well in a syntax:
begin program.
import spss,spssaux, sys
def CustomFunction ():
#function code here
CustomFunction()
end program.
But I want the CustomFunction()to be available in "normal" SPSS syntax.
Standard syntax does not use Python code nor does it add Python functions to, say, the transformation system. If you have a Python module, its contents can be used directly by just importing that module. And don't worry about importing it multiple times as Python is smart about ignoring redundant imports.
If you want to use Python code as regular syntax, you need to create an extension command. These have standard-style syntax but are implemented in Python (or R or Java). There are many of these either installed with Statistics or installable from the Utilities menu (or the Extensions menu in V24). Information on how to write these is in the Python help doc and related topics in the Help system.
As an example, the SPSSINC TRANS extension command makes Python functions available as transformation code. If CustomFunction, say, creates or transforms a variable casewise and is stored in mymodule.py, it might be invoked like this.
SPSSINC TRANS RESULT=newvar
/FORMULA "mymodule.CustomFunction(x,y,z)".
Extension commands are automatically loaded when Statistics starts, so there is no need for a startup script.
I'm sure someone has come across this before, but it was hard thinking of how to search for it.
Suppose I have a file generate_data.py and another plot_utils.py which contains a function for plotting this data.
Of note, generate_data.py takes a long time to run and I would like to only have to run it once. However, I haven't finished working out the kinks in plot_utils.py, so I have to run this a bunch of times.
It seems in spyder that when I run generate_data (be it in the current console or in a new dedicated python interpreter) that it doesn't allow me to modify plot_utils.py and call "from plot_utils import plotter" in the command line. -- I mean it doesn't have an error, but it's clear the changes haven't been made.
I guess I kind of want cell mode between different .py files.
EDIT: After being forced to formulate exactly what I want, I think I got around this by putting "from plot_utils import plotter" \n "plotter(foo)" inside a cell in generate_data.py. I am now wondering if there is a more elegant solution.
SECOND EDIT: actually the method mentioned above in the edit does not work as I said it did. Still looking for a method.
You need to reload it:
# Python 2.7
plotter = reload(plotter)
or
# Python 3.x
from imp import reload
plotter = reload(plotter)
The reason I want to this is I want to use the tool pyobfuscate to obfuscate my python code. Butpyobfuscate can only obfuscate one file.
I've answered your direct question separately, but let me offer a different solution to what I suspect you're actually trying to do:
Instead of shipping obfuscated source, just ship bytecode files. These are the .pyc files that get created, cached, and used automatically, but you can also create them manually by just using the compileall module in the standard library.
A .pyc file with its .py file missing can be imported just fine. It's not human-readable as-is. It can of course be decompiled into Python source, but the result is… basically the same result you get from running an obfuscater on the original source. So, it's slightly better than what you're trying to do, and a whole lot easier.
You can't compile your top-level script this way, but that's easy to work around. Just write a one-liner wrapper script that does nothing but import the real top-level script. If you have if __name__ == '__main__': code in there, you'll also need to move that to a function, and the wrapper becomes a two-liner that imports the module and calls the function… but that's as hard as it gets.) Alternatively, you could run pyobfuscator on just the top-level script, but really, there's no reason to do that.
In fact, many of the packager tools can optionally do all of this work for you automatically, except for writing the trivial top-level wrapper. For example, a default py2app build will stick compiled versions of your own modules, along with stdlib and site-packages modules you depend on, into a pythonXY.zip file in the app bundle, and set up the embedded interpreter to use that zipfile as its stdlib.
There are a definitely ways to turn a tree of modules into a single module. But it's not going to be trivial. The simplest thing I can think of is this:
First, you need a list of modules. This is easy to gather with the find command or a simple Python script that does an os.walk.
Then you need to use grep or Python re to get all of the import statements in each file, and use that to topologically sort the modules. If you only do absolute flat import foo statements at the top level, this is a trivial regex. If you also do absolute package imports, or from foo import bar (or from foo import *), or import at other levels, it's not much trickier. Relative package imports are a bit harder, but not that big of a deal. Of course if you do any dynamic importing, use the imp module, install import hooks, etc., you're out of luck here, but hopefully you don't.
Next you need to replace the actual import statements. With the same assumptions as above, this can be done with a simple sed or re.sub, something like import\s+(\w+) with \1 = sys.modules['\1'].
Now, for the hard part: you need to transform each module into something that creates an equivalent module object dynamically. This is the hard part. I think what you want to do is to escape the entire module code so that it can put into a triple-quoted string, then do this:
import types
mod_globals = {}
exec('''
# escaped version of original module source goes here
''', mod_globals)
mod = types.ModuleType(module_name)
mod.__dict__.update(mod_globals)
sys.modules[module_name] = mod
Now just concatenate all of those transformed modules together. The result will be almost equivalent to your original code, except that it's doing the equivalent of import foo; del foo for all of your modules (in dependency order) right at the start, so the startup time could be a little slower.
You can make a tool that:
Reads through your source files and puts all identifiers in a set.
Subtracts all identifiers from recursively searched standard- and third party modules from that set (modules, classes, functions, attributes, parameters).
Subtracts some explicitly excluded identifiers from that list as well, as they may be used in getattr/setattr/exec/eval
Replaces the remaining identifiers by gibberish
Or you can use this tool I wrote that does exactly that.
To obfuscate multiple files, use it as follows:
For safety, backup your source code and valuable data to an off-line medium.
Put a copy of opy_config.txt in the top directory of your project.
Adapt it to your needs according to the remarks in opy_config.txt.
This file only contains plain Python and is exec’ed, so you can do anything clever in it.
Open a command window, go to the top directory of your project and run opy.py from there.
If the top directory of your project is e.g. ../work/project1 then the obfuscation result will be in ../work/project1_opy.
Further adapt opy_config.txt until you’re satisfied with the result.
Type ‘opy ?’ or ‘python opy.py ?’ (without the quotes) on the command line to display a help text.
I think you can try using the find command with -exec option.
you can execute all python scripts in a directory with the following command.
find . -name "*.py" -exec python {} ';'
Wish this helps.
EDIT:
OH sorry I overlooked that if you obfuscate files seperately they may not run properly, because it renames function names to different names in different files.