Running Python from R - python

I am aware that there are multiple libraries for both languages (R/Python) to call modules from the other one. I am looking for a way to have the backend of my code running in python mainly because of .pyc and speed, and also the front end running in R so I can have a Shiny app. I couldn't find a way to make python machine for the backend. If anyone knows how to do it in R/Rstudio please respond.

I don't have any good benchmarks on it's speed, but the reticulate package is the best way I know of to pass data to and from a python script without using a webserver. It lets you import python objects into R where they will act like R objects, accepting arguments and returning values.
There were a few wonky problems that I had when I just wanted to run functions from a single file. It ran into problems with import statements and multiple functions that called on each other. What worked well was to run the import statements separately (see the sapply() statement below) and to merge all the code in my python script into a single object. This worked nicely and seemed about as fast as running it in python normally (though I haven't done any real benchmarking)
library(reticulate)
use_python(python = '/usr/bin/python') # optionally specify python location
# Import statements are here, not in the file
sapply(c("import mysql.connector", "import re"), py_run_string)
# File contains only the definition of class MismatchFinder
source_python("python_script.py")
# Now we can call on that python object from R
result <- MismatchFinder()$find_mismatch(arg1, arg2)
My impression is that it might be simpler if you make your python code into a module and load it with: py_module <- import_from_path('my_python_module', path = 'PATH') but I didn't try that.
Hope this helps!

I believe what you are looking for is the code below. It will run a python script in R.
system('python3 file_name.py')

Related

python script calling another script

I wrote a python script that works. The first line of my script is reading an hdf5 file
readFile = h5py.File('FileName_00','r')
After reading the file, my script does several mathematical operations, successfully working. In the output I got function F.
Now, I want to repeat the same script for different files. Basically, I only need to modify FileName_00 by FimeName_01 or ....FileName_10. I was thinking to create a script that call this script!
I never wrote a script that call another script, so any advice would be appreciable.
One option: turn your existing code into a function which takes a filename as an argument:
def myfunc(filename):
h5py.file(filename, 'r')
...
Now, after your existing code, call your function with the filenames you want to input:
myfunc('Filename_00')
myfunc('Filename_01')
myfunc('Filename_02')
...
Even more usefully, I definitely recommend looking into
if(__name__ == '__main__')
and argparse (https://docs.python.org/3/library/argparse.html) as jkr noted.
Also, if you put your algorithm in a function like this, you can import it and use it in another Python script. Very useful!
Although there are certainly many ways to achieve what you want without multiple python scripts, as other answerers have shown, here's how you could do it.
In python we have this function os.system (learn more about it here: https://docs.python.org/3/library/os.html#os.system). Simply put, you can use it like this:
os.system("INSERT COMMAND HERE")
Replacing INSERT COMMAND HERE with the command you use to run your python script. For example, with a script named script.py you could conceivably (depending on your environment) include the following line of code in a secondary python script:
os.system("python script.py")
Running the secondary python script would run script.py as well. FWIW, I don't necessarily think this is the best way to accomplish your goal -- I tend to agree with DraftyHat's solution in most circumstances. But in case you were curious, this is certainly an option in python. I've used this functionality in the past, albeit not to run other python scripts, but to execute commands in the shell. Hope this helps!

How to pass an R variable to a Python variable in Pycharm?

I am new to Pycharm; however, I want to take advantage of my R and Python knowledge. I am a big fan of both languages, and I am constantly learning more about them.
I am hoping to pass an R variable to a Python variable, similar to Jupyter Notebook.
I could not find any example code anywhere of doing this.
R code
x <- 5
python code
# Some conversion method needs to be added
print(x)
Python Console
>>>5
This is possible because Jupyter provides its own Kernel that code runs in. This kernel is able to translate variables between languages.
Pycharm does not provide a kernel, and instead executes Python code directly through an interpreter on your system. It's similar to doing python my_script.py. AFAIK vanilla Pycharm does not execute R at all.
There are plugins for Pycharm that support R and Jupyter notebooks. You might be able to find one that does what you want.
I usually solve this problem by simply adhering to the Unix philosophy:
Rscript foo.R | python bar.py
where foo.R prints the data to STDOUT and bar.py expects to read it from STDIN (you could also read/write from files). This approach of course requires you to define a clear interface of data between the scripts, rather than passing all the variables indiscriminately, which I don't personally mind because I do that anyway as a matter of design best practices.
There are also packages like reticulate which will allow you to call one language from the other. This might be more useful if you like to switch languages a lot.
Thanks for the above answer! That is great. I did find solution that could help other people that use pycharm.
If you have installed the R plug in option, you can use the following code.
Python code to save to feather file:
```
import pyarrow.feather as feather
data = pd.read_csv('C:/Users/data.csv')
feather.write_feather(data, 'C:/Users/data.feather')
```
R code to retrieve feather file
```
library(tidyverse)
library(feather)
library(arrow)
data <- arrow::read_arrow('C:/Users/data.feather')
print(data)
```
However, this process seems very similar to writing a file to csv in Python, then uploading the csv into R. There could be some type of lightweight storage difference, speed processing, and language variable agnostic/class translation. Below is the official documentation: Git: https://github.com/wesm/feather
Apache Arrow: https://arrow.apache.org/docs/python/install.html
RStudio: https://www.rstudio.com/blog/feather/

Python calling function from DLL (WinDLL)

I trying a lot of methods, but any method work strange. Some return error, some return value, but not what I need to be returned ("0" rather than string).
So, I try to use Everything in python code. I make example tool, but when I start using API I stacked. Simply I can't call function or call it wrong.
My code you may see on Github (with comments).
To run it need dll, that you can also find on Github, Everything itself and python 3.5.
Run it in cmd by using command: fs [path] (i.e. C:\, but not used right now as I can't call function)
Code started at line 73 and ended at 126.
Rather than struggling with calling "Everything" I would suggest that you take a look at the python built-in library functions os.walk, fnmatch.fnmatch and glob.glob - they work very well to do what everything does with no external dependencies and a simple pythonic user calling interface.
Additionally, if you write your code using them, you code will work on every platform that supports python.

how to have one file work on objects generated by another in spyder

I'm sure someone has come across this before, but it was hard thinking of how to search for it.
Suppose I have a file generate_data.py and another plot_utils.py which contains a function for plotting this data.
Of note, generate_data.py takes a long time to run and I would like to only have to run it once. However, I haven't finished working out the kinks in plot_utils.py, so I have to run this a bunch of times.
It seems in spyder that when I run generate_data (be it in the current console or in a new dedicated python interpreter) that it doesn't allow me to modify plot_utils.py and call "from plot_utils import plotter" in the command line. -- I mean it doesn't have an error, but it's clear the changes haven't been made.
I guess I kind of want cell mode between different .py files.
EDIT: After being forced to formulate exactly what I want, I think I got around this by putting "from plot_utils import plotter" \n "plotter(foo)" inside a cell in generate_data.py. I am now wondering if there is a more elegant solution.
SECOND EDIT: actually the method mentioned above in the edit does not work as I said it did. Still looking for a method.
You need to reload it:
# Python 2.7
plotter = reload(plotter)
or
# Python 3.x
from imp import reload
plotter = reload(plotter)

python modules missing in sage

I have Sage 4.7.1 installed and have run into an odd problem. Many of my older scripts that use functions like deepcopy() and uniq() no longer recognize them as global names. I have been able to fix this by importing the python modules one by one, but this is quite tedious. But when I start the command-line Sage interface, I can type "list2=deepcopy(list1)" without importing the copy module, and this works fine. How is it possible that the command line Sage can recognize global name 'deepcopy' but if I load my script that uses the same name it doesn't recognize it?
oops, sorry, not familiar with stackoverflow yet. I type: 'sage_4.7.1/sage" to start the command line interface; then, I type "load jbom.py" to load up all the functions I defined in a python script. When I use one of the functions from the script, it runs for a few seconds (complex function) then hits a spot where I use some function that Sage normally has as a global name (deepcopy, uniq, etc) but for some reason the script I loaded does not know what the function is. And to reiterate, my script jbom.py used to work the last time I was working on this particular research, just as I described.
It also makes no difference if I use 'load jbom.py' or 'import jbom'. Both methods get the functions I defined in my script (but I have to use jbom. in the second case) and both get the same error about 'deepcopy' not being a global name.
REPLY TO DSM: I have been sloppy about describing the problem, for which I am sorry. I have created a new script 'experiment.py' that has "import jbom" as its first line. Executing the function in experiment.py recognizes the functions in jbom.py but deepcopy is not recognized. I tried loading jbom.py as "load jbom.py" and I can use the functions just like I did months ago. So, is this all just a problem of layering scripts without proper usage of import/load etc?
SOLVED: I added "from sage.all import *" to the beginning of jbom.py and now I can load experiment.py and execute the functions calling jbom.py functions without any problems. From the Sage doc on import/load I can't really tell what I was doing wrong exactly.
Okay, here's what's going on:
You can only import files ending with .py (ignoring .py[co]) These are standard Python files and aren't preparsed, so 1/3 == int(0), not QQ(1)/QQ(3), and you don't have the equivalent of a from sage.all import * to play with.
You can load and attach both .py and .sage files (as well as .pyx and .spyx and .m). Both have access to Sage definitions but the .py files aren't preparsed (so y=17 makes y a Python int) while the .sage files are (so y=17 makes y a Sage Integer).
So import jbom here works just like it would in Python, and you don't get the access to what Sage has put in scope. load etc. are handy but they don't scale up to larger programs so well. I've proposed improving this in the past and making .sage scripts less second-class citizens, but there hasn't yet been the mix of agreement on what to do and energy to do it. In the meantime your best bet is to import from sage.all.

Categories

Resources