How to pass an R variable to a Python variable in Pycharm? - python

I am new to Pycharm; however, I want to take advantage of my R and Python knowledge. I am a big fan of both languages, and I am constantly learning more about them.
I am hoping to pass an R variable to a Python variable, similar to Jupyter Notebook.
I could not find any example code anywhere of doing this.
R code
x <- 5
python code
# Some conversion method needs to be added
print(x)
Python Console
>>>5

This is possible because Jupyter provides its own Kernel that code runs in. This kernel is able to translate variables between languages.
Pycharm does not provide a kernel, and instead executes Python code directly through an interpreter on your system. It's similar to doing python my_script.py. AFAIK vanilla Pycharm does not execute R at all.
There are plugins for Pycharm that support R and Jupyter notebooks. You might be able to find one that does what you want.
I usually solve this problem by simply adhering to the Unix philosophy:
Rscript foo.R | python bar.py
where foo.R prints the data to STDOUT and bar.py expects to read it from STDIN (you could also read/write from files). This approach of course requires you to define a clear interface of data between the scripts, rather than passing all the variables indiscriminately, which I don't personally mind because I do that anyway as a matter of design best practices.
There are also packages like reticulate which will allow you to call one language from the other. This might be more useful if you like to switch languages a lot.

Thanks for the above answer! That is great. I did find solution that could help other people that use pycharm.
If you have installed the R plug in option, you can use the following code.
Python code to save to feather file:
```
import pyarrow.feather as feather
data = pd.read_csv('C:/Users/data.csv')
feather.write_feather(data, 'C:/Users/data.feather')
```
R code to retrieve feather file
```
library(tidyverse)
library(feather)
library(arrow)
data <- arrow::read_arrow('C:/Users/data.feather')
print(data)
```
However, this process seems very similar to writing a file to csv in Python, then uploading the csv into R. There could be some type of lightweight storage difference, speed processing, and language variable agnostic/class translation. Below is the official documentation: Git: https://github.com/wesm/feather
Apache Arrow: https://arrow.apache.org/docs/python/install.html
RStudio: https://www.rstudio.com/blog/feather/

Related

Transferring data from a C buffer to Python for plotting with Matplotlib in Visual Studio 2019

I have C code successfully running in Visual Studio 2019 that fills a buffer with real time data from an FPGA. It is declared as...
unsigned char *pBuffer;
...in C. I can see that the data is correct in a memory watch window, at the address of pBuffer.
I also have installed Python 3.7 in Visual Studio and am successfully plotting with Matplotlib using python.h from a Python array.
So my question is how do I transfer the data from the C buffer to a python array for plotting??
I have looked at ctypes and since my main code is in C, it does not make much sense to go to Python from C then call C again. I have looked at bytes(), bytearray() and memoryview(). Memoryview appeals to me, because it does not copy data and this plotting needs to be very fast as the data comes in very fast. (think oscilloscope) But it does not seem to be real physical addresses that it works with, rather some kind of identifier that does not correspond to any memory location where my data is. I simply want to plot the data that I know exists in a C buffer (1D array) at a specific address. Python seems to be very restrictive.
I can't seem to get anything to do what I want to do, since Python disallows reading data from some specific memory location apparently. This being the case, I wonder how it might examine memory to display the content in any way, let alone transfer the content to a Python array. How is this done? And yes I am a Pythonic Newbie. Thanks for any help or suggestions.
My first idea would be to pass the data in a file. It will add a few milliseconds to the interchange, but that might not even be perceptible after loading the Python interpreter and the matplotlib module.
If you needed more immediate response, maybe a local loopback network connection would be useful. That's more trouble on the C side than on the Python side, especially on Windows.
As for reading directly from memory, you're right. Standard Python won't do that. You could use C-based modules to do whatever you're doing now from a C main program to get data out of your FPGA. If it's all stock Windows API stuff, you might want to take a look at Mark Hammond's PyWin32 module. That can be installed from the Python Package Index using pip install pywin32 from an elevated command prompt. It might support the APIs you need. If it does, then you might not need a separate C program to get the data out.
PS: Another option for interprocess communication is a named pipe. Windows named pipes can be opened with PyWin32. See the top-vote answer for this SO question.

How do I convert code of another language into Python?

I am wondering how can I convert Stata code into Python code.
For example, my Stata code looks like
if ("`var1'"=="") {
local l_QS "select distinct CountryName from `l_tableName'"
ODBCLoad, exec("`l_QS'") dsn("`dsn'") clear
}
And I want to convert it to Python code such as
if (f"{var1}"=="") :
l_QS = f"select distinct CountryName from {l_tableName}"
SQL_read(f"{l_QS}", dsn = f"{dsn}")
I am new to coding so I don't know what branch of computer science knowledge or what tools/techniques are relevant. I suppose knowledge about compilers and/or using regular expressions may help so I put those tags on my question. Any high-level pointers are appreciated, and specific code examples would be even better. Thanks in advance.
A very simple workaround would be to use the subprocess module included with python and write a basic command line wrapper to your scripts to use their functionality, then build your code from now on in python.
You could also look into possible API functionality in Stata if you have a whole lot of Stata code and it would take forever to convert it manually to python. This would require you to have access to a server and could be potentially costly, but would be cleaner than the subprocess module and wouldn't require the source code to be contained on your local machine. Also note that it's possible that Stata does not have tools to build an API.
As far as I am aware there are no projects that will directly parse a file from any language and convert it into python. This would be a huge project, although maybe with machine learning or AI it would be possible, though still very difficult. There are libraries for wrapping code in C and C++ (others too I'm sure I just know that these are available), but I can't find anything for Stata.

Running Python from R

I am aware that there are multiple libraries for both languages (R/Python) to call modules from the other one. I am looking for a way to have the backend of my code running in python mainly because of .pyc and speed, and also the front end running in R so I can have a Shiny app. I couldn't find a way to make python machine for the backend. If anyone knows how to do it in R/Rstudio please respond.
I don't have any good benchmarks on it's speed, but the reticulate package is the best way I know of to pass data to and from a python script without using a webserver. It lets you import python objects into R where they will act like R objects, accepting arguments and returning values.
There were a few wonky problems that I had when I just wanted to run functions from a single file. It ran into problems with import statements and multiple functions that called on each other. What worked well was to run the import statements separately (see the sapply() statement below) and to merge all the code in my python script into a single object. This worked nicely and seemed about as fast as running it in python normally (though I haven't done any real benchmarking)
library(reticulate)
use_python(python = '/usr/bin/python') # optionally specify python location
# Import statements are here, not in the file
sapply(c("import mysql.connector", "import re"), py_run_string)
# File contains only the definition of class MismatchFinder
source_python("python_script.py")
# Now we can call on that python object from R
result <- MismatchFinder()$find_mismatch(arg1, arg2)
My impression is that it might be simpler if you make your python code into a module and load it with: py_module <- import_from_path('my_python_module', path = 'PATH') but I didn't try that.
Hope this helps!
I believe what you are looking for is the code below. It will run a python script in R.
system('python3 file_name.py')

Dangerous Python Keywords?

I am about to get a bunch of python scripts from an untrusted source.
I'd like to be sure that no part of the code can hurt my system, meaning:
(1) the code is not allowed to import ANY MODULE
(2) the code is not allowed to read or write any data, connect to the network etc
(the purpose of each script is to loop through a list, compute some data from input given to it and return the computed value)
before I execute such code, I'd like to have a script 'examine' it and make sure that there's nothing dangerous there that could hurt my system.
I thought of using the following approach: check that the word 'import' is not used (so we are guaranteed that no modules are imported)
yet, it would still be possible for the user (if desired) to write code to read/write files etc (say, using open).
Then here comes the question:
(1) where can I get a 'global' list of python methods (like open)?
(2) Is there some code that I could add to each script that is sent to me (at the top) that would make some 'global' methods invalid for that script (for example, any use of the keyword open would lead to an exception)?
I know that there are some solutions of python sandboxing. but please try to answer this question as I feel this is the more relevant approach for my needs.
EDIT: suppose that I make sure that no import is in the file, and that no possible hurtful methods (such as open, eval, etc) are in it. can I conclude that the file is SAFE? (can you think of any other 'dangerous' ways that built-in methods can be run?)
This point hasn't been made yet, and should be:
You are not going to be able to secure arbitrary Python code.
A VM is the way to go unless you want security issues up the wazoo.
You can still obfuscate import without using eval:
s = '__imp'
s += 'ort__'
f = globals()['__builtins__'].__dict__[s]
** BOOM **
Built-in functions.
Keywords.
Note that you'll need to do things like look for both "file" and "open", as both can open files.
Also, as others have noted, this isn't 100% certain to stop someone determined to insert malacious code.
An approach that should work better than string matching us to use module ast, parse the python code, do your whitelist filtering on the tree (e.g. allow only basic operations), then compile and run the tree.
See this nice example by Andrew Dalke on manipulating ASTs.
built in functions/keywords:
eval
exec
__import__
open
file
input
execfile
print can be dangerous if you have one of those dumb shells that execute code on seeing certain output
stdin
__builtins__
globals() and locals() must be blocked otherwise they can be used to bypass your rules
There's probably tons of others that I didn't think about.
Unfortunately, crap like this is possible...
object().__reduce__()[0].__globals__["__builtins__"]["eval"]("open('/tmp/l0l0l0l0l0l0l','w').write('pwnd')")
So it turns out keywords, import restrictions, and in-scope by default symbols alone are not enough to cover, you need to verify the entire graph...
Use a Virtual Machine instead of running it on a system that you are concerned about.
Without a sandboxed environment, it is impossible to prevent a Python file from doing harm to your system aside from not running it.
It is easy to create a Cryptominer, delete/encrypt/overwrite files, run shell commands, and do general harm to your system.
If you are on Linux, you should be able to use docker to sandbox your code.
For more information, see this GitHub issue: https://github.com/raxod502/python-in-a-box/issues/2.
I did come across this on GitHub, so something like it could be used, but that has a lot of limits.
Another approach would be to create another Python file which parses the original one, removes the bad code, and runs the file. However, that would still be hit-and-miss.

Run a C# application from python script

I've just about finished coding a decently sized disease transmission model in C#. However, I'm fairly new to .NET and am unsure how to proceed. Currently I just double-click on the .exe file and the model imports config setting from text files, does its thing, and outputs the results into a text file.
What I would like to do next is write a Python script to do the following:
Run the simulation N times (N > 1000)
After each run rename the output file and store (i.e. ./output.txt -> ./acc/outputN.txt)
Aggregate, parse, and analyze the outputs
Output the result in some clean format (possibly excel)
The majority of my programming experience to date has been in C/C++ on linux. I'm fairly confident about the last two items; however, I have no idea how to proceed for the first two. Here are some specific questions I'd like advice on:
What is the easiest/best way to run my C# .exe from a python script?
Does anyone have advice on the best way to do filesystem operations in Python on a Windows system?
Thanks!
As of Python 2.6+ you should be using the subprocess module: (Docs)
import subprocess
for v in range(1000):
cmdLine = r"c:\path\to\my\app.exe"
subprocess.Popen(subprocess)
subprocess.Popen(r"move output.txt ./acc/output-%d.txt" % (v))
The answer to your problems can be found in 'os' in the python standard library. Documentation for doing various operations, such as handling files and starting processes, can be found here.
Process management (Running your C# program) can be found here and file operations are here.
EDIT: Actually, instead of the above process link, you should use the subprocess module.

Categories

Resources