Not able to import pandas in R - python

I am calling a python script from R/shiny as:
system("python /Users/Downloads/Untitled3.py EMEA regulatory '10% productivity saves SOW'")
It is not able to import pandas.
But when I straight call the script from the terminal as:
python /Users/Downloads/Untitled3.py EMEA regulatory '10% productivity saves SOW'
It is able to import pandas. I guess some version issue in python.I have anaconda installed. Can anyone of you please help me in rectifying the issue.
Although not required as, script starts as:
import pandas as pd
import numpy as np
import sys
from difflib import SequenceMatcher
##### More code#########

Problem
You have the default system python and then the anaconda distribution as well.
Merely running the command that you are running from R calls the default system python that doesn't have the required packages.
Fix
Assuming you have anaconda installed at /Users/<username>/anaconda/bin/python (that's the default mac installation folder),
the R command that you should run is -
system("/Users/<username>/anaconda/bin/python /Users/Downloads/Untitled3.py EMEA regulatory '10% productivity saves SOW'")
This ensures that you are explicitly using anaconda's python binaries which will pick up on the pandas and other relevant libraries installed there.
Hope that helps!

Related

Inconsistent results on Jupyer notebook and Intellij IDE: Python

I am trying to compute medcouple using robustats module in python.
https://github.com/FilippoBovo/robustats
The results on jupyter note book and IntelliJ Ide do not match.
Result on my jupyter notebook:
from robustats import medcouple
import numpy as np
x = np.array([9325.06, 6206.00, 10000.00, 9569.78])
print(medcouple(x))
-0.6442066420664204
Result on IntelliJ Ide
from robustats import medcouple
import numpy as np
x = np.array([9325.06, 6206.00, 10000.00, 9569.78])
print(medcouple(x))
nan
Has anyone came across this strange behaviour. Please do let me know if I have to change any setting on IDE.
I have made sure both are running against the same virtual env
it is not reproducible for me in 2021.1.3, robustats 0.1.7, Python 3.6, venv.
Could you please provide me with your setup details?
Please submit an issue at https://youtrack.jetbrains.com/issues/PY
with logs folder zipped from Help | Collect logs and Diagnostic Data and a screencast presenting the behaviour.
Information on how to use YouTrack: https://intellij-support.jetbrains.com/hc/en-us/articles/207241135-How-to-follow-YouTrack-issues-and-receive-notifications

Problems with reticulate in R studio and importing python modules

I'm trying to run reticulate and import python modules within r studio (specifically R-markdown). The R code chunk seems to do what is expected (i.e. install the python modules) and not seem to produce any errors, but the python code chunk does not seem to do what is expected (i.e. import the installed packages). It does not produce any output (or errors) which is somewhat strange.
I've tried a fresh install of reticulate, using the devtools version of reticulate a fresh install of R Studio and using the full conda path rather than the name, but neither seem to be working. I'm at a loss to figure out what is going wrong. I've also searched stackoverflow for various answers already and have tried various suggestions, but nothing seems to be working). Additionally, I have miniconda installed and python installed as well (and python packages and scripts run perfectly fine). If anyone was able to help, that would be fantastic.
(Apologies for formatting, the last backticks indicating the end of the code chunks aren't showing up properly)
## R code chunk
```
```{r}
library("reticulate")
#devtools::install_github("rstudio/reticulate")
conda_create("my_project_env")
py_install(packages = c("numpy","pandas","scikit-learn","matplotlib","seaborn","statsmodels"))
py_install(packages = c("IPython"))
# Either of these seem to "work" for installation
#conda_install(packages = c("numpy","pandas","scikit-learn","matplotlib","seaborn","statsmodels"))
#conda_install(packages = c("IPython"))
conda_list()
use_condaenv("my_project_env")
```
The python code chunk below seems to "run" but does not produce any output or errors (such as the python module could not be found) and I am unable to use the modules.
## Python code chunk
```
```{python}
# Main packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
```
It seems like the solution was to run the imports in an r studio code chunk in the following manner :
library("reticulate")
conda_create("my_project_env")
py_install(packages = c("numpy","pandas","scikit-learn","matplotlib","seaborn","statsmodels"))
conda_list()
use_condaenv("full_path_to_python_for my_project_env")
py_run_string('import numpy as np')
py_run_string('import pandas as pd')
py_run_string('import matplotlib.pyplot as plt')
py_run_string('import seaborn as sns')
From there, I was able to run the python functions in the python code chunks without issue.

How to know if all your script build in 3.8 works in 3.4 python

Short Question:
Is there a way to know the script works the same in 3.4,written in 3.8?
Long Question:
I wrote a script there uses a lot of if and else, when I build it in python 3.8.2, I have test each line myself. But now I need to use py2exe (which only supports till 3.4). So now I want to uninstall my python and install 3.4 but I'm not sure if all my codes will support in it. I started using python from version 3.8 only so I don't know what are the changes that took place from 3.4 to 3.8.2
these are my imports used
import pandas as pd
import eel
import bottle_websocket
import tkinter.filedialog
import xlrd
from configparser import ConfigParser
import os
import time
import xlsxwriter
import json
As I have used too much of if statements, I dont know if some lines inside a if will be executed or not.
You can maintain 2 different environments in the system.
So try running the code in both the environments and see if you get any warnings or errors.
Generally not much will change for the imports you have shared.

How to import pandas using R studio

So, just to be clear, I'm very new to python coding... so I'm not exactly sure what's going wrong.
Yesterday, whilst following a tutorial on calling python from R, I successfully installed and used several python packages (e.g., NumPy, pandas, matplotlib etc).
But today, when trying to run the exact same code, I'm getting an error when trying to import pandas (NumPy is importing without any errors). The error states:
ModuleNotFoundError: No module named 'pandas'
I'm not sure what's going on!?
I'm using R-Studio (running on a Mac)... here's a code snippet of how I'm doing it:
library(reticulate)
os <- import("os") # Setting directory
os$getcwd()
repl_python() #used to make it interactive
import numpy as np. # Load numpy package
import pandas as pd # Load pandas package
At this point, it's throwing me an error. I've tried googling the answer and searching here, but to no avail.
Any suggestions as to how I'd fix this problem, or what is going on?
Thanks
Possibly your python path for reticulate changed upon reloading Rstudio. Here is how to set the path manually (filepath for Linux or Mac):
library(reticulate)
path_to_python <- "~/anaconda3/bin/python"
use_python(path_to_python)
https://stackoverflow.com/a/45891929/4549682
You can check your Python path with py_config(): https://rstudio.github.io/reticulate/articles/versions.html#configuration-info
I recommend using Anaconda for your Python distribution (you might have to use Anaconda anyway for reticulate, not sure). Download it from here: https://www.anaconda.com/distribution/#download-section
Then you can create the environment for reticulate to use:
conda_create('r-reticulate', packages = "python=3.5")
I use Python 3.5 for some specific packages, but you can change that version or leave it as just 'python' for the latest version.
https://www.rdocumentation.org/packages/reticulate/versions/1.10/topics/conda-tools
Then you want to install the packages you need (if they aren't already) with
conda_install('re-reticulate', packages = 'numpy')
The way I use something like numpy is
np <- import('numpy')
np$arange(10)
You need to set the second argument of the function use_python, so it should be:
For example, use_python("/users/my_user/Anaconda3/python.exe",required = TRUE)
DON'T forget required = TRUE

Python on Windows error: python25.dll not found

I am running Python 2.7 on Windows 7 (on parallels on a Mac running Mountain Lion) and getting a strange error. It has happened both using Python(x,y) and the Enthought Python Distribution (paid version - 64-bit).
Running python from the command line initially works fine (and always does after rebooting the machine).
But, when I try to run my code at the command line as
python the_script.py
On the first try, I get this error window:
After that, I get the same error just from typing python at the command line.
If I specify the path as c:\python27\python the_script.py it works fine.
Here are all the modules I'm loading in my scripts:
import numpy as np
import subprocess as sub
import parallel_condor_Jacobian as pcj
import os
import shutil
In parallel_condor_Jacobian the following modules are loaded:
import numpy as np
import os
import subprocess as sub
Nothing really out of the ordinary I think!
Is one of these packages somehow dependent on python25.dll?
Fixes I have tried include totally removing python 2.7, reinstalling, and removing all python path stuff from my PATH environment variable and replacing them with c:\python27.
I'm really at a loss here. Happy to provide more relevant information.
remove the python.exe in the local folder ... and tell your colleagues to upgrade to at least 2.6 :P
and also tell them that the python exe is not portable :P

Categories

Resources