How to import pandas using R studio - python

So, just to be clear, I'm very new to python coding... so I'm not exactly sure what's going wrong.
Yesterday, whilst following a tutorial on calling python from R, I successfully installed and used several python packages (e.g., NumPy, pandas, matplotlib etc).
But today, when trying to run the exact same code, I'm getting an error when trying to import pandas (NumPy is importing without any errors). The error states:
ModuleNotFoundError: No module named 'pandas'
I'm not sure what's going on!?
I'm using R-Studio (running on a Mac)... here's a code snippet of how I'm doing it:
library(reticulate)
os <- import("os") # Setting directory
os$getcwd()
repl_python() #used to make it interactive
import numpy as np. # Load numpy package
import pandas as pd # Load pandas package
At this point, it's throwing me an error. I've tried googling the answer and searching here, but to no avail.
Any suggestions as to how I'd fix this problem, or what is going on?
Thanks

Possibly your python path for reticulate changed upon reloading Rstudio. Here is how to set the path manually (filepath for Linux or Mac):
library(reticulate)
path_to_python <- "~/anaconda3/bin/python"
use_python(path_to_python)
https://stackoverflow.com/a/45891929/4549682
You can check your Python path with py_config(): https://rstudio.github.io/reticulate/articles/versions.html#configuration-info
I recommend using Anaconda for your Python distribution (you might have to use Anaconda anyway for reticulate, not sure). Download it from here: https://www.anaconda.com/distribution/#download-section
Then you can create the environment for reticulate to use:
conda_create('r-reticulate', packages = "python=3.5")
I use Python 3.5 for some specific packages, but you can change that version or leave it as just 'python' for the latest version.
https://www.rdocumentation.org/packages/reticulate/versions/1.10/topics/conda-tools
Then you want to install the packages you need (if they aren't already) with
conda_install('re-reticulate', packages = 'numpy')
The way I use something like numpy is
np <- import('numpy')
np$arange(10)

You need to set the second argument of the function use_python, so it should be:
For example, use_python("/users/my_user/Anaconda3/python.exe",required = TRUE)
DON'T forget required = TRUE

Related

netCDF4 has no module Dataset using Anaconda

I am using Anaconda to manage my environments, and I have a strange problem with netCDF4.
I have several Jupyter notebooks in my environment that I've been using with netCDF4, no problem at all. I'm only interested in reading NetCDF files, so I'm only really using the Dataset.
Now I'm implementing the algorithm from my Jupyter notebooks in a Python package, and I get this error (in VS Code):
No name 'Dataset' in module 'netCDF4'
I can see that it's installed in Anaconda Navigator and if I try to do a pip install it reports that netcdf4 is already installed and all dependencies are met.
I've read similar-sounding posts here and they do not solve my problem.
In response to a comment, the error is where I import Dataset:
from netCDF4 import Dataset
This also gives the error:
import netCDF4 as nc
salinity_data = nc.Dataset(<file name etc...>)
The code completion does not show anything in the netCDF4 package other than some "_" prefixed variables.
I'm using Python 3.8.12 and I am using the correct virtual environment that I set up with Anaconda.
The error message is coming from pylint, not the Python interpreter (see comments above).
The code will run fine, so the issue is with pylint and the configuration. I can suppress the error by:
from netCDF4 import Dataset #pylint: disable=no-name-in-module
This will do for now, but at some point, I'd like to figure out why pylint is reporting this.
I have also found a package that works better for what I want to do wth the netCDF files:
https://github.com/h5netcdf/h5netcdf
This doesn't have all the hidden dependencies that netCDF4 does, and has a "legacyapi" that is a drop-in replacement for the netcdf package:
import hfnetcdf.legacyapi as nc
my_data = nc.Dataset('my_data_file.nc', 'r')

Problems with reticulate in R studio and importing python modules

I'm trying to run reticulate and import python modules within r studio (specifically R-markdown). The R code chunk seems to do what is expected (i.e. install the python modules) and not seem to produce any errors, but the python code chunk does not seem to do what is expected (i.e. import the installed packages). It does not produce any output (or errors) which is somewhat strange.
I've tried a fresh install of reticulate, using the devtools version of reticulate a fresh install of R Studio and using the full conda path rather than the name, but neither seem to be working. I'm at a loss to figure out what is going wrong. I've also searched stackoverflow for various answers already and have tried various suggestions, but nothing seems to be working). Additionally, I have miniconda installed and python installed as well (and python packages and scripts run perfectly fine). If anyone was able to help, that would be fantastic.
(Apologies for formatting, the last backticks indicating the end of the code chunks aren't showing up properly)
## R code chunk
```
```{r}
library("reticulate")
#devtools::install_github("rstudio/reticulate")
conda_create("my_project_env")
py_install(packages = c("numpy","pandas","scikit-learn","matplotlib","seaborn","statsmodels"))
py_install(packages = c("IPython"))
# Either of these seem to "work" for installation
#conda_install(packages = c("numpy","pandas","scikit-learn","matplotlib","seaborn","statsmodels"))
#conda_install(packages = c("IPython"))
conda_list()
use_condaenv("my_project_env")
```
The python code chunk below seems to "run" but does not produce any output or errors (such as the python module could not be found) and I am unable to use the modules.
## Python code chunk
```
```{python}
# Main packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
```
It seems like the solution was to run the imports in an r studio code chunk in the following manner :
library("reticulate")
conda_create("my_project_env")
py_install(packages = c("numpy","pandas","scikit-learn","matplotlib","seaborn","statsmodels"))
conda_list()
use_condaenv("full_path_to_python_for my_project_env")
py_run_string('import numpy as np')
py_run_string('import pandas as pd')
py_run_string('import matplotlib.pyplot as plt')
py_run_string('import seaborn as sns')
From there, I was able to run the python functions in the python code chunks without issue.

So i installed numpy . But when i call it in a program an error occurs. Any method to solve it permanently in windows 10

Here is the error
import numpy
Exception has occurred: ModuleNotFoundError
No module named 'numpy'
File "C:\path\to\file\32.py", line 1, in <module>
import numpy
Let me know how did you install the NumPy package; using pip or something else?
If you have multiple python versions, i.e. 2.x and 3.x at the same time, please make sure your interpreter for the 32.py file is the version that you installed NumPy on.
To possibly fix your problem, you should first try installing it and see if there are any errors. You should also check the version of Python you are running on Windows 10, because when you update Python it sometimes switches names between py and python
As you can see, the version of Python has changed between py and python so you should try changing that first.
If this does not work, you should try finding the directory for NumPy and adding it to the system PATH in your script. The installer usually shows you the location by doing the following:
import sys
sys.path.append("<insert numpy location here>")
import NumPy
This should manually force it into finding the package. If none of this works, please tell us and we should be able to find a different solution.
Happy Coding!
If you're using a code editor like PyCharm, you could install it by clicking on
file then settings then the project interpreter setting and install new module! You can search for the module and install.
Make sure that the python version that you want to use is a Windows Environmental Variable. You can test this by running this line in your command line.
> python --version
If you get some other python version that is not the one that you wish to use you can set the python version you want by finding where exactly your Python folder is located and go into settings and set the path as a new variable (I can leave a tutorial for you). If that is too much of a hassle, the Python installers can set the python that you will install as an environmental variable for you. (You could uninstall and reinstall and make sure that you allow it to make it an environmental variable.
After that you should be able to import whatever external packages you want using pip
for example:
pip install numpy

ImportError: cannot import name 'moduleTNC' - python

I'm having a problem in python when I try to import linear_model from sklearn library: from sklearn import linear_model. I have just installed it via pip simply in this way: pip install sklearn. I know that to avoid this error suffices uninstall and reinstall sklearn, but it didn't work. I also installed it via conda but opening the idle (is that correct?) it gives the same error.
How to avoid it?
NB: If I use jupyter from conda it works well as it should.
I had this same problem and resolved it with:
conda remove scipy scikit-learn -y
conda install scipy scikit-learn -y
I saw it here, where many other people said it solved their problems too.
With regards to the following error:
ImportError: cannot import name 'moduleTNC'
It can be solved by renaming moduletnc.cp36-win_amd64.pyd with moduleTNC.cp36-win_amd64.pyd in:
AppData\Roaming\Python\Python36\site-packages\scipy\optimize
I can't mark as possible duplicate, so I'm just pasting here. If that's a wrong behaviour, I'm sorry:
Import module works in Jupyter notebook but not in IDLE
The reason is that your pip/conda installed library paths are not accessible by python IDLE. You have to add those library paths to your environment variable(PATH). To do this open my computer > properties > advanced system settings > system.
Under environment variables look for PATH and at the end add the location of installed libraries. Refer this for more details on how to add locations in path variable. Once you do these you will be able to import the libraries. In order to know what locations python searches for libraries you can use
import sys
print sys.path
This will give you a list of locations where python searches for libraries. Once you edit the PATH variable those locations will be reflected here.
Refer this also in order to know how to add python library path.
Note: The tutorial is a reference on how to edit PATH variable. I encourage you to find the location of installed libraries and follow the steps to edit the same.

Packages not working, using Anaconda

I have installed Anaconda for Windows. It's on my work PC, so I chose the option "Just for Me" as I don't have admin rights.
Anaconda is installed on the following directory:
c:\Users\huf069\AppData\Local\Continuum\Anaconda
The Windows installer has added this directory (+ the Anaconda\Scripts directory) to the System path.
I can launch Python but trying to run
x = randn(100,100)
gives me a Name Error: name 'randn' is not defined,
whereas, as I understood, this command should work when using Anaconda, as the numpy package is included.
it works fine if I do:
import numpy
numpy.random.randn(100,100)
Anyone understand what could be happening ?
I can launch Python, but trying to run x = randn(100,100) gives me a Name Error: name 'randn' is not defined, whereas, as I understood, this command should work when using Anaconda, as the numpy package is included
The Anaconda distribution comes with the numpy package included, but still you'll need to import the package. If you want to use the randn() function without having to call the complete name, you can import it to your local namespace:
from numpy.random import randn
x = randn(100,100)
Otherwise, the call numpy.random.randn is your way to go.
You might want tot take a look at the Modules section of the Python Tutorial.

Categories

Resources