How to install Pandas to a specific python installation

How to install Pandas to a specific python installation - python

I am trying to install Pandas in a version of Python that is bundled with an API. In an IDE bundled with the API I run the following:
Code
import sys
type(sys.path)
for path in sys.path:
print(path)
Output
C:\Users\priper\AppData\Local\Programs\Python\Python37\python37.zip
C:\Users\priper\AppData\Local\Programs\Python\Python37\DLLs
C:\Users\priper\AppData\Local\Programs\Python\Python37\Lib
C:\Users\priper\AppData\Local\Programs\Python\Python37
C:\Users\priper\AppData\Local\Programs\Python\Python37\Lib\site-packages
C:\Windows\Microsoft.NET\Framework64\v4.0.30319\
C:\Users\priper\Desktop\HF_AutoCal
I have pointed the API to a version of Python3.7, now when I try to import pandas it says error. When I try and install pandas it says it is already satisfied as Pandaas is installed in Python3.8. I cannot point the API to Py3.8 as the API won't work with it, is there a way I can get Pandas and Numpy into the Python3.7?

Related

VSCode - import errors and interpreter errors

I have several imports in my current code:
from flask import Flask
import datetime as dt
import numpy as np
import pandas as pd
When I run this in VSCODE I am getting this error:
However, running this in jupyter notebook has no problems. When I looked online it said to use python interpreter but when I go to do that I get this error:
And another error:
Anaconda prompt says modules/packages are installed but when I run pip install in default windows terminal it says pip has no module:

Delete and reinstall Python extensions according to the same problem on github.
The second problem is related to the location of your Python interpreter. You need to choose the correct interpreter. I still recommend using one Python version in one environment. You can use other Python versions in virtual environment, so it won't lead to confusion.

Having trouble with opening Excel file with Pandas

I am working with Pandas for the 1st time and don't know much about it.
While trying to read an Excel file, Visual Studio code shows the "missing dependency xlrd". I don't know what to do.
Info:
Anaconda, VS code installed on the same drive. Excel file also on the same drive. I am using Windows 10 64bit.

Very short description. It would be nice if the description was a little more detailed. Try install the module:
pip install xlrd
If using python3 then:
pip3 install xlrd
If you are using conda:
conda install -c anaconda xlrd
May be there are multiple python versions in the system, where requirement might be satisfied for one and not for the other. I faced such problem and python3 rather than pip3 worked for me. Check out this too.
python3 -m pip install xlrd
Then it must work, otherwise, upgrade.
pip3 install --upgrade pandas
pip3 install --upgrade xlrd
I hope this will work.
import xlrd
import pandas as pd
sp = pd.ExcelFile("data.xlsx")
print(sp.parse(sp.sheet_names[0]))
If it doesn't work even after the upgrade, my guess is that there is another problem that is not known from your description. (Please include the full error message in the description as a code block, not in image format.)

First make sure you have all the required libraries installed.
pip install pandas
Pandas also requires the NumPy library
pip install numpy
In order to work with Pandas in your script, you will need to import it into your code. This is done with one line of code:
import pandas as pd
To work with Excel using Pandas, you need an additional object named ExcelFile. ExcelFile is built into the Pandas ecosystem, so you import directly from Pandas:
from pandas import ExcelFile
Recall your path where you have that excel file, example: /Users/Desktop/file.xlsx
Rather than referencing the path inside of the Read_Excel function, keep code clean by storing the path in a variable:
file_path = '/Users/Desktop/file.xlsx'
The Read_Excel function takes the file path of an Excel Workbook and returns a DataFrame object with the contents.
Put it all together and set the DataFrame object to a variable named “df”:
df = pd.read_excel(file_Path)
Lastly, you want to view the DataFrame so print the result. Add a print statement to the end of your script, using the DataFrame variable as the argument
print(df)

ImportError: cannot import name 'lzip'

A script I'm running is failing at one of my imports, and I can't figure out why.
The offending line of code is:
from pandas.compat import lzip
Error message: ImportError: cannot import name 'lzip'
This script is being run in a conda virtual environment, with dependencies installed through conda. Specific to this case, I have pandas and pandas-compat installed, also through conda, and running on Python3.6. (3.6 because 3.7 is incompatible with pandas-compat in an earlier attempt on a python3.7 env)
Help please, or point to where I may find the solution?
Not sure why lzip wouldn't be available, save for the warning at the top of the pandas API reference docu (Warning The pandas.core, pandas.compat, and pandas.util top-level modules are PRIVATE. Stable functionality in such modules is not guaranteed.). The first solution on this page didn't work for me.
Current pandas version: 0.25.3
Current pandas-compat: 0.1.1
Current python: 3.6.7

ImportError: No module named 'pandas.core.internals.managers'; 'pandas.core.internals' is not a package

When I tried to read a pickle file that saved by a former version of pandas, it yielded an ImportError.
ImportError: No module named 'pandas.core.internals.managers';
'pandas.core.internals' is not a package
There was no hit on stackoverflow so i would like to share my solution for this particular problem.

This error comes off due to encoding of formerly saved pickle file. If you updated pandas to newly amended version, it produces this import error.

I was facing the same error when I was using pandas version 0.23.4.
I have installed pandas 0.24.1 version explicitly by:
pip3 install pandas==0.24.1
This solved my problem(Python version I was using was 3.5)

I had the same problem, but for me, it seemed to come from the pickle package / interaction with the pandas package.
I had Pandas version 0.23.4.
I saved some pickle files with pandas.Dataframe.to_pickle, with python 3.6.6 & Pandas version 0.23.4.
Then I upgraded to python 3.7.2 (Pandas version 0.23.4), and was enabled to read thoses pickle files with pandas.Dataframe.read_pickle.
Next, I upgraded pandas to pandas 0.24.1, and it worked for me. I can read those files again.

Updating pandas would be the best solution for most cases. However if you have limitations updating your pandas version, and you need to consume pandas objects produced and pickled in a higher version, you can add class location map as below.
from pandas.compat.pickle_compat import _class_locations_map
_class_locations_map.update({
('pandas.core.internals.managers', 'BlockManager'): ('pandas.core.internals', 'BlockManager')
})

conda update pandas
If you use conda package manager.

How to import pandas using R studio

So, just to be clear, I'm very new to python coding... so I'm not exactly sure what's going wrong.
Yesterday, whilst following a tutorial on calling python from R, I successfully installed and used several python packages (e.g., NumPy, pandas, matplotlib etc).
But today, when trying to run the exact same code, I'm getting an error when trying to import pandas (NumPy is importing without any errors). The error states:
ModuleNotFoundError: No module named 'pandas'
I'm not sure what's going on!?
I'm using R-Studio (running on a Mac)... here's a code snippet of how I'm doing it:
library(reticulate)
os <- import("os") # Setting directory
os$getcwd()
repl_python() #used to make it interactive
import numpy as np. # Load numpy package
import pandas as pd # Load pandas package
At this point, it's throwing me an error. I've tried googling the answer and searching here, but to no avail.
Any suggestions as to how I'd fix this problem, or what is going on?
Thanks

Possibly your python path for reticulate changed upon reloading Rstudio. Here is how to set the path manually (filepath for Linux or Mac):
library(reticulate)
path_to_python <- "~/anaconda3/bin/python"
use_python(path_to_python)
https://stackoverflow.com/a/45891929/4549682
You can check your Python path with py_config(): https://rstudio.github.io/reticulate/articles/versions.html#configuration-info
I recommend using Anaconda for your Python distribution (you might have to use Anaconda anyway for reticulate, not sure). Download it from here: https://www.anaconda.com/distribution/#download-section
Then you can create the environment for reticulate to use:
conda_create('r-reticulate', packages = "python=3.5")
I use Python 3.5 for some specific packages, but you can change that version or leave it as just 'python' for the latest version.
https://www.rdocumentation.org/packages/reticulate/versions/1.10/topics/conda-tools
Then you want to install the packages you need (if they aren't already) with
conda_install('re-reticulate', packages = 'numpy')
The way I use something like numpy is
np <- import('numpy')
np$arange(10)

You need to set the second argument of the function use_python, so it should be:
For example, use_python("/users/my_user/Anaconda3/python.exe",required = TRUE)
DON'T forget required = TRUE

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to install Pandas to a specific python installation - python

Related

VSCode - import errors and interpreter errors

Having trouble with opening Excel file with Pandas

ImportError: cannot import name 'lzip'

ImportError: No module named 'pandas.core.internals.managers'; 'pandas.core.internals' is not a package

How to import pandas using R studio

Categories

Resources