Pandas and Spreadsheet

Pandas and Spreadsheet - python

I am using ubuntu and as a result unable to use ms excel.Anyways I have created a spread sheet and I wish to make use of it in my python program.
Following is the Python program.
import pandas as pd
df=pd.read_excel("/home/files/file1.ods")
df.head()
Traceback (most recent call last):
File "spreadsheet.py", line 2, in <module>
df=pd.read_excel("/home/files/file1.ods")
File "/usr/lib/python3/dist-packages/pandas/io/excel.py", line 163, in read_excel
io = ExcelFile(io, engine=engine)
File "/usr/lib/python3/dist-packages/pandas/io/excel.py", line 187, in __init__
import xlrd # throw an ImportError if we need to
ImportError: No module named 'xlrd'
Does it mean that I have to use ms excel or is there an error in my understanding.Whatever be the case your help will be highly appreciated.

In recent versions (since 0.25) of Pandas, this feature is provided.
Just install the odfpy package (with pip install odfpy or etc) and then use pandas' (sadly) read_excel() function with the engine='odf' option. For example:
pd.read_excel('path_to_file.ods', engine='odf')
See https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#opendocument-spreadsheets.

Related

Error Import awswrangler: AttributeError: module 'multiprocessing' has no attribute 'connection'

I have a python script that uses the lib awswrangler. Today my scrpit started to give errors in the import of the library and I don't know what is happening.
I'm running the script in a docker container with image python: 3.8
Example:
import awswrangler as wr
print(wr.__version__)
Error:
Traceback (most recent call last):
File "src/avec/automation/TaskBaseUserPass.py", line 1, in <module>
from awswrangler.pandas import Pandas
File "/usr/local/lib/python3.8/site-packages/awswrangler/__init__.py", line 17, in <module>
from awswrangler.pandas import Pandas # noqa
File "/usr/local/lib/python3.8/site-packages/awswrangler/pandas.py", line 45, in <module>
class Pandas:
File "/usr/local/lib/python3.8/site-packages/awswrangler/pandas.py", line 273, in Pandas
def _read_csv_once_remote(send_pipe: mp.connection.Connection, session_primitives: "SessionPrimitives",
AttributeError: module 'multiprocessing' has no attribute 'connection'

I have been experienced the same issue today when trying to import awswrangler. For me, downgrading the following dependencies helped:
pip install fsspec==0.6.3 PyAthena==1.10.2 s3fs==0.4.0
It seems that one or more of them were causing the problem.

If your code uses multiprocessing.connection.Listener or multiprocessing.connection.Client, then you should use:
import multiprocessing.connection
If you just use
import multiprocessing
.. then your code might get an ImportError or not. It depends on other modules. If an other module imports multiprocessing.connection, then it will work.
But I guess you don't want random behavior, and that's why you should import multiprocessing.connection.

I managed to run version 3.6, the library has a problem with mp.connection.Connection in current python versions

Getting several Traceback errors in python

New to Python. Trying to understand how to import the pandas module. I imported it through Pycharm, then ran a basic script seen below. I get several errors that I'm not sure how to interpret
import pandas
x = input("Enter your age")
print(x)
I receive this error message.
Traceback (most recent call last):
File "C:/Users/leeb/PycharmProjects/HelloWorld/app.py", line 1, in <module>
import pandas
File "C:\Users\leeb\PycharmProjects\HelloWorld\venv\lib\site-packages\pandas\__init__.py", line 11, in <module>
__import__(dependency)
File "C:\Users\leeb\PycharmProjects\HelloWorld\venv\lib\site-packages\numpy\__init__.py", line 150, in <module>
from . import random
File "C:\Users\leeb\PycharmProjects\HelloWorld\venv\lib\site-packages\numpy\random\__init__.py", line 181, in <module>
from . import _pickle
File "C:\Users\leeb\PycharmProjects\HelloWorld\venv\lib\site-packages\numpy\random\_pickle.py", line 1, in <module>
from .mtrand import RandomState
File "type.pxd", line 9, in init numpy.random.mtrand
ValueError: builtins.type size changed, may indicate binary incompatibility. Expected 440 from C header, got 432 from PyObject

This error tends to happen when you have an older version of Numpy installed.
You should upgrade it, as follows:
pip install numpy --upgrade
If that doesn't work, try to use a specific version of numpy, as follows:
pip uninstall numpy
pip install numpy==1.15.1
Or, if you're using anaconda, try:
conda update numpy

ImportError: C extension: No module named 'parsing' not built

I have been trying to find a solution to this import error regarding the pandas library when it says no module named "parsing." Every library should be installed correctly from the interpreter and they are all the latest version.
This is what the console returns:
Traceback (most recent call last):
File "C:\Users\shaya\PycharmProjects\NEA\venv\lib\site-packages\pandas\__init__.py", line 26, in <module>
from pandas._libs import (hashtable as _hashtable,
File "C:\Users\shaya\PycharmProjects\NEA\venv\lib\site-packages\pandas\_libs\__init__.py", line 4, in <module>
from .tslib import iNaT, NaT, Timestamp, Timedelta, OutOfBoundsDatetime
File "pandas\_libs\tslibs\conversion.pxd", line 11, in init pandas._libs.tslib
File "pandas\_libs\tslibs\conversion.pyx", line 40, in init pandas._libs.tslibs.conversion
ModuleNotFoundError: No module named 'parsing'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/shaya/PycharmProjects/NEA/Main.py", line 4, in <module>
import pandas_datareader.data as data
File "C:\Users\shaya\PycharmProjects\NEA\venv\lib\site-packages\pandas_datareader\__init__.py", line 2, in <module>
from .data import (DataReader, Options, get_components_yahoo,
File "C:\Users\shaya\PycharmProjects\NEA\venv\lib\site-packages\pandas_datareader\data.py", line 7, in <module>
from pandas_datareader.av.forex import AVForexReader
File "C:\Users\shaya\PycharmProjects\NEA\venv\lib\site-packages\pandas_datareader\av\__init__.py", line 3, in <module>
from pandas_datareader.base import _BaseReader
File "C:\Users\shaya\PycharmProjects\NEA\venv\lib\site-packages\pandas_datareader\base.py", line 7, in <module>
import pandas.compat as compat
File "C:\Users\shaya\PycharmProjects\NEA\venv\lib\site-packages\pandas\__init__.py", line 35, in <module>
"the C extensions first.".format(module))
ImportError: C extension: No module named 'parsing' not built. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext --inplace --force' to build the C extensions first.
All of the above tracebacks are from PyCharm.
OS: Windows.
I am using pip to install packages
Python version: 3.7.1, panda version: 0.23.4

Do you have python added to path? To test this; open a cmd and type python. If it is on your path, you should see the version of python you are running (Assuming you are using a windows machine). If this is the case, you can simply run the command after you checked this. If not, please navigate to the location where Python is installed, open python.exe and try to run the command python setup.py build_ext --inplace --force
If this doesn't work, you should try to re-install pandas by pip install --upgrade --force-reinstall pandas
If this also fails you could also go rigourous, and simply create a new environment and install pandas there. Sidenote: It is probably better to install pandas by using conda package manager, Pandas has portions of its code written in C to make it run faster. If you tried to install pandas manually you would need to build it.

I had the same problem under the same circumstances. I went through the code of some of the pandas files and saw that there is indeed a module named 'parsing' in the tslib folder of my pandas directory, yet for some reason it's not able to call it. I just reinstalled python and now it's working for me. If you find any other alternative, please let me know.

module 'pandas' has no attribute 'read_csv

import pandas as pd
df = pd.read_csv('FBI-CRIME11.csv')
print(df.head())
Running this simple code gives me the error:
Traceback (most recent call last):
File "C:/Users/Dita/Desktop/python/lessons/python.data/csv.py", line 1, in <module>
import pandas as pd
File "C:\python\lib\site-packages\pandas-0.19.1-py3.5-win-amd64.egg\pandas\__init__.py", line 37, in <module>
import pandas.core.config_init
File "C:\python\lib\site-packages\pandas-0.19.1-py3.5-win-amd64.egg\pandas\core\config_init.py", line 18, in <module>
from pandas.formats.format import detect_console_encoding
File "C:\python\lib\site-packages\pandas-0.19.1-py3.5-win-amd64.egg\pandas\formats\format.py", line 33, in <module>
from pandas.io.common import _get_handle, UnicodeWriter, _expand_user
File "C:\python\lib\site-packages\pandas-0.19.1-py3.5-win-amd64.egg\pandas\io\common.py", line 5, in <module>
import csv
File "C:\Users\Dita\Desktop\python\lessons\python.data\csv.py", line 4, in <module>
df = pd.read_csv('FBI-CRIME11.csv')
AttributeError: module 'pandas' has no attribute 'read_csv'

Try renaming your csv.py to something else, like csv_test.py. Looks like pandas is being confused about what to import.

Make sure you don't have a file called pandas.py in the directory you are executing your python file in.

I had checked for the presence of csv.py and made sure there was no file of this name. I also tried pip uninstall pandas and then pip install pandas. I still got the same error.
What worked for me was: pip-autoremove.
First install it using pip install pip-autoremove.
Then, remove pandas using pip-autoremove pandas -y.
Next, reinstall it using pip install pandas.
The reason why this is necessary is that sometimes, when using uninstall, the package folder may still be present.

I'm using Jupiter notebook and in my case I have imported and used pandas as below.
import pandas as pd
df = pd.read_csv('canada_per_capita_income.csv')
but it keeps throwing this error
DataFrame' object has no attribute 'read_csv
then i restart the kernel and rename the csv file to
income.csv
it solve my issue. Hope this might help someone

You need to make sure there are no files named pandas.py, numpy.py or matplotlib.py

openpyxl for Python 3 gives StringIO error

When I try to load a workbook in Python 3.3 using openpyxl, I get a "No module named 'StringIO'" error:
In [5]: load_workbook(FileName)
Traceback (most recent call last):
File "<ipython-input-6-c4f3bc35f522>", line 1, in <module>
load_workbook(FileName)
File "C:\WinPython-64bit-3.3.2.3\python-3.3.2.amd64\lib\site-packages\openpyxl-1.6.2-py3.3.egg\openpyxl\reader\excel.py", line 112, in load_workbook
f = repair_central_directory(filename, is_file_instance)
File "C:\WinPython-64bit-3.3.2.3\python-3.3.2.amd64\lib\site-packages\openpyxl-1.6.2-py3.3.egg\openpyxl\reader\excel.py", line 59, in repair_central_directory
from StringIO import StringIO
ImportError: No module named 'StringIO'
I am aware that StringIO is not available in Python 3, but then again, I am using the py3.3 version of openpyxl (or so I think...?). However, when I try to find the file that calls StringIO, it is not in the directory noted in the traceback. Have I installed something incorrectly? Or is something else going on here?
Thanks in advance.

This should have been resolved in openpyxl 1.7

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pandas and Spreadsheet - python

Related

Error Import awswrangler: AttributeError: module 'multiprocessing' has no attribute 'connection'

Getting several Traceback errors in python

ImportError: C extension: No module named 'parsing' not built

module 'pandas' has no attribute 'read_csv

openpyxl for Python 3 gives StringIO error

Categories

Resources