Python Pandas: Error when exporting DataFrame to CSV file - python

I am getting the following error when trying to export a pandas DataFrame to csv.
Traceback (most recent call last):
File "C:/Users/riley/PycharmProjects/EarlyPaidLoanReport/EarlyPaidOff.py", line 91, in <module>
LastTransactionDate.to_csv(LastTransactionDate, 'example.csv')
File "C:\Users\riley\Anaconda3\lib\site-packages\pandas\core\frame.py", line 1344, in to_csv
formatter.save()
File "C:\Users\riley\Anaconda3\lib\site-packages\pandas\formats\format.py", line 1526, in save
compression=self.compression)
File "C:\Users\riley\Anaconda3\lib\site-packages\pandas\io\common.py", line 424, in _get_handle
f = open(path, mode, errors='replace')
TypeError: invalid file: AutoNumber LoanAgreementID \
I'm not sure why I am getting this error. I've been writing to csv using pandas many times in the past. Could someone please help to fix this error?
LastTransactionDate.to_csv(LastTransactionDate, 'example.csv')

Your syntax is wrong. Unless I am missing something, just do this:
LastTransactionDate.to_csv('example.csv')

Related

xlrd assertion error when opening a .xls file (converted from .xlsx): assert _unused_i == nstrings - 1

I have a script that uses the xlrd library to read and write .xls files. The program works for most .xls', but I found that after I converted an .xlsx to .xls and try to open the workbook, I get the following assertion error:
Traceback (most recent call last):
File "/home/troublebucket/Projects/script.py", line 51, in <module>
wb = xlrd.open_workbook(bom_table_wb)
File "/home/troublebucket/.local/lib/python3.10/site-packages/xlrd/__init__.py", line 172, in open_workbook
bk = open_workbook_xls(
File "/home/troublebucket/.local/lib/python3.10/site-packages/xlrd/book.py", line 104, in open_workbook_xls
bk.parse_globals()
File "/home/troublebucket/.local/lib/python3.10/site-packages/xlrd/book.py", line 1211, in parse_globals
self.handle_sst(data)
File "/home/troublebucket/.local/lib/python3.10/site-packages/xlrd/book.py", line 1178, in handle_sst
self._sharedstrings, rt_runlist = unpack_SST_table(strlist, uniquestrings)
File "/home/troublebucket/.local/lib/python3.10/site-packages/xlrd/book.py", line 1472, in unpack_SST_table
assert _unused_i == nstrings - 1
AssertionError
I tried commenting out the assertion in book.py's unpack_SST_table(), and the script ran without errors but didn't actually read from the workbook. Any advice would be appreciated!

ValueError: Worksheet index 0 is invalid, 0 worksheets found.Cannot open xlsx with pandas in python

Could someone help me figure out why my files dont open.
import pandas as pd
file = "C://Dev//20211103_logfile Box 2.8.xlsx"
temp=pd.read_excel(file)
Here is the full error!
PS C:\Dev> & C:/Users/keyur/AppData/Local/Programs/Python/Python39/python.exe c:/Dev/test_excel.py
C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\openpyxl\reader\workbook.py:88:
UserWarning: File contains an invalid specification for 20211103_logfile. This will be removed
warn(msg)
Traceback (most recent call last):
File "c:\Dev\test_excel.py", line 6, in <module>
temp=pd.read_excel(file)
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\excel\_base.py", line 372, in read_excel
data = io.parse(
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\excel\_base.py", line 1272, in parse
return self._reader.parse(
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\excel\_base.py", line 537, in parse
sheet = self.get_sheet_by_index(asheetname)
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\excel\_openpyxl.py", line 546, in get_sheet_by_index
self.raise_if_bad_sheet_by_index(index)
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\excel\_base.py", line 468, in raise_if_bad_sheet_by_index
raise ValueError(
ValueError: Worksheet index 0 is invalid, 0 worksheets found
PS C:\Dev>
There are problem with your excel,
try make a new excel and copy pase all data ,then try again ,this method works for me.

pd.to_datetime error after saving csv file without doing anything

when I was using pd.to_datetime, my code is like below
rate = pd.read_csv('P2training.csv', header=0)
rate['Date'] = pd.to_datetime(rate['Date'], format='%Y-%m-%d')
rate.set_index('Date', inplace=True, drop=True)
rate.tail(10)
print(rate)
in P2training.csv, first column is 'Date' and this code ran well when I first downloaded P2training dataset. However after I open the csv file and save it without doing anything else, this code started to report errors below. If I put the original downloaded file to replace the 'saved' file, the code can still run properly.
C:\Users\yaojia\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\compat\pandas.py:56:
FutureWarning: The pandas.core.datetools module is deprecated and will
be removed in a future version. Please use the pandas.tseries module
instead. from pandas.core import datetools Traceback (most recent
call last): File
"C:\Users\yaojia\AppData\Roaming\Python\Python36\site-packages\pandas\core\tools\datetimes.py",
line 444, in _convert_listlike
values, tz = tslib.datetime_to_datetime64(arg) File "pandas_libs\tslib.pyx", line 1810, in
pandas._libs.tslib.datetime_to_datetime64 (pandas_libs\tslib.c:33275)
TypeError: Unrecognized value type:
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File
"C:/Users/yaojia/.PyCharmEdu4.0/config/scratches/scratch_7.py", line
23, in
rate['Date'] = pd.to_datetime(rate['Date'], format='%Y-%m-%d') File
"C:\Users\yaojia\AppData\Roaming\Python\Python36\site-packages\pandas\core\tools\datetimes.py",
line 509, in to_datetime
values = _convert_listlike(arg._values, False, format) File "C:\Users\yaojia\AppData\Roaming\Python\Python36\site-packages\pandas\core\tools\datetimes.py",
line 447, in _convert_listlike
raise e File "C:\Users\yaojia\AppData\Roaming\Python\Python36\site-packages\pandas\core\tools\datetimes.py",
line 435, in _convert_listlike
require_iso8601=require_iso8601 File "pandas_libs\tslib.pyx", line 2355, in pandas._libs.tslib.array_to_datetime
(pandas_libs\tslib.c:46617) File "pandas_libs\tslib.pyx", line
2484, in pandas._libs.tslib.array_to_datetime
(pandas_libs\tslib.c:44616) ValueError: time data '12/31/1979'
doesn't match format specified
Process finished with exit code 1
Could anyone give any hint what's going wrong?
I guess you open the csv with excel? If yes, excel recognize that column 'Date' are indeed dates and parse the column in it's own date format (in your case 'day/month/year') and save it this way while you are expecting 'year-month-day'.
I suggest you to open/save your csv with a text editor or change the default excel date format...

IOE error using dataframe.to_csv and dataframe.save to write data to files

I was trying to save dataframe for later use in pandas.
However, I had the error below.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/source/Linux/pkg/python-2.7.3/lib/python2.7/site-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/series.py", line 2881, in to_csv
encoding=encoding)
File "/source/Linux/pkg/python-2.7.3/lib/python2.7/site-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/frame.py", line 1393, in to_csv
formatter.save()
File "/source/Linux/pkg/python-2.7.3/lib/python2.7/site-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/format.py", line 963, in save
f.close()
IOError: [Errno 5] Input/output error
dataframe.save fails even for a simple object a = DataFrame({'a':[1,3,4],'b':[3,4,5]}).
The save method is deprecated. You should us to_pickle instead. It looks like you're using pandas 0.11 which quite old. The latest version is 0.16.
You could also consider saving it to csv or HDF5.
http://pandas.pydata.org/pandas-docs/stable/io.html

pandas HDFStore - how to reopen?

I created a file by using:
store = pd.HDFStore('/home/.../data.h5')
and stored some tables using:
store['firstSet'] = df1
store.close()
I closed down python and reopened in a fresh environment.
How do I reopen this file?
When I go:
store = pd.HDFStore('/home/.../data.h5')
I get the following error.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/misc/apps/linux/python-2.6.1/lib/python2.6/site-packages/pandas-0.10.0-py2.6-linux-x86_64.egg/pandas/io/pytables.py", line 207, in __init__
self.open(mode=mode, warn=False)
File "/misc/apps/linux/python-2.6.1/lib/python2.6/site-packages/pandas-0.10.0-py2.6-linux-x86_64.egg/pandas/io/pytables.py", line 302, in open
self.handle = _tables().openFile(self.path, self.mode)
File "/apps/linux/python-2.6.1/lib/python2.6/site-packages/tables/file.py", line 230, in openFile
return File(filename, mode, title, rootUEP, filters, **kwargs)
File "/apps/linux/python-2.6.1/lib/python2.6/site-packages/tables/file.py", line 495, in __init__
self._g_new(filename, mode, **params)
File "hdf5Extension.pyx", line 317, in tables.hdf5Extension.File._g_new (tables/hdf5Extension.c:3039)
tables.exceptions.HDF5ExtError: HDF5 error back trace
File "H5F.c", line 1582, in H5Fopen
unable to open file
File "H5F.c", line 1373, in H5F_open
unable to read superblock
File "H5Fsuper.c", line 334, in H5F_super_read
unable to find file signature
File "H5Fsuper.c", line 155, in H5F_locate_signature
unable to find a valid file signature
End of HDF5 error back trace
Unable to open/create file '/home/.../data.h5'
What am I doing wrong here? Thank you.
In my hands, following approach works best:
df = pd.DataFrame(...)
"write"
with pd.HDFStore('test.h5', mode='w') as store:
store.append('df', df, data_columns= df.columns, format='table')
"read"
with pd.HDFStore('test.h5', mode='r') as newstore:
df_restored = newstore.select('df')
You could try doing instead:
store = pd.io.pytables.HDFStore('/home/.../data.h5')
df1 = store['firstSet']
or use the read method directly:
df1 = pd.read_hdf('/home/.../data.h5', 'firstSet')
Either way, you should have pandas 0.12.0 or higher...
I had the same problem and finally fixed it by installing the pytables module (next to the pandas modules which I was using):
conda install pytables
which got me numexpr-2.4.3 and pytables-3.2.0
After that it worked. I am using pandas 0.16.2 under python 2.7.9

Categories

Resources