I am getting the following error when trying to export a pandas DataFrame to csv.
Traceback (most recent call last):
File "C:/Users/riley/PycharmProjects/EarlyPaidLoanReport/EarlyPaidOff.py", line 91, in <module>
LastTransactionDate.to_csv(LastTransactionDate, 'example.csv')
File "C:\Users\riley\Anaconda3\lib\site-packages\pandas\core\frame.py", line 1344, in to_csv
formatter.save()
File "C:\Users\riley\Anaconda3\lib\site-packages\pandas\formats\format.py", line 1526, in save
compression=self.compression)
File "C:\Users\riley\Anaconda3\lib\site-packages\pandas\io\common.py", line 424, in _get_handle
f = open(path, mode, errors='replace')
TypeError: invalid file: AutoNumber LoanAgreementID \
I'm not sure why I am getting this error. I've been writing to csv using pandas many times in the past. Could someone please help to fix this error?
LastTransactionDate.to_csv(LastTransactionDate, 'example.csv')
Your syntax is wrong. Unless I am missing something, just do this:
LastTransactionDate.to_csv('example.csv')
Related
I have a script that uses the xlrd library to read and write .xls files. The program works for most .xls', but I found that after I converted an .xlsx to .xls and try to open the workbook, I get the following assertion error:
Traceback (most recent call last):
File "/home/troublebucket/Projects/script.py", line 51, in <module>
wb = xlrd.open_workbook(bom_table_wb)
File "/home/troublebucket/.local/lib/python3.10/site-packages/xlrd/__init__.py", line 172, in open_workbook
bk = open_workbook_xls(
File "/home/troublebucket/.local/lib/python3.10/site-packages/xlrd/book.py", line 104, in open_workbook_xls
bk.parse_globals()
File "/home/troublebucket/.local/lib/python3.10/site-packages/xlrd/book.py", line 1211, in parse_globals
self.handle_sst(data)
File "/home/troublebucket/.local/lib/python3.10/site-packages/xlrd/book.py", line 1178, in handle_sst
self._sharedstrings, rt_runlist = unpack_SST_table(strlist, uniquestrings)
File "/home/troublebucket/.local/lib/python3.10/site-packages/xlrd/book.py", line 1472, in unpack_SST_table
assert _unused_i == nstrings - 1
AssertionError
I tried commenting out the assertion in book.py's unpack_SST_table(), and the script ran without errors but didn't actually read from the workbook. Any advice would be appreciated!
Could someone help me figure out why my files dont open.
import pandas as pd
file = "C://Dev//20211103_logfile Box 2.8.xlsx"
temp=pd.read_excel(file)
Here is the full error!
PS C:\Dev> & C:/Users/keyur/AppData/Local/Programs/Python/Python39/python.exe c:/Dev/test_excel.py
C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\openpyxl\reader\workbook.py:88:
UserWarning: File contains an invalid specification for 20211103_logfile. This will be removed
warn(msg)
Traceback (most recent call last):
File "c:\Dev\test_excel.py", line 6, in <module>
temp=pd.read_excel(file)
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\excel\_base.py", line 372, in read_excel
data = io.parse(
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\excel\_base.py", line 1272, in parse
return self._reader.parse(
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\excel\_base.py", line 537, in parse
sheet = self.get_sheet_by_index(asheetname)
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\excel\_openpyxl.py", line 546, in get_sheet_by_index
self.raise_if_bad_sheet_by_index(index)
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\excel\_base.py", line 468, in raise_if_bad_sheet_by_index
raise ValueError(
ValueError: Worksheet index 0 is invalid, 0 worksheets found
PS C:\Dev>
There are problem with your excel,
try make a new excel and copy pase all data ,then try again ,this method works for me.
when I was using pd.to_datetime, my code is like below
rate = pd.read_csv('P2training.csv', header=0)
rate['Date'] = pd.to_datetime(rate['Date'], format='%Y-%m-%d')
rate.set_index('Date', inplace=True, drop=True)
rate.tail(10)
print(rate)
in P2training.csv, first column is 'Date' and this code ran well when I first downloaded P2training dataset. However after I open the csv file and save it without doing anything else, this code started to report errors below. If I put the original downloaded file to replace the 'saved' file, the code can still run properly.
C:\Users\yaojia\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\compat\pandas.py:56:
FutureWarning: The pandas.core.datetools module is deprecated and will
be removed in a future version. Please use the pandas.tseries module
instead. from pandas.core import datetools Traceback (most recent
call last): File
"C:\Users\yaojia\AppData\Roaming\Python\Python36\site-packages\pandas\core\tools\datetimes.py",
line 444, in _convert_listlike
values, tz = tslib.datetime_to_datetime64(arg) File "pandas_libs\tslib.pyx", line 1810, in
pandas._libs.tslib.datetime_to_datetime64 (pandas_libs\tslib.c:33275)
TypeError: Unrecognized value type:
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File
"C:/Users/yaojia/.PyCharmEdu4.0/config/scratches/scratch_7.py", line
23, in
rate['Date'] = pd.to_datetime(rate['Date'], format='%Y-%m-%d') File
"C:\Users\yaojia\AppData\Roaming\Python\Python36\site-packages\pandas\core\tools\datetimes.py",
line 509, in to_datetime
values = _convert_listlike(arg._values, False, format) File "C:\Users\yaojia\AppData\Roaming\Python\Python36\site-packages\pandas\core\tools\datetimes.py",
line 447, in _convert_listlike
raise e File "C:\Users\yaojia\AppData\Roaming\Python\Python36\site-packages\pandas\core\tools\datetimes.py",
line 435, in _convert_listlike
require_iso8601=require_iso8601 File "pandas_libs\tslib.pyx", line 2355, in pandas._libs.tslib.array_to_datetime
(pandas_libs\tslib.c:46617) File "pandas_libs\tslib.pyx", line
2484, in pandas._libs.tslib.array_to_datetime
(pandas_libs\tslib.c:44616) ValueError: time data '12/31/1979'
doesn't match format specified
Process finished with exit code 1
Could anyone give any hint what's going wrong?
I guess you open the csv with excel? If yes, excel recognize that column 'Date' are indeed dates and parse the column in it's own date format (in your case 'day/month/year') and save it this way while you are expecting 'year-month-day'.
I suggest you to open/save your csv with a text editor or change the default excel date format...
I was trying to save dataframe for later use in pandas.
However, I had the error below.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/source/Linux/pkg/python-2.7.3/lib/python2.7/site-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/series.py", line 2881, in to_csv
encoding=encoding)
File "/source/Linux/pkg/python-2.7.3/lib/python2.7/site-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/frame.py", line 1393, in to_csv
formatter.save()
File "/source/Linux/pkg/python-2.7.3/lib/python2.7/site-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/format.py", line 963, in save
f.close()
IOError: [Errno 5] Input/output error
dataframe.save fails even for a simple object a = DataFrame({'a':[1,3,4],'b':[3,4,5]}).
The save method is deprecated. You should us to_pickle instead. It looks like you're using pandas 0.11 which quite old. The latest version is 0.16.
You could also consider saving it to csv or HDF5.
http://pandas.pydata.org/pandas-docs/stable/io.html
I created a file by using:
store = pd.HDFStore('/home/.../data.h5')
and stored some tables using:
store['firstSet'] = df1
store.close()
I closed down python and reopened in a fresh environment.
How do I reopen this file?
When I go:
store = pd.HDFStore('/home/.../data.h5')
I get the following error.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/misc/apps/linux/python-2.6.1/lib/python2.6/site-packages/pandas-0.10.0-py2.6-linux-x86_64.egg/pandas/io/pytables.py", line 207, in __init__
self.open(mode=mode, warn=False)
File "/misc/apps/linux/python-2.6.1/lib/python2.6/site-packages/pandas-0.10.0-py2.6-linux-x86_64.egg/pandas/io/pytables.py", line 302, in open
self.handle = _tables().openFile(self.path, self.mode)
File "/apps/linux/python-2.6.1/lib/python2.6/site-packages/tables/file.py", line 230, in openFile
return File(filename, mode, title, rootUEP, filters, **kwargs)
File "/apps/linux/python-2.6.1/lib/python2.6/site-packages/tables/file.py", line 495, in __init__
self._g_new(filename, mode, **params)
File "hdf5Extension.pyx", line 317, in tables.hdf5Extension.File._g_new (tables/hdf5Extension.c:3039)
tables.exceptions.HDF5ExtError: HDF5 error back trace
File "H5F.c", line 1582, in H5Fopen
unable to open file
File "H5F.c", line 1373, in H5F_open
unable to read superblock
File "H5Fsuper.c", line 334, in H5F_super_read
unable to find file signature
File "H5Fsuper.c", line 155, in H5F_locate_signature
unable to find a valid file signature
End of HDF5 error back trace
Unable to open/create file '/home/.../data.h5'
What am I doing wrong here? Thank you.
In my hands, following approach works best:
df = pd.DataFrame(...)
"write"
with pd.HDFStore('test.h5', mode='w') as store:
store.append('df', df, data_columns= df.columns, format='table')
"read"
with pd.HDFStore('test.h5', mode='r') as newstore:
df_restored = newstore.select('df')
You could try doing instead:
store = pd.io.pytables.HDFStore('/home/.../data.h5')
df1 = store['firstSet']
or use the read method directly:
df1 = pd.read_hdf('/home/.../data.h5', 'firstSet')
Either way, you should have pandas 0.12.0 or higher...
I had the same problem and finally fixed it by installing the pytables module (next to the pandas modules which I was using):
conda install pytables
which got me numexpr-2.4.3 and pytables-3.2.0
After that it worked. I am using pandas 0.16.2 under python 2.7.9