pd.to_datetime error after saving csv file without doing anything

pd.to_datetime error after saving csv file without doing anything - python

when I was using pd.to_datetime, my code is like below
rate = pd.read_csv('P2training.csv', header=0)
rate['Date'] = pd.to_datetime(rate['Date'], format='%Y-%m-%d')
rate.set_index('Date', inplace=True, drop=True)
rate.tail(10)
print(rate)
in P2training.csv, first column is 'Date' and this code ran well when I first downloaded P2training dataset. However after I open the csv file and save it without doing anything else, this code started to report errors below. If I put the original downloaded file to replace the 'saved' file, the code can still run properly.
C:\Users\yaojia\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\compat\pandas.py:56:
FutureWarning: The pandas.core.datetools module is deprecated and will
be removed in a future version. Please use the pandas.tseries module
instead. from pandas.core import datetools Traceback (most recent
call last): File
"C:\Users\yaojia\AppData\Roaming\Python\Python36\site-packages\pandas\core\tools\datetimes.py",
line 444, in _convert_listlike
values, tz = tslib.datetime_to_datetime64(arg) File "pandas_libs\tslib.pyx", line 1810, in
pandas._libs.tslib.datetime_to_datetime64 (pandas_libs\tslib.c:33275)
TypeError: Unrecognized value type:
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File
"C:/Users/yaojia/.PyCharmEdu4.0/config/scratches/scratch_7.py", line
23, in
rate['Date'] = pd.to_datetime(rate['Date'], format='%Y-%m-%d') File
"C:\Users\yaojia\AppData\Roaming\Python\Python36\site-packages\pandas\core\tools\datetimes.py",
line 509, in to_datetime
values = _convert_listlike(arg._values, False, format) File "C:\Users\yaojia\AppData\Roaming\Python\Python36\site-packages\pandas\core\tools\datetimes.py",
line 447, in _convert_listlike
raise e File "C:\Users\yaojia\AppData\Roaming\Python\Python36\site-packages\pandas\core\tools\datetimes.py",
line 435, in _convert_listlike
require_iso8601=require_iso8601 File "pandas_libs\tslib.pyx", line 2355, in pandas._libs.tslib.array_to_datetime
(pandas_libs\tslib.c:46617) File "pandas_libs\tslib.pyx", line
2484, in pandas._libs.tslib.array_to_datetime
(pandas_libs\tslib.c:44616) ValueError: time data '12/31/1979'
doesn't match format specified
Process finished with exit code 1
Could anyone give any hint what's going wrong?

I guess you open the csv with excel? If yes, excel recognize that column 'Date' are indeed dates and parse the column in it's own date format (in your case 'day/month/year') and save it this way while you are expecting 'year-month-day'.
I suggest you to open/save your csv with a text editor or change the default excel date format...

Related

wanna to read hdf file in dataframe

gss = pd.read_hdf('gss.hdf5', 'gs')
this the code i have used on VS code. and i got this
Traceback (most recent call last):
File "d:\pthon_txt\t.py", line 4, in <module>
gss = pd.read_hdf('gss.hdf5', 'gs')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Mohammed\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\io\pytables.py", line 442, in read_hdf
return store.select(
^^^^^^^^^^^^^
File "C:\Users\Mohammed\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\io\pytables.py", line 847, in select
raise KeyError(f"No object named {key} in the file")
KeyError: 'No object named gs in the file'
PS D:\pthon_txt>
i wanna to load this hdf file in pandas data frame

To know which keys stored in your HDF store, use the following code:
with pd.HDFStore('gss.hdf5') as store:
print(store.keys())
After that, you will be able to load your data with the correct key:
gss = pd.read_hdf('gss.hdf5', <KEY>)

The error is saying that the key gs doesn't exist in the file. If there's only one key you can use read_hdf without the key parameter, eg :
df = pd.read_hdf('gss.hdf5')

Pycharm getting column names

I'm new into this coding world (like 2 weeks old) so I just ran into a problem. I was following a tutorial like most of us did in the begging. The task was to add a new column called "Month". To do that they suggest to take the 2 first numbers from the column called "Order Date". I wrote the code by letter from the tutorial, the only difference was that I was using Pycharm and they Jupyter Notebook. I like Pycharm so maybe someone knows how to solve this.
The code is the following:
import pandas as pd
import os
files = [file for file in os.listdir("./Files")]
allmonths = pd.DataFrame()
for file in files:
df = pd.read_csv("./Files/" + file)
allmonths = pd.concat([allmonths,df])
alldata = pd.read_csv("allmonths.csv")
### Month Column addition
alldata["Month"] = alldata["Order Date"].str[0:2]
allmonths['Month']
print(alldata.head())
The Traceback:
Traceback (most recent call last):
File "D:\Coding\Sales_Data\venv\lib\site-packages\pandas\core\indexes\base.py", line 3621, in get_loc
return self._engine.get_loc(casted_key)
File "pandas_libs\index.pyx", line 136, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\index.pyx", line 163, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas_libs\hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Order Date'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "D:\Coding\Sales_Data\Sales Anal.py", line 11, in
alldata["Month"] = alldata["Order Date"].str[0:2]
File "D:\Coding\Sales_Data\venv\lib\site-packages\pandas\core\frame.py", line 3505, in getitem
indexer = self.columns.get_loc(key)
File "D:\Coding\Sales_Data\venv\lib\site-packages\pandas\core\indexes\base.py", line 3623, in get_loc
raise KeyError(key) from err
KeyError: 'Order Date'
I know the problem is something about the column names, and maybe that Pycharm can't get it from the CSV file. But, HOW to solve it... IDK

ValueError: Worksheet index 0 is invalid, 0 worksheets found.Cannot open xlsx with pandas in python

Could someone help me figure out why my files dont open.
import pandas as pd
file = "C://Dev//20211103_logfile Box 2.8.xlsx"
temp=pd.read_excel(file)
Here is the full error!
PS C:\Dev> & C:/Users/keyur/AppData/Local/Programs/Python/Python39/python.exe c:/Dev/test_excel.py
C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\openpyxl\reader\workbook.py:88:
UserWarning: File contains an invalid specification for 20211103_logfile. This will be removed
warn(msg)
Traceback (most recent call last):
File "c:\Dev\test_excel.py", line 6, in <module>
temp=pd.read_excel(file)
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\excel\_base.py", line 372, in read_excel
data = io.parse(
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\excel\_base.py", line 1272, in parse
return self._reader.parse(
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\excel\_base.py", line 537, in parse
sheet = self.get_sheet_by_index(asheetname)
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\excel\_openpyxl.py", line 546, in get_sheet_by_index
self.raise_if_bad_sheet_by_index(index)
File "C:\Users\keyur\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\excel\_base.py", line 468, in raise_if_bad_sheet_by_index
raise ValueError(
ValueError: Worksheet index 0 is invalid, 0 worksheets found
PS C:\Dev>

There are problem with your excel,
try make a new excel and copy pase all data ,then try again ,this method works for me.

Python Pandas: Error when exporting DataFrame to CSV file

I am getting the following error when trying to export a pandas DataFrame to csv.
Traceback (most recent call last):
File "C:/Users/riley/PycharmProjects/EarlyPaidLoanReport/EarlyPaidOff.py", line 91, in <module>
LastTransactionDate.to_csv(LastTransactionDate, 'example.csv')
File "C:\Users\riley\Anaconda3\lib\site-packages\pandas\core\frame.py", line 1344, in to_csv
formatter.save()
File "C:\Users\riley\Anaconda3\lib\site-packages\pandas\formats\format.py", line 1526, in save
compression=self.compression)
File "C:\Users\riley\Anaconda3\lib\site-packages\pandas\io\common.py", line 424, in _get_handle
f = open(path, mode, errors='replace')
TypeError: invalid file: AutoNumber LoanAgreementID \
I'm not sure why I am getting this error. I've been writing to csv using pandas many times in the past. Could someone please help to fix this error?
LastTransactionDate.to_csv(LastTransactionDate, 'example.csv')

Your syntax is wrong. Unless I am missing something, just do this:
LastTransactionDate.to_csv('example.csv')

pandas HDFStore - how to reopen?

I created a file by using:
store = pd.HDFStore('/home/.../data.h5')
and stored some tables using:
store['firstSet'] = df1
store.close()
I closed down python and reopened in a fresh environment.
How do I reopen this file?
When I go:
store = pd.HDFStore('/home/.../data.h5')
I get the following error.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/misc/apps/linux/python-2.6.1/lib/python2.6/site-packages/pandas-0.10.0-py2.6-linux-x86_64.egg/pandas/io/pytables.py", line 207, in __init__
self.open(mode=mode, warn=False)
File "/misc/apps/linux/python-2.6.1/lib/python2.6/site-packages/pandas-0.10.0-py2.6-linux-x86_64.egg/pandas/io/pytables.py", line 302, in open
self.handle = _tables().openFile(self.path, self.mode)
File "/apps/linux/python-2.6.1/lib/python2.6/site-packages/tables/file.py", line 230, in openFile
return File(filename, mode, title, rootUEP, filters, **kwargs)
File "/apps/linux/python-2.6.1/lib/python2.6/site-packages/tables/file.py", line 495, in __init__
self._g_new(filename, mode, **params)
File "hdf5Extension.pyx", line 317, in tables.hdf5Extension.File._g_new (tables/hdf5Extension.c:3039)
tables.exceptions.HDF5ExtError: HDF5 error back trace
File "H5F.c", line 1582, in H5Fopen
unable to open file
File "H5F.c", line 1373, in H5F_open
unable to read superblock
File "H5Fsuper.c", line 334, in H5F_super_read
unable to find file signature
File "H5Fsuper.c", line 155, in H5F_locate_signature
unable to find a valid file signature
End of HDF5 error back trace
Unable to open/create file '/home/.../data.h5'
What am I doing wrong here? Thank you.

In my hands, following approach works best:
df = pd.DataFrame(...)
"write"
with pd.HDFStore('test.h5', mode='w') as store:
store.append('df', df, data_columns= df.columns, format='table')
"read"
with pd.HDFStore('test.h5', mode='r') as newstore:
df_restored = newstore.select('df')

You could try doing instead:
store = pd.io.pytables.HDFStore('/home/.../data.h5')
df1 = store['firstSet']
or use the read method directly:
df1 = pd.read_hdf('/home/.../data.h5', 'firstSet')
Either way, you should have pandas 0.12.0 or higher...

I had the same problem and finally fixed it by installing the pytables module (next to the pandas modules which I was using):
conda install pytables
which got me numexpr-2.4.3 and pytables-3.2.0
After that it worked. I am using pandas 0.16.2 under python 2.7.9

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

pd.to_datetime error after saving csv file without doing anything - python

Related

wanna to read hdf file in dataframe

Pycharm getting column names

ValueError: Worksheet index 0 is invalid, 0 worksheets found.Cannot open xlsx with pandas in python

Python Pandas: Error when exporting DataFrame to CSV file

pandas HDFStore - how to reopen?

Categories

Resources