How to export .csv file from python and using pandas DataFrame - python

I am trying to export some filtered data from Python using Pandas DF to .csv file (Personal Learning project)
Code : df5.to_csv(r'/C:/Users/j/Downloads/data1/export.csv')
Error:
Traceback (most recent call last):
File "C:\Users\jansa\PycharmProjects\bbb\main.py", line 62, in <module>
df5.to_csv(r'/C:/Users/jansa/Downloads/data1/export.csv')
File "C:\Users\jansa\PycharmProjects\bbb\venv\lib\site-packages\pandas\core\generic.py", line 3551, in to_csv
return DataFrameRenderer(formatter).to_csv(
File "C:\Users\jansa\PycharmProjects\bbb\venv\lib\site-packages\pandas\io\formats\format.py", line 1180, in to_csv
csv_formatter.save()
File "C:\Users\jansa\PycharmProjects\bbb\venv\lib\site-packages\pandas\io\formats\csvs.py", line 241, in save with get_handle(
File "C:\Users\jansa\PycharmProjects\bbb\venv\lib\site-packages\pandas\io\common.py", line 697, in get_handle
check_parent_directory(str(handle))
File "C:\Users\jansa\PycharmProjects\bbb\venv\lib\site-packages\pandas\io\common.py", line 571, in check_parent_directory
raise OSError(rf"Cannot save file into a non-existent directory: '{parent}'")
OSError: Cannot save file into a non-existent directory: '\C:\Users\jansa\Downloads\data1'
I am researching, but cannot pinpoint the error.

Try
df.to_csv(r'C:\path\to\directory\filename.csv')

Generally, in Linux/Mac environment path separator is '/' but in windows, it is '\'. Also, the absolute path starts with '/' in Linux/Mac, while in windows, it starts with / So, using arguments in to_csv with C:\Users\j\Downloads\data1\export.csv' will resolve your issue.
In addition, if you want to get rid of such situations, you can do this:
import os
path = os.path.join('.', 'export.csv') #will save the file in current directory
Also, this returns the os path separator:
print(os.sep)

Related

python giving error in extracting a file from the file

Python code:
import tarfile,os
import sys
def extract_system_report (tar_file_path):
extract_path = ""
tar = tarfile.open(sys.argv[1])
for member in tar.getmembers():
print ("member_name is: ")
print (member.name)
tar.extract(member.name, "./crachinfo/")
extract_system_report(sys.argv[1])
while extracting the file getting below error:
>> python tar_read.py /nobackup/deepakhe/POLARIS_POJECT_09102019/hackathon_2021/a.tar.gz
member_name is:
/bootflash/.prst_sync/reload_info
Traceback (most recent call last):
File "tar_read.py", line 38, in <module>
extract_system_report(sys.argv[1])
File "tar_read.py", line 10, in extract_system_report
tar.extract(member.name, "./crachinfo/")
File "/auto/pysw/cel8x/python64/3.6.10/lib/python3.6/tarfile.py", line 2052, in extract
numeric_owner=numeric_owner)
File "/auto/pysw/cel8x/python64/3.6.10/lib/python3.6/tarfile.py", line 2114, in _extract_member
os.makedirs(upperdirs)
File "/auto/pysw/cel8x/python64/3.6.10/lib/python3.6/os.py", line 210, in makedirs
makedirs(head, mode, exist_ok)
File "/auto/pysw/cel8x/python64/3.6.10/lib/python3.6/os.py", line 220, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/bootflash'
I am specifying the folder to extract but still it seems trying to create a folder in the root directory. is this behavior expected? I can extract this tar file fine in file manager. is there a way to handle this in python?
I am specifying the folder to extract but still it seems trying to create a folder in the root directory. is this behavior expected?
Yes. The path provided is just a "current directory", it's not a "jail". The documentation even specifically warns about it:
It is possible that files are created outside of path, e.g. members that have absolute filenames starting with "/" or filenames with two dots "..".
I can extract this tar file fine in file manager. is there a way to handle this in python?
Use extractfile to get a pseudo-file handle on the contents of the archived file, then copy that wherever you want.

How to write on top of pandas HDF5 'read-only mode' files?

I am storing data using pandas built-in HDF5 methods.
Somehow, these HDF5 files were turned into 'read-only' files, and I am getting a lot of Opening xxx in read-only mode messages when I open those files in write mode and I can't write them, which is something I really need to do.
The thing I really don't understand so far is how come those files turned into read-only, as I am not aware of a piece of code that I wrote that may result in that behavior. (I have tried to check if the data stored in the HDF5 is corrupt, but I am able to read it and manipulate it, so it seems to be working just fine)
I have 2 questions:
How can I append data to those 'read-only mode' HDF5 files? (Can I convert them back to write mode or any other clever solution?)
Is there any pandas method that would change the HDF5 file to a 'read-only mode' by default so I can avoid turning those files into read-only in the first place?
Code:
The piece of code that is raising this issue is, which is the piece I use to save the output I generated:
with pd.HDFStore('data/observer/' + self._currency + '_' + str(ts)) as hdf:
hdf.append(key='observers', value=df, format='table', data_columns=True)
I also use this piece of code to manipulate the outputs that were generated previously:
for the_file in list_dir:
if currency in the_file:
temp_df = pd.read_hdf(folder + the_file)
...
I use some select commands as well to get specific columns from the data files:
with pd.HDFStore('data/observer/' + self.currency + '_' + timestamp) as hdf:
df = hdf.select(key='observers', columns=[x, y])
Error Traceback:
File ".../data_processing/observer_data.py", line 52, in save_obs_to_pandas
hdf.append(key='observers', value=df, format='table', data_columns=True)
File ".../venv/lib/python3.5/site-packages/pandas/io/pytables.py", line 963, in append
**kwargs)
File ".../venv/lib/python3.5/site-packages/pandas/io/pytables.py", line 1341, in _write_to_group
s.write(obj=value, append=append, complib=complib, **kwargs)
File ".../venv/lib/python3.5/site-packages/pandas/io/pytables.py", line 3930, in write
self.set_info()
File ".../venv/lib/python3.5/site-packages/pandas/io/pytables.py", line 3163, in set_info
self.attrs.info = self.info
File ".../venv/lib/python3.5/site-packages/tables/attributeset.py", line 464, in __setattr__
nodefile._check_writable()
File ".../venv/lib/python3.5/site-packages/tables/file.py", line 2119, in _check_writable
raise FileModeError("the file is not writable")
tables.exceptions.FileModeError: the file is not writable
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File ".../general_manager.py", line 144, in <module>
gm.run()
File ".../general_manager.py", line 114, in run
list_of_observer_managers = self.load_all_observer_managers()
File ".../general_manager.py", line 64, in load_all_observer_managers
observer = currency_pool.map(self.load_observer_manager, list_of_currencies)
File "/usr/lib/python3.5/multiprocessing/pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/lib/python3.5/multiprocessing/pool.py", line 608, in get
raise self._value
tables.exceptions.FileModeError: the file is not writable
The issue at hand was that I messed up with OS file permissions. The file I was trying to read belonged to the root (as I had run the code that generated those files with the root) and I was trying to access them with a user account.
I am running debian, and the following command (as root) solved my issues:
chown -R user.user folder
This commands recursively changes permissions of all files inside that folder to user.user.

How to turn a comma seperated value TXT into a CSV for machine learning

How do I turn this format of TXT file into a CSV file?
Date,Open,high,low,close
1/1/2017,1,2,1,2
1/2/2017,2,3,2,3
1/3/2017,3,4,3,4
I am sure you can understand? It already has the comma -eparated values.
I tried using numpy.
>>> import numpy as np
>>> table = np.genfromtxt("171028 A.txt", comments="%")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\Smith\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\lib\npyio.py", line 1551, in genfromtxt
fhd = iter(np.lib._datasource.open(fname, 'rb'))
File "C:\Users\Smith\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\lib\_datasource.py", line 151, in open
return ds.open(path, mode)
File "C:\Users\Smith\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\lib\_datasource.py", line 501, in open
raise IOError("%s not found." % path)
OSError: 171028 A.txt not found.
I have (S&P) 500 txt files to do this with.
You can use csv module. You can find more information here.
import csv
txt_file = 'mytext.txt'
csv_file = 'mycsv.csv'
in_txt = csv.reader(open(txt_file, "r"), delimiter=',')
out_csv = csv.writer(open(csv_file, 'w+'))
out_csv.writerows(in_txt)
Per #dclarke's comment, check the directory from which you run the code. As you coded the call, the file must be in that directory. When I have it there, the code runs without error (although the resulting table is a single line with four nan values). When I move the file elsewhere, I reproduce your error quite nicely.
Either move the file to be local, add a local link to the file, or change the file name in your program to use the proper path to the file (either relative or absolute).

pandas HDFStore - how to reopen?

I created a file by using:
store = pd.HDFStore('/home/.../data.h5')
and stored some tables using:
store['firstSet'] = df1
store.close()
I closed down python and reopened in a fresh environment.
How do I reopen this file?
When I go:
store = pd.HDFStore('/home/.../data.h5')
I get the following error.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/misc/apps/linux/python-2.6.1/lib/python2.6/site-packages/pandas-0.10.0-py2.6-linux-x86_64.egg/pandas/io/pytables.py", line 207, in __init__
self.open(mode=mode, warn=False)
File "/misc/apps/linux/python-2.6.1/lib/python2.6/site-packages/pandas-0.10.0-py2.6-linux-x86_64.egg/pandas/io/pytables.py", line 302, in open
self.handle = _tables().openFile(self.path, self.mode)
File "/apps/linux/python-2.6.1/lib/python2.6/site-packages/tables/file.py", line 230, in openFile
return File(filename, mode, title, rootUEP, filters, **kwargs)
File "/apps/linux/python-2.6.1/lib/python2.6/site-packages/tables/file.py", line 495, in __init__
self._g_new(filename, mode, **params)
File "hdf5Extension.pyx", line 317, in tables.hdf5Extension.File._g_new (tables/hdf5Extension.c:3039)
tables.exceptions.HDF5ExtError: HDF5 error back trace
File "H5F.c", line 1582, in H5Fopen
unable to open file
File "H5F.c", line 1373, in H5F_open
unable to read superblock
File "H5Fsuper.c", line 334, in H5F_super_read
unable to find file signature
File "H5Fsuper.c", line 155, in H5F_locate_signature
unable to find a valid file signature
End of HDF5 error back trace
Unable to open/create file '/home/.../data.h5'
What am I doing wrong here? Thank you.
In my hands, following approach works best:
df = pd.DataFrame(...)
"write"
with pd.HDFStore('test.h5', mode='w') as store:
store.append('df', df, data_columns= df.columns, format='table')
"read"
with pd.HDFStore('test.h5', mode='r') as newstore:
df_restored = newstore.select('df')
You could try doing instead:
store = pd.io.pytables.HDFStore('/home/.../data.h5')
df1 = store['firstSet']
or use the read method directly:
df1 = pd.read_hdf('/home/.../data.h5', 'firstSet')
Either way, you should have pandas 0.12.0 or higher...
I had the same problem and finally fixed it by installing the pytables module (next to the pandas modules which I was using):
conda install pytables
which got me numexpr-2.4.3 and pytables-3.2.0
After that it worked. I am using pandas 0.16.2 under python 2.7.9

Error when trying to convert .sxw file to .rml in OpenERP 6.0.3

[2012-06-01 15:33:10,638][molisamples] ERROR:web-services:Uncaught exception
Traceback (most recent call last):
File "osv\osv.pyo", line 122, in wrapper
File "osv\osv.pyo", line 176, in execute
File "osv\osv.pyo", line 167, in execute_cr
File "C:\Program Files (x86)\OpenERP 6.0\Server\addons\base_report_designer\base_report_designer.py", line 42, in sxwtorml
File "C:\Program Files (x86)\OpenERP 6.0\Server\addons\base_report_designer\openerp_sxw2rml\openerp_sxw2rml.py", line 309, in sxw2rml
File "C:\Program Files (x86)\OpenERP 6.0\Server\addons\base_report_designer\openerp_sxw2rml\openerp_sxw2rml.py", line 294, in unpackNormalize
File "C:\Program Files (x86)\OpenERP 6.0\Server\addons\base_report_designer\openerp_sxw2rml\openerp_sxw2rml.py", line 269, in oo_read
File "zipfile.pyo", line 346, in init
File "zipfile.pyo", line 366, in _GetContents
File "zipfile.pyo", line 378, in _RealGetContents
BadZipfile: File is not a zip file
I get the above error when I try to convert a report I just designed to .rml (using Open Office Writer). Please what could be the issue. I am seriously confused here
You can convert your .sxw to .rml using base_report_designer module.
Try following steps:
Open terminal -> go to openerp_sxw2rml folder like this:
cd addons/base_report_designer/openerp_sxw2rml
Then run this command: python openerp_sxw2rml.py absolute path of sxw > absolute path of rml
Like this:
python openerp_sxw2rml.py /home/arya/my_module/report/my_report.sxw > /home/arya/my_module/report/my_report.rml
This will convert sxw file into rml and you can find your file at given path of rml.
Thank you.
The error says that the file is not a zip file, so it's probably expecting the compressed format of sxw file. Any chance you saved the file in OpenOffice's uncompressed format?
Make sure when you save in Openoffice Writer, you select the old format, the one with SXW extension.
Don't just type .sxw, make sure the program puts it there by itself by selecting the correct entry in the fileformat selectionbox (i forget the full title and cannot check atm)
I figured it out. I had some errors in the python parser file for the report. That's what was causing the issue. It's been fixed now. Thanks y'all for the help

Categories

Resources