Python -- Pandas can't find file but numpy can - python

I am at a complete loss here. I am trying to open a txt file in pandas, I have tried multiple different approaches, but I receive the same error message every time. 'no such file'...
What is strange is that this...
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import PercentFormatter
full_file = np.loadtxt('2_Feature_Test.txt', delimiter=',')
...works completely fine, however this...
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.ticker import PercentFormatter
full_file = pd.read_csv('2_Feature_Test.txt', sep=',')
...does not.
Doesn't matter full path, doesn't matter backslashes or forward slashes or prefixing with r for raw string. Is the problem something to do with pandas and numpy being in different locations? I have no clue. Please, if you have any ideas I am all ears and would love nothing more than to get to the bottom of this. Thanks everyone.
If it helps, this is the full error message I receive...
Traceback (most recent call last):
File "C:\Users\Pat Oaks\Documents\txt_files\Thonny\lib\site-packages\thonny\workbench.py", line 1449, in event_generate
handler(event)
File "C:\Users\Pat Oaks\Documents\txt_files\Thonny\lib\site-packages\thonny\assistance.py", line 138, in handle_toplevel_response
self._explain_exception(msg["user_exception"])
File "C:\Users\Pat Oaks\Documents\txt_files\Thonny\lib\site-packages\thonny\assistance.py", line 178, in _explain_exception
+ _error_helper_classes["*"]
File "C:\Users\Pat Oaks\Documents\txt_files\Thonny\lib\site-packages\thonny\assistance.py", line 176, in <listcomp>
for helper_class in (
File "C:\Users\Pat Oaks\Documents\txt_files\Thonny\lib\site-packages\thonny\plugins\stdlib_error_helpers.py", line 555, in __init__
super().__init__(error_info)
File "C:\Users\Pat Oaks\Documents\txt_files\Thonny\lib\site-packages\thonny\assistance.py", line 478, in __init__
self.last_frame_module_source = read_source(self.last_frame.filename)
File "C:\Users\Pat Oaks\Documents\txt_files\Thonny\lib\site-packages\thonny\common.py", line 252, in read_source
with tokenize.open(filename) as fp:
File "C:\Users\Pat Oaks\Documents\txt_files\Thonny\lib\tokenize.py", line 447, in open
buffer = _builtin_open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'pandas\\_libs\\index.pyx'
UPDATE: due to a more patient man than I actually reading the error message, I realize the issue is most likely with the pandas installation. Install of pandas via conda install pandas failed saying 'the specified procedure could not be found'. Might this have something to do with the issue? Anybody seen this before?

As the comments have said, clearly the missing file is one of pandas', not the file you are trying to read.
Try forcing the reinstall of pandas
pip install -I pandas
or, if using Anaconda
conda install pandas --force-reinstall

Related

How to export .csv file from python and using pandas DataFrame

I am trying to export some filtered data from Python using Pandas DF to .csv file (Personal Learning project)
Code : df5.to_csv(r'/C:/Users/j/Downloads/data1/export.csv')
Error:
Traceback (most recent call last):
File "C:\Users\jansa\PycharmProjects\bbb\main.py", line 62, in <module>
df5.to_csv(r'/C:/Users/jansa/Downloads/data1/export.csv')
File "C:\Users\jansa\PycharmProjects\bbb\venv\lib\site-packages\pandas\core\generic.py", line 3551, in to_csv
return DataFrameRenderer(formatter).to_csv(
File "C:\Users\jansa\PycharmProjects\bbb\venv\lib\site-packages\pandas\io\formats\format.py", line 1180, in to_csv
csv_formatter.save()
File "C:\Users\jansa\PycharmProjects\bbb\venv\lib\site-packages\pandas\io\formats\csvs.py", line 241, in save with get_handle(
File "C:\Users\jansa\PycharmProjects\bbb\venv\lib\site-packages\pandas\io\common.py", line 697, in get_handle
check_parent_directory(str(handle))
File "C:\Users\jansa\PycharmProjects\bbb\venv\lib\site-packages\pandas\io\common.py", line 571, in check_parent_directory
raise OSError(rf"Cannot save file into a non-existent directory: '{parent}'")
OSError: Cannot save file into a non-existent directory: '\C:\Users\jansa\Downloads\data1'
I am researching, but cannot pinpoint the error.
Try
df.to_csv(r'C:\path\to\directory\filename.csv')
Generally, in Linux/Mac environment path separator is '/' but in windows, it is '\'. Also, the absolute path starts with '/' in Linux/Mac, while in windows, it starts with / So, using arguments in to_csv with C:\Users\j\Downloads\data1\export.csv' will resolve your issue.
In addition, if you want to get rid of such situations, you can do this:
import os
path = os.path.join('.', 'export.csv') #will save the file in current directory
Also, this returns the os path separator:
print(os.sep)

Add world street map to basemap

I am having problems with basemap - arcgisimage function. Sample code below
...
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
from PIL import Image
m = Basemap(
llcrnrlat=40.361369, llcrnrlon=-80.0955278,
urcrnrlat=40.501368, urcrnrlon=-79.865723,
epsg = 2272
)
#m.arcgisimage(service='ESRI_StreetMap_World_2D'
, xpixels=7000, verbose=True)
m.arcgisimage(service='World_Physical_Map', xpixels=7000, ypixels=None, dpi=96,verbose=True)
#m.arcgisimage(service='ESRI_Imagery_World_2D', xpixels=7000, verbose=True)
plt.show()
...
when I run this the arcgisimage() function crashes in PIL with error message
Traceback (most recent call last):
File "C:\Machine Learning\Geospatial\pittsburgh_map.py", line 11, in <module>
m.arcgisimage(service='World_Physical_Map', xpixels=7000, ypixels=None, dpi=96,verbose=True)
File "C:\Users\peter\AppData\Local\Programs\Python\Python38\lib\site-packages\mpl_toolkits\basemap\__init__.py", line 4263, in arcgisimage
return self.imshow(imread(urlopen(basemap_url)),ax=ax,
File "C:\Users\peter\AppData\Local\Programs\Python\Python38\lib\site-packages\matplotlib\image.py", line 1490, in imread
with img_open(fname) as image:
File "C:\Users\peter\AppData\Local\Programs\Python\Python38\lib\site-packages\PIL\ImageFile.py", line 121, in __init__
self._open()
File "C:\Users\peter\AppData\Local\Programs\Python\Python38\lib\site-packages\PIL\PngImagePlugin.py", line 692, in _open
cid, pos, length = self.png.read()
File "C:\Users\peter\AppData\Local\Programs\Python\Python38\lib\site-packages\PIL\PngImagePlugin.py", line 162, in read
pos = self.fp.tell()
io.UnsupportedOperation: seek
I had the same problem, most probably you have installed it using conda. I deinstalled the basemap module and reinstalled it with pip. then everything worked normally
If you followed the basemap installation instructions, then any issues you have with running basemap will probably remain unfixed as it is is deprecated in favor of cartopy https://github.com/matplotlib/basemap.

Python Pandas df is not defined

I have a problem with a script I wrote a while back, couple of months ago it worked fine without problem. However since then the OS has been updated.
The script works fine until it tries to create a dataframe with pandas
import os
import pandas as pd
import matplotlib.pyplot as plt
dir_input = '/home/xxx/xxx/xxx/Script/input/'
osdir = []
alldir = []
for all_files in os.listdir(dir_input):
alldir.append(all_files)
for file in os.listdir(dir_input): #Adds all the specified files to the list osdir
if file.endswith('.xlsx'):
osdir.append(file)
print("Found {0}".format(file))
for filename in osdir:
(fileroot, extension) = os.path.splitext(filename)
print 'Processing file...'
print fileroot
print ''
# pandas works with so called dataframes to import the data. Since I dont need all the columns we only use column d,f and j
df = pd.read_excel(dir_input+filename,parse_cols="D,F,J", index=df.index)
...
The error I get using spyder
Traceback (most recent call last):
File "<ipython-input-5-2cf9c86bcb8c>", line 1, in <module>
runfile('/home/xxx/python_scripts/xpos-frame-mean_batch_v1.1.py', wdir='/home/cdoering/python_scripts')
File "/home/xxx/anaconda/lib/python2.7/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 682, in runfile
execfile(filename, namespace)
File "/home/xxx/anaconda/lib/python2.7/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 78, in execfile
builtins.execfile(filename, *where)
File "/home/xxx/python_scripts/script.py", line 54, in <module>
df = pd.read_excel(dir_input+filename,parse_cols="D,F,J", index=df.index)
NameError: name 'df' is not defined
My feeling is there is something wrong with pandas, maybe? I uninstalled it using conda and reinstalled it. Tried uninstalling with pip, but never used pip to install it so it couldn't find it. I am at a loss.
As #EdChum said in their comment, the problem is 'referencing the index prior to creation'. Specifically, when you have index=df.index you are referring to the index attribute of the df, but you haven't created the df yet, so that attribute doesn't exist.

pandas HDFStore - how to reopen?

I created a file by using:
store = pd.HDFStore('/home/.../data.h5')
and stored some tables using:
store['firstSet'] = df1
store.close()
I closed down python and reopened in a fresh environment.
How do I reopen this file?
When I go:
store = pd.HDFStore('/home/.../data.h5')
I get the following error.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/misc/apps/linux/python-2.6.1/lib/python2.6/site-packages/pandas-0.10.0-py2.6-linux-x86_64.egg/pandas/io/pytables.py", line 207, in __init__
self.open(mode=mode, warn=False)
File "/misc/apps/linux/python-2.6.1/lib/python2.6/site-packages/pandas-0.10.0-py2.6-linux-x86_64.egg/pandas/io/pytables.py", line 302, in open
self.handle = _tables().openFile(self.path, self.mode)
File "/apps/linux/python-2.6.1/lib/python2.6/site-packages/tables/file.py", line 230, in openFile
return File(filename, mode, title, rootUEP, filters, **kwargs)
File "/apps/linux/python-2.6.1/lib/python2.6/site-packages/tables/file.py", line 495, in __init__
self._g_new(filename, mode, **params)
File "hdf5Extension.pyx", line 317, in tables.hdf5Extension.File._g_new (tables/hdf5Extension.c:3039)
tables.exceptions.HDF5ExtError: HDF5 error back trace
File "H5F.c", line 1582, in H5Fopen
unable to open file
File "H5F.c", line 1373, in H5F_open
unable to read superblock
File "H5Fsuper.c", line 334, in H5F_super_read
unable to find file signature
File "H5Fsuper.c", line 155, in H5F_locate_signature
unable to find a valid file signature
End of HDF5 error back trace
Unable to open/create file '/home/.../data.h5'
What am I doing wrong here? Thank you.
In my hands, following approach works best:
df = pd.DataFrame(...)
"write"
with pd.HDFStore('test.h5', mode='w') as store:
store.append('df', df, data_columns= df.columns, format='table')
"read"
with pd.HDFStore('test.h5', mode='r') as newstore:
df_restored = newstore.select('df')
You could try doing instead:
store = pd.io.pytables.HDFStore('/home/.../data.h5')
df1 = store['firstSet']
or use the read method directly:
df1 = pd.read_hdf('/home/.../data.h5', 'firstSet')
Either way, you should have pandas 0.12.0 or higher...
I had the same problem and finally fixed it by installing the pytables module (next to the pandas modules which I was using):
conda install pytables
which got me numexpr-2.4.3 and pytables-3.2.0
After that it worked. I am using pandas 0.16.2 under python 2.7.9

bundle_files = 1 fails with py2exe using matplotlib

I am trying to create a standalone application using py2exe that depends on matplotlib and numpy. The code of the application is this:
import numpy as np
import pylab as plt
plt.figure()
a = np.random.random((16,16))
plt.imshow(a,interpolation='nearest')
plt.show()
The setup code for py2exe (modified from http://www.py2exe.org/index.cgi/MatPlotLib) is this:
from distutils.core import setup
import py2exe
import sys
sys.argv.append('py2exe')
opts = {
'py2exe': {"bundle_files" : 3,
"includes" : [ "matplotlib.backends",
"matplotlib.backends.backend_qt4agg",
"pylab", "numpy",
"matplotlib.backends.backend_tkagg"],
'excludes': ['_gtkagg', '_tkagg', '_agg2',
'_cairo', '_cocoaagg',
'_fltkagg', '_gtk', '_gtkcairo', ],
'dll_excludes': ['libgdk-win32-2.0-0.dll',
'libgobject-2.0-0.dll']
}
}
setup(console=[{"script" : "matplotlib_test.py"}],
zipfile=None,options=opts)
Now, when bundle_files is set = 3 or is absent, all works fine, but the resulting exe cannot be distributed to a machine that is not configured with the same version of Python, etc. If I set bundle_files = 1, it creates a suitably large exe file that must have everything bundled, but it fails to run locally or distributed. In this case, I'm creating everything on a Windows 7 machine with Python 2.6.6 and trying to run locally and on an XP machine with Python 2.6.4 installed.
The errors I get when running on the XP machine seem strange since, without bundling, I get no errors on Win 7. With bundling, Win 7 does not report the traceback information, so I cannot be sure the errors are the same. In any case, here's the error message on XP:
Traceback (most recent call last):
File "matplotlib_test.py", line 2, in <module>
File "zipextimporter.pyc", line 82, in load_module
File "pylab.pyc", line 1, in <module>
File "zipextimporter.pyc", line 82, in load_module
File "matplotlib\__init__.pyc", line 709, in <module>
File "matplotlib\__init__.pyc", line 627, in rc_params
File "matplotlib\__init__.pyc", line 565, in matplotlib_fname
File "matplotlib\__init__.pyc", line 240, in wrapper
File "matplotlib\__init__.pyc", line 439, in _get_configdir
RuntimeError: Failed to create C:\Documents and Settings\mnfienen/.matplotlib; c
onsider setting MPLCONFIGDIR to a writable directory for matplotlib configuratio
n data
Many thanks in advance if anyone can point me in a direction that will fix this!
EDIT 1:
I followed William's advice and fixed the problem with MPLCONFIGDIR, but now get a new error:
:Traceback (most recent call last):
File "matplotlib\__init__.pyc", line 479, in _get_data_path
RuntimeError: Could not find the matplotlib data files
EDIT 2:
I fixed the data files problem by using:
data_files=matplotlib.get_py2exe_datafiles()
This leads to a new error:
Traceback (most recent call last):
File "matplotlib_test.py", line 5, in <module>
import matplotlib.pyplot as plt
File "matplotlib\pyplot.pyc", line 78, in <module>
File "matplotlib\backends\__init__.pyc", line 25, in pylab_setup
ImportError: No module named backend_wxagg
I had the same problem. I think the problem was caused by pylab in matplotlib, py2exe seemed to have trouble finding and getting all the backends associated with pylab.
I got around the problem by changing all my embedded plots to use matplotlib.figure instead of pylab. Here's a simple example on how to make a plot with matplotlib.figure:
import matplotlib.figure as fg
import numpy as np
fig = fg.Figure()
ax = fig.add_subplot(111)
lines = ax.plot(range(10), np.random.randn(10), range(10), np.random.randn(10))
You cannot use fig.show() directly with this, but it can be embedded in GUIs. I used Tkinker:
canvas = FigureCanvasTkAgg(fig, canvas_master)
canvas.show()
Well Misha Fienen, I guess it seems to be failing to write to your user folder, which you probably already knew. Just a stab in the dark but have you tried testing what happens if you follow the advice and change MPLCONFIGDIR to something a bit more basic (eg. "C:\matlibplotcfg\")?
There are two ways of solving the problem.
1.- In your matplotlib.rc file use:
backend : TkAgg
2.- alternatively, in your setup.py "includes" key add:
"matplotlib.backends.backend_wxagg"
both ways produce the test figure in Python 2.6, windows XP

Categories

Resources