pandas read_csv working only as root user - python

I am reading a csv file using pandas. It works fine if I run script as root user. But when I try to run it with different user it does not read data and gives:
error : KeyError: 'no item named 0'
it appears at:
dt = pd.read_csv('rt.csv', header=None).fillna('').set_index(0).to_dict()[1]
Btw, I am working on Ubuntu 12.02 and using anaconda, which is installed in root user and other user as well (which is giving error)
Please help.

You like have different pandas versions installed as user and root.
I get the same error with version 0.16.2 when I use the wrong delimiter.
Have a look at your data in rt.csv.
For example, this would work for a whitespace-delimited file:
dt = pd.read_csv('rt.csv', header=None,
delim_whitespace=True).fillna('').set_index(0).to_dict()[1]
Check the file and adapt the delimiter accordingly.

Related

ERROR:zygote_host_impl_linux.cc(89) - Chartify

I'm trying the new library(Chartify) provided by Spotify Team. On running the code below, I'm receiving the following error:
import chartify
import pandas as pd
file = "./data/Social_Network_Ads.csv"
data = pd.read_csv(file, sep = ',')
chart = chartify.Chart(blank_labels=True, y_axis_type='categorical', x_axis_type='linear')
chart.plot.scatter(
data_frame=data,
categorical_columns='Gender',
numeric_column='EstimatedSalary',
color_column='EstimatedSalary')
chart.style.color_palette.reset_palette_order()
chart.set_title("Scatter Plot w.r.t. Salaries of different Gender")
chart.set_subtitle("Labels for specific observations.")
chart.show()
[9643:9643:1127/175201.738360:ERROR:zygote_host_impl_linux.cc(89)] Running as root without --no-sandbox is not supported. See https://crbug.com/638180.
The HTML is being created though, but on opening the HTML, It gives a blank page.
An old question, but if you are facing similar issues... That message is related to the OS while executing X tool.
As a workaround, this helped me to solve the issue on a CentOS 7.7 while trying to execute different binaries!
export QTWEBENGINE_DISABLE_SANDBOX=1

pandas.read_csv cant find my path error

So I tried to run this code.
import pandas
i = input("hi input a csv file..")
df = pandas.read_csv(i)
and I got an error saying
FileNotFoundError: File b'"C:\\Users\\thomas.swenson\\Downloads\\hi.csv"' does not exist
but then if I hard code that path that 'doesn't exist' into my program it works fine.
import pandas
df = pandas.read_csv("C:\\Users\\thomas.swenson\\Downloads\\hi.csv")
it works just fine.
Anyone know why this may be happening?
I'm running python 3.6 and using a virtualenv
looks like the input function was placing another set of quotes around the input.
so ill just have to remove them and it works fine.

Failing to open an Excel file with Python

I'm on a Debian GNU/Linux computer, working with Python 2.7.9.
As a part of my job, I have been making python scripts that read inputs in various formats (e.g. Excel, Csv, Txt) and parse the information to more standarized files. It's not my first time opening or working with Excel files.
There's a particular file which is giving me problems, I just can't open it. When I tried with xlrd (version 0.9.3), it gave me the following error:
xlrd.open_workbook('sample.xls')
XLRDError: Unsupported format, or corrupt file: BOF not
workbook/worksheet: op=0x0009 vers=0x0002 strm=0x000a build=0 year=0
-> BIFF21
I tried to investigate the matter on my own, found a couple of answers in StackOverflow but I couldn't open it anyway. This particular answer I found may be the problem (the second explanation), but it doesn't include a workaround: https://stackoverflow.com/a/16518707/4345659
A tool that could conert the file to csv/txt would also solve the problem.
I already tried with:
xlrd
openpyxl
xlsx2csv (the shell tool)
A sample file is available here:
https://ufile.io/r4m6j
As a side note, I can open it with LibreOffice Calc and MS Excel, so I could eventually change it to csv that way. The thing is, I need to do it all with a python script.
Thanks in advance!
It seems like MS Problem. The xls file is very strange, maybe you should contact xlrd support.
But I have a crazy workaround for you: xls2ods. It works for me even though xls2csv doesn't (SiC!).
So, install catdoc first:
$sudo apt-get install catdoc
Then convert your xls file to ods and open ods using pyexcel_ods or whatever you prefer. To use pyexcel_ods install it first using pip install pyexcel_ods.
import subprocess
from pyexcel_ods import get_data
file_basename = 'sample'
returncode = subprocess.call(['xls2ods', '{}.xls'.format(file_basename)])
if returnecode > 0:
# consider to use subprocess.Popen if you need more control on stderr
exit(returncode)
data = get_data('{}.ods'.format(file_basename))
print(data)
I'm getting following output:
OrderedDict([(u'sample',
[[u'labo',
u'codfarm',
u'farmacia',
u'direccion',
u'localidad',
u'nom_medico',
u'matricula',
u'troquel',
u'producto',
u'cant_total']])])
Here is a kludge I would use:
Assuming you have LibreOffice on Debian, you could either convert all your *.xls files into *.csv using:
import os
os.system("libreoffice --headless --convert-to csv *.xls")
#or use os.call
... and then work consistently with csv.
Or you could convert only the corrupted file(s) when needed using a try/except block:
import os
try:
xlrd.open_workbook('sample.xls')
except XLRDError:
os.system("libreoffice --headless --convert-to csv sample.xls")
# mycsv = open("sample.csv", "r")
# for line in mycsv.readlines():
# ...
# ...
OBS: Keep LibreOffice closed while running the script.
Alternatively there are other tools out there to do the conversion. Here is one (which I have not tested): https://github.com/dilshod/xlsx2csv
If you are targeting windows, if you have Excel installed, and if you are familiar with Excel VBA, you will have a quick solution using the comtypes package:
http://pythonhosted.org/comtypes/
You will have direct access to Excel by its COM interfaces.
This code open an xls file and saves it as a cvs file, using the comtypes package:
import comtypes.client as cl
progId = "Excel.Application.15"
xl = cl.CreateObject(progId)
wb = xl.Workbooks.Open(r"C:\Users\aUser\Desktop\thermoList.xls")
wb.SaveAs(r"C:\Users\aUser\Desktop\thermoList.csv",FileFormat=6)
xl.DisplayAlerts = False
xl.Quit()
I could not test it with "sample.xls" which is corrupt.
Your could try with another file.
You might need to adjust the progId according to your version of Excel.
It's a file format issue. I'm not sure what file type is it but it's not Excel. I just open and saved the file with sample2.xls name and compare the types:
How are you creating this file?
If you need to get the words as a list of strings:
text_file = open("sample.xls", "r")
lines = text_file.read().replace(chr(200), '').replace(chr(0), '').replace(chr(1), '').replace(chr(5), '').replace(chr(2), '').replace(chr(3), '').replace(chr(4), '').replace(chr(6), '').replace(chr(7), '').replace(chr(8), '').replace(chr(9), '').replace(chr(10), '').replace(chr(12), '').replace(chr(15), '').replace(chr(16), '').replace(chr(17), '').replace(chr(18), '').replace(chr(49), '').replace('Arial', '')
for line in lines.split(chr(128)):
print(line)
the output:
The file you provided is corrupted, so there is no way for other responders to test it and recommend a good solution. And exception you posted confirming that.
As a solution you can try to debug some things, please see some steps below:
You mentioned you tried the xlrd library. Try to check if your xlrd module is upto date by executing this:
Python 2.7.9
>>> import xlrd
>>> xlrd.__VERSION
update to the latest official version if needed
Try to open any other *.xls file and see if it works with Python version you're using and current library.
Check module documentation it's pretty good, and there are some different things described how to use this module on various platforms( Win vs. Linux)http://xlrd.readthedocs.io/en/latest/dates.html
You always can rich out to the community (there is still a chance that you might be getting into some weird state or bug) the link is here https://github.com/python-excel/xlrd/issues
Hope that helps.
Unable to open your Excel either. Just as yadayada said, I think it is the problem of data source. If you really want to figure out the reason, I suggest you ask questions about the excel instead of python.
It's always work for me with any xls or xlsx files:
def csv_from_excel(filename_xls, filename_csv):
wb = xlrd.open_workbook(filename_xls, encoding_override='YOUR_ENCODING_HERE (f.e. "cp1251"')
sh = wb.sheet_by_index(0)
your_csv_file = open(filename_csv, 'wb')
wr = unicodecsv.writer(your_csv_file)
for rownum in xrange(sh.nrows):
wr.writerow(sh.row_values(rownum))
your_csv_file.close()
So, i don't work directly with excel file before convert them to csv. Mb it will help you

Tableau SDK TableException (40200)

Issue: Error being thrown: tableausdk.Exceptions.TableauException: TableauException (40200): The system cannot find the path specified.
- OS::mkdir(CreateDirectory path="C:\PATH\Tableau-SDK\tdetmp2A0E0E5E")
I am attempting to to create a tableau extract from oracle data using python and the tableauSDK.
The code seems to run correctly if the extract already exists. (although the produced tde is unreadable)
According to the Tableau community I should be able to create an extract from any source data without the extract already existing...
Any idea on why this is occuring?
tde_path = r'C:\PATH\test.tde'
tde_file = Extract(path=tde_path) ## ERROR Thrown here
The reason now seems obvious...
The error had the answer :
OS::mkdir(CreateDirectory path="C:\PATH\Tableau-SDK\tdetmp2A0E0E5E")
To solve the issue :
The Directory C:\PATH\Tableau-SDK\ did not exist.
Created the Directory and the code ran without error.

Error: Line magic function

I'm trying to read a file using python and I keep getting this error
ERROR: Line magic function `%user_vars` not found.
My code is very basic just
names = read_csv('Combined data.csv')
names.head()
I get this for anytime I try to read or open a file. I tried using this thread for help.
ERROR: Line magic function `%matplotlib` not found
I'm using enthought canopy and I have IPython version 2.4.1. I made sure to update using the IPython installation page for help. I'm not sure what's wrong because it should be very simple to open/read files. I even get this error for opening text files.
EDIT:
I imported traceback and used
print(traceback.format_exc())
But all I get is none printed. I'm not sure what that means.
Looks like you are using Pandas. Try the following (assuming your csv file is in the same path as the your script lib) and insert it one line at a time if you are using the IPython Shell:
import pandas as pd
names = pd.read_csv('Combined data.csv')
names.head()

Categories

Resources