Converting xls to csv in Python 3 using xlrd

Converting xls to csv in Python 3 using xlrd - python

I'm using Python 3.3 with xlrd and csv modules to convert an xls file to csv. This is my code:
import xlrd
import csv
def csv_from_excel():
wb = xlrd.open_workbook('MySpreadsheet.xls')
sh = wb.sheet_by_name('Sheet1')
your_csv_file = open('test_output.csv', 'wb')
wr = csv.writer(your_csv_file, quoting=csv.QUOTE_ALL)
for rownum in range(sh.nrows):
wr.writerow(sh.row_values(rownum))
your_csv_file.close()
With that I am receiving this error: TypeError: 'str' does not support the buffer interface
I tried changing the encoding and replaced the line within the loop with this:
wr.writerow(bytes(sh.row_values(rownum),'UTF-8'))
But I get this error: TypeError: encoding or errors without a string argument
Anyone know what may be going wrong?

Try this
import xlrd
import csv
def csv_from_excel():
wb = xlrd.open_workbook('MySpreadsheet.xlsx')
sh = wb.sheet_by_name('Sheet1')
your_csv_file = open('output.csv', 'w', encoding='utf8')
wr = csv.writer(your_csv_file, quoting=csv.QUOTE_ALL)
for rownum in range(sh.nrows):
wr.writerow(sh.row_values(rownum))
your_csv_file.close()

i recommend using pandas library for this task
import pandas as pd
xls = pd.ExcelFile('file.xlsx')
df = xls.parse(sheetname="Sheet1", index_col=None, na_values=['NA'])
df.to_csv('file.csv')

Your problem is basically that you open your file with Python2 semantics. Python3 is locale-aware, so if you just want to write text to this file (and you do), open it as a text file with the right options:
your_csv_file = open('test_output.csv', 'w', encoding='utf-8', newline='')
The encoding parameter specifies the output encoding (it does not have to be utf-8) and the Python3 documentation for csv expressly says that you should specify newline='' for csv file objects.

A quicker way to do it with pandas:
import pandas as pd
xls_file = pd.read_excel('MySpreadsheet.xls', sheetname="Sheet1")
xls_file.to_csv('MySpreadsheet.csv', index = False)
#remove the index because pandas automatically indexes the first column of CSV files.
You can read more about pandas.read_excel here.

Related

Converting CSV file to .xlsx file

I am trying to convert a CSV file to a .xlsx file, where the source CSV file is saved on my Desktop. I want the output file to be saved to my Desktop.
I have tried the below code. However, I am getting a 'file not found' error and 'create the parser' error. I do not know what these errors mean.
I seek:
Help to fix the script and
Help understanding the causes of the problem.
import pandas as pd
read_file = pd.read_csv(r'C:\Users\anthonyedwards\Desktop\credit_card_input_data.csv')
read_file.to_excel(r'C:\Users\anthonyedwards\Desktop\credit_card_output_data.xlsx', index = None, header=True)

Here's an example using xlsxwriter:
import os
import glob
import csv
from xlsxwriter.workbook import Workbook
for csvfile in glob.glob(os.path.join('.', 'file.csv')):
workbook = Workbook(csvfile[:-4] + '.xlsx')
worksheet = workbook.add_worksheet()
with open(csvfile, 'rt', encoding='utf8') as f:
reader = csv.reader(f)
for r, row in enumerate(reader):
for c, col in enumerate(row):
worksheet.write(r, c, col)
workbook.close()
FYI, there is also a package called openpyxl, that can read/write Excel 2007 xlsx/xlsm files.

What's the equivalent of reader = csv.reader(...) for xlsx sheets?

I have a script with
path=r"mypath\myfile.xlsx"
with open(path) as f:
reader = csv.reader(f)
but it won't work because the code is trying to open an xlsx file with a module made for csv files.
So, does an expression equivalent for xlsx files exist?

The equivalent of the highlighted code for xlsx sheets is:
path=r"mypath\myfile.xlsx"
import pandas as pd
with open(path) as f:
reader = pd.read_excel(f)

Converting xlsm to csv

I have this simple snippet which used to work well until today. It converts xlsm files into csv.
import xlrd
workbook = xlrd.open_workbook('T:/DataDump/3.26.17.xlsm')
for sheet in workbook.sheets():
with open('{}.csv'.format(sheet.name), 'w') as f:
writer = csv.writer(f)
writer.writerows(sheet.row_values(row) for row in range(sheet.nrows))
print ("CSV converted")
xlsm file:
Name Date Status
Python 12/15/2014 Manager
Pandas 10/17/2014 Senior
csv file:
Name Date Status
Python 12/15/2014 Manager
Pandas 10/17/2014 Senior
This snippet is providing me with the csv but with double spaces between the rows. How can I fix this please?

In Python 3,open f with the additional parameter newline=''
with open('{}.csv'.format(sheet.name), 'w', newline='') as f:

how to convert xlsx to tab delimited files

I have quite a lot of xlsx files which is a pain to convert them one by one to tab delimited files
I would like to know if there is any solution to do this by python. Here what I found and what tried to do with failure
This I found and I tried the solution but did not work Mass Convert .xls and .xlsx to .txt (Tab Delimited) on a Mac
I also tried to do it for one file to see how it works but with no success
#!/usr/bin/python
import xlrd
import csv
def main():
# I open the xlsx file
myfile = xlrd.open_workbook('myfile.xlsx')
# I don't know the name of sheet
mysheet = myfile.sheet_by_index(0)
# I open the output csv
myCsvfile = open('my.csv', 'wb')
# I write the file into it
wr = csv.writer(myCsvfile, delimiter="\t")
for rownum in xrange(mysheet.nrows):
wr.writerow(mysheet.row_values(rownum))
myCsvfile.close()
if __name__ == '__main__':
main()

No real need for the main function.
And not sure about your indentation problems, but this is how I would write what you have. (And should work, according to first comment above)
#!/usr/bin/python
import xlrd
import csv
# open the output csv
with open('my.csv', 'wb') as myCsvfile:
# define a writer
wr = csv.writer(myCsvfile, delimiter="\t")
# open the xlsx file
myfile = xlrd.open_workbook('myfile.xlsx')
# get a sheet
mysheet = myfile.sheet_by_index(0)
# write the rows
for rownum in xrange(mysheet.nrows):
wr.writerow(mysheet.row_values(rownum))

Why go with so much pain when you can do it in 3 lines:
import pandas as pd
file = pd.read_excel('myfile.xlsx')
file.to_csv('myfile.xlsx',
sep="\t",
index=False)

Trying to convert XLS to CSV in Python

I`m trying to convert .xls to .csv but when i run the code below nothing happens.
import xlrd
import csv
def csv_from_excel():
wb = xlrd.open_workbook('d://Documents and Settings//tdrub//Desktop//TreinamentoPython XLS-CSV//Teste.xls')
sh = wb.sheet_by_name('Sheet1')
Agencia = open('d://Documents and Settings//tdrub//Desktop//Agencia.csv', 'wb')
wr = csv.writer(Agencia, quoting=csv.QUOTE_ALL)
for rownum in xrange(sh.nrows):
wr.writerow(sh.row_values(rownum))
Agencia.close()
The directory is correct, the sheet name is correct but when i run the code no .csv file is created.
I appreciate if someone can help me :)

import xlrd
import csv
import os
file= open('out.csv', 'wb')
wr = csv.writer(file, quoting=csv.QUOTE_ALL)
book=xlrd.open_workbook("F.xls")
sheet=book.sheet_by_index(0)
for sheet in book.sheets():
for row in range(sheet.nrows):
wr.writerow(sheet.row_values(row))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Converting xls to csv in Python 3 using xlrd - python

i recommend using pandas library for this task import pandas as pd xls = pd.ExcelFile('file.xlsx') df = xls.parse(sheetname="Sheet1", index_col=None, na_values=['NA']) df.to_csv('file.csv')

Related

Converting CSV file to .xlsx file

What's the equivalent of reader = csv.reader(...) for xlsx sheets?

Converting xlsm to csv

how to convert xlsx to tab delimited files

Trying to convert XLS to CSV in Python

Categories

Resources