I`m trying to convert .xls to .csv but when i run the code below nothing happens.
import xlrd
import csv
def csv_from_excel():
wb = xlrd.open_workbook('d://Documents and Settings//tdrub//Desktop//TreinamentoPython XLS-CSV//Teste.xls')
sh = wb.sheet_by_name('Sheet1')
Agencia = open('d://Documents and Settings//tdrub//Desktop//Agencia.csv', 'wb')
wr = csv.writer(Agencia, quoting=csv.QUOTE_ALL)
for rownum in xrange(sh.nrows):
wr.writerow(sh.row_values(rownum))
Agencia.close()
The directory is correct, the sheet name is correct but when i run the code no .csv file is created.
I appreciate if someone can help me :)
import xlrd
import csv
import os
file= open('out.csv', 'wb')
wr = csv.writer(file, quoting=csv.QUOTE_ALL)
book=xlrd.open_workbook("F.xls")
sheet=book.sheet_by_index(0)
for sheet in book.sheets():
for row in range(sheet.nrows):
wr.writerow(sheet.row_values(row))
Related
I am trying to convert a CSV file to a .xlsx file, where the source CSV file is saved on my Desktop. I want the output file to be saved to my Desktop.
I have tried the below code. However, I am getting a 'file not found' error and 'create the parser' error. I do not know what these errors mean.
I seek:
Help to fix the script and
Help understanding the causes of the problem.
import pandas as pd
read_file = pd.read_csv(r'C:\Users\anthonyedwards\Desktop\credit_card_input_data.csv')
read_file.to_excel(r'C:\Users\anthonyedwards\Desktop\credit_card_output_data.xlsx', index = None, header=True)
Here's an example using xlsxwriter:
import os
import glob
import csv
from xlsxwriter.workbook import Workbook
for csvfile in glob.glob(os.path.join('.', 'file.csv')):
workbook = Workbook(csvfile[:-4] + '.xlsx')
worksheet = workbook.add_worksheet()
with open(csvfile, 'rt', encoding='utf8') as f:
reader = csv.reader(f)
for r, row in enumerate(reader):
for c, col in enumerate(row):
worksheet.write(r, c, col)
workbook.close()
FYI, there is also a package called openpyxl, that can read/write Excel 2007 xlsx/xlsm files.
I have 200 files with dates in the file name. I would like to add date from this file name into new column in each file.
I created macro in Python:
import pandas as pd
import os
import openpyxl
import csv
os.chdir(r'\\\\\\\')
for file_name in os.listdir(r'\\\\\\'):
with open(file_name,'r') as csvinput:
reader = csv.reader(csvinput)
all = []
row = next(reader)
row.append('FileName')
all.append(row)
for row in reader:
row.append(file_name)
all.append(row)
with open(file_name, 'w') as csvoutput:
writer = csv.writer(csvoutput, lineterminator='\n')
writer.writerows(all)
if file_name.endswith('.csv'):
workbook = openpyxl.load_workbook(file_name)
workbook.save(file_name)
csv_filename = pd.read_csv(r'\\\\\\')
csv_data= pd.read_csv(csv_filename, header = 0)
csv_data['filename'] = csv_filename`
Right now I see "InvalidFileException: File is not a zip file" and only first file has added column with the file name.
Can you please advise what am I doing wrong? BTW I,m using Python 3.4.
Many thanks,
Lukasz
First problem, this section:
with open(file_name, 'w') as csvoutput:
writer = csv.writer(csvoutput, lineterminator='\n')
writer.writerows(all)
should be indented, to be included in the for loop. Now it is only executed once after the loop. This is why you only get one output file.
Second problem, the exception is probably caused by openpyxl.load_workbook(file_name). Presumably openpyxl can only open actual Excel files (which are .zip files with other extension), no CSV files. Why do you want to open and save it after all? I think you can just remove those three lines.
I have a script with
path=r"mypath\myfile.xlsx"
with open(path) as f:
reader = csv.reader(f)
but it won't work because the code is trying to open an xlsx file with a module made for csv files.
So, does an expression equivalent for xlsx files exist?
The equivalent of the highlighted code for xlsx sheets is:
path=r"mypath\myfile.xlsx"
import pandas as pd
with open(path) as f:
reader = pd.read_excel(f)
I have a file which can be a csv file or xlsx file , how this file can be converted to PSV file through Robot framework or python scripting
From a CSV file to "psv" file through direct Python scripting using the csv module:
import csv
with open('input.csv', 'rU') as infile, open('output.psv', 'w') as outfile:
reader = csv.reader(infile)
writer = csv.writer(outfile, delimiter='|')
writer.writerows(reader)
From .xlsx file to "psv" using the xlrd package:
import csv
import xlrd
workbook = xlrd.open_workbook('input.xlsx')
sheet = workbook.sheet_by_index(0) # assume that the data is in the first sheet
with open('output.psv', 'w') as outfile:
writer = csv.writer(outfile, delimiter='|')
for i in range(sheet.nrows):
writer.writerow([cell.value for cell in sheet.row(i)])
I'm using Python 3.3 with xlrd and csv modules to convert an xls file to csv. This is my code:
import xlrd
import csv
def csv_from_excel():
wb = xlrd.open_workbook('MySpreadsheet.xls')
sh = wb.sheet_by_name('Sheet1')
your_csv_file = open('test_output.csv', 'wb')
wr = csv.writer(your_csv_file, quoting=csv.QUOTE_ALL)
for rownum in range(sh.nrows):
wr.writerow(sh.row_values(rownum))
your_csv_file.close()
With that I am receiving this error: TypeError: 'str' does not support the buffer interface
I tried changing the encoding and replaced the line within the loop with this:
wr.writerow(bytes(sh.row_values(rownum),'UTF-8'))
But I get this error: TypeError: encoding or errors without a string argument
Anyone know what may be going wrong?
Try this
import xlrd
import csv
def csv_from_excel():
wb = xlrd.open_workbook('MySpreadsheet.xlsx')
sh = wb.sheet_by_name('Sheet1')
your_csv_file = open('output.csv', 'w', encoding='utf8')
wr = csv.writer(your_csv_file, quoting=csv.QUOTE_ALL)
for rownum in range(sh.nrows):
wr.writerow(sh.row_values(rownum))
your_csv_file.close()
i recommend using pandas library for this task
import pandas as pd
xls = pd.ExcelFile('file.xlsx')
df = xls.parse(sheetname="Sheet1", index_col=None, na_values=['NA'])
df.to_csv('file.csv')
Your problem is basically that you open your file with Python2 semantics. Python3 is locale-aware, so if you just want to write text to this file (and you do), open it as a text file with the right options:
your_csv_file = open('test_output.csv', 'w', encoding='utf-8', newline='')
The encoding parameter specifies the output encoding (it does not have to be utf-8) and the Python3 documentation for csv expressly says that you should specify newline='' for csv file objects.
A quicker way to do it with pandas:
import pandas as pd
xls_file = pd.read_excel('MySpreadsheet.xls', sheetname="Sheet1")
xls_file.to_csv('MySpreadsheet.csv', index = False)
#remove the index because pandas automatically indexes the first column of CSV files.
You can read more about pandas.read_excel here.