How to write CSV files into XLSX using Python Pandas? - python

I have several .csv files and I want to write them into one .xlsx file as spreadsheets.
I've loaded these .csv files into Pandas.DataFrame using following code:
df1 = pandas.read_csv('my_file1.csv')
df2 = pandas.read_csv('my_file2.csv')
......
df5 = pandas.read_csv('my_file5.csv')
But I couldn't find any functions in Pandas that can write these DataFrames into one .xlsx file as separated spreadsheets.
Can anyone help me with this?

With recent enough pandas use DataFrame.to_excel() with an existing ExcelWriter object and pass sheet names:
from pandas.io.excel import ExcelWriter
import pandas
csv_files = ['my_file1.csv', 'my_file2.csv', ..., 'my_file5.csv']
with ExcelWriter('my_excel.xlsx') as ew:
for csv_file in csv_files:
pandas.read_csv(csv_file).to_excel(ew, sheet_name=csv_file)

Related

How to append all sheets in multiple Excel files into One Excel file (not consolidating or combining them into one sheet)

I want to combine multiple Excel files/sheets into one Excel file with multiple sheets without changing any formatting. Basically, it is to append all sheets in multiple Excel files into One Excel file with multiple sheets.
For example,
File1 with Sheet1
File2 with Sheet2, Sheet3
File3 with Sheet4, Sheet5
Outcome would be File0 with Sheet1, Sheet2, Sheet3, Sheet4, Sheet5 (as one Excel file).
Here is a code:
from pandas import ExcelWriter
import glob
import os
import pandas as pd
writer = ExcelWriter("File0.xlsx")
for filename in glob.glob("'File*.xlsx"):
excel_file = pd.ExcelFile(filename)
#(_, f_name) = os.path.split(filename)
#(f_short_name, _) = os.path.splitext(f_name)
for sheet_name in excel_file.sheet_names:
df_excel = pd.read_excel(filename, sheet_name)
df_excel.to_excel(writer, sheet_name, index=False)
writer.save()
The code works, but it re-writes the sheets. So I am losing all formats. Is there another way to append all sheets into one Excel file without consolidating them or losing the formatting?
Thank you.
Try to load all sheets into a list put them in a sheet with different names!
from pandas import ExcelWriter
import glob
import os
import pandas as pd
list_of_sheets = []
for filename in glob.glob("'File*.xlsx"):
excel_file = pd.ExcelFile(filename)
list_of_sheets.append(excel_file)
# now add them as different sheets in same excel file
writer = pd.ExcelWriter('multiple.xlsx', engine='xlsxwriter')
for i in range(0, len(list_of_sheets)):
list_of_sheets[i].to_excel(writer, sheet_name='Sheet{}'.format(i))
writer.save()
# in this way, it will be one sheet called multiple.xlsx where each sheet name will be named like sheet1, sheet2... and so on!
#Please accept and upvote the answer if it works, or comment if you have a doubt or error!

Is there a way to export individual sheets in a excel workbook to separate csv files using pandas?

I have 5 sheets in an excel workbook. I would like to export each sheet to csv using python libraries.
This is a sheet showing sales in 2019. I have named the seets according to the year they represent as shown here.
I have read the excel spreadsheet using pandas. I have used the for loop since I am interested in saving the csv file like the_sheet_name.csv. This is my code in a jupyter notebook:
import pandas as pd
df = pd.DataFrame()
myfile = 'sampledata.xlsx’
xl = pd.ExcelFile(myfile)
for sheet in xl.sheet_names:
df_tmp = xl.parse(sheet)
print(df_tmp)
df = df.append(df_tmp, ignore_index=True,sort=False)
csvfile = f'{sheet_name}.csv'
df.to_csv(csvfile, index=False)
Executing the code is producing just one csv file that has the data for all the other sheets. I would like to know if there is a way to customize my code so that I can produce individual sheets e.g sales2011.csv, sales2012.csv and so on.
Use sheet_name=None returns a dictionary of dataframes:
dfs = pd.read_excel('file.xlsx', sheet_name=None)
for sheet_name, data in dfs.items():
data.to_csv(f"{sheet_name}.csv")

Pasting Dataframe in seperate excel files keeping all existing data in python

I have 10 excel files with one sheet in each excel file:-
Sheet Name= Report Output 1
I have created dataframe based on 10 excel files by importing all 10 files through glob and pandas.
import glob
import pandas as pd
df = pd.DataFrame()
for f in glob.glob(filename*.xlsx):
info = pd.read_excel(f, sheetname='Report Output 1')
df = df.append(info)
Did some filtration, merging and calculations as per the requirement.
Now I have one consolidated dataframefinal_df which has data for 10 files after my calculations.
I want to paste the dataframe final_df back to all respective 10 files in New Sheet by splitting or groupby with unique value (column name Source in each file which has unique value) keeping original data in existing files(Sheet Name - Report Output 1) as it is.
I know openpyxl can perform this function through Python OpenPyXl dataframe_to_rows but how to write the code which will copy the dataframe to separate sheets.

using pandas to_csv to write the result to a csv in different sheets

The following code creates a csv named assetinventory.csv using data from the DataFrame:
data.append([x, instancename, names])
pd.DataFrame(data, columns=['InstanceID','InstanceName','AppCode']).to_csv('assetinventory.csv', mode='a',header=False,index=False)
Now I want to create a Sheet2 when I write to the same CSV file again. How can I do that?

Loading only one sheet to dataframe

I am trying to read an excel sheet into df using pandas read_excel method. The excel file contains 6-7 different sheet. Out of it, 2-3 sheets are very huge. I only want to read one excel sheet out of the file.
If I copy the sheet out and read the time reduces by 90%.
I have read that xlrd that is used by pandas always loads the whole sheet to memory. I cannot change the format of the input.
Can you please suggest a way to improve the performance?
It's quite simple. Just do this.
import pandas as pd
xls = pd.ExcelFile('C:/users/path_to_your_excel_file/Analysis.xlsx')
df1 = pd.read_excel(xls, 'Sheet1')
print(df1)
# etc.
df2 = pd.read_excel(xls, 'Sheet2')
print(df2)
import pandas as pd
df = pd.read_excel('YourFile.xlsx', sheet_name = 'YourSheet_Name')
Whatever sheet you want to read just put the sheet name and your path to excel file.
Use openpyxl in read-only mode. See http://openpyxl.readthedocs.io/en/default/pandas.html

Categories

Resources