Formatting Excel sheets from Pandas - python

I'm using the following code to print a dataframe to a csv;
writer = pd.ExcelWriter('dataframe.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='dataframe')
writer.save()
But my df is about 200 columns wide (20 columns of 10 categories) and only 5 rows deep.
Is there any way of manipulating it so that you tell pandas where to print various columns in the excel file.
Eg. Print columns 1-10 on row 1 in the excel sheet. Print columns 11-20 on row 6 in the excel sheet. etc.
Really I'm just trying to do the formatting of the excel file in pandas as opposed to having to play with the excel sheet after.

One solution might be to transpose the dataset using .T:
writer = pd.ExcelWriter('dataframe.xlsx', engine='xlsxwriter')
df.T.to_excel(writer, sheet_name='dataframe')
writer.save()

Related

Best way to sort multiple ranges in Excel via Python

I'm looking to try to sort some "tables" in Excel via Python as if I were doing it manually (that is selecting the range and then clicking sort). These are the tables I'm talking about:
Each of these cells in the tables excluding the names contains a formula such as a vlookup into a different sheet holding all of the data such as how many sales John Doe has made today and how many has John Doe made overall.
I would like to sort these tables via Python by the condition of sales from highest to lowest and then if sales numbers are the same then by conversion rate highest to lowest.
This is what I want to end up with:
I would like to keep the conditional formatting if possible or if possible add it into the code.
I have experimented sorting with Pandas however it has deleted the conditional formatting and has changed my conversion percentages into decimals.
This is the code I experimented with Pandas.
Sheet1 is where the tables are located, Sheet2 is the where the call numbers are pulling from, Sheet3 is where the sales numbers are being pulled from.
import pandas as pd
xl = pd.ExcelFile('file.xlsx')
df = xl.parse("Sheet1")
df = df.sort_values(by=["Sales", "Conversion"])
df2 = xl.parse("Sheet2")
df3 = xl.parse("Sheet3")
writer = pd.ExcelWriter('file.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1', index=False)
df2.to_excel(writer, sheet_name='Sheet2', index=False)
df3.to_excel(writer, sheet_name='Sheet3', index=False)
writer.save()

Choose A Specific Sheet In Excel Containing a String Pandas

I'm currently creating a dataframe from an excel spreadsheet in Pandas. For most of the files, they only contain 1 sheet. However, with some of the files that I have the sheet is not the first sheet. However, all of the sheets in all of the files have the same format. They have 'ITD_XXX_XXXX'. Is there a way to input into pandas to select the sheet that has the form.
df = pd.read_excel(path, sheet_name = contains('ITD_')
Here pandas would only select data from the sheet that has the string 'ITD_' in front of it?
Cheers.
I think the answer here would probably give you what you need.
Bring in the file as an Excelfile before reading it as a dataframe. Get the Sheet_names, and then extract the sheet_name that has 'ITD_'.
excel = pd.ExcelFile("your_excel.xlsx")
excel.sheet_names
# ["Sheet1", "Sheet2"]
for n in excel.sheet_names:
if n.startswith('ITD_'):
sheetname = n
break
df = excel.parse(sheetname)

Convert Excel sheets to Pandas df's

I have an excel file with one sheet name "info" as follows
Name Number
S1 50
S2 100
S3 400
This sheet give info about other sheet which I need to convert into pandas df's.
but, when I read this sheet and loop to create other df's. My code is also looking for a sheet name "Name" and thus breaking...any way to avoid this?
Use a header row or skip the first row as mentioned in the comments.
df_info = pd.read_excel('file.xlsx', sheet_name='info', header=0)
sheets = {}
for sheet_name in df_info['Name']:
sheets[sheet_name] = pd.read_excel('file.xlsx', sheet_name=sheet_name, header=None)
Pandas Read Excel Documentation

Excel Writer Python Separate Sheet For Each Row/Index In DataFrame

I have a dataframe with 14 columns and about 300 rows. What I want to do is create an xlsx with multiple sheets, each sheet holding a single row of the main dataframe. I'm setting it up like this because I want to append to these individual sheets every day for a new instance of the same row to see how the column values for the unique rows change over time. Here is some code.
tracks_df = pd.read_csv('final_outputUSA.csv')
writer2 = pd.ExcelWriter('please.xlsx', engine='xlsxwriter')
for track in tracks_df:
tracks_df.to_excel(writer2, sheet_name="Tracks", index=False, header=True)
writer2.save()
writer2.close()
Right now this just outputs the exact same format as the csv that I'm reading in. I know that I'm going to need to dynamically change the sheet_name based on an indexed value, I would like to have each sheet_name=df['Col1'] for each sheet. How do I output a xlsx with a separate sheet for each row in my dataframe?
Try this:
writer2 = pd.ExcelWriter('please.xlsx', engine='xlsxwriter')
df.apply(lambda x: x.to_frame().T.to_excel(writer2, sheet_name=x['Col1'].astype('str'), index=True, header=True), axis=1)
writer2.save()
writer2.close()

How to write Pandas Dataframe Horizontally in to Excel sheet from python openpyxl (elements in same row, consecutive columns)

I have this problem where I am able to write Pandas dataframe into excel using openpyxl and it works fine where the dataframe is written vertically.( Same column, consecutive rows)
but I want to write my dataframe horizontally i.e. elements in same row, consecutive columns
My dataframe is a single dimensional one like [10, 9,8,7,6]
writer = pd.ExcelWriter(fn, engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
#df.to_excel(writer, sheet_name='Sheet1', header=None, index=False)
df_2.to_excel(writer, sheet_name='Sheet3', header=None, index=False,
startcol=1,startrow=1)
writer.save()
My question is:
Can we define into this code snippet whether the data frame should be written vertically or horizontally just like startrow and startcol.
I've searched everywhere but could not find any.
This should solve your problem:
df2 = df2.transpose()
df_2.to_excel(writer, sheet_name='Sheet3', header=None, index=False,
startcol=1,startrow=1)
As mentioned here
Have you tried transposing the dataframe before writing it ?
e.g.
df_2.T.to_excel(writer, sheet_name='Sheet3', header=None, index=False, startcol=1,startrow=1)

Categories

Resources