Write Pivot Table to Excel Workbook Range using XLSX Writer - python

I have a series of tables I would like to write to the same worksheet. The only other post similar to this is here. I also looked here but didn't see a solution.
I was hoping for a similar situation to SAS ODS Output codes that send proc freq results to an excel file. My thought was turning the table results into a new data frame and then stacking the output results to a worksheet.
pd.value_counts(df['name'])
df.groupby('name').aggregate({'Id': lambda x: x.unique()})
If I know the number of rows corresponding to the table, I should ideally know the appropriate range of cells to write to.
I am using:
import xlsxwriter
workbook = xlsxwriter.Workbook('demo.xlsx')
worksheet = workbook.add_worksheet()
tableone = pd.value_counts(df['name'])
tabletwo = df.groupby('name').aggregate({'Id': lambda x: x.unique()})
worksheet.write('B2:C15', tableone)
worksheet.write('D2:E15', tabletwo)
workbook.close()
EDIT: Include view of tableone
TableOne:
Name | Freq
A 5
B 1
C 6
D 11

import xlsxwriter
workbook = xlsxwriter.Workbook('demo.xlsx')
worksheet = workbook.add_worksheet()
tableone = pd.value_counts(df['name'])
tabletwo = df.groupby('name').aggregate({'Id': lambda x: x.unique()})
col = 1, row = 1 #This is cell b2
for value in tableone:
if col == 16:
row += 1
col = 1
worksheet.write(row,col, value)
col += 1
col = 3, row = 1 #This is cell d2
for value in tabletwo:
if col == 16:
row += 1
col = 1
worksheet.write(row,col,value)
col += 1

Related

Inserting a row above pandas column headers to save a title name in the first cell of excel sheet

I have multiple dataframes that look like this, the data is irrelevant.
I want it to look like this, i want to insert a title above the column headers.
I want to combine them into multiple tabs in an excel file.
Is it possible to add another row above the column headers and insert a Title into the first cell before saving the file to excel.
I am currently doing it like this.
with pd.ExcelWriter('merged_file.xlsx',engine='xlsxwriter') as writer:
for filename in os.listdir(directory):
if filename.endswith('xlsx'):
print(filename)
if 'brands' in filename:
some function
elif 'share' in filename:
somefunction
else:
some function
df.to_excel(writer,sheet_name=f'{filename[:-5]}',index=True,index_label=True)
writer.close()
But the sheet_name is too long, that's why I want to add the title above the column headers.
I tried this code,
columns = df.columns
columns = list(zip([f'{filename[:-5]}'] * len(df.columns), columns))
columns = pd.MultiIndex.from_tuples(columns)
df2 = pd.DataFrame(df,index=df.index,columns=columns)
df2.to_excel(writer,sheet_name=f'{filename[0:3]}',index=True,index_label=True)
But it ends up looking like this with all the data gone,
It should look like this
You can write data from sedond row first and then write to first cell your text:
df = pd.DataFrame({'col': list('abc'), 'col1': list('def')})
print (df)
col col1
0 a d
1 b e
2 c f
writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1', startrow = 1, index=False)
workbook = writer.book
worksheet = writer.sheets['Sheet1']
text = 'sometitle'
worksheet.write(0, 0, text)
writer.save()
Then for reading need:
title = pd.read_excel('test.xlsx', nrows=0).columns[0]
print (title)
sometitle
df = pd.read_excel('test.xlsx', skiprows=1)
print (df)
col col1
0 a d
1 b e
2 c f
You can use MultiIndex. There is an example:
import pandas as pd
df = pd.read_excel('data.xls')
header = pd.MultiIndex.from_product([['Title'],
list(df.columns)])
pd.DataFrame(df.to_numpy(), None , columns = header)
Also, I can share with you my solution with real data in Deepnote (my favorite tool). Feel free to duplicate and play with your own .xls:
https://deepnote.com/publish/3cfd4171-58e8-48fd-af21-930347e8e713

Python: Print output on multiple Excel sheets

I have 4 lists each having 33 values each and wish to print the combination in excel. Excel limits the number of rows in each sheet to 1,048,576 and the number of combinations exceeds the sheet limit by 137,345 values.
How should I continue printing the result in next sheet in the same workbook?
a = [100, 101, 102,...,133]
b = [250, 251, 252,...,283]
c = [300, 301, 302,...,333]
d = [430, 431, 432,...,463]
list_combined = [(p,q,r,s) for p in a
for q in b
for r in c
for s in d]
import xlsxwriter
workbook = xlsxwriter.Workbook('combined.xlsx')
worksheet = workbook.add_worksheet()
for row, group in enumerate(list_combined):
for col in range(5):
worksheet.write (row, col, group[col])
workbook.close()
You could set an upper limit and switch to a new worksheet once you get to the limit.
Here is an example with a lower limit than the limit supported by Excel for testing:
import xlsxwriter
workbook = xlsxwriter.Workbook('test.xlsx')
worksheet = workbook.add_worksheet()
# Simulate a big list
biglist = range(1, 1001)
# Set max_row to your required limit. Zero indexed.
max_row = 100
row_num = 0
for data in biglist:
# If we hit the upper limit then create and switch to a new worksheet
# and reset the row counter.
if row_num == max_row:
worksheet = workbook.add_worksheet()
row_num = 0
worksheet.write(row_num, 0, data)
row_num += 1
workbook.close()
Output:
First, Python calls need to place the parenthesis just after the name. Spaces are not allowed:
worksheet.write (row, col, group[col])
worksheet.write(row, col, group[col])
Second, to write into multiple sheets, you need to do as follows:
Example taken from this SO answer
import xlsxwriter
list_name = ["first sheet", "second sheet", "third sheet"]
workbook = xlsxwriter.Workbook(<Your full path>)
for sheet_name in list_name:
worksheet = workbook.add_worksheet(sheet_name)
worksheet.write('A1', sheet_name)
workbook.close()
If you do not want to pass any name to the sheet, remove the sheet_name argument, and a default name will be given.
To split data into sheets you can easily adapt the code into:
for piece in iterable_data_set:
# consider "piece" a piece of data you want to put into each sheet
# `piece` must be an nxm matrix that contains dumpable data.
worksheet = workbook.add_worksheet()
for i in range(len(piece)):
for j in range(len(piece[i])):
worksheet.write(i, j, piece[i][j])
I recommend you first look for the answer to your question to avoid duplicate answers. If once looking for them none solve your problem, then you can go and ask it, also telling how your problem is different from others found in other questions.

Python / Openpyxl - Find Strings in column and return row number

I'm a fairly new Python user and I have an excel sheet that contains multiple unformatted tables. I am trying to iterate through column B with Python and openpyxl in order to find the headers of the respective table. When I find the header, I would like to save the row number in a variable. Unfortunately, the code is not running as intended and I am not receiving any error message. Below you can find a screenshot of a sample excel sheet as well as my code. Thank you for your help!
start = 1
end = 14
sheet = wb[('Positioning')]
for col in sheet.iter_cols(min_col=2,max_col=2, min_row = start, max_row=end):
for cell in col:
if cell.value == 'Table 1':
table1 = cell.row
elif cell.value == 'Table 2':
table2 = cell.row
Screenshot - Excel Example
Here are ways to search for a string in column or row. You may use column or col_idx which are terms inherent to openpyxl to denote alphabets and number of an Excel sheet respectively.
wb = load_workbook()
ws = wb.active
col_idx, row = search_value_in_col_index(ws, "_my_search_string_")
def search_value_in_column(ws, search_string, column="A"):
for row in range(1, ws.max_row + 1):
coordinate = "{}{}".format(column, row)
if ws[coordinate].value == search_string:
return column, row
return column, None
def search_value_in_col_idx(ws, search_string, col_idx=1):
for row in range(1, ws.max_row + 1):
if ws[row][col_idx].value == search_string:
return col_idx, row
return col_idx, None
def search_value_in_row_index(ws, search_string, row=1):
for cell in ws[row]:
if cell.value == search_string:
return cell.column, row
return None, row
This can be done in multiple ways , here's one way:
Assuming that 'Table' won't show up in the table anywhere but the header of said table.
from openpyxl import Workbook
from openpyxl import load_workbook,styles
wb = load_workbook('Test.xlsx') #Load the workbook
ws = wb['Sheet1'] #Load the worksheet
#ws['B'] will return all cells on the B column until the last one (similar to max_row but it's only for the B column)
for cell in ws['B']:
if(cell.value is not None): #We need to check that the cell is not empty.
if 'Table' in cell.value: #Check if the value of the cell contains the text 'Table'
print('Found header with name: {} at row: {} and column: {}. In cell {}'.format(cell.value,cell.row,cell.column,cell))
Which prints:
Found header with name: Table 1 at row: 2 and column: B. In cell <Cell 'Sheet1'.B2>
Found header with name: Table 2 at row: 10 and column: B. In cell <Cell 'Sheet1'.B10>

Loop through an excel file, find certain cell values and write into text files

I am making text files out of information from an existing excel file. Not all of the information in the excel file is supposed to be written in the text files, but I want my code to loop through the file and choose information from rows according to certain cell values.
Here is my code so far, however, when I run the code nothing happens and the text files are not generated. I do not get an error message either. Does anyone know what I am missing?
import xlrd
xlsfilename = 'Myexcelfile.xls'
book = xlrd.open_workbook(xlsfilename)
book.sheet_by_index(0)
number_rows = 275
number_lines = 1
for row in range(number_rows):
for col in 1, : #Column where the cell value decides wether or not information in the row should be added to the text file.
value = book.sheets()[0].cell(row, col).value
if value == 19: #Rows where 19 is the cell value in column 1 is to be focused on.
txtfilename = 'Mytextfile' + str(row) + '.txt'
with open(txtfilename, "w") as f:
d={} #Creating a dictionary to for Subject number (see later in the code)
for line in range(number_line):
f.write("Subject number{1}") #The subject number should change for each row containing 19 is added to the text file.
f.write('Text\n')
f.write('Newtext'.ljust(1))
for col in 3,:
val = book.sheets()[0].cell(row, col).value
s1 = str(val).ljust(1)
f.write(s1)
f.write('Moretext'.ljust(1))
for col 9, 10:
val = book.sheets()[0].cell(row, col).value
s2 = str(val).ljust(1)
f.write(s2)
else:
pass
Any help is greatly appreciated!
I am working in Python3.4.1
To write all the rows that include the string "19" in column 1 into the same text file, you can do:
import xlrd
xlsfilename = 'Myexcelfile.xls'
book = xlrd.open_workbook(xlsfilename)
book.sheet_by_index(0)
number_rows = book.sheets()[0].nrows
number_lines = 1
column_target = 1
txtfilename = 'Mytextfile.txt'
with open(txtfilename, "w") as f:
for row in range(number_rows):
value = book.sheets()[0].cell(row, column_target).value
if value == "19": #Rows where 19 is the cell value in column 1 is to be focused on.
d={} #Creating a dictionary to for Subject number (see later in the code)
for line in range(number_line):
f.write("Subject number{1}") #The subject number should change for each row containing 19 is added to the text file.
f.write('Text\n')
f.write('Newtext'.ljust(1))
for col in 3,:
val = book.sheets()[0].cell(row, col).value
s1 = str(val).ljust(1)
f.write(s1)
f.write('Moretext'.ljust(1))
for col 9, 10:
val = book.sheets()[0].cell(row, col).value
s2 = str(val).ljust(1)
f.write(s2)
else:
pass
Basically, just move the file scope outside of the loop so that the same file can be reused for multiple rows.

Loop through and compare spreadsheet cells in Python

Please excuse the crude code and I'm sure there are better ways to accomplish this but I am new to programming. Basically I have an excel file with 2 sheets, sheet 1 is populated in column A, sheet 2 is populated in A, B, and C. I want to run through all of the cells in sheet 1 column A searching for a match in sheet 2 column A and copy the info from B and C to sheet 1 if found. The code below kind of works, it copies some data and populates it but it doesn't really match up correctly and it seems to skip a lot of cells if they are the same value as the previous cell. Any help would be greatly appreciated.
import openpyxl
wb = openpyxl.load_workbook('spreadsheet.xlsx')
sheet1 = wb.get_sheet_by_name('Sheet1')
sheet2 = wb.get_sheet_by_name('Sheet2')
for row in sheet1['A1':'A200']:
for cell in row:
obj1 = cell.value
for row2 in sheet2['A1':'A2000']:
for cell2 in row2:
obj2 = cell2.value
if obj1 == obj2:
row = str(cell2.row)
site = 'B' + row
tic = 'C' + row
sheet1[site] = sheet2[site].value
sheet1[tic] = sheet2[tic].value
wb.save('spreadsheet2.xlsx')
Your question is a little unclear but if I understand you correctly this should help:
import openpyxl
wb = openpyxl.load_workbook('spreadsheet.xlsx')
sheet1 = wb.get_sheet_by_name('Sheet1')
sheet2 = wb.get_sheet_by_name('Sheet2')
for i in range(1, 201):
if sheet1.cell(row = i, column = 1).value == sheet2.cell(row = i, column = 1).value:
sheet1.cell(row = i, column = 2).value = sheet2.cell(row = i, column = 2).value
sheet1.cell(row = i, column = 3).value = sheet2.cell(row = i, column = 3).value
wb.save('spreadsheet2.xlsx')
I was able to clean up the code by the using the .cell() method. If this isn't what you need just comment and tell me what exactly you are trying to do. Hope this helps!

Categories

Resources