I want to know if there is any way to download the output dataframe of streamlit as an Excel file using the streamlit button?
I suggest you edit your question to include a minimal reproducible example so that it's easier for people to understand your question and to help you.
Here is the answer if I understand you correctly. Basically it provides 2 ways to download your data df as either csv or xlsx.
IMPORTANT: You need to install xlsxwriter package to make this work.
import streamlit as st
import pandas as pd
import io
# buffer to use for excel writer
buffer = io.BytesIO()
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45],
"random1": [5, 12, 1],
"random2": [230, 23, 1]
}
df = pd.DataFrame(data)
#st.cache
def convert_to_csv(df):
# IMPORTANT: Cache the conversion to prevent computation on every rerun
return df.to_csv(index=False).encode('utf-8')
csv = convert_to_csv(df)
# display the dataframe on streamlit app
st.write(df)
# download button 1 to download dataframe as csv
download1 = st.download_button(
label="Download data as CSV",
data=csv,
file_name='large_df.csv',
mime='text/csv'
)
# download button 2 to download dataframe as xlsx
with pd.ExcelWriter(buffer, engine='xlsxwriter') as writer:
# Write each dataframe to a different worksheet.
df.to_excel(writer, sheet_name='Sheet1', index=False)
# Close the Pandas Excel writer and output the Excel file to the buffer
writer.save()
download2 = st.download_button(
label="Download data as Excel",
data=buffer,
file_name='large_df.xlsx',
mime='application/vnd.ms-excel'
)
Related
I have write some content to a xlsx file by using xlsxwriter
workbook = xlsxwriter.Workbook(file_name)
worksheet = workbook.add_worksheet()
worksheet.write(row, col, value)
worksheet.close()
I'd like to add a dataframe after the existing rows to this file by to_excel
df.to_excel(file_name,
startrow=len(existing_content),
engine='xlsxwriter')
However, this seems not work.The dataframe not inserted to the file. Anyone knows why?
Unfortunately, as the content above is not specifically written, let's take a look at to_excel and XlsxWriter as examples.
using xlsxwriter
import xlsxwriter
# Create a new Excel file and add a worksheet
workbook = xlsxwriter.Workbook('example.xlsx')
worksheet = workbook.add_worksheet()
# Add some data to the worksheet
worksheet.write('A1', 'Language')
worksheet.write('B1', 'Score')
worksheet.write('A2', 'Python')
worksheet.write('B2', 100)
worksheet.write('A3', 'Java')
worksheet.write('B3', 98)
worksheet.write('A4', 'Ruby')
worksheet.write('B4', 88)
# Save the file
workbook.close()
Using the above code, we have saved the table similar to the one below to an Excel file.
Language
Score
Python
100
Java
98
Ruby
88
Next, if we want to add rows using a dataframe.to_excel :
using to_excel
import pandas as pd
# Load an existing Excel file
existing_file = pd.read_excel('example.xlsx')
# Create a new DataFrame to append
df = pd.DataFrame({
'Language': ['C++', 'Javascript', 'C#'],
'Score': [78, 97, 67]
})
# Append the new DataFrame to the existing file
result = pd.concat([existing_file, df])
# Write the combined DataFrame to the existing file
result.to_excel('example.xlsx', index=False)
The reason for using pandas concat:
To append, it is necessary to use pandas.DataFrame.ExcelWriter(), but XlsxWriter does not support append mode in ExcelWriter
Although the task can be accomplished using pandas.DataFrame.append(), the append method is slated to be deleted in the future, so we use concat instead.
The OP is using xlsxwriter in the engine parameter. Per XlsxWriter documentation "XlsxWriter is designed only as a file writer. It cannot read or modify an existing Excel file." (link to XlsxWriter Docs).
Below I've provided a fully reproducible example of how you can go about modifying an existing .xlsx workbook using the openpyxl module (link to Openpyxl Docs).
For demonstration purposes, I'll first create create a workbook called test.xlsx using pandas:
import pandas as pd
df = pd.DataFrame({'Col_A': [1,2,3,4],
'Col_B': [5,6,7,8],
'Col_C': [0,0,0,0],
'Col_D': [13,14,15,16]})
df.to_excel('test.xlsx', index=False)
This is the Expected output at this point:
Using openpyxl you can use another dataset to load the existing workbook ('test.xlsx') and modify the third column with different data from the new dataframe while preserving the other existing data. In this example, for simplicity, I update it with a one column dataframe but you could extend it to update or add more data.
from openpyxl import load_workbook
import pandas as pd
df_new = pd.DataFrame({'Col_C': [9, 10, 11, 12]})
wb = load_workbook('test.xlsx')
ws = wb['Sheet1']
for index, row in df_new.iterrows():
cell = 'C%d' % (index + 2)
ws[cell] = row[0]
wb.save('test.xlsx')
With the Expected output at the end:
experts, i want to remove a very first row from excel file using python. I am sharing a a screen print of my source excel file
i want out put as
I am using below python code to remove first row from excel but when i am reading it as data frame and printing that i am observing that data in data frame is being read as shown in below screen print
and the code which i am using is
import pandas as pd
import os
def remove_header():
file_name = "AV Clients.xlsx"
os.chmod(file_name, 0o777)
df = pd.read_excel(file_name) #Read Excel file as a DataFrame
#df = df.drop([0])
print(df)
#df.to_excel("AV_Clients1.xlsx", index=False)
remove_header()
Please suggest how i can remove a very first row from excel file whose screen print i have shared at top.
Thanks in advance
Kawaljeet
Just add skiprows argument while reading excel.
import pandas as pd
import os
def remove_header():
file_name = "AV Clients.xlsx"
os.chmod(file_name, 0o777)
df = pd.read_excel(file_name, skiprows = 1) #Read Excel file as a DataFrame
print(df)
df.to_excel("AV_Clients1.xlsx", index=False)
remove_header()
I have 5 excel tabs in a certain sheet that I need to copy and paste text files into. I know how to a certain cell of a normal excel sheet with one tab. But I have no idea how to copy and paste each text file to the correct tab as there are formulas in each one.
Any ideas?
you can achive using python panda
import pandas as pd
# Create some Pandas dataframes from some data.
df1 = pd.DataFrame({'Data': [11, 12, 13, 14]})
df2 = pd.DataFrame({'Data': [21, 22, 23, 24]})
df3 = pd.DataFrame({'Data': [31, 32, 33, 34]})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_multiple.xlsx', engine='xlsxwriter')
# Write each dataframe to a different worksheet.
df1.to_excel(writer, sheet_name='Sheet1')
df2.to_excel(writer, sheet_name='Sheet2')
df3.to_excel(writer, sheet_name='Sheet3')
# Close the Pandas Excel writer and output the Excel file.
writer.save()
Im working on a project where I have to take excel file make changes to the data and save it
from pandas import ExcelWriter
import pandas as pd
dfs = pd.read_excel("infile.xlsx")
#manuplate data
writer = ExcelWriter('outfile.xlsx')
dfs.to_excel(writer,'Sheet5')
writer.save()
The problem I have is the newly saved excel file does not have the same format(cell widht, bold borders) as the input file. What can I do to solve this issue?
You can't preserve the formatting because pandas throws away all that information upon import. You would need to specify the formatting options you want in the output with the ExcelWriter object. If you use the option engine='xlsxwriter' you can then use all the xlsxwriter formatting options before writing the final file. You can find more details in the XlsxWriter documentation.
Example:
import pandas as pd
# This removes the default header style so we can override it later
import pandas.io.formats.excel
pandas.io.formats.excel.header_style = None
# Create a Pandas dataframe from some data.
df = pd.DataFrame({'Data1': [10, 20, 30, 20, 15, 30, 45],
'Data2': [90, 80, 30, 15, 88, 34, 41]})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_conditional.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Get the xlsxwriter workbook and worksheet objects.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Create Format objects to apply to sheet
# https://xlsxwriter.readthedocs.io/format.html#format-methods-and-format-properties
red_bold = workbook.add_format({'bold': True, 'font_color': 'red'})
border = workbook.add_format({'border':5, 'border_color':'blue'})
#Apply formatting to sheet
worksheet.set_column('C:C', None, red_bold)
worksheet.set_column('A1:A8', None, border)
# Apply a conditional format to a cell range.
worksheet.conditional_format('B2:B8', {'type': '3_color_scale'})
# Close the Pandas Excel writer and output the Excel file.
writer.save()
import pandas as pd
df1=pd.read_csv('out.csv')
df2=pd.read_excel('file.xls')
df2['Location']=df1['Location']
df2['Sublocation']=df1['Sublocation']
df2['Zone']=df1['Zone']
df2['Subnet Type']=df1['Subnet Type']
df2['Description']=df1['Description']
newfile = input("Enter a name for the combined xlsx file: ")
print('Saving to new xlsx file...')
writer = pd.ExcelWriter(newfile)
df2.to_excel(writer, index=False)
writer.save()
Basically, it reads a csv file with 5 columns and it reads a xls file with existing columns, then makes a xlsx file where the two files are combined with the 5 new columns.
So it works, but only for 4999 rows, the last 10 dont have the 5 new columns in the new xlsx file.
I am little confused about the problem, so i came up with 2 options
1. append df1 to df2
2. Merge df1 to df2 (adds new columns to existing df)
. I think in your case you dont have same number of rows in csv and excel and for that reason last 10 rows dont have value in the output
import numpy as np
import pandas as pd
df1 = pd.DataFrame(np.array([
['a', 51, 61],
['b', 52, 62],
['c', 53, 63]]),
columns=['name', 'attr11', 'attr12'])
df2 = pd.DataFrame(np.array([
['a', 31, 41],
['b', 32, 42],
['c', 33, 43],
['d',34,44]]),
columns=['name', 'attr21', 'attr22'])
df3= df1.append(df2)
print df3
print pd.merge(df1,df2,on='name',how='right')
Most likely there is a way to do what you want within pandas, but in case there isn't, you can use lower-level packages to accomplish your task.
To read the CSV file, use the csv module that comes with Python. The following code loads all the data into a Python list, where each element of the list is a row in the CSV. Note that this code is not as compact as an experienced Python programmer would write. I've tried to strike a balance between being readable for Python beginners and being "idiomatic":
import csv
with open('input1.csv', 'rb') as f:
reader = csv.reader(f)
csvdata = []
for row in reader:
csvdata.append(row)
To read the .xls file, use xlrd, which should already be installed since pandas uses it, but you can install it separately if needed. Again, the following code is not the shortest possible, but is hopefully easy to understand:
import xlrd
wb = xlrd.open_workbook('input2.xls')
ws = wb.sheet_by_index(0) # use the first sheet
xlsdata = []
for rx in range(ws.nrows):
xlsdata.append(ws.row_values(rx))
Finally, write out the combined data to a .xlsx file using XlsxWriter. This is another package that may already be installed if you've used pandas to write Excel files, but can be installed separately if needed. Once again, I've tried to stick to relatively simple language features. For example, I've avoided zip(), whose workings might not be obvious to Python beginners:
import xlsxwriter
wb = xlsxwriter.Workbook('output.xlsx')
ws = wb.add_worksheet()
assert len(csvdata) == len(xlsdata) # we expect the same number of rows
for rx in range(len(csvdata)):
ws.write_row(rx, 0, xlsdata[rx])
ws.write_row(rx, len(xlsdata[rx]), csvdata[rx])
wb.close()
Note that write_row() lets you choose the destination cell of the leftmost data element. So I've used it twice for each row: once to write the .xls data at the far left, and once more to write the CSV data with a suitable offset.
I think you should append data
import pandas as pd
df1=pd.read_csv('out.csv')
df2=pd.read_excel('file.xls')
df2.append(df1)
newfile = input("Enter a name for the combined xlsx file: ")
print('Saving to new xlsx file...')
writer = pd.ExcelWriter(newfile)
df2.to_excel(writer, index=False)
writer.save()