how to save excel file with all its existing parameters/settings - python

I have excel file (.xlsx) with many sheets which contains many rows, columns, different column colors, size and so on... I just have to add some new rows(work with data not with as it called "conditional formatting"?).
I use pandas to import and save excel file. I'am also quite new to python.. I could not find the answer yet.
So my question is, is there any possibility to open excel file and update it with all its existing parameters like colors, font size and so on??
I'm trying to open and save it like:
my_excel_file = pd.read_excel(r'my_file.xlsx')
my_excel_file.to_excel('my_file.xlsx'), index = False)
all the sheets are gone, only one sheet saved, the same with colors, font size and so on.

From what it appears, you are opening one tab/sheet within Pandas and then saving that dataframe to a single sheet as well.
If you want to load all the data from the various tabs of your excel workbook, first load them and assign a name to them and thereafter save each of those dataframes to a specific tab in excel, for example:
import pandas as pd
df1 = pd.read_excel('file_name.xlsx', engine='openpyxl', sheet_name='tab1')
df2 = pd.read_excel('file_name.xlsx', engine='openpyxl', sheet_name='tab2')
Thereafter specify that you want to save these dataframes in different worksheets as opposed to overwriting the same sheet:
df1.to_excel('name_of_excel_file.xlsx', sheet_name='Sheet_name_1')
df2.to_excel('name_of_excel_file.xlsx', sheet_name='Sheet_name_2')
Perform as many times as need to - you can even create a function/loop to do it for you if it is repetitive.

Related

I have a CSV file with many columns and many rows. How do I create a one column one Excel sheet from Python?

This is my database:
https://archive.ics.uci.edu/ml/datasets/Parkinson+Speech+Dataset+with++Multiple+Types+of+Sound+Recordings
This database consist of training data and test data. The training data consists of many features; one column is one feature. I intend to convert each column into a separate Excel sheet.
The following is my Python code that I formulated to convert the entire text file into a CSV. But I intend to convert the entire text file into Excel sheets. For example, the entire text file contains 10 columns, so I want to create 10 Excel sheets with each column separated into one Excel sheet. Can any expert guide me on how to do it? I am completely new to Python so I hope someone can help me.
import pandas as pd
read_file = pd.read_csv (r'C://Users/RichardStone/Pycharm/Project/train_data.txt')
read_file.to_csv (r'C://Users/RichardStone/Pycharm/Project/train_data.csv', index=None)
Try this.
sheetnames = list()
for i in range(len(read_file.columns)):
sheetnames.append('Sheet' + str(i+1))
for i in range(len(read_file.columns)):
read_file.iloc[:, i].to_excel(sheetnames[i] + '.xlsx', index = False)

I want to sort data present in excel file in sheet with respect to column. (The excel file has multiple sheets)

Excel Data
This is the data I've in an excel file. There are 10 sheets containing different data and I want to sort data present in each sheet by the 'BA_Rank' column in descending order.
After sorting the data, I've to write the sorted data in an excel file.
(for eg. the data which was present in sheet1 of the unsorted sheet should be written in sheet1 of the sorted list and so on...)
If I remove the heading from the first row, I can use the pandas (sort_values()) function to sort the data present in the first sheet and save it in another list.
like this
import pandas as pd
import xlrd
doc = xlrd.open_workbook('without_sort.xlsx')
xl = pd.read_excel('without_sort.xlsx')
length = doc.nsheets
#print(length)
#for i in range(0,length):
#sheet = xl.parse(i)
result = xl.sort_values('BA_Rank', ascending = False)
result.to_excel('SortedData.xlsx')
print(result)
So is there any way I can sort the data without removing the header file from the first row?
and how can I iterate between sheets so as to sort the data present in multiple sheets?
(Note: All the sheets contain the same columns and I need to sort every sheet using 'BA_Rank' in descending order.)
First input, you don't need to call xlrd when using pandas, it's done under the hood.
Secondly, the read_excel method its REALLY smart. You can (imo should) define the sheet you're pulling data from. You can also set up lines to skip, inform where the header line is or to ignore it (and then set column names manually). Check the docs, it's quite extensive.
If this "10 sheets" it's merely anecdotal, you could use something like xlrd to extract the workbook's sheet quantity and work by index (or extract names directly).
The sorting looks right to me.
Finally, if you wanna save it all in the same workbook, I would use openpyxl or some similar library (there are many others, like pyexcelerate for large files).
This procedure pretty much always looks like:
Create/Open destination file (often it's the same method)
Write down data, sheet by sheet
Close/Save file
If the data is to be writen all on the same sheet, pd.concat([all_dataframes]).to_excel("path_to_store") should get it done

Loop through list of pandas dataframes and write them to different tabs in one Excel file (from Jupyter notebook)

I have a dataframe in my Jupyter notebook that I can successfully write to an Excel file with pandas ExcelWriter, but I'd rather split the dataframe into smaller dataframes (based on its index), then loop through them to write each to a different sheet in one Excel file. This seems syntactically correct but my code cell just runs without ever finishing:
path = r'/root/notebooks/my_file.xlsx'
writer = ExcelWriter(path)
sheets = df.index.unique().tolist()
for sheet in sheets:
df.loc[sheet].to_excel(writer, sheet_name=sheet, index=False)
writer.save()
I've tried a few different approaches without any luck. Am I missing something simple?
It is hard to determine the issue in your system without the error message (as you have said, you have an infinite loop). You might check the size of your dataset as you are putting only one row for each excel sheet. If you have plenty of rows, then you will have that many sheets.
However, I tried your code with my own dataset and there are some errors that can be fixed anyway.
path = 'raw/test_so.xlsx'
writer = pd.ExcelWriter(path)
sheets = df.index.unique().tolist()
for sheet in sheets:
df.loc[[sheet]].to_excel(writer, sheet_name=str(sheet), index=False)
writer.save()
See the df.loc[[sheet]] for each sheet to still get the dataframe format on excel (with column headers).
If your dataframe index is in integer, make sure that you do sheet_name=str(sheet), as it can't accept integer for the sheet name.

How to copy and paste multiple sheet for column's data with the same column name to another excel file

I have two different Excel files One is the master I need to update and the second one contains the data I need to copy and paste So....
This is my master excel file I want to update looks like this
Excel Master
and this excel which contains column data I need to copy it to the master file, as thee's are three sheets called GSM_CDDData, UMTS_CDDData, and LTE_CDDData and the three sheets contains the same column name which I need to copy it's data and add them to the Master Excel, as the columns name is CELLNAME
and this excel looks like
Source
After Copying this data ,I want to paste them in the Master Excel in column called CELL
So any Ideas how to do that?.....
as I know the supported library pandas and openpyxl
If you want to copy and paste the data itself, you might consider use of the pywin32 package (if you have Windows)
from win32com.client import Dispatch
xl = Dispatch("Excel.Application")
wb1 = xl.Workbooks.Open(Filename=file_name)
ws1 = wb1.Worksheets(sheet_1)
ws2 = wb1.Worksheets(sheet_2)
ws1.Columns(index_number).Copy(ws2.Columns(index_number))
If you don't want to use pywin32, you might just import the various columns using
pd.read_excel()
and append or concatenate the objects. From there you can save the new dataframe using to_excel.

Using pandas to replace data in excel sheet

I tried to come up with a way to copy data from a sheet in an excel file as
import pandas as pd
origionalFile = pd.ExcelFile('AnnualReport-V5.0.xlsx')
Transfers = pd.read_excel(origionalFile, 'Sheet1')
I have another excel file, which named 'AnnualReport-V6.0.xlsx', it has existing data in the sheet named 'Transfers', I tried to use the dataframe I created easily on to replace data in the sheet 'Transfers' in 'AnnualReport-V6.0.xlsx' from column B, leave column A as it is.
I did a few searches, the closest to what I want is this
Modifying an excel sheet in a excel book with pandas
but it does not allow me the keep column A in the original sheet (column A has some equations I do want to keep them), any idea how to do it? Thanks
Would reading column A and inserting it to the fresh data you want to write solve your problem?

Categories

Resources