Writing dataframe values to google sheet through gspread_dataframe - python

I am working in an automation that consists on sending the values of a dataframe to a google sheet, the following is my code for a sample dataframe, which is similar to the one I am working on:
#Creates a dictionary containing values for 1 column to be used in pandas dataframe
col = {'id':["1"],'name':["Juan"], 'code':["1563"], 'group':["3"], 'class':["A"]}
#Creates a pandas dataframe
df = pd.DataFrame(col)
df
I need to send to google sheet just the dataframe values, without the header, this is just a sample of the data I am working with, and of course I need the header in the dataframe because I am doing some column transformations in the dataframe before sending it to sheets, due to data comes from an API.
This is the code to send the dataframe to google sheet:
import gspread
from gspread_dataframe import set_with_dataframe
gc = gspread.service_account(filename='API_creds.json')
sheet = gc.open_by_key('SHEET_ID')
# Sending values from aimleap dataframe google sheet
row=1
col=1
worksheet = sheet.get_worksheet(0)
set_with_dataframe(worksheet,df,row,col)
After sending the dataframe to sheets through set_with_dataframe(worksheet,df,row,col), sheets gets updated with the dataframe including the header, I just need to update the sheet with just the values of the dataframe, how could I modify the parameters of set_with_dataframe() to achieve this?
This is how it looks when sending the dataframe:

You should be able to do this by setting the include_column_header argument to False.
set_with_dataframe(worksheet,df,row,col, include_column_header=False)

Related

Pasting a dataframe into excel using openpyxl

I have a table which is the output of a sql query and I want this table to be pasted into a specific cell of an excel (say B10).Using openpyxl I used to do
ws_hi["c31"].value = output_total.iloc[0]
(ws_hi is the excel sheet and output_total is the variable which holds the data I wanna copy)
which works fine for a single value but not for a table. Need help in exporting if output_total is a 4*5 table
FYI output_total is dataframe obtained by
output_total = pd.read_sql_query(text(query), engine1)
thanks!
Since your data is stored in a pandas DataFrame, you can use pd.DataFrame.to_excel and specify the upper left cell where you want to dump your data with startrow and startcol.
In your case this would be something like the following for the B10 cell:
output_total.to_excel('excel/file/path.xlsx', startrow=11, startcol=1, header=False, index=False)

How to export python dataframe into existing excel sheet and retain formatting?

I am trying to export a dataframe I've generated in Pandas to an Excel Workbook. I have been able to get that part working, but unfortunately no matter what I try, the dataframe goes into the workbook as a brand new worksheet.
What I am ultimately trying to do here is create a program that pulls API data from a website and imports it in an existing Excel sheet in order to create some sort of "live updating excel workbook". This means that the worksheet already has proper formatting, vba, and other calculated columns applied, and all of this would ideally stay the same except for the basic data in the dataframe I'm importing.
Anyway to go about this? Any direction at all would be quite helpful. Thanks.
Here is my current code:
file='testbook.xlsx'
writer = pd.ExcelWriter(file, engine = 'xlsxwriter')
df.to_excel(writer, sheet_name="Sheet1")
workbook = writer.book
worksheet = writer.sheets["Sheet1")
writer.save
In case u have both existing excel file and DataFrame in same format then you can simply import your exiting excel file into another DataFrame and concat both the DataFrames then save into new excel or existing one.
df1["df"] = pd.read_excel('testbook.xlsx')
df2["df"] = 1#your dataFrame
df = pd.concat([df1, df2])
df.to_excel('testbook.xlsx')
There are multiple ways of doing it if you want to do it completely using pandas library this will work.

Multiple Spreadsheets with Gspread

I'm really new with Python, and I’m working with gspread and Google Sheets. I have several spreadsheets I would like to pull data from. They all have the same name with an appended numerical value (e.g., SpreadSheet(1), SpreadSheet(2), SpreadSheet(3), etc.)
I would like to parse through each spreadsheet, pull the data, and generate a single data frame with the data. I can do this quite easily with a single spreadsheet, but I’m having trouble doing it with several.
I can create a list of the spreadsheets titles with the code below, but I'm not sure if that's the right direction.
titles_list = []
for spreadsheet in client.openall():
titles_list.append(spreadsheet.title)
Using a mix of both your starting code and #Tanaike's answer here you have a snippet of code that does what you expect.
# Create an authorized client
client = gspread.authorize(credentials)
# Create a list to hold the values
values = []
# Get all spreadsheets
for spreadsheet in client.openall():
# Get spreadsheet's worksheets
worksheets = spreadsheet.worksheets()
for ws in worksheets:
# Append the values of the worksheet to values
values.extend(ws.get_all_values())
# create df from values
df = pd.DataFrame(values)
print(df)
Hope I was clear.
I believe your goal as follows.
You want to merge the values retrieved from all sheets in a Google Spreadsheet.
You want to convert the retrieved values to the dataframe.
Each sheet has 4 columns, 100 rows and no header rows.
You want to achieve this using gspread with python.
You have already been able to get and put values for Google Spreadsheet using Sheets API.
For this, how about this answer?
Flow:
Retrieve all sheets in the Google Spreadsheet using worksheets().
Retrieve all values from all sheets using get_all_values() and merge the values.
Convert the retrieved values to the dataframe.
Sample script:
spreadsheetId = "###" # Please set the Spreadsheet ID.
client = gspread.authorize(credentials)
spreadsheet = client.open_by_key(spreadsheetId)
worksheets = spreadsheet.worksheets()
values = []
for ws in worksheets:
values.extend(ws.get_all_values())
df = pd.DataFrame(values)
print(df)
References:
worksheets()
get_all_values()

Using pandas to replace data in excel sheet

I tried to come up with a way to copy data from a sheet in an excel file as
import pandas as pd
origionalFile = pd.ExcelFile('AnnualReport-V5.0.xlsx')
Transfers = pd.read_excel(origionalFile, 'Sheet1')
I have another excel file, which named 'AnnualReport-V6.0.xlsx', it has existing data in the sheet named 'Transfers', I tried to use the dataframe I created easily on to replace data in the sheet 'Transfers' in 'AnnualReport-V6.0.xlsx' from column B, leave column A as it is.
I did a few searches, the closest to what I want is this
Modifying an excel sheet in a excel book with pandas
but it does not allow me the keep column A in the original sheet (column A has some equations I do want to keep them), any idea how to do it? Thanks
Would reading column A and inserting it to the fresh data you want to write solve your problem?

Append a pandas dataframe to an existing excel table

I need some help with the following.
I currently use python pandas to open a massive spreadsheet every day (this spreadsheet is a report, hence every day the data inside the spreadsheet is different). Pandas dataframe allows me to quickly crunch the data and generate a final output table, with much less data than the initial excel file.
Now, on day 1, I would need to add this output dataframe (3 rows 10 columns) to a new excel sheet (let's say sheet 1).
On day 2, I would need to take the new output of the dataframe and append it to the existing sheet 1. So at the end of day 2, the table in sheet1 would have 6 rows and 10 columns.
On day 3, same thing. I will launch my python pnadas tool, read data from the excel report, generate an output dataframe 3x10 and append it again to my excel file.
I can't find a way to append to an existing excel table.
Could anybody help?
Many thanks in advance,
Andrea
If you use openpyxl's utilities for dataframes then you should be able to do everything you need with the existing workbook, assuming this fits into memory.
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows
wb = load_workbook("C:\Andrea\master_file.xlsx")
ws = wb[SHEETNAME]
for row in dataframe_to_rows(dt_today):
ws.append(row)

Categories

Resources