google spreadsheets gspread append_row issue - python

I'm working on a program that generates a dynamic google spreadsheet report.
Sometimes when I create a new row (with data) in google spreadsheet using gspread append_row function it doesn't work as expected and no exception is thrown. The new line is added but there is no data inside.
example code below:
#!/usr/bin/python
import gspread
# report line data
report_line = ['name', 'finished <None or int>', 'duration <str>', 'id']
connection = gspread.login('email#google.com', 'password')
spreadsheet = connection.open('report_name')
sheet = spreadsheet.sheet1
sheet.append_row(report_line)
Am I missing something? Is this a known issue?
How can I be certain that the append_row function completes successfully?

It appends a new row after the last row in the sheet. A sheet has 1000 rows by default so you should find your appended row at index=1001.
Try to resize the sheet to the number of rows that are present:
sheet.resize(1)
You should now be able to append rows at the end of your data rather than at the end of the sheet. The number of rows has to be >= 1.

I'm adding to #BFTM's answer:
make sure you are doing the sheet.resize(1) one time, and not everytime you want to add append a row (it will delete all the rows you wrote beyond 1)
To get the number of rows dynamically, in case you don't know to do it manually, you can look at this answer. (get values from one column and find the length)
If you have direct accsess to the spreadsheet you can just delete the empty rows once, and then use append_row() as usual.

Related

Adding a new row of data to an existing ods sheet

I'm running a python script to automate some of my day-to-day tasks at work. One task I'm trying to do is simply add a row to an existing ods sheet that I usually open via LibreOffice.
This file has multiple sheets and depending on what my script is doing, it will add data to different sheets.
The thing is, I'm having trouble finding a simple and easy way to just add some data to the first unpopulated row of the sheet.
Reading about odslib3, pyexcel and other packages, it seems that to write a row, I need to specifically tell the row number and column to write data, and opening the ods file just to see what cell to write and tell the pythom script seems unproductive
Is there a way to easily add a row of data to an ods sheet without informing row number and column ?
If I understand the question I believe that using a .remove() and a .append() will do the trick. It will create and populate data on the last row (can't say its the most efficient though).
EX if:
from pyexcel_ods3 import save_data
from pyexcel_ods3 import get_data
data = get_data("info.ods")
print(data["Sheet1"])
[['first_row','first_row'],[]]
if([] in data["Sheet1"]):
data["Sheet1"].remove([])#remove unpopulated row
data["Sheet1"].append(["second_row","second_row"])#add new row
print(data["Sheet1"])
[['first_row','first_row'],['second_row','second_row']]

How to gather row id's of sheet that is being imported?

I am currently trying to move rows from a sheet I import, into another existing sheet. As of right now I am using result.data.id to return the sheet id that will be associated with the imported sheet. For further clarification here is the line of code I am using to move the rows.
response =smart.Sheets.move_rows(result.data.id,smart.models.CopyOrMoveRowDirective({'row_ids': [**IDK**],'to': smart.models.CopyOrMoveRowDestination({'sheet_id': 1174866712913796})}))
I run into my issue when it prompts me to enter the row ids that I am trying to move into the existing sheet. As the imported sheet has not been created yet, I can not reference the row ids within the smart sheet. My question is: Is there something similar to result.data.id but for row ids? Or is there something else I can enter for row_ids that will just move all the rows in a sheet?
If you're wanting to copy all rows from the sheet you've just imported, you should be able to use the Copy Rows operation. As the documentation shows:
You specify the id of the sheet you want to copy, and information about the container (e.g., folder) where you want the new sheet to be created.
You also have the option of including additional parameters that specify which types of things you want to be included in the copy operation (e.g., attachments, discussions, forms, etc.)
Here's an example request in Python:
response = smart.Sheets.copy_sheet(
4583173393803140, # sheet_id
smartsheet.models.ContainerDestination({
'destination_type': 'folder', # folder, workspace, or home
'destination_id': 9283173393803140, # folder_id
'new_name': 'newSheetName'
})
)
UPDATE (list rows):
Since I now understand that you want to copy all rows from the sheet you import into another sheet that already exists, the Copy Sheet approach that I've described above isn't appropriate.
Instead, after you've imported the sheet, you can use the Get Sheet operation to get data about the sheet you've imported. Here's an example of the Get Sheet operation in Python:
sheet = smartsheet_client.Sheets.get_sheet(
4583173393803140) # sheet_id
The response will contain (amongst other things), a rows property, which is an array of Row objects, representing all of the rows in the imported sheet. Each Row object contains an id property that specifies the ID of the row.
You'll need to iterate through that array of Row objects, adding the value of each id property to a comma-delimited list of IDs -- this will give you the list of IDs that the Move Rows to Another Sheet operation requires.

How to update a cell value in excel in Python

I have a spreadsheet which is having following columns:
TestID TestData ExpectedOutput ActualOutput Result
I have separate python scripts for each test-id. I need to read the row corresponding to that particular test-id and after execution, need to update result in same spreadsheet. I am not able to update that result value. can someone please help?
I read the spreadsheet using Pandas.
e.g.
a row in spread sheet:
TestID TestData ExpectedOutput ActualOutput Result
Testid-1 Min_freq=5,Max_freq=60, Drive started Drive started Pass
My script would search for this testid and read the test data. after execution, it would compare the output with expected output and accordingly would update the value of cell Result. I am not getting how to update result value.
Please help me.
The only package that can modify/edit an existing excel is openpyxl
You can read it by xlrd, but cannot modify it by xlwt or xlsxwriter, which can create and flash new xls and xlsx.
However, if you are using another source to edit the existing excel, they are not editing the same ones but two template mirror files, be sure to save it before letting python to read it, and vise versa.

How to write to an existing excel file without over-writing existing data using pandas

I know similar questions have been posted before, but i haven't found something working for this case. I hope you can help.
Here is a summary of the issue:
I'am writing a web scraping code using selenium(for an assignment purpose)
The code utilizes a for-loop to go from one page to another
The output of the code is a dataframe from each page number that is imported to excel. (basically a table)
Dataframes from all the web pages to be captured in one excel sheet only.(not multiple sheets within the excel file)
Each web page has the same data format (ie. number of columns and column headers are the same, but the row values vary..)
For info, I'am using pandas as it is helping convert the output from the website to excel
The problem i'm facing is that when the dataframe is exported to excel, it over-writes the data from the previous iteration. hence, when i run the code and scraping is completed, I will only get the data from the last for-loop iteration.
Please advise the line(s) of coding i need to add in order for all the iterations to be captured in the excel sheet, in other words and more specifically, each iteration should export the data to excel starting from the first empty row.
Here is an extract from the code:
for i in range(50, 60):
url= (urlA + str(i)) #this is the url generator, URLA is the main link excluding pagination
driver.get(url)
time.sleep(random.randint(3,7))
text=driver.find_element_by_xpath('/html/body/pre').text
data=pd.DataFrame(eval(text))
export_excel = data.to_excel(xlpath)
Thanks Dijkgraaf. Your proposal worked.
Here is the full code for others (for future reference).
apologies for the font, couldnt set it properly. anyway hope below is to some use for someone in the future.
xlpath= "c:/projects/excelfile.xlsx"
df=pd.DataFrame() #creating a data frame before the for loop. (dataframe is empty before the for loop starts)
Url= www.your website.com
for i in irange(1,10):
url= (urlA + str(i)) #this is url generator for pagination (to loop thru the page)
driver.get(url)
text=driver.find_element_by_xpath('/html/body/pre').text # gets text from site
data=pd.DataFrame(eval(text)) #evalues the extracted text from site and converts to Pandas dataframe
df=df.append(data) #appends the dataframe (df) specificed before the for-loop and adds the new (data)
export_excel = df.to_excel(xlpath) #exports consolidated dataframes (df) to excel

How do I read multiple tables on a worksheet from Excel to DataFrame where the tables have non-deterministic locations?

What is the easiest way to read the highlighted table in the screenshot below from Excel into a Pandas DataFrame? Suppose I have thousands of worksheets like this. The area I want to read has "Col4" at the top-left corner and doesn't have an entire blank row or column. "Col4" can appear at any (row, column) on the worksheet.
I suppose I can always go with the brutal force approach, where I read the entire sheet first, find the position of "Col4", and then extract the part I want. But I am wondering if there is any easier way to do it.
Also, I have only worked with Pandas so far. I know there are many other packages besides Pandas such as xlwings or xlrd. If you know any of these packages can be helpful please let me know and it will be very appreciated as well.
Note that this question is not a duplicate of pandas read_excel multiple tables on the same sheet, because the solution in that post only handles the case where the row offset is known beforehand.
The business problem behind this I am solving is to read many spreadsheets created by non-engineer crews (HR, accounting, etc.) in my company, and unfortunately they didn't create the spreadsheets in a consistent and programming-friendly way.
Python is pretty powerful, but I don't think you're going to get the kind of flexibility that you are looking for using it With Excel. Maybe someone will post a solution, but if not, you can use VBA for this task, and when everything is aggregated into one single sheet, use Python to read from that single source.
Sub CopyRangeFromMultipleSheets()
'Declaring variables
Dim Source As Worksheet
Dim Destination As Worksheet
Dim SourceLastRow, DestLastRow As Long
Application.ScreenUpdating = False
'Looping through all sheets to check whether "Master" sheet exist
For Each Source In ThisWorkbook.Worksheets
If Source.Name = "Master" Then
MsgBox "Master sheet already exist"
Exit Sub
End If
Next
'Inserting a new sheet after the "Main" sheet
Set Destination = Worksheets.Add(after:=Sheets("Sheet1"))
Destination.Name = "Master"
'Looping through all the sheets in the workbook
For Each Source In ThisWorkbook.Worksheets
'Preventing consolidation of data from "Main" and "Master" sheet
If Source.Name <> "Master" Then
SourceLastRow = Source.Range("A1").SpecialCells(xlLastCell).Row
Source.Activate
If Source.UsedRange.Count > 1 Then
DestLastRow = Sheets("Master").Range("A1").SpecialCells(xlLastCell).Row
If DestLastRow = 1 Then
'copying data from the source sheet to destination sheet
Source.Range("D1", Range("T1").SpecialCells(xlLastCell)).Copy Destination.Range("A" & DestLastRow)
Else
Source.Range("D1", Range("T1").SpecialCells(xlCellTypeLastCell)).Copy Destination.Range("A" & (DestLastRow + 1))
End If
End If
End If
Next
Destination.Activate
Application.ScreenUpdating = True
End Sub

Categories

Resources