Batch inserts into spreadsheet using python-gdata - python

I'm trying to use python-gdata to populate a worksheet in a spreadsheet. The problem is, updating individual cells is woefully slow. (By doing them one at a time, each request takes about 500ms!) Thus, I'm attempting to use the batch mechanism built into gdata to speed things up.
The problem is, I can't seem to insert new cells. I've scoured the web for examples, but I couldn't find any. This is my code, which I've adapted from an example in the documentation. (The documentation does not actually say how to insert cells, but it does show how to update cells. Since this is a new worksheet, it has no cells.)
Furthermore, with debugging enabled I can see that my requests returns HTTP 200 OK.
import time
import gdata.spreadsheet
import gdata.spreadsheet.service
import gdata.spreadsheets.data
email = '<snip>'
password = '<snip>'
spreadsheet_key = '<snip>'
worksheet_id = 'od6'
spr_client = gdata.spreadsheet.service.SpreadsheetsService()
spr_client.email = email
spr_client.password = password
spr_client.source = 'Example Spreadsheet Writing Application'
spr_client.ProgrammaticLogin()
# create a cells feed and batch request
cells = spr_client.GetCellsFeed(spreadsheet_key, worksheet_id)
batchRequest = gdata.spreadsheet.SpreadsheetsCellsFeed()
# create a cell entry
cell_entry = gdata.spreadsheet.SpreadsheetsCell()
cell_entry.cell = gdata.spreadsheet.Cell(inputValue="foo", text="bar", row='1', col='1')
# add the cell entry to the batch request
batchRequest.AddInsert(cell_entry)
# submit the batch request
updated = spr_client.ExecuteBatch(batchRequest, cells.GetBatchLink().href)
My hunch is that I'm simply misunderstanding the API, and that this should work with changes. Any help is much appreciated.

I recently ran across this as well (when trying to delete) but per the docs here it doesn't appear that batch insert or delete operations are supported:
A number of batch operations can be combined into a single request.
The two types of batch operations supported are query and update.
insert and delete are not supported because the cells feed cannot be
used to insert or delete cells. Remember that the worksheets feed must
be used to do that.
I'm not sure of your use case, but would using the ListFeed help at all? It still won't let you batch operations, so there will be the associated latency, but it may be more tolerable than what you're dealing with now (or were at the time).

As of Google I/O 2016, the latest Google Sheets API supports batch cell updates (and reads). Be aware however, that GData is now deprecated, along with most GData-based APIs, including your sample above as the new API is not GData. Also putting email addresses and passwords in plain text in code is a security risk, so new(er) Google APIs use OAuth2 for authorization. You need to get the latest Google APIs Client Library for Python. It's as easy as pip install -U google-api-python-client [or pip3 for Python 3].
As far as batch insert goes, here's a simple code sample. Assume you have multiple rows of data in rows. To mass-inject this into a Sheet, say with file ID SHEET_ID & starting at the upper-left in cell A1, you'd make one call like this:
SHEETS.spreadsheets().values().update(spreadsheetId=SHEET_ID, range='A1',
body={'values': rows}, valueInputOption='RAW').execute()
If you want a longer example, see the first video below where those rows are read out of a relational database. For those new to this API, here's one code sample from the official docs to help get you kickstarted. For slightly longer, more "real-world" examples, see these videos & blog posts:
Migrating SQL data to a Sheet plus code deep dive post
Formatting text using the Sheets API plus code deep dive post
Generating slides from spreadsheet data plus code deep dive post
The latest Sheets API provides features not available in older releases, namely giving developers programmatic document-oriented access to a Sheet as if you were using the user interface (create frozen rows, perform cell formatting, resizing rows/columns, adding pivot tables, creating charts, etc.)
However, to perform file-level access on Sheets, such as import/export, copy, move, rename, etc., you'd use the Google Drive API. Examples of using the Drive API:
Exporting a Google Sheet as CSV (blogpost)
"Poor man's plain text to PDF" converter (blogpost) (*)
(*) - TL;DR: upload plain text file to Drive, import/convert to Google Docs format, then export that Doc as PDF. Post above uses Drive API v2; this follow-up post describes migrating it to Drive API v3, and here's a developer video combining both "poor man's converter" posts.

Related

Using smartsheet-python-sdk to put data back into Smartsheet

I used the smartsheet-python-sdk (with a unique API key from Smartsheet) to automatically pull data from Smartsheet into my Python script along with other data sources (from Tableau) to create new feature-engineered columns. I now want to put these new columns I've created back into the same Smartsheet file I initially pulled from. Is there an automatic way to put these new columns I created back into the same Smartsheet I initially pulled data from using the smartsheet-python-sdk? Thank you!
Yes, the REST API documentation shows how to do it via the Python API in the examples on the right sidebar.
I find the Python API woefully under-documented, but a) it closely parallels the REST API with a few differences, like parameters being in Pythonic snake_case instead of JS camelCase; and b) the examples are usually enough to get you there.

Database in Excel using win32com or xlrd Or Database in mysql

I have developed a website where the pages are simply html tables. I have also developed a server by expanding on python's SimpleHTTPServer. Now I am developing my database.
Most of the table contents on each page are static and doesn't need to be touched. However, there is one column per table (i.e. page) that needs to be editable and stored. The values are simply text that the user can enter. The user enters the text via html textareas that are appended to the tables via javascript.
The database is to store key/value pairs where the value is the user entered text (for now at least).
Current situation
Because the original format of my webpages was xlsx files I opted to use an excel workbook as my database that basically just mirrors the displayed web html tables (pages).
I hook up to the excel workbook through win32com. Every time the table (page) loads, javascript iterates through the html textareas and sends an individual request to the server to load in its respective text from the database.
Currently this approach works but is terribly slow. I have tried to optimize everything as much as I can and I believe the speed limitation is a direct consequence of win32com.
Thus, I see four possible ways to go:
Replace my current win32com functionality with xlrd
Try to load all the html textareas for a table (page) at once through one server call to the database using win32com
Switch to something like sql (probably use mysql since it's simple and robust enough for my needs)
Use xlrd but make a single call to the server for each table (page) as in (2)
My schedule to build this functionality is around two days.
Does anyone have any thoughts on the tradeoffs in time-spent-coding versus speed of these approaches? If anyone has any better/more streamlined methods in mind please share!
Probably not the answer you were looking for, but your post is very broad, and I've used win32coma and Excel a fair but and don't see those as good tools towards your goal. An easier strategy is this:
for the server, use Flask: it is a Python HTTP server that makes it crazy easy to respond to HTTP requests via Python code and HTML templates. You'll have a fully capable server running in 5 minutes, then you will need a bit of time create code to get data from your DB and render from templates (which are really easy to use).
for the database, use SQLite (there is far more overhead intergrating with MysQL); because you only have 2 days, so
you could also use a simple CSV file, since the API (Python has a CSV file read/write module) is much simpler, less ramp up time. One CSV per user, easy to manage. You don't worry about insertion of rows for a user, you just append; and you don't implement remove of rows for a user, you just mark as inactive (a column for active/inactive in your CSV). In processing GET request from client, as you read from the CSV, you can count how many certain rows are inactive, and do a re-write of the CSV, so once in a while the request will be a little slower to respond to client.
even simpler yet you could use in-memory data structure of your choice if you don't need persistence across restarts of the server. If this is for a demo this should be acceptable limitation.
for the client side, use jQuery on top of javascript -- maybe you are doing that already. Makes it super easy to manipulate the DOM and use effects like slide-in/out etc. Get yourself the book "Learning jQuery", you'll be able to make good use of jQuery in just a couple hours.
If you only have two days it might be a little tight, but you will probably need more than 2 days to get around the issues you are facing with your current strategy, and issues you will face imminently.

Writing to a Google Document With Python

I have some data that I want to write to a simple multi-column table in Google Docs. Is this way too cumbersome to even begin attempting? I would just render it in XHTML, but my client has a very specific work flow set up on Google Docs that she doesn't want me to interfere with. The Google Docs API seems more geared toward updating metadata and making simple changes to the file format than it is toward undertaking the task of writing entire documents using only raw data and a few formatting rules. Am I missing something, or are there any other libraries that might be able to achieve this?
From the docs:
What can this API do?
The Google Documents List API allows developers to create, retrieve,
update, and delete Google Docs (including but not limited to text
documents, spreadsheets, presentations, and drawings), files, and
collections.
So maybe you could save your data to an excel worksheet temporarily and then upload and convert this.
Saving to excel can be done with xlwt:
import xlwt
workbook = xlwt.Workbook()
sheet = workbook.add_sheet('Sheet 1')
sheet.write(0,1,'First text')
workbook.save('test.xls')

Importing and processing a spreadsheet on Google App Engine

I have a spreadsheet full of information and one time only I need to put this data into the Datastore by reading the rows and creating model entities out of all of the data. Each row is an entity and each column is a different property.
I am a little confused how exactly I can put the data into a form that is process-able by GAE and then what I should use to process the spreadsheet in python. I can easily move my data, which is currently in Excel, to Google Docs if that makes things easier but I am still not sure what to do from there.
An easy way is to publish the spreadsheet. See this blogpost: http://blog.pamelafox.org/2010/08/importing-data-from-spreadsheets-to-app.html
Another method is to download the the spreadsheet as a CSV and upload this CSV with the app engine bulkloader.

Set cell format in Google Sheets spreadsheet using its API & Python

I am using gd_client.UpdateCell to update values in a google spreadsheet and am looking for a way to update the formatting of the cell i.e. color, lines, textsize.....
Can this be done?
Are there examples or other documentation available?
Anything to get me going would be great
(Feb 2017) As of Google I/O 2016, developers can now format cells in Google Sheets using the latest API (v4). Here's a short Python example that bolds the 1st row (assuming the file ID is SHEET_ID and SHEETS is the API service endpoint):
DATA = {'requests': [
{'repeatCell': {
'range': {'endRowIndex': 1},
'cell': {'userEnteredFormat': {'textFormat': {'bold': True}}},
'fields': 'userEnteredFormat.textFormat.bold',
}}
]}
SHEETS.spreadsheets().batchUpdate(
spreadsheetId=SHEET_ID, body=DATA).execute()
I also made a developer video on this subject if that helps (see below). BTW, you're not limited to Python, you can use any language supported by the Google APIs Client Libraries.
The latest Sheets API provides features not available in older releases, namely giving developers programmatic access to a Sheet as if you were using the user interface (frozen rows, cell formatting[!], resizing rows/columns, adding pivot tables, creating charts, etc.). If you're new to the API, I've created a few videos with somewhat more "real-world" examples:
Migrating SQL data to a Sheet plus code deep dive post
Formatting cells using the Sheets API plus code deep dive post
Generating slides from spreadsheet data plus code deep dive post
As you can tell, the Sheets API is primarily for document-oriented functionality as described above, but to perform file-level access such as import/export, copy, move, rename, etc., use the Google Drive API instead.
I don't think this can be done currently. The current APIs (both the List and Cell API) allow changing data, but not formatting.
The entire APIs are described here. Nothing about formatting:
http://code.google.com/apis/spreadsheets/data/3.0/reference.html
http://code.google.com/apis/documents/docs/3.0/reference.html
Many people requesting this in the groups but never get an answer from Google:
http://groups.google.com/group/Google-Docs-Data-APIs/browse_thread/thread/14aef72447ba48ce/9c2143fb4c8a3000?lnk=gst&q=color#9c2143fb4c8a3000
http://www.mail-archive.com/google-docs-data-apis#googlegroups.com/msg02569.html
This helped for me enormously: https://cloud.google.com/blog/products/application-development/formatting-cells-with-the-google-sheets-api
Google Apps Script looks like it might bring some of this functionality, in the Range class specifically:
https://developers.google.com/apps-script/reference/spreadsheet/range
Importantly, I have not figured out how to bind (and/or execute) a Google Apps Script to the sheet that I'm currently creating and filling using the Google Drive API and Google Sheets API.
I'm not suggesting porting your app to Google Apps Script, but I'm seriously considering it myself at this point. I'm hoping maybe someone else has some thoughts on the last missing piece of hooking the API up with the Google Apps Script.

Categories

Resources