I have some data that I want to write to a simple multi-column table in Google Docs. Is this way too cumbersome to even begin attempting? I would just render it in XHTML, but my client has a very specific work flow set up on Google Docs that she doesn't want me to interfere with. The Google Docs API seems more geared toward updating metadata and making simple changes to the file format than it is toward undertaking the task of writing entire documents using only raw data and a few formatting rules. Am I missing something, or are there any other libraries that might be able to achieve this?
From the docs:
What can this API do?
The Google Documents List API allows developers to create, retrieve,
update, and delete Google Docs (including but not limited to text
documents, spreadsheets, presentations, and drawings), files, and
collections.
So maybe you could save your data to an excel worksheet temporarily and then upload and convert this.
Saving to excel can be done with xlwt:
import xlwt
workbook = xlwt.Workbook()
sheet = workbook.add_sheet('Sheet 1')
sheet.write(0,1,'First text')
workbook.save('test.xls')
I am using the Google Spreadsheets API in python, and I can't get the column headers for the ListView of a worksheet.
Does anyone know how to get those titles? I can only find the underscore preceded hashes.
Take a look at a this library I created, its a simple wrapper to the spreadsheet API designed to make cases like the one you are facing much simpler to deal with.
I have a spreadsheet full of information and one time only I need to put this data into the Datastore by reading the rows and creating model entities out of all of the data. Each row is an entity and each column is a different property.
I am a little confused how exactly I can put the data into a form that is process-able by GAE and then what I should use to process the spreadsheet in python. I can easily move my data, which is currently in Excel, to Google Docs if that makes things easier but I am still not sure what to do from there.
An easy way is to publish the spreadsheet. See this blogpost: http://blog.pamelafox.org/2010/08/importing-data-from-spreadsheets-to-app.html
Another method is to download the the spreadsheet as a CSV and upload this CSV with the app engine bulkloader.
I'm trying to use python-gdata to populate a worksheet in a spreadsheet. The problem is, updating individual cells is woefully slow. (By doing them one at a time, each request takes about 500ms!) Thus, I'm attempting to use the batch mechanism built into gdata to speed things up.
The problem is, I can't seem to insert new cells. I've scoured the web for examples, but I couldn't find any. This is my code, which I've adapted from an example in the documentation. (The documentation does not actually say how to insert cells, but it does show how to update cells. Since this is a new worksheet, it has no cells.)
Furthermore, with debugging enabled I can see that my requests returns HTTP 200 OK.
import time
import gdata.spreadsheet
import gdata.spreadsheet.service
import gdata.spreadsheets.data
email = '<snip>'
password = '<snip>'
spreadsheet_key = '<snip>'
worksheet_id = 'od6'
spr_client = gdata.spreadsheet.service.SpreadsheetsService()
spr_client.email = email
spr_client.password = password
spr_client.source = 'Example Spreadsheet Writing Application'
spr_client.ProgrammaticLogin()
# create a cells feed and batch request
cells = spr_client.GetCellsFeed(spreadsheet_key, worksheet_id)
batchRequest = gdata.spreadsheet.SpreadsheetsCellsFeed()
# create a cell entry
cell_entry = gdata.spreadsheet.SpreadsheetsCell()
cell_entry.cell = gdata.spreadsheet.Cell(inputValue="foo", text="bar", row='1', col='1')
# add the cell entry to the batch request
batchRequest.AddInsert(cell_entry)
# submit the batch request
updated = spr_client.ExecuteBatch(batchRequest, cells.GetBatchLink().href)
My hunch is that I'm simply misunderstanding the API, and that this should work with changes. Any help is much appreciated.
I recently ran across this as well (when trying to delete) but per the docs here it doesn't appear that batch insert or delete operations are supported:
A number of batch operations can be combined into a single request.
The two types of batch operations supported are query and update.
insert and delete are not supported because the cells feed cannot be
used to insert or delete cells. Remember that the worksheets feed must
be used to do that.
I'm not sure of your use case, but would using the ListFeed help at all? It still won't let you batch operations, so there will be the associated latency, but it may be more tolerable than what you're dealing with now (or were at the time).
As of Google I/O 2016, the latest Google Sheets API supports batch cell updates (and reads). Be aware however, that GData is now deprecated, along with most GData-based APIs, including your sample above as the new API is not GData. Also putting email addresses and passwords in plain text in code is a security risk, so new(er) Google APIs use OAuth2 for authorization. You need to get the latest Google APIs Client Library for Python. It's as easy as pip install -U google-api-python-client [or pip3 for Python 3].
As far as batch insert goes, here's a simple code sample. Assume you have multiple rows of data in rows. To mass-inject this into a Sheet, say with file ID SHEET_ID & starting at the upper-left in cell A1, you'd make one call like this:
SHEETS.spreadsheets().values().update(spreadsheetId=SHEET_ID, range='A1',
body={'values': rows}, valueInputOption='RAW').execute()
If you want a longer example, see the first video below where those rows are read out of a relational database. For those new to this API, here's one code sample from the official docs to help get you kickstarted. For slightly longer, more "real-world" examples, see these videos & blog posts:
Migrating SQL data to a Sheet plus code deep dive post
Formatting text using the Sheets API plus code deep dive post
Generating slides from spreadsheet data plus code deep dive post
The latest Sheets API provides features not available in older releases, namely giving developers programmatic document-oriented access to a Sheet as if you were using the user interface (create frozen rows, perform cell formatting, resizing rows/columns, adding pivot tables, creating charts, etc.)
However, to perform file-level access on Sheets, such as import/export, copy, move, rename, etc., you'd use the Google Drive API. Examples of using the Drive API:
Exporting a Google Sheet as CSV (blogpost)
"Poor man's plain text to PDF" converter (blogpost) (*)
(*) - TL;DR: upload plain text file to Drive, import/convert to Google Docs format, then export that Doc as PDF. Post above uses Drive API v2; this follow-up post describes migrating it to Drive API v3, and here's a developer video combining both "poor man's converter" posts.
Generally I work with CSV files but for this project I need to support XLS too. Does anyone have experience reading XLS files on GAE with Python?
2 possible alternatives I am considering:
xlrd
Google Docs API
xlrd saves you the network round-trip implied by the use of Google Docs; if you don't need to keep the document stored (which would be a substantial plus for Google Docs), this might incline you towards xlrd. I believe they're both high-quality.
However, for both speed and accuracy of "translation", there's really no alternative to benchmarking them both on a range of files reflecting your specific needs and interests.