Writing to a Google Document With Python - python

I have some data that I want to write to a simple multi-column table in Google Docs. Is this way too cumbersome to even begin attempting? I would just render it in XHTML, but my client has a very specific work flow set up on Google Docs that she doesn't want me to interfere with. The Google Docs API seems more geared toward updating metadata and making simple changes to the file format than it is toward undertaking the task of writing entire documents using only raw data and a few formatting rules. Am I missing something, or are there any other libraries that might be able to achieve this?

From the docs:
What can this API do?
The Google Documents List API allows developers to create, retrieve,
update, and delete Google Docs (including but not limited to text
documents, spreadsheets, presentations, and drawings), files, and
collections.
So maybe you could save your data to an excel worksheet temporarily and then upload and convert this.
Saving to excel can be done with xlwt:
import xlwt
workbook = xlwt.Workbook()
sheet = workbook.add_sheet('Sheet 1')
sheet.write(0,1,'First text')
workbook.save('test.xls')

Related

Python Excel - OpenPyxl Limitation

I recently started to automate a report at work using Python. Since my data was provided to me in the form of an excel sheet, I felt the best way to do this was to use an excel python module. My module of choice was openpyxl. It worked great, I've used it to perform calculations and organise my data ready to plot charts. Now here's the problem...
I know that you cannot update existing charts using openpyxl so that option went out the window.
What I then tried to do was link the data in my openpyxl spreadsheet to another spreadsheet containing the charts (which is then linked to my word document where the charts are to be displayed). So after doing this I ran my script and to my annoyance, the data links between my openpyxl spreadsheet and charts spreadsheet had been severed. I guess this is because openpyxl creates a new spreadsheet when you save using the save function links are severed.
My question is.. are there any ways to maintain the data links?
It is currently not possible to maintain links between files. I think it would be possible to keep them metadata but, for fairly obvious reasons, it won't necessarily be possible to validate them. This best way for this to happen would be through a pull request.
If you're on Windows you might look at using the Python for Windows stuff which will allow you to remote control the applications.

Importing and processing a spreadsheet on Google App Engine

I have a spreadsheet full of information and one time only I need to put this data into the Datastore by reading the rows and creating model entities out of all of the data. Each row is an entity and each column is a different property.
I am a little confused how exactly I can put the data into a form that is process-able by GAE and then what I should use to process the spreadsheet in python. I can easily move my data, which is currently in Excel, to Google Docs if that makes things easier but I am still not sure what to do from there.
An easy way is to publish the spreadsheet. See this blogpost: http://blog.pamelafox.org/2010/08/importing-data-from-spreadsheets-to-app.html
Another method is to download the the spreadsheet as a CSV and upload this CSV with the app engine bulkloader.

Batch inserts into spreadsheet using python-gdata

I'm trying to use python-gdata to populate a worksheet in a spreadsheet. The problem is, updating individual cells is woefully slow. (By doing them one at a time, each request takes about 500ms!) Thus, I'm attempting to use the batch mechanism built into gdata to speed things up.
The problem is, I can't seem to insert new cells. I've scoured the web for examples, but I couldn't find any. This is my code, which I've adapted from an example in the documentation. (The documentation does not actually say how to insert cells, but it does show how to update cells. Since this is a new worksheet, it has no cells.)
Furthermore, with debugging enabled I can see that my requests returns HTTP 200 OK.
import time
import gdata.spreadsheet
import gdata.spreadsheet.service
import gdata.spreadsheets.data
email = '<snip>'
password = '<snip>'
spreadsheet_key = '<snip>'
worksheet_id = 'od6'
spr_client = gdata.spreadsheet.service.SpreadsheetsService()
spr_client.email = email
spr_client.password = password
spr_client.source = 'Example Spreadsheet Writing Application'
spr_client.ProgrammaticLogin()
# create a cells feed and batch request
cells = spr_client.GetCellsFeed(spreadsheet_key, worksheet_id)
batchRequest = gdata.spreadsheet.SpreadsheetsCellsFeed()
# create a cell entry
cell_entry = gdata.spreadsheet.SpreadsheetsCell()
cell_entry.cell = gdata.spreadsheet.Cell(inputValue="foo", text="bar", row='1', col='1')
# add the cell entry to the batch request
batchRequest.AddInsert(cell_entry)
# submit the batch request
updated = spr_client.ExecuteBatch(batchRequest, cells.GetBatchLink().href)
My hunch is that I'm simply misunderstanding the API, and that this should work with changes. Any help is much appreciated.
I recently ran across this as well (when trying to delete) but per the docs here it doesn't appear that batch insert or delete operations are supported:
A number of batch operations can be combined into a single request.
The two types of batch operations supported are query and update.
insert and delete are not supported because the cells feed cannot be
used to insert or delete cells. Remember that the worksheets feed must
be used to do that.
I'm not sure of your use case, but would using the ListFeed help at all? It still won't let you batch operations, so there will be the associated latency, but it may be more tolerable than what you're dealing with now (or were at the time).
As of Google I/O 2016, the latest Google Sheets API supports batch cell updates (and reads). Be aware however, that GData is now deprecated, along with most GData-based APIs, including your sample above as the new API is not GData. Also putting email addresses and passwords in plain text in code is a security risk, so new(er) Google APIs use OAuth2 for authorization. You need to get the latest Google APIs Client Library for Python. It's as easy as pip install -U google-api-python-client [or pip3 for Python 3].
As far as batch insert goes, here's a simple code sample. Assume you have multiple rows of data in rows. To mass-inject this into a Sheet, say with file ID SHEET_ID & starting at the upper-left in cell A1, you'd make one call like this:
SHEETS.spreadsheets().values().update(spreadsheetId=SHEET_ID, range='A1',
body={'values': rows}, valueInputOption='RAW').execute()
If you want a longer example, see the first video below where those rows are read out of a relational database. For those new to this API, here's one code sample from the official docs to help get you kickstarted. For slightly longer, more "real-world" examples, see these videos & blog posts:
Migrating SQL data to a Sheet plus code deep dive post
Formatting text using the Sheets API plus code deep dive post
Generating slides from spreadsheet data plus code deep dive post
The latest Sheets API provides features not available in older releases, namely giving developers programmatic document-oriented access to a Sheet as if you were using the user interface (create frozen rows, perform cell formatting, resizing rows/columns, adding pivot tables, creating charts, etc.)
However, to perform file-level access on Sheets, such as import/export, copy, move, rename, etc., you'd use the Google Drive API. Examples of using the Drive API:
Exporting a Google Sheet as CSV (blogpost)
"Poor man's plain text to PDF" converter (blogpost) (*)
(*) - TL;DR: upload plain text file to Drive, import/convert to Google Docs format, then export that Doc as PDF. Post above uses Drive API v2; this follow-up post describes migrating it to Drive API v3, and here's a developer video combining both "poor man's converter" posts.

Set cell format in Google Sheets spreadsheet using its API & Python

I am using gd_client.UpdateCell to update values in a google spreadsheet and am looking for a way to update the formatting of the cell i.e. color, lines, textsize.....
Can this be done?
Are there examples or other documentation available?
Anything to get me going would be great
(Feb 2017) As of Google I/O 2016, developers can now format cells in Google Sheets using the latest API (v4). Here's a short Python example that bolds the 1st row (assuming the file ID is SHEET_ID and SHEETS is the API service endpoint):
DATA = {'requests': [
{'repeatCell': {
'range': {'endRowIndex': 1},
'cell': {'userEnteredFormat': {'textFormat': {'bold': True}}},
'fields': 'userEnteredFormat.textFormat.bold',
}}
]}
SHEETS.spreadsheets().batchUpdate(
spreadsheetId=SHEET_ID, body=DATA).execute()
I also made a developer video on this subject if that helps (see below). BTW, you're not limited to Python, you can use any language supported by the Google APIs Client Libraries.
The latest Sheets API provides features not available in older releases, namely giving developers programmatic access to a Sheet as if you were using the user interface (frozen rows, cell formatting[!], resizing rows/columns, adding pivot tables, creating charts, etc.). If you're new to the API, I've created a few videos with somewhat more "real-world" examples:
Migrating SQL data to a Sheet plus code deep dive post
Formatting cells using the Sheets API plus code deep dive post
Generating slides from spreadsheet data plus code deep dive post
As you can tell, the Sheets API is primarily for document-oriented functionality as described above, but to perform file-level access such as import/export, copy, move, rename, etc., use the Google Drive API instead.
I don't think this can be done currently. The current APIs (both the List and Cell API) allow changing data, but not formatting.
The entire APIs are described here. Nothing about formatting:
http://code.google.com/apis/spreadsheets/data/3.0/reference.html
http://code.google.com/apis/documents/docs/3.0/reference.html
Many people requesting this in the groups but never get an answer from Google:
http://groups.google.com/group/Google-Docs-Data-APIs/browse_thread/thread/14aef72447ba48ce/9c2143fb4c8a3000?lnk=gst&q=color#9c2143fb4c8a3000
http://www.mail-archive.com/google-docs-data-apis#googlegroups.com/msg02569.html
This helped for me enormously: https://cloud.google.com/blog/products/application-development/formatting-cells-with-the-google-sheets-api
Google Apps Script looks like it might bring some of this functionality, in the Range class specifically:
https://developers.google.com/apps-script/reference/spreadsheet/range
Importantly, I have not figured out how to bind (and/or execute) a Google Apps Script to the sheet that I'm currently creating and filling using the Google Drive API and Google Sheets API.
I'm not suggesting porting your app to Google Apps Script, but I'm seriously considering it myself at this point. I'm hoping maybe someone else has some thoughts on the last missing piece of hooking the API up with the Google Apps Script.

how to read an excel file on google app engine

Generally I work with CSV files but for this project I need to support XLS too. Does anyone have experience reading XLS files on GAE with Python?
2 possible alternatives I am considering:
xlrd
Google Docs API
xlrd saves you the network round-trip implied by the use of Google Docs; if you don't need to keep the document stored (which would be a substantial plus for Google Docs), this might incline you towards xlrd. I believe they're both high-quality.
However, for both speed and accuracy of "translation", there's really no alternative to benchmarking them both on a range of files reflecting your specific needs and interests.

Categories

Resources