Access "Shared with me folder" in Google Drive via Python - python

This is the problem I have to face, I'll be as clear as possible.
My clients are sending daily reports of their sales via an excel file on google Drive. I am only a viewer on this file and I have it in my "shared with me" folder in Google Drive. Ideally, I would like to import this file on Python to be able to post-process it and analyze it. Few notes:
I have seen solutions here on StackOverflow that suggest adding a shortcut to Drive in order to import it but I have two problems with that:
After adding it I see a .gsheet file that I do not know how to load in Python
I am not sure that the file would be dynamically updated on daily basis
I can not use gdown since I am only a viewer on that file, unfortunately!
Let me know if you have other ideas/approaches! Thanks

You can list all the files shared with you using the Drive API.
We will need to use the following methods:
Files.list [Drive API] (https://developers.google.com/drive/api/v3/reference/files/list) to list all files you have access to.
You can use the API explorer available in most documentation files and once you have a better grasp on the API behaviour experiment starting with this code sample https://developers.google.com/drive/api/quickstart/python, this Quickstart makes a simple list of files with Python.
I recommend you use the following flow:
Call the Files.list method with the following parameters:
{
"q": "not ('me' in owners or creator = 'me')",
"fields": "nextPageToken,items(fileSize,owners,title,id,mimeType)"
}
This will return only the files you have opened that are shared with you (file you are not owner nor creator). For you to access .gsheet file you will not handle it as a regular file because they are not, instead use the Google Sheets API (https://developers.google.com/sheets/api/reference/rest) to fetch the data inside the Google Sheet file, the same thing is true for Google Docs and Google Slides, each have their respective API you can use to manipulate/access the data in each file.
If you look closely the parameters we are using, q filters the results you will obtain to only list files you don't own but can access, you can also filter files owned by a particular email address; the other parameter fields makes the response you obtain much shorter, since you won't make use of all the properties of a file this parameters provides a more simplifies response that will take less time for the server to process and less bandwidth, adjust the fields parameter if you need more or less data.
Finally, direct your focus to the nextPageToken property in the fields parameter, the API response will be paginated, meaning that you will receive up to a certain amount of files in one response, to retrieve the 'next page' of results just make the same call again but using the nextPageToken you obtained in the response as a new parameter in the request. This is explained in this documentation article https://developers.google.com/calendar/api/guides/pagination.
Note: If you need clarification on how to execute certain actions on a Google Sheet file I recommend you submit a new question since additional tasks with other APIs are outside the scope of this question and will make this response much larger than it needs to be.

Related

Using temporary files and folders in Web2py app

I am relatively new to web development and very new to using Web2py. The application I am currently working on is intended to take in a CSV upload from a user, then generate a PDF file based on the contents of the CSV, then allow the user to download that PDF. As part of this process I need to generate and access several intermediate files that are specific to each individual user (these files would be images, other pdfs, and some text files). I don't need to store these files in a database since they can be deleted after the session ends, but I am not sure the best way or place to store these files and keep them separate based on each session. I thought that maybe the subfolders in the sessions folder would make sense, but I do not know how to dynamically get the path to the correct folder for the current session. Any suggestions pointing me in the right direction are appreciated!
I was having this error "TypeError: expected string or Unicode object, NoneType found" and I had to store just a link in the session to the uploaded document in the db or maybe the upload folder in your case. I would store it to upload to proceed normally, and then clear out the values and the file if not 'approved'?
If the information is not confidential in similar circumstances, I directly write the temporary files under /tmp.

How to add a list of hyperlinks in a CSV field to a cell in Google Sheets?

I'm generating a csv file (with thousands of rows) on my local PC. From a Google account, I would like to do a manual Google Sheets>Import to upload the file for my book club group. The data is collected from HTML tables on multiple pages, if that matters.
One of the fields is named "shelves" is essentially tags, and it contains a list of (name, url) tuples. I'd like to modify my Python program to make a list along the lines of
[=HYPERLINK(url, name), =HYPERLINK(url, name), ..., =HYPERLINK(url, name)]
but I can't find any syntax clues. I also tried
['=HYPERLINK("url", "name"), =HYPERLINK("url", "name")', '=HYPERLINK("url", "name"), =HYPERLINK("url", "name")', ...]
Can something like this via importing a CSV file from Google Sheets work or not, in Aug 2022?
Here's a sample CSV row:
,title,title_url,author,author_url,shelves,date_started,date_finished,member_name,member_url,date_added,group_activity,group_book_id_url'
'29,"Luck in the Shadows (Nightrunner, #1)",http://goodreads.com/book/show/74270.Luck_in_the_Shadows,"Flewelling, Lynn",http://goodreads.com/author/show/42110.Lynn_Flewelling,"[('http://goodreads.com/group/bookshelf/group?shelf=read', 'read'), ('http://goodreads.com/group/bookshelf/group?shelf=1-book-of-the-month', '1-book-of-the-month'), ('http://goodreads.com/group/bookshelf/group?shelf=char-royalty-nobility', 'char-royalty-nobi...'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-action-adventure', 'genre-action-adve...'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-epic', 'genre-epic'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-fantasy', 'genre-fantasy'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-romance', 'genre-romance'), ('http://goodreads.com/group/bookshelf/group?shelf=profession-mage-witch-wizard', 'profession-mage-w...'), ('http://goodreads.com/group/bookshelf/group?shelf=theme-cross-dressing', 'theme-cross-dressing'), ('http://goodreads.com/group/bookshelf/group?shelf=theme-nautical', 'theme-nautical'), ('http://goodreads.com/group/bookshelf/group?shelf=theme-on-the-run', 'theme-on-the-run'), ('http://goodreads.com/group/bookshelf/group?shelf=time-historical', 'time-historical')]",1/1/2021,1/31/2021,Marianne ,http://goodreads.com/user/show/marianne,"group activity for 536628',http://goodreads.com/group/show_book/group?group_book_id=536628
So shelves is the field I'm working on. As you can see it has a long list (and edited for brevity):
[('http://goodreads.com/group/bookshelf/group?shelf=read', 'read'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-action-adventure', 'genre-action-adve...'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-epic', 'genre-epic'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-fantasy', 'genre-fantasy'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-romance', 'genre-romance'), ('http://goodreads.com/group/bookshelf/group?shelf=profession-mage-witch-wizard', 'profession-mage-w...'), ('http://goodreads.com/group/bookshelf/group?shelf=theme-on-the-run', 'theme-on-the-run'), ('http://goodreads.com/group/bookshelf/group?shelf=time-historical', 'time-historical')]
I would like to have a csv-type file that can be manually imported into Google Sheets and have a single cell contain the shelves list in the following fashion:
`[=HYPERTEXT('http://goodreads.com/group/bookshelf/group?shelf=read', 'read'), =HYPERTEXT('http://goodreads.com/group/bookshelf/group?shelf=genre-action-adventure', 'genre-action-adve...')]
So that when it is uploaded to Google it displays similarly to an html table cell:
Before I go through a ton of iterations of that, I wanted to see if that would even work. All the research I've done has come up with mostly 2020 info about only being able to do this in the Google Apps environment, or to possibly write a function for the spreadsheet. I did sign up and try the Google Apps environment, but got stuck in setting up credentials.
If not, is there a best approach to somehow accomplish this?
If it is possible, I could use some help on the syntax. Thank you!
I believe your goal is as follows.
You want to upload CSV data of your local PC to your Google Drive.
Here, from your question, the uploading data is as follows.
[('http://goodreads.com/group/bookshelf/group?shelf=read', 'read'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-action-adventure', 'genre-action-adve...'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-epic', 'genre-epic'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-fantasy', 'genre-fantasy'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-romance', 'genre-romance'), ('http://goodreads.com/group/bookshelf/group?shelf=profession-mage-witch-wizard', 'profession-mage-w...'), ('http://goodreads.com/group/bookshelf/group?shelf=theme-on-the-run', 'theme-on-the-run'), ('http://goodreads.com/group/bookshelf/group?shelf=time-historical', 'time-historical')]
You want to create a new Spreadsheet and put the formulas like =HYPERLINK("url", "name") to the column "A" of the new Spreadsheet.
You want to achieve this using a python script. And, when the data is uploaded, you don't want to authorize the scopes.
Unfortunately, when the data is uploaded to Google Drive, authorization is required to be used. So, in this answer, in order to achieve your goal, as a workaround, I used Web Apps as a wrapper API. When Web Apps is used, the authorization can be done when the Web Apps is deployed. By this, when the script accesses the Web Apps, the data can be uploaded without authorization. In this case, how about the following method?
Usage:
1. Create a Google Apps Script project.
In order to use Web Apps, please create a new Google Apps Script project. When you access https://script.google.com/home and create a new project. You can create a new Google Apps Script project.
2. Sample script.
Please copy and paste the following script to the script editor of the created Google Apps Script project.
function doPost(e) {
const data = JSON.parse(e.postData.contents);
const newSS = SpreadsheetApp.create("sample");
const formulas = data.map(([a, b]) => [`=HYPERLINK("${a}", "${b}")`]);
newSS.getSheets()[0].getRange(1, 1, formulas.length, 1).setFormulas(formulas);
return ContentService.createTextOutput(newSS.getUrl());
}
In this case, as a sample, a new Spreadsheet is created to the root folder.
If you want to achieve your goal without using =HYPERLINK(), please modify as follows.
From
const formulas = data.map(([a, b]) => [`=HYPERLINK("${a}", "${b}")`]);
newSS.getSheets()[0].getRange(1, 1, formulas.length, 1).setFormulas(formulas);
To
const rValues = data.map(([a, b]) => [SpreadsheetApp.newRichTextValue().setText(b).setLinkUrl(a).build()]);
newSS.getSheets()[0].getRange(1, 1, rValues.length).setRichTextValues(rValues);
3. Deploy Web Apps.
The detailed information can be seen at the official document.
Please set this using the new IDE of the script editor.
On the script editor, at the top right of the script editor, please click "click Deploy" -> "New deployment".
Please click "Select type" -> "Web App".
Please input the information about the Web App in the fields under "Deployment configuration".
Please select "Me" for "Execute as".
Please select "Anyone" for "Who has access".
Please click "Deploy" button.
Copy the URL of the Web App. It's like https://script.google.com/macros/s/###/exec.
When you modified the Google Apps Script, please modify the deployment as a new version. By this, the modified script is reflected in Web Apps. Please be careful about this.
You can see the detail of this in the report "Redeploying Web Apps without Changing URL of Web Apps for new IDE".
4. Testing:
In order to test the above Web Apps, in this case, from your question, a python script and your showing data are used. The sample script is as follows. Before you use this, please set url.
import json
import requests
url = "https://script.google.com/macros/s/###/exec" # Please replace this with your Web Apps URL.
# This data is from your question.
data = [('http://goodreads.com/group/bookshelf/group?shelf=read', 'read'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-action-adventure', 'genre-action-adve...'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-epic', 'genre-epic'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-fantasy', 'genre-fantasy'), ('http://goodreads.com/group/bookshelf/group?shelf=genre-romance', 'genre-romance'), ('http://goodreads.com/group/bookshelf/group?shelf=profession-mage-witch-wizard', 'profession-mage-w...'), ('http://goodreads.com/group/bookshelf/group?shelf=theme-on-the-run', 'theme-on-the-run'), ('http://goodreads.com/group/bookshelf/group?shelf=time-historical', 'time-historical')]
res = requests.post(url, json.dumps(data))
print(res.text)
From your question, I understood that you have already had data in the above script. So, I used it.
When this script is run, your data is sent to Web Apps without the authorization. Because the authorization has already been done when Web Apps is deployed. The uploaded data is parsed and set the values as the formula. And, the URL of the created new Spreadsheet is returned.
Note:
When you modified the Google Apps Script, please modify the deployment as a new version. By this, the modified script is reflected in Web Apps. Please be careful about this.
You can see the detail of this in the report "Redeploying Web Apps without Changing URL of Web Apps for new IDE".
References:
Web Apps
Taking advantage of Web Apps with Google Apps Script
Google's Import function cannot automatically translate a field containing a list of URLs into a single field as a list of Hyperlinks, regardless of the URL format used in the CSV field.
The user must write a Google Sheet Extension using Extensions > Apps Scripts in the relevant Sheet. Thanks to Tanaike's Answer that supplies comprehensive directions.
The user can then copy that script to their other Google Sheets that Import similarly formatted files.

How to prevent clutter and save api calls in a different file? (python)

For a webapp project I'm making multiple api calls. Each is a long link so my file is looking pretty cluttered.
I tried making a seperate file in which I can store my api calls, but the problem I then encountered was that the api calls get made when the file is loaded. This is obviously something I don't want.
What I think will work is making a function for each api call and store these functions in the other file, but to me this seems a bit overkill and not very beautiful.
Is there some better way I don't know about?
Edit: I can't just save the string, because the api is id dependent.
sample api call:
video = requests.get(f'https://api.themoviedb.org/3/movie/{id}/videos?api_key=e7319c984bf89b3efa98e5a0106691b7&language=en-US')

Automatically generated sitemap on Google App Engine

Okay so I know there are already some questions about this topic but I don't find any to be specific enough. I want to have a script on my site that will automatically generate my sitemap.xml (or save it somewhere). I know how to upload files and have my site set up on http://sean-behan.appspot.com with Python 2.7. How do I setup the script that will generate the sitemap and if possible please reference the code. Just ask if you need more info. :) Thanks.
You can have outside services automatically generate them for you by traversing your site.
One such service is at http://www.xml-sitemaps.com/details-sean-behan.appspot.com.html
Alternatively, you can serve your own xml file based on the URL's you want to appear in your site. In which case, see Tim Hoffman's answer.
I can't point you to code, as I don't know how your site is structured or what templating env you use, does your site structure include static pages etc...
The basics are if you have code that can pull together a list of dictionaries that contain the metadata about each page you want in your sitemap then you are half way there.
The use a templating language (or straight python ) that generates an xml file as per sitemap.org spec.
Now you have two choices, dynamically serve this output as requested, or store it in the datastore if when compressed it is less than 1MB, or write it to google cloud storage, then server it's contents when /sitemap.xml is requested. You will then set up a cron task to regenerate your cached sitemap once a day (or whatever frequency is appropriate).

Batch inserts into spreadsheet using python-gdata

I'm trying to use python-gdata to populate a worksheet in a spreadsheet. The problem is, updating individual cells is woefully slow. (By doing them one at a time, each request takes about 500ms!) Thus, I'm attempting to use the batch mechanism built into gdata to speed things up.
The problem is, I can't seem to insert new cells. I've scoured the web for examples, but I couldn't find any. This is my code, which I've adapted from an example in the documentation. (The documentation does not actually say how to insert cells, but it does show how to update cells. Since this is a new worksheet, it has no cells.)
Furthermore, with debugging enabled I can see that my requests returns HTTP 200 OK.
import time
import gdata.spreadsheet
import gdata.spreadsheet.service
import gdata.spreadsheets.data
email = '<snip>'
password = '<snip>'
spreadsheet_key = '<snip>'
worksheet_id = 'od6'
spr_client = gdata.spreadsheet.service.SpreadsheetsService()
spr_client.email = email
spr_client.password = password
spr_client.source = 'Example Spreadsheet Writing Application'
spr_client.ProgrammaticLogin()
# create a cells feed and batch request
cells = spr_client.GetCellsFeed(spreadsheet_key, worksheet_id)
batchRequest = gdata.spreadsheet.SpreadsheetsCellsFeed()
# create a cell entry
cell_entry = gdata.spreadsheet.SpreadsheetsCell()
cell_entry.cell = gdata.spreadsheet.Cell(inputValue="foo", text="bar", row='1', col='1')
# add the cell entry to the batch request
batchRequest.AddInsert(cell_entry)
# submit the batch request
updated = spr_client.ExecuteBatch(batchRequest, cells.GetBatchLink().href)
My hunch is that I'm simply misunderstanding the API, and that this should work with changes. Any help is much appreciated.
I recently ran across this as well (when trying to delete) but per the docs here it doesn't appear that batch insert or delete operations are supported:
A number of batch operations can be combined into a single request.
The two types of batch operations supported are query and update.
insert and delete are not supported because the cells feed cannot be
used to insert or delete cells. Remember that the worksheets feed must
be used to do that.
I'm not sure of your use case, but would using the ListFeed help at all? It still won't let you batch operations, so there will be the associated latency, but it may be more tolerable than what you're dealing with now (or were at the time).
As of Google I/O 2016, the latest Google Sheets API supports batch cell updates (and reads). Be aware however, that GData is now deprecated, along with most GData-based APIs, including your sample above as the new API is not GData. Also putting email addresses and passwords in plain text in code is a security risk, so new(er) Google APIs use OAuth2 for authorization. You need to get the latest Google APIs Client Library for Python. It's as easy as pip install -U google-api-python-client [or pip3 for Python 3].
As far as batch insert goes, here's a simple code sample. Assume you have multiple rows of data in rows. To mass-inject this into a Sheet, say with file ID SHEET_ID & starting at the upper-left in cell A1, you'd make one call like this:
SHEETS.spreadsheets().values().update(spreadsheetId=SHEET_ID, range='A1',
body={'values': rows}, valueInputOption='RAW').execute()
If you want a longer example, see the first video below where those rows are read out of a relational database. For those new to this API, here's one code sample from the official docs to help get you kickstarted. For slightly longer, more "real-world" examples, see these videos & blog posts:
Migrating SQL data to a Sheet plus code deep dive post
Formatting text using the Sheets API plus code deep dive post
Generating slides from spreadsheet data plus code deep dive post
The latest Sheets API provides features not available in older releases, namely giving developers programmatic document-oriented access to a Sheet as if you were using the user interface (create frozen rows, perform cell formatting, resizing rows/columns, adding pivot tables, creating charts, etc.)
However, to perform file-level access on Sheets, such as import/export, copy, move, rename, etc., you'd use the Google Drive API. Examples of using the Drive API:
Exporting a Google Sheet as CSV (blogpost)
"Poor man's plain text to PDF" converter (blogpost) (*)
(*) - TL;DR: upload plain text file to Drive, import/convert to Google Docs format, then export that Doc as PDF. Post above uses Drive API v2; this follow-up post describes migrating it to Drive API v3, and here's a developer video combining both "poor man's converter" posts.

Categories

Resources