Getting the download link for a public Google Docs file - python

Reading the Google Docs API I find this:
Downloading
Files cannot be downloaded in a format other than
the one in which they were originally uploaded. The download URL for
files looks something like this:
https://doc-04-20-docs.googleusercontent.com/docs/secure/m7an0emtau/WJm12345/YzI2Y2ExYWVm?h=16655626&e=download&gd=true
Given a public Google Documents file URL, say,
https://docs.google.com/open?id=0B1-vl-dPgKm_NTNhZjZkMWMtZjQxOS00MGE1LTg2MjItNGVjYzdmZjYxNmQ5
How can I turn it into a download link?

Hi nightcracker try this:
https://docs.google.com/uc?export=download&id=DOCIDGOESHERE
I've only tried it with one pdf and it worked ok so maybe having a play with that will help....
All the best,
Dave

The method suggested by dkcwd didn't work for a published document.
I have found the following method in this page. Given a publish document URL, such as
https://docs.google.com/document/pub?id=[ID]
The download link is
https://docs.google.com/document/export?format=[FORMAT]&id=[ID]
where [FORMAT] can be one of these values: pdf, doc, docx, oo, rtf, txt.

Related

downloading csv files from a specific site using python

Goal: want to automatize the download of various .csv files from https://wyniki.tge.pl/en/wyniki/archiwum/2/?date_to=2018-03-21&date_from=2018-02-19&data_scope=contract&market=rtee&data_period=3 using Python (this is not the main issue though)
Specifics: in particular, I am trying to download the csv file for the "Settlement price" and "BASE Year"
Problem: when I see the source code for this web page.I see the references to the "Upload" button, but I don't see refences for the csv file(Tbf I am not very good at looking at the source code). As I am using Python (urllib) I need to know the URL of the csv file but don't know how to get it.
This is not a question of Python per se, but about how to find the URL of some .csv that can be downloaded from a web page. Hence, no code is provided.
If you inspect the source code from that webpage in particular, you will see that the form to obtain the csv file has 3 main inputs:
file_type
fields
contracts
So, to obtain the csv file for the "Settlement price" and "BASE Year", you would simply do a POST request to that same URL, passing these as the payload:
file_type=2&fields=4&contracts=4
I would recommend wget command with python. WGET is a command to download any file. Once you download the file with wget then you can manipulate the csv file using other library.
I found this wget library for python.
https://pypi.python.org/pypi/wget
Regards.
Eduardo Estevez.

Getting file from URL that triggers a download in Python

I have a URL in a web analytics reporting platform that basically triggers a download/export of the report you're looking at. The downloaded file itself is a CSV, and the link that triggers the download uses several attached parameters to define things like the fields in the report. What I am looking to do is download the CSV that the link triggers a download of.
I'm using Python 3.6, and I've been told that the server I'll be deploying on does not support Selenium or any webkits like PhantomJS. Has anyone successfully accomplished this?
If the file is a CSV file, you might want to consider downloading it's content directly, by using the requests module, something like this.
import requests
session=requests.Session()
information=session.get(#the link of the page here)
Then You can decode the information and read the contents as you wish using the CSV module, something like this (the csv module should be imported):
decoded_information=information.content.decode('utf-8')
data=decoded_information.splitlines()
data=csv.DictReader(data)
You can use a for loop to access each row in the data as you wish using the column headings as dictionary keys like so:
for row in data:
itemdate=row['Date']
...
Or you can save the decoded contents by writing them to a file with something like this:
decoded_information=information.content.decode('utf-8')
file=open("filename.csv", "w")
file.write(decoded_information)
file.close
A couple of links with documentation on the CSV module is provided here just in case you haven't used it before:
https://docs.python.org/2/library/csv.html
http://www.pythonforbeginners.com/systems-programming/using-the-csv-module-in-python/
Hope this helps!

How to retrieve wav file from API?

I am working on Python3.4.4
I tried to use a Merriam-Webster API, and here is an example link:
http://www.dictionaryapi.com/api/v1/references/collegiate/xml/purple?key=bf534d02-bf4e-49bc-b43f-37f68a0bf4fd
There is a file under the tag, you will see after you open the url.
And I am wondering that how can I retrieve that wav file......
Because it is kind of just a string to me......
Thank you very much!
Okay, I just sort it out.
Usually you need to look at the instructions for the API, I look it up on the official website and it tells you that how you are going to retrieve that. In this case you are going to another url, and then wala

Python 3.4 - Downloading newly uploaded text files from pastebin.com

I want to download text files from pastebin.com.
Once I start the program it should look for text files that are being uploaded and "download" them once they're uploaded.
I know how to "download" them but not how to tell Python to click on one of the public files on http://pastebin.com/archive and then click on the "raw"-button to open a new tab that contains the "raw" content.
I googled a lot but literally nothing came up that would help me.
Thanks
Well, a program doesn't know how to "click" anything :). In order to retrieve information from a page, you simply need to send a GET request at the correct url. In your case, that would be http://pastebin.com/raw/4ffLHviP or any other code of the pastebin you want to download. You can retrieve codes manually, or e.g. by applying text parsers (regex, beautifulsoup...) on the archive page.
Note that, there is an API for scraping Pastebin (see http://pastebin.com/scraping). It is strongly recommended, if you want to extract consequent content from them, to use it. It is more "polite", may offer better service, and will avoid you to be blacklisted.
To choose a file you simply do the following:
Visit the link of the file, ex. http://pastebin.com/B8A6L7Zt
The raw content is already on that page, namely inside<textarea id='paste_code'>...</textarea>. So you just cut this content off, using regex for example.

Downloading public files in Google Drive (Python)

Suppose that someone gives me a link that enables me to download a public file in Google Drive.
I want to write a program that can read the link and then download it as a text file.
For example, https://docs.google.com/document/d/1yJVXtabsP7KrJXSu3XyOh-F2cFoP8Lftr14PtXCLEVU/edit is one of files in my Google Drive.
Everyone can access this file.
But how can I write a Python program that downloads the text file given the above link?
Could someone have some pieces of sample code for me?
It seems that some Google Drive SDK could be useful(?), but is there any way to do it without using SDK?
first you need to write a program that would slice off the link of the file that you have uploaded.
for example in the link that you gave:
https://docs.google.com/document/d/1yJVXtabsP7KrJXSu3XyOh-F2cFoP8Lftr14PtXCLEVU/edit
id is 1yJVXtabsP7KrJXSu3XyOh-F2cFoP8Lftr14PtXCLEVU
save it in some variable , say download_link
now to get the download link:
https://docs.google.com/uc?export=download&id=download_link
this link will download the file
If the above answer doesn't work for you use the following links :
to save as .txt file :
https://docs.google.com/document/d/1yJVXtabsP7KrJXSu3XyOh-F2cFoP8Lftr14PtXCLEVU/export?format=txt
to save as docx file:
https://docs.google.com/document/d/1yJVXtabsP7KrJXSu3XyOh-F2cFoP8Lftr14PtXCLEVU/export?format=docx
generally the trick is to add : export?format=txt instead of edit ! hope it helps.

Categories

Resources