Download files from personal OneDrive using Python

Download files from personal OneDrive using Python - python

I have a Python script that is running periodically on an AWS EC2 Ubuntu machine.
This script reads data from some files and sometimes changes data in them.
I want to download these files from OneDrive, do my own thing with them, and upload them back to OneDrive.
I want this to be done automatically, without the need for a user to approve any login or credentials. I'm ok with doing it once (i.e. approving the login on the first run) but the rest has to run automatically, without asking ever again for approvals (unless the permissions change, of course).
What is the best way to do this?
I've been reading the documentation on Microsoft Graph API but I'm struggling with the authentication part. I've created an application in Azure AAD, gave the sample permissions (to test) and created a secret credential.

I managed to do it. I'm not sure if it's the best way but it is working now. It's running automatically every hour and I don't need to touch it.
I followed the information on https://learn.microsoft.com/en-gb/azure/active-directory/develop/v2-oauth2-auth-code-flow
This is what I did.
Azure Portal
Create an application. Azure Active Directory -> App Registrations -> Applications from personal account
In Supported account types, choose the one that has personal Microsoft accounts.
In Redirect URI, choose Public client/native. We'll add the specific URI later.
In the application details, in the section Overview, take note of the Application (client) ID. We'll need this later.
In the section Authentication, click Add a Platform and choose Desktop + devices. You can use your own, I chose one of the suggested: https://login.microsoftonline.com/common/oauth2/nativeclient
In the section API permissions, you have to add all the permissions that your app will use. I added User.Read, Files.ReadWrite and offline_access. The offline_access is to be able to get the refresh token, which will be crucial to keep the app running without asking the user to login.
I did not create any Certificate or Secret.
Web
Looks like to get a token for the first time we have to use a browser or emulate something like that.
There must be a programmatic way to do this, but I had no idea how to do it. I also thought about using Selenium for this, but since it's only one time and my app will request tokens every hour (keeping the tokens fresh), I dropped that idea.
If we add new permissions, the tokens that we have will become invalid and we have to do this manual part again.
Open a browser and go to the URL below. Use the Scopes and the Redirect URI that you set up in Azure Portal.
https://login.microsoftonline.com/common/oauth2/v2.0/authorize?client_id=your_app_client_id&response_type=code&redirect_uri=https%3A%2F%2Flogin.microsoftonline.com%2Fcommon%2Foauth2%2Fnativeclient&response_mode=query&scope=User.Read%20offline_access%20Files.ReadWrite
That URL will redirect you to the Redirect URI that you set up and with a code=something in the URL. Copy that something.
Do a POST request with type FORM URL Encoded. I used https://reqbin.com/ for this.
Endpoint: https://login.microsoftonline.com/common/oauth2/v2.0/token
Form URL: grant_type=authorization_code&client_id=your_app_client_id&code=use_the_code_returned_on_previous_step
This will return an Access Token and a Refresh Token. Store the Refresh Token somewhere. I'm saving it in a file.
Python
# Build the POST parameters
params = {
'grant_type': 'refresh_token',
'client_id': your_app_client_id,
'refresh_token': refresh_token_that_you_got_in_the_previous_step
}
response = requests.post('https://login.microsoftonline.com/common/oauth2/v2.0/token', data=params)
access_token = response.json()['access_token']
new_refresh_token = response.json()['refresh_token']
# ^ Save somewhere the new refresh token.
# I just overwrite the file with the new one.
# This new one will be used next time.
header = {'Authorization': 'Bearer ' + access_token}
# Download the file
response = requests.get('https://graph.microsoft.com/v1.0/me/drive/root:' +
PATH_TO_FILE + '/' + FILE_NAME + ':/content', headers=header)
# Save the file in the disk
with open(file_name, 'wb') as file:
file.write(response.content)
So basically, I have the Refresh Token always updated.
I call the Token endpoint using that Refresh Token, and the API gives me an Access Token to use during the current session and a new Refresh Token.
I use this new Refresh Token the next time I run the program, and so on.

I've just published a repo which does this. Contributions and pull requests welcome:
https://github.com/stevemurch/onedrive-download

Related

Microsoft Graph API Read Mail with Python

I'm trying to create a python script that continuously reads mail from a service account in my organization. I'm attempting to use the Microsoft Graph API, but the more I read, the more confused I get. I have registered an app in Azure Portal and have my client id, client secret, etc, then it's my understanding you have to use those, call the API that requires you to paste a url into your browser to log in to consent access, and that provides a token that only lasts an hour? How can I do this programmatically?
I guess my question is, has anyone had any luck doing this with the graph api? How can I do this without having to do the browser handshake every hour? I would like to be able to just run this script and let it run without worrying about needing to refresh a token ever so often. Am I just dumb, or is this way too complicated lol. Any python examples on how people are authenticating to the graph api and staying authenticated would be greatly appreciated!

I was just working on something similar today. (Microsoft recently deprecated basic authentication for exchange, and I can no longer send mail using a simple username/password from a web application I support.)
Using the microsoft msal python library https://github.com/AzureAD/microsoft-authentication-library-for-python, and the example in sample/device_flow_sample.py, I was able to build a user-based login that retrieves an access token and refresh token in order to stay logged in (using "device flow authentication"). The msal library handles storing and reloading the token cache, as well as refreshing the token whenever necessary.
Below is the code for logging in the first time
#see https://github.com/AzureAD/microsoft-authentication-library-for-python/blob/dev/sample/device_flow_sample.py
import sys
import json
import logging
import os
import atexit
import requests
import msal
# logging
logging.basicConfig(level=logging.DEBUG) # Enable DEBUG log for entire script
logging.getLogger("msal").setLevel(logging.INFO) # Optionally disable MSAL DEBUG logs
# config
config = dict(
authority = "https://login.microsoftonline.com/common",
client_id = 'YOUR CLIENT ID',
scope = ["User.Read"],
username = 'user#domain',
cache_file = 'token.cache',
endpoint = 'https://graph.microsoft.com/v1.0/me'
)
# cache
cache = msal.SerializableTokenCache()
if os.path.exists(config["cache_file"]):
cache.deserialize(open(config["cache_file"], "r").read())
atexit.register(lambda:
open(config["cache_file"], "w").write(cache.serialize())
if cache.has_state_changed else None)
# app
app = msal.PublicClientApplication(
config["client_id"], authority=config["authority"],
token_cache=cache)
# exists?
result = None
accounts = app.get_accounts()
if accounts:
logging.info("found accounts in the app")
for a in accounts:
print(a)
if a["username"] == config["username"]:
result = app.acquire_token_silent(config["scope"], account=a)
break
else:
logging.info("no accounts in the app")
# initiate
if result:
logging.info("found a token in the cache")
else:
logging.info("No suitable token exists in cache. Let's get a new one from AAD.")
flow = app.initiate_device_flow(scopes=config["scope"])
if "user_code" not in flow:
raise ValueError(
"Fail to create device flow. Err: %s" % json.dumps(flow, indent=4))
print(flow["message"])
sys.stdout.flush() # Some terminal needs this to ensure the message is shown
# Ideally you should wait here, in order to save some unnecessary polling
input("Press Enter after signing in from another device to proceed, CTRL+C to abort.")
result = app.acquire_token_by_device_flow(flow) # By default it will block
# You can follow this instruction to shorten the block time
# https://msal-python.readthedocs.io/en/latest/#msal.PublicClientApplication.acquire_token_by_device_flow
# or you may even turn off the blocking behavior,
# and then keep calling acquire_token_by_device_flow(flow) in your own customized loop.
if result and "access_token" in result:
# Calling graph using the access token
graph_data = requests.get( # Use token to call downstream service
config["endpoint"],
headers={'Authorization': 'Bearer ' + result['access_token']},).json()
print("Graph API call result: %s" % json.dumps(graph_data, indent=2))
else:
print(result.get("error"))
print(result.get("error_description"))
print(result.get("correlation_id")) # You may need this when reporting a bug
You'll need to fix up the config, and update the scope for the appropriate privileges.
All the magic is in here:
result = app.acquire_token_silent(config["scope"], account=a)
and putting the Authorization access_token in the requests headers:
graph_data = requests.get( # Use token to call downstream service
config["endpoint"],
headers={'Authorization': 'Bearer ' + result['access_token']},).json()
As long as you call acquire_token_silent before you invoke any graph APIs, the tokens will stay up to date. The refresh token is good for 90 days or something, and automatically updates. Once you login, the tokens will be updated and stored in the cache (and persisted to a file), and will stay alive more-or-less indefinitely (there are some things that can invalidate it on the server side).
Unfortunately, I'm still having problems because it's an unverified multi-tenant application. I successfully added the user as a guest in my tenant, and the login works, but as soon as I try to get more interesting privileges in scope, the user can't log in - I'll either have to get my mpn verified, or get my client's 3rd party IT guys admin to grant permission for this app in their tenant. If I had admin privileges for their tenant, I'd probably be looking at the daemon authentication method instead of user-based.
(to be clear, the code above is the msal example almost verbatim, with config and persistence tweaks)

python: how to redirect from desktop app to url, wait user to accept the authorization and get authorization code

I'm working on an app using the Spotify API but I'm a bit new to all of this. I'm trying to get the Authorization Code with Proof Key for Code Exchange (PKCE) (https://developer.spotify.com/documentation/general/guides/authorization-guide/#authorization-code-flow-with-proof-key-for-code-exchange-pkce)
My problem is how do I redirect the user to the query where he has to ACCEPT the authorization and make my app to wait until the user clicks on ACCEPT. When he does this, the user will be redirected and that new URL (as the docs said) will contain the authorization code that I need to then exchange it for an authorization token.
This is my function so far to get that authorization code:
def get_auth_code(self):
code_challenge = self.get_code_challenge_PKCE()
scopes_needed = "user-read-email%20user-read-private%20playlist-read-collaborative%20playlist-modify-public%20playlist-read-private%20playlist-modify-private%20user-library-modify%20user-library-read"
endpoint = "https://accounts.spotify.com/authorize"
query = f"{endpoint}?client_id={self.client_ID}&response_type=code&redirect_uri={self.redirect_uri}&scope={scopes_needed}&code_challenge_method=S256&code_challenge={code_challenge}"
webbrowser.open(query)

Set up a web server.
To programmatially extract the access tokens you need a web server to handle the redirection after the user logs in on Spotify (which you redirected them to). Now this server can be the user pasting the URI to an input field on a terminal, but obviously this isn't ideal for user experience. It leaves room for lots of mistakes.
I've authored a Spotify Web API client, whose internals might be useful for you to examine. For example, you can use Flask to construct the server. The main principle is using one endpoint (i.e. /login) to redirect (code 307 worked for me browsers won't remember it) the user to a callback (i.e. /callback) which recieves the code parameter with which you can request an access token.
OAuth2 can be a bit of a pain to implement locally, I know. In my library I also made a similar function that you are constructing using webbrowser, but it does have the manual copy-pasting quirk. To use functions you can define yourself for brevity, the gist of it is:
verifier = secrets.token_urlsafe(32) # for PKCE, not in my library yet
url = user_authorisation_url(scope, state, verifier)
# Communicate with the user
print('Opening browser for Spotify login...')
webbrowser.open(url)
redirected = input('Please paste redirect URL: ').strip()
code = parse_code_from_url(redirected)
state_back = parse_state_from_url(redirected)
assert state == state_back # For that added security juice
token = request_user_token(code, verifier)

python linkedin oauth2 - where is http_api.py?

I'm trying to get this example to work from https://github.com/ozgur/python-linkedin. I'm using his example. When I run this code. I don't get the RETURN_URL and authorization_code talked about in the example. I'm not sure why, I think it is because I'm not setting up the HTTP API example correctly. I can't find http_api.py, and when I visit http://localhost:8080, I get a "this site can't be reached".
from linkedin import linkedin
API_KEY = 'wFNJekVpDCJtRPFX812pQsJee-gt0zO4X5XmG6wcfSOSlLocxodAXNMbl0_hw3Vl'
API_SECRET = 'daJDa6_8UcnGMw1yuq9TjoO_PMKukXMo8vEMo7Qv5J-G3SPgrAV0FqFCd0TNjQyG'
RETURN_URL = 'http://localhost:8000'
authentication = linkedin.LinkedInAuthentication(API_KEY, API_SECRET, RETURN_URL, linkedin.PERMISSIONS.enums.values())
# Optionally one can send custom "state" value that will be returned from OAuth server
# It can be used to track your user state or something else (it's up to you)
# Be aware that this value is sent to OAuth server AS IS - make sure to encode or hash it
#authorization.state = 'your_encoded_message'
print authentication.authorization_url # open this url on your browser
application = linkedin.LinkedInApplication(authentication)

http_api.py is one of the examples provided in the package. This is an HTTP server that will handle the response from LinkedIn's OAuth end point, so you'll need to boot it up for the example to work.
As stated in the guide, you'll need to execute that example file to get the server working. Note you'll also need to supply the following environment variables: LINKEDIN_API_KEY and LINKEDIN_API_SECRET.
You can run the example file by downloading the repo and calling LINKEDIN_API_KEY=yourkey LINKEDIN_API_SECRET=yoursecret python examples/http_api.py. Note you'll need Python 3.4 for it to work.

YouTube API without user OAuth process

I am trying to fetch captions from YouTube video using YouTube Data API (v3)
https://developers.google.com/youtube/v3/guides/implementation/captions
So, first I tried to retrieve a captions list using this url:
https://www.googleapis.com/youtube/v3/captions?part=snippet&videoId=KK9bwTlAvgo&key={My API KEY}
I could retrieve the caption id that I'd like to download (jEDP-pmNCIqoB8QGlXWQf4Rh3faalD_l) from the above link.
Then, I followed this instruction to download the caption:
https://developers.google.com/youtube/v3/docs/captions/download
However, even though I input the caption id and my api key correctly, it shows "Login Required" error.
I suppose I need OAuth authentication, but what I am trying to do is not related to my users's account, but simply downloading public caption data automatically.
My question is: Is there any way to process OAuth authentication just once to get an access token of my own YouTube account and then reuse it whenever I need it in my application?

I can't speak to the permissions needed for the captions API in particular, but in general, yes, you can OAuth to your app once using your own account and use the access and refresh tokens to make subsequent OAuth'd requests to the API. You can find the details of generating tokens here:
https://developers.google.com/youtube/v3/guides/auth/server-side-web-apps#Obtaining_Access_Tokens
To perform the steps manually (fortunately, you only need to do this once):
If access has already been granted for an app, it needs to be removed so that new auth credentials can be established. Go to https://security.google.com/settings/security/permissions (while logged into your account) and remove access to the app. If the client ID or secret change (or you need to create one), find them at https://console.developers.google.com under API Manager.
To grant access and receive a temporary code, enter this URL in a browser:
https://accounts.google.com/o/oauth2/auth?
client_id=<client_id>&
redirect_uri=http://www.google.com&
scope=https://www.googleapis.com/auth/youtube.force-ssl&
response_type=code&
access_type=offline&
approval_prompt=force
Follow the prompt to grant access to the app.
This will redirect to google.com with a code parameter (e.g.,
https://www.google.com/?code=4/ux5gNj-_mIu4DOD_gNZdjX9EtOFf&gws_rd=ssl#). Save the code.
Send a POST request (e.g., via Postman Chrome plugin) to https://accounts.google.com/o/oauth2/token with the following in the request body:
code=<code>&
client_id=<client_id>&
client_secret=<client_secret>&
redirect_uri=http://www.google.com&
grant_type=authorization_code
The response will contain both an access token and refresh token. Save both, but particularly the refresh token (because the access token will expire in 1 hour).
You can then use the access token to send an OAuth'd request manually, following one of the options here, essentially:
curl -H "Authorization: Bearer ACCESS_TOKEN" https://www.googleapis.com/youtube/v3/captions/<id>
or
curl https://www.googleapis.com/youtube/v3/captions/<id>?access_token=ACCESS_TOKEN
(When I tried the second option for captions, however, I got the message: "The OAuth token was received in the query string, which this API forbids for response formats other than JSON or XML. If possible, try sending the OAuth token in the Authorization header instead.")
You can also use the refresh token in your code to create the credential needed when building your YouTube object. In Java, this looks like the following:
String clientId = <your client ID>
String clientSecret = <your client secret>
String refreshToken = <refresh token>
HttpTransport transport = new NetHttpTransport();
JsonFactory jsonFactory = new JacksonFactory();
GoogleCredential credential = new GoogleCredential.Builder()
.setTransport(transport)
.setJsonFactory(jsonFactory)
.setClientSecrets(clientId, clientSecret)
.build()
.setRefreshToken(refreshToken);
try {
credential.refreshToken();
} catch (IOException e) {
e.printStackTrace();
}
youtube = new YouTube.Builder(transport, jsonFactory, credential).build();
I imagine you can do something similar in Python with the API Client Libraries, although I haven't tried Python.

How do I access onedrive in an automated fashion without user interaction?

I am trying to access my own docs & spreadsheets via onedrive's api. I have:
import requests
client_id = 'my_id'
client_secret = 'my_secret'
scopes = 'wl.offline_access%20wl.signin%20wl.basic'
response_type = 'token' # also have tried "code"
redirect_uri = 'https://login.live.com/oauth20_desktop.srf'
base_url = 'https://apis.live.net/v5.0/'
r = requests.get('https://login.live.com/oauth20_authorize.srf?client_id=%s&scope=%s&response_type=%s&redirect_uri=%s' % (client_id, scopes, response_type, redirect_uri))
print r.text
(For my client I've also tried both "Mobile or desktop client app:" set to "Yes" and "No")
This will return the html for the user to manually click on. Since the user is me and it's my account how do I access the API without user interaction?
EDIT #1:
For those confused on what I'm looking for it would be the equivalent of Google's Service Account (OAuth2): https://console.developers.google.com/project

You cannot "bypass" the user interaction.
However you are very close to getting it to work. If you want to gain an access token in python you have to do it through the browser. You can use the web browser library to open the default web browser. It will look something like this (your app must be a desktop app):
import webbrowser
webbrowser.open("https://login.live.com/oauth20_authorize.srf?client_id=foo&scope=bar&response_type=code&redirect_uri=https://login.live.com/oauth20_desktop.srf")
This will bring you to the auth page, sign in and agree to the terms (it will differ depending on scope). It will direct you to a page where the url looks like:
https://login.live.com/oauth20_desktop.srf?code=<THISISTHECODEYOUWANT>&lc=foo
Copy this code from the browser and have your python script take it as input.
You can then make a request as described here using the code you received from the browser.
You will receive a response described here

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.