Storing OAuth Token in Python Library - python

I have a Python service which imports a library that talks to the PayPal API. There is a config file that is passed into the library __init__() which contains the PayPal API username and password.
Calling the PayPal API token endpoint with the username and password will return a token used to authenticate during the pay call. However, this token lasts for 90 minutes and should be reused.
There are multiple instances of this service running on different servers and they need to all share this one secret token.
What would the best way of storing this 9 minute token be?

While you could persist this in a database, since it's only valid for 90 minutes, you might consider using an in-memory data store like Redis. It's very simple to set up and there are various Python clients available.
Redis in particular supports expiration time when setting a value, so you can make sure it'll only be kept for a set amount of time. Of course, you should still have exception handling in place in case for some reason the key is invalidated early.
While this may introduce a software dependency if you're not already using a key-value store, it's not clear from your question how this library is intended to be used and thus whether this is an issue.
If installing other software is not an option, you could use a temporary file. However, because Python's tempfile doesn't seem to support directly setting a temporary file's name, you might have to handle file management manually. For example:
import os
import time
import tempfile
# 90 minutes in seconds. Setting this a little lower would
# probably be better to account for network latency.
MAX_AGE = 90 * 60
# /tmp/libname/ needs to exist for this to work; creating it
# if necessary shouldn't give you much trouble.
TOKEN_PATH = os.path.join(
tempfile.gettempdir(),
'libname',
'paypal.token',
)
def get_paypal_token():
token = None
if os.path.isfile(TOKEN_PATH):
token_age = time.time() - os.path.getmtime(TOKEN_PATH)
if token_age < MAX_AGE:
with open(TOKEN_PATH, 'r') as infile:
# You might consider a test API call to establish token validity here.
token = infile.read()
if not token:
# Get a token from the PayPal API and write it to TOKEN_PATH.
token = 'dummy'
with open(TOKEN_PATH, 'w') as outfile:
outfile.write(token)
return token
Depending on the environment, you would probably want to look into restricting permissions on this temp file. Regardless of how you persist the token, though, this code should be a useful example. I wouldn't be thrilled about sticking something like this on the file system, but if you already have the PayPal credentials used to request a token on disk, writing the token to temporary storage probably won't be a big deal.

You could store the token as a system variable.
import os
# Store token
os.environ['PAYPAL_API_TOKEN'] = <...>
# Retrieve token
token = os.environ['PAYPAL_API_TOKEN']
Be aware of the security implications though: Other processes could read the token.

Related

How to make a large file accessible to external APIs?

I'm new to webdev, and I have this use case where a user sends a large file (e.g., a video file) to the API, and then this file needs to be accessible to other APIs (which could possibly be on different servers) for further processing.
I'm using FastAPI for the backend, defining a file parameter with a type of UploadFile to receive and store the files. But what would be the best way to make this file accessible to other APIs? Is there a way I can get a publicly accessible URL out of the saved file, which other APIs can use to download the file?
Returning a File Response
First, to return a file that is saved on disk from a FastAPI backend, you could use FileResponse (in case the file was already fully loaded into memory, see here). For example:
from fastapi import FastAPI
from fastapi.responses import FileResponse
some_file_path = "large-video-file.mp4"
app = FastAPI()
#app.get("/")
def main():
return FileResponse(some_file_path)
In case the file is too large to fit into memory—as you may not have enough memory to handle the file data, e.g., if you have 16GB of RAM, you can’t load a 100GB file—you could use StreamingResponse. That way, you don't have to read it all first in memory, but, instead, load it into memory in chunks, thus processing the data one chunk at a time. Example is given below. If you find yield from f being rather slow when using StreamingResponse, you could instead create a custom generator, as described in this answer.
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
some_file_path = "large-video-file.mp4"
app = FastAPI()
#app.get("/")
def main():
def iterfile():
with open(some_file_path, mode="rb") as f:
yield from f
return StreamingResponse(iterfile(), media_type="video/mp4")
Exposing the API to the public
As for exposing your API to the public—i.e., external APIs, users, developers, etc.—you can use ngrok (or expose, as suggested in this answer).
Ngrok is a cross-platform application that enables developers to expose a local development server to the Internet with minimal effort. To embed the ngrok agent into your FastAPI application, you could use pyngrok—as suggested here (see here for a FastAPI integration example). If you would like to run and expose your FastAPI app through Google Colab (using ngrok), instead of your local machine, please have a look at this answer (plenty of tutorials/examples can also be found on the web).
If you are looking for a more permanent solution, you may want to have a look at cloud platforms—more specifically, a Platform as a Service (PaaS)—such as Heroku. I would strongly recommend you thoroughly read FastAPI's Deployment documentation. Have a closer look at About HTTPS and Deployments Concepts.
Important to note
By exposing your API to the outside world, you are also exposing it to various forms of attack. Before exposing your API to the public—even if it’s for free—you need to make sure you are offering secure access (use HTTPS), as well as authentication (verify the identity of a user) and authorisation (verify their access rights; in other words, verify what specific routes, files and data a user has access to)—take a look at 1. OAuth2 and JWT tokens, 2. OAuth2 scopes, 3. Role-Based Access Control (RBAC), 4. Get Current User and How to Implement Role based Access Control With FastAPI.
Addtionally, if you are exposing your API to be used publicly, you may want to limit the usage of the API because of expensive computation, limited resources, DDoS attacks, Brute-force attacks, Web scraping, or simply due to monthly cost for a fixed amount of requests. You can do that at the application level using, for instance, slowapi (related post here), or at the platform level by setting the rate limit through your hosting service (if permitted). Furthermore, you would need to make sure that the files uploaded by users have the permitted file extension, e.g., .mp4, and are not files with, for instance, a .exe extension that are potentially harmful to your system. Finally, you would also need to ensure that the uploaded files do not exceed a predefined MAX_FILE_SIZE limit (based on your needs and system's resources), so that authenticated users, or an attacker, would be prevented from uploading extremely large files that would result in consuming server resources in a way that the application may end up crashing. You shouldn't rely, though, on the Content-Length header being present in the request to do that, as this might be easily altered, or even removed, by the client. You should rather use an approach similar to this answer (have a look at the "Update" section) that uses request.stream() to process the incoming data in chunks as they arrive, instead of loading the entire file into memory first. By using a simple counter, e.g., total_len += len(chunk), you can check if the file size has exceeded the MAX_FILE_SIZE, and if so, raise an HTTPException with HTTP_413_REQUEST_ENTITY_TOO_LARGE status code (see this answer as well, for more details and code examples).
Read more on FastAPI's Security documentation and API Security on Cloudflare.

How can I configure a Flask API to allow for token-protected user-specific responses?

If I have a server end point that say does a simple task like initializes an API with a token that I generate client side, and then print the users account info.
Can I initialize() the API globally so the user can do other tasks after printing account info?
and
How does that affect other users initializing() and printing info if they do it at the same time?
I don't understand how this server works and any help would be great. Thank you!
If I'm understanding you correctly, it sounds like the easiest way to accomplish what you're trying to do is to use the flask-login module.
You will want to create an endpoint / Flask route (for example, '/login') that the user will send a POST request to with a username (and usually also a password). When the login is successful, the user's browser will have a cookie set with a token that will allow them to access Flask routes that have a #login_required decorator attached.
In your Python code for those routes, you will be able to access the "current_user", which will allow you to tailor your response to the particular user sending the request.
This way of doing things will allow you to deal with your first question, which seems to be about saving a user's identity across requests, and it will also allow you to deal with your second question, which seems to be about the problem of having different users accessing the same endpoint while expecting different responses.
Nathan's answer points you in the right direction, but I'll make a comment about state and scope because you asked about multiple users.
Firstly, the normal way to do this would be for the server to generate the random token and send it back to the user.
client request 1 > init > create token > store along with init data > return token to user
client request 2 > something(token) > find data related to token > return data to user
When you run Flask (and most webapps) you try to make it so that each request is independent. In your case, the user calls an endpoint and identifies themselves by passing a token. If you make a record of that token somewhere (normally in a database) then you can identify that user's data by looking up the same token they pass on subsequent requests.
Where you choose to store that information is important. As I say a database is the normal recommended approach as it's built to be accessed safely by multiple people at the same time.
You might be tempted to not do the database thing and actually store the token / data information in a global variable inside python. Here's the reason why that's (probably) not going to work:
Flask is a wsgi server, and how it behaves when up and running depends on how it's configured. I generally use uwsgi with several different processes. Each process will have its own state that can't see one another. In this (standard / common) configuration, you don't know which process is going to receive any given request.
request 1 > init(token) > process 1 > store token in global vars
request 2 > other(token) > process 2 > can't find token in global vars
That's why we use a database to store all shared information:
request 1 > init(token) > process 1 > store token in db
request 2 > other(token) > process 2 > can find token db

Is it possible to store authentication token in a separate file?

I am building a command line tool using python that interfaces with an RESTful api. The API uses oauth2 for authentication. Rather than asking for access_token every time user runs the python tool. Can I store the access_token in some way so that I can use it till its lifespan? If it is then how safe it is.
You can store the access token in a file on your user's desktop.
You can do so using a storage. Assuming you use oauth2client:
# Reading credentials
store = oauth2client.file.Storage(cred_path)
credentials = store.get()
# Writing credentials
creds = client.AccessTokenCredentials(access_token, user_agent)
creds.access_token = access_token
creds.refresh_token = refresh_token
creds.client_id = client_id
creds.client_secret = client_secret
# For some reason it does not save all the credentials,
# so write them to a json file manually instead
with open(credential_path, "w") as f:
f.write(creds.to_json)
In terms of security, I would not see much of a threat here as these access tokens will be on a user's desktop. If someone wants to get their access token, they would need to have read access to that file for that time frame. However, if they can already do that, they most likely also can use your script to send them a copy of the user's access token every time it is authenticated. But take my word lightly as I'm not a professional in that area. See information security stack exchange.
A post in information security stack exchange did talk about this:
these tokens give access to some fairly privileged information about your users.
However, the question was addressed to a database instead.
In conclusion, you can keep it in a file. (But take my word with a grain of salt)
Do you want to store it on the service side or locally?
Since your tool interfaces RESTful API, which is stateless, meaning that no information is stored between different requests to API, you actually need to provide access token every time your client accesses any of the REST endpoints. I am maybe missing some of the details in your design, but access tokens should be used only for authorization, since your user is already authenticated if he has a token. This is why tokens are valid only for a certain amount of time, usually 1 hour.
So you need to provide a state either by using cookie (web interface) or storing the token locally (Which is what you meant). However, you should trigger the entire oauth flow every time a user logs in to your client (authenticating user and providing a new auth token) otherwise you are not utilizing the benefits of oauth.

Django, global variables and tokens

I'm using django to develop a website. On the server side, I need to transfer some data that must be processed on the second server (on a different machine). I then need a way to retrieve the processed data. I figured that the simplest would be to send back to the Django server a POST request, that would then be handled on a view dedicated for that job.
But I would like to add some minimum security to this process: When I transfer the data to the other machine, I want to join a randomly generated token to it. When I get the processed data back, I expect to also get back the same token, otherwise the request is ignored.
My problem is the following: How do I store the generated token on the Django server?
I could use a global variable, but I had the impression browsing here and there on the web, that global variables should not be used for safety reason (not that I understand why really).
I could store the token on disk/database, but it seems to be an unjustified waste of performance (even if in practice it would probably not change much).
Is there third solution, or a canonical way to do such a thing using Django?
You can store your token in django cache, it will be faster from database or disk storage in most of the cases.
Another approach is to use redis.
You can also calculate your token:
save some random token in settings of both servers
calculate token based on current timestamp rounded to 10 seconds, for example using:
token = hashlib.sha1(secret_token)
token.update(str(rounded_timestamp))
token = token.hexdigest()
if token generated on remote server when POSTing request match token generated on local server, when getting response, request is valid and can be processed.
The simple obvious solution would be to store the token in your database. Other possible solutions are Redis or something similar. Finally, you can have a look at distributed async tasks queues like Celery...

demystify Flask app.secret_key

If app.secret_key isn't set, Flask will not allow you to set or access the session dictionary.
This is all that the flask user guide has to say on the subject.
I am very new to web development and I have no idea how/why any security stuff works. I would like to understand what Flask is doing under the hood.
Why does Flask force us to set this secret_key property?
How does Flask use the secret_key property?
The answer below pertains primarily to Signed Cookies, an implementation of the concept of sessions (as used in web applications). Flask offers both, normal (unsigned) cookies (via request.cookies and response.set_cookie()) and signed cookies (via flask.session). The answer has two parts: the first describes how a Signed Cookie is generated, and the second is presented as a series of Question/Answer that address different aspects of the scheme. The syntax used for the examples is Python3, but the concepts apply also to previous versions.
What is SECRET_KEY (or how to create a Signed Cookie)?
Signing cookies is a preventive measure against cookie tampering. During the process of signing a cookie, the SECRET_KEY is used in a way similar to how a "salt" would be used to muddle a password before hashing it. Here's a (widely) simplified description of the concept. The code in the examples is meant to be illustrative. Many of the steps have been omitted and not all of the functions actually exist. The goal here is to provide a general understanding of the main idea, but practical implementations will likely be a bit more involved. Also, keep in mind that Flask already provides most of this for you in the background. So, besides setting values to your cookie (via the session API) and providing a SECRET_KEY, it's not only ill-advised to re-implement this yourself, but there's no need to do so:
A poor man's cookie signature
Before sending a Response to the browser:
( 1 ) First a SECRET_KEY is established. It should only be known to the application and should be kept relatively constant during the application's life cycle, including through application restarts.
# choose a salt, a secret string of bytes
>>> SECRET_KEY = 'my super secret key'.encode('utf8')
( 2 ) create a cookie
>>> cookie = make_cookie(
... name='_profile',
... content='uid=382|membership=regular',
... ...
... expires='July 1 2030...'
... )
>>> print(cookie)
name: _profile
content: uid=382|membership=regular...
...
...
expires: July 1 2030, 1:20:40 AM UTC
( 3 ) to create a signature, append (or prepend) the SECRET_KEY to the cookie byte string, then generate a hash from that combination.
# encode and salt the cookie, then hash the result
>>> cookie_bytes = str(cookie).encode('utf8')
>>> signature = sha1(cookie_bytes+SECRET_KEY).hexdigest()
>>> print(signature)
7ae0e9e033b5fa53aa....
( 4 ) Now affix the signature at one end of the content field of the original cookie.
# include signature as part of the cookie
>>> cookie.content = cookie.content + '|' + signature
>>> print(cookie)
name: _profile
content: uid=382|membership=regular|7ae0e9... <--- signature
domain: .example.com
path: /
send for: Encrypted connections only
expires: July 1 2030, 1:20:40 AM UTC
and that's what is sent to the client.
# add cookie to response
>>> response.set_cookie(cookie)
# send to browser -->
Upon receiving the cookie from the browser:
( 5 ) When the browser returns this cookie back to the server, strip the signature from the cookie's content field to get back the original cookie.
# Upon receiving the cookie from browser
>>> cookie = request.get_cookie()
# pop the signature out of the cookie
>>> (cookie.content, popped_signature) = cookie.content.rsplit('|', 1)
( 6 ) Use the original cookie with the application's SECRET_KEY to recalculate the signature using the same method as in step 3.
# recalculate signature using SECRET_KEY and original cookie
>>> cookie_bytes = str(cookie).encode('utf8')
>>> calculated_signature = sha1(cookie_bytes+SECRET_KEY).hexdigest()
( 7 ) Compare the calculated result with the signature previously popped out of the just received cookie. If they match, we know that the cookie has not been messed with. But if even just a space has been added to the cookie, the signatures won't match.
# if both signatures match, your cookie has not been modified
>>> good_cookie = popped_signature==calculated_signature
( 8 ) If they don't match then you may respond with any number of actions, log the event, discard the cookie, issue a fresh one, redirect to a login page, etc.
>>> if not good_cookie:
... security_log(cookie)
Hash-based Message Authentication Code (HMAC)
The type of signature generated above that requires a secret key to ensure the integrity of some contents is called in cryptography a Message Authentication Code or MAC.
I specified earlier that the example above is an oversimplification of that concept and that it wasn't a good idea to implement your own signing. That's because the algorithm used to sign cookies in Flask is called HMAC and is a bit more involved than the above simple step-by-step. The general idea is the same, but due to reasons beyond the scope of this discussion, the series of computations are a tad bit more complex.
If you're still interested in crafting a DIY, as it's usually the case, Python has some modules to help you get started :) here's a starting block:
import hmac
import hashlib
def create_signature(secret_key, msg, digestmod=None):
if digestmod is None:
digestmod = hashlib.sha1
mac = hmac.new(secret_key, msg=msg, digestmod=digestmod)
return mac.digest()
The documentaton for hmac and hashlib.
The "Demystification" of SECRET_KEY :)
What's a "signature" in this context?
It's a method to ensure that some content has not been modified by anyone other than a person or an entity authorized to do so.
One of the simplest forms of signature is the "checksum", which simply verifies that two pieces of data are the same. For example, when installing software from source it's important to first confirm that your copy of the source code is identical to the author's. A common approach to do this is to run the source through a cryptographic hash function and compare the output with the checksum published on the project's home page.
Let's say for instance that you're about to download a project's source in a gzipped file from a web mirror. The SHA1 checksum published on the project's web page is 'eb84e8da7ca23e9f83....'
# so you get the code from the mirror
download https://mirror.example-codedump.com/source_code.tar.gz
# you calculate the hash as instructed
sha1(source_code.tar.gz)
> eb84e8da7c....
Both hashes are the same, you know that you have an identical copy.
What's a cookie?
An extensive discussion on cookies would go beyond the scope of this question. I provide an overview here since a minimal understanding can be useful to have a better understanding of how and why SECRET_KEY is useful. I highly encourage you to follow up with some personal readings on HTTP Cookies.
A common practice in web applications is to use the client (web browser) as a lightweight cache. Cookies are one implementation of this practice. A cookie is typically some data added by the server to an HTTP response by way of its headers. It's kept by the browser which subsequently sends it back to the server when issuing requests, also by way of HTTP headers. The data contained in a cookie can be used to emulate what's called statefulness, the illusion that the server is maintaining an ongoing connection with the client. Only, in this case, instead of a wire to keep the connection "alive", you simply have snapshots of the state of the application after it has handled a client's request. These snapshots are carried back and forth between client and server. Upon receiving a request, the server first reads the content of the cookie to reestablish the context of its conversation with the client. It then handles the request within that context and before returning the response to the client, updates the cookie. The illusion of an ongoing session is thus maintained.
What does a cookie look like?
A typical cookie would look like this:
name: _profile
content: uid=382|status=genie
domain: .example.com
path: /
send for: Encrypted connections only
expires: July 1 2030, 1:20:40 AM UTC
Cookies are trivial to peruse from any modern browser. On Firefox for example go to Preferences > Privacy > History > remove individual cookies.
The content field is the most relevant to the application. Other fields carry mostly meta instructions to specify various scopes of influence.
Why use cookies at all?
The short answer is performance. Using cookies, minimizes the need to look things up in various data stores (memory caches, files, databases, etc), thus speeding things up on the server application's side. Keep in mind that the bigger the cookie the heavier the payload over the network, so what you save in database lookup on the server you might lose over the network. Consider carefully what to include in your cookies.
Why would cookies need to be signed?
Cookies are used to keep all sorts of information, some of which can be very sensitive. They're also by nature not safe and require that a number of auxiliary precautions be taken to be considered secure in any way for both parties, client and server. Signing cookies specifically addresses the problem that they can be tinkered with in attempts to fool server applications. There are other measures to mitigate other types of vulnerabilities, I encourage you to read up more on cookies.
How can a cookie be tampered with?
Cookies reside on the client in text form and can be edited with no effort. A cookie received by your server application could have been modified for a number of reasons, some of which may not be innocent. Imagine a web application that keeps permission information about its users on cookies and grants privileges based on that information. If the cookie is not tinker-proof, anyone could modify theirs to elevate their status from "role=visitor" to "role=admin" and the application would be none the wiser.
Why is a SECRET_KEY necessary to sign cookies?
Verifying cookies is a tad bit different than verifying source code the way it's described earlier. In the case of the source code, the original author is the trustee and owner of the reference fingerprint (the checksum), which will be kept public. What you don't trust is the source code, but you trust the public signature. So to verify your copy of the source you simply want your calculated hash to match the public hash.
In the case of a cookie however the application doesn't keep track of the signature, it keeps track of its SECRET_KEY. The SECRET_KEY is the reference fingerprint. Cookies travel with a signature that they claim to be legit. Legitimacy here means that the signature was issued by the owner of the cookie, that is the application, and in this case, it's that claim that you don't trust and you need to check the signature for validity. To do that you need to include an element in the signature that is only known to you, that's the SECRET_KEY. Someone may change a cookie, but since they don't have the secret ingredient to properly calculate a valid signature they cannot spoof it. As stated a bit earlier this type of fingerprinting, where on top of the checksum one also provides a secret key, is called a Message Authentication Code.
What about Sessions?
Sessions in their classical implementation are cookies that carry only an ID in the content field, the session_id. The purpose of sessions is exactly the same as signed cookies, i.e. to prevent cookie tampering. Classical sessions have a different approach though. Upon receiving a session cookie the server uses the ID to look up the session data in its own local storage, which could be a database, a file, or sometimes a cache in memory. The session cookie is typically set to expire when the browser is closed. Because of the local storage lookup step, this implementation of sessions typically incurs a performance hit. Signed cookies are becoming a preferred alternative and that's how Flask's sessions are implemented. In other words, Flask sessions are signed cookies, and to use signed cookies in Flask just use its Session API.
Why not also encrypt the cookies?
Sometimes the contents of cookies can be encrypted before also being signed. This is done if they're deemed too sensitive to be visible from the browser (encryption hides the contents). Simply signing cookies however, addresses a different need, one where there's a desire to maintain a degree of visibility and usability to cookies on the browser, while preventing that they'd be meddled with.
What happens if I change the SECRET_KEY?
By changing the SECRET_KEY you're invalidating all cookies signed with the previous key. When the application receives a request with a cookie that was signed with a previous SECRET_KEY, it will try to calculate the signature with the new SECRET_KEY, and both signatures won't match, this cookie and all its data will be rejected, it will be as if the browser is connecting to the server for the first time. Users will be logged out and their old cookie will be forgotten, along with anything stored inside. Note that this is different from the way an expired cookie is handled. An expired cookie may have its lease extended if its signature checks out. An invalid signature just implies a plain invalid cookie.
So unless you want to invalidate all signed cookies, try to keep the SECRET_KEY the same for extended periods.
What's a good SECRET_KEY?
A secret key should be hard to guess. The documentation on Sessions has a good recipe for random key generation:
>>> import os
>>> os.urandom(24)
'\xfd{H\xe5<\x95\xf9\xe3\x96.5\xd1\x01O<!\xd5\xa2\xa0\x9fR"\xa1\xa8'
You copy the key and paste it in your configuration file as the value of SECRET_KEY.
Short of using a key that was randomly generated, you could use a complex assortment of words, numbers, and symbols, perhaps arranged in a sentence known only to you, encoded in byte form.
Do not set the SECRET_KEY directly with a function that generates a different key each time it's called. For example, don't do this:
# this is not good
SECRET_KEY = random_key_generator()
Each time your application is restarted it will be given a new key, thus invalidating the previous.
Instead, open an interactive python shell and call the function to generate the key, then copy and paste it to the config.
Anything that requires encryption (for safe-keeping against tampering by attackers) requires the secret key to be set. For just Flask itself, that 'anything' is the Session object, but other extensions can make use of the same secret.
secret_key is merely the value set for the SECRET_KEY configuration key, or you can set it directly.
The Sessions section in the Quickstart has good, sane advice on what kind of server-side secret you should set.
Encryption relies on secrets; if you didn't set a server-side secret for the encryption to use, everyone would be able to break your encryption; it's like the password to your computer. The secret plus the data-to-sign are used to create a signature string, a hard-to-recreate value using a cryptographic hashing algorithm; only if you have the exact same secret and the original data can you recreate this value, letting Flask detect if anything has been altered without permission. Since the secret is never included with data Flask sends to the client, a client cannot tamper with session data and hope to produce a new, valid signature.
Flask uses the itsdangerous library to do all the hard work; sessions use the itsdangerous.URLSafeTimedSerializer class with a customized JSON serializer.

Categories

Resources