Recommended approach to externalize "secret" configuration in app engine

Recommended approach to externalize "secret" configuration in app engine - python

I am planning to use a JWT implementation on app engine (python, if it makes a difference) which will require me to have a secret string to have data signed. At some point I would like to have the source (configuration files included) available in a public repository. What is the best approach to externalize that secret string, without making the value public too?
There are 3 options I can think of, but none seem great:
Set environment variable in GAE console. (doesn't exist)
Keep separate repo of private things and mix in with script at deploy-time. (seems clunky)
Create my own "environment var" entity and keep data in datastore. (don't see console screen to manually put data into datastore)
Right now option #3 seems the most reasonable. Is there a better or recommended approach to what I'm trying to do here?

You really want to separate the secret from the data. If you are signing the data, I would really recommend #1 (or a hybrid of that really). You can define an environment_variables section in your app.yaml: https://cloud.google.com/appengine/docs/python/config/appconfig#Python_app_yaml_Defining_environment_variables
env_variables:
APPLICATION_SECRET: 'secret_goes_here'
You'll then want to prohibit code downloads for the application as an extra step to protect your secret: https://cloud.google.com/appengine/docs/python/tools/uploadinganapp#Python_Downloading_source_code.
Finally, you could use a method similar to the client_secrets.json that was done for the Google APIs client library if you really don't want to use environment variables: https://developers.google.com/api-client-library/python/guide/aaa_client_secrets

Related

How to secure passwords and secret keys in Django settings file

A django settings file includes sensitive information such as the secret_key, password for database access etc which is unsafe to keep hard-coded in the setting file. I have come across various suggestions as to how this information can be stored in a more secure way including putting it into environment variables or separate configuration files. The bottom line seems to be that this protects the keys from version control (in addition to added convenience when using in different environments) but that in a compromised system this information can still be accessed by a hacker.
Is there any extra benefit from a security perspective if sensitive settings are kept in a data vault / password manager and then retrieved at run-time when settings are loaded?
For example, to include in the settings.py file (when using pass):
import subprocess
SECRET_KEY=subprocess.check_output("pass SECRET_KEY", shell=True).strip().decode("utf-8")
This spawns a new shell process and returns output to Django. Is this more secure than setting through environment variables?

I think a data vault/password manager solution is a matter of transferring responsibility but the risk is still here. When deploying Django in production, the server should be treated as importantly as a data vault. Firewall in place, fail to ban, os up to date... must be in place. Then, in my opinion, there is nothing wrong or less secure than having a settings.py file with a config parser reading a config.ini file (declared in your .gitignore!) where all your sensitive information is present.

Storing REST API credentials safely for access in a Python environment

I have a number of REST APIs from a software program (ex: Tradingview)
I would like to store the API credentials (e.g. keys, secrets) safely.
I had thought about placing them in a Database table - but - I am not totally fond of placing clear text in a table.
I already know about using OS Environment Variables:
[... snip ...]
import os
import sys
import logging
[... snip ...]
LD_API_KEY = os.getenv("BINANCE_APIKEY")
LD_API_SECRET = os.getenv("BINANCE_API_SECRET")
where keys are stored in a file - but - as mentioned before, I have a number of API keys.
Just leaving them on a server (in clear text) - even though the file is hidden - is not sitting well with me.
Is there any other way to store API Keys?

There are a number of articles on this topic, a quick web search for "Storing API keys" will net you some interesting and informative reads, so I'll just talk about my experience here.
Really, it's all up to preference, the requirements of your project, and the level of security you need. I personally have run through a few solutions. Here's how my project has evolved over time.
Each key stored in environment variables
Simple enough, just had to use os.environ for every key. This very quickly became a management headache, especially when deploying to multiple environments, or setting up an environment for a new project contributor.
All keys stored in a local file
This started as just a file outside source control with an environment variable pointing to the file. I started with a simple JSON file in the following structure.
[
{
"name": "Service Name",
"date": "1970-01-01", // to track rotations
"key": "1234abcd",
"secret_key": "abcd1234"
}
]
This evolved into a class that accessed this file for me and returned the desired key so I didn't have to repeat json.load() or import os in every script that accessed APIs. This got a little more complex when I started needing to store OAuth tokens.
I eventually moved this file to a private, encrypted (git-secret), local-only git repo so team members could also use the keys in their environments.
Use a secret management service
The push to remote work forced me to create a system for remote API key access and management. My team debated a number of solutions, but we eventually fell on AWS Secrets Manager. The aforementioned custom class was pointed at AWS instead of a local file, and we gained a significant increase in security and flexibility over the local-only solution.
There are a number of cloud-based secret management solutions, but my project is already heavily AWS integrated, so this made the most sense given the costs and constraints. This also means that each team member now only needs to have AWS permissions and use their accounts AWS API key for access.

Why is App Engine Returning the Wrong Application ID?

The App Engine Dev Server documentation says the following:
The development server simulates the production App Engine service. One way in which it does this is to prepend a string (dev~) to the APPLICATION_IDenvironment variable. Google recommends always getting the application ID using get_application_id
In my application, I use different resources locally than I do on production. As such, I have the following for when I startup the App Engine instance:
import logging
from google.appengine.api.app_identity import app_identity
# ...
# other imports
# ...
DEV_IDENTIFIER = 'dev~'
application_id = app_identity.get_application_id()
is_development = DEV_IDENTIFIER in application_id
logging.info("The application ID is '%s'")
if is_development:
logging.warning("Using development configuration")
# ...
# set up application for development
# ...
# ...
Nevertheless, when I start my local dev server via the command line with dev_appserver.py app.yaml, I get the following output in my console:
INFO: The application ID is 'development-application'
WARNING: Using development configuration
Evidently, the dev~ identifier that the documentation claims will be preprended to my application ID is absent. I have also tried to use the App Engine Launcher UI to see if that changed anything, but it did not.
Note that 'development-application' is the name of my actual application, but I expected it to be 'dev~development-application'.

Google recommends always getting the application ID using get_application_id
But, that's if you cared about the application ID -- you don't: you care about the partition. Check out the source -- it's published at https://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/api/app_identity/app_identity.py .
get_app_identity uses os.getenv('APPLICATION_ID') then passes that to internal function _ParseFullAppId -- which splits it by _PARTITION_SEPARATOR = '~' (thus removing again the dev~ prefix that dev_appserver.py prepended to the environment variable). That's returned as the "partition" to get_app_identity (which ignores it, only returning the application ID in the strict sense).
Unfortunately, there is no architected way to get just the partition (which is in fact all you care about).
I would recommend that, to distinguish whether you're running locally or "in production" (i.e, on Google's servers at appspot.com), in order to access different resources in each case, you take inspiration from the way Google's own example does it -- specifically, check out the app.py example at https://cloud.google.com/appengine/docs/python/cloud-sql/#Python_Using_a_local_MySQL_instance_during_development .
In that example, the point is to access a Cloud SQL instance if you're running in production, but a local MySQL instance instead if you're running locally. But that's secondary -- let's focus instead on, how does Google's own example tell which is the case? The relevant code is...:
if (os.getenv('SERVER_SOFTWARE') and
os.getenv('SERVER_SOFTWARE').startswith('Google App Engine/')):
...snipped: what to do if you're in production!...
else:
...snipped: what to do if you're in the local server!...
So, this is the test I'd recommend you use.
Well, as a Python guru, I'm actually slightly embarassed that my colleagues are using this slightly-inferior Python code (with two calls to os.getenv) -- me, I'd code it as follows...:
in_prod = os.getenv('SERVER_SOFTWARE', '').startswith('Google App Engine/')
if in_prod:
...whatever you want to do if we're in production...
else:
...whatever you want to do if we're in the local server...
but, this is exactly the same semantics, just expressed in more elegant Python (exploiting the second optional argument to os.getenv to supply a default value).
I'll be trying to get this small Python improvement into that example and to also place it in the doc page you were using (there's no reason anybody just needing to find out if their app is being run in prod or locally should ever have looked at the docs about Cloud SQL use -- so, this is a documentation goof on our part, and, I apologize). But, while I'm working to get our docs improved, I hope this SO answer is enough to let you proceed confidently.

That documentation seems wrong, when I run the commands locally it just spits out the name from app.yaml.
That being said, we use
import os
os.getenv('SERVER_SOFTWARE', '').startswith('Dev')
to check if it is the dev appserver.

Python App Engine debug/dev mode

I'm working on an App Engine project (Python) where we'd like to make certain changes to the app's behavior when debugging/developing (most often locally). For example, when debugging, we'd like to disable our rate-limiting decorators, turn on the debug param in the WSGIApplication, maybe add some asserts.
As far as I can tell, App Engine doesn't naturally have any concept of a global dev-mode or debug-mode, so I'm wondering how best to implement such a mode. The options I've been able to come up with so far:
Use google.appengine.api.app_identity.get_default_version_hostname() to get the hostname and check if it begins with localhost. This seems... unreliable, and doesn't allow for using the debug mode in a deployed app instance.
Use os.environ.get('APPLICATION_ID') to get the application id, which according to this page is automatically prepended with dev~ by the development server. Worryingly, the very source of this information is in a box warning:
Do not get the App ID from the environment variable. The development
server simulates the production App Engine service. One way in which
it does this is to prepend a string (dev~) to the APPLICATION_ID
environment variable, which is similar to the string prepended in
production for applications using the High Replication Datastore. You
can modify this behavior with the --default_partition flag, choosing a
value of "" to match the master-slave option in production. Google
recommends always getting the application ID using get_application_id,
as described above.
Not sure if this is an acceptable use of the environment variable. Either way it's probably equally hacky, and suffers the same problem of only working with a locally running instance.
Use a custom app-id for development (locally and deployed), use the -A flag in dev_appserver.py, and use google.appengine.api.app_identity.get_application_id() in the code. I don't like this for a number of reasons (namely having to have two separate app engine projects).
Use a dev app engine version for development and detect with os.environ.get('CURRENT_VERSION_ID').split('.')[0] in code. When deployed this is easy, but I'm not sure how to make dev_appserver.py use a custom version without modifying app.yaml. I suppose I could sed app.yaml to a temp file in /tmp/ with the version replaced and the relative paths resolved (or just create a persistent dev-app.yaml), then pass that into dev_appserver.py. But that seems also kinda dirty and prone to error/sync issues.
Am I missing any other approaches? Any considerations I failed to acknowledge? Any other advice?

In regards to "detecting" localhost development we use the following in our applications settings / config file.
IS_DEV_APPSERVER = 'development' in os.environ.get('SERVER_SOFTWARE', '').lower()
That used in conjunction with the debug flag should do the trick for you.

Password storage (for a set of scripts)

I have a system (actually it is a set of shell scripts) which has a lot of instances on different servers in different test stages (dev, uat, prd). Scripts need use some passwords for authorization in for example database (btw each environment has its own passwords).
I have a deployment system, therefore I'm able to hold passwords in repository to not to update them each time manually.
But it's completely unacceptable from security point of view to store them as plain text.
I could develop a solution myself using gpg (to hold each password in gpg encrypted file with pub certificate of target environment), but I'm not sure it's the best way.
Is there any existing opensource solutions for password storage which are better than own solution with gpg?

It seems you are looking for Password store. You can have a look into vault 0.2

PyCrypto.Blowfish should be very nice for that purpose:
https://www.dlitz.net/software/pycrypto/api/current/Crypto.Cipher.Blowfish.BlowfishCipher-class.html
Although you'd have to specify key manually on each startup of your "password server" obviously.

PyCrypto is a well known and mature library for this kind of thing, and should do what you are looking for.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.