Testing with LocalStack in a dockerized python environment

Testing with LocalStack in a dockerized python environment - python

When developing a dockerized AWS application locally, it's a common practice to simulate Amazon's services using LocalStack. One way to get the Python application talking to LocalStack at test-time is to monkeypatch the Boto Client and ServiceResource, and use the "link" feature in the docker-compose file.
Unfortunately the Docker-Compose reference manual advises against this. It seems that Docker might remove the link feature. Instead they recommend that we should use the internal networks feature of a docker-compose file. This mean that instead of accessing the LocalStack services via localhost (e.g. http://localhost:4566) it will be via something like http://localstack:4566, provided the service-name was "localstack".
Is there a way to change the monkey-patch configuration so that this works? This is the standard monkey-patch code:
import localstack_client.session
import pytest
#pytest.fixture(autouse=True)
def boto3_localstack_patch(monkeypatch):
session_ls = localstack_client.session.Session()
monkeypatch.setattr(boto3, "client", session_ls.client)
monkeypatch.setattr(boto3, "resource", session_ls.resource)
There's no obvious way to indicate that the tests ought to use a different hostname, so how do I do this?

Found an answer:
#pytest.fixture(autouse=True)
def boto3_localstack_patch(monkeypatch):
session_ls = localstack_client.session.Session(localstack_host="localstack")
monkeypatch.setattr(boto3, "client", session_ls.client)
monkeypatch.setattr(boto3, "resource", session_ls.resource)
The localstack_client.session.Session() object takes an argument localstack_host which can be used to specify how to connect to LocalStack if not on localhost.

Related

Discovering peer instances in Azure Virtual Machine Scale Set

Problem: Given N instances launched as part of VMSS, I would like my application code on each azure instance to discover the IP address of the other peer instances. How do I do this?
The overall intent is to cluster the instances so, as to provide active passive HA or keep the configuration in sync.
Seems like there is some support for REST API based querying : https://learn.microsoft.com/en-us/rest/api/virtualmachinescalesets/
Would like to know any other way to do it, i.e. either python SDK or instance meta data URL etc.

The RestAPI you mentioned has a Python SDK, the "azure-mgmt-compute" client
https://learn.microsoft.com/python/api/azure.mgmt.compute.compute.computemanagementclient

One way to do this would be to use instance metadata. Right now instance metadata only shows information about the VM it's running on, e.g.
curl -H Metadata:true "http://169.254.169.254/metadata/instance/compute?api-version=2017-03-01"
{"compute":
{"location":"westcentralus","name":"imdsvmss_0","offer":"UbuntuServer","osType":"Linux","platformFaultDomain":"0","platformUpdateDomain":"0",
"publisher":"Canonical","sku":"16.04-LTS","version":"16.04.201703300","vmId":"e850e4fa-0fcf-423b-9aed-6095228c0bfc","vmSize":"Standard_D1_V2"},
"network":{"interface":[{"ipv4":{"ipaddress":[{"ipaddress":"10.0.0.4","publicip":"52.161.25.104"}],"subnet":[{"address":"10.0.0.0","dnsservers":[],"prefix":"24"}]},
"ipv6":{"ipaddress":[]},"mac":"000D3AF8BECE"}]}}
You could do something like have each VM send the info to a listener on VM#0, or to an external service, or you could combine this with Azure Files, and have each VM output to a common share. There's an Azure template proof of concept here which outputs information from each VM to an Azure File share.. https://github.com/Azure/azure-quickstart-templates/tree/master/201-vmss-azure-files-linux - every VM has a mountpoint which contains info written by every VM.

Troubleshooting Websockets with EC2 on AWS using Django

I am using Django-Channels to try to get real time features such as chat/messaging, notifications, etc. Right now, I have gotten everything to work fine on my laptop using the settings described in the docs here: http://channels.readthedocs.io/en/latest/. I use a local redis server for testing purposes.
However, when I deploy to my Amazon EC2 Elastic Beanstalk server (using an AWS ElastiCache Redis), the WebSocket functionality fails. I was reading and I think it is due to the fact that Amazon's HTTPS does not support WebSockets, so I need to switch to Secure TCP.
I tried doing that with:
https://blog.jverkamp.com/2015/07/20/configuring-websockets-behind-an-aws-elb/
and
https://medium.com/#Philmod/load-balancing-websockets-on-ec2-1da94584a5e9#.ak2jh5h0q
but to no avail.
Does anyone have any success implementing WebSockets with CentOS/Apache and Django on AWS EB? The Django-Channels package is fairly new so I was wondernig if anyone has experienced and/or overcome this hurdle.
Thanks in advance

AWS has launched new Application Load Balancer that supports web sockets. Change your ELB to Application Load Balancer and that will fix your issue.
https://aws.amazon.com/blogs/aws/new-aws-application-load-balancer/

As described here it's possible to run Django Channels on Elastic Beanstalk using an Application Load Balancer.
In a simplified form, it's basically:
Create an ALB
Add 2 target groups, one that points to port 80, and one that points to Daphne port, ie 8080.
Create 2 path rules. Let the default route point to target group 1 (port 80), and set the second to use a relative path, ie. /ws/ and point it to target group 2.
Add Daphne and workers to supervisord (or another init system)
DONE! Access Daphne/websockets through the relative url ws://example.com/ws/.

I suppose ALB is the only way. The reason is because with the SSL protocol listner in the classic LB, the session stickiness and X-Forwaded headers won't be forwarded and will result in the proxy server redirect loop. Doc is here,
http://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-listener-config.html
I'll update the answer if I find out a way with the existing CLB.

Why is App Engine Returning the Wrong Application ID?

The App Engine Dev Server documentation says the following:
The development server simulates the production App Engine service. One way in which it does this is to prepend a string (dev~) to the APPLICATION_IDenvironment variable. Google recommends always getting the application ID using get_application_id
In my application, I use different resources locally than I do on production. As such, I have the following for when I startup the App Engine instance:
import logging
from google.appengine.api.app_identity import app_identity
# ...
# other imports
# ...
DEV_IDENTIFIER = 'dev~'
application_id = app_identity.get_application_id()
is_development = DEV_IDENTIFIER in application_id
logging.info("The application ID is '%s'")
if is_development:
logging.warning("Using development configuration")
# ...
# set up application for development
# ...
# ...
Nevertheless, when I start my local dev server via the command line with dev_appserver.py app.yaml, I get the following output in my console:
INFO: The application ID is 'development-application'
WARNING: Using development configuration
Evidently, the dev~ identifier that the documentation claims will be preprended to my application ID is absent. I have also tried to use the App Engine Launcher UI to see if that changed anything, but it did not.
Note that 'development-application' is the name of my actual application, but I expected it to be 'dev~development-application'.

Google recommends always getting the application ID using get_application_id
But, that's if you cared about the application ID -- you don't: you care about the partition. Check out the source -- it's published at https://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/api/app_identity/app_identity.py .
get_app_identity uses os.getenv('APPLICATION_ID') then passes that to internal function _ParseFullAppId -- which splits it by _PARTITION_SEPARATOR = '~' (thus removing again the dev~ prefix that dev_appserver.py prepended to the environment variable). That's returned as the "partition" to get_app_identity (which ignores it, only returning the application ID in the strict sense).
Unfortunately, there is no architected way to get just the partition (which is in fact all you care about).
I would recommend that, to distinguish whether you're running locally or "in production" (i.e, on Google's servers at appspot.com), in order to access different resources in each case, you take inspiration from the way Google's own example does it -- specifically, check out the app.py example at https://cloud.google.com/appengine/docs/python/cloud-sql/#Python_Using_a_local_MySQL_instance_during_development .
In that example, the point is to access a Cloud SQL instance if you're running in production, but a local MySQL instance instead if you're running locally. But that's secondary -- let's focus instead on, how does Google's own example tell which is the case? The relevant code is...:
if (os.getenv('SERVER_SOFTWARE') and
os.getenv('SERVER_SOFTWARE').startswith('Google App Engine/')):
...snipped: what to do if you're in production!...
else:
...snipped: what to do if you're in the local server!...
So, this is the test I'd recommend you use.
Well, as a Python guru, I'm actually slightly embarassed that my colleagues are using this slightly-inferior Python code (with two calls to os.getenv) -- me, I'd code it as follows...:
in_prod = os.getenv('SERVER_SOFTWARE', '').startswith('Google App Engine/')
if in_prod:
...whatever you want to do if we're in production...
else:
...whatever you want to do if we're in the local server...
but, this is exactly the same semantics, just expressed in more elegant Python (exploiting the second optional argument to os.getenv to supply a default value).
I'll be trying to get this small Python improvement into that example and to also place it in the doc page you were using (there's no reason anybody just needing to find out if their app is being run in prod or locally should ever have looked at the docs about Cloud SQL use -- so, this is a documentation goof on our part, and, I apologize). But, while I'm working to get our docs improved, I hope this SO answer is enough to let you proceed confidently.

That documentation seems wrong, when I run the commands locally it just spits out the name from app.yaml.
That being said, we use
import os
os.getenv('SERVER_SOFTWARE', '').startswith('Dev')
to check if it is the dev appserver.

Python App Engine debug/dev mode

I'm working on an App Engine project (Python) where we'd like to make certain changes to the app's behavior when debugging/developing (most often locally). For example, when debugging, we'd like to disable our rate-limiting decorators, turn on the debug param in the WSGIApplication, maybe add some asserts.
As far as I can tell, App Engine doesn't naturally have any concept of a global dev-mode or debug-mode, so I'm wondering how best to implement such a mode. The options I've been able to come up with so far:
Use google.appengine.api.app_identity.get_default_version_hostname() to get the hostname and check if it begins with localhost. This seems... unreliable, and doesn't allow for using the debug mode in a deployed app instance.
Use os.environ.get('APPLICATION_ID') to get the application id, which according to this page is automatically prepended with dev~ by the development server. Worryingly, the very source of this information is in a box warning:
Do not get the App ID from the environment variable. The development
server simulates the production App Engine service. One way in which
it does this is to prepend a string (dev~) to the APPLICATION_ID
environment variable, which is similar to the string prepended in
production for applications using the High Replication Datastore. You
can modify this behavior with the --default_partition flag, choosing a
value of "" to match the master-slave option in production. Google
recommends always getting the application ID using get_application_id,
as described above.
Not sure if this is an acceptable use of the environment variable. Either way it's probably equally hacky, and suffers the same problem of only working with a locally running instance.
Use a custom app-id for development (locally and deployed), use the -A flag in dev_appserver.py, and use google.appengine.api.app_identity.get_application_id() in the code. I don't like this for a number of reasons (namely having to have two separate app engine projects).
Use a dev app engine version for development and detect with os.environ.get('CURRENT_VERSION_ID').split('.')[0] in code. When deployed this is easy, but I'm not sure how to make dev_appserver.py use a custom version without modifying app.yaml. I suppose I could sed app.yaml to a temp file in /tmp/ with the version replaced and the relative paths resolved (or just create a persistent dev-app.yaml), then pass that into dev_appserver.py. But that seems also kinda dirty and prone to error/sync issues.
Am I missing any other approaches? Any considerations I failed to acknowledge? Any other advice?

In regards to "detecting" localhost development we use the following in our applications settings / config file.
IS_DEV_APPSERVER = 'development' in os.environ.get('SERVER_SOFTWARE', '').lower()
That used in conjunction with the debug flag should do the trick for you.

Launch Openstack Instances using python-boto

I am trying to launch instances on opensatck setup with multiple networks configured using python-boto.
But I got following error,
EC2ResponseError: EC2ResponseError: 400 Bad Request
<?xml version="1.0"?>
<Response><Errors><Error><Code>NetworkAmbiguous</Code><Message>Multiple possible networks found, use a Network ID to be more specific.</Message></Error></Errors><RequestID>req-28b5a4e8-3838-4111-95db-337c5048716d</RequestID></Response>
My code is like here,
from boto import ec2
ostack = ec2.connection.EC2Connection(
ec2_access_key, ec2_secret_key, is_secure=False, port=8773, region='nova',
path='/services/Cloud'
)
ostack.run_instances('ami-xxxxx', key_name='BotoTest')
The above is working fine for single network configured to openstack.
Note: run_instances doesn't have keyword argument for network-id.
Where I made a mistake or how to fix it? or is it bug in python-boto?
Advance in Thanks.

I believe that this isn't a bug of boto, which was built to communicate with the AWS-API. While most of the EC2-AWS functionality work well with the EC2-OpenStack API, some features are not implemented and are answered with a HTTP-Error 500 or 400.
AWS use the VPC (Virtual Private Cloud) as Network and an Availability Zone as Subnet. Both have a default setting, which is taken if there is no further specification when creating a new instance. But in OpenStack I can't see a possibility to mark a Network and a Subnet as default.
In my attempts, neither private_ip_address nor subnet_id works to specify a network/subnet at run_instances() if there are more than one at OpenStack.
Edit: if you only have one network/subnet, the following code works fine with boto at trystack.org:
import boto
conn = boto.connect_ec2_endpoint("http://8.21.28.222:8773/services/Cloud",aws_access_key_id='...',aws_secret_access_key='...')
new_instance = conn.run_instances("ami-00000020", key_name="trystack", security_groups=["default"], instance_type="m1.small")

Have you tried? :
from boto import ec2
ostack = ec2.connection.EC2Connection(
ec2_access_key, ec2_secret_key, is_secure=False, port=8773, region='nova',
path='/services/Cloud', debug=1
)
then
ostack.run_instances('ami-xxxxx', subnet_id='your network id', key_name='BotoTest')
Amazon uses this for VPC networks? Are you running a VPC?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.