What is the easiest way to deploy a python app with only two .py files but keep it in one dyno? My files are friend.py and foe.py and my Procfile looks like this:
worker: python friend.py
worker: python foe.py
But when deployed to Heroku, the only dyno I have is foe.py. I've read other similar questions but they seem to complicated and I don't yet understand the inner workings of a python web application.
If they are distinct processes working in parallel, the most direct path is two dynos, using different names (actually, friend and foe would work fine as process names) in the Procfile. Right now, you are using the name worker twice, so foe.py shows up because it's the last one defined. Two things to keep in mind -
The names in Procfile can be arbitrary; as far as I know, the only "special" name is web, which tells Heroku to expect that process to bind to a port and accept HTTP traffic from the routing mesh. worker isn't special; it's just a convention people tend to use for "something running other than the web dyno"
A dyno is closer to a Docker container than a virtual machine, so the general best practice is one kind of process per container.
If you really need only one dyno (cost?), you could write a third script whose sole job is to spawn friend.py and foe.py as subprocesses. In that case, everything comes up and down as a unit; you can't manage friend and foe independently.
Hope that helps.
Related
Similar question was asked here, however the solution does not give the shell access to the same environment as the deployment. If I inspect os.environ from within the shell, none of the environment variables appear.
Is there a way to run the manage.py shell with the environment?
PS: As a little side question, I know the mantra for EBS is to stop using eb ssh, but then how would you run one-off management scripts (that you don't want to run on every deploy)?
One of the cases you have to run something once is db schema migrations. Usually you store information about that in the db... So you can use db to sync / ensure that something was triggered only once.
Personally I have nothing against using eb ssh, I see problems with it however. If you want to have CI/CD, that manual operation is against the rules.
Looks like you are referring to WWW/API part of Beanstalk. If you need something that is quite frequent... maybe worker is more suitable? Problem here is that if API goes deployed first you would have wrong schema.
In general you are using EC2, so it's user data stores information that spins up you service. So there you can put your "stuff". Still you need to sync / ensure. Here are docs for beanstalk - for more information how to do that.
Edit
Beanstalk is kind of instrumentation on top of EC2. So there must be a way to work with it, since you have access to user data of that EC2s. No worries you don't need to dig that deep. There is good way of instrumenting your server. It is called ebextensions. It can be used to put files on the server, trigger commands, instrument cron. What ever you want.
You can create ebextension, with container_commands this time Python Configuration Namespaces section. That commands are executed on each deployment. Still, problem is that you need to sync since more then one deployment can go at the same time. Good part is that you can set env in the way you want.
I have no problem to access to the environment variables. How did you get the problem? Try do prepare page with the map.
I am dockerizing a Python webapp using the https://hub.docker.com/r/tiangolo/uwsgi-nginx image, which uses supervisor to control the uWSGI instance.
My app actually requires an additional supervisor-mediated process to run (LibreOffice headless, with which I generate documents through the appy module), and I'm wondering what is the proper pattern to implement it.
The way I see it, I could extend the above image with the extra supervisor config for my needs (along with all the necessary OS-level install steps), but this would be in contradiction with the general principle of running the least amount of distinct processes in a given container. However, since my Python app is designed to talk with LibreOffice only locally, I'm not sure how I could achieve it with a more containerized approach. Thanks for any help or suggestion.
The recommendation for one-process-per-container is sound - Docker only monitors the process it starts when the container runs, so if you have multiple processes they're not watched by Docker. It's also a better design - you have lightweight, focused containers with single responsibilities, and you can manage them independently.
user2105103 is right though, the image you're using already loses that benefit because it runs Python and Nginx, and you could extend it with LibreOffice headless and package your whole app without changing code.
If you move to a more "best practice" approach, you'd have a distributed app running across three containers in a Docker network:
nginx - web proxy, this is the public entry point to the app. Nginx can do routing, caching, SSL termination, rate limiting etc.
app - your Python app, only visible inside the Docker network. Receives requests from nginx and uses libreoffice for document manipulation;
libreoffice - running in headless mode with the API exposed, but only available within the Docker network.
You'd need code changes for this, bringing in something like PyOO to use the LibreOffice API remotely from the app container.
You've already blown the "one process per container" -- just add another process. It's not a hard rule, or even one that everybody agrees with.
Extend away, or better yet author your own custom container. That way you own it, you understand it, and it's optimized for your purpose.
I just came across Docker a couple days ago and have been doing some research, but one thing is still a bit unclear to me.
In a video I watched by the creator of Docker, he likened this utility to a shipping container so that you could guarantee that your stack works as intended once set up inside.
But I'm seeing a lot of container images which are just a single part of the stack, i.e. an nginx image or a uwsgi image.
Basically I want to run a web server using python, flask, nginx, and uwsgi. They're all part of the stack so should they go in a single container or should certain parts be in their own container?
I'll have a MySQL server as well, and this seems more logical to run in its own container.
Apologies if this is a matter of opinion, but to me it feels like there is only one right way to go about this.
Docker containers are treated as deployment units - which means you package an application (or part of an application) and all its dependencies into a docker container that can be deployed independently. Your application could be monolithic where your entire application fits into one container just exposing HTTP end points for the browser to access, or an application that is composed of sub-components that can be independently deployed and managed - something like microservices - that when put together form the complete application. In such a case, each independent sub-component would reside in a container of its own. So, the decision on how many containers and how many processes within a container depends on the composition of your application and the kind of scalability you want to achieve.
Docker containers are meant to run single processes, but of course you can work around it by running process management tools like supervisord. I'm not very familiar with the python stack you are talking about, but I could explain this to you in terms of a stack comprising of Nginx + Node + Redis. I have elaborated on a sample docker workflow with this stack in my blog post as well: http://anandmanisankar.com/posts/docker-container-nginx-node-redis-example/
In my example used in the blog post, Nginx, Node and Redis run on separate containers. The reason being the following:
I want to be able to scale my node application depending on the load. So it makes sense to run it on separate container that I can scale independently.
I run Redis on a separate container which acts as a shared data store for my node containers.
To load balance my node containers I run Nginx - again on a separate container - that can dynamically balance the load across the scaled out node containers. The load balancing configuration can also be dynamically updated based on the state/availability/health of the node containers. It would be ideal if I could implement a service discovery mechanism which dynamically generates the Nginx configuration based on the availability of the containers. So scaling up would just be a matter of adding additional containers, and fault tolerance (failure of some node containers) would be automatically handled as well.
You can find the code behind this docker worflow on my github repo: https://github.com/msanand/docker-workflow
You could try to draw an analogy from this to any other web architecture stack. Hope this helps!
I think you could like this, I made a public (and open source) Docker image with all the bells and whistles that you can use to build a Python Flask web application.
It has uWSGI for running the application, Nginx to serve HTTP and Supervisord to control them, so you don't have to learn how to install and configure all those to build your Python Flask web app.
It seems like uWSGI with Nginx is one of the more robust (and with great performance) ways to deploy a Python web app. Here are the benchmarks: http://nichol.as/benchmark-of-python-web-servers.
There are even some template projects you can use to bootstrap your own. And also, you don't have to clone the full project or something, you can just use it as a base image.
Docker Hub: https://hub.docker.com/r/tiangolo/uwsgi-nginx-flask/
GitHub: https://github.com/tiangolo/uwsgi-nginx-flask-docker
And about the "one process per container" debate, some say that this is one of the key misconceptions when you look at it from a microservices point of view: https://valdhaus.co/writings/docker-misconceptions/
As others here have said, officially it is recommended to have one process per container, see
https://docs.docker.com/articles/dockerfile_best-practices/
However, I think there is much debate about how much process isolation you need. One example that is closer to the full stack is the phusion passenger image (phusion/baseimage and phusion/passenger-docker)
that bundles, amongst other things, nginx, ruby and passenger. Some people hate this, others think there is a place for such images. Opinions expressed about this particular image and a linked article discussing it can be found here: https://news.ycombinator.com/item?id=7258009. I think that you can generalize a lot of what is said there to your case and that the variety of arguments supports the variety of image types you have observed.
Personally, I think the full stack vs single process debate is down to the requirements of what you are trying to achieve. If you worry about scalability the single process paradigm might be better for you. If you care about quickly bringing up a dev environment, it could be more straight forward to create/take a container that feels a bit more like a virtual machine.
I am working on a web application that uses a permanent object MyService. Using a web interface I am dynamically updating its state and monitor its behavior. Now I would like to periodically call one of its methods. I was thinking of using celery PeriodicTask but run into some scope issues. It seems I need to execute three different processes:
python manage.py runserver
python manage.py celery worker
python manage.py celerybeat
The problem is that even if I ensure that MyService is a singleton that can be safely used by more than one thread, celery creates its own fresh copy of the object. Is there a way I could share this object between both django server and celery main process? I tried to find a way to start celery from within django script but until now with no success. Would appreciate any help.
If you need to share something between multiple processes or maybe even multiple machines (eg. your workers could run on a seperate machine) the best (and probably easiest) practice to share information would be using an external service.
In the simplest case you could use Django's DB, but if you encounter that this is not suitable for you, for example if you have a heavy write load you can use something like Redis or Memcache (which you can also talk to via Django's caching API). These will enable you to be able to handle a big write load and besides you can use eg. Redis as a queue for celery as well.
I am coming from a Java/Tomcat background and was wondering if there is anything out there which could be similar to the Tomcat manager application?
I'm imagining a webapp that I can use to easily deploy and un-deploy Flask based webapps. I guess an analogy to Tomcat would be a WSGI server with a web based manager.
Unfortunately, the deployment story for Python / WSGI is not quite as neat as Java's WAR file based deployment. (And, while Python is not Java that doesn't mean that WAR file deployments aren't nice). So you don't have anything that will quite match your expectations there - but you may be able to cobble together something similar.
First, you'll want a web server that can easily load and unload WSGI applications without requiring a server restart - the one that immediately jumps to mind is uwsgi in emperor mode (and here's an example setup).
Second, you need a consistent way lay out your applications so the WSGI file can be picked up / generated. Something as simple as always having a root-level app.wsgi file that can be copied to the directory being watched by uwsgi will do.
Third, you'll need a script that can take a web application folder / virtualenv and move / symlink it to the "available applications" folder. You'll need another one that can add / symlink, touch (to restart) and remove (to shutdown) the app.wsgi files from the directory(ies) that uwsgi is watching for new vassel applications. If you need to run it across multiple machines (or even just one remote machine) you could use Fabric.
Fourth and finally, you'll need a little web application to enable you to manage the WSGI files for these available applications without using the command line. Since you just spent all this time building some infrastructure for it, why not use Flask and deploy it on itself to make sure everything works?
It's not a pre-built solution, but hopefully this at least points you in the right direction.