The current workflow I have is that I created many images of different python setups, which people can pull from if they want to test a python script with a certain configuration. Then they build the container form the image and transfer the scripts, data, etc... from their local machine to the container. Next they run it in the container, and then they finally transfer the results back to their local machine.
I'm new to docker, so is there a better way to go about this? Something I have in mind that would be convenient is if there was a central machine or docker container where people could save their python scripts and data they need to run their tests and then run them in the image of the python environment they want to and save the results. Is this possible? I've been reading about volumes and think I can maybe do something with that, but I don't know...
Related
I'm developing a Python application for Machine Learning models, you can see my docker-compose file here: https://github.com/Quilograma/IES_Project/blob/main/docker-compose.yml.
The problem is while developing the application, every time I change a line in the Python code I've to kill all active containers and drop their respective images. Then call docker-compose up to see the change I've made. It takes roughly 5 minutes to pull all Docker images and install the Python libraries again, which significantly slows down the process of development.
Is there any turnaround to overcome this issue? I really want to use the containers. Thanks!
You do not need to remove any images. You need to re-build your image. This means all previous image layers (FROM python:<tag>, RUN pip install <packages...>) would be cached.
The alternative solution (but only because Python in an interpreted language) would be to mount your module as a volume. Then when you save in your host filesystem, it is automatically updated inside the container.
Personal example with Flask server and Kafka connection
G'day all,
well I have jupyterlabs running inside a docker container (in fact I use this image 1) and so far it has been great, but now that I started to capture data for my research, I have lots of data and need to upload it to my DataBase in MySQL (which is also in Docker), but to upload the several files (2k per sample) into SQL, takes about 4 hours. So, I wanted to run the ipython scripts from the CLI inside the docker container so I can run overnight or over the weekends without needing the browser or having my laptop ON.
I know that if I install anaconda on Windows or Ubuntu on a separate computer I could leave it there running, but I want to use what I already have working.
Thanks a lot and I'm not sure if this is possible with jupyterlabs inside docker.
P.S. I don't want to be creating new images as I will get new data daily so I want to upload it overnight, if this can't be done, well, no worries I can install anaconda on Ubuntu Desktop and use it there but I would prefer to keep what I already have as I said before. Thanks again
I'm using venv for my python repo on github, and wanted to run the same code on 10+ ec2 instances (each instance will have a cronjob that just runs the same code on the same schedule)
Any recommendations on how to best achieve this + continue to make sure all instances get the latest release branches on github? I'd like to try and automate any configuration I need to do, so that I'm not doing this:
Create one ec2 instance, set up all the configurations I need, like download latest python version, etc. Then git clone, set up all the python packages I need using venv. Verify code works on this instance.
Repeat for remaining 10+ ec2 instances
Whenever someone releases a new master branch, I have to ssh into every ec2 instances, git pull to the correct branch, re-update any new configurations I need, repeat for all remaining 10+ ec2 instances.
Ideally I can just run some script that pushes everything that's needed to make the code work on all ec2 instances. I have little experience with this type of thing, but from reading around this is an approach I'm considering. Am I on the right track?:
Create a script I run to ssh into all my ec2 instances and git clone/update to correct branch
Use Docker to make sure all ec2 instances are set up properly so the python code works (Is this the right use-case for Docker?). Above script will run the necessary Docker commands
Similar thing with using venv and reading the requirements.txt file so all ec2 instances has the right python packages and versions
Depending on your app and requirements (is EC2 100% necessary?) I can recommend following:
Capistrano-like SSH deployments (https://github.com/dlapiduz/fabistrano) if your fleet is static and you need fast deployments. Not a best practice and not terribly secure, but you mentioned similar scheme in your post
Using AWS Image Builder (https://aws.amazon.com/image-builder/) or Packer (https://www.packer.io/) to build new release image and then replace old image with new in your EC2 autoscaling group
Build docker image of your app and use ECS or EKS to host it. I would recommend this approach if you're not married to running code directly on EC2 hosts.
I am using the Docker Python SDK docker-py to create a script that allows starting one or multiple containers (depending on a program argument in a way like script.py --all or script.py --specific_container) and it has to be possible to start each container with its own configuration (image, container_name, etc.) just like in typical docker-compose.yml files.
So basically, im trying to do the same what docker-compose does, just with the Python Docker SDK.
I've read that some people are trying to stick with docker-compose by using subprocess but it is not recommended and i would like to avoid this.
I am searching for possibly existing libraries for this but i haven't found anything just yet. Do you know anything i could use?
Another option would be to somehow store configuration files for the "specific_container"-profiles and for the "all"-profile as JSON (?) and then parse them and populate the Docker SDK's run method of the Container class, which lets you give all options that you can also give in the docker-compose file.
Maybe someone knows another, better solution?
Thanks in advance guys.
I have a server and a front end, I would like to get python code from the user in the front end and execute it securely on the backend.
I read this article explaining the problematic aspect of each approach - eg pypl sandbox, docker etc.
I can define the input the code needs, and the output, and can put all of them in a directory, what's important to me is that this code should not be able to hurt my filesystem, and create a disk, memory, cpu overflow (I need to be able to set timeouts, and prevent it from accesing files that are not devoted to it)
What is the best practice here? Docker? Kubernetes? Is there a python module for that?
Thanks.
You can have a python docker container with the volume mount. Volume mount will be the directory in local system for code which will be available in your docker container also. By doing this you have isolated user supplied code only to the container when it runs.
Scan your python container with CIS benchmark for better security