should new python virtualenv's be created with new linux user accounts?

should new python virtualenv's be created with new linux user accounts? - python

I'm starting with Python 3, using Raspbian (from Debian), and using virtualenv. I understand how to create/use a virtualenv to "sandbox" different Python project, HOWEVER I'm a bit unclear on whether one should be setting up a different linux user for each project (assuming that the project/virtualenv will be used to create & then run a daemon process on the linux box).
So when creating separate python environments the question I think is should I be:
creating a new linux user account for each deamon/acript I'm working on, so that both the python virtual environment, and the python project code area can live under directories owned by this user?
perhaps just create one new non-administrator account at the beginning, and then just use this account for each project/virtual environmnet
create everything under the initial admin user I first log with for raspbian (e.g. "pi" user) - Assume NO for this option, but putting it in for completeness.

TL;DR: 1. no 2. yes 3. no
creating a new linux user account for each deamon/script I'm working on, so that both the python virtual environment, and the python project code area can live under directories owned by this user?
No. Unnecessary complexity and no real benefit to create many user accounts for this. Note that one user can be logged in multiple sessions and running multiple processes.
perhaps just create one new non-administrator account at the beginning, and then just use this account for each project/virtual environment
Yes, and use sudo from the non-admin account if/when you need to escalate privilege.
create everything under the initial admin user I first log with for raspbian (e.g. "pi" user) - Assume NO for this option, but putting it in for completeness.
No. Better to create a regular user, not run everything as root. Using a non-root administrator account would be OK, though.

It depends on what you're trying to achieve. From virtualenv's perspective you could do any of those.
#1 makes sense to me if you have multiple services that are publicly accessible and want to isolate them.
If you're running trusted code on an internal network, but don't want the dependencies clashing then #2 sounds reasonable.
Given that the Pi is often used for a specific purpose (not a general purpose desktop say) and the default account goes largely unused, using that account would be fine. Make sure to change the default password.

In the general case, there is no need to create a separate account just for a virtualenv.
There can be reasons to create a separate account, but they are distinct from, and to some extent anathema to, virtual environments. (If you have a dedicated account for a service, there is no need really to put it in a virtualenv -- you might want to if it has dependencies you want to be able to upgrade easily etc, but the account already provides a level of isolation similar to what a virtualenv provides within an account.)
Reasons to use a virtual environment:
Make it easy to run things with different requirements under the same account.
Make it easy to install things for yourself without any privileges.
Reasons to use a separate account:
Fine-grained access control to privileged resources.
Properly isolating the private resources of the account.

Related

What strategy should I use to periodically extract information from a specific folder

With this question I would like to gain some insights/verify that I'm on the right track with my thinking.
The request is as follows: I would like to create a database on a server. This database should be updated periodically by adding information that is present in a certain folder, on a different computer. Both the server and the computer will be within the same network (I may be running into some firewall issues).
So the method I am thinking of using is as follows. Create a tunnel between the two systems. I will run a script that periodically (hourly or daily) searches through the specified directory, convert the files to data and add it to the database. I am planning to use python, which I am fairly familiar with.
Note: I dont think I will be able to install python on the pc with the files.
Is this at all doable? Is my approach solid? Please let me know if additional information is required.

Create a tunnel between the two systems.
If you mean setup the firewall between the two machines to allow connection, then yeah. Just open the postgresql port. Check postgresql.conf for the port number in case it isn't the default. Also put the correct permissions in pg_hba.conf so the computer's ip can connect to it.
I will run a script that periodically (hourly or daily) searches through the specified directory, convert the files to data and add it to the database. I am planning to use python, which I am fairly familiar with.
Yeah, that's pretty standard. No problem.
Note: I dont think I will be able to install python on the pc with the files.
On Windows you can install anaconda for all users or just the current user. The latter doesn't require admin privileges, so that may help.
If you can't install python, then you can use some python tools to turn your python program into an executable that contains all the libraries, so you just have to drop that into a folder on the computer and execute it.
If you absolutely cannot install anything or execute any program, then you'll have to create a scheduled task to copy the data to a computer that has python over the network, and run the python script there, but that's extra complication.
If the source computer is automatically backed up to a server, you can also use the backup as a data source, but there will be a delay depending on how often it runs.

How to provision an EC2 Server programmatically and configure default settings

I would like to provision an AWS EC2 server using python and set default user and password among other things. The idea is that the developer selects from our site menu the items they would like to have installed e.g MySQL, Nginx etc. When one clicks submit, I'm using boto to create the EC2 server and now I would like to install the softwares and set the default user credentials so that it can be mailed to the user.
I would like to make the above self service i.e everything is well automated and one can customize as per the needs without system engineer involvement. Note: I don't want to share the aws keypairs, that will be left for the servers admin.
I'm thinking of using fabric on the above but it seems it will require a lot of code and configuration. Is there the best and recommended way one can provision a linux server and do the above? How does providers like digitalOceans set default root passwords during create? I would like to keep this details unique as possible.

Use the User Data field to pass a #cloud-config setup to the Amazon EC2 instance. Some of the things it can do include:
Including users and groups, including defining user passwords
Writing out arbitrary files
Add yum / apt repository
Install via chef / puppet
Run commands on first boot
Install arbitrary packages
See: Cloud config examples

Creating and transferring a site with Django

As a fledgling Django developer, I was wondering if it was customary, or indeed possible, to create a site with Django then transfer the complete file structure to a different machine where it would "go live".
Thanks,
~Caitlin

You could use GIT or Mercurial - or other version control system. To put the site structure on a central server. After that you could deploy the site for example with fabric to multiple servers. For deployment process you should consider using for example virtualenv to isolate the project from global python packages and requirements.

Of course that's possible and in fact it's the only way to "go live". You don't want to develop in your live server, do you? And it's true for any platform, not just django.
If I understood your question correctly, you need a system to push your development code to live.
Use a version control system: git, svn, mercurial etc.
Identify environment specific code like setting/config files etc. and have separate instances of them for each environment.
Create a testing/staging/PP environment which has live data or live-like data and deploy your code there before pushing it to live.
To avoid any downtime during deployment process, usually a symbolic link is created which points to the existing code folder. When a new release is to be pushed, a new folder is created with new code, after all other dependencies are done (like setting and database changes) and the sym link is pointed to the new folder.

What is the best way to distribute code across servers?

I have a directory of python programs, classes and packages that I currently distribute to 5 servers. It seems I'm continually going to be adding more servers and right now I'm just doing a basic rsync over from my local box to the servers.
What would a better approach be for distributing code across n servers?
thanks

I use Mercurial with fabric to deploy all the source code. Fabric's written in python, so it'll be easy for you to get started. Updating the production service is as simple as fab production deploy. Which ends ups doing something like this:
Shut down all the services and put an "Upgrade in Progress" page.
Update the source code directory.
Run all migrations.
Start up all services.
It's pretty awesome seeing this all happen automatically.

First, make sure to keep all code under revision control (if you're not already doing that), so that you can check out new versions of the code from a repository instead of having to copy it to the servers from your workstation.
With revision control in place you can use a tool such as Capistrano to automatically check out the code on each server without having to log in to each machine and do a manual checkout.
With such a setup, deploying a new version to all servers can be as simple as running
$ cap deploy
from your local machine.

While I also use version control to do this, another approach you might consider is to package up the source using whatever package management your host systems use (for example RPMs or dpkgs), and set up the systems to use a custom repository Then an "apt-get upgrade" or "yum update" will update the software on the systems. Then you could use something like "mussh" to run the stop/update/start commands on all the tools.
Ideally, you'd push it to a "testing" repository first, have your staging systems install it, and once the testing of that was signed off on you could move it to the production repository.
It's very similar to the recommendations of using fabric or version control in general, just another alternative which may suit some people better.
The downside to using packages is that you're probably using version control anyway, and you do have to manage version numbers of these packages. I do this using revision tags within my version control, so I could just as easily do an "svn update" or similar on the destination systems.
In either case, you may need to consider the migration from one version to the next. If a user loads a page that contains references to other elements, you do the update and those elements go away, what do you do? You may wish to do something either within your deployment scripting, or within your code where you first push out a version with the new page, but keep the old referenced elements, deploy that, and then remove the referenced elements and deploy that later.
In this way users won't see broken elements within the page.

faking a filesystem / virtual filesystem

I have a web service to which users upload python scripts that are run on a server. Those scripts process files that are on the server and I want them to be able to see only a certain hierarchy of the server's filesystem (best: a temporary folder on which I copy the files I want processed and the scripts).
The server will ultimately be a linux based one but if a solution is also possible on Windows it would be nice to know how.
What I though of is creating a user with restricted access to folders of the FS - ultimately only the folder containing the scripts and files - and launch the python interpreter using this user.
Can someone give me a better alternative? as relying only on this makes me feel insecure, I would like a real sandboxing or virtual FS feature where I could run safely untrusted code.

Either a chroot jail or a higher-order security mechanism such as SELinux can be used to restrict access to specific resources.

You are probably best to use a virtual machine like VirtualBox or VMware (perhaps even creating one per user/session).
That will allow you some control over other resources such as memory and network as well as disk
The only python that I know of that has such features built in is the one on Google App Engine. That may be a workable alternative for you too.

This is inherently insecure software. By letting users upload scripts you are introducing a remote code execution vulnerability. You have more to worry about than just modifying files, whats stopping the python script from accessing the network or other resources?
To solve this problem you need to use a sandbox. To better harden the system you can use a layered security approach.
The first layer, and the most important layer is a python sandbox. User supplied scripts will be executed within a python sandbox. This will give you the fine grained limitations that you need. Then, the entire python app should run within its own dedicated chroot. I highly recommend using the grsecurity kernel modules which improve the strength of any chroot. For instance a grsecuirty chroot cannot be broken unless the attacker can rip a hole into kernel land which is very difficult to do these days. Make sure your kernel is up to date.
The end result is that you are trying to limit the resources that an attacker's script has. Layers are a proven approach to security, as long as the layers are different enough such that the same attack won't break both of them. You want to isolate the script form the rest of the system as much as possible. Any resources that are shared are also paths for an attacker.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

should new python virtualenv's be created with new linux user accounts? - python

Related

What strategy should I use to periodically extract information from a specific folder

How to provision an EC2 Server programmatically and configure default settings

Creating and transferring a site with Django

What is the best way to distribute code across servers?

faking a filesystem / virtual filesystem

Categories

Resources