If you're building a system that has three databases - dev, testing and live, and you write a module to connect to the database, how do you make sure that a developer can not connect to the live database?
For example, you have different database clusters which are identified by variables. How do you stop the developer from simply replacing the "dev" variable with the "live" one?
Some potential solutions:
Use an environment variable (but couldn't the developer just change the environment variable on their local machine?)
Use some sort of config file which you then replace with a script when deploying to production. This makes sense and is quite simple...but is it the way these things are normally done?
The assumptions in this question are:
The database connection module is just part of the codebase, so any developer can see it and use it, and potentially change it. (Is this in itself bad?)
It would be great to know how to approach this issue - the stack is a Python server connecting to a Cassandra DB cluster, but where the cluster changes depending on whether it's dev, testing or live.
Many thanks in advance.
Most common 2 solutions for this are (normally used together):
Firewall, production servers should have strict access rules so that at the very least apart from intra-cluster communication that might be free, all other exterior channels need to be pre-approved, so for examples only IPs assigned to the devops team or DBAs can even try to access the machines.
Use credentials. Most frameworks support some form or the other of application.properties/application.yaml/application.conf. Spring boot for example can read files with the name application.properties that are in the same folder as the jar and then use those values to override the ones bundled into the jar. So that you can override the user and password for the database in the production environment, where the dev should not have access.
So:
Dev has access trough the firewall to the dev server and also the user/password to it. This way he can experiment and develop with no problems. Depending on the organization tho' he might sue a local database so this may not apply.
When you go up to test/preprod/prod, the application should be configured to read the connection details from a file or as starting parameters so that the administrator or the dev-ops team can change them to be medium specific. This is also important as not to have the same credentials across all DBs.
For specific info regarding authentication on Cassandra, you can start here: docs
Hope this helped,
Cheers!
Related
I'm developing an Android App and I want the users to upload some images to an EC2 instance in order to process them using some fancy python code, and then return the final images to the user. I am new with servers and AWS, it's a little bit confusing, there are too much things that seem related, but I am not sure where to look. What can I do to achieve my goal? Thanks.
Long story short, I am gonna put things into the list in sequential order to make things easier for you.
Spin off an EC2 instance (Instance class as per your CPU/Network/Memory Requirements).
Install required dependencies and Publish code into your EC2 Instance (Via GitHub or CI/CD. Github will be a straightforward way for you as a beginner.)
Run your code.
Configure security groups which will allow the internet to communicate with your instance. Security groups help you define which port to open to the internet and make it accessible from your android app.
Connect your code via EC2 IP or public IP domain exposed by every EC2 instance (If configured with the public IP option).
These are the basic things you need to do for a simple DIY set-up as a beginner. This will be a good start but definitely not recommended for production usage. For production usage, you need to familiarise yourself with the concept of VPCs, Subnets, NATs, IGWs, Security Groups, Load Balancers, AutoScaling Groups, ACM for SSL certificate management and Cloudwatch for alerting and logging.
I hope this will help you to kickstart.
Note: If you are a beginner and this is just a hobby learning project then I would recommend not to get into advanced concepts like VPCs, subnets etc. and start with humble beginnings. Once you achieve that, look into more security + availability of your existing setup. Do things in smaller increments rather than everything all at once. Happy Coding.
I'm creating a pygtk app that needs a mysql connection from a remote db.
db = MySQLdb.connect("remotehost","username","password","databse", charset='utf8')
App is almost completed and going to be published. But the problem is, if anyone decompile this script they can easily read the above credentials and then there is a security issue. So how do I can protect this code or is there any way I can strongly compile this file?
Database connections are generally made from trusted computers inside a trusted network, for a variety of reasons:
As you've seen, the client needs to store access credentials to the DB.
Most of the time, such connections are made with no transport security (unencrypted), so any eavesdropper can observe and mangle requests/responses.
Latency in the path to the DB is usually a issue, so you want to minimize it, thus placing the client near to the DB
Violating this common practice means you'll have to deal with these problems.
It's very common to have a intermediary service using some other protocol (for example, HTTP/REST) to exposes an API that indirectly modifies the database. You keep the service on a host in your trusted computing base, and only that one host accesses the DB.
In this architecture, you can (and should) perform authentication and mandatory access control in the intermediary service. In turn, having different credentials for each client that accesses that service will help keep things secure.
If you can't rewrite your application at this point, you should follow patriciasz's suggestion on keeping the least priviledge possible. You may also be interested in techniques to make it harder (but not impossible) to obtain the credentials
There is no way to protect your code (compiled or not) from the owner of the machine it runs on.
In this case he will effectively have the same access restrictions your application's SQL user has.
There is no good way to protect your code, but you can use read_default_file options while using connect. The connection arguments will then be read form the file, specified with
read_default_file.
NOTE: This in no way is securing your username, password since anyone having access to the cnf file can get the information.
Build an interface between the database and the application. Only the interface will get true database access.
Give the app credentials to access the interface, and only the interface, then let the interface interact with the data base. This adds a second layer to boost security and helps to protect database integrity.
In the future develop with separate logic from the start. The app does not need to accesses the data base. Instead, it needs data from the database.
Also as a rule of database security avoid putting credentials on the client side. If you have n apps then n apps can access your data base, and controlling access points is a big part of database logic.
Separating program logic is the real deal, credentials don't need to reside on clients machine just as chip said
I'm working on a Flask application, in which we will have multiple clients (10-20) each with their own configuration (for the DB, client specific settings etc.) Each client will have a subdomain, like www.client1.myapp.com, www.cleint2.myapp.com. I'm using uWSGI as the middleware and nginx as the proxy server.
There are a couple of ways to deploy this I suppose, one would be to use application dispatching, and a single instance of uwsgi. Another way would be to just run a separate uwsgi instance for each client, and forward the traffic with nginx based on subdomain to the right app. Does anyone know of the pros and cons of each scenario? Just curious, how do applications like Jira handle this?
I would recommend having multiple instances, forwarded to by nginx. I'm doing something similar with a PHP application, and it works very well.
The reason, and benefit of doing it this way is that you can keep everything completely separate, and if one client's setup goes screwy, you can re-instance it and there's no problem for anyone else. Also, no user, even if they manage to break the application level security can access any other user's data.
I keep all clients on their own databases (one mysql instance, multiple dbs), so I can do a complete sqldump (if using mysql, etc) or for another application which uses sqlite rather than mysql: copy the .sqlite database completely for backup.
Going this way means you can also easily set up a 'test' version of a client's site, as well as as live one. Then you can swap which one is actually live just by changing your nginx settings. Say for doing upgrades, you can upgrade the testing one, check it's all OK, then swap. (Also, for some applications, the client may like having their own 'testing' version, which they can break to their hearts content, and know they (or you) can reinstance it in moments, without harming their 'real' data).
Going with application dispatching, you cannot easily get nginx to serve separate client upload directories, without having a separate nginx config per client (and if you're doing that, then why not go for individual uWSGI instances anyway). Likewise for individual SSL certificates (if you want that...).
Each subdomain (or separate domain entirely for some) has it's own logging, so if a certain client is being DOS'd, or hacked otherwise, it's easy to see which one.
You can set up file-system level size quotas per user, so that if one client starts uploading gBs of video, your server doesn't get filled up as well.
The way I'm working is using ansible to provision and set up the server how I want it, with the client specific details kept in a separate host_var file. So my inventory is:
[servers]
myapp.com #or whatever...
[application_clients]
apples
pears
elephants
[application_clients:vars]
ansible_address=myapp.com
then host_vars/apples:
url=apples.myapp.com
db_user=apples
db_pass=secret
then in the provisioning, I set up a two new users & one group for each client. For instance: apples, web.apples as the two users, and the group simply as apples (which both are in).
This way, all the application code is owned by apples user, but the PHP-FPM instance (or uWSGI instance in your case) is run by web.apples. The permissions of all the code is rwXr-X---, and the permissions of uploads & static directories is rwXrwXr-X. Nginx runs as it's own user, so it can access ONLY the upload/static directories, which it can serve as straight files. Any private files which you want to be served by the uWSGI app can be set that way easily. The web user can read the code, and execute it, but cannot edit it. The actual user itself can read and write to the code, but isn't normally used, except for updates, installing plugins, etc.
I can give out a SFTP user to a client which is chroot'd to their uploads directory if they want to upload outside of the application interface.
Using ansible, or another provisioning system, means there's very little work needed to create a new client setup, and if a client (for whatever reason) wants to move over to their own server, it's just a couple of lines to change in the provisioning details, and re-run the scripts. It also means I can keep a development server installed with the exact same provisioning as the main server, and also I can keep a backup amazon instance on standby which is ready to take over if ever I need it to.
I realise this doesn't exactly answer your question about pros and cons each way, but it may be helpful anyway. Multiple instances of uWSGI or any other WSGI server (mainly I use waitress, but there are plenty of good ones) are very simple to set up and if done logically, with a good provisioning system, easy to administrate.
I'm trying to write a script which will monitor packets (using pypcap) and redirect certain URLs/IPs to something I choose. I know I could just edit the hosts file, but that won't work because I'm not an admin.
I'm thinking that CGI might be useful, but this one has really got me confused.
EDIT:
sorry if it sounded malicious or like a MITM attack. The reason I need this is because I have an (old) application which grabs a page from a site, but the domain has changed recently causing it to not function anymore. I didn't write the application, so I can't just change the domain it accesses.
I basically need to accomplish what can be done by editing the hosts file without having access to it.
pypcap needs administrative rights, so this is not an option.
And you don't have access to the pcs internals, to the source code or to the webserver.
There are a few options left:
Modify the host name in the applications files with a hexeditor and disassembler.
Modify the loaded application in memory with Cheat Engine and other memory tools.
Start the application in a virtual environment which can modify os api calls. A modified wine might be able to do this.
Modify the request between the pc and the webserver with a (transparent)proxy / modified router.
If the application supports the usage of proxies, it might be the easiest solution to set up a local squid with a redirector.
I wanted to know if there was a way I can get my python script located on a shared web hosting provider to read the contents of a folder on my desktop and list out the contents?
Can this be done using tempfiles?
Server-side web scripts have no access to the client other than through requests. If you can somehow break through the browser's protection settings to get JavaScript, Java, or Flash to read the contents of the client then you stand a fighting chance. But doing so will make many people angry and is generally considered a bad idea.
Unless your desktop computer has a public, accessible IP, neither your app running on a shared web hosting provider, nor any other app and host on the internet, can get information from your desktop computer. Does your desktop computer fall within the tiny minority that does have such a public, accessible IP?
If not, and if you're willing to run the obvious risks involved of course, you can try turning the (probably dynamically assigned) IP address that your ISP gives you into a resolvable domain name, by working with such DNS providers as DynDNS -- it can be done for free.
Once you're past the hurdle of public accessibility, you need to run on your computer some server that can respond to properly authenticated requests by supplying the information you desire. For example, you could run a web server such as Apache (which is powerful indeed but perhaps a bit hard for you to set up), or the like -- and a custom app on top of it to check authentication and provide the specific information you want to make available.
If you have no privacy worry (i.e., you don't mind that any hacker in the world can look at that folder's contents), you can skip the authentication, which is the really delicate and potentially fragile part (given that there's really no way for your app, running on a shared web hosting provider, to hold "secrets" very effectively).
If you can clarify each of these issues, then we can help pinpoint the best approach (what to install and how on both your desktop computer, and that shared web hosting provider).