I'm pretty new to BDD and Lettuce and I cam across an issue which I'm not sure how best to handle.
I want to create a Lettuce test suite which I can then run against different environments, where some parameters in the scenario would be different for each environment.
So following the Lettuce documentation I have this example scenario:
Scenario: Create correct config
Given I have IP "127.0.0.0:8000"
And I specify username "myuser" and password "mypassword"
When I connect to the server
Then I get return code 200
In this case I would have to change the IP, user and password for each environment. But this is not practical and I want to be able to have some config file which I can create for each environment and it would contain the value for these parameters.
I found out about terrain.py and saw that you can set variables in this file which you can access from your steps.py using world.
So it would be possible to re-word the scenario like this:
Scenario: Create correct config
Given I have a correct IP
And I specify correct credentials
When I connect to the sever
Then I get return code 200
Now in the step definitions example for "I have a correct IP" you can use world.correctIP which will be defined in terrain.py.
This would work in the way I need it to, but I'm not convinced this is the correct way to do it and if terrain.py was intended to be used like this...or is there a different way to handle this situation?
I would say that hiding the implementation details is a good approach. That is, I have a correct IP is a better way to go than keeping this detail in a property file.
BDD is all about communication. If it is enough to know that you use the correct ip, then there is no need to know which ip when you read the example.
Related
I'm currently using the azure-cosmos module in Python to connect to a database on Azure. I want to fetch the data, make a few transformations, and then push it to a new container.
You need the key and client ID to connect to the database, which I've used as variables in my code for now, as follows:
url = 'https://xyz.azure.com:443/'
key ='randomlettersandnumbers=='
client = CosmosClient(url, credential=key)
This seems to be a bad practice intuitively, and especially once I push this to Git, anyone could gain access to my database. So what's the most secure way to do this?
I'm coming from a non-SWE background, so apologies if this question is dumb.
Thanks!
The way I deal with this kind of problem is by using environment variables
import os
url = os.environ.get("url-endpoint")
key = os.environ.get("api-key")
client = CosmosClient(url, credential=key)
You can set them in your ssh shell like that:
export url-endpoint="https://xyz.azure.com:443/"
export api-key="randomlettersandnumbers=="
Or you can put them in a bash script envs.sh
export url-endpoint="https://xyz.azure.com:443/"
export api-key="randomlettersandnumbers=="
And then you can use source command.
source envs.sh
You have a good article about storing sensitive data using environment variables here
For a new machine learning course we're looking to design a series of coding assignments in which students get some starter code, and make improvements until the unit tests pass. Then they commit and push their code back to the remote where an autograding workflow runs more tests to see if they did adequate work.
What we'd like to do would be to give the students some tests that they can look into, to see what the general programming goal is; but to also have a secret unit test to try their code on data that the students have never seen. On this unseen test data they'd have to reach at least a certain accuracy score to get a passing grade.
The question is: can this be done in github classroom? It seems that the default setup is to give all the tests openly in the starter code repository. But we want to have some tests that the students can't see, so that we can test if they're only narrowly writing to the visible test or actually writing a properly generic solution.
If this isn't directly possible, is there a workaround strategy?
No idea if this could or would work, but maybe try the top answer from here :
"
GitHubPages (like Bitbucket Pages and GitLab Pages) only serve static pages, so the only solution is something client side (Javascript).
A solution could be, instead of using real authentication, just to share only a secret (password) with all the authorized persons and implement one of the following scheme:
put all the private files in a (not listed) subdirectory and name that with the hash of the chosen password. The index page asks you (with Javascript) for the password and build the correct start link calculating the hash.
See for example: https://github.com/matteobrusa/Password-protection-for-static-pages
PRO: Very simple approach protecting a whole subdirectory tree
CONS:
possible attack: sniffing the following requests to obtain the name of the subdirectory
the admins on the hosting site have access to the full contents
crypt the page with password and decrypt on the fly with javascript
see for example: https://github.com/robinmoisson/staticrypt
PRO: no plaintext page code around (decrypting happens on the client side)
CONS:
just a single page, and need to reinsert the password on every refresh
an admin could change your Javascript code to obtain the password when you insert it"
The context is testing of a web app with selenium while using a number of virtual user accounts we created for this very purpose. And so the testing process needs to access our sites and log-on with the virtual user's id and password.
None of these accounts are critical and they are flagged as testing accounts so no damage can be done. Still, it would probably be a good idea to encrypt the passwords and decrypt them prior to use.
If it matter, our test app is written in Python, Django and uses PostgreSQL for the database. It runs on a small Linode instance.
What might best practices be for something like this?
EDIT 1
The other thought I had was to store the credentials on a second machine and access them through and API while only allowing that access to happen from a known server's non-public IP. In other words, get two instances at Linode and create a private machine-to-machine connection within the data center.
In this scenario, access to the first machine would allow someone to potentially make requests to the second machine if they are able to de-obfuscate the API code. If someone really wants the data they can certainly get it.
We could add two factor authentication as a way to gate the tests. In other words, even if you had our unencrypted test_users table you couldn't do anything with them because of the 2FA mechanism in place just for these users.
Being that this is for testing purposes only I am starting to think the best solution might very well be to populate the test_users table with valid passwords only while running a test. We could keep the data safe elsewhere and have a script that uploads the data to the test server when we want to run a test suite. Someone with access to this table could not do thing with it because all the passwords would be invalid. In fact, we could probably use this fact to detect such a breach.
I just hate the idea of storing unencrypted passwords even if it is for test users that can't really do any damage to the actual app (their transactions being virtual).
EDIT 2
An improvement to that would be to go ahead and encrypt the data and keep it in the test server. However, every time the tests are run the system would reach out to us for the crypto key. And, perhaps, after the test is run the data is re-encrypted with a new key. A little convoluted but it would allow for encrypted passwords (and even user id's, just to make it harder) on the test server. The all-important key would be nowhere near the server and it would self-destruct after each use.
What is generally done in a case like this is to put the password through a cryptographic hash function, and store the hashed password.
To verify a login, hash the provided password and compare the calculated hash to the stored version.
The idea behind this is that it is considered impossible to reverse a good cryptographic hash function. So it doesn't matter if an attacker could read the hashed passwords.
Example in Python3:
In [1]: import hashlib
In [2]: hashlib.sha256('This is a test'.encode('utf8')).hexdigest()
Out[2]: 'c7be1ed902fb8dd4d48997c6452f5d7e509fbcdbe2808b16bcf4edce4c07d14e'
In [3]: hashlib.sha256('This is a tist'.encode('utf8')).hexdigest()
Out[3]: 'f80b4162fc28f1f67d1a566da60c6c5c165838a209e89f590986333d62162cba'
In [4]: hashlib.sha256('This is a tst.'.encode('utf8')).hexdigest()
Out[4]: '1133d07c24ef5f46196ff70026b68c4fa703d25a9f12405ff5384044db4e2adf'
(for Python2, just leave out the encode.)
As you can see, even one-letter changes lead to a big change in the hash value.
I'm not seeing much documentation on this. I'm trying to get an image uploaded onto server from a URL. Ideally I'd like to make things simple but I'm in two minds as to whether using an ImageField is the best way or simpler to simply store the file on the server and display it as a static file. I'm not uploading anyfiles so I need to fetch them in. Can anyone suggest any decent code examples before I try and re-invent the wheel?
Given an URL say http://www.xyx.com/image.jpg, I'd like to download that image to the server, put it into a suitable location after renaming. My question is general as I'm looking for examples of what people have already done. So far I just see examples relating to uploading images, but that doesn't apply. This should be a simple case and I'm looking for a canonical example that might help.
This is for uploading an image from the user: Django: Image Upload to the Server
So are there any examples out there that just deal with the process of fetching and image and storing on the server and/or ImageField.
Well, just fetching an image and storing it into a file is straightforward:
import urllib2
with open('/path/to/storage/' + make_a_unique_name(), 'w') as f:
f.write(urllib2.urlopen(your_url).read())
Then you need to configure your Web server to serve files from that directory.
But this comes with security risks.
A malicious user could come along and type a URL that points nowhere. Or that points to their own evil server, which accepts your connection but never responds. This would be a typical denial of service attack.
A naive fix could be:
urllib2.urlopen(your_url, timeout=5)
But then the adversary could build a server that accepts a connection and writes out a line every second indefinitely, never stopping. The timeout doesn’t cover that.
So a proper solution is to run a task queue, also with timeouts, and a carefully chosen number of workers, all strictly independent of your Web-facing processes.
Another kind of attack is to point your server at something private. Suppose, for the sake of example, that you have an internal admin site that is running on port 8000, and it is not accessible to the outside world, but it is accessible to your own processes. Then I could type http://localhost:8000/path/to/secret/stats.png and see all your valuable secret graphs, or even modify something. This is known as server-side request forgery or SSRF, and it’s not trivial to defend against. You can try parsing the URL and checking the hostname against a blacklist, or explicitly resolving the hostname and making sure it doesn’t point to any of your machines or networks (including 127.0.0.0/8).
Then of course, there is the problem of validating that the file you receive is actually an image, not an HTML file or a Windows executable. But this is common to the upload scenario as well.
I've been asked to set up a FTP server using python that different users can log in to, and depending on their login will display a different file structure.
Part of the structure will be read only, and another part write, read, create and delete.
The file structure and files won't exist on the server, and will have to be built in a lazy way as the user expands folders by querying external servers.
The servers need to, I guess, mimic the FTP interface/protocol from the outside, but work completely differently internally.
I was wondering how big or difficult a job this would be as I need to provide some type of time scale for getting this working.
Is there anything like this out there already? has anyone done something similar before?
Are there any obvious problems of trying to implement this kind of model?
The twisted project would be the obvious place to start; the following example starts a simple FTP server that authenticates users against a password file but also allows anonymous access
from twisted.protocols.ftp import FTPFactory, FTPRealm
from twisted.cred.portal import Portal
from twisted.cred.checkers import AllowAnonymousAccess, FilePasswordDB
from twisted.internet import reactor
p = Portal(FTPRealm('./'),
[AllowAnonymousAccess(), FilePasswordDB("pass.dat")])
f = FTPFactory(p)
reactor.listenTCP(21, f)
reactor.run()
You can easily expand from there. How you implement 'files' and 'directories' is completely up to you.
Why python? I mean what python has to do with it? I'd look for some PAM module, able to create user-specific virtual filesystem structure on login, and if there's no ready one, consider modify some pam_mount, something like that..
http://pam-mount.sourceforge.net