How to pass a failed testsuite run (junit.xml parsing)? - python

We are working in a automated continues integration/deployment pipeline. we got hundreds of testcases and 4 stages (Red, Orange, Yellow, Green).
The issue I'm facing is that a test can fail (bug, timings, stuck process etc.) and it will fail the entire regression run.
I think that we need some sort of weight to determine amount of pass/fail tests to be considered as 'fail' build.
any ideas? something you created on your pipeline?
Thanks,
-M

Having a failed build does not always reflect the quality of the product, mainly if the fails are related to testing infrastructure issues.
Reduce the risk of unwanted fails which are not related to application bugs(timings, stuck process), by building a strong and stable framework, that can be easily maintained and extended.
When speaking of fails related to application bugs, the type of tests that failed are more important than the amount of fails. This is defect severity. You can have 3 trivial fails that don't have big impact, and you can also have only 1 fail that is critical. You need to flag your tests accordingly.
Additionally to this, there is a Jenkins plugin that creates an easy to follow test run history, where you can see the number of tests that failed the most times in the last runs.

Related

Continuously gather metrics in the background of a pytest run

I'm creating some automatic tests for my app using pytest. Every test consists of some test actions and an assertion - nothing special. However some of these tests are intentionally a bit disruptive. I would like to gather some metrics from different resources of my app and from my test environment while the tests are running, and put these metrics into a log file - I'm not interested in failing the tests based on these metrics, I just want them to understand my system better.
I'm thinking about creating a script that gathers the information that I want and creating a fixture to run it in the background of my test by using subprocess.Popen(). I have also thought about creating a function to gather the data and run it in parallel to my test code by using multiprocessing. I don't know if there are other options.
I would like to know if there is a standard, simple way to do this. I want to avoid unnecessary complexity at all costs.
Thanks!

Test cases for Redhat OpenStack?

I am working in RedHat OpenStack project and I need to know good test cases for reliability, performance, and function test cases for RedHat OpenStack. I already looked at the Tempest test. but I'm asking if there's any other test I can follow?
I realize that you mention that you've already looked at Tempest, but I would strongly encourage you to take a second look. I understand that the documentation is a little underwhelming and tailoring the tempest configuration to your deployment can be a significant time investment. Beyond its documentation, it's a well-maintained OpenStack project and running sanity checks doesn't take too long to configure. The results can be truly revealing.
Create a tempest workspace and conduct sanity checks with --smoke or -s
Create a workspace with tempest init myworkspace. This will create the directory structure for you based off of what exists in /etc/tempest. If you've already configured your /etc/tempest, you're a step ahead, otherwise, you'll need to configure your myworkspace/etc/tempest.conf before running any test.
Once your workspace is configured for your deployment, execute tempest run --smoke from the workspace directory. This will execute ~100 smoke tests for basic cloud functionality and sanity testing. With my modest deployment, this doesn't take more than 3-5 minutes to get some worthwhile results.
Results from --subunit
Continuing with the myworkspace directory, running your smoketests with the --subunit flag (tempest run --smoke --subunit) produces the html-exportable subunit docs at workspace/.stestr/$iteration where $iteration is a 0-indexed iteration of tempest run you've executed.
For example, after your first iteration, run subunit2html .stestr/0 to generate a well-formatted results.html for your review.
Beyond Smoketesting
If you start here and iterate, I think it naturally progresses into running the full gamut of tests. The workflow is a bit different from the smoke testing:
Generally start with tempest cleanup --init-saved-state which will produce a pre-test state of your cloud, a veritable snapshot of the resources you do not want to cleanup in post. The state is stored at saved_state.json.
Run your tests with options tailored to your deployment, most basically tempest run.
After analyzing your results, a run of tempest cleanup will destroy resources that do not exists in the saved_state.json file.

how to handle time-consuming tests

I have tests which have a huge variance in their runtime. Most will take much less than a second, some maybe a few seconds, some of them could take up to minutes.
Can I somehow specify that in my Nosetests?
In the end, I want to be able to run only a subset of my tests which take e.g. less than 1 second (via my specified expected runtime estimate).
Have a look at this write up about attribute plugin for nose tests, where you can manually tag tests as #attr('slow') and #attr('fast'). You can tun nosetests -a '!slow' afterward to run your tests quickly.
It would be great if you can do it automatically, but I'm afraid that you would have to write additional code to do it on the fly. If you are into rapid development, I would run the nose with xunit xml output enabled (which tracks the runtime of each test). Your test module can dynamically read in your xml output file from previous runs and set attribute settings for tests accordingly to filter out quick tests. This way you do not have to do it manually, alas with more work (and you have to run all tests at least once).

django test coverage with black box testing?

We are testing a Django applications with a black box (functional integration) testing approach, where a client performs tests with REST API calls to the Django application. The client is running on a different VM, so we can not use the typical coverage.py (I think).
Is there a way to compute the coverage of these black box tests? Can I somehow instruct Django to start and stop in test coverage mode and then report test coverage?
The coverage for functional integration tests are really a different layer of abstraction than unit test coverage which covers lines of code executed. You likely care more about coverage of use-cases in a true black-box test.
But if you are looking for code coverage anyways (and there are certainly reasons why you might want to), it looks like you should be able to use coverage.py if you have access to the server to set up test scenarios. You will need to implement a way to end the django process to allow coverage.py to write the coverage report.
From:
https://coverage.readthedocs.io/en/coverage-4.3.4/howitworks.html#execution
"At the end of execution, coverage.py writes the data it collected to
a data file"
This indicates that the python processes must come to completion naturally. Killing the process manually would also take out the coverage.py wrapper preventing the write.
Some ideas to end django: stop django command using sys.exit()
See:
https://docs.djangoproject.com/en/1.10/topics/testing/advanced/#integration-with-coverage-py

Making Nose fail slow tests

I want to my tests to fail if they take longer than a certain time to run (say 500ms) because it sucks when a load of slightly slow tests mount up and suddenly you have this big delay every time you run the test suite. Are there any plugins or anything for Nose that do this already?
For cases where where timing is important (e.g. realtime requirements):
http://nose.readthedocs.org/en/latest/testing_tools.html
nose.tools.timed(limit)
Test must finish within specified time limit to pass.
Example use:
from nose.tools import timed
#timed(.1)
def test_that_fails():
time.sleep(.2)
I respectfully suggest that changing the meaning of "broken" is a bad idea.
The meaning of a failed/"red" test should never be anything other than "this functionality is broken". To do anything else risks diluting the value of the tests.
If you implement this and then next week a handful of tests fail, would it be an indicator that
Your tests are running slowly?
The code is broken?
Both of the above at the same time?
I suggest it would be better to gather MI from your build process and monitor it in order to spot slow tests building up, but let red mean "broken functionality" rather then "broken functionality and/or slow test."

Categories

Resources