Django Selenium test fails sometimes in Travis CI

Django Selenium test fails sometimes in Travis CI - python

My team has puzzled at this issue on and off for weeks now. We have a test suite using LiveServerTestCase that runs all the Selenium-based tests that we have. One test in particular will seemingly randomly fail for no reason sometimes--I could change a comment in a different file and the test would fail. Changing some other comment would fix the test again. We are using the Firefox webdriver for the Selenium tests:
self.driver = Firefox()
Testing locally inside our Docker container can never reproduce the error. This is most likely due to the fact that when tests.py is run outside of Travis CI, a different web driver is used than Firefox(). The web driver instead is as such:
self.driver = WebDriver("http://selenium:4444/wd/hub", desired_capabilities={'browserName':'firefox'})
For local testing, we use a Selenium container.
The test that fails is a series of sub-tests that each tests a filtering search feature that we have; each sub-test is a different filter query. The sequence of each sub-test is:
Find the filter search bar element
Send the filter query (a string, i.e. something like "function = int main()")
Simulate the browser click to execute the query
For the specific filter on the set of data (the set of data is consistent throughout the subtests), assert that the length of the returned results matches what is expected for that specific filter
Very often this test will pass when run in Travis CI, and as noted before, this test always passes when run locally. The error cannot be reproduced when interacting with the site manually in a web browser. However, once in a while, this sort of error will appear in the test output in Travis CI:
- Broken pipe from ('127.0.0.1', 39000)
- Broken pipe from ('127.0.0.1', 39313)
39000 and 39313 are not always the numbers--these change every time a new Travis CI build is run. These seem like port numbers, though I'm not really sure what they actually are.
We have time.sleep(sec) lines right before fetching the list of results for a filter. Increasing the sleep time usually will correlate with a temporary fix of the broken pipe error. However, the test is very fickle and changing the sleep time likely does not have much to do with fixing the error at all; there have been times where the sleep time has been reduced or taken out of a subtest and the test will pass. In any case, as a result of the broken pipe, the filter cannot get executed and the assertion fails.
One potentially interesting detail is that regardless of the order of subtests, it is always the first subtest that fails if the broken pipe error occurs. If, however, the first subtest passes, then all subtests will always pass.
So, my question is: what on earth is going on here and how do we make sure that this random error stops happening? Apologies if this is a vague/confusing question, but unfortunately that is the nature of the problem.

It looks like your issue may be similar to what this fellow was running into. It's perhaps an issue with your timeouts. You may want to use an explicit wait, or try waiting for a specific element to load before comparing the data. I had similar issues with my test where my Selenium test would try polling an image to see if it was present before the page had finished loading. Like I say, this may not be the same issue, but could potentially help. Goodluck!

I just ran into this myself, and this is caused by the django's built-in server not using python's logging system. This has been fixed in 1.10 but is not released yet at the time of writing. In my case it is acceptable to leave the messages in the log until it is time to upgrade; better than adding timeouts and increasing build time.
Django ticket on the matter
Code that's causing the issue in 1.9.x

Related

AWS IOT Inconsistent results from multiple sequential requests

I'm writing a simple unittest which is running:
botocore.client.IoT.search_index(queryString='connectivity.connected:true')
My unittest simply connects a device, subscribes to MQTT, sends and receives a test message. This gives me reason to trust the device is truly online.
Sometimes my unit test passes, sometimes fails. When I come into a debugger and run the search_index command repeatedly I see inconsistent results between calls. Sometimes the device I just connected is online, sometimes it's not, after 20ish seconds the device appears to be consistently online.
I believe I'm probably getting responses from different servers and the propagation of the connected state between servers is simply delayed on the AWS side.
If my assessment is correct, then I want to know if there's anything I can do to force a consistent state between calls. Coding around this kind of inconsistent behavior is extremely error prone and almost certain to induce very hard to track bugs. Plus I don't trust that many other requests I'm making to AWS IOT are safe to rely on. In short I'm not going to do it, I'll find a better solution if there's no way to force AWS IOT to provide a consistent state between calls.

Giving that you are unit testing your code, what you can do is mock botocore.client.IoT.search_index response. Verify the patch function from unittest.mock library (https://docs.python.org/3/library/unittest.mock.html). For your case it must be something like:
#patch('botocore.client.IoT')
def test(iotMockedClass):
iotMockedClass.search_index.return_value = 'a fixed value that you must define'
# ... your unit test case
It's also important to mention that unit tests should not depend on the environment, which must be replaced with mocks or stubs.

Unknown error has occurred in Cloud Functions

First, it looks like this thread but it is not: An unknown error has occurred in Cloud Function: GCP Python
I deployed a couple of times Cloud Functions and they are still working fine. Nevertheless, since last week, following the same procedure I can deploy correctly, but testing them I get the error "An unknown error has occurred in Cloud Functions. The attempted action failed. Please try again, send feedback".
In remote the script works perfectly and writes in Cloud Storage.
My Cloud Function is a zip with a python script, loading a csv in Cloud Storage.
The csv weights 160kB, the python script 5kB. So I used 128MiB of memory allocated.
The execution time is 38 secs, almost half of the default timeout.
It is configured to allow just traffic within the project.
Env variables are not the problem
It's triggered by pub/sub and what I want is to schedule it when I can make it work.
I'm quite puzzled. I have such a lack of ideas right now that I started to think everything works fine but the Google testing method is what is fails... Nevertheless when I run the pub/sub topic in Cloud Scheduler it launches the error log without much info 1. By any chance anyone had the same problem?
Thanks

Answer of myself from the past:
Finally "solved". I'm a processing a csv in the CF of 160kB, in my computer the execution time lasts 38 seconds. For some reason in the CF I need 512MB of Allocated Memory and a timeout larger than 60 secs.
Answer of myself from a closest past:
Don't test a CF using the test button, because sometimes it takes more than the max available timeout to finish, hence you'll get errors.
If you want to test it easily
Write prints after milestones in your code to check how the script is evolving.
Use the logs interface. The prints will be displayed there ;)
Also, logs show valuable info (sometimes even readable).
Also, if you're sending for example, to buckets, check them after the CF is finished, maybe you get a surprise.
To sum up, don't believe blindly in the testing button.
Answer of myself from the present (already regretting the prints thing):
There are nice python libraries to check logs, don't print stuff for that (if you have time).

How to avoid the browser shutdown after running a test case file in test suite?

I am automating a web application using Squish tool. In the suite I have two testcase files.
But when I run it as suite , the browser is getting closed and relaunch it for the next testcase.
def launchAxisApplication(self):
Wrapper.fixateResultContext()
bool=isBrowserOpen();
if bool==False:
Wrapper.startBrowser(AxisUrl)
Wrapper.restoreResultContext()
This method call happens in a statement which belongs to the first test case file. So what change I have to do for not to relaunch the application.

Squish will always close the AUT (Application Under Test) or browser started in that test case, at the end of the test case (should the AUT/browser still be running).
To avoid this with Squish for Web you have to attach to the running browser, which is explained in Attaching to a running Web Browser.
Since the setup steps are different and a bit lengthy for each browser (and may be subject to change) I am not duplicating that information here (at the risk of downvoting or not accepting this as a proper, self-contained answer).
Then, once the browser is configured to be attachable and running, one can use the attachToBrowser() function to attach to this browser.
PS: You will have to consider how to handle situations where the previous test case aborted with an error, leaving the browser in an "unknown" state, rather than the expected state for the next test case.

Query failing on first try of the day, succeeding on second try

Exact error I get is here:
{'trace': "(Error) ('08S01', '[08S01] [FreeTDS][SQL Server]Write to the server failed (20006) (SQLExecDirectW)')"}
I get this when I first run a query in my Pyramid application. Any query I run (In my case, it is a web search form that returns info from a database)
The entire application is read-only, as is the account used to connect to the db. I don't know what it would be writing that would fail. And like I said, if I re-run the exact same thing (or refresh the page) it runs just fine without error.
Edit: Emphasis on the "first try of the day". If no queries for x amount of time, I get this write error again, and then it'll work. It's almost like it's fallen asleep and that first query will wake it up.

I would guess that there's a pool of DB connections that is kept open for some time, T. The server, however, terminates open connections after some time, S, which is less than T.
The first connection of the day (or after S elapses in general) would give you this error.
Try to look for a way to change the "timeout" of the connections in the pool to be less than S and that should fix the problem.
Edit: These times (T and S) are dependent on configs or default values for the server and libraries you use. I've experienced a similar issue with a Flask+SQLAlchemy+MySQL app in the past and I had to change the connection timeouts, etc.
Edit 2: T might be "keep connections open forever" or a very high value

In selenium.py, How to deal with 404 status code

I'm writing a selenium script by python. Something that I found out, is that when selenium gets 404 status code. it crashes. What is the best way to deal with it?

I had a similar problem. Sometimes a server we were using (i.e., not the main server we were testing, only a "sub-server") throughout our tests would crash. I added a minor sanity test to see if the server is up or not before the main tests ran. That is, I performed a simple GET request to the server, surrounded it with try-catch and if that passed I continue with the tests. Let me stress out this point- before i even started selenium i would perform a GET request using python's urllib2. It's not the best of solutions but it's fast it was enough for me.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.