Python: assert if string matches a format

Python: assert if string matches a format - python

I have some unit tests for my Django API using Django Rest Framework APIClient.
Different endpoints of the API return custom error messages, some with formatted strings like: 'Geometry type "{}" is not supported'.
I'm asserting the status code from the client responses and error message keys, but there are cases that I'd like to figure what error message is returned to make sure nothing else has caused that error.
So I'd like to validate the returned error message against the original unformatted string too. For example if I receive an error message like 'Geometry type "Point" is not supported', I'd like to check if it matches the original unformatted message, i.e. 'Geometry type "{}" is not supported'.
The solutions I've thought of so far:
First: replacing the brackets in the original string with a regex pattern and see if it matches the response.
Second: (the cool idea, but might fail in some cases) using difflib.SequenceMatcher and test if the similarity ratio is bigger than, for example, 90%.
UPDATE
Here's an example:
There's a dict of error messages from which each error picks the relevant message, adds the format arguments if needed, and raises its error:
ERROR_MESSAGES = {
'ERROR_1': 'Error message 1: {}. Do something about it',
'ERROR_2': 'Something went wrong',
'ERROR_3': 'Check you args: {}. Here is an example: {}'
}
Now an error happens in my DRF serializer during processing a request and it raises an error:
try:
some_validation()
except SomeError as e:
raise serializers.ValidationError({'field1': [ERROR_MESSAGES['ERROR_N1'], ERROR_MESSAGES['ERROR_N2']], 'field2': ['ERROR_N3']})
Now in a specific test, I'd like to make sure a certain error message is there:
class SomeTestCases(TestCase):
def test_something(self):
response = self.client.post(...)
self.assertThisMessageIsInResponse(response.data, ERROR_MESSAGES['ERROR_K'])
response.data can be just a string or a dict or list of errors; i.e. whatever that can go in ValidationError.
Pointing to the error message location within response.data for each test case is no problem. The concern of this question is dealing with comparison between formatted and unformatted strings.
So far the easiest approach has been regex. I'm mostly curios about if there's a built-in assertion for this and what other solutions can be used.

You are looking for assertRegex():
class SomeTestCases(TestCase):
def test_something(self):
response = self.client.post(...)
self.assertRegex(response.data, r'^your regex here$')
See also assertNotRegex.

Seems like regex would be the easiest solution here
import re
msg = 'Geometry type "point" is not supported'
assert re.match(r'^Geometry type ".+" is not supported$', msg)

Related

Using a wildcard in the middle of a URI endpoint for requests_mock JSON responses

I have some code that I would like to test, it is a fairly vanilla GET request wrapper, but the implementation of it requests data from the API multiple times with different IDs.
Adding mock JSON responses for the tests is problematic as there are hundreds of calls with these IDs and we want to test against one fixed response.
The target URI looks like https://someurl.com/api/v1/id/1234/data?params
The issue we are having is not wanting to add a line of code for every mock endpoint.
Eg. rather than have
mocker.get('https://someurl.com/api/v1/id/1234/data?params',
json={},
status_code=200)
mocker.get('https://someurl.com/api/v1/id/5678/data?params',
json={},
status_code=200)
I would like to implement some sort of wildcard matching, like this:
mocker.get(re.compile('https://someurl.com/api/v1/id/*/data?params'),
json={},
status_code=200)
This should be possible if I understand the docs correctly but this returns an error:
Failed: [undefined]requests_mock.exceptions.NoMockAddress: No mock address: GET https://someurl.com/api/v1/id/1234/data?params

That's because * and ? are qualifiers in the regular expression syntax. Once you adjust the pattern to escape the question mark (\?) and turn the star to a greedy match-any qualifier (.*), things should work as expected:
>>> requests_mock.register_uri(
... 'GET',
... re.compile(r'https://someurl.com/api/v1/id/.*/data\?params'),
... json={},
... status_code=200
... )
>>> requests.get('https://someurl.com/api/v1/id/1234/data?params').status_code
200
>>> requests.get('https://someurl.com/api/v1/id/lorem-ipsum-dolor-sit-amet/data?params').status_code
200

Gmail API: Python Email Dict appears to be Missing Keys

I'm experiencing a strange issue that seems to be inconsistent with google's gmail API:
If you look here, you can see that gmail's representation of an email has keys "snippet" and "id", among others. Here's some code that I use to generate the complete list of all my emails:
response = service.users().messages().list(userId='me').execute()
messageList = []
messageList.extend(response['messages'])
while 'nextPageToken' in response:
pagetoken = response['nextPageToken']
response = service.users().messages().list(userId='me', pageToken=pagetoken).execute()
messageList.extend(response['messages'])
for message in messageList:
if 'snippet' in message:
print(message['snippet'])
else:
print("FALSE")
The code works!... Except for the fact that I get output "FALSE" for every single one of the emails. 'snippet' doesn't exist! However, if I run the same code with "id" instead of snippet, I get a whole bunch of ids!
I decided to just print out the 'message' objects/dicts themselves, and each one only had an "id" and a "threadId", even though the API claims there should be more in the object... What gives?
Thanks for your help!

As #jedwards said in his comment, just because a message 'can' contain all of the fields specified in documentation, doesn't mean it will. 'list' provides the bare minimum amount of information for each message, because it provides a lot of messages and wants to be as lazy as possible. For individual messages that I want to know more about, I'd then use 'messages.get' with the id that I got from 'list'.
Running get for each email in your inbox seems very expensive, but to my knowledge there's no way to run a batch 'get' command.

Use of string in if() statement django error: string as left operand, not QuerySet

So, I am getting an error of:
TypeError: 'in <string>' requires string as left operand, not QuerySet
I have a method which has:
error_val = self.error_object
for p in self.output:
request = requests.get(p, timeout=settings.REQUESTS_TIMEOUT, verify=False)
for req in request:
if error_val in req:
print 'error Found in'+req
This error is happening due to error_val in the if()
In laymen's terms, this is basically saying (if I'm not mistaken), "Whoah, I'm getting a object value, with strings, but I can't compare to another object value"
req - is basically the html output of a page e.g. <html><body><!--html content here--></body></html>
error_val - is an variable holding the values of an object (results from a django query)
My question: how can I rework this method so, I can use the error_val var against each req (request)?
Any help, comments, suggestions are really helpful. Thank you.

self.error_object holds the instance of QuerySet class. And no you can't check if object of this type is inside the string.
QuerySet is a class which is a wrapper for Django ORM query/ies. It implements iterable protocol so you can iterate over it to get matching Model instances one by one.
Then you can access the fields of these instances as normal object attributes. If one of them is a string then you can check if it's a substring of req.
It's hard to say, what exactly you are trying to do but just a guess:
for model_instance in self.error_object:
for req in request:
if model_instance.some_string_field in req:
print 'error Found in' + req

If you're trying to check whether any of multiple objects' string representations appear in your response (I am unable to figure out what other behavior "queryset in string" might be intended to create) you want something like:
error_strings = [str(val) for val in self.error_object]
...
# then in your loop
if any(val in req for val in error_strings):
You might also profile creating an ord together regexp of error strings.

python string template : Error 'must be convertible to a buffer, not Template'

I use httplib and a string template.
With a normal String as message, it is working, with a template I get :
Error 'must be convertible to a buffer, not Template'
message=str(SMessage.substitute(...
webservice = httplib.HTTP(host)
webservice.putrequest("POST", url)
....
webservice.send(message)
Do I need to convert my template, somehow ?

Template.substitute() method returns a string, you can drop str() around the call. It also means that there might be another assignment to message that is not shown in your code snippet that changes message from str to Template or the error is raised in the different place (a full traceback can show where).

TwythonStreamer accents encoding? - unable to decode response, not valid JSON, with code 200

I started playing recently with Twython and the twitter API. The auth was a bit cumbersome to deal with but it's now working perfectly with a little Bottle webserver included in my script.
I'm trying to do something very simple: track a hashtag with the streaming filter API.
It seemed to work well at first but now I can see many errors in my log:
Error!
200 Unable to decode response, not valid JSON
It happens only on part of the tweets. I thought it could be linked to coordinates, but that's not it. I just tested and it seems to be caused by accents (éèêàâ...) encoding issues.
How can I fix this?
My streamer code is very basic:
class QMLStreamer(TwythonStreamer):
def on_success(self, data):
if 'text' in data:
if 'coordinates' in data and data['coordinates'] and 'coordinates' in data['coordinates']:
tweetlog("[%s](%s) - %s" % (data['created_at'], data['coordinates']['coordinates'], data['text'].encode('utf-8')))
else:
tweetlog("[%s] - %s" % (data['created_at'], data['text'].encode('utf-8')))
def on_error(self, status_code, data):
print colored('Error !', 'red', attrs=['bold'])
print status_code, data

The error happens in your code. You shouldn't be using .encode() here.
It is a bit counter-intuitive, but on_error() will be called if on_success() raised an exception, which is probably what happens here (a UnicodeDecodeError). That's why you're seeing an error code 200 ("HTTP Ok").
Twython is returning data as unicode objects, so you can just do:
print(u"[%s](%s) - %s" % (data['created_at'], data['coordinates']['coordinates'], data['text']))
You should probably add your own try...except block in on_success() for further debugging.
Also, I'm not sure what your tweetlog() function does, but be aware if you are on Windows that print() might have issues writing some codepoints as it will try to convert to the terminal's codepage.

No a perfect answer but you can try printing a normalized version of the text using unicodedata:
import unicodedata
...
tweetlog("[%s](%s) - %s" % (data['created_at'], data['coordinates']['coordinates'], unicodedata.normalize('NFD',data['text']).encode('ascii', 'ignore')))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: assert if string matches a format - python

You are looking for assertRegex(): class SomeTestCases(TestCase): def test_something(self): response = self.client.post(...) self.assertRegex(response.data, r'^your regex here$') See also assertNotRegex.

Seems like regex would be the easiest solution here import re msg = 'Geometry type "point" is not supported' assert re.match(r'^Geometry type ".+" is not supported$', msg)

Related

Using a wildcard in the middle of a URI endpoint for requests_mock JSON responses

Gmail API: Python Email Dict appears to be Missing Keys

Use of string in if() statement django error: string as left operand, not QuerySet

python string template : Error 'must be convertible to a buffer, not Template'

TwythonStreamer accents encoding? - unable to decode response, not valid JSON, with code 200

Categories

Resources