I'm using the requests module to collect some data from a website. This application runs once every day. The amount of rows of data I get changes every time, per request I can get a maximum 250 rows of data. If there is more then 250 rows of data the API gives me a follow uplink which can be used to get the rows 251 >- 500 etc.
Now I have a problem, sometimes the amount of data is < 250 rows, this means there is no followuplink to use and that's exactly where my program gives the following error:
KeyError: #odata.nextLink
This is a piece of the application:
proxies = {'https': 'proxy.***.***.com:8080'}
headers = {"grant_type": "password",
"username": "****",
"password": "****",
"persistent": "true",
"device": '{"DeviceUniqueId":"b680c452","Name":"Chrome","DeviceVersion":"36","PlatformType":"Browser"}'}
url1 = 'https://****-***.com/odata/Results'
params_1 = (
('$filter', mod_date),
('$count', 'true'),
('$select', 'Status'),
('$expand', 'Result($select=ResultId),Specification($select=Name), SpecificationItem($select=Name,MinimumValue, MaximumValue)\n\n'),)
response_1 = requests.get(url_1, headers=headers, proxies=proxies, params=params_1)
q_1 = response_1.json()
next_link_1 = q_1['#odata.nextLink']
q_1 = [tuple(q_1.values())]
while next_link_1:
new_response_1 = requests.get(next_link_1, headers=headers, proxies=proxies)
new_data_1 = new_response_1.json()
q_1.append(tuple(new_data_1.values()))
next_link_1 = new_data_1.get('#odata.nextLink', None)
Now I actually want Python to only read the variable next_link_1 if its available otherwise it should just ignore it and collect what is available...
You only want to enter the while loop when q_1 has the key '#odata.nextLink' Inside the while loop, this is already accomplished in the line next_link_1 = new_data_1.get('#odata.nextLink', None) You could use the same approach -- setting next_link_1 to None if there is no next link -- before the while loop:
next_link_1 = q_1.get('#odata.nextLink', None)
This can be simplified to
next_link_1 = q_1.get('#odata.nextLink')
as None is already the default default value of dict.get().
NB: The question title is wrong. The variable always exists, as you are setting it. Only the existence of the key #odata.nextLink is fragile. So, what you actually want to do is check the existence of a key in a dictionary. To understand what is going on, you should familiarize yourself with the dict.get() method.
There is also some obvious refactoring possible here, getting rid of the repetition of the first iteration, and moving it into the loop:
proxies = {'https': 'proxy.***.***.com:8080'}
headers = {
'grant_type': 'password',
'username': '****',
'password': '****',
'persistent': 'true',
'device': '{"DeviceUniqueId":"b680c452","Name":"Chrome","DeviceVersion":"36","PlatformType":"Browser"}'
}
params = (
('$filter', mod_date),
('$count', 'true'),
('$select', 'Status'),
('$expand', 'Result($select=ResultId),Specification($select=Name), SpecificationItem($select=Name,MinimumValue, MaximumValue)\n\n'),
)
url = 'https://****-***.com/odata/Results'
data = []
while url:
response = requests.get(
url,
headers=headers,
proxies=proxies,
params=params,
)
response_data = response.json()
data.append(tuple(response_data.values()))
url = response_data.get('#odata.nextLink')
params = tuple()
Use get in both places. Better yet, restructure your loop so that you only need one call.
proxies = {'https': 'proxy.***.***.com:8080'}
headers = {...}
url1 = 'https://****-***.com/odata/Results'
params = (...)
qs = []
next_link = url
get_args = {'headers': headers, 'proxies': proxies, 'params': params}
while True:
response = requests.get(next_link, **get_args)
q = response.json()
qs.append(tuple(q.values())
if (next_link := q.get('#odata.nextLink', None)) is None:
break
if 'params' in get_args:
del get_args['params'] # Only needed in the first iteration
(I'm not terribly excited about how we ensure params is used only on the first iteration, but I think it's better than duplicating the process of defining next_link before the loop starts. Maybe something like this would be an improvement?
get_args = {...} # As above
new_get_args = dict(headers=..., proxies=...) # Same, but without params
while True:
...
if (next_link := ...) is None:
break
get_args = new_get_arg
Repeated assignment to get_args is probably cheaper than repeatedly testing for and deleting the params key, at the cost of having a second dict in memory. You could even drop that after the first iteration by adding a second assignment new_get_args = get_args to the end of the loop, which would result in a pair of do-nothing assignments for later iterations.)
Related
I have a webservice that give doc list. I call this webservice via get_doc_list.
but when I pass 2 values to id__in, it return one mapping object.
def get_doc_list(self, id__in):
config = self.configurer.doc
params = {
"id__in": id__in,
}
response = self._make_request(
token=self.access_token,
method='get',
proxies=self.proxies,
url=config.service_url,
params=params,
module_name=self.module_name,
finalize_response=False
)
return response
How can I fix it?!
You can add this two lines before make_request:
string_id_in = [str(i) for i in id_in]
id_in = ",".join(string_id_in)
I can't seem to easily access a list value from within a dictionary response from an API.
data = {
'room_id': room,
'how_many': 1
}
response_url = 'https://api.clickmeeting.com/v1/conferences/'+ str(room) +'/tokens'
response1 = requests.post(response_url, headers=headers, data=data).
response1.raise_for_status()
# access JSOn content
jsonResponse = response1.json()
print(jsonResponse)
the response is:
{'access_tokens': [{'token': 'C63GJS', 'sent_to_email': None, 'first_use_date': None}]}
I'm looking to assign the token value to a variable.
Any ideas?
If the list in the access_tokens is always of length 1, you can do something like this:
token = json_response["access_token"][0]["token"]
If there's a potential for more than one item in access_tokens, then something similar:
tokens = []
access_tokens = json_response["access_token"]
tokens = [at["token"] if "token" in at for at in access_tokens]
I have looked at How to mock REST API and I have read the answers but I still can't seem to get my head around how I would go about dealing with a method that executes multiple GET and POST requests. Here is some of my code below.
I have a class, UserAliasGroups(). Its __init__() method executes requests.post() to login into the external REST API. I have in my unit test this code to handling the mocking of the login and it works as expected.
#mock.patch('aliases.user_alias_groups.requests.get')
#mock.patch('aliases.user_alias_groups.requests.post')
def test_user_alias_groups_class(self, mock_post, mock_get):
init_response = {
'HID-SessionData': 'token==',
'errmsg': '',
'success': True
}
mock_response = Mock()
mock_response.json.return_value = init_response
mock_response.status_code = status.HTTP_201_CREATED
mock_post.return_value = mock_response
uag = UserAliasGroups(auth_user='TEST_USER.gen',
auth_pass='FakePass',
groups_api_url='https://example.com')
self.assertEqual(uag.headers, {'HID-SessionData': 'token=='})
I also have defined several methods like obtain_request_id(), has_group_been_deleted(), does_group_already_exists() and others. I also define a method called create_user_alias_group() that calls obtain_request_id(), has_group_been_deleted(), does_group_already_exists() and others.
I also have code in my unit test to mock a GET request to the REST API to test my has_group_been_deleted() method that looks like this:
has_group_been_deleted_response = {
'error_code': 404,
'error_message': 'A group with this ID does not exist'
}
mock_response = Mock()
mock_response.json.return_value = has_group_been_deleted_response
mock_response.status_code = status.HTTP_404_NOT_FOUND
mock_get.return_value = mock_response
Now I can get to my question. Below is the pertinent part of my code.
class UserAliasGroups:
def __init__(
self,
auth_user=settings.GENERIC_USER,
auth_pass=settings.GENERIC_PASS,
groups_api_url=settings.GROUPS_API_URL
):
""" __init__() does the login to groups. """
self.auth_user = auth_user
self.auth_pass = auth_pass
self.headers = None
self.groups_api_url = groups_api_url
# Initializes a session with the REST API service. Each login session times out after 5 minutes of inactivity.
self.login_url = f'{self.groups_api_url}/api/login'
response = requests.post(self.login_url, json={}, headers={'Content-type': 'application/json'},
auth=(auth_user, auth_pass))
if response.status_code is not 201:
try:
json = response.json()
except:
json = "Could not decode json."
raise self.UserAliasGroupsException(f"Error: User {self.auth_user}, failed to login into "
f"{self.login_url} {json}")
response_json = response.json()
self.headers = {'HID-SessionData': response_json['HID-SessionData']}
def obtain_request_id(self, request_reason):
payload = {'request_reason': request_reason}
url = f'{self.groups_api_url}/api/v1/session/requests'
response = requests.post(url=url, json=payload, headers=self.headers)
if response.status_code is not status.HTTP_200_OK:
try:
json = response.json()
except:
json = "Could not decode json."
msg = f'obtain_request_id() Error url={url} {response.status_code} {json}.'
raise self.UserAliasGroupsException(msg)
request_id = response.json().get('request_id')
return request_id
def has_group_been_deleted(self, group_name):
url = f'{self.groups_api_url}/api/v1/groups/{group_name}/attributes/RESATTR_GROUP_DELETED_ON'
response = requests.get(url=url, headers=self.headers)
return response.status_code == status.HTTP_200_OK
def does_group_already_exists(self, group_name):
url = f'{self.groups_api_url}/api/v1/groups/{group_name}'
response = requests.get(url=url, headers=self.headers)
if response.status_code is status.HTTP_200_OK:
# check if the group has been "deleted".
return not self.has_group_been_deleted(group_name=group_name)
return False
def create_user_alias_group(
self,
... long list of params omitted for brevity ...
):
if check_exists:
# Check if group already exists or not.
if self.does_group_already_exists(group_name):
msg = f'Cannot create group {group_name}. Group already exists.'
raise self.UserAliasGroupsException(msg)
... more code omitted for brevity ...
My question is how do I write my unit test to deal with multiple calls to requests.post() and request.get() all resulting in different responses in my create_user_alias_group() method?
I want to call create_user_alias_group() in my unit test so I have to figure out how to mock multiple requests.get() and requests.post() calls.
Do I have use multiple decorators like this:
#mock.patch('aliases.user_alias_groups.obtain_request_id.requests.post')
#mock.patch('aliases.user_alias_groups.does_group_already_exists.requests.get')
#mock.patch('aliases.user_alias_groups.has_group_been_deleted.requests.get')
def test_user_alias_groups_class(self, mock_post, mock_get):
...
?
Thanks for looking my long question :)
You can use mock.side_effect which takes an iterable. Then different calls will return different values:
mock = Mock()
mock.side_effect = ['a', 'b', 'c']
This way the first call to mock returns "a", then the next one "b" and so on. (In your case, you'll set mock_get.side_effect).
I'm running a Python script which uses a value list as query parameters for an HTTP request over an API endpoint. Here a snap:
df = pd.read_excel('grp.xlsx', sheet_name='Sheet1', usecols="A")
for item in df.PLACE:
df.PLACE.head()
#1st level request
def wbsearchentities_q(**kwargs):
params = {
'action': 'wbsearchentities',
'format': 'json',
'language': 'en',
'search': item
}
params.update(kwargs)
response = requests.get(API_ENDPOINT, params=params)
return response
r = wbsearchentities_q(ids=item)
item_id = (r.json()['search'][0]['id'])
item_label = (r.json()['search'][0]['label'])
I'm having this error: IndexError: list index out of range which means that some items from my list are not recognized by the API endpoint.
I would just pass over and continue the loop. I tried to fix using this without result.
Thanks in advance.
you can try:
for item in df.PLACE:
try:
... your code ...
except:
pass
In order to be specific only for that error (recommanded in order to avoid not handling other errors), and continue to the next item in the df:
try:
item_id = (r.json()['search'][0]['id'])
item_label = (r.json()['search'][0]['label'])
except IndexError:
continue
I have a script that uses a dictionary stored in "my_dict" variable, and values of url, user, password, along with the "id" variable. The script then does an HTTP GET call to the url depending on headers passed. How do I create a Python Function which is equivalent to this? And how to the function later for another set of url, user, password etc?
import urllib, urllib2, base64, json
my_dict = {'server': {'user': 'patrick', 'url': 'http://192.168.0.1/tasks', 'password': 'secret'}}
id = "8d4lkf8kjhla8EnsdAjkjFjkdb6lklne"
for value in my_dict.keys():
url = my_dict[value]['url']
pass = my_dict[value]['password']
authKey = base64.b64encode("patrick:"+str(pass))
headers = {"login-session": id, "Content-Type": "application/json", "Authorization": "Basic " + authKey}
data = {"param": "value"}
request = urllib2.Request(url)
for key, value in headers.items():
request.add_header(key, value)
response = json.load(urllib2.urlopen(request))
print response
Your question has a lot of subquestions such as what do you want variable or constant? Do you know the syntax? How your function will be used? etc etc.
Therefore the best way for you to get an answer is learn some basics, like here (free) for the functions syntax in python. After that you may very well have the answers to your question.
A simple function like this will work if I understood you correctly,
def send_request(my_dict, id):
for value in my_dict.keys():
url = my_dict[value]['url']
pass = my_dict[value]['password']
authKey = base64.b64encode("patrick:"+str(pass))
headers = {"login-session": id, "Content-Type": "application/json", "Authorization": "Basic " + authKey}
data = {"param": "value"}
request = urllib2.Request(url)
for key, value in headers.items():
request.add_header(key, value)
response = json.load(urllib2.urlopen(request))
print response