I do a loop of requests and unfortunately sometimes there happens a server timeout. Thats why I want to check the status code first and if it is not 200 I want to go back and repeat the last request until the status code is 200. An example of the code looks like this:
for i in range(0, len(table)):
var = table.iloc[i]
url = 'http://example.request.com/var/'
response = requests.get(url)
if response.status_code == 200:
data = response.json()
else:
"go back to response"
I am appending the response data of every i, so I would like to go back as long as the response code is 200 and then go on with the next i in the loop.
Is there any easy solution?
I believe you want to do something like this:
for i in range(0, len(table)):
var = table.iloc[i]
url = 'http://example.request.com/var/'
response = requests.get(url)
while response.status_code != 200:
response = requests.get(url)
data = response.json()
I made a small example, used an infinite loop and used break to demonstrate when the status code is = 200
while True:
url = 'https://stackoverflow.com/'
response = requests.get(url)
if response.status_code == 200:
# found code
print('found exemple')
break
Related
I am trying to capture http status code 3XX/302 for a redirection url. But I cannot get it because it gives 200 status code.
Here is the code:
import requests
r = requests.get('http://goo.gl/NZek5')
print r.status_code
I suppose this should issue either 301 or 302 because it redirects to another page. I had tried few redirecting urls (for e.g. http://fb.com ) but again it is issuing the 200. What should be done to capture the redirection code properly?
requests handles redirects for you, see redirection and history.
Set allow_redirects=False if you don't want requests to handle redirections, or you can inspect the redirection responses contained in the r.history list.
Demo:
>>> import requests
>>> url = 'https://httpbin.org/redirect-to'
>>> params = {"status_code": 301, "url": "https://stackoverflow.com/q/22150023"}
>>> r = requests.get(url, params=params)
>>> r.history
[<Response [301]>, <Response [302]>]
>>> r.history[0].status_code
301
>>> r.history[0].headers['Location']
'https://stackoverflow.com/q/22150023'
>>> r.url
'https://stackoverflow.com/questions/22150023/http-redirection-code-3xx-in-python-requests'
>>> r = requests.get(url, params=params, allow_redirects=False)
>>> r.status_code
301
>>> r.url
'https://httpbin.org/redirect-to?status_code=301&url=https%3A%2F%2Fstackoverflow.com%2Fq%2F22150023'
So if allow_redirects is True, the redirects have been followed and the final response returned is the final page after following redirects. If allow_redirects is False, the first response is returned, even if it is a redirect.
requests.get allows for an optional keyword argument allow_redirects which defaults to True. Setting allow_redirects to False will disable automatically following redirects, as follows:
In [1]: import requests
In [2]: r = requests.get('http://goo.gl/NZek5', allow_redirects=False)
In [3]: print r.status_code
301
This solution will identify the redirect and display the history of redirects, and it will handle common errors. This will ask you for your URL in the console.
import requests
def init():
console = input("Type the URL: ")
get_status_code_from_request_url(console)
def get_status_code_from_request_url(url, do_restart=True):
try:
r = requests.get(url)
if len(r.history) < 1:
print("Status Code: " + str(r.status_code))
else:
print("Status Code: 301. Below are the redirects")
h = r.history
i = 0
for resp in h:
print(" " + str(i) + " - URL " + resp.url + " \n")
i += 1
if do_restart:
init()
except requests.exceptions.MissingSchema:
print("You forgot the protocol. http://, https://, ftp://")
except requests.exceptions.ConnectionError:
print("Sorry, but I couldn't connect. There was a connection problem.")
except requests.exceptions.Timeout:
print("Sorry, but I couldn't connect. I timed out.")
except requests.exceptions.TooManyRedirects:
print("There were too many redirects. I can't count that high.")
init()
Anyone have the php version of this code?
r = requests.get(url)
if len(r.history) < 1:
print("Status Code: " + str(r.status_code))
else:
print("Status Code: 301. Below are the redirects")
h = r.history
i = 0
for resp in h:
print(" " + str(i) + " - URL " + resp.url + " \n")
i += 1
if do_restart:
I am trying to test aws waf using Python 3. The idea is to send an 500 to 1000 HTTP request in N seconds/minutes and print the status code of each request. the time need to be specified by user, like if user wants to send 1000 request in 5 minutes or wants to send all 1000 request in 1 second.
I am not able to implement timing.
How can I make this work? Below is my code so far:
import requests
import time
import threading
countRequests = 0
countSuccessRequests = 0
countFailedRequests = 0
countAllowRequests = 0
countBlockedRequests = 0
coubtErrors = 0
class Requester(threading.Thread):
def __init__(self, url, w):
threading.Thread.__init__(self)
self.url = url
def run(self):
global countRequests
global countSuccessRequests
global countFailedRequests
global countAllowRequests
global countBlockedRequests
global coubtErrors
try:
r = requests.get(url)
countSuccessRequests+=1
if (r.status_code == 200):
countAllowRequests+=1
print(f"countRequest: {countAllowRequests}, statusCode: {r.status_code}")
if (r.status_code == 403):
countBlockedRequests+=1
print(f"Blocked: {countBlockedRequests}, statusCode: {r.status_code}")
if (r.status_code != 200 and r.status_code != 403):
print(f'Failed: code= {r.status_code}, sample: {w}')
except (requests.Timeout, requests.ConnectionError, requests.HTTPError) as err:
countFailedRequests+=1
if __name__ == '__main__':
url = 'URL-HERE'
print(f"url: {url}")
requ = 150
for w in range(1, requ, +1):
req = Requester(url,w)
req.start()
time.sleep(0)
I tried searching on stack overflow however i was not able to build logic for it.
I would appreciate someone to point me to the correct thread or help me build one.
I am trying to work on a website and the code works fine, but sometimes the response text has a specific string error happened. If that string appears, I need to send request to that item again
Here's my try but I still got some results with that string error happened
for item in mylist:
while True:
response = requests.get(f'myurl/{item}', headers=headers)
res_text = response.text
if 'SUCESSFUL EXECUTION' in res_text:
scraped_item = (item, 'PAY IT')
else:
json_data=json.loads(res_text)
scraped_item = (item, json_data['errorMsg'])
print(scraped_item)
results.append(scraped_item)
if not 'error happened' in res_text:break
I could solve it using while True trick
for item in mylist:
while True:
response = requests.get(f'myurl/{item}', headers=headers)
res_text = response.text
if not 'error happened' in res_text:
break
if 'SUCESSFUL EXECUTION' in res_text:
scraped_item = (item, 'PAY IT')
else:
json_data=json.loads(res_text)
scraped_item = (item, json_data['errorMsg'])
print(scraped_item)
results.append(scraped_item)
time.sleep(1)
I have a python program, which will make some api calls and print some output depending upon the api's response. I am using flush = true argument in the print function call. Sometimes the python program looks hung (It won't print anything to the terminal). But when i press CTRL-C, some output is printed to the output and python (looks to)resumes again. I have not written any signal handler for SIGINT. Why is this behavior?
UPDATE (Added python code)
import requests
import config
import json
from ipaddress import IPv4Address, IPv4Network
ip_list = []
filepath = "ip_address_test.txt"
with open(filepath) as fp:
line = fp.readline()
ip_list.append(line.rstrip())
while line:
line = fp.readline()
ip_list.append(line.rstrip())
print(ip_list)
del ip_list[-1]
sysparm_offset = 0
sysparm_limit = 2
flag = 1
for ip in ip_list:
hosts = []
while flag == 1:
print("ip is "+str(ip), flush=True)
# Set the request parameters
url = 'https://xxx.service-now.com/api/now/table/u_cmdb_ci_subnet?sysparm_query=u_lifecycle_status!=end of life^GOTOname>='+ip+'&sysparm_limit='+str(sysparm_limit)+'&sysparm_offset='+str(sysparm_offset)
# Eg. User name="username", Password="password" for this code sample.
user = config.asknow_username
pwd = config.asknow_password
headers = {"Accept":"application/json"}
# Do the HTTP request
response = requests.get(url, auth=(user, pwd), headers=headers,verify=False)
# Check for HTTP codes other than 200
if response.status_code != 200:
print('Status:', response.status_code, 'Headers:', response.headers, 'Error Response:', response.content)
exit()
# Decode the json response into a dictionary and use the data
result = json.loads(response.content)['result']
iter = 0
while iter < sysparm_limit:
print("iter = "+str(iter),flush=True)
print("checking "+result[iter]["name"], flush=True)
id = result[iter]["u_guid"]
url = "https://xxx.service-now.com/api/now/cmdb/instance/u_cmdb_ci_subnet/" + str(id)
response = requests.get(url, auth=(user, pwd), headers=headers,verify=False)
# Check for HTTP codes other than 200
if response.status_code != 200:
print('Status:', response.status_code, 'Headers:', response.headers, 'Error Response:', response.content)
exit()
result1 = json.loads(response.content)['result']
for relation in result1["outbound_relations"]:
if relation["type"]["display_value"] == "IP Connection::IP Connection":
if IPv4Address(relation["target"]["display_value"]) in IPv4Network(ip):
print("appending host", flush=True)
hosts.append(relation["target"]["display_value"])
else:
print("ip not in subnet", flush=True)
flag = 0
break
if flag == 0:
break
iter = iter+1
sysparm_offset = sysparm_offset + 2
with open('output.txt',"a+") as output_file:
for value in hosts:
output_file.write(str(value)+"\n")
print("completed "+ip,flush=True)
flag = 1
I am not sure if my issue is the same since you didn't mention clicking inside the command prompt and you didn't mention if it also resumes when pressing something besides ctrl-c, but it sounds a lot like what I had.
A little more googling led me to an insight I wish I had found as an answer to your question a while ago:
https://www.davici.nl/blog/the-mystery-of-hanging-batch-script-until-keypress
I had no idea a QuickEdit mode existed but besides following the steps in the article, you can also simply open the commandprompt, right-click the title bar, choose "defaults", and then under "Edit Options" disable "QuickEdit Mode".
I hope this helps you (or someone else) as much as it just helped me
This question already has an answer here:
How to send multiple http requests python
(1 answer)
Closed 6 years ago.
I created the following script to download images from an API endpoint which works as intended. Thing is that it is rather slow as all the requests have to wait on each other. What is the correct way to make it possible to still have the steps synchronously for each item I want to fetch, but make it parallel for each individual item. This from an online service called
servicem8
So what I hope to achieve is:
fetch all possible job ids => keep name/and other info
fetch name of the customer
fetch each attachment of a job
These three steps should be done for each job. So I could make things parallel for each job as they do not have to wait on each other.
Update:
Problem I do not understand is how can you make sure that you bundle for example the three calls per item in one call as its only per item that I can do things in parallel so for example when I want to
fetch item( fetch name => fetch description => fetch id)
so its the fetch item I want to make parallel?
The current code I have is working but rather slow:
import requests
import dateutil.parser
import shutil
import os
user = "test#test.com"
passw = "test"
print("Read json")
url = "https://api.servicem8.com/api_1.0/job.json"
r = requests.get(url, auth=(user, passw))
print("finished reading jobs.json file")
scheduled_jobs = []
if r.status_code == 200:
for item in r.json():
scheduled_date = item['job_is_scheduled_until_stamp']
try:
parsed_date = dateutil.parser.parse(scheduled_date)
if parsed_date.year == 2016:
if parsed_date.month == 10:
if parsed_date.day == 10:
url_customer = "https://api.servicem8.com/api_1.0/Company/{}.json".format(item[
'company_uuid'])
c = requests.get(url_customer, auth=(user, passw))
cus_name = c.json()['name']
scheduled_jobs.append(
[item['uuid'], item['generated_job_id'], cus_name])
except ValueError:
pass
for job in scheduled_jobs:
print("fetch for job {}".format(job))
url = "https://api.servicem8.com/api_1.0/Attachment.json?%24filter=related_object_uuid%20eq%20{}".format(job[
0])
r = requests.get(url, auth=(user, passw))
if r.json() == []:
pass
for attachment in r.json():
if attachment['active'] == 1 and attachment['file_type'] != '.pdf':
print("fetch for attachment {}".format(attachment))
url_staff = "https://api.servicem8.com/api_1.0/Staff.json?%24filter=uuid%20eq%20{}".format(
attachment['created_by_staff_uuid'])
s = requests.get(url_staff, auth=(user, passw))
for staff in s.json():
tech = "{}_{}".format(staff['first'], staff['last'])
url = "https://api.servicem8.com/api_1.0/Attachment/{}.file".format(attachment[
'uuid'])
r = requests.get(url, auth=(user, passw), stream=True)
if r.status_code == 200:
creation_date = dateutil.parser.parse(
attachment['timestamp']).strftime("%d.%m.%y")
if not os.path.exists(os.getcwd() + "/{}/{}".format(job[2], job[1])):
os.makedirs(os.getcwd() + "/{}/{}".format(job[2], job[1]))
path = os.getcwd() + "/{}/{}/SC -O {} {}{}".format(
job[2], job[1], creation_date, tech.upper(), attachment['file_type'])
print("writing file to path {}".format(path))
with open(path, 'wb') as f:
r.raw.decode_content = True
shutil.copyfileobj(r.raw, f)
else:
print(r.text)
Update [14/10]
I updated the code in the following way with some hints given. Thanks a lot for that. Only thing I could optimize I guess is the attachment downloading but it is working fine now. Funny thing I learned is that you cannot create a CON folder on a windows machine :-) did not know that.
I use pandas as well just to try to avoid some loops in my list of dicts but not sure if I am already most performant. Longest is actually reading in the full json files. I fully read them in as I could not find an API way of just telling the api, return me only the jobs from september 2016. The api query function seems to work on eq/lt/ht.
import requests
import dateutil.parser
import shutil
import os
import pandas as pd
user = ""
passw = ""
FOLDER = os.getcwd()
headers = {"Accept-Encoding": "gzip, deflate"}
import grequests
urls = [
'https://api.servicem8.com/api_1.0/job.json',
'https://api.servicem8.com/api_1.0/Attachment.json',
'https://api.servicem8.com/api_1.0/Staff.json',
'https://api.servicem8.com/api_1.0/Company.json'
]
#Create a set of unsent Requests:
print("Read json files")
rs = (grequests.get(u, auth=(user, passw), headers=headers) for u in urls)
#Send them all at the same time:
jobs,attachments,staffs,companies = grequests.map(rs)
#create dataframes
df_jobs = pd.DataFrame(jobs.json())
df_attachments = pd.DataFrame(attachments.json())
df_staffs = pd.DataFrame(staffs.json())
df_companies = pd.DataFrame(companies.json())
#url_customer = "https://api.servicem8.com/api_1.0/Company/{}.json".format(item['company_uuid'])
#c = requests.get(url_customer, auth=(user, passw))
#url = "https://api.servicem8.com/api_1.0/job.json"
#jobs = requests.get(url, auth=(user, passw), headers=headers)
#print("Reading attachments json")
#url = "https://api.servicem8.com/api_1.0/Attachment.json"
#attachments = requests.get(url, auth=(user, passw), headers=headers)
#print("Reading staff.json")
#url_staff = "https://api.servicem8.com/api_1.0/Staff.json"
#staffs = requests.get(url_staff, auth=(user, passw))
scheduled_jobs = []
if jobs.status_code == 200:
print("finished reading json file")
for job in jobs.json():
scheduled_date = job['job_is_scheduled_until_stamp']
try:
parsed_date = dateutil.parser.parse(scheduled_date)
if parsed_date.year == 2016:
if parsed_date.month == 9:
cus_name = df_companies[df_companies.uuid == job['company_uuid']].iloc[0]['name'].upper()
cus_name = cus_name.replace('/', '')
scheduled_jobs.append([job['uuid'], job['generated_job_id'], cus_name])
except ValueError:
pass
print("{} jobs to fetch".format(len(scheduled_jobs)))
for job in scheduled_jobs:
print("fetch for job attachments {}".format(job))
#url = "https://api.servicem8.com/api_1.0/Attachment.json?%24filter=related_object_uuid%20eq%20{}".format(job[0])
if attachments == []:
pass
for attachment in attachments.json():
if attachment['related_object_uuid'] == job[0]:
if attachment['active'] == 1 and attachment['file_type'] != '.pdf' and attachment['attachment_source'] != 'INVOICE_SIGNOFF':
for staff in staffs.json():
if staff['uuid'] == attachment['created_by_staff_uuid']:
tech = "{}_{}".format(
staff['first'].split()[-1].strip(), staff['last'])
creation_timestamp = dateutil.parser.parse(
attachment['timestamp'])
creation_date = creation_timestamp.strftime("%d.%m.%y")
creation_time = creation_timestamp.strftime("%H_%M_%S")
path = FOLDER + "/{}/{}/SC_-O_D{}_T{}_{}{}".format(
job[2], job[1], creation_date, creation_time, tech.upper(), attachment['file_type'])
# fetch attachment
if not os.path.isfile(path):
url = "https://api.servicem8.com/api_1.0/Attachment/{}.file".format(attachment[
'uuid'])
r = requests.get(url, auth=(user, passw), stream = True)
if r.status_code == 200:
if not os.path.exists(FOLDER + "/{}/{}".format(job[2], job[1])):
os.makedirs(
FOLDER + "/{}/{}".format(job[2], job[1]))
print("writing file to path {}".format(path))
with open(path, 'wb') as f:
r.raw.decode_content = True
shutil.copyfileobj(r.raw, f)
else:
print("file already exists")
else:
print(r.text)
General idea is to use asynchronous url requests and there is a python module named grequests for that-https://github.com/kennethreitz/grequests
From Documentation:
import grequests
urls = [
'http://www.heroku.com',
'http://python-tablib.org',
'http://httpbin.org',
'http://python-requests.org',
'http://fakedomain/',
'http://kennethreitz.com'
]
#Create a set of unsent Requests:
rs = (grequests.get(u) for u in urls)
#Send them all at the same time:
grequests.map(rs)
And the resopnse
[<Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, None, <Response [200]>]