Python post request display message if response taking longer than x seconds - python

I have the following python code that fetches data from a remote json file. The processing of the remote json file can sometimes be quick or sometimes take a little while. So I put the please wait print message before the post request. This works fine. However, I find that for the requests that are quick, the please wait is pointless. Is there a way I can display the please wait message if request is taking longer than x seconds?
try:
print("Please wait")
r = requests.post(url = "http://localhost/test.php")
r_data = r.json()

you can do it using multiple threads as follows:
import threading
from urllib import request
from asyncio import sleep
def th():
sleep(2) # if download takes more than 2 seconds
if not isDone:
print("Please wait...")
dl_thread = threading.Thread(target=th) # create new thread that executes function th when the thread is started
dl_thread.start() # start the thread
isDone = False # variable to track the request status
r = request.post(url="http://localhost/test.php")
isDone = True
r_data = r.json()

Related

How to resend GET request with requests library until I get desired response? (Python)

I work at a webproject were I need to 1) create a user 2) log in with the user credentials. The problem is there's an unspecified amount of time before the user gets added to the database, so I need to wait for that amount of time.
I want to create an explicit wait that would send GET requests every n seconds until the response contains the added user credentials.
from time import sleep
from requests import Session
session = Session()
url = 'app_url_with_user_list_endpoint'
resp = session.get(url=url)
while resp.json()[-1]["email"] != "new_use#samplemail.fake":
sleep(0.5)
resp = session.get(url=url)
Here, I tried to update the resp until it contains the new user email. But, I just created an infitine loop.
So, how do I wait for the response to hold the desired email? Optional: how do I specify max number of retries?
Does the following work? (Limited to 10 tries, which here amounts to a timeout of 5 seconds.)
from time import sleep
from requests import Session
session = Session()
url = 'app_url_with_user_list_endpoint'
match = "new_use#samplemail.fake"
success = False
for _ in range(10):
resp = session.get(url=url)
if resp.json()[-1]["email"] == match:
success = True
break
sleep(0.5)
if success:
print("Success: User was added")
else:
print("Timeout: Failed to add user")

How to send many requests in minimal time with Python 3

My task is to send 30-100 post requests to one url in one exact precise moment of time. For example in 13:00:00.550 with several milliseconds accuracy.
Requests are differ from each other (some types, for example 10 types). And each type must send 5 times.
I have problem with fast sending of http requests. Is there the fastest way to send 30-100 post requests in minimal time?
I tried to use asyncio and httpx.AsyncClient to do it.
Here the part of code how I made it:
from datetime import datetime
import asyncio
import httpx
async def async_post(request_data):
time_to_sleep = 0.005
action_time = '13:00:00'
time_microseconds = 550000
async with httpx.AsyncClient(cookies=request_data['cookies']) as client:
while True:
now_time_second = datetime.now().strftime('%H:%M:%S')
if action_time==now_time_second:
break
await asyncio.sleep(0.05)
while True:
now_time_microsecond = datetime.now().strftime('%f')
if now_time_microsecond >= time_microseconds:
break
await asyncio.sleep(0.003)
for _ in range(5):
response = await client.post(request_data['url'],
headers = request_data['headers'],
params = request_data['params'],
data = request_data['data'],
timeout = 60)
logger.info('Time: ' + str(datetime.now().strftime('%H:%M:%S.%f')))
logger.info('Text: ' + str(response.text))
logger.info('Response time: ' + str(response.headers['Date']))
await asyncio.sleep(time_to_sleep)
def main():
loop = asyncio.get_event_loop()
loop.run_until_complete(
asyncio.gather(*[async_post(request_data) for request_data in all_requests_data]))
all_requests_data - list of all types of requests.
request_data - dict that contains data of request
As result - the time between requests can reach 70-200 ms. That's a lot. It does not suit for me.
And it's not server lag. I tried other application, and could see, that server can make answers in few miliseconds. So that is not on server side.
How to send requests faster?

Waiting for API response in python3

(background)
I have an ERP application which is managed from a Weblogic Console. Recently we noticed that the same activities that we perform from the console can be performed using the vendor provided REST API calls. So we wanted to utilize this approach programatically and try to build some automations.
This is the page from where we can control one of the instance ConsoleImage
The same button acts as Stop and Start to manage the start and stop instance.
Both the start and stop have different API calls which makes sense.
The complete API doc is at : https://docs.oracle.com/cd/E61420_01/doc.92/e80710/smcrestapis.htm#BABFHBJI
(Now)
I wrote a program in python using the request method to call these APIs and it works fine.
The API response can take anywhere between 20 to 30 seconds when I use the stopInstance API
And normally takes 60 to 90 seconds when I use the startInstance API, but if there is an issue when starting the instance it takes more than 300 seconds and goes into indefinate wait.
My problem is, while starting an instance I want to wait maximum only for 100 seconds for the response. If it takes more than 100 seconds the program should display a message like "Instance was not able to start in 100 seconds"
This is my program. I am taking input from a text file and all the values present there have been verified.
import requests
import json
import importlib.machinery
import importlib.util
import numpy
import time
import sys
loader = importlib.machinery.SourceFileLoader('SM','sm_details.txt')
spec = importlib.util.spec_from_loader(loader.name, loader)
mod = importlib.util.module_from_spec(spec)
loader.exec_module(mod)
username = str(mod.username)
password = str(mod.password)
hostname = str(mod.servermanagerHostname)
portnum = str(mod.servermanagerPort)
instanceDetails = numpy.array(mod.instanceName)
authenticationAPI = "http://"+hostname+":"+portnum+"/manage/mgmtrestservice/authenticate"
startInstanceAPI = "http://"+hostname+":"+portnum+"/manage/mgmtrestservice/startinstance"
headers = {
'Content-Type':'application/json',
'Cache-Control':'no-cache',
}
data = {}
data['username']= username
data['password']= password
instanceNameDict = {'instanceName':''}
#Authentication request and storing token
response = requests.post(authenticationAPI, data=json.dumps(data), headers=headers)
token = response.headers['TOKEN']
head2 = {}
head2['TOKEN']=token
def start(instance):
print(f'\nTrying to start instance : '+instance['instanceName'])
startInstanceResponse = requests.post(startInstanceAPI,data=json.dumps(instance), headers=head2) #this is where the program is stuck and it does not move to the time.sleep step
time.sleep(100)
if startInstanceResponse.status_code == 200:
print('Instance '+instance['instanceName']+' started.')
else:
print('Could not start instance in 100 seconds')
sys.exit(1)
I would suggest you to use the timeout parameter in requests:
requests.post(startInstanceAPI,data=json.dumps(instance), headers=head2, timeout=100.0)
You can tell Requests to stop waiting for a response after a given
number of seconds with the timeout parameter. Nearly all production
code should use this parameter in nearly all requests. Failure to do
so can cause your program to hang indefinitely.
Source
Here's the requests timeout documentation, you will also find more details in there and Exception handling.

Run Parallel Request session in python

I am trying to open a multiple web session and save the data into CSV, Have written my code using for loop & requests.get options, But it's taking so long to access 90 number of Web location. Can anyone let me know how the whole process run in parallel for loc_var:
The code is working fine, only the issue is running one by one for loc_var, and took so long time.
Want to access all the for loop loc_var URL in parallel and write operation of CSV
Below is the Code:
import pandas as pd
import numpy as np
import os
import requests
import datetime
import zipfile
t=datetime.date.today()-datetime.timedelta(2)
server = [("A","web1",":5000","username=usr&password=p7Tdfr")]
'''List of all web_ips'''
web_1 = ["Web1","Web2","Web3","Web4","Web5","Web6","Web7","Web8","Web9","Web10","Web11","Web12","Web13","Web14","Web15"]
'''List of All location'''
loc_var =["post1","post2","post3","post4","post5","post6","post7","post8","post9","post10","post11","post12","post13","post14","post15","post16","post17","post18"]
for s,web,port,usr in server:
login_url='http://'+web+port+'/api/v1/system/login/?'+usr
print (login_url)
s= requests.session()
login_response = s.post(login_url)
print("login Responce",login_response)
#Start access the Web for Loc_variable
for mkt in loc_var:
#output is CSV File
com_actions_url='http://'+web+port+'/api/v1/3E+date(%5C%22'+str(t)+'%5C%22)and+location+%3D%3D+%27'+mkt+'%27%22&page_size=-1&format=%22csv%22'
print("com_action_url",com_actions_url)
r = s.get(com_actions_url)
print("action",r)
if r.ok == True:
with open(os.path.join("/home/Reports_DC/", "relation_%s.csv"%mkt),'wb') as f:
f.write(r.content)
# If loc is not aceesble try with another Web_1 List
if r.ok == False:
while r.ok == False:
for web_2 in web_1:
login_url='http://'+web_2+port+'/api/v1/system/login/?'+usr
com_actions_url='http://'+web_2+port+'/api/v1/3E+date(%5C%22'+str(t)+'%5C%22)and+location+%3D%3D+%27'+mkt+'%27%22&page_size=-1&format=%22csv%22'
login_response = s.post(login_url)
print("login Responce",login_response)
print("com_action_url",com_actions_url)
r = s.get(com_actions_url)
if r.ok == True:
with open(os.path.join("/home/Reports_DC/", "relation_%s.csv"%mkt),'wb') as f:
f.write(r.content)
break
There are multiple approaches that you can take to make concurrent HTTP requests. Two that I've used are (1) multiple threads with concurrent.futures.ThreadPoolExecutor or (2) send the requests asynchronously using asyncio/aiohttp.
To use a thread pool to send your requests in parallel, you would first generate a list of URLs that you want to fetch in parallel (in your case generate a list of login_urls and com_action_urls), and then you would request all of the URLs concurrently as follows:
from concurrent.futures import ThreadPoolExecutor
import requests
def fetch(url):
page = requests.get(url)
return page.text
# Catch HTTP errors/exceptions here
pool = ThreadPoolExecutor(max_workers=5)
urls = ['http://www.google.com', 'http://www.yahoo.com', 'http://www.bing.com'] # Create a list of urls
for page in pool.map(fetch, urls):
# Do whatever you want with the results ...
print(page[0:100])
Using asyncio/aiohttp is generally faster than the threaded approach above, but the learning curve is more complicated. Here is a simple example (Python 3.7+):
import asyncio
import aiohttp
urls = ['http://www.google.com', 'http://www.yahoo.com', 'http://www.bing.com']
async def fetch(session, url):
async with session.get(url) as resp:
return await resp.text()
# Catch HTTP errors/exceptions here
async def fetch_concurrent(urls):
loop = asyncio.get_event_loop()
async with aiohttp.ClientSession() as session:
tasks = []
for u in urls:
tasks.append(loop.create_task(fetch(session, u)))
for result in asyncio.as_completed(tasks):
page = await result
#Do whatever you want with results
print(page[0:100])
asyncio.run(fetch_concurrent(urls))
But unless you are going to be making a huge number of requests, the threaded approach will likely be sufficient (and way easier to implement).

Correct greenlet termination

I am using gevent to download some html pages.
Some websites are way too slow, some stop serving requests after period of time. That is why I had to limit total time for a group of requests I make. For that I use gevent "Timeout".
timeout = Timeout(10)
timeout.start()
def downloadSite():
# code to download site's url one by one
url1 = downloadUrl()
url2 = downloadUrl()
url3 = downloadUrl()
try:
gevent.spawn(downloadSite).join()
except Timeout:
print 'Lost state here'
But the problem with it is that i loose all the state when exception fires up.
Imagine I crawl site 'www.test.com'. I have managed to download 10 urls right before site admins decided to switch webserver for maintenance. In such case i will lose information about crawled pages when exception fires up.
The question is - how do I save state and process the data even if Timeout happens ?
Why not try something like:
timeout = Timeout(10)
def downloadSite(url):
with Timeout(10):
downloadUrl(url)
urls = ["url1", "url2", "url3"]
workers = []
limit = 5
counter = 0
for i in urls:
# limit to 5 URL requests at a time
if counter < limit:
workers.append(gevent.spawn(downloadSite, i))
counter += 1
else:
gevent.joinall(workers)
workers = [i,]
counter = 0
gevent.joinall(workers)
You could also save a status in a dict or something for every URL, or append the ones that fail in a different array, to retry later.
A self-contained example:
import gevent
from gevent import monkey
from gevent import Timeout
gevent.monkey.patch_all()
import urllib2
def get_source(url):
req = urllib2.Request(url)
data = None
with Timeout(2):
response = urllib2.urlopen(req)
data = response.read()
return data
N = 10
urls = ['http://google.com' for _ in xrange(N)]
getlets = [gevent.spawn(get_source, url) for url in urls]
gevent.joinall(getlets)
contents = [g.get() for g in getlets]
print contents[5]
It implements one timeout for each request. In this example, contents contains 10 times the HTML source of google.com, each retrieved in an independent request. If one of the requests had timed out, the corresponding element in contents would be None. If you have questions about this code, don't hesitate to ask in the comments.
I saw your last comment. Defining one timeout per request definitely is not wrong from the programming point of view. If you need to throttle traffic to the website, then just don't spawn 100 greenlets simultaneously. Spawn 5, wait until they returned. Then, you can possibly wait for a given amount of time, and spawn the next 5 (already shown in the other answer by Gabriel Samfira, as I see now). For my code above, this would mean, that you would have to repeatedly call
N = 10
urls = ['http://google.com' for _ in xrange(N)]
getlets = [gevent.spawn(get_source, url) for url in urls]
gevent.joinall(getlets)
contents = [g.get() for g in getlets]
whereas N should not be too high.

Categories

Resources