MultiThreading Python Issue(Segmentation fault) - python

I am trying to run the following code in multithreading however I keep getting the " Segmentation fault (core dumped)" . Please advise what I am doing wrong -
def insert_api(r):
url = url_responses+'/'+str(r[0])
response = requests.get(url,headers={'api-key':APIToken,'Content-Type': 'application/json'})
if response.status_code == 200:
dd = json.loads(response.content)
InsertTable('API_Response',str(r[0]),str(r[1]),json.dumps(dd['result']))
else:
logMsg('','HTTP request for '+url_responses+' failed. HTTP response code is: '+str(response.status_code),'failure')
subject='API Request failed ********* '+ datetime.now().strftime("%Y%m%d-%H%M")
Body='This email is to notify that the API request for the URL: ' +str(url)+' failed at '+ datetime.now().strftime("%Y%m%d-%H%M")
email_notifier(subject,Body)
with concurrent.futures.ThreadPoolExecutor() as executor:
executor.map(insert_api,response_list)
InsertTable is a function to insert records in a table(API_Response) passed as parameter along with other values. email_notifier is a function to send emails in case of exceptions. Since I have 95k+ records in the API , hence trying to implement the multithreading logic.
Thanks
Samy!!

I was able to resolve it by adding a lock logic before my InsertTable function

Related

Kafka confluent proxy api - send message - Internal server error

I'm trying to wrap the Confluent kafka proxy api in one class that will handle producing and consuming.
Following this link: https://docs.confluent.io/platform/current/kafka-rest/api.html I tried to implement it as follows:
def send(self, topic, data):
try:
r = requests.post(self._url('/topics/' + topic), json=data, headers=headers_v2)
if not r.ok:
raise Exception("Error: ", r.reason)
except Exception as e:
print(" ")
print('Event streams send request failed')
print(Exception, e)
print(" ")
return e
but I ended up working with 2 versions of the api (v2/v3) cause I didn't find some api's in one implementation and vise versa...
For example I didn't find how to create topic in v2, so I implemented it with v3.
My issue now is with the send method, I'm getting Internal server error and I can't find why!
Maybe because the create topic was done with v3 and I'm trying to produce messages with v2.
I changed the data payload for the send to look like:
data = {"records": [{"value": data}]} and send passed,
poll passed when using:
r = requests.get(self._url('/consumers/' + self.consumer_group + '/instances/' + self.consumer + '/records'), headers={'Accept': 'application/vnd.kafka.json.v2+json'})

Python requests.Session() randomly returns 401 even after authenticating

I've been using requests.Session() to make web requests with authentication. Maybe 70% of the time I'll get a status_code of 200, but I also sporadically get 401.
Since I'm using a session - I'm absolutely positive that the credentials are correct - given that the same exact request when repeated may return 200.
Some further details:
I'm working with the SharePoint REST API
I'm using NTLM Authentication
To circumvent the problem, I've tried writing a loop that will sleep for a few seconds and retry the request. The odd thing here is that I haven't seen this actually recover - instead if the first request fails, then all subsequent requests will fail too. But if I just try again - the request may succeed on the first try.
Please note that I've already reviewed this question, but the suggestion is to use requests.Session(), which I'm already doing and still receiving 401s.
Here's some code to demonstrate what I've tried so far.
import requests
from requests_ntlm import HttpNtlmAuth
from urllib.parse import quote
# Establish requests session
s = requests.Session()
s.auth = HttpNtlmAuth(username, password)
# Update the request header to request JSON formatted output
s.headers.update({'Content-Type': 'application/json; odata=verbose',
'accept': 'application/json;odata=verbose'})
def RetryLoop(req, max_tries = 5):
''' Takes in a request object and will retry the request
upon failure up the the specified number of maximum
retries.
Used because error codes occasionally surface even though the
REST API call is formatted correctly. Exception returns status code
and text. Success returns request object.
Default max_tries = 5
'''
# Call fails sometimes - allow 5 retries
counter = 0
# Initialize loop
while True:
# Hit the URL
r = req
# Return request object on success
if r.status_code == 200:
return r
# If limit reached then raise exception
counter += 1
if counter == max_tries:
print(f"Failed to connect. \nError code = {r.status_code}\nError text: {r.text}")
# Message for failed retry
print(f'Failed request. Error code: {r.status_code}. Trying again...')
# Spacing out the requests in case of a connection problem
time.sleep(5)
r = RetryLoop(s.get("https://my_url.com"))
I've additionally tried creating a new session within the retry loop - but that hasn't seemed to help either. And I thought 5 seconds of sleep should be sufficient if it's a temporary block from the site, because I've retried myself in much less time and gotten the expected 200. I would expect to see a failure or two, and then a success.
Is there an underlying problem that I'm missing? And is there a more proper what that I can re-attempt the request given a 401?
** EDIT: #Swadeep pointed out the issue - by passing in the request to the function it's only calling the request once. Updated code that works properly:
def RetryLoop(req, max_tries = 5):
''' Takes in a request object and will retry the request
upon failure up the the specified number of maximum
retries.
Used because error codes occasionally surface even though the
REST API call is formatted correctly. Exception returns status code
and text. Success returns request object.
Default max_tries = 5
'''
# Call fails sometimes - allow 5 retries
counter = 0
# Initialize loop
while True:
# Return request object on success
if req.status_code == 200:
return req
# If limit reached then raise exception
counter += 1
if counter == max_tries:
print(f"Failed to connect. \nError code = {req.status_code}\nError text: {req.text}")
# Message for failed retry
print(f'Failed request. Error code: {req.status_code}. Trying again...')
# Spacing out the requests in case of a connection problem
time.sleep(1)
req = s.get(req.url)
This is what I propose.
import requests
from requests_ntlm import HttpNtlmAuth
from urllib.parse import quote
# Establish requests session
s = requests.Session()
s.auth = HttpNtlmAuth(username, password)
# Update the request header to request JSON formatted output
s.headers.update({'Content-Type': 'application/json; odata=verbose', 'accept': 'application/json;odata=verbose'})
def RetryLoop(s, max_tries = 5):
'''Takes in a request object and will retry the request
upon failure up the the specified number of maximum
retries.
Used because error codes occasionally surface even though the
REST API call is formatted correctly. Exception returns status code
and text. Success returns request object.
Default max_tries = 5
'''
# Call fails sometimes - allow 5 retries
counter = 0
# Initialize loop
while True:
# Hit the URL
r = s.get("https://my_url.com")
# Return request object on success
if r.status_code == 200:
return r
# If limit reached then raise exception
counter += 1
if counter == max_tries:
print(f"Failed to connect. \nError code = {r.status_code}\nError text: {r.text}")
# Message for failed retry
print(f'Failed request. Error code: {r.status_code}. Trying again...')
# Spacing out the requests in case of a connection problem
time.sleep(5)
r = RetryLoop(s)

python web scraping asp.net site returns internal server error after initial success

I am scraping an asp.net site by submitting form data. I am using threadpool to send 4 parallel requests. Now what is happening is that the first set of parallel requests get processed correctly and I get the desired response which I process as needed.
But the next request onward I get Runtime error(Description: An exception occurred while processing your request. Additionally, another exception occurred while executing the custom error page for the first exception. The request has been terminated.) as response. So unable to process more than 4 requests at a time. Any suggestions to improve the code snippet below are welcome
def EpicNo_Search(EpicNo):
print(EpicNo)
global headers,formData,url
choice= '21'
#This Intermediate requests are made to get the eventvalidation,viewstate Token
session = requests.session()
res = session.get(url,headers=headers)
soup = BeautifulSoup(res.text,'lxml')
formData['__EVENTVALIDATION'],formData['__VIEWSTATE'] = extract_form_hiddens(soup)
formData['ctl00$ContentPlaceHolder1$gr1'] = 'RadioButton2'
formData['__EVENTTARGET']='ctl00$ContentPlaceHolder1$RadioButton2'
res = session.post(url,urllib.parse.urlencode(formData), headers=headers)
if "Server Error" in res.text:
filename='zidlist'
with open('./{}.txt'.format(filename), mode='at', encoding='utf-8') as file:
file.write(EpicNo)
else:
#Final Request
soup = BeautifulSoup(res.text,'lxml')
formData['__EVENTVALIDATION'],formData['__VIEWSTATE']= extract_form_hiddens(soup)
formData['ctl00$ContentPlaceHolder1$gr1']='RadioButton2'
formData['ctl00$ContentPlaceHolder1$Drop4']=choice
formData['__EVENTTARGET']= ''
formData['ctl00$ContentPlaceHolder1$TextBox4']=EpicNo
formData['ctl00$ContentPlaceHolder1$Button3']= 'Search'
res= session.post(url,formData, headers=headers)
if 'No Record Found , Please Fill Form 6' in res.text:
write_csv('No Match','output.csv',epicno=EpicNo)
else:
write_csv(res.text.encode('utf-8'),'output.csv')
#We make 4 parallel requests to the website for faster result consolidation
pool = ThreadPool(4)
pool.map(EpicNo_Search, epicnolist)
My request header has Useragent info, cache-control(max-age=0) and connection(keep-alive)

AWS Lambda/SNS Publish ignore invalid endpoints

i'm sending apple push notifications via AWS SNS via Lambda with Boto3 and Python.
from __future__ import print_function
import boto3
def lambda_handler(event, context):
client = boto3.client('sns')
for record in event['Records']:
if record['eventName'] == 'INSERT':
rec = record['dynamodb']['NewImage']
competitors = rec['competitors']['L']
for competitor in competitors:
if competitor['M']['confirmed']['BOOL'] == False:
endpoints = competitor['M']['endpoints']['L']
for endpoint in endpoints:
print(endpoint['S'])
response = client.publish(
#TopicArn='string',
TargetArn = endpoint['S'],
Message = 'test message'
#Subject='string',
#MessageStructure='string',
)
Everything works fine! But when an endpoint is invalid for some reason (at the moment this happens everytime i run a development build on my device, since i get a different endpoint then. This will be either not found or deactivated.) the Lambda function fails and gets called all over again. In this particular case if for example the second endpoint fails it will send the push over and over again to endpoint 1 to infinity.
Is it possible to ignore invalid endpoints and just keep going with the function?
Thank you
Edit:
Thanks to your help i was able to solve it with:
try:
response = client.publish(
#TopicArn='string',
TargetArn = endpoint['S'],
Message = 'test message'
#Subject='string',
#MessageStructure='string',
)
except Exception as e:
print(e)
continue
Aws lamdba on failure retries the function till the event expires from the stream.
In your case since the exception on the 2nd endpoint is not handled, the retry mechanism ensures the reexecution of post to the first endpoint.
If you handle the exception and ensure the function successfully ends even when there is a failure, then the retries will not happen.

Why can't I decode a JSON message in the following situation?

I am trying to send a JSON message from a computer to another one via a post request.
The script which sends the message is the following:
message = {'station':'turn on'}
res = rest.send( 'POST', server_addr + "/newstation", json.dumps(message), {'Content-Type': 'application/json'} )
The rest.send(...) method should be correct as I used it before and it worked fine.
The PC which sends the post request runs Linux, while the receiving one runs Win 8, if that means anything.
On the receiving machine I have the following:
#app.route('/newstation', methods = ['POST'])
def new_station ():
j_data = request.get_json()
d = decode_data(j_data)
where decode_data(j_data) is the following
def decode_data(j_data):
d = json.loads(j_data)
return d
My problem is: whenever I try to send the post request from the first machine the response is "Internal server error" and on the machine with the server the error returned is "TypeError: expected string or buffer".
Now I am thinking that it may be a matter of encoding of the string.
The post request is received and I can print the json content without problems, the issue arises when I try to decode.
I fixed the issue, it was a mistake on my part (of course). I misunderstood the documentation.
#app.route('/newstation', methods = ['POST'])
def new_station ():
j_data = request.get_json()
#d = decode_data(j_data)
request.get_json() already returns me a dictionary, so the decode_data function isn't actually needed. I already have the result without the need for json.loads().

Categories

Resources