Python script stops working when I put it inside a function - python

Little bit of background: I'm using Python 2.7.12 on a Windows 10 computer.
This is by far one of the oddest problems I have ever encountered with Python.
I have written a script that makes a GET request to an API, with the correct headers, and gets some XML data back. For the record, when I paste the script like this in a python file and run it via CMD, it works perfectly fine.
But..
It stops working as soon as I wrap this inside a function. Nothing
else, just wrap it inside a function, and use
if __name__ == '__main__':
my_new_function()
to run it from CMD and it won't work anymore. It still works but the API says I have wrong auth credentials, and thus I don't get any data back.
I went over every piece of string that is in this code, and it's all ASCII encoded. I also checked the timestamps, and they are all correct.
This is my script:
SECRET_KEY = 'YYY'
PUBLIC_KEY = 'XXX'
content_type = 'application/xml'
date = time.strftime('%a, %d %b %Y %H:%M:%S GMT', time.gmtime())
method = 'GET'
uri = '/uri'
msg = """{method}
{content_type}
{date}
x-bol-date:{date}
{uri}""".format(content_type=content_type,
date=date,
method=method,
uri=uri)
h = hmac.new(
SECRET_KEY,
msg, hashlib.sha256)
b64 = base64.b64encode(h.digest())
signature = PUBLIC_KEY + b':' + b64
headers = {'Content-Type': content_type,
'X-BOL-Date': date,
'X-BOL-Authorization': signature}
r = requests.get('example.com/uri', headers=headers)
the same code inside a function:
def get_orders():
SECRET_KEY = 'XXX'
PUBLIC_KEY = 'YYY'
content_type = 'application/xml'
date = time.strftime('%a, %d %b %Y %H:%M:%S GMT', time.gmtime())
method = 'GET'
uri = '/uri'
msg = """{method}
{content_type}
{date}
x-bol-date:{date}
{uri}""".format(content_type=content_type,
date=date,
method=method,
uri=uri)
h = hmac.new(
SECRET_KEY,
msg, hashlib.sha256)
b64 = base64.b64encode(h.digest())
signature = PUBLIC_KEY + b':' + b64
headers = {'Content-Type': content_type,
'X-BOL-Date': date,
'X-BOL-Authorization': signature}
r = requests.get('example.com/uri', headers=headers)
if __name__ == '__main__':
get_orders()

I think your multi-line string is getting spaces in it when you indent it in a function. Concatenate it on each line instead and it should work.

Related

Storing a variable from a try statement

I have program that is calling into an API every 60 seconds and storing the data. The program is running on a cellular modem that is using Python 2.6. What I'm trying to do is have variables StartTimeConv and EndTimeConv from the try statement stored so that if the try statement fails the except statement can reference them. I've had them declared outside of the try statement, but that it generated a "referenced before assignment" error. What I'm ultimately trying to accomplish with this is if there's a cell signal issue or the API service isn't reachable, the start & stop times can still be referenced and the digital io triggers can still function.
def Client():
threading.Timer(60, Client).start()
# Request Session ID
request = urllib2.Request(url)
b64auth = base64.standard_b64encode("%s:%s" % (username,password))
request.add_header("Authorization", "Basic %s" % b64auth)
result = urllib2.urlopen(request)
# Parse and store Session ID
tree = ET.parse(result)
xml_data = tree.getroot()
sessionid = xml_data[1].text
# Dispatch Event Request
url1 = "SiteURL".format(sessionid)
request1 = urllib2.Request(url1)
result1 = urllib2.urlopen(request1)
# Read and store sys time
sys_time = time.localtime()
# Convert Sys time to datetime object
dt = datetime.fromtimestamp(mktime(sys_time))
# Parse and store Dispatch Event, start and stop time
try:
tree1 = ET.parse(result1)
xml_data1 = tree1.getroot()
dispatchEvent = xml_data1[0][0][2].text
EventStartTime = xml_data1[0][0][14].text
EventEndTime = xml_data1[0][0][1].text
#Convert string time to datetime object
StartTimeConv = datetime.strptime(xml_data1[0][0][14].text, "%a %B %d, %Y %H:%M")
EndTimeConv = datetime.strptime(xml_data1[0][0][1].text, "%a %B %d, %Y %H:%M")
print(dispatchEvent)
print(StartTimeConv)
print(EndTimeConv)
print(dt)
except:
print("No Event")
pass
else:
if dispatchEvent is not None and dt >= StartTimeConv:
set_digital_io('D0', 'on')
elif dispatchEvent is not None and dt <= EndTimeConv:
set_digital_io('D0', 'off')
else:
set_digital_io('D0', 'off')

Python SIEM log collector hangs randomly

I am trying to troubleshoot a script for Mimecast's API. The script runs fine for the most part, but a few times, I have noticed that it stops pulling logs and generally appears to be a hung process. After restarting the script and manually pushing logs to the syslog server, it starts working again without issue. I am not able to reproduce this issue at will.
The script is supposed to do the following:
Authenticate against Mimecast's API
Sign responses
download, extract and save log files to log dir
utilize a tokenized header to determine which file was downloaded in the last request. Should save the token ID within a file in the checkpoint directory
Push files to remote syslog server
Output any errors and info to console
Below is the sample code from Mimecast.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import logging.handlers
import json
import os
import requests
import base64
import uuid
import datetime
import hashlib
import shutil
import hmac
import time
from zipfile import ZipFile
import io
# Set up variables
APP_ID = "YOUR DEVELOPER APPLICATION ID"
APP_KEY = "YOUR DEVELOPER APPLICATION KEY"
URI = "/api/audit/get-siem-logs"
EMAIL_ADDRESS = 'EMAIL ADDRESS OF YOUR ADMINISTRATOR'
ACCESS_KEY = 'ACCESS KEY FOR YOUR ADMINISTRATOR'
SECRET_KEY = 'SECRET KEY FOR YOUR ADMINISTRATOR'
LOG_FILE_PATH = "FULLY QUALIFIED PATH TO FOLDER TO WRITE LOGS"
CHK_POINT_DIR = 'FULLY QUALIFIED PATH TO FOLDER TO WRITE PAGE TOKEN'
# Set True to output to syslog, false to only save to file
syslog_output = False
# Enter the IP address or hostname of your syslog server
syslog_server = 'localhost'
# Change this to override default port
syslog_port = 514
# delete files after fetching
delete_files = True
# Set threshold in number of files in log file directory
log_file_threshold = 10000
# Set up logging (in this case to terminal)
log = logging.getLogger(__name__)
log.root.setLevel(logging.DEBUG)
log_formatter = logging.Formatter('%(levelname)s %(message)s')
log_handler = logging.StreamHandler()
log_handler.setFormatter(log_formatter)
log.addHandler(log_handler)
# Set up syslog output
syslog_handler = logging.handlers.SysLogHandler(address=(syslog_server, syslog_port))
syslog_formatter = logging.Formatter('%(message)s')
syslog_handler.setFormatter(syslog_formatter)
syslogger = logging.getLogger(__name__)
syslogger = logging.getLogger('SysLogger')
syslogger.addHandler(syslog_handler)
# Supporting methods
def get_hdr_date():
return datetime.datetime.utcnow().strftime("%a, %d %b %Y %H:%M:%S UTC")
def read_file(file_name):
try:
with open(file_name, 'r') as f:
data = f.read()
return data
except Exception as e:
log.error('Error reading file ' + file_name + '. Cannot continue. Exception: ' + str(e))
quit()
def write_file(file_name, data_to_write):
if '.zip' in file_name:
try:
byte_content = io.BytesIO(data_to_write)
zip_file = ZipFile(byte_content)
zip_file.extractall(LOG_FILE_PATH)
except Exception as e:
log.error('Error writing file ' + file_name + '. Cannot continue. Exception: ' + str(e))
quit()
else:
try:
with open(file_name, 'w') as f:
f.write(data_to_write)
except Exception as e:
log.error('Error writing file ' + file_name + '. Cannot continue. Exception: ' + str(e))
quit()
def get_base_url(email_address):
# Create post body for request
post_body = dict()
post_body['data'] = [{}]
post_body['data'][0]['emailAddress'] = email_address
# Create variables required for request headers
request_id = str(uuid.uuid4())
request_date = get_hdr_date()
headers = {'x-mc-app-id': APP_ID, 'x-mc-req-id': request_id, 'x-mc-date': request_date}
# Send request to API
log.debug('Sending request to https://api.mimecast.com/api/discover-authentication with request Id: ' +
request_id)
try:
r = requests.post(url='https://api.mimecast.com/api/login/discover-authentication',
data=json.dumps(post_body), headers=headers)
# Handle Rate Limiting
if r.status_code == 429:
log.warning('Rate limit hit. sleeping for ' + str(r.headers['X-RateLimit-Reset'] * 1000))
time.sleep(r.headers['X-RateLimit-Reset'] * 1000)
except Exception as e:
log.error('Unexpected error getting base url. Cannot continue.' + str(e))
quit()
# Handle error from API
if r.status_code != 200:
log.error('Request returned with status code: ' + str(r.status_code) + ', response body: ' +
r.text + '. Cannot continue.')
quit()
# Load response body as JSON
resp_data = json.loads(r.text)
# Look for api key in region region object to get base url
if 'region' in resp_data["data"][0]:
base_url = resp_data["data"][0]["region"]["api"].split('//')
base_url = base_url[1]
else:
# Handle no region found, likely the email address was entered incorrectly
log.error(
'No region information returned from API, please check the email address.'
'Cannot continue')
quit()
return base_url
def post_request(base_url, uri, post_body, access_key, secret_key):
# Create variables required for request headers
request_id = str(uuid.uuid4())
request_date = get_hdr_date()
unsigned_auth_header = '{date}:{req_id}:{uri}:{app_key}'.format(
date=request_date,
req_id=request_id,
uri=uri,
app_key=APP_KEY
)
hmac_sha1 = hmac.new(
base64.b64decode(secret_key),
unsigned_auth_header.encode(),
digestmod=hashlib.sha1).digest()
sig = base64.encodebytes(hmac_sha1).rstrip()
headers = {
'Authorization': 'MC ' + access_key + ':' + sig.decode(),
'x-mc-app-id': APP_ID,
'x-mc-date': request_date,
'x-mc-req-id': request_id,
'Content-Type': 'application/json'
}
try:
# Send request to API
log.debug('Sending request to https://' + base_url + uri + ' with request Id: ' + request_id)
r = requests.post(url='https://' + base_url + uri, data=json.dumps(post_body), headers=headers)
# Handle Rate Limiting
if r.status_code == 429:
log.warning('Rate limit hit. sleeping for ' + str(r.headers['X-RateLimit-Reset'] * 1000))
time.sleep(r.headers['X-RateLimit-Reset'] * 1000)
r = requests.post(url='https://' + base_url + uri, data=json.dumps(post_body), headers=headers)
# Handle errors
except Exception as e:
log.error('Unexpected error connecting to API. Exception: ' + str(e))
return 'error'
# Handle errors from API
if r.status_code != 200:
log.error('Request to ' + uri + ' with , request id: ' + request_id + ' returned with status code: ' +
str(r.status_code) + ', response body: ' + r.text)
return 'error'
# Return response body and response headers
return r.content, r.headers
def get_mta_siem_logs(checkpoint_dir, base_url, access_key, secret_key):
uri = "/api/audit/get-siem-logs"
# Set checkpoint file name to store page token
checkpoint_filename = os.path.join(checkpoint_dir, 'get_mta_siem_logs_checkpoint')
# Build post body for request
post_body = dict()
post_body['data'] = [{}]
post_body['data'][0]['type'] = 'MTA'
post_body['data'][0]['compress'] = True
if os.path.exists(checkpoint_filename):
post_body['data'][0]['token'] = read_file(checkpoint_filename)
# Send request to API
resp = post_request(base_url, uri, post_body, access_key, secret_key)
now = datetime.datetime.now().strftime("%a %b %d %H:%M:%S %Y")
# Process response
if resp != 'error':
resp_body = resp[0]
resp_headers = resp[1]
content_type = resp_headers['Content-Type']
# End if response is JSON as there is no log file to download
if content_type == 'application/json':
log.info('No more logs available')
return False
# Process log file
elif content_type == 'application/octet-stream':
file_name = resp_headers['Content-Disposition'].split('=\"')
file_name = file_name[1][:-1]
# Save files to LOG_FILE_PATH
write_file(os.path.join(LOG_FILE_PATH, file_name), resp_body)
# Save mc-siem-token page token to check point directory
write_file(checkpoint_filename, resp_headers['mc-siem-token'])
try:
if syslog_output is True:
for filename in os.listdir(LOG_FILE_PATH):
file_creation_time = time.ctime(os.path.getctime(LOG_FILE_PATH + "/" + filename))
if now < file_creation_time or now == file_creation_time:
log.info('Loading file: ' + filename + ' to output to ' + syslog_server + ':' + str(syslog_port))
with open(file=os.path.join(LOG_FILE_PATH, filename), mode='r', encoding='utf-8') as log_file:
lines = log_file.read().splitlines()
for line in lines:
syslogger.info(line)
log.info('Syslog output completed for file ' + filename)
except Exception as e:
log.error('Unexpected error writing to syslog. Exception: ' + str(e))
# return true to continue loop
return True
else:
# Handle errors
log.error('Unexpected response')
for header in resp_headers:
log.error(header)
return False
def run_script():
# discover base URL
try:
base_url = get_base_url(email_address=EMAIL_ADDRESS)
except Exception as e:
log.error('Error discovering base url for ' + EMAIL_ADDRESS + ' . Exception: ' + str(e))
quit()
# Request log data in a loop until there are no more logs to collect
try:
log.info('Getting MTA log data')
while get_mta_siem_logs(checkpoint_dir=CHK_POINT_DIR, base_url=base_url, access_key=ACCESS_KEY,
secret_key=SECRET_KEY) is True:
log.info('Getting more MTA log files')
except Exception as e:
log.error('Unexpected error getting MTA logs ' + (str(e)))
file_number = len([name for name in os.listdir(LOG_FILE_PATH) if os.path.isfile(name)])
if delete_files or file_number >= log_file_threshold:
for filename in os.listdir(LOG_FILE_PATH):
file_path = os.path.join(LOG_FILE_PATH, filename)
try:
if os.path.isfile(file_path) or os.path.islink(file_path):
os.unlink(file_path)
elif os.path.isdir(file_path):
shutil.rmtree(file_path)
except Exception as e:
print('Failed to delete %s. Reason: %s' % (file_path, e))
quit()
# Run script
run_script()
It seems like it may potentially be a race condition but I am not sure how to confirm since I can't reproduce it. I notice that SumoLogic has a modified version of this script as well with a different methodology for managing the files/paths. If this script works better than the main sample script above, would anybody be able to explain WHY? I haven't had any issues with it yet.
https://github.com/SumoLogic/sumologic-content/blob/master/MimeCast/SumoLogic-Mimecast-Data-Collection/siem_collection.py

Json body is truncated using non english characters in AWS Lambda function

I am using an API gateway and AWS Lamdba function as Proxy to my company's API (C# Web API 2.0)
The Lambda function in written in Python 2.7 and I am using Pyhton's urllib2 to pass the http request to the API.
I encounterd a strange issue When I am sending a json body containing hebrew characters.
The Json is being cut in the middle. I am making sure that the Json sent from the Lambda is complete, but the json body received in the Lambda is being turcated somewhere along the way.
This is the Lambda function:
from __future__ import print_function
import json
import urllib2
import HTMLParser
base = "http://xxxxxx/api"
hparser = HTMLParser.HTMLParser()
def lambda_handler(event, context):
print("Got event\n" + json.dumps(event, indent=2))
# Form URL
url = base + event['queryStringParameters']['rmt']
print('URL = %s' % url)
req = urllib2.Request(url)
if 'body' in event:
if event['body']:
print('BODY = %s' % json.dumps(event['body'], ensure_ascii=False, encoding='utf8') )
req.add_data(json.dumps(event['body'], ensure_ascii=False, encoding='utf8'))
# Copy only some headers
if 'headers' in event:
if event['headers']:
copy_headers = ('Accept', 'Content-Type', 'content-type')
for h in copy_headers:
if h in event['headers']:
print('header added = %s' % event['headers'][h])
req.add_header(h, event['headers'][h])
# Build response
out = {}
headersjsonstr = ('Access-Control-Allow-Origin', '')
response_header = {}
try:
print('Trying here...')
resp = urllib2.urlopen(req)
out['statusCode'] = resp.getcode()
out['body'] = resp.read()
for head in resp.info().headers:
keyval = head.split(':')
if any(keyval[0] in h for h in headersjsonstr):
response_header[keyval[0]] = keyval[1].replace('\r','').replace('\n','').strip()
print('response_header = %s' % response_header )
out['headers'] = response_header
print('status = %s' % out['statusCode'] )
except urllib2.HTTPError as e:
out['statusCode'] = e.getcode()
out['body'] = e.read()
out['headers'] = e.headers
print('status = %s' % out['statusCode'] )
return out
This is the Post request raw body Json
{"company":"שלום","guests":[{"fullname":"אבי","carno":"67"}],"fromdate":"2018-10-10","todate":"2018-10-10","fromtime":"07:31","totime":"07:31","comments":null,"Employee":{"UserId":"ink1445"}}
And this is what I am getting on the API:
"{\"company\":\"שלום\",\"guests\":[{\"fullname\":\"אבי\",\"carno\":\"67\"}],\"fromdate\":\"2018-10-10\",\"todate\":\"2018-10-10\",\"fromtime\":\"07:31\",\"totime\":\"07:31\",\"comments\":null,\"Employee\":{\"UserId\":\"ink1
Again, when I am sending only English letters everything is fine.
Please help!
Thanks
Very likely your json buffer is too small, and you are getting overflow truncation.
The size was probably set assuming ASCII or utf-8 encoding, and your unicode characters are wider (consume more bytes).
Depending on what json package you are using, you may be able to set an option for unicode or you might need to adjust the buffer size manually.

Python - compare last modified date of two files, local and remote

I'm trying to implement a python script that will compare the last modified dates of a local and remotely hosted file.
If the remote file is newer it should:
- delete the local file
- download the remote file with the last modified date intact
The closest answer I've found to this is Last Modified of file downloaded does not match its HTTP header, however I believe this downloads the whole file, so doesn't save much resource/time
What I'd like to do is just review the remote file's headers rather than download the whole file which I believe should be quicker.
Here's my current code, which is very messy and noobish (see string replace etc) I'm sure there's a better/quicker way - what can you suggest?
remote_source = 'http://example.com/somefile.xml'
local_source = 'path/to/myfile.xml'
if path.exists(local_source):
local_source_last_modified = os.path.getmtime(local_source)
local_source_last_modified = datetime.datetime.fromtimestamp(local_source_last_modified).strftime('(%Y, %m, %d, %H, %M, %S)')
conn = urllib.urlopen(remote_source)
remote_source_last_modified = conn.info().getdate('last-modified')
remote_source_last_modified = str(remote_source_last_modified)
remote_source_last_modified = remote_source_last_modified.replace(", 0, 1, 0)", ")")
if local_source_last_modified < remote_source_last_modified:
pass
else:
headers = urlretrieve(remote_source, local_source)[1]
lmStr = headers.getheader("Last-Modified")
remote_source_last_modified = mktime(strptime(lmStr, "%a, %d %b %Y %H:%M:%S GMT"))
os.utime(local_source, (remote_source_last_modified, remote_source_last_modified))
else:
headers = urlretrieve(remote_source, local_source)[1]
lmStr = headers.getheader("Last-Modified")
remote_source_last_modified = mktime(strptime(lmStr, "%a, %d %b %Y %H:%M:%S GMT"))
os.utime(local_source, (remote_source_last_modified, remote_source_last_modified))
Just in case anybody reads this, here's what I ended up with:
def syncCheck(file_path):
remote_source = 'http://example.com/' + os.path.basename(file_path)
local_source = file_path
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36'}
response = requests.head(remote_source, headers = headers)
remote_source_last_modified = response.headers["last-modified"]
remote_source_last_modified = time.mktime(datetime.datetime.strptime(remote_source_last_modified[:-4], "%a, %d %b %Y %H:%M:%S").timetuple())
try:
if os.path.exists(local_source):
local_source_last_modified = os.path.getmtime(local_source)
if local_source_last_modified == remote_source_last_modified:
break
else:
try:
os.remove(local_source)
except:
break
urlretrieve(remote_source, local_source)
os.utime(local_source, (remote_source_last_modified, remote_source_last_modified))
else:
urlretrieve(remote_source, local_source)
os.utime(local_source, (remote_source_last_modified, remote_source_last_modified))
except HTTPError, e:
print("HTTP Error: " + str(e.fp.read()))
except URLError, e:
print("URL Error: " + str(e.reason))

Images caching in browser - app-engine-patch application

I have a little problem with caching the images in the browser for my app-engine application
I`m sending last-modified, expires and cache-control headers but image is loaded from the server every time.
Here is the header part of the code:
response['Content-Type'] = 'image/jpg'
response['Last-Modified'] = current_time.strftime('%a, %d %b %Y %H:%M:%S GMT')
response['Expires'] = current_time + timedelta(days=30)
response['Cache-Control'] = 'public, max-age=2592000'
Here is an example code for my fix copy in dpaste here
def view_image(request, key):
data = memcache.get(key)
if data is not None:
if(request.META.get('HTTP_IF_MODIFIED_SINCE') >= data['Last-Modified']):
data.status_code = 304
return data
else:
image_content_blob = #some code to get the image from the data store
current_time = datetime.utcnow()
response = HttpResponse()
last_modified = current_time - timedelta(days=1)
response['Content-Type'] = 'image/jpg'
response['Last-Modified'] = last_modified.strftime('%a, %d %b %Y %H:%M:%S GMT')
response['Expires'] = current_time + timedelta(days=30)
response['Cache-Control'] = 'public, max-age=315360000'
response['Date'] = current_time
response.content = image_content_blob
memcache.add(image_key, response, 86400)
return response

Categories

Resources