Recreating python mechanize script in R - python

I'd like to recreate the python script below which uses mechanize and http.cookiejar in R. I thought it would be straight forward using rvest but I was unable to do so. Any insight on which packages to use and apply would be extremely helpful. I realize reticulate may be a possibility but I figure that there has to be a way to do this in R that is straight forward.
import mechanize
import http.cookiejar
b = mechanize.Browser()
b.set_handle_refresh(True)
b.set_debug_redirects(True)
b.set_handle_redirect(True)
b.set_debug_http(True)
cj = http.cookiejar.CookieJar()
b.set_cookiejar(cj)
b.addheaders = [
('User-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.94 Safari/537.36'),
('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'),
('Host', 'www.fangraphs.com'),
('Referer', 'https://www.fangraphs.com/auctiontool.aspx?type=pit&proj=atc&pos=1,1,1,1,5,1,1,0,0,1,5,5,0,18,0&dollars=400&teams=12&mp=5&msp=5&mrp=5&mb=1&split=&points=c|0,1,2,3,4,5|0,1,2,3,4,5&lg=MLB&rep=0&drp=0&pp=C,SS,2B,3B,OF,1B&players=')
]
b.open("https://www.fangraphs.com/auctiontool.aspx?type=pit&proj=atc&pos=1,1,1,1,5,1,1,0,0,1,5,5,0,18,0&dollars=400&teams=12&mp=5&msp=5&mrp=5&mb=1&split=&points=c|0,1,2,3,4,5|0,1,2,3,4,5&lg=MLB&rep=0&drp=0&pp=C,SS,2B,3B,OF,1B&players=")
def is_form1_form(form):
return "id" in form.attrs and form.attrs['id'] == "form1"
b.select_form(predicate=is_form1_form)
b.form.find_control(name='__EVENTTARGET').readonly = False
b.form.find_control(name='__EVENTARGUMENT').readonly = False
b.form['__EVENTTARGET'] = 'AuctionBoard1$cmdCSV'
b.form['__EVENTARGUMENT'] = ''
print(b.submit().read())
The R code I was using to attempt to recreate this with rvest is below. The comments indicate the main source of my confusion. In particular the needed fields grabbed by the python code were not showing up when I grabbed the form with rvest and when I tried to manually insert them I got a Connection Refused upon submitting.
library(rvest)
atc.pitcher.link = "https://www.fangraphs.com/auctiontool.aspx?type=pit&proj=atc&pos=1,1,1,1,5,1,1,0,0,1,5,5,0,18,0&dollars=400&teams=12&mp=5&msp=5&mrp=5&mb=1&split=&points=c|0,1,2,3,4,5|0,1,2,3,4,5&lg=MLB&rep=0&drp=0&pp=C,SS,2B,3B,OF,1B&players="
proj.data = html_session(atc.pitcher.link)
form.unfilled = proj.data %>% html_node("form") %>% html_form()
# note: I am suprised "__EVENTTARGET" and "__EVENTARGUMENT" are not included as attributes of the unfilled form. I can select them in the posted python script.
# If I try and create them with the appropriate values I get a Connection Refused Error.
form.unfilled[[5]]$`__EVENTTARGET` = form.unfilled[[5]]$`__VIEWSTATE`
form.unfilled[[5]]$`__EVENTARGUMENT`= form.unfilled[[5]]$`__VIEWSTATE`
form.unfilled[[5]]$`__EVENTTARGET`$readonly = FALSE
form.unfilled[[5]]$`__EVENTTARGET`$value = "AuctionBoard1$cmdCSV"
form.unfilled[[5]]$`__EVENTARGUMENT`$value = ""
form.unfilled[[5]]$`__EVENTARGUMENT`$readonly = FALSE
form.filled = form.unfilled
session = submit_form(proj.data, form.filled)

Here is a way to do it using RSelenium and setting chrome to be headless an enabling remote download to your working directory. It automatically brings up a headless browser and then lets the code drive it.
I believe to do the equivalent in rvest you need to write some native phantomjs.
library(RSelenium)
library(wdman)
eCaps <- list(
chromeOptions = list(
args = c('--headless','--disable-gpu', '--window-size=1280,800'),
prefs = list(
"profile.default_content_settings.popups" = 0L,
"download.prompt_for_download" = FALSE,
"download.default_directory" = getwd()
)
)
)
cDrv <- wdman::chrome()
rD <- RSelenium::rsDriver(extraCapabilities = eCaps)
remDr <- rD$client
remDr$queryRD(
ipAddr = paste0(remDr$serverURL, "/session/", remDr$sessionInfo[["id"]], "/chromium/send_command"),
method = "POST",
qdata = list(
cmd = "Page.setDownloadBehavior",
params = list(
behavior = "allow",
downloadPath = getwd()
)
)
)
atc.pitcher.link= "http://www.fangraphs.com/auctiontool.aspx?type=pit&proj=atc&pos=1,1,1,1,5,1,1,0,0,1,5,5,0,18,0&dollars=400&teams=12&mp=5&msp=5&mrp=5&mb=1&split=&points=c|0,1,2,3,4,5|0,1,2,3,4,5&lg=MLB&rep=0&drp=0&pp=C,SS,2B,3B,OF,1B&players="
remDr$navigate(atc.pitcher.link)
# sleep to be nice and give things time to load
Sys.sleep(8)
# find the button the page we want to click
option <- remDr$findElement('id', 'AuctionBoard1_cmdCSV')
#click it
option$clickElement()
list.files(getwd(),pattern = 'sysdata')
remDr$closeall()
cDrv$stop()

Related

Request post method doesn't return valid response

I am trying to work on a website that has simple captcha. Here's the link.
Steps:
One is supposed to type a case number e.g. 200078510, then type the numbers in the captcha, then click on Search button.
Progress:
I could solve the part of the captcha, but when trying to use the POST method in requests library, I didn't get a valid response. I got this string حدث خطأ ما , which means that Something went wrong. A successful response would have included the case number in the response e.g. 200078510.
Question:
90% of the time myCaptcha is correct so the problem, I think, is with the POST request. Can anyone see what is wrong with my POST request?
I provide a working VBA example at the end, as additional info, in case that helps.
Here's the code that I could do till now:
import requests
import cv2
import numpy as np
import pytesseract
from PIL import Image
sNumber = 'Number.png'
sTemp = 'Temp.png'
pytesseract.pytesseract.tesseract_cmd=r'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
def getCaptcha():
response = requests.get("https://eservices.moj.gov.kw/captcha/imgCaptcha.jsp")
with open(sNumber, "wb") as f:
f.write(response.content)
f.close()
img = cv2.imread(sNumber)
lower = np.array([0, 0, 0])
upper = np.array([46, 46, 255])
thresh = cv2.inRange(img, lower, upper)
thresh = 255 - thresh
cv2.imwrite(sTemp, thresh)
img=Image.open(sTemp)
text=pytesseract.image_to_string(img, lang='eng',config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')
return text
myCaptcha = getCaptcha()
print(myCaptcha)
payload = {'txtCaseNo': '200078510', 'txtCaptcha2': myCaptcha, 'searchType': '0'}
r = requests.post("https://eservices.moj.gov.kw/viewResults/validateCase.jsp", data=payload)
print(r.url)
print(r.text)
I even tried using headers like that and the same problem
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36',
'Content-Type':'application/x-www-form-urlencoded'}
payload = {'txtCaseNo': '200078510', 'txtCaptcha2': myCaptcha, 'searchType': '0'}
r = requests.post("https://eservices.moj.gov.kw/viewResults/validateCase.jsp",headers = headers, data=payload)
Simply I need to be able to use the POST method of the requests package so as to be able to send the suitable arguments and then navigate to multiple sections that are related to the searched number.
Supplementary Information (Working example reference in VBA):
I have a working code in VBA for the entire process. The code navigates to a URL and enter a number and enter the numbers on captcha. Here's the code:
Public vCaptcha
Sub Test()
Dim wsIndex As Worksheet, wsData As Worksheet, http As New XMLHTTP60, html As New HTMLDocument, htmlData As New HTMLDocument, postCasePane As Object, oTables As Object, postTable As Object, postWrongSec As Object, strArg As String, xTemp As String, sTemp As String, r As Long, lr As Long, i As Long, ii As Long, vMAX As Long, cnt As Long
Set wsIndex = ThisWorkbook.Worksheets("Index")
Set wsData = ThisWorkbook.Worksheets("Data")
wsData.Range("A1").CurrentRegion.Offset(1).ClearContents
For r = 2 To wsIndex.Cells(Rows.Count, 1).End(xlUp).Row
If r Mod 10 = 0 Then ThisWorkbook.Save
lr = wsData.Cells(Rows.Count, 1).End(xlUp).Row + 1
If wsIndex.Cells(r, 1).Value = "" Then GoTo Skipper
sPoint:
Application.StatusBar = "Case Number: " & wsIndex.Cells(r, 1).Value & " ------- Row " & r
DecryptCaptcha
strArg = "txtCaseNo=" & wsIndex.Cells(r, 1).Value & "&txtCaptcha2=" & vCaptcha & "&searchType=0"
With http
.Open "POST", "https://eservices.moj.gov.kw/viewResults/validateCase.jsp", False
.setRequestHeader "Content-type", "application/x-www-form-urlencoded"
.send strArg
html.body.innerHTML = .responseText
Set postWrongSec = html.querySelector("span[lang='AR-KW']")
If Not postWrongSec Is Nothing Then
If postWrongSec.innerText = "ÚÝæÇ: ÑãÒ ÇáÍãÇíÉ ÛíÑ ÕÍíÍ !!!" Then
cnt = cnt + 1
Debug.Print "Wrong Captcha " & cnt: GoTo sPoint
End If
End If
Set postCasePane = html.querySelector("#caseViewPane span h4")
If postCasePane Is Nothing Then wsData.Range("A" & lr).Value = wsIndex.Cells(r, 1).Value: wsData.Range("C" & lr).Value = "ÑÞã ÇáÞÖíÉ ÛíÑ ÕÍíÍ": GoTo Skipper
.Open "POST", "https://eservices.moj.gov.kw/viewResults/viewLastEvents.jsp", False
.setRequestHeader "Content-type", "application/x-www-form-urlencoded"
.send
html.body.innerHTML = .responseText
End With
Set html = Nothing: Set htmlData = Nothing
Skipper:
Application.Wait Now + TimeValue("00:00:05")
Next r
Application.StatusBar = Empty
MsgBox "Done...", 64
End Sub
And this is the part the is responsible for the captcha
Private Sub DecryptCaptcha()
Dim res, sDestFolder As String, strFile As String, sURL As String
sDestFolder = ThisWorkbook.Path & "\"
strFile = "Number.png"
sURL = "https://eservices.moj.gov.kw/captcha/imgCaptcha.jsp"
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", sURL, False
.send
res = .responseBody
End With
With CreateObject("ADODB.Stream")
.Type = 1
.Open
.write res
.SaveToFile sDestFolder & strFile, 2
End With
vCaptcha = CleanNumber(ScriptFile(sDestFolder & strFile))
End Sub
Function ScriptFile(strImage As String) As String
Dim wshShell As Object, sOutput As String, strCommand As String
sOutput = ThisWorkbook.Path & "\OutputNumber.txt"
strCommand = "Powershell.exe -File ""C:\Users\" & Environ("USERNAME") & "\Desktop\ConvertImage.ps1"" " & strImage
Set wshShell = CreateObject("WScript.Shell")
wshShell.Run strCommand, 0, True
ScriptFile = CreateObject("Scripting.FileSystemObject").OpenTextFile(sOutput).ReadAll
End Function
Function CleanNumber(ByVal strText As String) As String
With CreateObject("VBScript.RegExp")
.IgnoreCase = True
.Global = True
.Pattern = "[^0-9]"
If .Test(strText) Then
CleanNumber = WorksheetFunction.Trim(.Replace(strText, vbNullString))
Else
CleanNumber = strText
End If
End With
End Function
And as for the powershell file these are the contents
$image=$args[0]
$desktop= (Join-Path $env:USERPROFILE 'Desktop')
$imagefile=(Join-Path $desktop 'NumberNew.png')
$textfile=(Join-Path $desktop 'OutputNumber')
cd (Join-Path $desktop '\')
magick convert $image -resize 300x160 -density 300 -quality 100 $imagefile
magick convert $imagefile -negate -lat 300x160+40% -negate $imagefile
tesseract.exe $imagefile $textfile -l eng
Of course the code requires tesseract to be installed and also the imagemagick to deal and manipulate the image. The code is working in VBA but I would like to use python for that to improve my skills. Now I am stuck and have no more points of success. Thanks advanced for help.

Example of a working OWASP Zap script with authenticated scan using API

Can someone please show a script that is capable of doing the above? I have found a good amount of instruction on the web and tried a lot of different things but still can't get Zap to login to the page to perform a full scan.
The best I get is something like this:
'http://XXX',
'http://XXX/robots.txt',
'http://XXX/sitemap.xml',
'http://XXX/webui',
'http://XXX/webui/index.html',
'http://XXX/webui/index.html?Password=ZAP&Username=ZAP',
'http://XXX/webui/login',
'http://XXX/webui/login/assets',
'http://XXX/webui/login/assets/images',
'http://XXX/webui/login/assets/images/companylogo.png',
'http://XXX/webui/login/assets/styles',
'http://XXX/webui/login/assets/styles/login.css',
'http://XXX/webui/login/login.js',
'http://XXX/webui/login/redirect.js',
'http://XXX/webui?Password=ZAP&Username=ZAP'
Many thanks
from zapv2 import ZAPv2
from random import randint
import socket
zap_ip = 'zap' #name of a Docker container running Zap
target = 'http://example.com'
auth_url = target + "webui/index.html"
scanners = ['90020', '90029']
# authorized Web UI user
username = test
password = test
auth_data = 'password={%password%}&username={%username#%}'
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
zap = ZAPv2(proxies={'http': 'http://' + zap_ip + ':' + str(port),
'https': 'http://' + zap_ip + ':' + str(port)})
new_context = randint(1, 100000000000)
session = zap.core.session_location
session_name = 'session_1.session' if zap.core.session_location == \
'session_0.session' else 'session_0.session'
zap.core.new_session(name=session_name)
zap.core.load_session(session_name)
context_id = zap.context.new_context(new_context)
zap.context.include_in_context(new_context, '.*')
zap.ascan.disable_all_scanners()
for scanner in scanners:
zap.ascan.enable_scanners(scanner)
all_rules = [scanner for scanner in \
zap.ascan.scanners() if scanner['enabled'] == 'true']
start_url = auth_url if auth_url else target
zap.urlopen(start_url)
auth_method_name = 'formBasedAuthentication'
authmethod_configparams = 'loginUrl=%s&loginRequestData=%s' % (auth_url, auth_data)
authcred_configparams = 'username=%s&password=%s' % (username, password)
zap.authentication.set_authentication_method(contextid=context_id,
authmethodname=auth_method_name,
authmethodconfigparams=authmethod_configparams)
user_id = zap.users.new_user(contextid=context_id, name=username)
zap.users.set_authentication_credentials(contextid=context_id,
userid=user_id,
authcredentialsconfigparams=authcred_configparams)
zap.users.set_user_enabled(contextid=context_id, userid=user_id, enabled=True zap.forcedUser.set_forced_user(context_id, user_id)
zap.forcedUser.set_forced_user_mode_enabled('true')
spider = zap.spider.scan_as_user(url=target, contextid=context_id,
userid=user_id, recurse='false')
while (int(zap.spider.status()) < 100):
time.sleep(2)
zap.ascan.scan(target)
zap.ascan.remove_all_scans()
zap.core.delete_all_alerts()
zap.context.remove_context(new_context)
Authentication is, in general, a pain. There are so many different ways authentication can be implemented its really difficult to provide anything other than very generic advice.
However the fact that you've got a URL like 'http://XXX/webui?Password=ZAP&Username=ZAP' implies you have not configured something correctly as these are the default values supplied by the ZAP spider.
If you can supply more details about what your application appears to expect and what you are doing then we should be able to help some more.

python requests enable cookies/javascript

I try to download an excel file from a specific website. In my local computer it works perfectly:
>>> r = requests.get('http://www.health.gov.il/PublicationsFiles/IWER01_2004.xls')
>>> r.status_code
200
>>> r.content
b'\xd0\xcf\x11\xe0\xa1\xb1...\x00\x00' # Long binary string
But when I connect to a remote ubuntu server, I get a message related to enabling cookies/javascript.
r = requests.get('http://www.health.gov.il/PublicationsFiles/IWER01_2004.xls')
>>> r.status_code
200
>>> r.content
b'<HTML>\n<head>\n<script>\nChallenge=141020;\nChallengeId=120854618;\nGenericErrorMessageCookies="Cookies must be enabled in order to view this page.";\n</script>\n<script>\nfunction test(var1)\n{\n\tvar var_str=""+Challenge;\n\tvar var_arr=var_str.split("");\n\tvar LastDig=var_arr.reverse()[0];\n\tvar minDig=var_arr.sort()[0];\n\tvar subvar1 = (2 * (var_arr[2]))+(var_arr[1]*1);\n\tvar subvar2 = (2 * var_arr[2])+var_arr[1];\n\tvar my_pow=Math.pow(((var_arr[0]*1)+2),var_arr[1]);\n\tvar x=(var1*3+subvar1)*1;\n\tvar y=Math.cos(Math.PI*subvar2);\n\tvar answer=x*y;\n\tanswer-=my_pow*1;\n\tanswer+=(minDig*1)-(LastDig*1);\n\tanswer=answer+subvar2;\n\treturn answer;\n}\n</script>\n<script>\nclient = null;\nif (window.XMLHttpRequest)\n{\n\tvar client=new XMLHttpRequest();\n}\nelse\n{\n\tif (window.ActiveXObject)\n\t{\n\t\tclient = new ActiveXObject(\'MSXML2.XMLHTTP.3.0\');\n\t};\n}\nif (!((!!client)&&(!!Math.pow)&&(!!Math.cos)&&(!![].sort)&&(!![].reverse)))\n{\n\tdocument.write("Not all needed JavaScript methods are supported.<BR>");\n\n}\nelse\n{\n\tclient.onreadystatechange = function()\n\t{\n\t\tif(client.readyState == 4)\n\t\t{\n\t\t\tvar MyCookie=client.getResponseHeader("X-AA-Cookie-Value");\n\t\t\tif ((MyCookie == null) || (MyCookie==""))\n\t\t\t{\n\t\t\t\tdocument.write(client.responseText);\n\t\t\t\treturn;\n\t\t\t}\n\t\t\t\n\t\t\tvar cookieName = MyCookie.split(\'=\')[0];\n\t\t\tif (document.cookie.indexOf(cookieName)==-1)\n\t\t\t{\n\t\t\t\tdocument.write(GenericErrorMessageCookies);\n\t\t\t\treturn;\n\t\t\t}\n\t\t\twindow.location.reload(true);\n\t\t}\n\t};\n\ty=test(Challenge);\n\tclient.open("POST",window.location,true);\n\tclient.setRequestHeader(\'X-AA-Challenge-ID\', ChallengeId);\n\tclient.setRequestHeader(\'X-AA-Challenge-Result\',y);\n\tclient.setRequestHeader(\'X-AA-Challenge\',Challenge);\n\tclient.setRequestHeader(\'Content-Type\' , \'text/plain\');\n\tclient.send();\n}\n</script>\n</head>\n<body>\n<noscript>JavaScript must be enabled in order to view this page.</noscript>\n</body>\n</HTML>'
On local I run from MACos that has Chrome installed (I'm not actively using it for the script, but maybe it's related?), on remote I run ubuntu on digital ocean without any GUI browser installed.
The behavior of requests has nothing to do with what browsers are installed on the system, it does not depend on or interact with them in any way.
The problem here is that the resource you are requesting has some kind of "bot mitigation" mechanism enabled to prevent just this kind of access. It returns some javascript with logic that needs to be evaluated, and the results of that logic are then used for an additional request to "prove" you're not a bot.
Luckily, it appears that this specific mitigation mechanism has been solved before, and I was able to quickly get this request working utilizing the challenge-solving functions from that code:
from math import cos, pi, floor
import requests
URL = 'http://www.health.gov.il/PublicationsFiles/IWER01_2004.xls'
def parse_challenge(page):
"""
Parse a challenge given by mmi and mavat's web servers, forcing us to solve
some math stuff and send the result as a header to actually get the page.
This logic is pretty much copied from https://github.com/R3dy/jigsaw-rails/blob/master/lib/breakbot.rb
"""
top = page.split('<script>')[1].split('\n')
challenge = top[1].split(';')[0].split('=')[1]
challenge_id = top[2].split(';')[0].split('=')[1]
return {'challenge': challenge, 'challenge_id': challenge_id, 'challenge_result': get_challenge_answer(challenge)}
def get_challenge_answer(challenge):
"""
Solve the math part of the challenge and get the result
"""
arr = list(challenge)
last_digit = int(arr[-1])
arr.sort()
min_digit = int(arr[0])
subvar1 = (2 * int(arr[2])) + int(arr[1])
subvar2 = str(2 * int(arr[2])) + arr[1]
power = ((int(arr[0]) * 1) + 2) ** int(arr[1])
x = (int(challenge) * 3 + subvar1)
y = cos(pi * subvar1)
answer = x * y
answer -= power
answer += (min_digit - last_digit)
answer = str(int(floor(answer))) + subvar2
return answer
def main():
s = requests.Session()
r = s.get(URL)
if 'X-AA-Challenge' in r.text:
challenge = parse_challenge(r.text)
r = s.get(URL, headers={
'X-AA-Challenge': challenge['challenge'],
'X-AA-Challenge-ID': challenge['challenge_id'],
'X-AA-Challenge-Result': challenge['challenge_result']
})
yum = r.cookies
r = s.get(URL, cookies=yum)
print(r.content)
if __name__ == '__main__':
main()
you can use this code to avoid block
url = 'your url come here'
s = HTMLSession()
s.headers['user-agent'] = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36'
r = s.get(url)
r.html.render(timeout=8000)
print(r.status_code)
print(r.content)

Changing text variable from another imported script

So we have two scripts the first being AdidasStock.py and the second being StockWindow.py. I am trying to replace the base url in getVarientStock from StockWindow.py. Once again my apology's I am really new to python.
I am getting an error :
aulocale1() takes exactly 2 arguments (1 given)
class AdidasStock:
def __init__(self, clientId, sku):
self.session = requests.session()
self.headers = {"User-Agent" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36",
"Accept-Language" : "REPLACETHISPLIZZZ"}
self.locale = ''
self.clientId = clientId
self.sku = sku
self.skus = []
def getVarientStock(self, sku, base):
base = "http://www.adidas.com.au/on/demandware.store/Sites-adidas-AU-Site/en_AU"
urlVariantStock = base + '/Product-GetVariants?pid=' + sku
r = requests.get(urlVariantStock, headers=self.headers)
Here is how I am trying to change the above base , self.locale, and a portion of self.headers. I am using a Tkinter Checkbutton to trigger this function.
Checkbutton
aulocale = IntVar()
aucheck = Checkbutton(self.master, variable=aulocale, onvalue=1, offvalue=0, text="AU",command=self.aulocale1)
This is the Function
def aulocale1(self,base):
base.replace = "http://www.adidas.com.au/on/demandware.store/Sites-adidas-AU-Site/en_AU"
self.locale.replace = ('','AU')
self.headers.replace = ('REPLACETHISPLIZZZ','en-AU,en;q=0.8')
def uklocale1(self,base):
base.replace = "www.adidas.co.uk/on/demandware.store/Sites-adidas-GB-Site/en_GB"
self.locale.replace = ('','GB')
elf.headers.replace = ('REPLACETHISPLIZZZ','en-GB,en;q=0.8')
Function def aulocale1(self,base): expects one argument - base but when you assign this function to Checkbox using command=self.aulocale1 then Checkbox will execute this function without arguments - it will run self.aulocale1()
You can assign to command function with arguments using lambda
command=lambda:self.aulocale1("argument")
(BTW: if you will use lambda in for loop then you will have other problems ;) )
base is local variable so you can't change it ... but you can run this function with argument base so you can use default value for this argument
def getVarientStock(self, sku, base="http://www.adidas.com.au/ ...")
urlVariantStock = base + '/Product-GetVariants?pid=' + sku
r = requests.get(urlVariantStock, headers=self.headers)
If you run it without base
getVarientStock("XX")
then it uses "http://www.adidas.com.au/ ..." as base
but if you run it with second argument
getVarientStock("XX", "http://stackoverflow.com")
then it uses "http://stackoverflow.com" as base

Header Check in Python (GAE)

I was wondering how I would go about checking HTTP headers to determine whether the request is valid or malformed. How can I do this in Python, more specifically, how can I do this in GAE?
For some debugging and viewing the request with the headers I use the following DDTHandler class.
import cgi
import wsgiref.handlers
import webapp2
class DDTHandler(webapp2.RequestHandler):
def __start_display(self):
self.response.out.write("<!--\n")
def __end_display(self):
self.response.out.write("-->\n")
def __show_dictionary_items(self,dictionary,title):
if (len(dictionary) > 0):
request = self.request
out = self.response.out
out.write("\n" + title + ":\n")
for key, value in dictionary.iteritems():
out.write(key + " = " + value + "\n")
def __show_request_members(self):
request = self.request
out = self.response.out
out.write(request.url+"\n")
out.write("Query = "+request.query_string+"\n")
out.write("Remote = "+request.remote_addr+"\n")
out.write("Path = "+request.path+"\n\n")
out.write("Request payload:\n")
if (len(request.arguments()) > 0):
for argument in request.arguments():
value = cgi.escape(request.get(argument))
out.write(argument+" = "+value+"\n")
else:
out.write("Empty\n")
self.__show_dictionary_items(request.headers, "Headers")
self.__show_dictionary_items(request.cookies, "Cookies")
def view_request(self):
self.__start_display()
self.__show_request_members()
self.__end_display()
def view(self, aString):
self.__start_display()
self.response.out.write(aString+"\n")
self.__end_display()
Example:
class RootPage(DDTHandler):
def get(self):
self.view_request()
Will output the request and contains the headers.
So check the code and get what you need. Thought as said, a malformed "invalid" request won't probably hit your app.
<!--
http://localhost:8081/
Query =
Remote = 127.0.0.1
Path = /
Request payload:
Empty
Headers:
Referer = http://localhost:8081/_ah/login?continue=http%3A//localhost%3A8081/
Accept-Charset = ISO-8859-7,utf-8;q=0.7,*;q=0.3
Cookie = hl=en_US; dev_appserver_login="test#example.com:False:185804764220139124118"
User-Agent = Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.52 Safari/537.17
Host = localhost:8081
Accept = text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language = en-US,en;q=0.8,el;q=0.6
Cookies:
dev_appserver_login = test#example.com:False:185804764220139124118
hl = en_US
-->

Categories

Resources