Editing dictionary in Python - python

I'm still very new to coding (been coding for a week) so I am struggling with a very basic function.
I am trying to log into a website using python however I am having a hard time changing the set-cookie header.
See my current code below:
import requests
targetURL = "http://hostip/v2_Website/aspx/login.aspx"
headers = {
"Host": "*host IP*",
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0"}
response = requests.get(url=targetURL,
proxies=proxies,
headers=headers,)
response_headers = response.headers
When I print the response.headers I get the following:
{'Cache-Control': 'no-cache, no-store', 'Pragma': 'no-cache,no-cache', 'Content-Length': '15671', 'Content-Type': 'text/html; charset=utf-8', 'Expires': '-1', 'Server': 'Microsoft-IIS/7.5', 'X-AspNet-Version': '2.0.50727', 'Set-Cookie': 'ASP.NET_SessionId=vq5q4lzlrqiiebbmxw341yic; path=/; HttpOnly, CookieLoginAttempts=5; expires=Tue, 14-Aug-2018 17:14:09 GMT; path=/', 'X-Powered-By': 'ASP.NET', 'Date': 'Tue, 14 Aug 2018 07:14:10 GMT', 'Connection': 'close'}
Obviously when I use these headers in my HTTP POST it fails due to the POST having a Set-Cookie header instead of Cookie value.
My objectives are as follows:
Update/change the Set-Cookie key to Cookie
Then I would like to remove values that are not needed in the Cookie key
Add other keys and values
Ultimately I would like to change the headers to the following so I can use it for my POST in order for me to pass login credentials:
POST /Test server/aspx/Login.aspx?function=Welcome HTTP/1.1
Host: *Host IP*
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://*HostIP*/v2_Website/aspx/main.aspx?function=Welcome
Cookie: ASP.NET_SessionId=3vy0fy55xsmffhbotikrwh55; CookieLoginAttempts=5; Admin=false
Connection: close
Upgrade-Insecure-Requests: 1
Content-Type: application/x-www-form-urlencoded
Content-Length: 220
Is my above objective even possible? If so how does one even achieve it as I don't understand the process of modifying a dictionary I can't see.
Again I would like to you to note I am still very green in the world of coding and trying to "think like a coder" thus keeping responses a little less technical would be highly appreciated, just so I can understand your response and advice. Any help would be great!

I was a able to find the answer with a quite a bit of research.
Instead of trying to edit it manually I did the following:
import requests
session_requests = requests.session()
login_url = "http://*host ip*/v2_Website/aspx/Login.aspx"
result = session_requests.get(login_url)
result = session_requests.post(login_url,
headers= dict(referer=login_url))
This pulls through the needed cookie and adds everything as need. My headers come back as follows:
POST /_v2_Website/aspx/Login.aspx HTTP/1.1
Host: *host IP*
User-Agent: python-requests/2.18.4
Accept-Encoding: gzip, deflate
Accept: */*
Connection: close
referer: http://*hostIP*/v2_Website/aspx/Login.aspx
Cookie: ASP.NET_SessionId=3crqoo45hnn21anuqulmmr55; CookieLoginAttempts=5
Content-Length: 79
Content-Type: application/x-www-form-urlencoded

Related

Unable to log in to a site requiring unconventional payload to be sent with post requests

I'm trying to log in to a website using requests module. While creating a script to do so, I could notice that the payload used in there is completely different from the conventional approach. This is exactly how the payload +åEMAIL"PASSWORD(0 looks like. This is the content type parameters content-type: application/grpc-web+proto.
The following is what I see in dev tools when I log in to that site manually:
General
--------------------------------------------------------
Request URL: https://grips-web.aboutyou.com/checkout.CheckoutV1/logInWithEmail
Request Method: POST
Status Code: 200
Remote Address: 104.18.9.228:443
Response Headers
--------------------------------------------------------
Referrer Policy: strict-origin-when-cross-origin
access-control-allow-credentials: true
access-control-allow-origin: https://www.aboutyou.cz
access-control-expose-headers: Content-Encoding, Vary, Access-Control-Allow-Origin, Access-Control-Allow-Credentials, Date, Content-Type, grpc-status, grpc-message
cf-cache-status: DYNAMIC
cf-ray: 67d009674f604a4d-SIN
content-encoding: gzip
content-type: application/grpc-web+proto
date: Wed, 11 Aug 2021 08:19:04 GMT
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
server: cloudflare
set-cookie: __cf_bm=a45185d4acac45725b46236884673503104a9473-1628669944-1800-Ab2Aos6ocz7q8B8v53oEsSK5QiImY/zqlTba/Y0FqpdsaQt2c10FJylcwTacmdovm6tjGd8hLdy/LidfFCtOj70=; path=/; expires=Wed, 11-Aug-21 08:49:04 GMT; domain=.aboutyou.com; HttpOnly; Secure; SameSite=None
vary: Origin
Request Headers
--------------------------------------------------------
:authority: grips-web.aboutyou.com
:method: POST
:path: /checkout.CheckoutV1/logInWithEmail
:scheme: https
accept: */*
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
cache-control: no-cache
content-length: 48
content-type: application/grpc-web+proto
origin: https://www.aboutyou.cz
pragma: no-cache
referer: https://www.aboutyou.cz/
sec-ch-ua: "Chromium";v="92", " Not A;Brand";v="99", "Google Chrome";v="92"
sec-ch-ua-mobile: ?0
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: cross-site
user-agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.36
x-grpc-web: 1
Request Payload
--------------------------------------------------------
+åEMAIL"PASSWORD(0
This is what I've created so far (can't find any way to fill in the payload):
import requests
from bs4 import BeautifulSoup
start_url = 'https://www.aboutyou.cz/'
post_link = 'https://grips-web.aboutyou.com/checkout.CheckoutV1/logInWithEmail'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.3',
'content-type': 'application/grpc-web+proto',
'origin': 'https://www.aboutyou.cz',
'referer': 'https://www.aboutyou.cz/',
'x-grpc-web': '1'
}
payload = {
}
with requests.Session() as s:
s.headers.update(headers)
r = s.post(post_link,data=payload)
print(r.status_code)
print(r.url)
Steps to log in to that site manually:
Go to this site
This is how to get the login form
Login form looks like this
How can I log in to that site using requests module?
I don't think that you'll be able to use Python Requests to login to your target site.
Your post_link url:
post_link = 'https://grips-web.aboutyou.com/checkout.CheckoutV1/logInWithEmail'
states that it is: gRPC requires HTTP/2 and Python Requests send HTTP/1.1 requests only.
Additionally, I noted that the target site also uses CloudFlare, which is difficult to bypass with Python, especially when using Python Requests
'Set-Cookie': '__cf_bm=11d867459fe0951da4157b475cf88eb3ab7658fb-1629229293-1800-AeFomlmROcmUYcRosxxcSnoJkGOW/WXjUe1WxK6SkM2eXIbnAqXRlpwOkpvOfONrbApJd4Qwj+a8+kOzLAfpHIE=; path=/; expires=Tue, 17-Aug-21 20:11:33 GMT; domain=.aboutyou.com; HttpOnly; Secure; SameSite=None', 'Vary': 'Accept-Encoding', 'Server': 'cloudflare', 'CF-RAY': '6805616b8facf1b2-ATL', 'Content-Encoding': 'gzip'}
Here are previous Stack Overflow questions on Python Requests with gRPC
Can't make gRPC work with python requests rest api call
Send plain JSON to a gRPC server using python
I looked through the GitHub repository for Python Requests and saw that HTTP/2 has been a requested feature for almost 7 years.
During my research, I discovered HTTPX, which is a HTTP client for Python 3, which provides sync and async APIs, and support for both HTTP/1.1 and HTTP/2. The documentation states that the package is stable, but is still considered a beta at this point. I would recommend trying HTTPX to see if it solves your issue with logging into your target site.

Log in to website using requests

I am currently trying to get data off http://www.spotrac.com/ that requires being signed in. My current attempt uses this code(which I got by going through a bunch of other stack overflow questions on a similar topic)
from bs4 import BeautifulSoup as bs
from requests import session
payload = {
'id': 'contactForm',
'cmd': 'http://www.spotrac.com/signin/submit/',
'email': '*****',
'password': '*****'
}
with session() as c:
r_login = c.post('http://www.spotrac.com/signin/', data=payload)
print(r_login.headers)
response = c.get('http://www.spotrac.com/nba/cleveland-cavaliers/lebron-james')
print(response.cookies)
soup=bs(response.text, 'html.parser')
with open('ex.html','w') as f:
f.write(soup.prettify())
My current code does everything right, except I am not logged in when I'm making the request.
Thanks
You're sending POST request to a wrong URL, and with an incorrect payload as well.
POST http://www.spotrac.com/signin/submit/ HTTP/1.1
Host: www.spotrac.com
Connection: keep-alive
Content-Length: 86
Cache-Control: max-age=0
Origin: http://www.spotrac.com
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36
Content-Type: application/x-www-form-urlencoded
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Referer: http://www.spotrac.com/signin/
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.8
Cookie: cisession=a%3A5%3A%7Bs%3A10%3A%22session_id%22%3Bs%3A32%3A%2206021e191bdbbaf955f111f67b961056%22%3Bs%3A10%3A%22ip_address%22%3Bs%3A11%3A%22119.9.105.6%22%3Bs%3A10%3A%22user_agent%22%3Bs%3A108%3A%22Mozilla%2F5.0+%28Windows+NT+6.1%3B+WOW64%29+AppleWebKit%2F537.36+%28KHTML%2C+like+Gecko%29+Chrome%2F55.0.2883.87+Safari%2F537.36%22%3Bs%3A13%3A%22last_activity%22%3Bi%3A1485487245%3Bs%3A9%3A%22user_data%22%3Bs%3A0%3A%22%22%3B%7Dd6089620b21ecce6837161605055ae04; _ga=GA1.2.910256341.1481865346; _gali=contactForm
redirect=http%3A%2F%2Fwww.spotrac.com%2F&email=sdfs%40gmail.com&password=lkasjdflksjad
HTTP/1.1 302 Found
Server: nginx
Date: Fri, 27 Jan 2017 04:21:16 GMT
Content-Type: text/html
Content-Length: 0
Connection: keep-alive
Set-Cookie: cisession=a%3A5%3A%7Bs%3A10%3A%22session_id%22%3Bs%3A32%3A%22badb1275aee1cdad6736a6b4bb1ce809%22%3Bs%3A10%3A%22ip_address%22%3Bs%3A11%3A%22119.9.105.6%22%3Bs%3A10%3A%22user_agent%22%3Bs%3A108%3A%22Mozilla%2F5.0+%28Windows+NT+6.1%3B+WOW64%29+AppleWebKit%2F537.36+%28KHTML%2C+like+Gecko%29+Chrome%2F55.0.2883.87+Safari%2F537.36%22%3Bs%3A13%3A%22last_activity%22%3Bi%3A1485490876%3Bs%3A9%3A%22user_data%22%3Bs%3A0%3A%22%22%3B%7Dad486866c32cac526487707cea85b8a9; expires=Fri, 10-Feb-2017 04:21:16 GMT; path=/
Location: http://www.spotrac.com/register/
X-Powered-By: PleskLin
MS-Author-Via: DAV
As you can see from above session, the correct url should be http://www.spotrac.com/signin/submit/, and payload string is redirect=http%3A%2F%2Fwww.spotrac.com%2F&email=sdfs%40gmail.com&password=lkasjdflksjad, which is basically:
payload = {'redirect': 'http://www.spotrac.com/',
'email': mail_address,
'password': password}
Also make sure simulate headers with correct parameters, then you're good to go.

Get Python to print Location or Get

Guys i m scraping this url 'http://goldfilmesonline.com/jack-reacher-sem-retorno-legendado-online/' to try to get 1 link,but they have 2 that work for me, what happend to the site, i think it redirect, whem u play the video, 1 thing i know it send 2 media links whem i play the movie, those are the links i want to get because they work in VLC or Kodi so my firt code is to get the embeded url and it work fine
import requests
from bs4 import BeautifulSoup
a = requests.get('http://goldfilmesonline.com/jack-reacher-sem-retorno-legendado-online/')
soup = BeautifulSoup(a.content, 'html')
links = soup.find_all('iframe')
for i in links:
x = (i['src'])
if 'openload' in x:
print x
this is the result:
https://openload.co/embed/BQgJDIUtZ_w/
here is what i dont know what to do, i use fiddler and could get the headers request and response but i dont know what to parsed or params to give the so i took x and try to log this url and get the links i want but dindnt work here is what i try
import requests
url = 'https://openload.co/embed/BQgJDIUtZ_w/'
data = {'mime':'true'}
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36'}
#response = requests.post(url,params=data,headers=headers)
#b = response.status_code
#x = response.text
#c = requests.get('https://openload.co/embed/BQgJDIUtZ_w',params=data,headers=headers)
#d = response.headers['Location']
#print d
headers2= {'Transfer-Encoding': 'chunked', 'Set-Cookie': '__cfduid=d11e26a118392f7f08c5df1e88e15b3f71479421090; expires=Fri, 17-Nov-17 22:18:10 GMT; path=/; domain=.openload.co; HttpOnly, _csrf=bf05bbd254877ef6c354a8fb3d4001938ce56f8141ab4704460eb96f946f790ca%3A2%3A%7Bi%3A0%3Bs%3A5%3A%22_csrf%22%3Bi%3A1%3Bs%3A32%3A%22QWePPbK2IdPoeI3nfzzW0cf1Xlu2xgVv%22%3B%7D; path=/; HttpOnly, _olbknd=w6; path=/', 'Server': 'cloudflare-nginx', 'Connection': 'keep-alive',
'Cache-Control': 'private', 'Date': 'Thu, 17 Nov 2016 22:18:10 GMT', 'CF-RAY': '30368e96ed2a5a56-BOS', 'Content-Type': 'text/html; charset=UTF-8'}
f = requests.post(url,params=data,headers=headers)
print f
this last code i get a response 400, but it should be 302 found
here is the fiddler results:
this is the request headers
GET /stream/BQgJDIUtZ_w~1479434968~73.248.0.0~BduDRWUg?mime=true HTTP/1.1
Host: openload.co
Connection: keep-alive
Accept-Encoding: identity;q=1, *;q=0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36(KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36
Accept: */*
Referer: https://openload.co/embed/BQgJDIUtZ_w/
Accept-Language: en-US,en;q=0.8,pt-BR;q=0.6,pt;q=0.4
Range: bytes=0-
This is the Response headers
HTTP/1.1 302 Found
Date: Thu, 17 Nov 2016 02:09:42 GMT
Content-Type: video/mp4
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: __cfduid=d9ff8e31a73be207f66c4717480c6afe41479348582;expires=Fri, 17-Nov-17 02:09:42 GMT; path=/; domain=.openload.co; HttpOnly
Cache-Control: private
Access-Control-Allow-Origin: *
Location:https://1j8b54.oloadcdn.net/dl/l/zHA2Jp0IKTcUySV8/BQgJDIUtZ_w/Jack.Reacher.Never.Go.Back.HDTS.XviD-TOM.avi.mp4?mime=true
Set-Cookie: _olbknd=w5; path=/
Server: cloudflare-nginx
CF-RAY: 302fa46268585a68-BOS
I'm using python 2.7 with requests and beautifulSoup to get Python to print me 1 or both this 2 links, first is the one in Request headers:
GET /stream/BQgJDIUtZ_w~1479434968~73.248.0.0~BduDRWUg?mime=true
the second is the one in response headers:
Location: https://1j8b54.oloadcdn.net/dl/l/zHA2Jp0IKTcUySV8/BQgJDIUtZ_w/Jack.Reacher.Never.Go.Back.HDTS.XviD-TOM.avi.mp4?mime=true
can someone help me

Issue with submitting an HTTP POST request

I am having an issue with submitting an HTTP Post request. My purpose of this program is to scrape the lyrics off a website, and then use that string in a text summarizer. I am having an issue submitting the POST request on the summarizer's website. Currently with the code below, it does not submit request. It just returns the page. I think it may be due to the content-type being different, but I am not sure.
My code:
def summarize(lyrics):
url = 'http://www.freesummarizer.com'
values = {'text' : lyrics,
'maxsentences' : '1',
'maxtopwords' : '40',
'email' : 'your#email.com' }
headers = {'User-Agent' : 'Mozilla/5.0'}
cookies = {'_jsuid': '777245265', '_ga':'GA1.2.164138903.1423973625', '__smToken':'elPdHJINsP5LvAYhia6OAA68', '__smListBuilderShown':'true', '_first_pageview':'1', '_gat':'1', '_eventqueue':'%7B%22heatmap%22%3A%5B%7B%22type%22%3A%22heatmap%22%2C%22href%22%3A%22%252F%22%2C%22x%22%3A324%2C%22y%22%3A1800%2C%22w%22%3A640%7D%5D%2C%22events%22%3A%5B%5D%7D', 'PHPSESSID':'28b0843d49700e134530fbe32ea62923', '__smSmartbarShown':'true'}
r = requests.post(url, data=values, headers=headers)
print(r.text)
My Response:
'transfer-encoding': 'chunked'
'set-cookie': 'PHPSESSID=1f10ec11e6f9040cbb5a81e16bfcdf7f; path=/',
'expires': 'Thu, 19 Nov 1981 08:52:00 GMT'
'keep-alive': 'timeout=5, max=100'
'server': 'Apache'
'connection': 'Keep-Alive'
'pragma': 'no-cache'
'cache-control': 'no-store, no-cache, must-revalidate, post-check=0, pre-check=0'
'date': 'Fri, 27 Feb 2015 18:38:41 GMT'
'content-type': 'text/html'
A successful response on this website:
Host: freesummarizer.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:35.0) Gecko/20100101 Firefox/35.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://freesummarizer.com/
Cookie: _jsuid=777245265; _ga=GA1.2.164138903.1423973625; __smToken=elPdHJINsP5LvAYhia6OAA68; __smListBuilderShown=true; _first_pageview=1; _gat=1; _eventqueue=%7B%22heatmap%22%3A%5B%7B%22type%22%3A%22heatmap%22%2C%22href%22%3A%22%252F%22%2C%22x%22%3A324%2C%22y%22%3A1800%2C%22w%22%3A640%7D%5D%2C%22events%22%3A%5B%5D%7D; PHPSESSID=28b0843d49700e134530fbe32ea62923; __smSmartbarShown=true
Connection: keep-alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 6044
Everything seems to be working just fine with requests.
But I think the issue here is that you are using the wrong tool for the job.
The tool I believe you are looking for is Selenium.
Selenium automates browsers. That's it! What you do with that power is entirely up to you. Primarily, it is for automating web applications for testing purposes, but is certainly not limited to just that. Boring web-based administration tasks can (and should!) also be automated as well.
You should absolutely take a look it this tool.
Selenium docs

How to get mechanize requests to look like they originate from a real browser

OK, here's the header(just an example) info I got from Live HTTP Header while logging into an account:
http://example.com/login.html
POST /login.html HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8 GTB7.1 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Referer: http://example.com
Cookie: blahblahblah; blah = blahblah
Content-Type: application/x-www-form-urlencoded
Content-Length: 39
username=shane&password=123456&do=login
HTTP/1.1 200 OK
Date: Sat, 18 Dec 2010 15:41:02 GMT
Server: Apache/2.2.3 (CentOS)
X-Powered-By: PHP/5.2.14
Set-Cookie: blah = blahblah_blah; expires=Sun, 18-Dec-2011 15:41:02 GMT; path=/; domain=.example.com; HttpOnly
Set-Cookie: blah = blahblah; expires=Sun, 18-Dec-2011 15:41:02 GMT; path=/; domain=.example.com; HttpOnly
Set-Cookie: blah = blahblah; expires=Sun, 18-Dec-2011 15:41:02 GMT; path=/; domain=.example.com; HttpOnly
Cache-Control: private, no-cache="set-cookie"
Expires: 0
Pragma: no-cache
Content-Encoding: gzip
Vary: Accept-Encoding
Content-Length: 4135
Keep-Alive: timeout=10, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
Normally I would code like this:
import mechanize
import urllib2
MechBrowser = mechanize.Browser()
LoginUrl = "http://example.com/login.html"
LoginData = "username=shane&password=123456&do=login"
LoginHeader = {"User-Agent": "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8 GTB7.1 (.NET CLR 3.5.30729)", "Referer": "http://example.com"}
LoginRequest = urllib2.Request(LoginUrl, LoginData, LoginHeader)
LoginResponse = MechBrowser.open(LoginRequest)
Above code works fine. My question is, do I also need to add these following lines (and more in previous header infos) in LoginHeader to make it really looks like firefox's surfing, not mechanize?
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
What parts/how many of header info need to be spoofed to make it looks "real"?
It depends on what you're trying to 'fool'. You can try some online services that do simple User Agent sniffing to gauge your success:
http://browserspy.dk/browser.php
http://www.browserscope.org (look for 'We think you're using...')
http://www.browserscope.org/ua
http://panopticlick.eff.org/ -> will help you to pick some 'too common to track' options
http://networking.ringofsaturn.com/Tools/browser.php
I believe a determined programmer could detect your game, but many log parsers and tools wouldn't once you echo what your real browser sends.
One thing you should consider is that lack of JS might raise red flags, so capture sent headers with JS disabled too.
Here's how you set the user agent for all requests made by mechanize.Browser
br = mechanize.Browser()
br.addheaders = [('User-agent', 'your user agent string here')]
Mechanize can fill in forms as well
br.open('http://yoursite.com/login')
br.select_form(nr=1) # select second form in page (0 indexed)
br['username'] = 'yourUserName' # inserts into form field with name 'username'
br['password'] = 'yourPassword'
response = br.submit()
if 'Welcome yourUserName' in response.get_data():
# login was successful
else:
# something went wrong
print response.get_data()
See the mechanize examples for more info
If you are paranoid about keeping bots/scripts/non-real browsers out, you'd look for things like the order of HTTP requests, let one resource be added using JavaScript. If that resource is not requested, or requested before the JavaScript - then you know it's a "fake" browser.
You could also look at number of requests per connection (keep-alive), or simply verify that all CSS files of the first page (given that they're at the top of the HTML) gets loaded.
YMMV but it can become pretty cumbersome to simulate enough to make some "fake" browser pass as a "real" one (used by humans).

Categories

Resources