I'm start learning python with requests library and i get facebook for example.
This is my code:
import requests
get_response = requests.get(url='https://www.facebook.com/login/identify?
ctx=recover')
post_data = {'email':'mycorrectemailaddress'}
post_response = requests.post(url="https://www.facebook.com/login/identify?ctx=recover/POST", data=post_data)
print(post_response.text)
And my script not going to the next page, i don't know where is my fault.
You didn't pass all the required data.
payload = {lsd:AVpg3ZvY
email:your#email.com
did_submit:Search
__user:0
__a:1
__dyn:7AzHK4GgN1t2u6XgmwCwRAKGzEy4S-C11xG12wAxu13wIwHx27QdwPG2iuUG4XzEa8uwh9UcU88lwIwHwJwnoCcxG48hwv9FovgeFUuzUhws82BxCqUkguy99UK
__af:iw
__req:5
__be:-1
__pc:PHASED:DEFAULT
__rev:2929740}
Sadly, I don't know what are these args. But you could parse it from a get requrest on the page or use selenium for the informations, and pass cookies to the requests module, and then make the post request.
Related
I am new in Python and web scraping, but I keep learning. I have managed to get some exciting results using BeautifulSoup and Requests libraries and my next goal is to log into a website that allows remote access to my heating system to do some web scraping and maybe extend its capabilities further.
Unfortunately, I got stuck. I have used Mozilla's Web Dev Tools to see the url that the form posts to, and the name attributes of the username and password fields. The webpage url is https://emodul.pl/login and the Request payload looks as follows:
{"username":"my_username","password":"my_password","rememberMe":false,"languageId":"en","remote":false}
I am using requests.Session() instance to make a post request to the login url and using the above-mentioned payload:
import requests
url = 'https://emodul.pl/login'
payload = {'username':'my_username','password':'my_password','rememberMe':False,'languageId':'en','remote':False}
with requests.Session() as s:
p = s.post(url, data=payload)
print(p.text)
Apparently I'm doing something wrong because I'm getting the "error":"Sorry, something went wrong. Try again." response.
Any advice will be much appreciated.
Hi I am new with python requests and would like to have some help.
When I try to use python requests and get the session cookie, use the following command:
session_req = requests.session()
result = session_req.get(
get_url
)
after execute GET from requests, I use the '.cookies' property ant the respective key I want to send at the POST Header, I get the value successfully, but the POST action is not working.
session_req.cookies['IFCSHOPSESSID']
but when I get the request from the same API via POSTMAN and try to get the cookie property (exporting the code as python requests) I found some differences, and if I use this same cookie exported from POSTMAN it works.
POSTMAN EXAMPLE
'cookie': 'IFCSHOPSESSID=hrthhiqdeg0dvf4ecooc83lui3; nikega=GA1.4.831513767.1599354095; nikega_gid=GA1.4.1839484382.1599354095; _ga=GA1.3.831513767.1599354095; _gid=GA1.3.733956911.1599354099; chaordic_browserId=0-fv_3j6NdVlbNFFwPRzUGQVse7e1bbqga-3OS1599354098234702; chaordic_anonymousUserId=anon-0-fv_3j6NdVlbNFFwPRzUGQVse7e1bbqga-3OS1599354098234702; chaordic_testGroup=%7B%22experiment%22%3Anull%2C%22group%22%3Anull%2C%22testCode%22%3Anull%2C%22code%22%3Anull%2C%22session%22%3Anull%7D; user_unic_ac_id=bec863cf-4e06-0ab1-d881-b566595d3e8f; _gcl_au=1.1.1305519862.1599354100; _fbp=fb.2.1599354100232.504934336; smeventsclear_16df2784b41e46129645c2417f131191=true; smViewOnSite=true; __pr.cvh=4ftsyf8x16; _gaexp=GAX1.3.tupm6REJTMeD-piAakRDMA.18557.0; blueID=75a502b6-e7c2-4eb3-8442-75aea5d95fdc; _cm_ads_activation_retry=false; sback_client=5816989a58791059954e4c52; sback_partner=false; sb_days=1599356617672; sback_refresh_wp=no; smClickOnSite=true; smClickOnSite_652c0aaee02549a3a6ea89988778d3fc=true; _rtbhouse_source_=socialminer; RKT=false; dedup=socialminer; lmd_cj=socialminer; advcake_url=https%3A%2F%2Fwww.nike.com.br%2Flancamentos%3Futm_source%3Dsocialminer%26utm_medium%3Dsocialminer_onsitedesktop%26utm_campaign%3Dsocialminer_onsitedesktop_lancamentos_desk%26smid%3D3-17; advcake_trackid=dd7e2ef0-dd50-889a-aeea-559a0d8bcd22; advcake_utm_content=socialminer_onsitedesktop_lancamentos_desk; advcake_utm_campaign=socialminer; Campanha=; Parceiro=; Midia=; AMCVS_F0935E09512D2C270A490D4D%40AdobeOrg=1; s_cc=true; lmd_orig=direct; SIZEBAY_SESSION_ID=0AC1A70CB19F4f03610665d04bb088ef3b9af0942fc8; sback_customer_w=true; sback_browser=0-87718800-1599408894bff13e290b9fee5fc2b430382f639b87dd9cf25112334287575f550afed62983-14051381-17920887216,13017640152-1599408894; sback_access_token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJhcGkuc2JhY2sudGVjaCIsImlhdCI6MTU5OTQwODg5NSwiZXhwIjoxNTk5NDk1Mjk1LCJhcGkiOiJ2MiIsImRhdGEiOnsiY2xpZW50X2lkIjoiNTgxNjk4OWE1ODc5MTA1OTk1NGU0YzUyIiwiY2xpZW50X2RvbWFpbiI6Im5pa2UuY29tLmJyIiwiY3VzdG9tZXJfaWQiOiI1ZjU0M2VjODA5ZjFkMDkzMmQzMjQ2OTUiLCJjdXN0b21lcl9hbm9ueW1vdXMiOmZhbHNlLCJjb25uZWN0aW9uX2lkIjoiNWY1NDNlYzgwOWYxZDA5MzJkMzI0Njk2IiwiYWNjZXNzX2xldmVsIjoiY3VzdG9tZXIifX0.K6FYVBasHjMg_PLbT1yZfrnIp97USqijoMObF4eUSms.WrWrDrHeHezRqBiYiYHeDr; sback_customer=$2gSxATWYdVYOVGMI10bUdkW2pWeoZERU1kc1YWWhd1SNR0aMJ0QUVzTHpHdJZERnpVS6FTSkRUTOBjMys2bUdnT2$12; sback_pageview=false; ak_bmsc=B6177778CB59637165F7EC43342C1559C9063147DA220000234E555F8D78F831~plACNrc4cNxoHZNcO7aF4o+U0KQNKjzPECGSfb42NdayPvdNkBWwUT9QOhGjuLJJ3vStuFIRkiI/35wsHEyUE3/h2guphhaEy71BnfekvDtb/6F84hS+fWhPxxVG5RAlph8WzGpYMn6NZESNVcgnZYfH4HoZ/IzBPR6AMG9UGn6W4xm/j/j9kOfef8v/fZf2pXw4mxJuiN5Cxc7g2sV4nCdoEW98Q4AgqplzxWZjpamZk=; bm_sz=6586256DDAFC895D740341E4214D0D40~YAAQRzEGybYT7yN0AQAAfDw5ZQnXjJtKI2SxkwQFV9vLZpF5mACXNUtUFDSkidKuYM2fac5sQgRozU9fA3+017dht/PUtH+wtibATtTmoVOlpKnW+V76+1rySk3HK6q83Q9rtQc/LaaQ8VYtK/tDi0VOc7/0wLyKy/+Z4OLtgUpySYZZcEX4k8/46no8rFD6OQ==; AMCV_F0935E09512D2C270A490D4D%40AdobeOrg=359503849%7CMCIDTS%7C18512%7CMCMID%7C56897587165425478193529762442649463163%7CMCAAMLH-1600030892%7C4%7CMCAAMB-1600030892%7CRKhpRz8krg2tLO6pguXWp5olkAcUniQYPHaMWWgdJ3xzPWQmdj0y%7CMCOPTOUT-1599433292s%7CNONE%7CMCSYNCSOP%7C411-18519%7CvVersion%7C5.0.1; sback_total_sessions=3; sback_session=5f554e3c73a63da56d739d87; lmd_traf=direct-1599402359608&direct-1599408890286&direct-1599414284313&direct-1599427194077; chaordic_realUserId=2962653; chaordic_session=1599429266491-0.4343169041143473; _st_ses=49222273669791505; _st_cart_script=helper_nike.js; _st_cart_url=/; _sptid=1592; _spcid=1592; _st_id=cnVkc29ucmFtb25AZ21haWwuY29t; _st_idb=cnVkc29ucmFtb25AZ21haWwuY29t; lx_sales_channel=%5B%222%22%5D; sback_cart=5f555ba24f507d767721c387; CSRFtoken=1ac8a198f88ac1ccc1f8555ab41c8a95; gpv_v70=nikecombr%3Echeckout%3Eaddress; pv_templateName=CHECKOUT; gptype_v60=checkout%3Aaddress; stc119288=env:1599429270%7C20201007215430%7C20200906222939%7C5%7C1088071:20210906215939|uid:1599354102799.1149977977.6705985.119288.1871143352:20210906215939|srchist:1088071%3A1599429270%3A20201007215430:20210906215939|tsa:1599429270805.1898407973.364911.7635034620790062.2:20200906222939; bm_sv=C9C3A8C6B2F6CB232317BB794ADC0497~ZnoksXquh4Yrh4uN87gycXdh+ixzU+xMFsb94sO9uE5JMLyZz9eJPp5odX7vx944KIXG1nvOxuq8pdrQUDjBrchRJLC4yiD1yWX0h4BjWhZwbfHPtnzaT3ASbIZnf2Ts1TRt+ZAescJJwrNPs4oV2If7vyiWi2AYILFvCstCTS8=; _uetsid=a9a0bfd4fe4e4db52bcd4ca66850a785; _uetvid=9ba47ed116a48f496f6b1a9844e21c95; __udf_j=f08aeb668454efbf6ddc83dd9d4b7a8385abde9f9fbd92526f1de0441da2126ec40330dfc36d0b9c3eae98557c94447d; _spl_pv=40; s_sq=lojanike-new-production%252Clojanike-nikebr%3D%2526c.%2526a.%2526activitymap.%2526page%253Dnikecombr%25253Echeckout%25253Eaddress%2526link%253DSeguir%252520para%252520pagamento%2526region%253Didentificacao-form%2526pageIDType%253D1%2526.activitymap%2526.a%2526.c%2526pid%253Dnikecombr%25253Echeckout%25253Eaddress%2526pidt%253D1%2526oid%253DSeguir%252520para%252520pagamento%2526oidt%253D3%2526ot%253DSUBMIT; RT="z=1&dm=nike.com.br&si=92b42534-25ee-4155-aa1a-e7d127581869&ss=kermvxyl&sl=9&tt=17e8&bcn=%2F%2F173e2544.akstat.io%2F"; _abck=F6E1C280C3F9D735A2B1AB62443DB479~-1~YAAQVjEGycno+iJ0AQAAmtRxZQT8kxLFalTup4dkYT5+cq/PavPcY4/0zAeJv4GoSQQwYVj4EWydkfxbJR3Rgaa4k6ma+5O72J/lsiajATrx0oaZJuB5b/FIP6RymanPRVGlb3kLJXpBQDkCmVv62kkxLKxySrlAYDCg0ORCpSXlTCbFBVEchC9ih5t094egSeVdM6VjfQSO9uDKISBoP4923qkJMTpbk9B1nOoiylKK+y+FGFu8pzEpQqZYj7tIMTJVpqe0OpXaQ8m8nPyp0K+PmBcAndIHcBMTZUEqma9/72Enx8yvGbKXrYbAzNDw6ZtKY9OAbNuVeqprza/Af0aUkinm0l3JqxjTH1LpglNxNN4=~-1~-1~-1; CSRFtoken=20a208bad599aa3ead0bbe944b27a368; bm_sv=C9C3A8C6B2F6CB232317BB794ADC0497~ZnoksXquh4Yrh4uN87gycXdh+ixzU+xMFsb94sO9uE5JMLyZz9eJPp5odX7vx944KIXG1nvOxuq8pdrQUDjBrchRJLC4yiD1yWX0h4BjWhbSXhHWWrgkUsOTt9033P5Wxu1qmo5M6w0VAWeAzBaCN7yZC2Ll7DiGq0CwpjxlOW4=; _abck=F6E1C280C3F9D735A2B1AB62443DB479~-1~YAAQVjEGyRKO+iJ0AQAA+4U9ZQSNIWTEz/60Uk5gz2tnzVtbMbX0hpaMbkbeJxSYSMD1xo7TTedXnJ0UuTLxxcHhLVrRRCrZfSjZ+yH00Ld6FLIajmYFefKPehzA6GgwjnLyucI1O6nDw2ZU1CV0WJLeWGgcmX7sinsLr3DVtmoGJyNR1Q9EWpvq71/W1Ys4Bqhq1628YKEz/0Z1Ic1bWMujcG03064ZZYYXTSTz9jrkxHKaEoJQNQgyUg9NXQhv4EFoMSESy/AIKRy+hVCULLJscbkpH8WakuvYQ1raghVfheks/Xra9AmiUoOqAbWAPXOij1nWQ9PSV2hxQZfkibD0+YP14pTXPoCAUA9jCQHRJIw=~0~-1~-1'
session_req.cookies['IFCSHOPSESSID'] EXAMPLE
qnabtagl4pu7gm2jg3sij03cu6
Other curious thing is that when I use the '.cookies' property, my POST call return sucess even without update the cart where it should be inserting a new register.
As I am trying to develop one site bot, I would like to generate this same cookie via python requests code. Can anyone try to help me on it?
This is an example with python 3. You can customize it.
import requests
data ="param_1=value_1¶m_2=value_2&.....¶m_n=value_n"; #your request parameters.
cookie = "cookie_name=xxxxxxxx;....." #define cookie
url_endpoint = "htpps://........." # your url endpoint
# add cookies to endpoints
resp = requests.get(url_endpoint, data=data.encode('utf-8'),cookies=cookie)
if(resp.status_code==200):
print("success ")
else:
print("error ")
I have a task to crawl all Pulitzer Winner, and I found this page has all I want: https://www.pulitzer.org/prize-winners-by-year/2018.
But I got the following problems,
Problem 1: How to crawl a dynamic page? I use python/urllib2.urlopen, to get the page's content, but this dynamic page doesn't return the real content from this.
Problem 2: I then found an API URL from devtool: https://www.pulitzer.org/cache/api/1/winners/year/166/raw.json. But when I sent a GET request from urllib2.urlopen, I always get null. How does it happen? Or how can I handle with it?
If this is too naive for you, please name some words so that I can learn it from Google.
Thanks in advance!
One way to handle is to create a session using requests module. This way, it passes necessary session details required for next api call, you also have to pass one more parameter Referer to the header. This differentiates which year you are looking for in the api call.
import requests
s = requests.session()
url = "https://www.pulitzer.org/prize-winners-by-year/2017"
resp1 = s.get(url)
headers = {'Referer': 'https://www.pulitzer.org/prize-winners-by-year/2017'}
api = "https://www.pulitzer.org/cache/api/1/winners/year/166/raw.json"
data = s.get(api,headers=headers)
now you can extract the data from the response in data.
trying to authenticate to a website and fill out a form using requests lib
import requests
payload = {"name":"someone", "password":"somepass", "submit":"Submit"}
s = requests.Session()
s.post("https://someurl.com", data=payload)
next_payload = {"item1":"something", "item2":"something", "submit":"Submit"}
r = s.post("https://someurl.com", data=next_payload)
print r.text
authentication works and i verified that i can post to forms but this one i am having problem with gives The action could not be completed, perhaps because your session had expired. Please try again
Attempted in urllib2 and same result -- dont think its an issue with a cookie.
I am wondering if javascript on this page has something to do with giving session error? other form page doesnt have any javascripts.
Thanks for your input...
I am using cookielib and some times opening a url in browser downloads many other files by browser making many other requests. Can I replicate the same behaviour using cookie lib or any other python library?
For example: To get all the required information from page https://applicant.keybank.com/psp/hrsappl/EMPLOYEE/HRMS/c/HRS_HRAM.HRS_CE.GBL?Page=HRS_CE_HM_PRE&Action=A&SiteId=1
I have to make more than 1 GET requests from my python script. I got the request urls of all the requests browser makes by analysing the network requests when I opened the page.
I am seeing if there is any way I can just make 1 request and it fetches all the related requests by itself like browser.
I am not very much interested in the js or css but the main html.
I tried with the following code but it couldn't download whole page
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
response = opener.open('https://applicant.keybank.com/psp/hrsappl/EMPLOYEE/HRMS/c/HRS_HRAM.HRS_CE.GBL?Page=HRS_CE_HM_PRE&Action=A&SiteId=1')
html = response.read()
but when I fetched 3 other GET urls in sequence it is able to give me the required html in the third GET response. I got these urls by examining network tab of the browser
'https://applicant.keybank.com/psc/hrsappl/EMPLOYEE/EMPL/s/WEBLIB_PT_NAV.ISCRIPT1.FieldFormula.IScript_UniHeader_Frame?c=NNTCgkqGs001AcPaisqGbYpTu%2fbGx4jx&Page=HRS_CE_HM_PRE&Action=A&SiteId=1&PortalActualURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentProvider=HRMS&PortalRegistryName=EMPLOYEE&PortalServletURI=https%3a%2f%2fapplicant.keybank.com%2fpsp%2fhrsappl%2f&PortalURI=https%3a%2f%2fapplicant.keybank.com%2fpsc%2fhrsappl%2f&PortalHostNode=EMPL&PortalIsPagelet=true&NoCrumbs=yes')
'https://applicant.keybank.com/psc/hrsappl/EMPLOYEE/EMPL/s/WEBLIB_PTPPB.ISCRIPT2.FieldFormula.IScript_TemplatePageletBuilder?PTPPB_PAGELET_ID=KC_LNAV_APPLICANT&target=KCNV_KC_LNAV_APPLICANT_TMPL&Page=HRS_CE_HM_PRE&Action=A&SiteId=1&PortalActualURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentProvider=HRMS&PortalRegistryName=EMPLOYEE&PortalServletURI=https%3a%2f%2fapplicant.keybank.com%2fpsp%2fhrsappl%2f&PortalURI=https%3a%2f%2fapplicant.keybank.com%2fpsc%2fhrsappl%2f&PortalHostNode=EMPL&PortalIsPagelet=true&NoCrumbs=yes&PortalTargetFrame=TargetContent'
'https://hronline.keybank.com/psc/hrshrm/EMPLOYEE/HRMS/c/HRS_HRAM.HRS_CE.GBL?Page=HRS_CE_HM_PRE&Action=A&SiteId=1&PortalActualURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentProvider=HRMS&PortalCRefLabel=Careers&PortalRegistryName=EMPLOYEE&PortalServletURI=https%3a%2f%2fapplicant.keybank.com%2fpsp%2fhrsappl%2f&PortalURI=https%3a%2f%2fapplicant.keybank.com%2fpsc%2fhrsappl%2f&PortalHostNode=EMPL&NoCrumbs=yes&PortalKeyStruct=yes'
and following is the complete code for the other fetches I am making
response = opener.open('https://applicant.keybank.com/psc/hrsappl/EMPLOYEE/EMPL/s/WEBLIB_PT_NAV.ISCRIPT1.FieldFormula.IScript_UniHeader_Frame?c=NNTCgkqGs001AcPaisqGbYpTu%2fbGx4jx&Page=HRS_CE_HM_PRE&Action=A&SiteId=1&PortalActualURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentProvider=HRMS&PortalRegistryName=EMPLOYEE&PortalServletURI=https%3a%2f%2fapplicant.keybank.com%2fpsp%2fhrsappl%2f&PortalURI=https%3a%2f%2fapplicant.keybank.com%2fpsc%2fhrsappl%2f&PortalHostNode=EMPL&PortalIsPagelet=true&NoCrumbs=yes')
response.read()
response = opener.open('https://applicant.keybank.com/psc/hrsappl/EMPLOYEE/EMPL/s/WEBLIB_PTPPB.ISCRIPT2.FieldFormula.IScript_TemplatePageletBuilder?PTPPB_PAGELET_ID=KC_LNAV_APPLICANT&target=KCNV_KC_LNAV_APPLICANT_TMPL&Page=HRS_CE_HM_PRE&Action=A&SiteId=1&PortalActualURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentProvider=HRMS&PortalRegistryName=EMPLOYEE&PortalServletURI=https%3a%2f%2fapplicant.keybank.com%2fpsp%2fhrsappl%2f&PortalURI=https%3a%2f%2fapplicant.keybank.com%2fpsc%2fhrsappl%2f&PortalHostNode=EMPL&PortalIsPagelet=true&NoCrumbs=yes&PortalTargetFrame=TargetContent')
response.read()
response = opener.open('https://hronline.keybank.com/psc/hrshrm/EMPLOYEE/HRMS/c/HRS_HRAM.HRS_CE.GBL?Page=HRS_CE_HM_PRE&Action=A&SiteId=1&PortalActualURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentURL=https%3a%2f%2fhronline.keybank.com%2fpsc%2fhrshrm%2fEMPLOYEE%2fHRMS%2fc%2fHRS_HRAM.HRS_CE.GBL%3fPage%3dHRS_CE_HM_PRE%26Action%3dA%26SiteId%3d1&PortalContentProvider=HRMS&PortalCRefLabel=Careers&PortalRegistryName=EMPLOYEE&PortalServletURI=https%3a%2f%2fapplicant.keybank.com%2fpsp%2fhrsappl%2f&PortalURI=https%3a%2f%2fapplicant.keybank.com%2fpsc%2fhrsappl%2f&PortalHostNode=EMPL&NoCrumbs=yes&PortalKeyStruct=yes')
required_html = response.read()
requests can handle cookies, as you can see here.
It's a great library, far more powerful that urllib2, and yet simpler-looking.
>>> import requests
>>> r = requests.get('https://applicant.keybank.com/psp/hrsappl/EMPLOYEE/HRMS/c/HRS_HRAM.HRS_CE.GBL?Page=HRS_CE_HM_PRE&Action=A&SiteId=1')
>>> r.cookies
Edit: This answer dos not really address the problem, I read too fast. Sorry about that.
As suggested by #J.F.Sebastian, I'm adding a link to a python webkit client, Ghost.py, that could emulate a browser, as you requested.