how to make request in python instead of curl - python

Here is the curl command:
curl 'http://ecard.bupt.edu.cn/User/ConsumeInfo.aspx'
-H 'Cookie: ASP.NET_SessionId=4mzmi0whscqg4humcs5fx0cf; .NECEID=1; .NEWCAPEC1=$newcapec$:zh-CN_CAMPUS; .ASPXAUTSSM=A91571BB81F620690376AF422A09EEF8A7C4F6C09978F36851B8DEDFA56C19C96A67EBD8BBD29C385C410CBC63D38135EFAE61BF49916E0A6B9AB582C9CB2EEB72BD9DE20D64A51C09474E676B188D937B84E601A981C02F51AA9C85A08EABCC1D30D7B9299E02A45D427B08A44AEBB0'
-H 'Origin: http://ecard.bupt.edu.cn'
-H 'Accept-Encoding: gzip, deflate'
-H 'Accept-Language: zh-CN,zh;q=0.8,en;q=0.6,zh-TW;q=0.4'
-H 'Upgrade-Insecure-Requests: 1'
-H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36'
-H 'Content-Type: application/x-www-form-urlencoded'
-H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'
-H 'Cache-Control: max-age=0' -H 'Referer: http://ecard.bupt.edu.cn/User/ConsumeInfo.aspx'
-H 'Connection: keep-alive'
--data '__EVENTTARGET=&__EVENTARGUMENT=&__VIEWSTATE=%2FwEPDwUINDEyMzA5NDkPFgIeCFNvcnRUeXBlBQNBU0MWAmYPZBYCAgMPZBYCAgMPZBYCAgQPPCsAEQMADxYEHgtfIURhdGFCb3VuZGceC18hSXRlbUNvdW50ZmQBEBYAFgAWAAwUKwAAFgJmD2QWAmYPZBYCAgEPZBYCAgEPDxYEHghDc3NDbGFzcwUKU29ydEJ0X0FzYx4EXyFTQgICZGQYAQUiY3RsMDAkQ29udGVudFBsYWNlSG9sZGVyMSRncmlkVmlldw88KwAMAQhmZBb11RlvPBlQn5fbuqe6uh8BRZJ2jUZ5U17xEnRM%2BnbF&__VIEWSTATEGENERATOR=D99777FC&__EVENTVALIDATION=%2FwEdAAgtbl4l0SdCQqYpyy3Ex39BEoHXgPeD%2Fa3eEKPr2bG0rY8WyuyQajjUbeopYM%2Fne68pyn2l0BPEWPPnyd6MfoDZVQGbFRIKjb%2FOTPkkCDIIaJ3X6W3VKO%2FV036Pdf6jf06OTFulzGhY80%2FZ1HbCJJ6LR5Lg6mLp72ibUCB3VipRJuK11qmW%2BSDe3IOvbK4oNUdV5%2FxmkXw25tTwfJ8P8OS2&ctl00%24ContentPlaceHolder1%24rbtnType=0&ctl00%24ContentPlaceHolder1%24txtStartDate=2016-09-17&ctl00%24ContentPlaceHolder1%24txtEndDate=2016-09-24&ctl00%24ContentPlaceHolder1%24btnSearch=%E6%9F%A5++%E8%AF%A2' --compressed
I want to use request to get the return message to analyse, but i'm confused about this.
Thanks!

You can use python requests module. The answer would be something like:
response = requests.get('http://ecard.bupt.edu.cn/User/ConsumeInfo.aspx',
headers={'Origin': 'http://ecard.bupt.edu.cn'},
data={'__EVENTTARGET': '', 'rbtnType':0},
cookies={'.NECEID': 1}
)
I didn't put all parameters there but I think this example is enough to figure out how to send exact the same request you want. The only thing I am not sure is --compressed flag - as I understand from docs it is something more specific to curl(correct me if I'm wrong)

Here is a useful website to convert the curl to requests:
curl to request
Curl Command:
curl 'http://js.t.sinajs.cn/open/api/js/api/bundle.js?version=20150130.02' -H 'Pragma: no-cache' -H 'Accept-Encoding: gzip, deflate, sdch' -H 'Accept-Language: zh-CN,zh;q=0.8,en;q=0.6,zh-TW;q=0.4' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.87 Safari/537.36' -H 'Accept: */*' -H 'Referer: http://www.liaoxuefeng.com/wiki/001374738125095c955c1e6d8bb493182103fac9270762a000/001375502200090e998439175cc4268b0ea703b3b4ed55e000' -H 'Connection: keep-alive' -H 'Cache-Control: no-cache' --compressed
Python requests:
import requests
headers = {
'Pragma': 'no-cache',
'Accept-Encoding': 'gzip, deflate, sdch',
'Accept-Language': 'zh-CN,zh;q=0.8,en;q=0.6,zh-TW;q=0.4',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.87 Safari/537.36',
'Accept': '*/*',
'Referer': 'http://www.liaoxuefeng.com/wiki/001374738125095c955c1e6d8bb493182103fac9270762a000/001375502200090e998439175cc4268b0ea703b3b4ed55e000',
'Connection': 'keep-alive',
'Cache-Control': 'no-cache',
}
requests.get('http://js.t.sinajs.cn/open/api/js/api/bundle.js?version=20150130.02', headers=headers)

Related

Getting 302 with Curl, but 200 with python requests

The site that I am trying to scrape must follow 302 observed by the browser network tools. I've copied the network request as Curl and its working fine, but when I convert it to python requests its just returning 200.
Curl:
curl -v 'https://my-site.com/CardAuthentication.aspx' \
-H 'Connection: keep-alive' \
-H 'Pragma: no-cache' \
-H 'Cache-Control: no-cache' \
-H 'Upgrade-Insecure-Requests: 1' \
-H 'Origin: https://my-site.com' \
-H 'Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryBN2XwBbcAwvRWZzk' \
-H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36' \
-H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
-H 'Sec-GPC: 1' \
-H 'Sec-Fetch-Site: same-origin' \
-H 'Sec-Fetch-Mode: navigate' \
-H 'Sec-Fetch-User: ?1' \
-H 'Sec-Fetch-Dest: document' \
-H 'Referer: https://my-site.com/CardAuthentication.aspx' \
-H 'Accept-Language: en-US,en;q=0.9' \
-H 'Cookie: ASP.NET_SessionId=...; TS0192fa71=...; ServiceProvider=UID=...; .ASPXAUTH=...' \
--data-raw $'------WebKitFormBoundaryBN2XwBbcAwvRWZzk\r\ DATA \r\n' \
--compressed
this is returning:
* Trying IP:443...
........
........
< HTTP/1.1 302 Found
< Cache-Control: no-cache
< Pragma: no-cache
< Content-Type: text/html; charset=utf-8
< Expires: -1
< Location: my-site.com/MemberDetails.aspx
< Date: Mon, 07 Feb 2022 09:54:49 GMT
< Content-Length: 61211
< Set-Cookie: TS0192fa71=...; Path=/; Domain=.my-site.com; Secure
Python:
import requests
import logging
logging.basicConfig(level=logging.DEBUG)
cookies = {
"ASP.NET_SessionId": ".....",
"TS0192fa71": ".....",
"ServiceProvider": ".....",
".ASPXAUTH": ".....",
}
headers = {
......
"Content-Type": "multipart/form-data; boundary=----WebKitFormBoundaryBN2XwBbcAwvRWZzk",
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36",
"Referer": "https://my-site.com/CardAuthentication.aspx",
"Accept-Language": "en-US,en;q=0.9",
}
data = {
"------WebKitFormBoundaryBN2XwBbcAwvRWZzk\r\nContent-Disposition: form-data; name": '"__EVENTTARGET"\r\n\r\n\r\n-- DATA
}
response = requests.post(
"https://my-site.com/CardAuthentication.aspx",
headers=headers,
cookies=cookies,
data=data,
)
And the return code is 200, with response history empy.
Is there something wrong with requests library or in the way that request library is processing the data? How can I solve this?
Is there something wrong with requests library or in the way that
request library is processing the data? How can I solve this?
This is default requests behavior, you need to set allow_redirects to False if you want to not follow in case of 3xx response code, example
import requests
r = requests.get("http://github.com", allow_redirects=False)
print(r.status_code) # 301
If you want to know more read request.request docs.

cURL request into Swift vs Python

I am trying to implement a cURL request into Swift App, however, the server is not accepting my request for some reason.
cURL Request:
curl 'https://test.com' \
-H 'Connection: keep-alive' \
-H 'Accept: application/json, text/javascript, */*; q=0.01' \
-H 'cache-control: no-cache' \
-H 'X-Requested-With: XMLHttpRequest' \
-H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36' \
-H 'Content-Type: application/json' \
-H 'Origin: https://yyy.com' \
-H 'Sec-Fetch-Site: same-origin' \
-H 'Sec-Fetch-Mode: cors' \
-H 'Sec-Fetch-Dest: empty' \
-H 'Referer: https://zzz.com' \
-H 'Accept-Language: en-GB,en-US;q=0.9,en;q=0.8' \
--data-binary '{"method":"NewUser","params":{"Username":"James","Password":"secret"}}' \
--compressed
I have managed to do this in Python and it worked:
import requests
url = "https://test.com"
headers = CaseInsensitiveDict
headers["Connection"] = "keep-alive"
headers["Accept"] = "application/json, text/javascript, */*; q=0.01"
headers["cache-control"] = "no-cache"
headers["X-Requested-With"] = "XMLHttpRequest"
headers["User-Agent"] = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
headers["Content-Type"] = "application/json"
headers["Origin"] = "https://yyy.com"
headers["Sec-Fetch-Site"] = "same-origin"
headers["Sec-Fetch-Mode"] = "cors"
headers["Sec-Fetch-Dest"] = "empty"
headers["Referer"] = "https://zzz"
headers["Accept-Language"] = "en-GB,en-US;q=0.9,en;q=0.8"
data = '{"method":"NewUser","params":{"Username":"James","Password":"Secret"}}'
raw_response = requests.post(url, headers=headers, data=data)
What I am trying to do in Swift is:
let url = URL(string: "https://test.com")!
var request = URLRequest(url: url)
request.httpMethod = "POST"
request.addValue("application/json", forHTTPHeaderField: "Content-Type")
request.addValue("keep-alive", forHTTPHeaderField: "Connection")
request.addValue("application/json, text/javascript, */*; q=0.01", forHTTPHeaderField: "Accept")
request.addValue("no-cache", forHTTPHeaderField: "cache-control")
request.addValue("XMLHttpRequest", forHTTPHeaderField: "X-Requested-With")
request.addValue("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", forHTTPHeaderField: "User-Agent")
request.addValue("https://yyy.com", forHTTPHeaderField: "Origin")
request.addValue("same-origin", forHTTPHeaderField: "Sec-Fetch-Site")
request.addValue("cors", forHTTPHeaderField: "Sec-Fetch-Mode")
request.addValue("empty", forHTTPHeaderField: "Sec-Fetch-Dest")
request.addValue("https://zzz.com", forHTTPHeaderField: "Referer")
request.addValue("en-GB,en-US;q=0.9,en;q=0.8", forHTTPHeaderField: "Accept-Language")
let json: [String: Any] = ["method": "NewUser","params": ["Username":"James", "Password":"Secret"]]
let jsonData = try! JSONSerialization.data(withJSONObject: json, options: [])
request.httpBody = jsonData
let task = URLSession.shared.dataTask(with: url) {(data, response, error) in
guard let data = data else { return }
print(String(data: data, encoding: .utf8)!)
}
task.resume()
Any idea what is the difference between Python/Swift code that it's not working?
Thanks.

Get response from Curl but not Python Requests

I tried to run the Curl command in the terminal. I can get the response in JSON. But when I convert the Curl to Python Code and run the python script, the script returned an html page. It looks like it is a captcha page.
Why they can detect the difference? Any way I can bypass that?
curl --location --request GET 'https://www.99.co/api/v2/web/search/listings?query_type=city&property_segments=residential&listing_type=sale&rental_type=unit&page_size=1&page_num=100&zoom=11&show_future_mrts=true&show_cluster_preview=true&show_internal_linking=true' \
--header 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:89.0) Gecko/20100101 Firefox/89.0' \
--header 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' \
--header 'accept-language: en-US,en;q=0.5' \
--header 'if-none-match: W/"5cb28158470b6eb63b0db1f5dd2adda7d10c328c"' \
--header 'cookie: _xsrf=2|6be6c7ee|de091e966dba87e0b2c8637a63127715|1616479899' \
--header 'cookie: lotameid=undefined' \
--header 'cookie: started_since=1616480838' \
--header 'cookie: country=SG' \
--header 'cookie: nid=2de067d4-fed8-4c35-b7db-14f479997f8a' \
--header 'cookie: g_state={"i_p":1627551635034,"i_l":1}' \
--header 'upgrade-insecure-requests: 1' \
--header 'te: trailers' \
--header 'sec-gpc: 1'
import requests
url = "https://www.99.co/api/v2/web/search/listings?query_type=city&property_segments=residential&listing_type=sale&rental_type=unit&page_size=1&page_num=100&zoom=11&show_future_mrts=true&show_cluster_preview=true&show_internal_linking=true"
payload={}
headers = {
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:89.0) Gecko/20100101 Firefox/89.0',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'accept-language': 'en-US,en;q=0.5',
'if-none-match': 'W/"5cb28158470b6eb63b0db1f5dd2adda7d10c328c"',
'cookie': '_xsrf=2|6be6c7ee|de091e966dba87e0b2c8637a63127715|1616479899',
'cookie': 'lotameid=undefined',
'cookie': 'started_since=1616480838',
'cookie': 'country=SG',
'cookie': 'nid=2de067d4-fed8-4c35-b7db-14f479997f8a',
'cookie': 'g_state={"i_p":1627551635034,"i_l":1}',
'upgrade-insecure-requests': '1',
'te': 'trailers',
'sec-gpc': '1'
}
response = requests.request("GET", url, headers=headers, data=payload)
print(response.text)

Curl requests works but fails on python-requests

I'm currently getting forbidden for a site I am trying to visit using requests library.
What's interesting is it seems to be working fine when using curl like this
curl 'https://www.zim.com/tools/track-a-shipment?consnumber=ZIMUHKG83103991' \
-H 'Connection: keep-alive' \
-H 'Pragma: no-cache' \
-H 'Cache-Control: no-cache' \
-H 'sec-ch-ua: "Google Chrome";v="87", " Not;A Brand";v="99", "Chromium";v="87"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'Upgrade-Insecure-Requests: 1' \
-H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36' \
-H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
-H 'Sec-Fetch-Site: same-origin' \
-H 'Sec-Fetch-Mode: navigate' \
-H 'Sec-Fetch-User: ?1' \
-H 'Sec-Fetch-Dest: document' \
-H 'Referer: https://www.zim.com/tools/track-a-shipment?consnumber=ZIMUHKG83103991' \
-H 'Accept-Language: en-US,en;q=0.9' \
-H 'Cookie: bm_sz=A9168FBFA535588B6FA41B389B0E0FC1~YAAQ5zLFF9vG5S52AQAA4DyEaQroeTnfUz1nzswWQ4xFrGAXCczEwMD0MIS+Fr9nA5Zs5CDuB/6MOUP/Q8maKyJJN1yTDoeodUjZt+5T/loHrw8YlbK3a0vO+29JspP04oee8T5lU9qct2gtHC60vbG94f3quPmuSVBERJbEnOG/Ju5QoQx9jsDMPQ==; AKA_A2=A; ak_bmsc=F05A9B88D5CE9314962A4FB53AC066FE17C532E756050000F179D95F94CDFE71~plTDPprwDhfZXi2vSQbthcJGHAoK6c1KvvX8PW2BvX9/io3C7PDJwH1q8MiYF2HN4xoooBKQWo3LvCfAKOWQaTOhy3Q+2auRG/2mn2fLiJxvIuju5c4huzOMGsxmM1BrVtWKf+fR4nnfdtodyimozv+SyFazbEbuHuq18ReiUyoP5KyXXlrsRkp3ReYMkve37Mm6n4x/8s7ZWSa7scVVthWDn02d9Dtp4buykLxcZCz1s=; rxVisitor=16080880493048D6AKB20IDEQSEGR1UGPHIO59J02C95B; uniqueUserId=73aa42f6-ec0a-454b-864c-53c929c69acc; stg_traffic_source_priority=1; country=US; _fbp=fb.1.1608088050866.874878324; _ga=GA1.2.1801289879.1608088051; _gid=GA1.2.1440505168.1608088051; _pk_ses.923228e7-d119-45eb-a025-e4d1b1af6e1b.ad39=*; OptanonAlertBoxClosed=2020-12-16T03:07:34.685Z; dtCookie=v_4_srv_1_sn_5RCLUTEDABIFNF5FLA270N8NCOHUOMRF_perc_100000_ol_0_mul_1_app-3A25bc709b3f2362bb_0; TS0128f558=01bbeceaf7fc47a22dbac4f704c3f650720bf639f1b2b937dd7133075e6b113a96aa9c1b98d65581f9a2ca694795bd9b5c3403231723523f99d38545aaef94cf6cc57266cf; LPVID=MwZTliY2MwYjcwMDFmOTdi; LPSID-53134936=JThIkUaOQaGDwAiVk_mZWg; TS01f7aa8d=01bbeceaf7e716ede662d1632457f3471645d66c1808274488c7c06b899e5092889d3e65ade3ba07fd262b9063dd8f63ed770f2bef; stg_externalReferrer=https://www.zim.com/tools/track-a-shipment?consnumber=ZIMUHKG83103991; _abck=7AE32D3F4822078BCDBCB745422E19AE~0~YAAQ5zLFF4XV5S52AQAAOkGOaQUBzX68zlTJTRs1RP+1RohmAGbj05cNVwmJZIYznbQcOmT6NR0ccyXvbkmcczETVxvbyg4h9qiNQ9TNDh9aZrnqxQxttgA96/eKOY5iysj2qIM5Fg+RCBqlSS96tse+yduqQ+JV2OKFCgQC0CR+5Fvr8nngGjIGVqsV/wC4vNGBQX6oqf1YHMxx8rN36mEzTfOpQvUAVU1iN0fuK98Rpu16LFgJU05GRxd/trtTYKDx8AX6SSdjpGyNIvJWOKg30wO7ALYMjALr3pPiWh23sF6McV3nOJVONoGZpBrK6bhEOfQAeUnqkYT4aySK1JebTA==~-1~-1~-1; dtLatC=37; OptanonConsent=isIABGlobal=false&datestamp=Tue+Dec+15+2020+19%3A18%3A19+GMT-0800+(Pacific+Standard+Time)&version=5.14.0&landingPath=NotLandingPage&groups=1%3A1%2C2%3A1&AwaitingReconsent=false; _pk_id.923228e7-d119-45eb-a025-e4d1b1af6e1b.ad39=e4e8f967bd0568d5.1608088052.1.1608088700.1608088052.; bm_sv=DA4D33FF58BFFFC86A150B9B747988EC~oo1jX8ygs4LrnU4muVv/GaJaPJk2Jiolx5z0YcxGR4jlocdv9gImCEfeD3qrIbuauRTjMQYhGgV9HfY6u+d2QkpVguGGBIc92tToxRY4Ht3kXG+UqE5NpoBKiBhweyNtt/LIFRgTDDA1v4zQc8EkBw==; rxvt=1608090500507|1608088049306; dtPC=1$88698827_281h-vKTCJGFJVHKAFAPRGCSPMAVPCDVLMHPUJ-0e8; dtSa=true%7CC%7C-1%7CTrack%20Shipment%7C-%7C1608089388256%7C88698827_281%7Chttps%3A%2F%2Fwww.zim.com%2Ftools%2Ftrack-a-shipment%3Fconsnumber%3DZIMUHKG83103991%7CTrack%20Shipment%5Ec%20Container%20Tracing%20%5Ep%20ZIM%7C1608088700148%7C%7C; RT="sl=4&ss=1608088049148&tt=11039&obo=0&sh=1608088700486%3D4%3A0%3A11039%2C1608088668289%3D3%3A0%3A9023%2C1608088063737%3D2%3A0%3A5494%2C1608088051968%3D1%3A0%3A2818&dm=zim.com&si=puidioyg8d8&rl=1&ld=1608088700486&nu=&cl=1608089388285&r=https%3A%2F%2Fwww.zim.com%2Ftools%2Ftrack-a-shipment%3F201a15c26295707f7d9507ad6c103a3f&ul=1608089388315"; stg_last_interaction=Wed%2C%2016%20Dec%202020%2003:29:48%20GMT; stg_returning_visitor=Wed%2C%2016%20Dec%202020%2003:29:48%20GMT' \
--compressed
Doing the same exact thing in request and making sure I am passing every header correctly, the site still seem to give me forbidden error. I understand this is done for security on their side, but it seems to be working fine using curl, what could be different?
My python requests look like this:
headers = {
'Connection': 'keep-alive',
'Pragma': 'no-cache',
'Cache-Control': 'no-cache',
'sec-ch-ua': '"Google Chrome";v="87", " Not;A Brand";v="99", "Chromium";v="87"',
'sec-ch-ua-mobile': '?0',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-User': '?1',
'Sec-Fetch-Dest': 'document',
'Referer': 'https://www.zim.com/tools/track-a-shipment',
'Accept-Language': 'en-US,en;q=0.9',
'Cookie':'bm_sz=A9168FBFA535588B6FA41B389B0E0FC1~YAAQ5zLFF9vG5S52AQAA4DyEaQroeTnfUz1nzswWQ4xFrGAXCczEwMD0MIS+Fr9nA5Zs5CDuB/6MOUP/Q8maKyJJN1yTDoeodUjZt+5T/loHrw8YlbK3a0vO+29JspP04oee8T5lU9qct2gtHC60vbG94f3quPmuSVBERJbEnOG/Ju5QoQx9jsDMPQ==; AKA_A2=A; ak_bmsc=F05A9B88D5CE9314962A4FB53AC066FE17C532E756050000F179D95F94CDFE71~plTDPprwDhfZXi2vSQbthcJGHAoK6c1KvvX8PW2BvX9/io3C7PDJwH1q8MiYF2HN4xoooBKQWo3LvCfAKOWQaTOhy3Q+2auRG/2mn2fLiJxvIuju5c4huzOMGsxmM1BrVtWKf+fR4nnfdtodyimozv+SyFazbEbuHuq18ReiUyoP5KyXXlrsRkp3ReYMkve37Mm6n4x/8s7ZWSa7scVVthWDn02d9Dtp4buykLxcZCz1s=; rxVisitor=16080880493048D6AKB20IDEQSEGR1UGPHIO59J02C95B; uniqueUserId=73aa42f6-ec0a-454b-864c-53c929c69acc; stg_traffic_source_priority=1; country=US; _fbp=fb.1.1608088050866.874878324; _ga=GA1.2.1801289879.1608088051; _gid=GA1.2.1440505168.1608088051; _pk_ses.923228e7-d119-45eb-a025-e4d1b1af6e1b.ad39=*; OptanonAlertBoxClosed=2020-12-16T03:07:34.685Z; dtCookie=v_4_srv_1_sn_5RCLUTEDABIFNF5FLA270N8NCOHUOMRF_perc_100000_ol_0_mul_1_app-3A25bc709b3f2362bb_0; TS0128f558=01bbeceaf7fc47a22dbac4f704c3f650720bf639f1b2b937dd7133075e6b113a96aa9c1b98d65581f9a2ca694795bd9b5c3403231723523f99d38545aaef94cf6cc57266cf; LPVID=MwZTliY2MwYjcwMDFmOTdi; LPSID-53134936=JThIkUaOQaGDwAiVk_mZWg; TS01f7aa8d=01bbeceaf7e716ede662d1632457f3471645d66c1808274488c7c06b899e5092889d3e65ade3ba07fd262b9063dd8f63ed770f2bef; stg_externalReferrer=https://www.zim.com/tools/track-a-shipment?consnumber=ZIMUHKG83103991; _abck=7AE32D3F4822078BCDBCB745422E19AE~0~YAAQ5zLFF4XV5S52AQAAOkGOaQUBzX68zlTJTRs1RP+1RohmAGbj05cNVwmJZIYznbQcOmT6NR0ccyXvbkmcczETVxvbyg4h9qiNQ9TNDh9aZrnqxQxttgA96/eKOY5iysj2qIM5Fg+RCBqlSS96tse+yduqQ+JV2OKFCgQC0CR+5Fvr8nngGjIGVqsV/wC4vNGBQX6oqf1YHMxx8rN36mEzTfOpQvUAVU1iN0fuK98Rpu16LFgJU05GRxd/trtTYKDx8AX6SSdjpGyNIvJWOKg30wO7ALYMjALr3pPiWh23sF6McV3nOJVONoGZpBrK6bhEOfQAeUnqkYT4aySK1JebTA==~-1~-1~-1; dtLatC=37; OptanonConsent=isIABGlobal=false&datestamp=Tue+Dec+15+2020+19%3A18%3A19+GMT-0800+(Pacific+Standard+Time)&version=5.14.0&landingPath=NotLandingPage&groups=1%3A1%2C2%3A1&AwaitingReconsent=false; _pk_id.923228e7-d119-45eb-a025-e4d1b1af6e1b.ad39=e4e8f967bd0568d5.1608088052.1.1608088700.1608088052.; bm_sv=DA4D33FF58BFFFC86A150B9B747988EC~oo1jX8ygs4LrnU4muVv/GaJaPJk2Jiolx5z0YcxGR4jlocdv9gImCEfeD3qrIbuauRTjMQYhGgV9HfY6u+d2QkpVguGGBIc92tToxRY4Ht3kXG+UqE5NpoBKiBhweyNtt/LIFRgTDDA1v4zQc8EkBw==; rxvt=1608090500507|1608088049306; dtPC=1$88698827_281h-vKTCJGFJVHKAFAPRGCSPMAVPCDVLMHPUJ-0e8; dtSa=true%7CC%7C-1%7CTrack%20Shipment%7C-%7C1608089388256%7C88698827_281%7Chttps%3A%2F%2Fwww.zim.com%2Ftools%2Ftrack-a-shipment%3Fconsnumber%3DZIMUHKG83103991%7CTrack%20Shipment%5Ec%20Container%20Tracing%20%5Ep%20ZIM%7C1608088700148%7C%7C; RT="sl=4&ss=1608088049148&tt=11039&obo=0&sh=1608088700486%3D4%3A0%3A11039%2C1608088668289%3D3%3A0%3A9023%2C1608088063737%3D2%3A0%3A5494%2C1608088051968%3D1%3A0%3A2818&dm=zim.com&si=puidioyg8d8&rl=1&ld=1608088700486&nu=&cl=1608089388285&r=https%3A%2F%2Fwww.zim.com%2Ftools%2Ftrack-a-shipment%3F201a15c26295707f7d9507ad6c103a3f&ul=1608089388315"; stg_last_interaction=Wed%2C%2016%20Dec%202020%2003:29:48%20GMT; stg_returning_visitor=Wed%2C%2016%20Dec%202020%2003:29:48%20GMT'
}
params = (
('consnumber', 'ZIMUHKG83103991'),
)
response = requests.get('https://www.zim.com/tools/track-a-shipment', headers=headers, params=params)
I even make sure to pass the same exact cookie, it still fails.
you can change
params = (
('consnumber', 'ZIMUHKG83103991'),
)
to dict type params = {
'consnumber':'ZIMUHKG83103991'
} try again

How could I query the result without selenium on Python or Ruby

I'm trying to track the tickets trend
currently, I'm using selenium to simulate submitting forms.
as you know, the selenium is slow and consume much more memory.
However, when you submit the form, it will redirect you to a new url http://makeabooking.flyscoot.com/Flight/Select
Therefore, I don't have the idea how could I do this without the selenium.
Because I couldn't change the form of query like this http://makeabooking.flyscoot.com/Flight/from={TPE}&to={NYK}&date={2015-10-12} to fetch the result.
Any idea to do this with Ruby or Python with SSL proxy and HTTP proxy support ?
sample website: http://www.flyscoot.com/index.php/en/
You can get the curl requests from chrome easily and use it by:
F12 > Network > request > Right Click > Copy As cURL
curl 'http://makeabooking.flyscoot.com/Flight/Select' -H 'Accept-Encoding: gzip, deflate, sdch' -H 'Accept-Language: en-US,en;q=0.8,tr;q=0.6' -H 'Upgrade-Insecure-Requests: 1' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8' -H 'Referer: http://www.flyscoot.com/index.php/en/' -H 'Cookie: optimizelyEndUserId=oeu1444666692081r0.12463579000905156; __utmt=1; granify.lasts#1345=1444666699786; ASP.NET_SessionId=lql5yzv1l3yatkh1lcumg2e5; dotrez=1209262602.20480.0000; optimizelySegments=%7B%222335550040%22%3A%22gc%22%2C%222344180004%22%3A%22referral%22%2C%222354350067%22%3A%22false%22%2C%222355380121%22%3A%22none%22%7D; optimizelyBuckets=%7B%223025070068%22%3A%223020800213%22%7D; __utma=185425846.733949751.1444666694.1444666694.1444666694.1; __utmb=185425846.2.10.1444666694; __utmc=185425846; __utmz=185425846.1444666694.1.1.utmcsr=stackoverflow.com|utmccn=(referral)|utmcmd=referral|utmcct=/questions/33084039/how-could-i-query-the-result-without-selenium-on-python-or-ruby; granify.uuid=68b0d8e8-d068-40d8-9068-3098e870b858; granify.session#1345=1444666699786; granify.flags#1345=8; _gr_ep_sent=1; _gr_er_sent=1; granify.session_init#1345=2; optimizelyPendingLogEvents=%5B%5D' -H 'Connection: keep-alive' -H 'X-FirePHP-Version: 0.0.6' -H 'Cache-Control: max-age=0' --compressed
If you can set the headers and cookies info correctly you can use Python requests. If you want to convert it to the Python requests, you can use the this link. By this way you can simulate the browser. See the pyton requests:
cookies = {
'optimizelyEndUserId': 'oeu1444666692081r0.12463579000905156',
'__utmt': '1',
'granify.lasts#1345': '1444666699786',
'ASP.NET_SessionId': 'lql5yzv1l3yatkh1lcumg2e5',
'dotrez': '1209262602.20480.0000',
'optimizelySegments': '%7B%222335550040%22%3A%22gc%22%2C%222344180004%22%3A%22referral%22%2C%222354350067%22%3A%22false%22%2C%222355380121%22%3A%22none%22%7D',
'optimizelyBuckets': '%7B%223025070068%22%3A%223020800213%22%7D',
'__utma': '185425846.733949751.1444666694.1444666694.1444666694.1',
'__utmb': '185425846.2.10.1444666694',
'__utmc': '185425846',
'__utmz': '185425846.1444666694.1.1.utmcsr=stackoverflow.com|utmccn=(referral)|utmcmd=referral|utmcct=/questions/33084039/how-could-i-query-the-result-without-selenium-on-python-or-ruby',
'granify.uuid': '68b0d8e8-d068-40d8-9068-3098e870b858',
'granify.session#1345': '1444666699786',
'granify.flags#1345': '8',
'_gr_ep_sent': '1',
'_gr_er_sent': '1',
'granify.session_init#1345': '2',
'optimizelyPendingLogEvents': '%5B%5D',
}
headers = {
'Accept-Encoding': 'gzip, deflate, sdch',
'Accept-Language': 'en-US,en;q=0.8,tr;q=0.6',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Referer': 'http://www.flyscoot.com/index.php/en/',
'Connection': 'keep-alive',
'X-FirePHP-Version': '0.0.6',
'Cache-Control': 'max-age=0',
}
requests.get('http://makeabooking.flyscoot.com/Flight/Select', headers=headers, cookies=cookies)
If you save the result, you can see that result is as done via browser (open stack.html):
r = requests.get('http://makeabooking.flyscoot.com/Flight/Select', headers=headers, cookies=cookies
f = open("stack1.html", "w")
f.write(r.content)
I think this answer https://stackoverflow.com/a/1196151/1033953 is what you're looking for.
You'll need to inspect the parameters on that form to make sure you're posting the right values, but then you just need to use the Ruby net/http to send the HTTP Post.
I'm sure Python has something similar. Or you could use curl to post as shows in this answer https://superuser.com/a/149335

Categories

Resources