Python HTTP Header Content-Type boundary

Python HTTP Header Content-Type boundary - python

Here is my code:
headers={
'Host': 'cafe.upphoto.naver.com',
'Content-Length': '879990',
'Accept': '*/*',
'Origin': 'http://cafe.upphoto.naver.com',
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36',
'Content-Type':content,
'Content-Type': 'multipart/form-data;',# boundary=----WebKitFormBoundary3oLjjtLvU7AzQqTF',
'Referer': write,
'Accept-Language': 'ko-KR,ko;q=0.8,en-US;q=0.6,en;q=0.4',
}
files = {'image':('test.jpg',open('C:\\Users\\Public\\Pictures\\Sample Pictures\\test.jpg','rb'),'Content-Type: image/jpeg'),'filename':(None,'test.jpg'),'autorotate':(None,'true'),'extractAnimatedCnt':(None,'true'),'userId':(None,'beg1995')}
resp=self.post(url2+'upload/0',files=files,headers=headers)
When you run this code, the following packet is created:
POST http://cafe.upphoto.naver.com/MjAxNzA3MDcwMTExNDAHMTQ5OTM1ODQzNjkyNwdjYWZlMgdiZWcxOTk1BzAHMgdhODA1MzhiZmMyMGMyYTFlYTlhODE1NGY5OTc1ZDRkZA/upload/0 HTTP/1.1
Host: cafe.upphoto.naver.com
Proxy-Connection: keep-alive
Content-Length: 879990
Accept: */*
Origin: http://cafe.upphoto.naver.com
User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36
Content-Type: multipart/form-data; boundary=----WebKitFormBoundary9xhUsyQOPYJrPr3R
Referer: http://cafe.upphoto.naver.com/MjAxNzA3MDcwMTExNDAHMTQ5OTM1ODQzNjkyNwdjYWZlMgdiZWcxOTk1BzAHMgdhODA1MzhiZmMyMGMyYTFlYTlhODE1NGY5OTc1ZDRkZA/startup?mode=base&width=960
Accept-Language: ko-KR,ko;q=0.8,en-US;q=0.6,en;q=0.4
--e2f306a6b5a3485fb70bc2f7f1af2e9a
Content-Disposition: form-data; name="image"; filename="test.jpg"
Content-Type: Content-Type: image/jpeg
ÿØÿà
Content-Disposition: form-data; name="filename"
test.jpg
--e2f306a6b5a3485fb70bc2f7f1af2e9a
Content-Disposition: form-data; name="autorotate"
true
--e2f306a6b5a3485fb70bc2f7f1af2e9a
Content-Disposition: form-data; name="extractAnimatedCnt"
true
--e2f306a6b5a3485fb70bc2f7f1af2e9a
Content-Disposition: form-data; name="userId"
beg1995
--e2f306a6b5a3485fb70bc2f7f1af2e9a-
Look. The boundaries set and the boundaries actually applied are different.
What is the problem?

I suppose you use requests library. It doesn't allow to setup boundaries. So it is generated automatically on the fly.

Related

Making the right post request

I need your help in putting together a post request.
The output I get is html, but the plan was to get the following:
Below are all the data for the desired item:
General
Request URL: https://dgslivebetting.betonline.ag/ngwbet.aspx/gvFrameHtml
Request Method: POST
Status Code: 200
Remote Address: 104.17.64.19:443
Referrer Policy: strict-origin-when-cross-origin
Response Headers
cache-control: no-cache
cf-cache-status: DYNAMIC
cf-ray: 76800ae95afc35b3-DME
content-encoding: br
content-type: application/json; charset=utf-8
date: Thu, 10 Nov 2022 16:07:42 GMT
expires: -1
pragma: no-cache
server: cloudflare
set-cookie: server_persistent=!zk3OrErnBetHZkiKJcby5Il79pzHsf7dxKD0PcVuB54Z2dznuEbqgGAVDWLDvoqpVSDnVq+Jtf91LHo=; path=/; Httponly; Secure
x-newrelic-app-data: PxQFUFRTDQMHR1NRBQkOVVABDhFORDQHUjZKA1ZLVVFHDFYPHjZWADdTRRcPAF0cXgMWAFJFaAcXQU4cBRAlEFEPXSpMVVgQH1UXUR1RHVBUAA9QVloUHgFIQ1YCAg9fAAgFAFZXUFYDUQBAFF5VXkAAZA==
Request Headers
:authority: dgslivebetting.betonline.ag
:method: POST
:path: /ngwbet.aspx/gvFrameHtml
:scheme: https
accept: */*
accept-encoding: gzip, deflate, br
accept-language: ru-RU,ru;q=0.9,en-US;q=0.8,en;q=0.7
content-length: 12
content-type: application/json; charset=UTF-8
cookie: \_xpid=574830729; \_xpkey=K_F3GRHECOTdjT306mOafHByLTxopGhY; LPVID=MxZmQyM2Q5OTFlOTU0ZTJk; \_hjSessionUser_2115245=eyJpZCI6IjQ3MzAxYmQwLTQ4ODgtNWNjMC1hZGZjLWJlZDBmNDgwZDJjZCIsImNyZWF0ZWQiOjE2NjY1NTY0MjQwOTIsImV4aXN0aW5nIjp0cnVlfQ==; CT.CONTENT.NA.STATUS=1; \_gid=GA1.2.1666042031.1667883501; PreviousUrlNav=%2Fsportsbook%2Flive-betting; chQuickBet=undefined; inputAmount=100.00; kameleoonVisitorCode=\_js_ti27yqxpj7dd4k1x; DD-LINK-NAREDIRECT=0; ASP.NET_SessionId=5acflzzgqtjdvsnjc5wtwuys; tz=Eastern%20Standard%20Time; btpdb.1PR3l09.dGZjLjY2ODI2ODU=U0VTU0lPTg; oddsfmt=dec; \_hjSession_2115245=eyJpZCI6Ijk2NzBiMjNkLWY4MGQtNDM5OS1hYWNhLWQyODBjNmZlYzNkMSIsImNyZWF0ZWQiOjE2NjgwOTM2NzY4OTUsImluU2FtcGxlIjpmYWxzZX0=; \_hjAbsoluteSessionInProgress=0; \_hjIncludedInSessionSample=0; LPSID-90263191=bLgFHbiuTjOcwCg1FgR16g; \__cf_bm=5LozQOf4P4COCn1rVD5emsVzukFSNbWdS7kvBVodzJ4-1668096251-0-AQ+nY5HeihIwV+gAI1oaFKJJxOtgXWs5czIr198Ffrh18P1q4nriEcszp/j7dwjuDjVuki1jlT6IByy2ewOCcXSUWavF+3MCcBF4Yb8sfDPVkvoSufxJ46feYuPiCiPcw0eW9oTUnrmZNcEkZ1732RDx6LWq1OElUvT0Uk6sk1n1; \_gat_UA-190679354-1=1; \_ga_KC6V6402HY=GS1.1.1668096234.18.1.1668096460.0.0.0; \_ga=GA1.1.1142263304.1666556424; server_persistent=!Tdbrpsz3tJ8jlNmKJcby5Il79pzHsfLVz91fFnDrXObiJE45d6idCUAVcW4Qmd/g598vNFaqTVuVRvk=
origin: https://dgslivebetting.betonline.ag
referer: https://dgslivebetting.betonline.ag/ngwbet.aspx
sec-ch-ua: "Google Chrome";v="107", "Chromium";v="107", "Not=A?Brand";v="24"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-origin
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36
x-newrelic-id: VgcFUVNTDxACV1NaDgIDVlw=
x-requested-with: XMLHttpRequest
Please help me figure out how I can get what I want.
My code:
import requests
import cloudscraper
scraper = cloudscraper.create_scraper()
url = 'https://dgslivebetting.betonline.ag/ngwbet.aspx/gvFrameHtml'
data = {"gameID":0}
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36",
'Referer': "https://dgslivebetting.betonline.ag/ngwbet.aspx/gvFrameHtml"
}
r = requests.post(url, data=data, headers=headers)
print(r.text)

In order to get JSON back, you need to add the Content-Type header to your request.
Your current examples shows you are only sending these headers:
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36",
'Referer': "https://dgslivebetting.betonline.ag/ngwbet.aspx/gvFrameHtml"
}
At the very least, you'll need to add Content-Type: application/json; charset=UTF-8 to the request, otherwise, requests is doing an application/x-www-form-urlencoded form post which is why you're getting back HTML from this site instead of JSON.

Fatal erro in POST using request module Python 3

I'm working on a web scraper build in python. Until now I build the following code:
import requests
headers = {
'authority': 'truegamedata.com',
'accept': '*/*',
'x-requested-with': 'XMLHttpRequest',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.152 Safari/537.36',
'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
'sec-gpc': '1',
'origin': 'https://truegamedata.com',
'sec-fetch-site': 'same-origin',
'sec-fetch-mode': 'cors',
'sec-fetch-dest': 'empty',
'referer': 'https://truegamedata.com/weapon_builder.php',
'accept-language': 'pt-BR,pt;q=0.9,en-US;q=0.8,en;q=0.7',
}
data = {
'weapon_name': '^%^5B^%^22Kilo 141^%^22^%^2C^%^22wz^%^22^%^5D'
}
response = requests.post('https://truegamedata.com/SQL_calls/base_data.php', headers=headers, data=data)
print(response.text)
For some reason, I get the following error:
<br />
<b>Fatal error</b>: Uncaught Error: Call to a member function execute() on bool in /home/customer/www/truegamedata.com/public_html/SQL_calls/base_data.php:29
Stack trace:
#0 {main}
thrown in <b>/home/customer/www/truegamedata.com/public_html/SQL_calls/base_data.php</b> on line <b>29</b><br />
Does anyone know why this is happening? And how I can get this response?
Here is the request from Chorme Dev tools:
Request URL: https://truegamedata.com/SQL_calls/base_data.php
Request Method: POST
Status Code: 200
Remote Address: 127.0.0.1:61696
Referrer Policy: strict-origin-when-cross-origin
cache-control: no-store, no-cache, must-revalidate
content-encoding: br
content-type: text/html; charset=UTF-8
date: Fri, 12 Feb 2021 20:08:45 GMT
expires: Thu, 19 Nov 1981 08:52:00 GMT
host-header: 8441280b0c35cbc1147f8ba998a563a7
pragma: no-cache
server: nginx
vary: Accept-Encoding
x-httpd-modphp: 1
x-proxy-cache-info: DT:1
:authority: truegamedata.com
:method: POST
:path: /SQL_calls/base_data.php
:scheme: https
accept: */*
accept-encoding: gzip, deflate, br
accept-language: pt-BR,pt;q=0.9
content-length: 42
content-type: application/x-www-form-urlencoded; charset=UTF-8
cookie: PHPSESSID=375e8ebdfa9174d6db5eb8c1cda4411b; game=wz
origin: https://truegamedata.com
referer: https://truegamedata.com/weapon_builder.php
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-origin
sec-gpc: 1
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.152 Safari/537.36
x-requested-with: XMLHttpRequest
weapon_name: ["FR 5.56","wz"]
I tried to give as much information as possible, if anything is missing let me know

Issue in passing session information for scraping

I went to this website
www4.fmovies.to
then I clicked a movie and checked its CDN URL via Inspect->Network
and got below details
https://cdn.mcloud.to/stream/sf:i0:q2:h3:p23:l1/LR6ljfLn3hrEjSfrOp19wg/1542603600/i/f/2/nr69r8/hls/480/480-0013.ts
:authority: cdn.mcloud.to
:method: GET
:path: /stream/sf:i0:q2:h3:p23:l1/LR6ljfLn3hrEjSfrOp19wg/1542603600/i/f/2/nr69r8/hls/480/480-0001.ts
:scheme: https
accept: */*
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
cookie: __cfduid=d0847f9ac6d9a8da1dd131d1a0a91ea991542533053; _ga=GA1.2.485859786.1542533055; _gid=GA1.2.1916946057.1542533055; _gat=1
origin: https://mcloud.to
referer: https://mcloud.to/embed/#P#O8SE2916SEOA5?sub.file=https%253A%252F%252Fstatic1.akacdn.ru%252Fsubtitle%252F40039.vtt%253Fv1&ui=oAhi567w9OQEhJWEdbl0s%40Ep0Ir2VvG1xiK9JqKx
user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36
created header information using the above information and then ran
request = requests.get(url, headers=headers)
But am getting 403 Not Authorized. What is the issue?

You need to pass referer header that is the src attribute of video content iframe that looks like
<iframe src="https://mcloud.to/embed/#9#4ZS04Z10SWOE5?ui=pwxi4Kjr6%40wHmIqHcrl0yeFfpYqUUIW1wCKlJr6x" allow="autoplay; fullscreen" scrolling="no" allowfullscreen="yes" style="width: 100%; height: 100%;" frameborder="no"></iframe>
The code looks like
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:60.0)Gecko/20100101 Firefox/60.', 'pragma': 'no-cache', 'connection': 'keep-alive', 'cache-control': 'no-cache', 'referer': 'https://mcloud.to/embed/#9#4ZS04Z10SWOE5?ui=pwxi4Kjr6%40wHmIqHcrl0yeFfpYqUUIW1wCKlJr6x'}
requests.get('https://cdn.mcloud.to/stream/sf:i0:q2:h2:p24:l1/WjLDZuCBHmtyv63lT-RoVQ/1542603600/g/c/0/rj0m0m/hls/480/480-0000.ts', headers=headers)

Not able to upload tar.gz file using Python Request Module

Here is what my XHR data looks like when captured in chrome
Request Header
POST my_url?X-Progress-ID=ee821652321919bc7ae61fbe0b625990&userpkgname=file_name.tar.gz HTTP/1.1
Host: 10.110.134.28
Connection: keep-alive
Content-Length: 17461
Accept: application/json, text/plain, */*
Origin: https://10.110.134.28
X-XSRF-TOKEN: bGIwdfFE-oaL_1yVrCzw0iHvv4yUHLC28xjw
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36
Content-Type: multipart/form-data; boundary=----WebKitFormBoundarybfK9jSdLoc2Mpj0i
Referer: https://10.110.134.28/
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.8
Cookie: XSRF-TOKEN=bGIwdfFE-oaL_1yVrCzw0iHvv4yUHLC28xjw; sid=s%3AXglfJNLQ9zzp3eHjQ2QOpk19kFKDDMvJ.ZMKyZd1Gx13lz2MnJgty5WncnilySzfoThGktkhlk4w
Payload
------WebKitFormBoundarybfK9jSdLoc2Mpj0i
Content-Disposition: form-data; name="package"; filename="file_name.tar.gz"
Content-Type: application/x-gzip
------WebKitFormBoundarybfK9jSdLoc2Mpj0i--
And this is how I am building my request.
files = {'package': (<file_name>, open(config_path, 'rb'), 'application/x-gzip')}
request.post(url, files=files)
This is how my request header looks like
{
'Content-Length' : '17449',
'Accept-Encoding' : 'gzip, deflate',
'Accept' : '*/*',
'User-Agent' : 'python-requests/2.10.0',
'Connection' : 'keep-alive',
'Cookie' : 'XSRF-TOKEN=zmklLEL0-gDJOfNBk113MuTpBkLo0j6MAzw0; sid=s%3AC2JZDCfpg_CgkU7qSlS5YTvWXwpgMX35.5nU7W02TPNYtMkIQ4W%2B1bjd87A7KyJbh3shoNqqADXE',
'Content-Type' : 'multipart/form-data; boundary=270d9e02bf214dc7a09c3081cba5b0e0',
'XSRF-TOKEN' : 'zmklLEL0-gDJOfNBk113MuTpBkLo0j6MAzw0'
}
When I make the request I get 502 bad gateway response that too after few seconds while on chrome I get 200 OK instantly
So most probably I am not building my request correctly. Any suggestions?

How to specify the "Content-Type" and "Accept" on FormRequest?

Using the RequestForm, I need to specify that the Content-Type is application/json; charset=UTF-8 and Accept is */*.
How to do this?
Currently, my code looks like this:
yield scrapy.FormRequest(url='...',
formdata={
...
},
cookies={...},
callback=self.parse_second)
Using browser, the request is:
POST /PaginasPublicas/_SBC.aspx/pesquisaLoteIntegracaoTPCL HTTP/1.1
Host: geosampa.prefeitura.sp.gov.br
Connection: keep-alive
Content-Length: 118
Accept: */*
Origin: http://geosampa.prefeitura.sp.gov.br
X-Requested-With: XMLHttpRequest
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36
Content-Type: application/json; charset=UTF-8
Referer: http://geosampa.prefeitura.sp.gov.br/PaginasPublicas/_SBC.aspx
Accept-Encoding: gzip, deflate
Accept-Language: pt-BR,pt;q=0.8,en-US;q=0.6,en;q=0.4,ar;q=0.2,de;q=0.2,es;q=0.2,fr;q=0.2,it;q=0.2,ja;q=0.2,pl;q=0.2,tr;q=0.2,zh-TW;q=0.2
Cookie: ASP.NET_SessionId=bvvghxvsxgwzuyaudsqn5m5q

Your request should be like this:
yield FormRequest(..., headers={'Content-Type': 'application/json','charset':'UTF-8'})

Scrapy Request has a field headers which is use to define explicit headers. This will work for you.
yield scrapy.FormRequest(url='...',
formdata={
...
},
cookies={...}, headers={'Content-Type': 'application/json','charset':'UTF-8'},
callback=self.parse_second)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python HTTP Header Content-Type boundary - python

I suppose you use requests library. It doesn't allow to setup boundaries. So it is generated automatically on the fly.

Related

Making the right post request

Fatal erro in POST using request module Python 3

Issue in passing session information for scraping

Not able to upload tar.gz file using Python Request Module

How to specify the "Content-Type" and "Accept" on FormRequest?

Categories

Resources