So i'm trying to consume this API, I got this URL http://www.ventamovil.com.mx:9092/service.asmx?op=Check_Balance
There you can write this {"User":"6144135400","Password":"Prueba$$"} on the input field and you get a response.
https://i.stack.imgur.com/RTEii.png
Response
But when i try to consume this api on python i just can't, i don't exactly know how to consume correctly:
My Code
As you can see i got a different response with my code, i should be getting the same response as the "Response" image.
To save yourself some time, you can use their request to build python code automatically, all you have to do is:
Just as you did at first, enter the json in the input field and invoke.
open the network tab, copy the post request they made as curl
curl 'http://www.ventamovil.com.mx:9092/service.asmx/Check_Balance' -H 'Connection: keep-alive' -H 'Cache-Control: max-age=0' -H 'Upgrade-Insecure-Requests: 1' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36' -H 'Origin: http://www.ventamovil.com.mx:9092' -H 'Content-Type: application/x-www-form-urlencoded' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' -H 'Referer: http://www.ventamovil.com.mx:9092/service.asmx?op=Check_Balance' -H 'Accept-Language: en-US,en;q=0.9,ar;q=0.8,pt;q=0.7' --data 'jrquest=%7B%22User%22%3A6144135400%2C+%22Password%22%3A+%22Prueba%24%24%22%7D' --compressed --insecure
Go to postman and import the curl, then click code and select python, and here you go you have all the right headers needed
import requests
url = "http://www.ventamovil.com.mx:9092/service.asmx/Check_Balance"
payload = 'jrquest=%7B%22User%22%3A6144135400%2C+%22Password%22%3A+%22Prueba%24%24%22%7D'
headers = {
'Upgrade-Insecure-Requests': '1',
'Content-Type': 'application/x-www-form-urlencoded',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
}
response = requests.request("POST", url, headers=headers, data = payload)
print(response.text.encode('utf8'))
As you can see, they accept their input as form encoded payload.
You need to modify this request to be parameterized with user/password you want each time you use.
Btw, the output of this python code is:
b'<?xml version="1.0" encoding="utf-8"?>\r\n<string xmlns="http://www.ventamovil.com.mx/ws/">{"Confirmation":"00","Saldo_Inicial":"10000","Compras":"9360","Ventas":"8416","Comision":"469","Balance":"10345.92"}</string>'
Related
can't parse the transcript of a video from https://www.ted.com/talks/alexis_nikole_nelson_a_flavorful_field_guide_to_foraging/transcript
the requests won't see the span class where the text actually is. What could be the problem?
import requests
url = 'https://www.ted.com/talks/alexis_nikole_nelson_a_flavorful_field_guide_to_foraging/transcript'
page = requests.get(url)
print(page.content)
Is there any way to reach the transcript? Thank you.
I need to reach this
no atrribute found
That's because the data is not loaded via the link you're using, but via a call to their GraphQL instance.
Using curl, you can fetch the data like so:
curl 'https://www.ted.com/graphql?operationName=Transcript&variables=%7B%22id%22%3A%22alexis_nikole_nelson_a_flavorful_field_guide_to_foraging%22%2C%22language%22%3A%22en%22%7D&extensions=%7B%22persistedQuery%22%3A%7B%22version%22%3A1%2C%22sha256Hash%22%3A%2218f8e983b84c734317ae9388c83a13bc98702921b141c2124b3ce4aeb6c48ef6%22%7D%7D' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:99.0) Gecko/20100101 Firefox/99.0' -H 'Accept: */*' -H 'Accept-Language: en-US,en;q=0.5' -H 'Accept-Encoding: gzip, deflate, br' -H 'Referer: https://www.ted.com/talks/alexis_nikole_nelson_a_flavorful_field_guide_to_foraging/transcript' -H 'content-type: application/json' -H 'client-id: Zenith production' -H 'x-operation-name: Transcript' --output - | gzip -d
Note, the URL is urlencoded. You can import from urllib.parse import quote to use the quote() method to urlencode a string in python.
So simply translate the above curl command to python.
There's no magic, simply set the correct headers.
If you're lazy, you can also use this online converter, to convert a curl command to python code.
This produces:
import requests
from requests.structures import CaseInsensitiveDict
url = "https://www.ted.com/graphql?operationName=Transcript&variables=%7B%22id%22%3A%22alexis_nikole_nelson_a_flavorful_field_guide_to_foraging%22%2C%22language%22%3A%22en%22%7D&extensions=%7B%22persistedQuery%22%3A%7B%22version%22%3A1%2C%22sha256Hash%22%3A%2218f8e983b84c734317ae9388c83a13bc98702921b141c2124b3ce4aeb6c48ef6%22%7D%7D"
headers = CaseInsensitiveDict()
headers["User-Agent"] = "Mozilla/5.0 (X11; Linux x86_64; rv:99.0) Gecko/20100101 Firefox/99.0"
headers["Accept"] = "*/*"
headers["Accept-Language"] = "en-US,en;q=0.5"
headers["Accept-Encoding"] = "gzip, deflate, br"
headers["Referer"] = "https://www.ted.com/talks/alexis_nikole_nelson_a_flavorful_field_guide_to_foraging/transcript"
headers["content-type"] = "application/json"
headers["client-id"] = "Zenith production"
headers["x-operation-name"] = "Transcript"
resp = requests.get(url, headers=headers)
print(resp.content)
Output:
b'{"data":{"translation":{"id":"209255","language" ...
There is a website which I need to scrape. It has a long list of available job positions, that are folded by default:
Which unfold when a user clicks on it:
When a user unfolds it, the page sends a POST request to a website with a position id.
I tried to imitate this request (see code below), it doesn't fail (status==200) but doesn't return anything. I suspect that is because of CORS. Is there anyway to still collect the data?
import requests
url = "https://econjobmarket.org/positions/recordClick"
payload = 'posid=7026'
headers = {
'Accept': '*/*',
'X-CSRF-TOKEN': HERE_GOES_THE_TOKEN,
'X-Requested-With': 'XMLHttpRequest',
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
'Cookie': HERE_GOES_THE_COOKIE
}
response = requests.request("POST", url, headers=headers, data = payload)
print(response.text.encode('utf8'))
I don't see additional requests sent to get expanded data. All data (both in folded and expanded states are already in page source)
response = requests.get('https://econjobmarket.org/positions').content
print("Post-Doc, Computational Marketing" in response)
True
The recordClick URL you are seeing is simply for recording the click for web analytics. As Parolla said, what you are looking for is already in the page source. Your best bet is to do an HTTP GET on the website and parse the html code with BeautifulSoup.
You can reduce the ability of the site to track you and potentially block your scraping if you drop the token and cookies from the request headers.
A quick test in curl shows the response is still complete without them.
curl -i -s -k -X $'GET' \
-H $'Host: econjobmarket.org' -H $'Connection: close' -H $'Cache-Control: max-age=0' -H $'DNT: 1' -H $'Upgrade-Insecure-Requests: 1' -H $'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.101 Safari/537.36' -H $'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' -H $'Sec-GPC: 1' -H $'Sec-Fetch-Site: cross-site' -H $'Sec-Fetch-Mode: navigate' -H $'Sec-Fetch-User: ?1' -H $'Sec-Fetch-Dest: document' -H $'Accept-Encoding: gzip, deflate' -H $'Accept-Language: en-GB,en-US;q=0.9,en;q=0.8' \
$'https://econjobmarket.org/positions'
J and Parolla are correct that the POST is just recording your actions on the website.
I have a csv file that I want to send via curl to a Django Rest Framework API View I have developed. The csv file itself contains only foo,bar and this is the curl command I'm using:
curl URL -H 'Accept:
application/json' -H 'Referer: http://localhost:5011/ost:5011' -H
'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94
Safari/537.36' -H 'Authorization: JWT TOKEN -H 'Content-Type:
multipart/form-data' -F upload=#testcsv.csv
When it hits my API View in Django, and I run
request.data.get('file').read()
the output is some metadata around the file and not the file contents itself:
(Pdb) b'--------------------------001ec735bfc1a929\r\nContent-
Disposition: form-data; name="upload";
filename="testcsv.csv"\r\nContent-Type: application/octet-
stream\r\n\r\n\r\n--------------------------001ec735bfc1a929--\r\n'
How can I access the actual file itself through this method? My APIview is using
class FileUploadView(APIView):
parser_classes = (MultiPartParser,)
Thanks!
f = request.FILES["filefield_name"]
I am using flask-restful as an api server and am constructing the first PUT method. Using request imported from flask, I am able to access request.form data without issue with the following cURL command:
curl http://127.0.0.1:5000/api/v1/system/account -X PUT -d username=asdas -d email=asdasd#test.com
My PUT method logs out both the username and email without issue:
def put(self):
print 'SystemAccountPut'
print request.form['username']
print request.form['email']
return
Output:
SystemAccountPut
asdas
asdasd#test.com
I have an app using the axios project to make api calls. When axios attempts to PUT form data, request.form no longer works. Here is the call axios is making converted to cURL command from Chrome Dev Console:
curl 'http://127.0.0.1:5000/api/v1/system/account' -X PUT -H 'Pragma: no-cache' -H 'Origin: http://127.0.0.1:5000' -H 'Accept-Encoding: gzip, deflate, sdch' -H 'Accept-Language: en-US,en;q=0.8' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36' -H 'Content-Type: application/json;charset=UTF-8' -H 'Accept: application/json, text/plain, */*' -H 'Cache-Control: no-cache' -H 'Referer: http://127.0.0.1:5000/settings' -H 'Connection: keep-alive' --data-binary '{"username":"asdas","email":"asdasd#test.com"}' --compressed
With the same method above request.form['username'] and request.form['email'] are empty. request.data however has the form data in it, and request.get_json() also will output the form data in JSON format.
My question is what should I be using in this case to retrieve the form data? The first curl command is clean with request.form having the data I need, but request.data is empty. The second cURL command leaves request.form broken but does populate request.data. Is there a best practice on how I should retrieve form data in both cURL cases?
I figured out the issue after further learning more about incoming forms and some insight from davidism. The first cURL example has the following Content-Type: application/x-www-form-urlencoded. The second cURL command has the following Content-Type: application/json;charset=UTF-8. Unsurprisingly, the first cURL command sends form data to request.form and the second cURL command is interpreted as data and can retrieved at request.data or request.get_json(). For my needs I want to get the form data either way, so in my put method I have the following:
data = request.get_json() or request.form
print data['email']
print data['username']
This gives me the email and password in both cURL examples.
I have a Python script using the Requests library that is of this form:
uhash = '1234567abcdefg'
cookies = {
'uhash':uhash
}
payload = {
'action':'trade.bump',
'hash':uhash,
'tradeid':'12345678'
}
r = requests.post(
'http://www.target_url.com/api/core',
cookies=cookies,
params=payload
)
Above is my Python attempt at creating the following cURL request (written with bash):
HASH="1234567abcdefg"
TRADEID="12345678"
curl 'http://www.target_url.com/api/core' -H "Cookie: uhash=$HASH" --data "action=trade.bump&hash=$HASH&tradeid=$TRADEID"
In summary, both scripts contain:
The cookie - uhash
Three data parameters called action, hash, and tradeid
My issue currently is, the bash script works - the server response for when I use the bash script is this:
{"meta":{"code":200},"data":{"bumped":true,"count":15}}
However, if I use the Python script, with the SAME cookie and parameter values as the bash script, I get:
{"meta":{"code":301},"data":{"message":"You can't bump a trade that doesn't exist ;_;"}}
The above error tells me the trade doesn't exist, despite that tradeid existing and the exact same one as my bash script's parameters.
I tried to debug using Firefox' convenient copy-as-curl tool to copy that curl command, which was how I made the bash script. However, once I tried to translate it to the Python script, it will tell me the aforementioned error. Maybe I am using the Requests library incorrectly, and I am missing something.
Attached is the full cURL request taken from Firefox (don't worry, the parameters were sanitized, meaning, they're not the real values):
curl 'http://www.tf2outpost.com/api/core' -H 'Host: www.tf2outpost.com' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:35.0) Gecko/20100101 Firefox/35.0' -H 'Accept: application/json, text/javascript, */*; q=0.01' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'Content-Type: application/x-www-form-urlencoded; charset=UTF-8' -H 'X-Requested-With: XMLHttpRequest' -H 'Referer: http://www.tf2outpost.com/trades' -H 'Cookie: __qca=P0-6517545-1420724809746; __utma=5135382.11011755.14224810.14331180.14180489.7; __utmz=51353782.1420724810.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); uhash=abcdefg12345678; mb_uid2=3211475230616776; CTag61=14338638870; __utmb=513532.9.10.14180489; __utmc=513782; __utmt=1; __utmt_b=1; __utmt_c=1; OX_plg=sl|qt|pm; HIRO_COOKIE=data=&newSession=true&id=2237524293×tamp=1433506185; HIRO_CLIENT_ID=67751187' -H 'Connection: keep-alive' -H 'Pragma: no-cache' -H 'Cache-Control: no-cache' --data 'action=trade.bump&hash=abcdefg12345678&tradeid=12345678'
Not quite sure why that is happening.
Try using data or json key instead of params, use json.dumps(payload) if data is your preferred method.