Python requests to online store not checking out - python

I am new here so bear with me if I break the etiquette for this forum. Anyway, I've been working on a python project for a while now and I'm nearing the end but I've been dealing with the same problem for a couple of days now and I can't figure out what the issue is.
I'm using python and the requests module to send a post request to the checkout page of an online store. The response I get when i send it in is the page where you put in your information, not the page that says your order was confirmed, etc.
At first I thought that it could be the form data that I was sending in, and I was right. I checked what it was supposed to be in the network tab on chrome and i saw I was sending in 'Visa' and it was supposed to be 'visa' but it still didn't work after that. Then I thought it could be the encoding but I have no clue how to check what kind the site takes.
Do any of you have any ideas of what could be preventing this from working? Thanks.
EDIT: I realized that I wasn't sending a Cookie in the request headers, so I fixed that and it's still not working. I set up a server script that prints the request on another computer and posted to that instead and the requests are exactly the same, both headers and body. I have no clue what it could possibly be.

Related

Python login with requests library

After a whole day of searching, I unfortunately did not succeed.
I would like to create a script who login and fill a form with python (i'm using the requests library).
If i resume my problem i just would like to do a POST requete from Postman for example for this form, but it's allways give me a 500 error, even i copie all the Request header and all the Resquest payload... I don't understand why.
My dream to give a 400 error :D
(I made a summary here but it's been all day since I tried to achieve it.)
The url : https://wallet.esw.esante.gouv.fr/auth/?response_type=code&scope=openid%20identity&client_id=portail-ecps&state=0t0YIzB4ar33dpuU5IK1yN8v6t8&redirect_uri=https:%2F%2Fwallet.esw.esante.gouv.fr%2Fredirect_uri&nonce=0KbgFrUIWvR-mrU63ZX6aViGEWz1tSnlGEP0XE4ZiFQ&acr_values=eidas2
Stop running javascript on your browser and try to get page source then you will have gotten nothing. Request module is not good with javascrip websites. Take a look for selenium

Logging into instagram using Python requests

I'm trying to log into Instagram using Python Requests. I figured it would be as simple as creating a requests.Session object and then sending a post request i.e.
session.post(login_url, data={'username':****, 'password':****})
This didn't work. I didn't know why so I tried manually entering the browsers headers (I used Chrome dev tools to see the headers of the post request) and passing them along with the request (headers={...}) even though I figured the session would deal with that. I tried sending a get request to the login URL first in order to get a cookie (and CSRF token I think) then doing the steps mentioned before. None of this worked.
I dont have much experience at all with this type of thing and I just dont understand what differentiates my post requests from google chromes (I must be doing something wrong). Thanks

Python Webscraper Breaking, Not sure Why

I am trying to access a 3rd party ticketing site via API through a web scraper.
I know this is vague but I am new to python and I am not exactly sure how to figure out my error below:
My code breaks on this line:
roken_response =r.json
I get this error
Can anyone tell why exactly my code is breaking?
Using the requests library (which you seem to be using), .json is a convenience method that decodes the response as JSON. If your response was not JSON, then you will get a JSONDecodeError, as you show in your screenshot.
So the webserver probably answered your request with some HTML or something instead of JSON.
Also it sounds like you are violating the ToS of that poor ticketing site :(

Big requests issue: GET doesnt release/reset TCP connections, loop crashes

im using python3.3 and the requests module to scrape links from an arbitrary webpage. My program works as follows: I have a list of urls which in the beginning has just the starting url in it.
The program loops over that list and gives the urls to a procedure GetLinks, where im using requests.get and Beautifulsoup to extract all links. Before that procedure appends links to my urllist it gives them to another procedure testLinks to see whether its an internal, external or broken link. In the testLinks im using requests.get too to be able to handle redirects etc.
The program worked really well so far, i tested it with quite some wesites and was able to get all links of pages with like 2000 sites etc. But yesterday i encountered a problem on one page, by looking on the Kaspersky Network Monitor. On this page some TCP connections just dont reset, it seems to me that in that case, the initial request for my first url dont get reset, the connection time is as long as my program runs.
Ok so far. My first try was to use requests.head instead of .get in my testLinks procedure. And then everything works fine! The connections are released as wanted. But the problem is, the information i get from requests.head is not sufficient, im not able to see the redirected url and how many redirects took place.
Then i tried requests.head with
allow_redirects=True
But unfortunately this is not a real .head request, it is a usual .get request. So i got the same problem. I also tried to use to set the parameter
keep_alive=False
but it didnt work either. I even tried to use urllib.request(url).geturl() in my testLinks for redirect issues, but here the same problem occurs, the TCP connections dont get reset.
I tried so much to avoid this problem, i used request sessions but it also had the same problem. I also tried a request.post with the header information Connection: close but it didnt worked.
I analyzed some links where i think it gets struck and so far i believe it has something to do with redirects like 301->302. But im really not sure because on all the other websites i tested it there mustve been such a redirect, they are quite common.
I hope someone can help me. For Information im using a VPN connection to be able to see all websites, because the country im in right now blocks some pages, which are interesting for me. But of course i tested it without the VPN and i had the same problem.
Maybe theres a workaround, because request.head in testLinks is sufficient if i just would be able in case of redirects to see the finnish url and maybe the number of redirects.
If the text is not well readable, i will provide a scheme of my code.
Thanks alot!

Direct link to comments that are being loaded asynchronously?

I am playing around with change.org and trying to download a couple of comments on a petition. For this, I would like to know where the comments are being pulled from when the user clicks on "load more reasons" For an example, look here:
http://www.change.org/petitions/tell-usda-to-stop-using-pink-slime-in-school-food
Looking at the XHR requests in Chrome, I see requests being sent to http://www.change.org/petitions/tell-usda-to-stop-using-pink-slime-in-school-food/opinions?page=2&role=comments Of course, the page number varies with the number of times comments are being loaded.
However, this link leads to a blank page when I try it in a browser. Is this because of some missing data in the url or is this a result of some authentication step within the javascript that makes the request in the first place?
Any pointers will be appreciated. Thanks!
EDIT: Thanks to the first response, I see that the data is being received when I use the console. How do I receive the same data when making the request from a python script. Do I have to mimic the browser or is there a way to just use urllib?
They must be validating the source of the request. If you go to the site open the console and run this:
$.get('http://www.change.org/petitions/tell-usda-to-stop-using-pink-slime-in-school-food/opinions?page=2&role=comments',{},function(data){console.log(data);});
You will see the data come back

Categories

Resources