I am trying to access a 3rd party ticketing site via API through a web scraper.
I know this is vague but I am new to python and I am not exactly sure how to figure out my error below:
My code breaks on this line:
roken_response =r.json
I get this error
Can anyone tell why exactly my code is breaking?
Using the requests library (which you seem to be using), .json is a convenience method that decodes the response as JSON. If your response was not JSON, then you will get a JSONDecodeError, as you show in your screenshot.
So the webserver probably answered your request with some HTML or something instead of JSON.
Also it sounds like you are violating the ToS of that poor ticketing site :(
Related
After a whole day of searching, I unfortunately did not succeed.
I would like to create a script who login and fill a form with python (i'm using the requests library).
If i resume my problem i just would like to do a POST requete from Postman for example for this form, but it's allways give me a 500 error, even i copie all the Request header and all the Resquest payload... I don't understand why.
My dream to give a 400 error :D
(I made a summary here but it's been all day since I tried to achieve it.)
The url : https://wallet.esw.esante.gouv.fr/auth/?response_type=code&scope=openid%20identity&client_id=portail-ecps&state=0t0YIzB4ar33dpuU5IK1yN8v6t8&redirect_uri=https:%2F%2Fwallet.esw.esante.gouv.fr%2Fredirect_uri&nonce=0KbgFrUIWvR-mrU63ZX6aViGEWz1tSnlGEP0XE4ZiFQ&acr_values=eidas2
Stop running javascript on your browser and try to get page source then you will have gotten nothing. Request module is not good with javascrip websites. Take a look for selenium
There is a site https://www.flashscore.ru/. I need to parse the 'Odds' category from there. But I do not know how to do this, the site has a loading using ajax. But when I look at this ajax I can’t find the very request that I need, all requests are encrypted as well. Tell me how to decrypt requests or where it can be read. I must say right away that I can’t open the browser, because the script will work on the server (without selenium type libraries). If there is a ready-made solution, I will be very happy
I am new here so bear with me if I break the etiquette for this forum. Anyway, I've been working on a python project for a while now and I'm nearing the end but I've been dealing with the same problem for a couple of days now and I can't figure out what the issue is.
I'm using python and the requests module to send a post request to the checkout page of an online store. The response I get when i send it in is the page where you put in your information, not the page that says your order was confirmed, etc.
At first I thought that it could be the form data that I was sending in, and I was right. I checked what it was supposed to be in the network tab on chrome and i saw I was sending in 'Visa' and it was supposed to be 'visa' but it still didn't work after that. Then I thought it could be the encoding but I have no clue how to check what kind the site takes.
Do any of you have any ideas of what could be preventing this from working? Thanks.
EDIT: I realized that I wasn't sending a Cookie in the request headers, so I fixed that and it's still not working. I set up a server script that prints the request on another computer and posted to that instead and the requests are exactly the same, both headers and body. I have no clue what it could possibly be.
I need to input text into the text boxon this website:
http://www.link.cs.cmu.edu/link/submit-sentence-4.html
I then require the return page's html to be returned.
I have looked at other solutions. But i am aware that there is no solution for all.
I have seen selenium, but im do not understand its documentation and how i can apply it.
Please help me out thanks.
BTW i have some experience with beautifulsoup, if it helps.
Check out the requests module. It is super easy to use to make any kind of HTTP request and gives you complete control of any extra headers or form completion payload data you would need to POST data to the website you want to.
P.S. If all else fails, make the request you want to make to the website in a web browser and get the curl address of the request using an inspector. Then you can just start a python script and exec the curl command (which you might need to install on your system if you dont have it) with the parameters in the curl request you copied
I've had a look at many tutorials regarding cookiejar, but my problem is that the webpage that i want to scape creates the cookie using javascript and I can't seem to retrieve the cookie. Does anybody have a solution to this problem?
If all pages have the same JavaScript then maybe you could parse the HTML to find that piece of code, and from that get the value the cookie would be set to?
That would make your scraping quite vulnerable to changes in the third party website, but that's most often the case while scraping. (Please bear in mind that the third-party website owner may not like that you're getting the content this way.)
I responded to your other question as well: take a look at mechanize. It's probably the most fully featured scraping module I know: if the cookie is sent, then I'm sure you can get to it with this module.
Maybe you can execute the JavaScript code in a JavaScript engine with Python bindings (like python-spidermonkey or pyv8) and then retrieve the cookie. Or, as the javascript code is executed client side anyway, you may be able to convert the cookie-generating code to Python.
You could access the page using a real browser, via PAMIE, win32com or similar, then the JavaScript will be running in its native environment.