Issue using python request module - python

Good morning,
Since yesterday, I'm having timeouts doing requests to ebay website. The code is simple:
import requests
headers = {
"User-Agent":
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"}
htlm=requests.get("https://www.ebay.es",headers=headers).text
Tested with google and it works. This is the response I receive:
'\nGateway Timeout - In read \n\nGateway Timeout\nThe proxy server did not receive a timely response from the upstream server.\nReference #1.477f1602.1645295618.7675ccad\n\n'
What happened or changed? How could I solve it?

Removing the headers should work. Perhaps they don't like that user agent for some reason.
import requests
# headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"}
headers = {}
url = "https://www.ebay.es"
response = requests.get(url, headers=headers)
html_text = response.text

Related

No response from `request.get()` for NASDAQ webpage

I am able to open this url via a browser and see the response in json format. However, when I use the requests module, there is no response from the method.
import requests
response = requests.get('https://api.nasdaq.com/api/calendar/earnings?date=2021-02-23')
What is wrong here?
this worked for me:
url = 'https://api.nasdaq.com/api/calendar/earnings?date=2021-02-23'
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36'}
response = requests.get(url, headers=headers)
Explanation
The site is blocking requests from python. Refer to explanation here
When adding the headers of the query that appear when inspecting the element in chrome, the request works well in python:
import requests
response = requests.get('https://api.nasdaq.com/api/calendar/earnings?date=2021-02-23',headers={"authority":"api.nasdaq.com","scheme":"https","path":"/api/calendar/earnings?date=2021-02-23","pragma":"no-cache","cache-control":"no-cache","accept":"application/json, text/plain, */*","user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36","origin":"https://www.nasdaq.com","sec-fetch-site":"same-site","sec-fetch-mode":"cors","sec-fetch-dest":"empty","referer":"https://www.nasdaq.com/","accept-encoding":"gzip, deflate, br","accept-language":"en-US,en;q=0.9,es;q=0.8,nl;q=0.7"})

Python Requests error 403 even with user agent

I'm trying to parse an auction website with python and request.
But so far, it's returned error 403 forbidden access cloudfare protection.
Here's bellow my code :
import requests
url = "https://www.interencheres.com"
headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'}
response = requests.get(url, headers=headers)

Web scraping request stopped working, showing "Response [401]" in python?

import requests
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36'}
url = 'https://www.nseindia.com/api/chart-databyindex?index=ACCEQN'
r = requests.get(url, headers=headers)
data = r.json()
print(data)
prices = data['grapthData']
print(prices)
It was working fine but now it showing error "Response [401]"
Well, it's all about the site's authentication requirements. It requires a certain level of authorization to access like this.

response 503 python requests

hi I trying to get a request to https://www.playerup.com/ with blow code :
import requests
header = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36"}
r = requests.get("https://www.playerup.com/",headers=header)
print(r.status_code)
but it give me a [503]type error
I trying timeout for 5 second and also not work
how I should fix it ?
So times using a referer and google cache can help you avoid these failures. So your code should be:
# here added a referer
header = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36" ,'referer':'https://www.google.com/'}
# now use a google cache
r = requests.get("http://webcache.googleusercontent.com/search?q=cache:www.playerup.com/",headers=header)
Now see the status code:
>>>r
<Response [200]>

How can I handle "The page has expired due to inactivity."

Currently, I am trying to make a user generator for a website and there is a fundemantal problem that I've been facing. The code below works but what it prints out is
The page has expired due to inactivity. Please refresh and try again
I have seen some of those solutions including using xsrf-token but either I am doing something wrong or it is not related to token.
with requests.Session() as s:
s.get('http://www.watchill.org/register')
token = s.cookies["XSRF-TOKEN"]
agent = {"User-Agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36 OPR/62.0.3331.116",
"XSRF-TOKEN":token}
r = s.post('http://www.watchill.org/register',headers=agent)
print(bs4.BeautifulSoup(r.content,"html.parser"))
The problem is with your CSRF token which is invalid.
I didn't check if this code is doing what's your aiming to but it does not return page expired message:
import requests
from bs4 import BeautifulSoup
def getXsrf(cookies):
for cookie in s.cookies:
if cookie.name =='XSRF-TOKEN':
return cookie.value
with requests.Session() as s:
s.get('http://www.watchill.org/register')
xsrf = getXsrf(s.cookies)
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36 OPR/62.0.3331.116"}
headers['X-XSRF-TOKEN'] = xsrf
r = s.post('http://www.watchill.org/register',headers=headers)
print(BeautifulSoup(r.content,"html.parser"))

Categories

Resources