I have a simple Get request I'd like to make using Python's Request library.
import requests
HEADERS = {'user-agent': ('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5)'
'AppleWebKit/537.36 (KHTML, like Gecko)'
'Chrome/45.0.2454.101 Safari/537.36'),
'referer': 'http://stats.nba.com/scores/'}
url = 'http://stats.nba.com/stats/playbyplayv2?EndPeriod=10&EndRange=55800&GameID=0021500281&RangeType=2&Season=2016-17&SeasonType=Regular+Season&StartPeriod=1&StartRange=0'
response = requests.get(url, timeout=5, headers=HEADERS)
However, when I make the requests.get call, I get the error requests.exceptions.ReadTimeout: HTTPConnectionPool(host='stats.nba.com', port=80): Read timed out. (read timeout=5). But I am able to copy/paste that url into my browser and view the resulting JSON. Why is requests not able to get the result?
Your HEADERS format is wrong. I tried with this code and it worked without any issues:
import requests
HEADERS = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36',
}
url = 'http://stats.nba.com/stats/playbyplayv2?EndPeriod=10&EndRange=55800&GameID=0021500281&RangeType=2&Season=2016-17&SeasonType=Regular+Season&StartPeriod=1&StartRange=0'
response = requests.get(url, timeout=5, headers=HEADERS)
print(response.text)
Related
I'm using python and trying to read the metadata from a token on solscan.
I am looking for the name, image, etc from metadata.
I am currently using JSON request which seems to work (ie not fail), but it only returns me:
{"holder":0}
Process finished with exit code 0
I am doing several other requests to website, so I think my request is correct.
I tried looking at the documentation on https://public-api.solscan.io/docs and I believe I am requesting the correct info, but I dont get it.
Here is my current code:
import requests
headers = {
'accept': 'application/jsonParsed',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'
}
params = (
('tokenAddress', 'EArf8AxBi44QxFVnSab9gZpXTxVGiAX2YCLokccr1UsW'),
)
response = requests.get('https://public-api.solscan.io/token/meta', headers=headers, params=params)
#response = requests.get('https://arweave.net/viPcoBnO9OjXvnzGMXGvqJ2BEgl25BMtqGaj-I1tkCM', headers=headers)
print(response.content.decode())
Any help appreciated!
This code sample works:
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'
}
params = {
'address': 'EArf8AxBi44QxFVnSab9gZpXTxVGiAX2YCLokccr1UsW',
}
response = requests.get('https://api.solscan.io/account', headers=headers, params=params)
print(response.content.decode())
I use another URL and parameters in my sample: https://api.solscan.io/account used instead of https://public-api.solscan.io/token/meta and address param instead of tokenAddress.
I'm doing a GET request on POSTMAN. The request is just this URL https://www.google.com/search?q=pip+google-images&tbm=isch and I get a 853639 characters long response(Which is the response I want).
So I want to do the same thing with Python. I used the Postman's GENERATE CODE SNIPPETS and copied the code for Python Requests, I pasted it into my own python script and ran it.
But the response I got was only 22490 characters long(The response I don't want).
Why is it happening?
Python's code:
import requests
url = "https://www.google.com/search"
querystring = {"q":"pip google-images","tbm":"isch"}
headers = {
'cache-control': "no-cache",
'postman-token': "6b5e997f-6651-2178-1371-5d6a555984a7"
}
response = requests.request("GET", url, headers=headers, params=querystring)
print(response.text)
You need to set the User-Agent, e.g.
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'
}
Include this in your headers.
I'm trying to retrieve the timetable from this site using Requests.
I make the post sending the right parameters and get back the empty HTML skeleton, but instead I would like to get the json file returned.
Here is what I see when inspecting the page and highlighted you can see the file I want to retrieve.
Here is my code so far:
url = "https://alilauro-tickets.certusonline.com/"
headers = {'user-agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.3'}
data = "msg=TimeTable&req=%7B%22getAvailability%22%3A%22Y%22%2C%22getBasicPrice%22%3A%22Y%22%2C%22getRouteAnalysis%22%3A%22Y%22%2C%22directOnly%22%3A%22Y%22%2C%22legs%22%3A%224%22%2C%22pax%22%3A1%2C%22origin%22%3A%22BEV%22%2C%22destination%22%3A%22ISC%22%2C%22tripRequest%22%3A%5B%7B%22tripfrom%22%3A%22BEV%22%2C%22tripto%22%3A%22ISC%22%2C%22tripdate%22%3A%222020-03-19%22%2C%22tripleg%22%3A0%7D%2C%7B%22tripfrom%22%3A%22ISC%22%2C%22tripto%22%3A%22BEV%22%2C%22tripdate%22%3A%222020-03-19%22%2C%22tripleg%22%3A1%7D%2C%7B%22tripfrom%22%3A%22BEV%22%2C%22tripto%22%3A%22FOR%22%2C%22tripdate%22%3A%222020-03-19%22%2C%22tripleg%22%3A2%7D%2C%7B%22tripfrom%22%3A%22FOR%22%2C%22tripto%22%3A%22BEV%22%2C%22tripdate%22%3A%222020-03-19%22%2C%22tripleg%22%3A3%7D%5D%7D"
r = requests.post(url, data=data, headers=headers, timeout=20)
The request should be as below:
url = 'https://alilauro-tickets.certusonline.com/php/proxy.php'
headers = {'user-agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.3'}
data = {
'msg': 'TimeTable',
'req': '{"getAvailability":"Y","getBasicPrice":"Y","getRouteAnalysis":"Y","directOnly":"Y","legs":1,"pax":1,"origin":"BEV","destination":"FOR","tripRequest":[{"tripfrom":"BEV","tripto":"FOR","tripdate":"2020-03-18","tripleg":0}]}'
}
response = requests.post(url, headers=headers, data=data)
Image path --> http://markinternational.info/data/out/366/221983609-black-hd-desktop-wallpaper.jpg
Code I am using
import urllib
urllib.urlretrieve("https://markinternational.info/data/out/366/221983609-black-hd-desktop-wallpaper.jpg" , "photu.jpg")
What it returns (returns same thing for successful or unsuccessful attempts)
('photu.jpg', <httplib.HTTPMessage instance at 0x7fe3cfb27d88>)
Can someone help?
You need to fake the user-agent to bypass this restriction by the web server.
I used Python3 and requests library, I managed to get the picture:
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
url = 'https://markinternational.info/data/out/366/221983609-black-hd-desktop-wallpaper.jpg'
res = requests.get(url, headers=headers)
with open('photo.jpg', 'wb') as W:
W.write(res.content)
This might help.
import urllib
f = open('photu.jpg','wb')
f.write(urllib.urlopen('https://markinternational.info/data/out/366/221983609-black-hd-desktop-wallpaper.jpg').read())
f.close()
Since you're sending a raw http request without any User-Agent header, the server is not allowing the request to pass through. You can mock it with a defined User-Agent in header and it'll work as if it works on browser.
url = "https://markinternational.info/data/out/366/221983609-black-hd-desktop-wallpaper.jpg"
req = urllib.request.Request(
url,
data=None,
headers={
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
}
)
with open('image.jpg', 'wb') as img_file:
img_file.write(urllib.request.urlopen(req).read())
i would like to take the response data about a specific website.
I have this site:
https://enjoy.eni.com/it/milano/map/
and if i open the browser debuger console i can see a posr request that give a json response:
how in python i can take this response by scraping the website?
Thanks
Apparently the webservice has a PHPSESSID validation so we need to get it first using proper user agent:
import requests
import json
headers = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.76 Safari/537.36'
}
r = requests.get('https://enjoy.eni.com/it/milano/map/', headers=headers)
session_id = r.cookies['PHPSESSID']
headers['Cookie'] = 'PHPSESSID={};'.format(session_id)
res = requests.post('https://enjoy.eni.com/ajax/retrieve_vehicles', headers=headers, allow_redirects=False)
json_obj = json.loads(res.content)