Ruby vs Python get request - python

I'm trying to make a GET request to the NBA API in ruby. I've tried both the unirest library but the request hangs after 5 minutes.
When I use python's requests library for the same url + endpoint it works. Here's what I'm trying for each:
require "unirest"
base_url = "http://stats.nba.com/stats"
team_endpoint = "/teamgamelog"
params = {:TeamID => "1610612739", :Season => "2016-17", :SeasonType => "Playoffs"}
HEADERS = {'user-agent'=> 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36',
'Accept'=> "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
}
response = Unirest.get "#{base_url}#{team_endpoint}", headers: HEADERS,
parameters: params
The above code doesn't work (the connection eventually times-out). The following python code works:
import requests
base_url = 'http://stats.nba.com/stats'
HEADERS = {'user-agent': ('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) '
'AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/45.0.2454.101 Safari/537.36'),}
end_point = '/teamgamelog'
PARAMS = {'TeamID': '1610612739', 'Season': '2016-17', 'SeasonType': 'Playoffs'}
r = requests.get(base_url+end_point, headers=HEADERS, params=PARAMS)
Any advice on solving the ruby case?

Related

Read JSON metadata for a token from Solscan

I'm using python and trying to read the metadata from a token on solscan.
I am looking for the name, image, etc from metadata.
I am currently using JSON request which seems to work (ie not fail), but it only returns me:
{"holder":0}
Process finished with exit code 0
I am doing several other requests to website, so I think my request is correct.
I tried looking at the documentation on https://public-api.solscan.io/docs and I believe I am requesting the correct info, but I dont get it.
Here is my current code:
import requests
headers = {
'accept': 'application/jsonParsed',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'
}
params = (
('tokenAddress', 'EArf8AxBi44QxFVnSab9gZpXTxVGiAX2YCLokccr1UsW'),
)
response = requests.get('https://public-api.solscan.io/token/meta', headers=headers, params=params)
#response = requests.get('https://arweave.net/viPcoBnO9OjXvnzGMXGvqJ2BEgl25BMtqGaj-I1tkCM', headers=headers)
print(response.content.decode())
Any help appreciated!
This code sample works:
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'
}
params = {
'address': 'EArf8AxBi44QxFVnSab9gZpXTxVGiAX2YCLokccr1UsW',
}
response = requests.get('https://api.solscan.io/account', headers=headers, params=params)
print(response.content.decode())
I use another URL and parameters in my sample: https://api.solscan.io/account used instead of https://public-api.solscan.io/token/meta and address param instead of tokenAddress.

Python request yields status code 500 even though the website is available

I'm trying to use Python to check whether or not a list of websites is online. However, on several sites, requests yields the wrong status code. For example, the status code I get for https://signaturehound.com/ is 500 even though the website is online and in the Chrome developer tools the response code 200 is shown.
My code looks as follows:
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
def url_ok(url):
r = requests.head(url,timeout=5,allow_redirects=True,headers=headers)
status_code = r.status_code
return status_code
print(url_ok("https://signaturehound.com/"))
As suggested by #CaptainDaVinci in the comments, the solution is to replace head by get in the code:
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
def url_ok(url):
r = requests.get(url,timeout=5,allow_redirects=True,headers=headers)
status_code = r.status_code
return status_code
print(url_ok("https://signaturehound.com/"))

Python Requests Get not Working

I have a simple Get request I'd like to make using Python's Request library.
import requests
HEADERS = {'user-agent': ('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5)'
'AppleWebKit/537.36 (KHTML, like Gecko)'
'Chrome/45.0.2454.101 Safari/537.36'),
'referer': 'http://stats.nba.com/scores/'}
url = 'http://stats.nba.com/stats/playbyplayv2?EndPeriod=10&EndRange=55800&GameID=0021500281&RangeType=2&Season=2016-17&SeasonType=Regular+Season&StartPeriod=1&StartRange=0'
response = requests.get(url, timeout=5, headers=HEADERS)
However, when I make the requests.get call, I get the error requests.exceptions.ReadTimeout: HTTPConnectionPool(host='stats.nba.com', port=80): Read timed out. (read timeout=5). But I am able to copy/paste that url into my browser and view the resulting JSON. Why is requests not able to get the result?
Your HEADERS format is wrong. I tried with this code and it worked without any issues:
import requests
HEADERS = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36',
}
url = 'http://stats.nba.com/stats/playbyplayv2?EndPeriod=10&EndRange=55800&GameID=0021500281&RangeType=2&Season=2016-17&SeasonType=Regular+Season&StartPeriod=1&StartRange=0'
response = requests.get(url, timeout=5, headers=HEADERS)
print(response.text)

Able to see image on browser, but urllib.urlretrieve() fails to downlad it. How can I download it?

Image path --> http://markinternational.info/data/out/366/221983609-black-hd-desktop-wallpaper.jpg
Code I am using
import urllib
urllib.urlretrieve("https://markinternational.info/data/out/366/221983609-black-hd-desktop-wallpaper.jpg" , "photu.jpg")
What it returns (returns same thing for successful or unsuccessful attempts)
('photu.jpg', <httplib.HTTPMessage instance at 0x7fe3cfb27d88>)
Can someone help?
You need to fake the user-agent to bypass this restriction by the web server.
I used Python3 and requests library, I managed to get the picture:
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
url = 'https://markinternational.info/data/out/366/221983609-black-hd-desktop-wallpaper.jpg'
res = requests.get(url, headers=headers)
with open('photo.jpg', 'wb') as W:
W.write(res.content)
This might help.
import urllib
f = open('photu.jpg','wb')
f.write(urllib.urlopen('https://markinternational.info/data/out/366/221983609-black-hd-desktop-wallpaper.jpg').read())
f.close()
Since you're sending a raw http request without any User-Agent header, the server is not allowing the request to pass through. You can mock it with a defined User-Agent in header and it'll work as if it works on browser.
url = "https://markinternational.info/data/out/366/221983609-black-hd-desktop-wallpaper.jpg"
req = urllib.request.Request(
url,
data=None,
headers={
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
}
)
with open('image.jpg', 'wb') as img_file:
img_file.write(urllib.request.urlopen(req).read())

How can I recreate a urllib.requests in Python 2.7?

I'm crawling some web-pages and parsing through some data on them, but one of the sites seems to be blocking my requests. The version of the code using Python 3 with urllib.requests works fine. My problem is that I need to use Python 2.7, and I can't get a response using urllib2
Shouldn't these requests be identical?
Python 3 version:
def fetch_title(url):
req = urllib.request.Request(
url,
data=None,
headers={
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
}
)
html = urllib.request.urlopen(req).read().encode('unicode-escape').decode('ascii')
return html
Python 2.7 version:
import urllib2
opener = urllib2.build_opener()
opener.addheaders = [(
'User-Agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
)]
response = opener.open('http://website.com')
print response.read()
The following code should work, essentially with python 2.7 you can create a dictionary with your desired headers and format your request in a way that it will work properly with urllib2.urlopen using urllib2.Request.
import urllib2
def fetch_title(url):
my_headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36"
}
return urllib2.urlopen(urllib2.Request(url, headers=my_headers)).read()

Categories

Resources