Ruby vs Python get request

Ruby vs Python get request - python

I'm trying to make a GET request to the NBA API in ruby. I've tried both the unirest library but the request hangs after 5 minutes.
When I use python's requests library for the same url + endpoint it works. Here's what I'm trying for each:
require "unirest"
base_url = "http://stats.nba.com/stats"
team_endpoint = "/teamgamelog"
params = {:TeamID => "1610612739", :Season => "2016-17", :SeasonType => "Playoffs"}
HEADERS = {'user-agent'=> 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36',
'Accept'=> "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
}
response = Unirest.get "#{base_url}#{team_endpoint}", headers: HEADERS,
parameters: params
The above code doesn't work (the connection eventually times-out). The following python code works:
import requests
base_url = 'http://stats.nba.com/stats'
HEADERS = {'user-agent': ('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) '
'AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/45.0.2454.101 Safari/537.36'),}
end_point = '/teamgamelog'
PARAMS = {'TeamID': '1610612739', 'Season': '2016-17', 'SeasonType': 'Playoffs'}
r = requests.get(base_url+end_point, headers=HEADERS, params=PARAMS)
Any advice on solving the ruby case?

Related

Read JSON metadata for a token from Solscan

I'm using python and trying to read the metadata from a token on solscan.
I am looking for the name, image, etc from metadata.
I am currently using JSON request which seems to work (ie not fail), but it only returns me:
{"holder":0}
Process finished with exit code 0
I am doing several other requests to website, so I think my request is correct.
I tried looking at the documentation on https://public-api.solscan.io/docs and I believe I am requesting the correct info, but I dont get it.
Here is my current code:
import requests
headers = {
'accept': 'application/jsonParsed',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'
}
params = (
('tokenAddress', 'EArf8AxBi44QxFVnSab9gZpXTxVGiAX2YCLokccr1UsW'),
)
response = requests.get('https://public-api.solscan.io/token/meta', headers=headers, params=params)
#response = requests.get('https://arweave.net/viPcoBnO9OjXvnzGMXGvqJ2BEgl25BMtqGaj-I1tkCM', headers=headers)
print(response.content.decode())
Any help appreciated!

This code sample works:
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'
}
params = {
'address': 'EArf8AxBi44QxFVnSab9gZpXTxVGiAX2YCLokccr1UsW',
}
response = requests.get('https://api.solscan.io/account', headers=headers, params=params)
print(response.content.decode())
I use another URL and parameters in my sample: https://api.solscan.io/account used instead of https://public-api.solscan.io/token/meta and address param instead of tokenAddress.

Python request yields status code 500 even though the website is available

I'm trying to use Python to check whether or not a list of websites is online. However, on several sites, requests yields the wrong status code. For example, the status code I get for https://signaturehound.com/ is 500 even though the website is online and in the Chrome developer tools the response code 200 is shown.
My code looks as follows:
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
def url_ok(url):
r = requests.head(url,timeout=5,allow_redirects=True,headers=headers)
status_code = r.status_code
return status_code
print(url_ok("https://signaturehound.com/"))

As suggested by #CaptainDaVinci in the comments, the solution is to replace head by get in the code:
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
def url_ok(url):
r = requests.get(url,timeout=5,allow_redirects=True,headers=headers)
status_code = r.status_code
return status_code
print(url_ok("https://signaturehound.com/"))

Python Requests Get not Working

I have a simple Get request I'd like to make using Python's Request library.
import requests
HEADERS = {'user-agent': ('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5)'
'AppleWebKit/537.36 (KHTML, like Gecko)'
'Chrome/45.0.2454.101 Safari/537.36'),
'referer': 'http://stats.nba.com/scores/'}
url = 'http://stats.nba.com/stats/playbyplayv2?EndPeriod=10&EndRange=55800&GameID=0021500281&RangeType=2&Season=2016-17&SeasonType=Regular+Season&StartPeriod=1&StartRange=0'
response = requests.get(url, timeout=5, headers=HEADERS)
However, when I make the requests.get call, I get the error requests.exceptions.ReadTimeout: HTTPConnectionPool(host='stats.nba.com', port=80): Read timed out. (read timeout=5). But I am able to copy/paste that url into my browser and view the resulting JSON. Why is requests not able to get the result?

Your HEADERS format is wrong. I tried with this code and it worked without any issues:
import requests
HEADERS = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36',
}
url = 'http://stats.nba.com/stats/playbyplayv2?EndPeriod=10&EndRange=55800&GameID=0021500281&RangeType=2&Season=2016-17&SeasonType=Regular+Season&StartPeriod=1&StartRange=0'
response = requests.get(url, timeout=5, headers=HEADERS)
print(response.text)

Able to see image on browser, but urllib.urlretrieve() fails to downlad it. How can I download it?

Image path --> http://markinternational.info/data/out/366/221983609-black-hd-desktop-wallpaper.jpg
Code I am using
import urllib
urllib.urlretrieve("https://markinternational.info/data/out/366/221983609-black-hd-desktop-wallpaper.jpg" , "photu.jpg")
What it returns (returns same thing for successful or unsuccessful attempts)
('photu.jpg', <httplib.HTTPMessage instance at 0x7fe3cfb27d88>)
Can someone help?

You need to fake the user-agent to bypass this restriction by the web server.
I used Python3 and requests library, I managed to get the picture:
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
url = 'https://markinternational.info/data/out/366/221983609-black-hd-desktop-wallpaper.jpg'
res = requests.get(url, headers=headers)
with open('photo.jpg', 'wb') as W:
W.write(res.content)

This might help.
import urllib
f = open('photu.jpg','wb')
f.write(urllib.urlopen('https://markinternational.info/data/out/366/221983609-black-hd-desktop-wallpaper.jpg').read())
f.close()

Since you're sending a raw http request without any User-Agent header, the server is not allowing the request to pass through. You can mock it with a defined User-Agent in header and it'll work as if it works on browser.
url = "https://markinternational.info/data/out/366/221983609-black-hd-desktop-wallpaper.jpg"
req = urllib.request.Request(
url,
data=None,
headers={
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
}
)
with open('image.jpg', 'wb') as img_file:
img_file.write(urllib.request.urlopen(req).read())

How can I recreate a urllib.requests in Python 2.7?

I'm crawling some web-pages and parsing through some data on them, but one of the sites seems to be blocking my requests. The version of the code using Python 3 with urllib.requests works fine. My problem is that I need to use Python 2.7, and I can't get a response using urllib2
Shouldn't these requests be identical?
Python 3 version:
def fetch_title(url):
req = urllib.request.Request(
url,
data=None,
headers={
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
}
)
html = urllib.request.urlopen(req).read().encode('unicode-escape').decode('ascii')
return html
Python 2.7 version:
import urllib2
opener = urllib2.build_opener()
opener.addheaders = [(
'User-Agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
)]
response = opener.open('http://website.com')
print response.read()

The following code should work, essentially with python 2.7 you can create a dictionary with your desired headers and format your request in a way that it will work properly with urllib2.urlopen using urllib2.Request.
import urllib2
def fetch_title(url):
my_headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36"
}
return urllib2.urlopen(urllib2.Request(url, headers=my_headers)).read()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Ruby vs Python get request - python

Related

Read JSON metadata for a token from Solscan

Python request yields status code 500 even though the website is available

Python Requests Get not Working

Able to see image on browser, but urllib.urlretrieve() fails to downlad it. How can I download it?

How can I recreate a urllib.requests in Python 2.7?

Categories

Resources