Url unshortner python and urilib Request - python

Im traying to unshort a few urls that have been shortened. For such thing I'm using https://pypi.org/project/fake-headers/ to create the headers
but i cant get it to work I get 400 and 406 status codes.
url = 'http:// bit.ly /377n4o6' #i had to add spaces end url is https://es.banggood.com/Xiaomi-Mi-A1-MiA1-Dual-Rear-Camera-5_5-inch-4GB-RAM-64GB-Snapdragon-625-Octa-core-4G-Smartphone-p-1196064.html
parsed = urllib.parse.urlparse(url)
if parsed.scheme == '':
url = 'http://' + url
parsed = urllib.parse.urlparse(url)
h = http.client.HTTPConnection(parsed.netloc, timeout=5)
path = quote(parsed.path, safe='')
header = Headers(
browser="chrome", # Generate only Chrome UA
os="win", # Generate ony Windows platform
headers=True # generate misc headers
)
#request is asking to have a parans, so I got one from another code.
params = urlencode({'#number': 12524, '#type': 'issue', '#action': 'show'})
h.request('HEAD', path,params, header.generate())
response = h.getresponse()
print(response.status)
what could be wrong with this that fails?
thanks

Related

How to search for books that have spaces in their title using Google books API

When i search for books with a single name(e.g bluets) my code works fine, but when I search for books that have two names or spaces (e.g white whale) I got an error(jinja2 synatx) how do I solve this error?
#app.route("/book", methods["GET", "POST"])
def get_books():
api_key =
os.environ.get("API_KEY")
if request.method == "POST":
book = request.form.get("book")
url =f"https://www.googleapis.com/books/v1/volumes?q={book}:keyes&key={api_key}"
response =urllib.request.urlopen(url)
data = response.read()
jsondata = json.loads(data)
return render_template ("book.html", books=jsondata["items"]
I tried to search for similar cases, and just found one solution, but I didn't understand it
Here is my error message
http.client.InvalidURL
http.client.InvalidURL: URL can't contain control characters. '/books/v1/volumes?q=white whale:keyes&key=AIzaSyDtjvhKOniHFwkIcz7-720bgtnubagFxS8' (found at least ' ')
Some chars in url need to be encoded - in your situation you have to use + or %20 instead of space.
This url has %20 instead of space and it works for me. If I use + then it also works
import urllib.request
import json
url = 'https://www.googleapis.com/books/v1/volumes?q=white%20whale:keyes&key=AIzaSyDtjvhKOniHFwkIcz7-720bgtnubagFxS8'
#url = 'https://www.googleapis.com/books/v1/volumes?q=white+whale:keyes&key=AIzaSyDtjvhKOniHFwkIcz7-720bgtnubagFxS8'
response = urllib.request.urlopen(url)
text = response.read()
data = json.loads(text)
print(data)
With requests you don't even have to do it manually because it does it automatically
import requests
url = 'https://www.googleapis.com/books/v1/volumes?q=white whale:keyes&key=AIzaSyDtjvhKOniHFwkIcz7-720bgtnubagFxS8'
r = requests.get(url)
data = r.json()
print(data)
You may use urllib.parse.urlencode() to make sure all chars are correctly encoded.
import urllib.request
import json
payload = {
'q': 'white whale:keyes',
'key': 'AIzaSyDtjvhKOniHFwkIcz7-720bgtnubagFxS8',
}
query = urllib.parse.urlencode(payload)
url = 'https://www.googleapis.com/books/v1/volumes?' + query
response = urllib.request.urlopen(url)
text = response.read()
data = json.loads(text)
print(data)
and the same with requests - it also doesn't need encoding
import requests
payload = {
'q': 'white whale:keyes',
'key': 'AIzaSyDtjvhKOniHFwkIcz7-720bgtnubagFxS8',
}
url = 'https://www.googleapis.com/books/v1/volumes'
r = requests.get(url, params=payload)
data = r.json()
print(data)

Requests lib changing characters to URL-encoded representation

I am trying to send a get request to an api that includes timestamps. The URL is getting changed and instead of : %253A is inserted and I get a 500 error code.
url = 'https://www.fleetTrackingSimplicity.com/REST6/api/vehiclehistory/2'
start = '2020-12-01T00:00:00Z'
end = '2020-12-02T00:00:00Z'
param = {'startdate': start, 'enddate': end, 'count':'500'}
r_auth = str(a_json['TransactionId'])
headers2 = dict (Authorization = r_auth, Accept = 'application/json')
r = requests.get(url, headers = headers2, params=param)
When I print r.url I get https://www.fleettrackingsimplicity.com/REST6/api/vehiclehistory/2?startdate=2020-12-01T00%253A00%253A00Z&enddate=2020-12-02T00%253A00%253A00Z&count=500

Python 301 POST

So basically I'm trying to make a request to this website - https://panel.talonro.com/login/ which is supposed to be 301 redirect.
I send data as I should but in the end there is no Location header in my request and status code is 200 instead of 301.
I can't figure out what I am doing wrong. Please help
def do_request():
req = requests.get('https://panel.talonro.com/login/').text
soup = BeautifulSoup(req, 'html.parser')
csrf = soup.find('input', {'name':'csrfKey'}).get('value')
ref = soup.find('input', {'name':'ref'}).get('value')
post_data = {
'auth':'mylogin',
'password':'mypassword',
'login__standard_submitted':'1',
'csrfKey':csrf,
'ref':ref,
'submit':'Go'
}
post = requests.post(url = 'https://forum.talonro.com/login/', data = post_data, headers = {'referer':'https://panel.talonro.com/login/'})
Right now push_data is in do_request(), so you cannot access it outside of that function.
Instead, try this where you return that info and then pass it in:
import requests
from bs4 import BeautifulSoup
def do_request():
req = requests.get('https://panel.talonro.com/login/').text
soup = BeautifulSoup(req, 'html.parser')
csrf = soup.find('input', {'name':'csrfKey'}).get('value')
ref = soup.find('input', {'name':'ref'}).get('value')
post_data = {
'auth':'mylogin',
'password':'mypassword',
'login__standard_submitted':'1',
'csrfKey':csrf,
'ref':ref,
'submit':'Go'
}
return post_data
post = requests.post(url = 'https://forum.talonro.com/login/', data = do_request(), headers = {'referer':'https://panel.talonro.com/login/'})

oauth1, rsa-sha1, header params

I have been using requests-oauthlib, which do have rsa-sha1 signature algorithm, but there can't seem to be a way to add extra params to the authorization header, and if i do modify the header manually before making the request i get
"The request contains an incomplete or invalid oauth_signature."
because it had already created the oauth_signature when i have added the new oauth_body_hash to the authorization header.
i have pasted the code, the first call using oauth passes correctly, but the second one is giving the error.
any possible way to workaround this ? another library that has both options to add extra params and provide rsa-sha1 signature ?
oauth = OAuth1(
consumer_key,
signature_method='RSA-SHA1',
signature_type='auth_header',
rsa_key=key,
callback_uri=callback_uri
)
oauth.client.realm = 'eWallet'
r = requests.post(url=request_token_url, auth=oauth)
credentials = parse_qs(r.content)
# end of request
# post shopingcart
root = ET.fromstring(shopping_cart_xml)
root.find('OAuthToken').text = credentials['oauth_token'][0]
xml = ET.tostring(root, encoding="us-ascii", method="xml")
sha1 = hashlib.sha1()
sha1.update(xml)
oauth_body_hash = sha1.digest()
oauth_body_hash = base64.b64encode(oauth_body_hash)
oauth.client.oauth_body_hash = oauth_body_hash
req = Request('POST', shopping_cart_url, data=xml, auth=oauth)
prepped = req.prepare()
auth_header = prepped.headers['Authorization'] + ', oauth_body_hash="%s"' % (oauth_body_hash)
headers = {'Content-Type': 'application/xml', 'Authorization': auth_header}
r = requests.post(url=shopping_cart_url, headers=headers, data=xml)

Get Ajax Json Output with Python Requests Library

I am requesting an Ajax Web site with a Python script and fetching cities and branch offices of http://www.yurticikargo.com/bilgi-servisleri/Sayfalar/en-yakin-sube.aspx
I completed the first step with posting
{cityID: 34} to this url and fetc the JSON output.
http://www.yurticikargo.com/_layouts/ArikanliHolding.YurticiKargo.WebSite/ajaxproxy-sswservices.aspx/GetTownByCity
But I can not retrive the JSON output with Python although i get succesfully with Chrome Advanced Rest Client Extension, posting {cityID:54,townID:5416,unitOnDutyFlag:null,closestFlag:2}
http://www.yurticikargo.com/_layouts/ArikanliHolding.YurticiKargo.WebSite/ajaxproxy-unitservices.aspx/GetUnit
All of the source code is here
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import requests
import json
class Yurtici(object):
baseUrl = 'http://www.yurticikargo.com/'
ajaxRoot = '_layouts/ArikanliHolding.YurticiKargo.WebSite/ajaxproxy-sswservices.aspx/'
getTown = 'GetTownByCity'
getUnit = 'GetUnit'
urlGetTown = baseUrl + ajaxRoot + getTown
urlGetUnit = baseUrl + ajaxRoot + getUnit
headers = {'content-type': 'application/json','encoding':'utf-8'}
def __init__(self):
pass
def ilceler(self, plaka=34): # Default testing value
payload = {'cityId':plaka}
url = self.urlGetTown
r = requests.post(url, data=json.dumps(payload), headers=self.headers)
return r.json() # OK
def subeler(self, ilceNo=5902): # Default testing value
# 5902 Çerkezköy
payload= {'cityID':59,'townID':5902,'unitOnDutyFlag':'null','closestFlag':0}
url = self.urlGetUnit
headers = {'content-type': 'application/json','encoding':'utf-8'}
r = requests.post(url, data=json.dumps(payload), headers=headers)
print r.status_code, r.raw.read()
if __name__ == '__main__':
a = Yurtici()
print a.ilceler(37) # OK
print a.subeler() # NOT OK !!!
Your code isn't posting to the same url you're using in your text example.
Let's walk through this backwards. First, let's look at the failing POST.
url = self.urlGetUnit
headers = {'content-type': 'application/json','encoding':'utf-8'}
r = requests.post(url, data=json.dumps(payload), headers=headers)
So we're posting to a URL that is equal to self.urlGetUnit. Ok, let's look at how that's defined:
baseUrl = 'http://www.yurticikargo.com/'
ajaxRoot = '_layouts/ArikanliHolding.YurticiKargo.WebSite/ajaxproxy-sswservices.aspx/'
getUnit = 'GetUnit'
urlGetUnit = baseUrl + ajaxRoot + getUnit
If you do the work in urlGetUnit, you get that the URL will be http://www.yurticikargo.com/_layouts/ArikanliHolding.YurticiKargo.WebSite/ajaxproxy-sswservices.aspx/GetUnit. Let's put this alongside the URL you used in Chrome to compare the differences:
http://www.yurticikargo.com/_layouts/ArikanliHolding.YurticiKargo.WebSite/ajaxproxy-sswservices.aspx/GetUnit
http://www.yurticikargo.com/_layouts/ArikanliHolding.YurticiKargo.WebSite/ajaxproxy-unitservices.aspx/GetUnit
See the difference? ajaxRoot is not the same for both URLs. Sort that out and you'll get back a JSON response.

Categories

Resources