python URL handling issue - python

I am trying to write a python web app that will take some sql and a bunch of other things and return a Json file, the latter part is not the issue and I have not even put it in the script yet, the issue is that the url being passed is being utf-8 encoded and then url encoded
turning our example
query :SELECT + ;
test: 2
into
test=2&query=SELECT+%2B+%3B
This seems to be ok
but the receiving get seems to think that it can expand the codes back into chars
and it receives
test=2&query=SELECT+++;
then this is url decoded and it chops off the semicolon, and i want to keep the semicolon!
it also turns the +'s which are rightly spaces into spaces but the previous bug made the real plus code into a literal plus which turns it into a space!
{'test': '2', 'query': 'SELECT '}
code is as follows:
#!/usr/bin/python
import web
import psycopg2
import re
import urllib
import urlparse
urls = (
'/query', 'query',
'/data/(.*)', 'data'
)
app = web.application(urls, globals())
render = web.template.render('templates/')
class query:
def GET(self):
return render.query()
def POST(self):
i = web.input()
data = {}
data['query'] = i.sql.encode('utf-8')
data['test'] = '2'
murl = urllib.urlencode(data)
return "go!"
class data:
def GET(self, urlEncodedDict):
print "raw type:", type(urlEncodedDict)
print "raw:", urlEncodedDict
urlEncodedDict = urlEncodedDict.encode('ascii', 'ignore')
print "ascii type:", type(urlEncodedDict)
print "ascii:", urlEncodedDict
data = dict(urlparse.parse_qsl(urlEncodedDict, 1)) #bad bit
print "dict:", data
print "element:", data['query']
if ( re.match('SELECT [^;]+ ;', data['query'])):
return 'good::'+data['query']
else:
return 'Bad::'+data['query']
if __name__ == "__main__":
app.run()
Url generated from my test form is:
http://localhost:8080/data/test=2&query=SELECT+%2B+%3B
Output is as follows:
raw type: <type 'unicode'>
raw: test=2&query=SELECT+++;
ascii type: <type 'str'>
ascii: test=2&query=SELECT+++;
dict: {'test': '2', 'query': 'SELECT '}
element: SELECT
127.0.0.1:53272 - - [16/Nov/2012 11:05:44] "HTTP/1.1 GET /data/test=2&query=SELECT+++;" - 200 OK
127.0.0.1:53272 - - [16/Nov/2012 11:05:44] "HTTP/1.1 GET /favicon.ico" - 404 Not Found
I wish to get the same dict out of the get that i encode in the first place.

If you want to pass data into a GET request, you need to use the query string syntax using the question mark character [?] as a delimiter.
The URL should be:
http://localhost:8080/data/?test=2&query=SELECT+%2B+%3B
After that, you just have to use web.input() to get a dictionary with all arguments already decoded.
urls = (
'/query', 'query',
'/data/', 'data'
)
[...]
class data:
def GET(self):
data = web.input()
print "dict:", data
print "element:", data['query']
if ( re.match('SELECT [^;]+ ;', data['query'])):
return 'good::'+data['query']
else:
return 'Bad::'+data['query']
Result:
dict: <Storage {'test': u'2', 'query': u'SELECT + ;'}>
element: SELECT + ;
127.0.0.1:44761 - - [16/Nov/2012 15:06:06] "HTTP/1.1 GET /data/" - 200 OK

Related

Check query string on the request using requests_mock

How to check if a request mocked by requests_mock added some query parameters to a URL?
I have a function func thats do a HTTP POST on the url with some query string on the URL and I want to check if was called with this query string.
This is my attempt, but fails.
query is a empty string and qs is a empty dict.
I have sure that my func is appending the query string on the request.
with requests_mock.Mocker() as mock:
mock.post(url, text=xml)
func() # This function will call url + query string
history = mock.request_history[0]
assert history.method == "POST" # OK
assert history.query is None # Returns an empty string, AssertionError: assert '' is None
assert history.qs is None # Returns an empty dict, assert {} is None
My func
def credilink():
url = settings["url"]
params = settings["params"]
params["CC"] = query
response = requests.post(url, params=params)
# ...
I tried to reproduce your problem and was unable to...
Here is the code I'm running:
import requests
import requests_mock
url = "http://example.com"
settings = dict(url=url, params=dict(a=1))
query = "some-query"
xml = "some-xml"
def credilink():
url = settings["url"]
params = settings["params"]
params["CC"] = query
response = requests.post(url, params=params)
return response.text
# ...
def test():
with requests_mock.Mocker() as mock:
mock.post(url, text=xml)
data = credilink() # This function will call url + query string
history = mock.request_history[0]
assert history.method == "POST" # OK
assert history.qs == dict(a=['1'], cc=[query])
assert history.query == f"a=1&cc={query}"
assert data == xml
The assertions pass in this snippet.
Maybe it's some version problem? I used requests==2.25.1 and requests-mock==1.8.0.
In my case, the problem was in mock:// URL schema, as it's present in the requests-mock samples
session.get('mock://test.com/path')
But requests library skips query arguments for non "http" URLs. Here is the comment from the source code
# Don't do any URL preparation for non-HTTP schemes like `mailto`,
# `data` etc to work around exceptions from `url_parse`, which
# handles RFC 3986 only.
if ':' in url and not url.lower().startswith('http'):
self.url = url
return

KeyError: ' http' Url Parsing Error

I am currently working on shopware API When I am parsing the URL of
like
http://192.168.0.100/shopware531/api
and give me an error that:~
connection_type = SCHEME_TO_CONNECTION[scheme]
KeyError: u' http'
Using the
def buildHttpQuery(self, taxonomy, parameters):
if taxonomy.startswith('/'):
taxonomy = taxonomy[1:]
if not self.baseurl.endswith('/'):
self.baseurl += '/'
url = urljoin(self.baseurl, taxonomy)
url_parts = list(urlparse(url))
query = dict(parse_qsl(url_parts[4]))
query.update(parameters)
url_parts[4] = urlencode(query)
url = urlunparse(url_parts)
return url
and url return is :~ http://192.168.0.100/shopware531/api
I have the similiar problem but with bytes. I have link like: b"https://google.com" and I use httplib2.request(str(link)) because this request wants string instead of bytes. Later on debugger I saw that str function transforms b'https://google.com' into b'https://google.com' and cause KeyError. So after using b'https://google.com'.decode('utf-8') it works.

Python Cherrypy: dictionary not behaving like a dictionary

I'm trying to scrape some data using the Spotify API. The code below works and returns a lot of text when I search for the track name 'if i can't'. The beginning of the output from the API prints on my website and looks like this:
It looks like a dictionary except for the funny b' at the start. Also I can't access it like a dictionary. If I try
return raw_data['info']
it throws up an error. Similarly, if I try to find its type (so return type(raw_data) instead of return raw_data), the page comes up blank.
Is there someway to save the output from the data.read() in the form of a dictionary? Using
raw_data = ast.literal_eval(raw_data)
throws up an error.
#!/usr/local/bin/python3.2
# -*- coding: utf-8 -*-
import cherrypy
import numpy as np
import urllib.request
class Root(object):
#cherrypy.expose
def index(self):
a_query = Query()
text = a_query.search()
return '''<html>
Welcome to Spoti.py! %s
</html>''' %text
class Query():
def __init__(self):
self.qstring = '''if i can't'''
def space_to_plus(self):
'''takes the instance var qstring
replaces ' ' with '+'
-----------------------
returns nothing'''
self.qstring = self.qstring.replace(' ', '+')
def search(self):
self.space_to_plus()
url = 'http://ws.spotify.com/search/1/track.json?q=' + self.qstring
data = urllib.request.urlopen(url)
raw_data = data.read()
#return raw_data['info']
#return type(raw_data)
return raw_data
cherrypy.config.update({
'environment': 'production',
'log.screen': False,
'server.socket_host': '127.0.0.1',
'server.socket_port': 15850,
#'tools.encode.on': True,
#'tools.encode.encoding': 'utf-8',
})
cherrypy.config.update({'tools.sessions.on': True})
cherrypy.quickstart(Root())
What you have there is a JSON string. The b at the beginning indicates you are printing a a byte string literal. What you have to do is parse the JSON. Simply do this:
import json
...
info_dict = json.loads(raw_data)

Convert string to correct charset

Im trying to save unicode data to an external webservice.
When I try to save æ-ø-å, it get saved as æ-ø-Ã¥ in the external system.
Edit:
(My firstname value is Jørn) (Value from django J\\xf8rn)
firstname.value=user_firstname = Jørn
Here is my result if I try to use encode:
firstname.value=user_firstname.encode('ascii', 'replace') = J?rn
firstname.value=user_firstname.encode('ascii', 'xmlcharrefreplace') = Jørn
firstname.value=user_firstname.encode('ascii', 'backslashreplace') = J\xf8rn
firstname.value=user_firstname.encode('ascii', 'ignore') = I get a unicode error using ignore.
My form for updating a user:
def show_userform(request):
if request.method == 'POST':
form = UserForm(request.POST, request.user)
if form.is_valid():
u = UserProfile.objects.get(username = request.user)
firstname = form.cleaned_data['first_name']
lastname = form.cleaned_data['last_name']
tasks.update_webservice.delay(user_firstname=firstname, user_lastname=lastname)
return HttpResponseRedirect('/thank-you/')
else:
form = UserForm(instance=request.user) # An unbound form
return render(request, 'myapp/form.html', {
'form': form,
})
Here is my task:
from suds.client import Client
#task()
def update_webservice(user_firstname, user_lastname):
membermap = client.factory.create('ns2:Map')
firstname = client.factory.create('ns2:mapItem')
firstname.key="Firstname"
firstname.value=user_firstname
lastname = client.factory.create('ns2:mapItem')
lastname.key="Lastname"
lastname.value=user_lastname
membermap.item.append(firstname)
membermap.item.append(lastname)
d = dict(CustomerId='xxx', Password='xxx', PersonId='xxx', ContactData=membermap)
try:
#Send updates to SetPerson function
result = client.service.SetPerson(**d)
except WebFault, e:
print e
What do I need to do, to make the data saved correctly?
Your external system is interpreting your UTF-8 as if it were Latin-1, or maybe Windows-1252. That's bad.
Encoding or decoding ASCII is not going to help. Your string is definitely not plain ASCII.
If you're lucky, it's just that you're missing some option in that web service's API, with which you could tell it that you're sending it UTF-8.
If not, you've got quite a maintenance headache on your hands, but you can still fix what you get back. The web service took the string you encoded as UTF-8 and decoded it as Latin-1, so you just need to do the exact reverse of that:
user_firstname = user_firstname.encode('latin-1').decode('utf-8')
Use decode and encode methods for str type.
for example :
x = "this is a test" # ascii encode
x = x.encode("utf-8") # utf-8 encoded
x = x.decode("utf-8") # ascii encoded

Python: issue in displaying as a list httplib.HTTPMessage

I have a small problem here. So, I am writing some calls for a well known REST API. Everything is going well, except the fact that I want all the response to be displayed as a list(which is better for me to manipulate). My function is this:
import sys, httplib
HOST = "api.sugarsync.com"
API_URL = "https://api.sugarsync.com"
def do_request(xml_location):
request = open(xml_location,"r").read()
webservice = httplib.HTTPS(HOST)
webservice.putrequest("POST", "authorization", API_URL)
webservice.putheader("Host", HOST)
webservice.putheader("User-Agent","Python post")
webservice.putheader("Content-type", "application/xml")
webservice.putheader("Content-type", "application/xml")
webservice.putheader("Accept", "*/*")
webservice.putheader("Content-length", "%d" % len(request))
webservice.endheaders()
webservice.send(request)
statuscode, statusmessage, header = webservice.getreply()
result = webservice.getfile().read()
return statuscode, statusmessage, header
return result
do_request('C://Users/my_user/Documents/auth.xml')
I am used to use split() but in this case the result is this:
[201, 'Created', <httplib.HTTPMessage instance at 0x0000000001F68AC8>]
Well, I need also the third object(httplib.HTTPMessage instance at 0x0000000001F68AC8>), to be displayed as list, to extract some of the data in there.
Thanks in advance!
httplib.HTTPMessage is something like dict, here is a sample:
import httplib
from cStringIO import StringIO
h = httplib.HTTPMessage(StringIO(""))
h["Content-Type"] = "text/plain"
h["Content-Length"] = "1234"
print h.items()
you just call it's function items(), it will return a list of headers

Categories

Resources