Extra &/ampersand in json.dumps with python requests package GET Request

Extra &/ampersand in json.dumps with python requests package GET Request - python

I want to perform a simple get request against a public api using the pythonrequests package.
I am using requests 2.25.1 and python 3.6.
Unfortunately, there is an extra & preprended to the URL parameters that I cannot figure out where it comes from. Example code below with the wrong url and correct url removing the ampersand.
import requests
import json
URL="https://search.rcsb.org/rcsbsearch/v1/query?json="
JSON={
"query": {
"type": "terminal",
"service": "text",
"parameters": { "value": "thymidine kinase" }
},
"return_type": "entry"
}
r=requests.get(url = URL,params=json.dumps(JSON, separators=(',', ':')))
r.url then is
https://search.rcsb.org/rcsbsearch/v1/query?json=&%7B%22query%22:%7B%22type%22:%22terminal%22,%22service%22:%22text%22,%22parameters%22:%7B%22value%22:%22thymidine%20kinase%22%7D%7D,%22return_type%22:%22entry%22%7D
which produces a 500 error.
if one changes json=&%7 to json=%7the request works. How can I get rid of the extra ampersand?
https://search.rcsb.org/rcsbsearch/v1/query?json=%7B%22query%22:%7B%22type%22:%22terminal%22,%22service%22:%22text%22,%22parameters%22:%7B%22value%22:%22thymidine%20kinase%22%7D%7D,%22return_type%22:%22entry%22%7D

The problem is that you're both trying to use the params keyword argument to requests.get and you're trying to build the parameter string yourself. Because of the ? in your URL, the underlying url manipulation code assumes that there are already parameters and adds new ones using & (as in someurl?param1=foo&param2=bar...).
Pick one mechanism or the other, e.g.:
import json
import requests
URL="https://search.rcsb.org/rcsbsearch/v1/query"
JSON={
"query": {
"type": "terminal",
"service": "text",
"parameters": { "value": "thymidine kinase" }
},
"return_type": "entry"
}
r=requests.get(url = URL,params={'json': json.dumps(JSON, separators=(',', ':'))})
print(r)

Related

How to validate JSON request body before sending PUT request in python

It's when I send a PUT request to my API endpoint from python with a JSON request body I receive empty request body, because sometimes It's containing special characters which is not supported by JSON.
How can I sanitize my JSON before sending my request?
I've tried with stringify and parsing json before I sent my request!
profile = json.loads(json.dumps(profile))
My example invalid json is:
{
"url": "https://www.example.com/edmund-chand/",
"name": "Edmund Chand",
"current_location": "FrankfurtAmMainArea, Germany",
"education": [],
"skills": []
}
and My expected validated json should be:
{
"url": "https://www.example.com/edmund-chand/",
"name": "Edmund Chand",
"current_location": "Frankfurt Am Main Area, Germany",
"education": [],
"skills": []
}

If you're looking for something quick to sanitize json data for limited fields i.e. current_location, you can try something like the following below:
def sanitize(profile):
profile['current_location'] = ', '.join([val.strip() for val in profile['current_location'].split(',')])
return profile
profile = sanitize(profile)
The idea here is that you would write code to sanitize each bits in that function and send it your api or throw exception if invalid etc.
For more robust validation, you can consider using jsonschema package. More details here.
With that package you can validate strings and json schema more flexibly.
Example taken from the package readme:
from jsonschema import validate
# A sample schema, like what we'd get from json.load()
schema = {
"type" : "object",
"properties" : {
"url" : {"type" : "string", "format":"uri"},
"current_location" : {"type" : "string", "maxLength":25, "pattern": "your_regex_pattern"},
},
}
# If no exception is raised by validate(), the instance is valid.
validate(instance=profile, schema=schema)
You can find more infor and types of available validation for strings here.

Thank you #Rithin for your solution but that one seems more coupled with one field of the whole JSON.
I found a solution to replace it with below example code which works for any field:
profile = json.loads(json.dumps(profile).replace("\t", " "))

How to make a post request with the Python requests library?

I am using the following filters in Postman to make a POST request in a Web API but I am unable to make a simple POST request in Python with the requests library.
First, I am sending a POST request to this URL (http://10.61.202.98:8081/T/a/api/rows/cat/ect/tickets) with the following filters in Postman applied to the Body, with the raw and JSON(application/json) options selected.
Filters in Postman
{
"filter": {
"filters": [
{
"field": "RCA_Assigned_Date",
"operator": "gte",
"value": "2017-05-31 00:00:00"
},
{
"field": "RCA_Assigned_Date",
"operator": "lte",
"value": "2017-06-04 00:00:00"
},
{
"field": "T_Subcategory",
"operator": "neq",
"value": "Temporary Degradation"
},
{
"field": "Issue_Status",
"operator": "neq",
"value": "Queued"
}],
"logic": "and"
}
}
The database where the data is stored is Cassandra and according to the following links Cassandra not equal operator, Cassandra OR operator,
Cassandra Between order by operators, Cassandra does not support the NOT EQUAL TO, OR, BETWEEN operators, so there is no way I can filter the URL with these operators except with AND.
Second, I am using the following code to apply a simple filter with the requests library.
import requests
payload = {'field':'T_Subcategory','operator':'neq','value':'Temporary Degradation'}
url = requests.post("http://10.61.202.98:8081/T/a/api/rows/cat/ect/tickets",data=payload)
But what I've got is the complete data of tickets instead of only those that are not temporary degradation.
Third, the system is actually working but we are experiencing a delay of 2-3 mins to see the data. The logic goes as follows: We have 8 users and we want to see all the tickets per user that are not temporary degradation, then we do:
def get_json():
if user_name == "user 001":
with urllib.request.urlopen(
"http://10.61.202.98:8081/T/a/api/rows/cat/ect/tickets?user_name=user&001",timeout=15) as url:
complete_data = json.loads(url.read().decode())
elif user_name == "user 002":
with urllib.request.urlopen(
"http://10.61.202.98:8081/T/a/api/rows/cat/ect/tickets?user_name=user&002",timeout=15) as url:
complete_data = json.loads(url.read().decode())
return complete_data
def get_tickets_not_temp_degradation(start_date,end_date,complete_):
return Counter([k['user_name'] for k in complete_data if start_date < dateutil.parser.parse(k.get('DateTime')) < end_date and k['T_subcategory'] != 'Temporary Degradation'])
Basically, we get the whole set of tickets from the current and last year, then we let Python to filter the complete set by user and so far there are only 10 users which means that this process is repeated 10 times and makes me no surprise to discover why we get the delay...
My questions is how can I fix this problem of the requests library? I am using the following link Requests library documentation as a tutorial to make it working but it just seems that my payload is not being read.

Your Postman request is a JSON body. Just reproduce that same body in Python. Your Python code is not sending JSON, nor is it sending the same data as your Postman sample.
For starters, sending a dictionary via the data arguments encodes that dictionary to application/x-www-form-urlencoded form, not JSON. Secondly, you appear to be sending a single filter.
The following code replicates your Postman post exactly:
import requests
filters = {"filter": {
"filters": [{
"field": "RCA_Assigned_Date",
"operator": "gte",
"value": "2017-05-31 00:00:00"
}, {
"field": "RCA_Assigned_Date",
"operator": "lte",
"value": "2017-06-04 00:00:00"
}, {
"field": "T_Subcategory",
"operator": "neq",
"value": "Temporary Degradation"
}, {
"field": "Issue_Status",
"operator": "neq",
"value": "Queued"
}],
"logic": "and"
}}
url = "http://10.61.202.98:8081/T/a/api/rows/cat/ect/tickets"
response = requests.post(url, json=filters)
Note that filters is a Python data structure here, and that it is passed to the json keyword argument. Using the latter does two things:
Encode the Python data structure to JSON (producing the exact same JSON value as your raw Postman body value).
Set the Content-Type header to application/json (as you did in your Postman configuration by picking the JSON option in the dropdown menu after picking raw for the body).
requests is otherwise just an HTTP API, it can't make Cassandra do any more than any other HTTP library. The urllib.request.urlopen code sends GET requests, and are trivially translated to requests with:
def get_json():
url = "http://10.61.202.98:8081/T/a/api/rows/cat/ect/tickets"
response = requests.get(url, params={'user_name': user}, timeout=15)
return response.json()
I removed the if branching and replaced that with using the params argument, which translates a dictionary of key-value pairs to a correctly encoded URL query (passing in the user name as the user_name key).
Note the json() call on the response; this takes care of decoding JSON data coming back from the server. This still takes long, you are not filtering the Cassandra data much here.

I would recommend using the json attribute instead of data. It handles the dumping for you.
import requests
data = {'user_name':'user&001'}
headers = {'Content-Type': 'application/json', 'Accept': 'application/json'}
url = "http://10.61.202.98:8081/T/a/api/rows/cat/ect/tickets/"
r = requests.post(url, headers=headers, json=data)
Update, answer for question 3. Is there a reason you are using urllib? I’d use python requests as well for this request.
import requests
def get_json():
r = requests.get("http://10.61.202.98:8081/T/a/api/rows/cat/ect/tickets”, params={"user_name": user_name.replace(" ", "&")})
return r.json
# not sure what you’re doing here, more context/code example would help
def get_tickets_not_temp_degradation(start_date, end_date, complete_):
return Counter([k['user_name'] for k in complete_data if start_date < dateutil.parser.parse(k.get('DateTime')) < end_date and k['T_subcategory'] != 'Temporary Degradation'])
Also, is the username really supposed to be user+001 and not user&001 or user 001?

I think, you can use requests library as follows:
import requests
import json
payload = {'field':'T_Subcategory','operator':'neq','value':'Temporary Degradation'}
url = requests.post("http://10.61.202.98:8081/T/a/api/rows/cat/ect/tickets",data=json.dumps(payload))

You are sending user in url, use it through post, but its depend upon how end points are implemented. You can try the below code :
import requests
from json import dumps
data = {'user_name':'user&001'}
headers = {'Content-Type': 'application/json', 'Accept': 'application/json'}
url = "http://10.61.202.98:8081/T/a/api/rows/cat/ect/tickets/"
r = requests.post(url, headers=headers, data=dumps(data))

Tree view from json Django

I have a backend that gives me a json response like this
{
"compiler": {
"type": "GCC",
"version": "5.4"
},
"cpu": {
"architecture": "x86_64",
"count": 4
}
}
I need to visualize this response in the form of a tree. What should I do?
Maybe try to transform it to django-model? Or something else?

If you just want it printed with indentation, the json module can already do this using dumps(), as shown here. Alternatively, you can use pprint.

Translating Elasticsearch request from Kibana into elasticsearch-dsl

Recently migrated from AWS Elasticsearch Service (used Elasticsearch 1.5.2) to Elastic Cloud (currently using Elasticsearch 5.1.2). Glad I did it, but with that change comes a newer version of Elasticsearch and newer API's. Struggling to get my head around the new way of requesting stuff. Formerly, I could more or less copy/paste from Kibana's "Elasticsearch Request Body", adjust a few things, run elasticsearch.Elasticsearch.search() and get what I expect.
Here's my Elasticsearch Request Body from Kibana (for brevity, removed some of the extraneous stuff that Kibana usually inserts):
{
"size": 500,
"sort": [
{
"Time.ISO8601": {
"order": "desc",
"unmapped_type": "boolean"
}
}
],
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "Message\\ ID: 2003",
"analyze_wildcard": true
}
},
{
"range": {
"Time.ISO8601": {
"gte": 1484355455678,
"lte": 1484359055678,
"format": "epoch_millis"
}
}
}
],
"must_not": []
}
},
"stored_fields": [
"*"
],
"script_fields": {},
}
Now I want to use elasticsearch-dsl to do it, since that seems to be the recommended method (instead of using elasticsearch-py). How would I translate the above into elasticsearch-dsl?
Here's what I have so far:
from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search, Q
client = Elasticsearch(
hosts=['HASH.REGION.aws.found.io/elasticsearch'],
use_ssl=True,
port=443,
http_auth=('USER','PASS')
)
s = Search(using=client, index="emp*")
s = s.query("query_string", query="Message\ ID:2003", analyze_wildcards=True)
s = s.query("range", **{"Time.ISO8601": {"gte": 1484355455678, "lte": 1484359055678, "format": "epoch_millis"}})
s = s.sort("Time.ISO8601")
response = s.execute()
for hit in response:
print '%s %s' % (hit['Time']['ISO8601'], hit['Message ID'])
My code written as above is not giving me what I expect. Getting results that include stuff that doesn't match "Message\ ID:2003", and also it's giving me things outside the requested range of Time.ISO8601 as well.
Totally new to elasticsearch-dsl and ES 5.1.2's way of doing things, so I know I've got lots to learn. What am I doing wrong? Thanks in advance for the help!

I don't have elasticsearch running right now but the query looks like what you wanted (you can always see the query produced by looking at s.to_dict()) with the exception of escaping the \ sign. In the original query it was escaped yet in python the result might be different due to different escaping.
I wuld strongly advise to not have spaces in your fields and also to use a more structured query than query_string:
s = Search(using=client, index="emp*")
s = s.filter("term", message_id=2003)
s = s.query("range", Time__ISO8601={"gte": 1484355455678, "lte": 1484359055678, "format": "epoch_millis"})
s = s.sort("Time.ISO8601")
Note that I also changed query() to filter() for a slight speedup and used __ instead of . in the field name keyword argument. elasticsearch-dsl will automatically expand that to ..
Hope this helps...

Local parameter on Facebook issue

I have written this Python code to retrieve all Facebook pages that are written in Arabic:
import facebook # pip install facebook-sdk
import json
import codecs
from prettytable import PrettyTable
from collections import Counter
# A helper function to pretty-print Python objects as JSON
def pp(o):
print json.dumps(o, indent=1)
# Create a connection to the Graph API with your access token
ACCESS_TOKEN = ''#my access token
g = facebook.GraphAPI(ACCESS_TOKEN)
s=g.request('search', { 'q' : '&',
'type' : 'page',
'limit' : 5000 ,
'locale' : 'ar_AR' })
pp(s)
The locale parameter should return all pages written in Arabic. However, as the output below shows, I get results that contain English. What am I doing incorrectly?
{
"paging": {
"data": [
{
"category": "\u0628\u0636\u0627\u0626\u0639 \u0627\u0644\u0628\u064a\u0639 \u0628\u0627\u0644\u062a\u062c\u0632\u0626\u0629 \u0648\u0628\u0636\u0627\u0626\u0639 \u0627\u0644\u0645\u0633\u062a\u0647\u0644\u0643\u064a\u0646",
"name": "Stop & Shop",
"category_list": [
{
"id": "169207329791658",
"name": "\u0645\u062d\u0644 \u0628\u0642\u0627\u0644\u0629"
}
],
"id": "170000993071234"
},
{
"category": "\u0628\u0636\u0627\u0626\u0639 \u0627\u0644\u0628\u064a\u0639 \u0628\u0627\u0644\u062a\u062c\u0632\u0626\u0629 \u0648\u0628\u0636\u0627\u0626\u0639 \u0627\u0644\u0645\u0633\u062a\u0647\u0644\u0643\u064a\u0646",
"name": "C&A",
"category_list": [
{
"id": "186230924744328",
"name": "\u0645\u062a\u062c\u0631 \u0645\u0644\u0627\u0628\u0633"
}
],
"id": "109345009145382"
},

Your query is 100% correct and should only return Arabic posts. Unfortunately, this is a known Facebook Graph Search API bug. It looks like it flips back and forth from working to not working.
See the discussions, https://developers.facebook.com/bugs/294623187324442 and https://developers.facebook.com/bugs/409365862525282
I had similar issues working with the Facebook Graph API, it never seems to work quite right.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Extra &/ampersand in json.dumps with python requests package GET Request - python

Related

How to validate JSON request body before sending PUT request in python

How to make a post request with the Python requests library?

Tree view from json Django

Translating Elasticsearch request from Kibana into elasticsearch-dsl

Local parameter on Facebook issue

Categories

Resources