converting json string to dict in python - python

I am getting a value in request.body, it is like :
a = '[data={"vehicle":"rti","action_time":"2015-04-21 14:18"}]'
type(a) == str
I want to convert this str to dict. i have tried by doing this
b=json.loads(a)
But am getting error
ValueError: No JSON object could be decoded

The data you are receiving is not properly formatted JSON. You're going to have to do some parsing or data transformation before you can convert it using the json module.
If you know that the data always begins with the literal string '[data=' and always ends with the literal string ']', and that the rest of the data is valid json, you can simply strip off the problematic characters:
b = json.loads(a[6:-1])
If the data can't be guaranteed to be in precisely that format, you'll have to learn what the actual format is, and do more intelligent parsing.

It is not a valid json format that you are receiving.
A valid format is of type:
'{"data":{"vehicle":"rti","action_time":"2015-04-21 14:18"}}'

import json
a = '[data={"vehicle":"rti","action_time":"2015-04-21 14:18"}]'
r = a.split("=")
r[:] = r[0].replace("[", ""), r[1].replace("]", "")
d = '{"%s":%s}'%(r[0],r[1])
dp = json.loads(d)
print dp

Related

how to print after the keyword from python?

i have following string in python
b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
I want to print the all alphabet next to keyword "name" such that my output should be
waqas
Note the waqas can be changed to any number so i want print any name next to keyword name using string operation or regex?
First you need to decode the string since it is binary b. Then use literal eval to make the dictionary, then you can access by key
>>> s = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
>>> import ast
>>> ast.literal_eval(s.decode())['name']
'waqas'
It is likely you should be reading your data into your program in a different manner than you are doing now.
If I assume your data is inside a JSON file, try something like the following, using the built-in json module:
import json
with open(filename) as fp:
data = json.load(fp)
print(data['name'])
if you want a more algorithmic way to extract the value of name:
s = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a",\
"persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],\
"name":"waqas"}'
s = s.decode("utf-8")
key = '"name":"'
start = s.find(key) + len(key)
stop = s.find('"', start + 1)
extracted_string = s[start : stop]
print(extracted_string)
output
waqas
You can convert the string into a dictionary with json.loads()
import json
mystring = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
mydict = json.loads(mystring)
print(mydict["name"])
# output 'waqas'
First you need to convert the string into a proper JSON Format by removing b from the string using substring in python suppose you have a variable x :
import json
x = x[1:];
dict = json.loads(x) //convert JSON string into dictionary
print(dict["name"])

How to convert string to a list of json in python?

I have a string in the form of
[[sourceId:111, clientId:12345, clientName:testclient, module:test,source:Request, userName:Michelle Jackson],[sourceId:112, clientId:1233, clientName:testclient2, module:test, source:Request, userName:Michelle Jackson]]
How do I convert it into a valid python list of json ?
Although I never recommend doing this, here's the code
import re
arr = []
for x in s.split('],['):
kv = re.sub('\[|\]', '', x)
arr.append(dict(kvi.split(':') for kvi in kv.split(',')))
NOTE: If the string is system generated, it's better to get it in JSON format in the first place.

python: how do I parse a stream of json arrays with ijson library

The incoming data resembles the following:
[{
"foo": "bar"
}]
[{
"bar": "baz"
}]
[{
"baz": "foo"
}]
as you see, arrays of objects strung together. JSON-ish
ijson is able to handle the first array, and then I get:
ijson.common.JSONError: Additional data
when it hits the subsequent arrays. How do I get around this?
Here's a first cut at the problem that at least has a working regex substitution to turn a full string into valid json. It only works if you're ok with reading the full input stream before parsing as json.
import re
input = ''
for line in inputStream:
input = input + line
# input == '[{"foo": "bar"}][{"bar": "baz"}][{"baz": "foo"}]'
# wrap in [] and put commas between each ][
sanitizedInput = re.sub(r"\]\[", "],[", "[%s]" % input)
# sanitizedInput == '[[{"foo": "bar"}],[{"bar": "baz"}],[{"baz": "foo"}]]'
# then parse sanitizedInput
parsed = json.loads(sanitizedInput)
print parsed #=> [[{u'foo': u'bar'}], [{u'bar': u'baz'}], [{u'baz': u'foo'}]]
Note: since you're read the whole thing as a string, you can use json instead of ijson
You can use json.JSONDecoder.raw_decode to walk through the string. Its documentation indeed says:
This can be used to decode a JSON document from a string that may have extraneous data at the end.
The following code sample assumes all the JSON values are in one big string:
def json_elements(string):
while True:
try:
(element, position) = json.JSONDecoder.raw_decode(string)
yield element
string = string[position:]
except ValueError:
break
To avoid dealing with raw_decode yourself and to be able to parse a stream chunk by chunk, I would recommend a library I made for this exact purpose: streamcat.
def json_elements(stream)
decoder = json.JSONDecoder()
yield from streamcat.stream_to_iterator(stream, decoder)
This works for any concatenation of JSON values regardless of how many white-space characters are used within them or between them.
If you have control over how your input stream is encoded, you may want to consider using line-delimited JSON, which makes parsing easier.

Python: remove double quotes from JSON dumps

I have a database which returns the rows as lists in following format:
data = ['(1000,"test value",0,0.00,0,0)', '(1001,"Another test value",0,0.00,0,0)']
After that, I use json_str = json.dumps(data) to get a JSON string. After applying json.dumps(), I get the following output:
json_str = ["(1000,\"test value\",0,0.00,0,0)", "(1001,\"Another test value\",0,0.00,0,0)"]
However, I need the JSON string in the following format:
json_str = [(1000,\"test value\",0,0.00,0,0), (1001,\"Another test value\",0,0.00,0,0)]
So basically, I want to remove the surrounding double quotes. I tried to accomplish this with json_str = json_str.strip('"') but this doesn't work. Then, I tried json_str = json_str.replace('"', '') but this also removes the escaped quotes.
Does anybody know a way to accomplish this or is there a function in Python similiar to json.dumps() which produces the same result, but without the surrounding double quotes?
You are dumping list of strings so json.dumps does exactly what you are asking for. Rather ugly solution for your problem could be something like below.
def split_and_convert(s):
bits = s[1:-1].split(',')
return (
int(bits[0]), bits[1], float(bits[2]),
float(bits[3]), float(bits[4]), float(bits[5])
)
data_to_dump = [split_and_convert(s) for s in data]
json.dumps(data_to_dump)

Python: json.loads returns items prefixing with 'u'

I'll be receiving a JSON encoded string from Objective-C, and I am decoding a dummy string (for now) like the code below. My output comes out with character 'u' prefixing each item:
[{u'i': u'imap.gmail.com', u'p': u'aaaa'}, {u'i': u'333imap.com', u'p': u'bbbb'}...
How is JSON adding this Unicode character? What's the best way to remove it?
mail_accounts = []
da = {}
try:
s = '[{"i":"imap.gmail.com","p":"aaaa"},{"i":"imap.aol.com","p":"bbbb"},{"i":"333imap.com","p":"ccccc"},{"i":"444ap.gmail.com","p":"ddddd"},{"i":"555imap.gmail.com","p":"eee"}]'
jdata = json.loads(s)
for d in jdata:
for key, value in d.iteritems():
if key not in da:
da[key] = value
else:
da = {}
da[key] = value
mail_accounts.append(da)
except Exception, err:
sys.stderr.write('Exception Error: %s' % str(err))
print mail_accounts
The u- prefix just means that you have a Unicode string. When you really use the string, it won't appear in your data. Don't be thrown by the printed output.
For example, try this:
print mail_accounts[0]["i"]
You won't see a u.
Everything is cool, man. The 'u' is a good thing, it indicates that the string is of type Unicode in python 2.x.
http://docs.python.org/2/howto/unicode.html#the-unicode-type
The d3 print below is the one you are looking for (which is the combination of dumps and loads) :)
Having:
import json
d = """{"Aa": 1, "BB": "blabla", "cc": "False"}"""
d1 = json.loads(d) # Produces a dictionary out of the given string
d2 = json.dumps(d) # Produces a string out of a given dict or string
d3 = json.dumps(json.loads(d)) # 'dumps' gets the dict from 'loads' this time
print "d1: " + str(d1)
print "d2: " + d2
print "d3: " + d3
Prints:
d1: {u'Aa': 1, u'cc': u'False', u'BB': u'blabla'}
d2: "{\"Aa\": 1, \"BB\": \"blabla\", \"cc\": \"False\"}"
d3: {"Aa": 1, "cc": "False", "BB": "blabla"}
Those 'u' characters being appended to an object signifies that the object is encoded in Unicode.
If you want to remove those 'u' characters from your object, you can do this:
import json, ast
jdata = ast.literal_eval(json.dumps(jdata)) # Removing uni-code chars
Let's checkout from python shell
>>> import json, ast
>>> jdata = [{u'i': u'imap.gmail.com', u'p': u'aaaa'}, {u'i': u'333imap.com', u'p': u'bbbb'}]
>>> jdata = ast.literal_eval(json.dumps(jdata))
>>> jdata
[{'i': 'imap.gmail.com', 'p': 'aaaa'}, {'i': '333imap.com', 'p': 'bbbb'}]
Unicode is an appropriate type here. The JSONDecoder documentation describe the conversion table and state that JSON string objects are decoded into Unicode objects.
From 18.2.2. Encoders and Decoders:
JSON Python
==================================
object dict
array list
string unicode
number (int) int, long
number (real) float
true True
false False
null None
"encoding determines the encoding used to interpret any str objects decoded by this instance (UTF-8 by default)."
The u prefix means that those strings are unicode rather than 8-bit strings. The best way to not show the u prefix is to switch to Python 3, where strings are unicode by default. If that's not an option, the str constructor will convert from unicode to 8-bit, so simply loop recursively over the result and convert unicode to str. However, it is probably best just to leave the strings as unicode.
I kept running into this problem when trying to capture JSON data in the log with the Python logging library, for debugging and troubleshooting purposes. Getting the u character is a real nuisance when you want to copy the text and paste it into your code somewhere.
As everyone will tell you, this is because it is a Unicode representation, and it could come from the fact that you’ve used json.loads() to load in the data from a string in the first place.
If you want the JSON representation in the log, without the u prefix, the trick is to use json.dumps() before logging it out. For example:
import json
import logging
# Prepare the data
json_data = json.loads('{"key": "value"}')
# Log normally and get the Unicode indicator
logging.warning('data: {}'.format(json_data))
>>> WARNING:root:data: {u'key': u'value'}
# Dump to a string before logging and get clean output!
logging.warning('data: {}'.format(json.dumps(json_data)))
>>> WARNING:root:data: {'key': 'value'}
Try this:
mail_accounts[0].encode("ascii")
Just replace the u' with a single quote...
print (str.replace(mail_accounts,"u'","'"))

Categories

Resources