I am extracting a postgres table as json. The output file contains lines like:
{"data": {"test": 1, "hello": "I have \" !"}, "id": 4}
Now I need to load them in my python code using json.loads, but I get this error:
Traceback (most recent call last):
File "test.py", line 33, in <module>
print json.loads('''{"id": 4, "data": {"test": 1, "hello": "I have \" !"}}''')
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 381, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting , delimiter: line 1 column 50 (char 49)
I figured out the fix is to add another \ to \". So, if I pass
{"data": {"test": 1, "hello": "I have \\" !"}, "id": 4}
to json.loads, I get this:
{u'data': {u'test': 1, u'hello': u'I have " !'}, u'id': 4}
Is there a way to do this without adding the extra \? Like passing a parameter to json.loads or something?
You can specify so called “raw strings”:
>>> print r'{"data": {"test": 1, "hello": "I have \" !"}, "id": 4}'
{"data": {"test": 1, "hello": "I have \" !"}, "id": 4}
They don’t interpret the backslashes.
Usual strings change \" to ", so you can have " characters in strings that are themselves limited by double quotes:
>>> "foo\"bar"
'foo"bar'
So the transformation from \" to " is not done by json.loads, but by Python itself.
Try this:
json.loads(r'{"data": {"test": 1, "hello": "I have \" !"}, "id": 4}')
If you have that string inside a variable, then just:
json.loads(data.replace("\\", r"\\"))
Hope it helps!
Try using triple quotes r""", no need to consider the \ thing.
json_string = r"""
{
"jsonObj": []
}
"""
data = json.loads(json_string)
Try the ways source.replace('""', '') or sub it, cause "" in the source will make json.loads(source) can not distinguish them.
for my instance, i wrote:
STRING.replace("': '", '": "').replace("', '", '", "').replace("{'", '{"').replace("'}", '"}').replace("': \"", '": "').replace("', \"", '", "').replace("\", '", '", "').replace("'", '\\"')
and works like a charm.
Related
I have the following string I need to deserialize:
{
"bw": 20,
"center_freq": 2437,
"channel": 6,
"essid": "DIRECT-sB47" Philips 6198",
"freq": 2437
}
This is almost a correct JSON, except for the quote in the value DIRECT-sB47" Philips 6198 which prematurely ends the string, breaking the rest of the JSON.
Is there a way to deserialize elements which have the pattern
"key": "something which includes a quote",
or should I try to first pre-process the string with a regex to remove that quote (I do not care about it, nor about any other weird characters in the keys or values)?
UPDATE: sorry for not posting the code (it is a standard deserialization via json). The code is also available at repl.it
import json
data = '''
{
"bw": 20,
"center_freq": 2437,
"channel": 6,
"essid": "DIRECT-sB47" Philips 6198",
"freq": 2437
}
'''
trans = json.loads(data)
print(trans)
The traceback:
Traceback (most recent call last):
File "main.py", line 12, in <module>
trans = json.loads(data)
File "/usr/local/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.6/json/decoder.py", line 355, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 6 column 26 (char 79)
The same code without the quote works fine
import json
data = '''
{
"bw": 20,
"center_freq": 2437,
"channel": 6,
"essid": "DIRECT-sB47 Philips 6198",
"freq": 2437
}
'''
trans = json.loads(data)
print(trans)
COMMENT: I realize that the provider of the JSON should fix their code (I opened a bug report with them). In the meantime, until the bug is fixed (if it is) I would like to try a workaround.
I ended up analyzing the exception which includes the place of the faulty character, removing it and deserializing again (in a loop).
Worst case the whole data string is swallowed, which in my case is better than crashing.
import json
import re
data = '''
{
"bw": 20,
"center_freq": 2437,
"channel": 6,
"essid": "DIRECT-sB47" Philips 6198",
"freq": 2437
}
'''
while True:
try:
trans = json.loads(data)
except json.decoder.JSONDecodeError as e:
s = int(re.search(r"\.*char (\d+)", str(e)).group(1))-2
print(f"incorrect character at position {s}, removing")
data = data[:s] + data[(s + 1):]
else:
break
print(trans)
I'm having an issue I can't understand. I sending a json data as string via Redis (as a queue) and the receiver is throwing the following error :
[ERROR JSON (in queue)] - {"ip": null, "domain": "somedomain.com", "name": "Some user name", "contact_id": 12345, "signature":
"6f496a4eaba2c1ea4e371ea2c4951ad92f41ddf45ff4949ffa761b0648a22e38"} => end is out of bounds
The code that throws the exception is the following :
try:
item = json.loads(item[1])
except ValueError as e:
sys.stderr.write("[ERROR JSON (in queue)] - {1} => {0}\n".format(str(e), str(item)))
return None
What is really odd, is that if I open a python console and do the following :
>>> import json
>>> s = '{"ip": null, "domain": "somedomain.com", "name": "Some user name", "contact_id": 12345, "signature": "6f496a4eaba2c1ea4e371ea2c4951ad92f41ddf45ff4949ffa761b0648a22e38"}'
>>> print s
I have no issue, the string (copy/pasted in the Python console) yield no errors at all, but my original code is throwing one!
Do you have any idea about what is causing the issue?
You are loading item[1], which is the second character of the string items:
>>> json.loads('"')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 381, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: end is out of bounds
You should write:
item = json.loads(item)
I have a string in the following strings
json_string = '{u"favorited": false, u"contributors": null}'
json_string1 = '{"favorited": false, "contributors": null}'
The following json load works fine.
json.loads(json_string1 )
But, the following json load give me value error, how to fix this?
json.loads(json_string)
ValueError: Expecting property name: line 1 column 2 (char 1)
I faced the same problem with strings I received from a customer. The strings arrived with u's. I found a workaround using the ast package:
import ast
import json
my_str='{u"favorited": false, u"contributors": null}'
my_str=my_str.replace('"',"'")
my_str=my_str.replace(': false',': False')
my_str=my_str.replace(': null',': None')
my_str = ast.literal_eval(my_str)
my_dumps=json.dumps(my_str)
my_json=json.loads(my_dumps)
Note the replacement of "false" and "null" by "False" and "None", since the literal_eval only recognizes specific types of Python literal structures. This means that if you may need more replacements in your code - depending on the strings you receive.
You could remove the u suffix from the string using REGEX and then load the JSON
s = '{u"favorited": false, u"contributors": null}'
json_string = re.sub('(\W)\s*u"',r'\1"', s)
json.loads(json_string )
Use json.dumps to convert a Python dictionary to a string, not str. Then you can expect json.loads to work:
Incorrect:
>>> D = {u"favorited": False, u"contributors": None}
>>> s = str(D)
>>> s
"{u'favorited': False, u'contributors': None}"
>>> json.loads(s)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\dev\Python27\lib\json\__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "D:\dev\Python27\lib\json\decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "D:\dev\Python27\lib\json\decoder.py", line 380, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting property name: line 1 column 2 (char 1)
Correct:
>>> D = {u"favorited": False, u"contributors": None}
>>> s = json.dumps(D)
>>> s
'{"favorited": false, "contributors": null}'
>>> json.loads(s)
{u'favorited': False, u'contributors': None}
I am doing this in order to read a file
f = subprocess.Popen(["../../../abc/def/run_script.sh", "cat", "def/data/ex/details.json"], stdout=subprocess.PIPE)
out = f.stdout.readline()
The file's contents that is being read from above looks like this:
{
"def1": {
"val1": 31.6, "val2" : 10
},
"def2": {
"9": {
"val1": 20.1, "val2": 22
}
}
}
How should i go about it. When there was only 1 "val1" and 1 "val2", i did a simple search with regex and saved the values. Now that there are two, I need to be careful to know which one I am dealing with.. is there an easy way out..?
File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting object: line 1 column 3 (char 2)
Use the json module to parse JSON.
Python 3 code:
import json
import subprocess
with subprocess.Popen(["cat", "/tmp/foo.json"], stdout=subprocess.PIPE) as f:
j = f.stdout.read()
o = json.loads(j.decode("UTF-8"))
print(o)
print(o["def1"]["val1"])
print(o["def2"]["9"]["val1"])
Output:
{'def1': {'val1': 31.6, 'val2': 10}, 'def2': {'9': {'val1': 20.1, 'val2': 22}}}
31.6
20.1
Edit:
For Python 2, use this instead.
import json
import subprocess
f = subprocess.Popen(["cat", "/tmp/foo.json"], stdout=subprocess.PIPE)
j = f.stdout.read()
o = json.loads(j.decode("UTF-8"))
print o
print o["def1"]["val1"]
print o["def2"]["9"]["val1"]
I am trying to load a JSON file of mine which I have created from copying content from another JSON file. But I keep on getting the Error ValueError: Expecting property name: line 1 column 1 (char 1) when I try to read JSON data from the file in which I copied all the data, My JSON data is of the Format
{
"server": {
"ipaddress": "IP_Sample",
"name": "name_Sample",
"type": "type_Sample",
"label": "label_Sample",
"keyword": "kwd_Sample",
"uid": "uid_Sample",
"start_time": "start_Sample",
"stop_time": "stop_Sample"
}
}
And my load and write methods are
def load(self, filename):
inputfile = open(filename,'r')
self.data = json.loads(inputfile.read())
print (self.data)
inputfile.close()
return
def write(self, filename):
file = open(filename, "w")
tempObject = self.data
print type(tempObject)
#json.dump(filename, self.data)
print self.data["server"]
print >> file, self.data
file.close()
return
I cannot figure out where I am going wrong, can anybody help me with that..
To save and load JSON to and from a file, use an open file object. Your code indicates you tried to save the filename to self.data, which is not a fileobject...
The following code works:
def write(self, filename):
with open(filename, 'w') as output:
json.dump(self.data, output)
def load(self, filename):
with open(filename, 'r') as input:
self.data = json.load(input)
I use the open files as context managers, to ensure they are closed when done reading or writing.
Your other attempt, print >> file, self.data, simply prints the python representation to the file, not JSON:
>>> print example
{u'server': {u'uid': u'uid_Sample', u'keyword': u'kwd_Sample', u'ipaddress': u'IP_Sample', u'start_time': u'start_Sample', u'label': u'label_Sample', u'stop_time': u'stop_Sample', u'type': u'type_Sample', u'name': u'name_Sample'}}
which, when read back from the file would give the error message you indicated:
>>> json.loads("{u'server': {u'uid': u'uid_Sample', u'keyword': u'kwd_Sample', u'ipaddress': u'IP_Sample', u'start_time': u'start_Sample', u'label': u'label_Sample', u'stop_time': u'stop_Sample', u'type': u'type_Sample', u'name': u'name_Sample'}}")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/__init__.py", line 307, in loads
return _default_decoder.decode(s)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py", line 319, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py", line 336, in raw_decode
obj, end = self._scanner.iterscan(s, **kw).next()
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/scanner.py", line 55, in iterscan
rval, next_pos = action(m, context)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py", line 171, in JSONObject
raise ValueError(errmsg("Expecting property name", s, end))
ValueError: Expecting property name: line 1 column 1 (char 1)
You'd have to print the json.dumps() output instead:
>>> print json.dumps(example)
'{"server": {"uid": "uid_Sample", "keyword": "kwd_Sample", "ipaddress": "IP_Sample", "start_time": "start_Sample", "label": "label_Sample", "stop_time": "stop_Sample", "type": "type_Sample", "name": "name_Sample"}}'