I have the following string I need to deserialize:
{
"bw": 20,
"center_freq": 2437,
"channel": 6,
"essid": "DIRECT-sB47" Philips 6198",
"freq": 2437
}
This is almost a correct JSON, except for the quote in the value DIRECT-sB47" Philips 6198 which prematurely ends the string, breaking the rest of the JSON.
Is there a way to deserialize elements which have the pattern
"key": "something which includes a quote",
or should I try to first pre-process the string with a regex to remove that quote (I do not care about it, nor about any other weird characters in the keys or values)?
UPDATE: sorry for not posting the code (it is a standard deserialization via json). The code is also available at repl.it
import json
data = '''
{
"bw": 20,
"center_freq": 2437,
"channel": 6,
"essid": "DIRECT-sB47" Philips 6198",
"freq": 2437
}
'''
trans = json.loads(data)
print(trans)
The traceback:
Traceback (most recent call last):
File "main.py", line 12, in <module>
trans = json.loads(data)
File "/usr/local/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.6/json/decoder.py", line 355, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 6 column 26 (char 79)
The same code without the quote works fine
import json
data = '''
{
"bw": 20,
"center_freq": 2437,
"channel": 6,
"essid": "DIRECT-sB47 Philips 6198",
"freq": 2437
}
'''
trans = json.loads(data)
print(trans)
COMMENT: I realize that the provider of the JSON should fix their code (I opened a bug report with them). In the meantime, until the bug is fixed (if it is) I would like to try a workaround.
I ended up analyzing the exception which includes the place of the faulty character, removing it and deserializing again (in a loop).
Worst case the whole data string is swallowed, which in my case is better than crashing.
import json
import re
data = '''
{
"bw": 20,
"center_freq": 2437,
"channel": 6,
"essid": "DIRECT-sB47" Philips 6198",
"freq": 2437
}
'''
while True:
try:
trans = json.loads(data)
except json.decoder.JSONDecodeError as e:
s = int(re.search(r"\.*char (\d+)", str(e)).group(1))-2
print(f"incorrect character at position {s}, removing")
data = data[:s] + data[(s + 1):]
else:
break
print(trans)
This question already has an answer here:
Python TypeError: expected string or buffer
(1 answer)
Closed 5 years ago.
I have the code below which should convert a JSON file to a CSV file
import json
import csv
infractions = open("C:\\Users\\Alan\\Downloads\\open.json","r")
infractions_parsed = json.loads(infractions)
infractions_data = infractions_parsed['infractions']
# open a file for writing
csv_data = open('Data.csv', 'w')
# create the csv writer object
csvwriter = csv.writer(csv_data)
count = 0
for inf in infractions_data:
if count == 0:
header = inf.keys()
csvwriter.writerow(header)
count += 1
csvwriter.writerow(inf.values())
employ_data.close()
However, I get this error. Any reason why this should be?
C:\Users\Alan\Desktop>python monkeytennis.py
Traceback (most recent call last):
File "monkeytennis.py", line 5, in <module>
infractions_parsed = json.loads(infractions)
File "C:\Python27\lib\json\__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "C:\Python27\lib\json\decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer
JSON is in format:
{
"count": 666,
"query": "righthere",
"infractions": [{
"status": "open",
"severity": 2.0,
"title": "Blah blah blah",
"coals": [1, 1],
"date": "2017-04-22T23:10:07",
"name": "Joe Bloggs"
},...
infractions is a file object, which can't be passed directly to json.loads(). Either read it first:
infractions_parsed = json.loads(infractions.read())
or use json.load (without the 's') which does expect a buffer.
infractions_parsed = json.load(infractions)
I have a string in the following strings
json_string = '{u"favorited": false, u"contributors": null}'
json_string1 = '{"favorited": false, "contributors": null}'
The following json load works fine.
json.loads(json_string1 )
But, the following json load give me value error, how to fix this?
json.loads(json_string)
ValueError: Expecting property name: line 1 column 2 (char 1)
I faced the same problem with strings I received from a customer. The strings arrived with u's. I found a workaround using the ast package:
import ast
import json
my_str='{u"favorited": false, u"contributors": null}'
my_str=my_str.replace('"',"'")
my_str=my_str.replace(': false',': False')
my_str=my_str.replace(': null',': None')
my_str = ast.literal_eval(my_str)
my_dumps=json.dumps(my_str)
my_json=json.loads(my_dumps)
Note the replacement of "false" and "null" by "False" and "None", since the literal_eval only recognizes specific types of Python literal structures. This means that if you may need more replacements in your code - depending on the strings you receive.
You could remove the u suffix from the string using REGEX and then load the JSON
s = '{u"favorited": false, u"contributors": null}'
json_string = re.sub('(\W)\s*u"',r'\1"', s)
json.loads(json_string )
Use json.dumps to convert a Python dictionary to a string, not str. Then you can expect json.loads to work:
Incorrect:
>>> D = {u"favorited": False, u"contributors": None}
>>> s = str(D)
>>> s
"{u'favorited': False, u'contributors': None}"
>>> json.loads(s)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\dev\Python27\lib\json\__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "D:\dev\Python27\lib\json\decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "D:\dev\Python27\lib\json\decoder.py", line 380, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting property name: line 1 column 2 (char 1)
Correct:
>>> D = {u"favorited": False, u"contributors": None}
>>> s = json.dumps(D)
>>> s
'{"favorited": false, "contributors": null}'
>>> json.loads(s)
{u'favorited': False, u'contributors': None}
I am doing this in order to read a file
f = subprocess.Popen(["../../../abc/def/run_script.sh", "cat", "def/data/ex/details.json"], stdout=subprocess.PIPE)
out = f.stdout.readline()
The file's contents that is being read from above looks like this:
{
"def1": {
"val1": 31.6, "val2" : 10
},
"def2": {
"9": {
"val1": 20.1, "val2": 22
}
}
}
How should i go about it. When there was only 1 "val1" and 1 "val2", i did a simple search with regex and saved the values. Now that there are two, I need to be careful to know which one I am dealing with.. is there an easy way out..?
File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting object: line 1 column 3 (char 2)
Use the json module to parse JSON.
Python 3 code:
import json
import subprocess
with subprocess.Popen(["cat", "/tmp/foo.json"], stdout=subprocess.PIPE) as f:
j = f.stdout.read()
o = json.loads(j.decode("UTF-8"))
print(o)
print(o["def1"]["val1"])
print(o["def2"]["9"]["val1"])
Output:
{'def1': {'val1': 31.6, 'val2': 10}, 'def2': {'9': {'val1': 20.1, 'val2': 22}}}
31.6
20.1
Edit:
For Python 2, use this instead.
import json
import subprocess
f = subprocess.Popen(["cat", "/tmp/foo.json"], stdout=subprocess.PIPE)
j = f.stdout.read()
o = json.loads(j.decode("UTF-8"))
print o
print o["def1"]["val1"]
print o["def2"]["9"]["val1"]
I have a database with lots of tweets that I crawled using Twitter API. Those tweets are in a json format what allows me to convert them into dictionaries, right? So here we go: I'm trying to convert these strings into dictionaries using the json package. But everytime I try to run json.loads(string), it gives me an error:
ValueError: Expecting property name: line 1 column 2 (char 1).
Here's an example of the "json strings" I have in my database.
{u"contributors": "None", u"truncated": False, u"text": u"RT #WhoScored: Most clear-cut chances missed at World Cup 2014: M\xfcller / Higua\xedn / Benzema (5), de Vrij / \xd6zil / Ronaldo (4)", u"in_reply_to_status_id": "None", u"id": 487968527395983360, u"favorite_count": 0, u"source": u"Twitter for BlackBerry\xae", u"retweeted": False, u"coordinates": "None", u"entities": {u"user_mentions": [{u"id": 99806132, u"indices": [3, 13], u"id_str": u"99806132", u"screen_name": u"WhoScored", u"name": u"WhoScored.com"}], u"symbols": [], u"trends": [], u"hashtags": [], u"urls": []}, u"in_reply_to_screen_name": "None", u"id_str": u"487968527395983360", u"retweet_count": 0, u"in_reply_to_user_id": "None", u"favorited": False, u"retweeted_status": {u"contributors": "None", u"truncated": False, u"text": u"Most clear-cut chances missed at World Cup 2014: M\xfcller / Higua\xedn / Benzema (5), de Vrij / \xd6zil / Ronaldo (4)", u"in_reply_to_status_id": "None", u"id": 487955847025143808, u"favorite_count": 17, u"source": u"TweetDeck", u"retweeted": False, u"coordinates": "None", u"entities": {u"user_mentions": [], u"symbols": [], u"trends": [], u"hashtags": [], u"urls": []}, u"in_reply_to_screen_name": "None", u"id_str": u"487955847025143808", u"retweet_count": 59, u"in_reply_to_user_id": "None", u"favorited": False, u"user": {u"follow_request_sent": "None", u"profile_use_background_image": True, u"default_profile_image": False, u"id": 99806132, u"verified": True, u"profile_image_url_https": u"https://pbs.twimg.com/profile_images/477005408557486083/9MVR7GdF_normal.jpeg", u"profile_sidebar_fill_color": u"DDEEF6", u"profile_text_color": u"333333", u"followers_count": 425860, u"profile_sidebar_border_color": u"C0DEED", u"id_str": u"99806132", u"profile_background_color": u"272727", u"listed_count": 3245, u"profile_background_image_url_https": u"https://pbs.twimg.com/profile_background_images/439356280/123abc.jpg", u"utc_offset": 3600, u"statuses_count": 24118, u"description": u"The largest detailed football statistics website, covering Europe"s top leagues and more. Follow #WSTipster for betting tips. Powered by Opta data.", u"friends_count": 67, u"location": u"London", u"profile_link_color": u"0084B4", u"profile_image_url": u"http://pbs.twimg.com/profile_images/477005408557486083/9MVR7GdF_normal.jpeg", u"following": "None", u"geo_enabled": False, u"profile_banner_url": u"https://pbs.twimg.com/profile_banners/99806132/1402565693", u"profile_background_image_url": u"http://pbs.twimg.com/profile_background_images/439356280/123abc.jpg", u"name": u"WhoScored.com", u"lang": u"en", u"profile_background_tile": False, u"favourites_count": 250, u"screen_name": u"WhoScored", u"notifications": "None", u"url": u"http://whoscored.com", u"created_at": u"Sun Dec 27 23:22:45 +0000 2009", u"contributors_enabled": False, u"time_zone": u"London", u"protected": False, u"default_profile": False, u"is_translator": False}, u"geo": "None", u"in_reply_to_user_id_str": "None", u"possibly_sensitive": False, u"lang": u"ro", u"created_at": u"Sat Jul 12 13:45:14 +0000 2014", u"filter_level": u"low", u"in_reply_to_status_id_str": "None", u"place": "None"}, u"user": {u"follow_request_sent": "None", u"profile_use_background_image": True, u"default_profile_image": False, u"id": 498676612, u"verified": False, u"profile_image_url_https": u"https://pbs.twimg.com/profile_images/485720258934603776/BmUaZHax_normal.jpeg", u"profile_sidebar_fill_color": u"DDEEF6", u"profile_text_color": u"333333", u"followers_count": 192, u"profile_sidebar_border_color": u"C0DEED", u"id_str": u"498676612", u"profile_background_color": u"C0DEED", u"listed_count": 1, u"profile_background_image_url_https": u"https://pbs.twimg.com/profile_background_images/654833637/l9fp65m0xqzsmoneg8pz.jpeg", u"utc_offset": "None", u"statuses_count": 6468, u"description": u"Garut 10 July | Asda islamic school | Farmasi | #judikajude | path : Dicky Darul Majid", u"friends_count": 153, u"location": u"", u"profile_link_color": u"B30000", u"profile_image_url": u"http://pbs.twimg.com/profile_images/485720258934603776/BmUaZHax_normal.jpeg", u"following": "None", u"geo_enabled": True, u"profile_banner_url": u"https://pbs.twimg.com/profile_banners/498676612/1404927261", u"profile_background_image_url": u"http://pbs.twimg.com/profile_background_images/654833637/l9fp65m0xqzsmoneg8pz.jpeg", u"name": u"DICKY ", u"lang": u"en", u"profile_background_tile": False, u"favourites_count": 74, u"screen_name": u"DickyDarulMajid", u"notifications": "None", u"url": "None", u"created_at": u"Tue Feb 21 09:21:59 +0000 2012", u"contributors_enabled": False, u"time_zone": "None", u"protected": False, u"default_profile": False, u"is_translator": False}, u"geo": "None", u"in_reply_to_user_id_str": "None", u"possibly_sensitive": False, u"lang": u"ro", u"created_at": u"Sat Jul 12 14:35:37 +0000 2014", u"filter_level": u"medium", u"in_reply_to_status_id_str": "None", u"place": "None"}
And here is the code:
import sys, codecs, json
encode = sys.stdin.encoding
all_entries = Tweet.objects.all()[1427:1435]
for entry in all_entries:
tweet = entry.tweet.encode(encode)
json_acceptable_string = tweet.replace("'", "\"")
json_acceptable_string = json_acceptable_string.replace("None", "\"None\"")
data = json.loads(json_acceptable_string)
print data
Traceback:
Traceback (most recent call last):
File "/home/kiko/workspace/SA_WorldCup/main.py", line 6, in <module>
util.tweets_count()
File "/home/kiko/workspace/SA_WorldCup/util/__init__.py", line 25, in tweets_count
data = json.loads(json_acceptable_string, object_hook=JSONObject)
File "/usr/lib/python2.7/json/__init__.py", line 351, in loads
return cls(encoding=encoding, **kw).decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting property name: line 1 column 2 (char 1)
I tried many times, but with no succeed. Can you help me out? Thanks a lot.
It looks like your "json" is actually python literal syntax. In that case, it might be easier to ast.literal_eval the string1.
As a side benefit, then you don't have to do any (possibly sketchy) replacements of None with "None". (Consider a tweet which says "None of your base are belong to us")
1although it would probably be even better to make sure that proper json is being dumped into the database to begin with...
use eval()
In [13]: a = "[{'start_city': '1', 'end_city': 'aaa', 'number': 1},\
...: {'start_city': '2', 'end_city': 'bbb', 'number': 1},\
...: {'start_city': '3', 'end_city': 'ccc', 'number': 1}]"
In [14]: eval(a)
Out[14]:
[{'end_city': 'aaa', 'number': 1, 'start_city': '1'},
{'end_city': 'bbb', 'number': 1, 'start_city': '2'},
{'end_city': 'ccc', 'number': 1, 'start_city': '3'}