Error in parsing .json file using Python - python

I am getting the following error. What does it mean?
AttributeError: 'bool' object has no attribute 'decode'
in code line : writer.writerow({k:v.decode('utf8') for k,v in dictionary.iteritems()})
My code looks like :
import json
import csv
def make_csv(data):
fname = "try.csv"
with open(fname,'wb') as outf:
dic_list = data['bookmarks']
dictionary = dic_list[0]
writer = csv.DictWriter(outf,fieldnames = sorted(dictionary.keys()), restval = "None", extrasaction = 'ignore')
writer.writeheader()
for dictionary in dic_list:
writer.writerow({k:v.decode('utf8') for k,v in dictionary.iteritems()})
return
def main():
fil = "readability.json"
f = open(fil,'rb')
data = json.loads(f.read())
print type(data)
make_csv(data)
The json file looks like :
{ "bookmarks" : [{..},{..} ..... {..}],
"recommendations" : [{..},{..}...{..}]
}
where [..] = list and {..} = dictionary
EDIT :
The above problem was solved, But when I ran the above code, The CSV file generated has some discrepancies. Some rows were pasted randomly i.e. under different headers in .csv file. Any suggestions?

Somewhere in your readability.json file you have an entry that's a boolean value, like true or false (in JSON), translated to the Python True and False objects.
You should not be using decode() in the first place, however, as json.loads() already produces Unicode values for strings.
Since this is Python 2, you want to encode your data, to UTF-8, instead. Convert your objects to unicode first:
writer.writerow({
k: unicode(v).encode('utf8')
for k ,v in dictionary.iteritems()
})
Converting existing Unicode strings to unicode is a no-op, but for integers, floating point values, None and boolean values you'll get a nice Unicode representation that can be encoded to UTF-8:
>>> unicode(True).encode('utf8')
'True'

Related

How to add string in specific positions in geojson object

i have the below posted geojson mentioned in geojson_1 section below. i want to add to it "geometry":{ and }, so that to appear as follows
{"geometry":{"type":"Polygon","coordinates":[[[1216374.67364018,6563498.44078949],[1216387.86261675,6563523.87797899],[1216397.66970116,6563548.2905649],[1216424.17569103,6563588.32082324],[1216458.19258303,6563622.16452455],[1216498.32084288,6563648.42909789],[1216542.90943577,6563666.03380959],[1216590.12376481,6563674.25425166],[1216638.02117068,6563672.7521636],[1216684.63088244,6563661.58935797],[1216728.03512655,6563641.225175],[1216752.29181681,6563626.67066235],[1216787.17700448,6563601.12371718],[1216816.83970763,6563569.63465531],[1216831.39332728,6563551.03748989],[1216838.2508451,6563541.8226918],[1216897.47283376,6563458.0765492],[1216918.74007329,6563421.44644481],[1216933.156564,6563381.60258193],[1216940.26085904,6563339.82061228],[1216939.82562707,6563297.43819918],[1216931.86491641,6563255.81218836],[1216907.60647856,6563170.91644364],[1216887.20280767,6563121.46139137],[1216856.24799209,6563077.86160203],[1216821.48046704,6563039.0529759],[1216799.23490474,6563017.28929875],[1216753.95673639,6562978.48086898],[1216737.29066155,6562965.4435638],[1216673.22488836,6562919.79826372],[1216644.73178636,6562899.22061724],[1216601.13622245,6562874.31962206],[1216562.32695185,6562857.3410734],[1216556.56069412,6562854.90900462],[1216549.97837146,6562852.23502385],[1216545.77480552,6562849.58841453],[1216504.75306873,6562829.03095075],[1216487.0229317,6562822.21187019],[1216482.65368148,6562820.3796627],[1216478.79578194,6562814.49158384],[1216462.95127963,6562793.04723497],[1216450.44559886,6562777.97698661],[1216448.65520854,6562774.19751598],[1216429.39331353,6562740.84427663],[1216404.99213055,6562711.06486155],[1216382.3528849,6562687.61801865],[1216357.97638417,6562665.64947943],[1216339.38004804,6562651.09634054],[1216299.24217837,6562625.743469],[1216254.86196793,6562608.94292463],[1216208.02902037,6562601.37212065],[1216160.63182011,6562603.33631786],[1216114.58159971,6562614.75632195],[1216071.73529105,6562635.17167463],[1216033.82066317,6562663.75921023],[1216002.36666225,6562699.36623124],[1215978.64176027,6562740.55696848],[1215963.60279766,6562785.67045603],[1215957.85638339,6562832.88749107],[1215961.63441126,6562880.30398198],[1215963.25125904,6562889.19806131],[1215964.03213898,6562893.28933659],[1215968.1319511,6562913.79137984],[1215972.03222389,6562937.19708669],[1215977.28745991,6563016.79645952],[1215971.45390521,6563048.0427099],[1215969.39172682,6563061.09023781],[1215963.73069651,6563104.75131293],[1215962.06592533,6563123.22251112],[1215960.00953034,6563163.6421848],[1215954.35640427,6563213.80082903],[1215954.63230125,6563269.34572034],[1215960.29082704,6563315.43307246],[1215970.70253119,6563361.57391952],[1215982.82982632,6563397.95907391],[1216001.20120538,6563439.38870108],[1216027.10825421,6563476.54992802],[1216059.60193364,6563508.08124916],[1216097.4918555,6563532.82739871],[1216139.38989254,6563549.8816932],[1216183.76104002,6563558.61926572],[1216228.97966418,6563558.71997125],[1216273.38907516,6563550.18012236],[1216287.1346647,6563546.13717455],[1216332.92682121,6563527.24586381],[1216373.78586258,6563499.1986745],[1216374.67364018,6563498.44078949]]]}}
to simpify it even more, i want to add "geometry":{ right after the the first curly bracket, and the } at the very end
i attmepted the following:
asString = asString[:2] + "geometry:" + asString[2:]
asString = asString[:len(asString)] + "}" + asString[len(asString):]
but i am not getting the expected results
geojson_1:
{"type":"Polygon","coordinates":[[[1216374.67364018,6563498.44078949],[1216387.86261675,6563523.87797899],[1216397.66970116,6563548.2905649],[1216424.17569103,6563588.32082324],[1216458.19258303,6563622.16452455],[1216498.32084288,6563648.42909789],[1216542.90943577,6563666.03380959],[1216590.12376481,6563674.25425166],[1216638.02117068,6563672.7521636],[1216684.63088244,6563661.58935797],[1216728.03512655,6563641.225175],[1216752.29181681,6563626.67066235],[1216787.17700448,6563601.12371718],[1216816.83970763,6563569.63465531],[1216831.39332728,6563551.03748989],[1216838.2508451,6563541.8226918],[1216897.47283376,6563458.0765492],[1216918.74007329,6563421.44644481],[1216933.156564,6563381.60258193],[1216940.26085904,6563339.82061228],[1216939.82562707,6563297.43819918],[1216931.86491641,6563255.81218836],[1216907.60647856,6563170.91644364],[1216887.20280767,6563121.46139137],[1216856.24799209,6563077.86160203],[1216821.48046704,6563039.0529759],[1216799.23490474,6563017.28929875],[1216753.95673639,6562978.48086898],[1216737.29066155,6562965.4435638],[1216673.22488836,6562919.79826372],[1216644.73178636,6562899.22061724],[1216601.13622245,6562874.31962206],[1216562.32695185,6562857.3410734],[1216556.56069412,6562854.90900462],[1216549.97837146,6562852.23502385],[1216545.77480552,6562849.58841453],[1216504.75306873,6562829.03095075],[1216487.0229317,6562822.21187019],[1216482.65368148,6562820.3796627],[1216478.79578194,6562814.49158384],[1216462.95127963,6562793.04723497],[1216450.44559886,6562777.97698661],[1216448.65520854,6562774.19751598],[1216429.39331353,6562740.84427663],[1216404.99213055,6562711.06486155],[1216382.3528849,6562687.61801865],[1216357.97638417,6562665.64947943],[1216339.38004804,6562651.09634054],[1216299.24217837,6562625.743469],[1216254.86196793,6562608.94292463],[1216208.02902037,6562601.37212065],[1216160.63182011,6562603.33631786],[1216114.58159971,6562614.75632195],[1216071.73529105,6562635.17167463],[1216033.82066317,6562663.75921023],[1216002.36666225,6562699.36623124],[1215978.64176027,6562740.55696848],[1215963.60279766,6562785.67045603],[1215957.85638339,6562832.88749107],[1215961.63441126,6562880.30398198],[1215963.25125904,6562889.19806131],[1215964.03213898,6562893.28933659],[1215968.1319511,6562913.79137984],[1215972.03222389,6562937.19708669],[1215977.28745991,6563016.79645952],[1215971.45390521,6563048.0427099],[1215969.39172682,6563061.09023781],[1215963.73069651,6563104.75131293],[1215962.06592533,6563123.22251112],[1215960.00953034,6563163.6421848],[1215954.35640427,6563213.80082903],[1215954.63230125,6563269.34572034],[1215960.29082704,6563315.43307246],[1215970.70253119,6563361.57391952],[1215982.82982632,6563397.95907391],[1216001.20120538,6563439.38870108],[1216027.10825421,6563476.54992802],[1216059.60193364,6563508.08124916],[1216097.4918555,6563532.82739871],[1216139.38989254,6563549.8816932],[1216183.76104002,6563558.61926572],[1216228.97966418,6563558.71997125],[1216273.38907516,6563550.18012236],[1216287.1346647,6563546.13717455],[1216332.92682121,6563527.24586381],[1216373.78586258,6563499.1986745],[1216374.67364018,6563498.44078949]]]}
I'm going to assume that geojson_1 is available as a string in which case:
import json
output = {'geometry': json.loads(geojson_1)}
...will give you a dictionary with the structure you need.
It looks like plain json data, or a string representation of a dict (they wouldn't be any different in this case), did you consider wrapping the returned data in a new dict rather than manipulating it as a string?
import json
# Assume this returns the geojson as text
geojson = json.loads(get_geojson())
geojson = {"geometry": geojson}
print(json.dumps(geojson))
I get the expected result using the following:
'{"geometry":' + d + "}"
It adds the string {"geometry": to the string d and at the end }.
The variable dis:
d = '{"type":"Polygon","co (rest of json) ,6563498.44078949]]]}'
Or you can use the json library for this:
import json
data = json.loads(d) # note d is the same string as above, this can also be from a file or read file using json.load(FILE)
# Create your new object:
result = {'geometry': data}
# print you new json:
print( json.dumps(result, indent=2))
edit:
'"geometry":{' + d + "}"
Note that you get a string starting with geometry and a { and directly another { from you input json. This is not a correct dictionary nor a proper json format.
Result:
'"geometry":{{"type":"Polygon", ... ,6563498.44078949]]]}}'
(the dots are just the rest of your original json.

how to print after the keyword from python?

i have following string in python
b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
I want to print the all alphabet next to keyword "name" such that my output should be
waqas
Note the waqas can be changed to any number so i want print any name next to keyword name using string operation or regex?
First you need to decode the string since it is binary b. Then use literal eval to make the dictionary, then you can access by key
>>> s = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
>>> import ast
>>> ast.literal_eval(s.decode())['name']
'waqas'
It is likely you should be reading your data into your program in a different manner than you are doing now.
If I assume your data is inside a JSON file, try something like the following, using the built-in json module:
import json
with open(filename) as fp:
data = json.load(fp)
print(data['name'])
if you want a more algorithmic way to extract the value of name:
s = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a",\
"persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],\
"name":"waqas"}'
s = s.decode("utf-8")
key = '"name":"'
start = s.find(key) + len(key)
stop = s.find('"', start + 1)
extracted_string = s[start : stop]
print(extracted_string)
output
waqas
You can convert the string into a dictionary with json.loads()
import json
mystring = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
mydict = json.loads(mystring)
print(mydict["name"])
# output 'waqas'
First you need to convert the string into a proper JSON Format by removing b from the string using substring in python suppose you have a variable x :
import json
x = x[1:];
dict = json.loads(x) //convert JSON string into dictionary
print(dict["name"])

Convert csv file to json with no quotes around float values

I have some csv files that I need to convert to json. Some of the float values in the csv are numeric strings (to maintain trailing zeros). When converting to json, all keys and values are wrapped in double quotes. I need the numeric string float values to not have quotes, but maintain the trailing zeros.
Here is a sample of the input csv file:
ACCOUNTNAMEDENORM,DELINQUENCYSTATUS,RETIRED,INVOICEDAYOFWEEK,ID,BEANVERSION,ACCOUNTTYPE,ORGANIZATIONTYPEDENORM,HIDDENTACCOUNTCONTAINERID,NEWPOLICYPAYMENTDISTRIBUTABLE,ACCOUNTNUMBER,PAYMENTMETHOD,INVOICEDELIVERYTYPE,DISTRIBUTIONLIMITTYPE,CLOSEDATE,FIRSTTWICEPERMTHINVOICEDOM,HELDFORINVOICESENDING,FEINDENORM,COLLECTING,ACCOUNTNUMBERDENORM,CHARGEHELD,PUBLICID
John Smith,2.0000000000,0.0000000000,5.0000000000,1234567.0000000000,69.0000000000,1.0000000000,,4321987.0000000000,1,000-000-000-00,10012.0000000000,10002.0000000000,3.0000000000,,1.0000000000,0,,0,000-000-000-00,0,bc:1234346
The json output I am getting is:
{"ACCOUNTNAMEDENORM":"John Smith","DELINQUENCYSTATUS":"2.0000000000","RETIRED":"0.0000000000","INVOICEDAYOFWEEK":"5.0000000000","ID":"1234567.0000000000","BEANVERSION":"69.0000000000","ACCOUNTTYPE":"1.0000000000","ORGANIZATIONTYPEDENORM":null,"HIDDENTACCOUNTCONTAINERID":"4321987.0000000000","NEWPOLICYPAYMENTDISTRIBUTABLE":"1","ACCOUNTNUMBER":"000-000-000-00","PAYMENTMETHOD":"12345.0000000000","INVOICEDELIVERYTYPE":"98765.0000000000","DISTRIBUTIONLIMITTYPE":"3.0000000000","CLOSEDATE":null,"FIRSTTWICEPERMTHINVOICEDOM":"1.0000000000","HELDFORINVOICESENDING":"0","FEINDENORM":null,"COLLECTING":"0","ACCOUNTNUMBERDENORM":"000-000-000-00","CHARGEHELD":"0","PUBLICID":"xx:1234346"}
Here is the code I am using:
import csv
import json
csvfile = open('output2.csv', 'r')
jsonfile = open('output2.json', 'w')
readHeaders = csv.reader(csvfile)
fieldnames = next(readHeaders)
reader = csv.DictReader(csvfile, fieldnames)
for row in reader:
json.dump(row, jsonfile, separators=(',', ':'))
jsonfile.write('\n')
I would like the output to have no quotes around float values, similar to the following:
{"ACCOUNTNAMEDENORM":"John Smith","DELINQUENCYSTATUS":2.0000000000,"RETIRED":0.0000000000,"INVOICEDAYOFWEEK":5.0000000000,"ID":1234567.0000000000,"BEANVERSION":69.0000000000,"ACCOUNTTYPE":1.0000000000,"ORGANIZATIONTYPEDENORM":null,"HIDDENTACCOUNTCONTAINERID":4321987.0000000000,"NEWPOLICYPAYMENTDISTRIBUTABLE":"1","ACCOUNTNUMBER":"000-000-000-00","PAYMENTMETHOD":12345.0000000000,"INVOICEDELIVERYTYPE":98765.0000000000,"DISTRIBUTIONLIMITTYPE":3.0000000000,"CLOSEDATE":null,"FIRSTTWICEPERMTHINVOICEDOM":1.0000000000,"HELDFORINVOICESENDING":"0","FEINDENORM":null,"COLLECTING":"0","ACCOUNTNUMBERDENORM":"000-000-000-00","CHARGEHELD":"0","PUBLICID":"xx:1234346"}
Now, from your comments, that I understand your question better, here's a completely different answer. Note that it doesn't use the json module and just does the processing needed "manually". Although it probably could be done using the module, getting it to format the Python data types it recognizes by default differently can be fairly involved — I know from experience — as compared to the relatively simple logic used below anyway.
Anther note: Like your code, this converts each row of the csv file into a valid JSON object and writes each one to a file on a separate line. However the contents of the resulting file technically won't be valid JSON because all of these individual objects need to be be comma-separated and enclosed in [ ] brackets (i.e. thereby becoming a valid JSON "Array" Object).
import csv
with open('output2.csv', 'r', newline='') as csvfile, \
open('output2.json', 'w') as jsonfile:
for row in csv.DictReader(csvfile):
newfmt = []
for field, value in row.items():
field = '"{}"'.format(field)
try:
float(value)
except ValueError:
value = 'null' if value == '' else '"{}"'.format(value)
else:
# Avoid changing integer values to float.
try:
int(value)
except ValueError:
pass
else:
value = '"{}"'.format(value)
newfmt.append((field, value))
json_repr = '{' + ','.join(':'.join(pair) for pair in newfmt) + '}'
jsonfile.write(json_repr + '\n')
This is the JSON written to the file:
{"ACCOUNTNAMEDENORM":"John Smith","DELINQUENCYSTATUS":2.0000000000,"RETIRED":0.0000000000,"INVOICEDAYOFWEEK":5.0000000000,"ID":1234567.0000000000,"BEANVERSION":69.0000000000,"ACCOUNTTYPE":1.0000000000,"ORGANIZATIONTYPEDENORM":null,"HIDDENTACCOUNTCONTAINERID":4321987.0000000000,"NEWPOLICYPAYMENTDISTRIBUTABLE":"1","ACCOUNTNUMBER":"000-000-000-00","PAYMENTMETHOD":12345.0000000000,"INVOICEDELIVERYTYPE":98765.0000000000,"DISTRIBUTIONLIMITTYPE":3.0000000000,"CLOSEDATE":null,"FIRSTTWICEPERMTHINVOICEDOM":1.0000000000,"HELDFORINVOICESENDING":"0","FEINDENORM":null,"COLLECTING":"0","ACCOUNTNUMBERDENORM":"000-000-000-00","CHARGEHELD":"0","PUBLICID":"bc:1234346"}
Shown again below with added whitespace:
{"ACCOUNTNAMEDENORM": "John Smith",
"DELINQUENCYSTATUS": 2.0000000000,
"RETIRED": 0.0000000000,
"INVOICEDAYOFWEEK": 5.0000000000,
"ID": 1234567.0000000000,
"BEANVERSION": 69.0000000000,
"ACCOUNTTYPE": 1.0000000000,
"ORGANIZATIONTYPEDENORM": null,
"HIDDENTACCOUNTCONTAINERID": 4321987.0000000000,
"NEWPOLICYPAYMENTDISTRIBUTABLE": "1",
"ACCOUNTNUMBER": "000-000-000-00",
"PAYMENTMETHOD": 12345.0000000000,
"INVOICEDELIVERYTYPE": 98765.0000000000,
"DISTRIBUTIONLIMITTYPE": 3.0000000000,
"CLOSEDATE": null,
"FIRSTTWICEPERMTHINVOICEDOM": 1.0000000000,
"HELDFORINVOICESENDING": "0",
"FEINDENORM": null,
"COLLECTING": "0",
"ACCOUNTNUMBERDENORM": "000-000-000-00",
"CHARGEHELD": "0",
"PUBLICID": "bc:1234346"}
Might be a bit of overkill, but with pandas it would be pretty simple:
import pandas as pd
data = pd.read_csv('output2.csv')
data.to_json(''output2.json')
One solution is to use a regular expression to see if the string value looks like a float, and convert it to a float if it is.
import re
null = None
j = {"ACCOUNTNAMEDENORM":"John Smith","DELINQUENCYSTATUS":"2.0000000000",
"RETIRED":"0.0000000000","INVOICEDAYOFWEEK":"5.0000000000",
"ID":"1234567.0000000000","BEANVERSION":"69.0000000000",
"ACCOUNTTYPE":"1.0000000000","ORGANIZATIONTYPEDENORM":null,
"HIDDENTACCOUNTCONTAINERID":"4321987.0000000000",
"NEWPOLICYPAYMENTDISTRIBUTABLE":"1","ACCOUNTNUMBER":"000-000-000-00",
"PAYMENTMETHOD":"12345.0000000000","INVOICEDELIVERYTYPE":"98765.0000000000",
"DISTRIBUTIONLIMITTYPE":"3.0000000000","CLOSEDATE":null,
"FIRSTTWICEPERMTHINVOICEDOM":"1.0000000000","HELDFORINVOICESENDING":"0",
"FEINDENORM":null,"COLLECTING":"0","ACCOUNTNUMBERDENORM":"000-000-000-00",
"CHARGEHELD":"0","PUBLICID":"xx:1234346"}
for key in j:
if j[key] is not None:
if re.match("^\d+?\.\d+?$", j[key]):
j[key] = float(j[key])
I used null = None here to deal with the "null"s that show up in the JSON. But you can replace 'j' here with each CSV row you're reading, then use this to update the row before writing it back with the floats replacing the strings.
If you're OK with converting any numerical string into a float, then you can skip the regular expression (re.match() command) and replace it with j[key].isnumeric(), if it's available for your Python version.
EDIT: I don't think floats in Python handle the "precision" in a way you might think. It may look like 2.0000000000 is being "truncated" to 2.0, but I think this is more of a formatting and display issue, rather than losing information. Consider the following examples:
>>> float(2.0000000000)
2.0
>>> float(2.00000000001)
2.00000000001
>>> float(1.00) == float(1.000000000)
True
>>> float(3.141) == float(3.140999999)
False
>>> float(3.141) == float(3.1409999999999999)
True
>>> print('%.10f' % 3.14)
3.1400000000
It's possible though to get the JSON to have those zeroes, but in that case it comes down to treating the number as a string, namely a formatted one.
Hah, it's really interesting, I want to find the opposite answer with you that is the results are with quotes.
Actually it's very easy to remove it automatically, just remove the param "separators=(',', ':')".
For me, just adding this param is Okay.

converting json string to dict in python

I am getting a value in request.body, it is like :
a = '[data={"vehicle":"rti","action_time":"2015-04-21 14:18"}]'
type(a) == str
I want to convert this str to dict. i have tried by doing this
b=json.loads(a)
But am getting error
ValueError: No JSON object could be decoded
The data you are receiving is not properly formatted JSON. You're going to have to do some parsing or data transformation before you can convert it using the json module.
If you know that the data always begins with the literal string '[data=' and always ends with the literal string ']', and that the rest of the data is valid json, you can simply strip off the problematic characters:
b = json.loads(a[6:-1])
If the data can't be guaranteed to be in precisely that format, you'll have to learn what the actual format is, and do more intelligent parsing.
It is not a valid json format that you are receiving.
A valid format is of type:
'{"data":{"vehicle":"rti","action_time":"2015-04-21 14:18"}}'
import json
a = '[data={"vehicle":"rti","action_time":"2015-04-21 14:18"}]'
r = a.split("=")
r[:] = r[0].replace("[", ""), r[1].replace("]", "")
d = '{"%s":%s}'%(r[0],r[1])
dp = json.loads(d)
print dp

Python: json.loads returns items prefixing with 'u'

I'll be receiving a JSON encoded string from Objective-C, and I am decoding a dummy string (for now) like the code below. My output comes out with character 'u' prefixing each item:
[{u'i': u'imap.gmail.com', u'p': u'aaaa'}, {u'i': u'333imap.com', u'p': u'bbbb'}...
How is JSON adding this Unicode character? What's the best way to remove it?
mail_accounts = []
da = {}
try:
s = '[{"i":"imap.gmail.com","p":"aaaa"},{"i":"imap.aol.com","p":"bbbb"},{"i":"333imap.com","p":"ccccc"},{"i":"444ap.gmail.com","p":"ddddd"},{"i":"555imap.gmail.com","p":"eee"}]'
jdata = json.loads(s)
for d in jdata:
for key, value in d.iteritems():
if key not in da:
da[key] = value
else:
da = {}
da[key] = value
mail_accounts.append(da)
except Exception, err:
sys.stderr.write('Exception Error: %s' % str(err))
print mail_accounts
The u- prefix just means that you have a Unicode string. When you really use the string, it won't appear in your data. Don't be thrown by the printed output.
For example, try this:
print mail_accounts[0]["i"]
You won't see a u.
Everything is cool, man. The 'u' is a good thing, it indicates that the string is of type Unicode in python 2.x.
http://docs.python.org/2/howto/unicode.html#the-unicode-type
The d3 print below is the one you are looking for (which is the combination of dumps and loads) :)
Having:
import json
d = """{"Aa": 1, "BB": "blabla", "cc": "False"}"""
d1 = json.loads(d) # Produces a dictionary out of the given string
d2 = json.dumps(d) # Produces a string out of a given dict or string
d3 = json.dumps(json.loads(d)) # 'dumps' gets the dict from 'loads' this time
print "d1: " + str(d1)
print "d2: " + d2
print "d3: " + d3
Prints:
d1: {u'Aa': 1, u'cc': u'False', u'BB': u'blabla'}
d2: "{\"Aa\": 1, \"BB\": \"blabla\", \"cc\": \"False\"}"
d3: {"Aa": 1, "cc": "False", "BB": "blabla"}
Those 'u' characters being appended to an object signifies that the object is encoded in Unicode.
If you want to remove those 'u' characters from your object, you can do this:
import json, ast
jdata = ast.literal_eval(json.dumps(jdata)) # Removing uni-code chars
Let's checkout from python shell
>>> import json, ast
>>> jdata = [{u'i': u'imap.gmail.com', u'p': u'aaaa'}, {u'i': u'333imap.com', u'p': u'bbbb'}]
>>> jdata = ast.literal_eval(json.dumps(jdata))
>>> jdata
[{'i': 'imap.gmail.com', 'p': 'aaaa'}, {'i': '333imap.com', 'p': 'bbbb'}]
Unicode is an appropriate type here. The JSONDecoder documentation describe the conversion table and state that JSON string objects are decoded into Unicode objects.
From 18.2.2. Encoders and Decoders:
JSON Python
==================================
object dict
array list
string unicode
number (int) int, long
number (real) float
true True
false False
null None
"encoding determines the encoding used to interpret any str objects decoded by this instance (UTF-8 by default)."
The u prefix means that those strings are unicode rather than 8-bit strings. The best way to not show the u prefix is to switch to Python 3, where strings are unicode by default. If that's not an option, the str constructor will convert from unicode to 8-bit, so simply loop recursively over the result and convert unicode to str. However, it is probably best just to leave the strings as unicode.
I kept running into this problem when trying to capture JSON data in the log with the Python logging library, for debugging and troubleshooting purposes. Getting the u character is a real nuisance when you want to copy the text and paste it into your code somewhere.
As everyone will tell you, this is because it is a Unicode representation, and it could come from the fact that you’ve used json.loads() to load in the data from a string in the first place.
If you want the JSON representation in the log, without the u prefix, the trick is to use json.dumps() before logging it out. For example:
import json
import logging
# Prepare the data
json_data = json.loads('{"key": "value"}')
# Log normally and get the Unicode indicator
logging.warning('data: {}'.format(json_data))
>>> WARNING:root:data: {u'key': u'value'}
# Dump to a string before logging and get clean output!
logging.warning('data: {}'.format(json.dumps(json_data)))
>>> WARNING:root:data: {'key': 'value'}
Try this:
mail_accounts[0].encode("ascii")
Just replace the u' with a single quote...
print (str.replace(mail_accounts,"u'","'"))

Categories

Resources