multiple json dictionaries python - python

I have a json file with multiple json dictionaries:
the format is as following
{"x":"1", "y":"2", "z": "3"}{"x": "2","y":"3", "z":"4"}{"x":"3", "y":"4", "z":"5"}
How can I convert this to one json dictionary format as following:
{"items":[{"x":"1", "y":"2", "z": "3"},{"x": "2","y":"3", "z":"4"},{"x":"3", "y":"4", "z":"5"}]}

Already mostly answered here: Importing wrongly concatenated JSONs in python
That shows you how to pick up each JSON element from a concatenated list of them (which is not valid JSON) using json.JSONDecoder method raw_decode(). The rest is a simple matter of concatenating strings '{"items":[', element1, ",", element2, ... "]" or alternatively, accumulating them as a list, wrapping that with a one-item dict and if required, dumping that json as a string with json.dumps
OK expanding on that
import json
d = json.JSONDecoder()
# get your concatenated json into x, for example maybe
x = open('myfile.json','r').read()
jlist=[]
while True:
try:
j,n = d.raw_decode(x)
jlist.append(j)
except ValueError:
break
x=x[n:]
# my original thought was
result = '{"items":[' +
','.join( [ str(s) for s in jlist] ) +
']'
# but this is neater
result = json.dumps( { "items":jlist })
# and of course if you want it for programmatic use, just
result_as_json = dict( items=jlist )

Related

How to add string in specific positions in geojson object

i have the below posted geojson mentioned in geojson_1 section below. i want to add to it "geometry":{ and }, so that to appear as follows
{"geometry":{"type":"Polygon","coordinates":[[[1216374.67364018,6563498.44078949],[1216387.86261675,6563523.87797899],[1216397.66970116,6563548.2905649],[1216424.17569103,6563588.32082324],[1216458.19258303,6563622.16452455],[1216498.32084288,6563648.42909789],[1216542.90943577,6563666.03380959],[1216590.12376481,6563674.25425166],[1216638.02117068,6563672.7521636],[1216684.63088244,6563661.58935797],[1216728.03512655,6563641.225175],[1216752.29181681,6563626.67066235],[1216787.17700448,6563601.12371718],[1216816.83970763,6563569.63465531],[1216831.39332728,6563551.03748989],[1216838.2508451,6563541.8226918],[1216897.47283376,6563458.0765492],[1216918.74007329,6563421.44644481],[1216933.156564,6563381.60258193],[1216940.26085904,6563339.82061228],[1216939.82562707,6563297.43819918],[1216931.86491641,6563255.81218836],[1216907.60647856,6563170.91644364],[1216887.20280767,6563121.46139137],[1216856.24799209,6563077.86160203],[1216821.48046704,6563039.0529759],[1216799.23490474,6563017.28929875],[1216753.95673639,6562978.48086898],[1216737.29066155,6562965.4435638],[1216673.22488836,6562919.79826372],[1216644.73178636,6562899.22061724],[1216601.13622245,6562874.31962206],[1216562.32695185,6562857.3410734],[1216556.56069412,6562854.90900462],[1216549.97837146,6562852.23502385],[1216545.77480552,6562849.58841453],[1216504.75306873,6562829.03095075],[1216487.0229317,6562822.21187019],[1216482.65368148,6562820.3796627],[1216478.79578194,6562814.49158384],[1216462.95127963,6562793.04723497],[1216450.44559886,6562777.97698661],[1216448.65520854,6562774.19751598],[1216429.39331353,6562740.84427663],[1216404.99213055,6562711.06486155],[1216382.3528849,6562687.61801865],[1216357.97638417,6562665.64947943],[1216339.38004804,6562651.09634054],[1216299.24217837,6562625.743469],[1216254.86196793,6562608.94292463],[1216208.02902037,6562601.37212065],[1216160.63182011,6562603.33631786],[1216114.58159971,6562614.75632195],[1216071.73529105,6562635.17167463],[1216033.82066317,6562663.75921023],[1216002.36666225,6562699.36623124],[1215978.64176027,6562740.55696848],[1215963.60279766,6562785.67045603],[1215957.85638339,6562832.88749107],[1215961.63441126,6562880.30398198],[1215963.25125904,6562889.19806131],[1215964.03213898,6562893.28933659],[1215968.1319511,6562913.79137984],[1215972.03222389,6562937.19708669],[1215977.28745991,6563016.79645952],[1215971.45390521,6563048.0427099],[1215969.39172682,6563061.09023781],[1215963.73069651,6563104.75131293],[1215962.06592533,6563123.22251112],[1215960.00953034,6563163.6421848],[1215954.35640427,6563213.80082903],[1215954.63230125,6563269.34572034],[1215960.29082704,6563315.43307246],[1215970.70253119,6563361.57391952],[1215982.82982632,6563397.95907391],[1216001.20120538,6563439.38870108],[1216027.10825421,6563476.54992802],[1216059.60193364,6563508.08124916],[1216097.4918555,6563532.82739871],[1216139.38989254,6563549.8816932],[1216183.76104002,6563558.61926572],[1216228.97966418,6563558.71997125],[1216273.38907516,6563550.18012236],[1216287.1346647,6563546.13717455],[1216332.92682121,6563527.24586381],[1216373.78586258,6563499.1986745],[1216374.67364018,6563498.44078949]]]}}
to simpify it even more, i want to add "geometry":{ right after the the first curly bracket, and the } at the very end
i attmepted the following:
asString = asString[:2] + "geometry:" + asString[2:]
asString = asString[:len(asString)] + "}" + asString[len(asString):]
but i am not getting the expected results
geojson_1:
{"type":"Polygon","coordinates":[[[1216374.67364018,6563498.44078949],[1216387.86261675,6563523.87797899],[1216397.66970116,6563548.2905649],[1216424.17569103,6563588.32082324],[1216458.19258303,6563622.16452455],[1216498.32084288,6563648.42909789],[1216542.90943577,6563666.03380959],[1216590.12376481,6563674.25425166],[1216638.02117068,6563672.7521636],[1216684.63088244,6563661.58935797],[1216728.03512655,6563641.225175],[1216752.29181681,6563626.67066235],[1216787.17700448,6563601.12371718],[1216816.83970763,6563569.63465531],[1216831.39332728,6563551.03748989],[1216838.2508451,6563541.8226918],[1216897.47283376,6563458.0765492],[1216918.74007329,6563421.44644481],[1216933.156564,6563381.60258193],[1216940.26085904,6563339.82061228],[1216939.82562707,6563297.43819918],[1216931.86491641,6563255.81218836],[1216907.60647856,6563170.91644364],[1216887.20280767,6563121.46139137],[1216856.24799209,6563077.86160203],[1216821.48046704,6563039.0529759],[1216799.23490474,6563017.28929875],[1216753.95673639,6562978.48086898],[1216737.29066155,6562965.4435638],[1216673.22488836,6562919.79826372],[1216644.73178636,6562899.22061724],[1216601.13622245,6562874.31962206],[1216562.32695185,6562857.3410734],[1216556.56069412,6562854.90900462],[1216549.97837146,6562852.23502385],[1216545.77480552,6562849.58841453],[1216504.75306873,6562829.03095075],[1216487.0229317,6562822.21187019],[1216482.65368148,6562820.3796627],[1216478.79578194,6562814.49158384],[1216462.95127963,6562793.04723497],[1216450.44559886,6562777.97698661],[1216448.65520854,6562774.19751598],[1216429.39331353,6562740.84427663],[1216404.99213055,6562711.06486155],[1216382.3528849,6562687.61801865],[1216357.97638417,6562665.64947943],[1216339.38004804,6562651.09634054],[1216299.24217837,6562625.743469],[1216254.86196793,6562608.94292463],[1216208.02902037,6562601.37212065],[1216160.63182011,6562603.33631786],[1216114.58159971,6562614.75632195],[1216071.73529105,6562635.17167463],[1216033.82066317,6562663.75921023],[1216002.36666225,6562699.36623124],[1215978.64176027,6562740.55696848],[1215963.60279766,6562785.67045603],[1215957.85638339,6562832.88749107],[1215961.63441126,6562880.30398198],[1215963.25125904,6562889.19806131],[1215964.03213898,6562893.28933659],[1215968.1319511,6562913.79137984],[1215972.03222389,6562937.19708669],[1215977.28745991,6563016.79645952],[1215971.45390521,6563048.0427099],[1215969.39172682,6563061.09023781],[1215963.73069651,6563104.75131293],[1215962.06592533,6563123.22251112],[1215960.00953034,6563163.6421848],[1215954.35640427,6563213.80082903],[1215954.63230125,6563269.34572034],[1215960.29082704,6563315.43307246],[1215970.70253119,6563361.57391952],[1215982.82982632,6563397.95907391],[1216001.20120538,6563439.38870108],[1216027.10825421,6563476.54992802],[1216059.60193364,6563508.08124916],[1216097.4918555,6563532.82739871],[1216139.38989254,6563549.8816932],[1216183.76104002,6563558.61926572],[1216228.97966418,6563558.71997125],[1216273.38907516,6563550.18012236],[1216287.1346647,6563546.13717455],[1216332.92682121,6563527.24586381],[1216373.78586258,6563499.1986745],[1216374.67364018,6563498.44078949]]]}
I'm going to assume that geojson_1 is available as a string in which case:
import json
output = {'geometry': json.loads(geojson_1)}
...will give you a dictionary with the structure you need.
It looks like plain json data, or a string representation of a dict (they wouldn't be any different in this case), did you consider wrapping the returned data in a new dict rather than manipulating it as a string?
import json
# Assume this returns the geojson as text
geojson = json.loads(get_geojson())
geojson = {"geometry": geojson}
print(json.dumps(geojson))
I get the expected result using the following:
'{"geometry":' + d + "}"
It adds the string {"geometry": to the string d and at the end }.
The variable dis:
d = '{"type":"Polygon","co (rest of json) ,6563498.44078949]]]}'
Or you can use the json library for this:
import json
data = json.loads(d) # note d is the same string as above, this can also be from a file or read file using json.load(FILE)
# Create your new object:
result = {'geometry': data}
# print you new json:
print( json.dumps(result, indent=2))
edit:
'"geometry":{' + d + "}"
Note that you get a string starting with geometry and a { and directly another { from you input json. This is not a correct dictionary nor a proper json format.
Result:
'"geometry":{{"type":"Polygon", ... ,6563498.44078949]]]}}'
(the dots are just the rest of your original json.

how to print after the keyword from python?

i have following string in python
b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
I want to print the all alphabet next to keyword "name" such that my output should be
waqas
Note the waqas can be changed to any number so i want print any name next to keyword name using string operation or regex?
First you need to decode the string since it is binary b. Then use literal eval to make the dictionary, then you can access by key
>>> s = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
>>> import ast
>>> ast.literal_eval(s.decode())['name']
'waqas'
It is likely you should be reading your data into your program in a different manner than you are doing now.
If I assume your data is inside a JSON file, try something like the following, using the built-in json module:
import json
with open(filename) as fp:
data = json.load(fp)
print(data['name'])
if you want a more algorithmic way to extract the value of name:
s = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a",\
"persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],\
"name":"waqas"}'
s = s.decode("utf-8")
key = '"name":"'
start = s.find(key) + len(key)
stop = s.find('"', start + 1)
extracted_string = s[start : stop]
print(extracted_string)
output
waqas
You can convert the string into a dictionary with json.loads()
import json
mystring = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
mydict = json.loads(mystring)
print(mydict["name"])
# output 'waqas'
First you need to convert the string into a proper JSON Format by removing b from the string using substring in python suppose you have a variable x :
import json
x = x[1:];
dict = json.loads(x) //convert JSON string into dictionary
print(dict["name"])

How to remove numbers from json in python

I having some json format like
json= 5843080158430803{"name":"NAME", "age":"56",}
So, how i get {"name":"NAME", "age":"56",} Using regex/split (which one is bets method for it) in Python.
Thanks in Advance...
Split the first occurance of { into an array, and get the second element in the array.
We also have to add the { again because its removed by the split function
json = '5843080158430803{"name":"NAME", "age":"56",}'
json = '{' + json.split('{', 1)[1]
print(json)
Result: {"name":"NAME", "age":"56",}
perhaps you could split at at the first { and then replace the part prior to it.
I am assuming the json you have above is actually a string. Then you could do:
json_prefix = json.split("{")
json = json.replace(json_prefix, "")

JSON from streamed data in Python

I am simply trying to keep the following input and resulting JSON string in order.
Here is the input string and code:
import json
testlist=[]
# we create a list as a tuple so the dictionary order stays correct
testlist=[({"header":{"stream":2,"function":3,"reply":True},"body": [({"format": "A", "value":"This is some text"})]})]
print 'py data string: '
print testlist
data_string = json.dumps(testlist)
print 'json string: '
print data_string
Here is the output string:
json string:
[{"body": [{"format": "A", "value": "This is some text"}], "header": {"stream": 2, "function": 3, "reply": true}}]
I am trying to keep the order of the output the same as the input.
Any help would be great. I can't seem to figure this one point.
As Laurent wrote your question is not very clear, but I give it a try:
OrderedDict.update adds in the above case the entries of databody to the dictionary.
What you seem to want to do is something like data['body'] = databody where databody is this list
[{"format":"A","value":"This is a text\nthat I am sending\n to a file"},{"format":"U6","value":5},{"format":"Boolean","value":true}, "format":"F4", "value":8.10}]
So build first this list end then add it to your dictionary plus what you wrote in your post is that the final variable to be parse into json is a list so do data_string = json.dumps([data])

python: how do I parse a stream of json arrays with ijson library

The incoming data resembles the following:
[{
"foo": "bar"
}]
[{
"bar": "baz"
}]
[{
"baz": "foo"
}]
as you see, arrays of objects strung together. JSON-ish
ijson is able to handle the first array, and then I get:
ijson.common.JSONError: Additional data
when it hits the subsequent arrays. How do I get around this?
Here's a first cut at the problem that at least has a working regex substitution to turn a full string into valid json. It only works if you're ok with reading the full input stream before parsing as json.
import re
input = ''
for line in inputStream:
input = input + line
# input == '[{"foo": "bar"}][{"bar": "baz"}][{"baz": "foo"}]'
# wrap in [] and put commas between each ][
sanitizedInput = re.sub(r"\]\[", "],[", "[%s]" % input)
# sanitizedInput == '[[{"foo": "bar"}],[{"bar": "baz"}],[{"baz": "foo"}]]'
# then parse sanitizedInput
parsed = json.loads(sanitizedInput)
print parsed #=> [[{u'foo': u'bar'}], [{u'bar': u'baz'}], [{u'baz': u'foo'}]]
Note: since you're read the whole thing as a string, you can use json instead of ijson
You can use json.JSONDecoder.raw_decode to walk through the string. Its documentation indeed says:
This can be used to decode a JSON document from a string that may have extraneous data at the end.
The following code sample assumes all the JSON values are in one big string:
def json_elements(string):
while True:
try:
(element, position) = json.JSONDecoder.raw_decode(string)
yield element
string = string[position:]
except ValueError:
break
To avoid dealing with raw_decode yourself and to be able to parse a stream chunk by chunk, I would recommend a library I made for this exact purpose: streamcat.
def json_elements(stream)
decoder = json.JSONDecoder()
yield from streamcat.stream_to_iterator(stream, decoder)
This works for any concatenation of JSON values regardless of how many white-space characters are used within them or between them.
If you have control over how your input stream is encoded, you may want to consider using line-delimited JSON, which makes parsing easier.

Categories

Resources