Extracting data from a nested json structure in python - python

My json file looks like this:
{"07/01/2015-08/01/2015":
{"ABC": [
["12015618727", "2015-07-29 02:32:01"],
["12024079732", "2015-07-24 13:04:01"],
["12024700142", "2015-07-02 00:00:00"]
]
}
}
I want to extract the numbers 12015618727, 12024079732, 12024700142 from here in python.
I wrote this code:
import json
numbers=set()
input_file=open('filename', 'r')
json_decode=json.load(input_file)
for item in json_decode["07/01/2015-08/01/2015"]["ABC"]:
for j in item:
numbers.add(j[0])
print " ".join(str(x) for x in numbers)
But this doesn't print the numbers.

Python has a json parsing library, see https://docs.python.org/2/library/json.html for details.
Usage:
import json
text = open("file.txt", "r").read()
obj = json.loads(text)
where obj is a python native dict object with nested arrays and dicts.
Edit:
This is the code you want.
import json
numbers=set()
input_file=open('filename.json', 'r')
json_decode=json.load(input_file)
for item in json_decode["07/01/2015-08/01/2015"]["ABC"]:
numbers.add(item[0])
print " ".join(str(x) for x in numbers)
You iterated through each item (the two strings) and added the first letter of each string, hence 1 and 2. Next time, please provide the output you got.
Also, you should attempt to debug your code first. I added a print at the beginning of each loop, and that made the problem pretty clear.

Related

Python - Encoding items in a list

I have a script that works fine in Python2 but I can't get it to work in Python3. I want to base64 encode each item in a list and then write it to a json file. I know I can't use map the same way in Python3 but when I make it a list I get a different error.
import base64
import json
list_of_numbers = ['123456', '234567', '345678']
file = open("orig.json", "r")
json_object = json.load(file)
list = ["[{\"number\":\"" + str(s) + "\"}]" for s in list_of_numbers]
base64_bytes = map(base64.b64encode, list)
json_object["conditions"][1]["value"] = base64_bytes
rule = open("new.json", "w")
json.dump(json_object, rule, indent=2, sort_keys=True)
rule.close()
I'm not sure if your error is related to this, but here's what I think might be the problem. When you map a function, the returned value becomes a map object. To get the results as a list again, you need to cast it back to a list after you map your function. In other words:
base64_bytes = list(map(base64.b64encode, list))
P.S. It's better to avoid list as your variable name since it's the name of the built-in function list.

Finding multiple occuring of word in string in Python

So I have a string that contains data below
https://myanimelist.net/animelist/domis1/load.json?status=2&offset=0.
I want to find all 'anime_id' and put them into the list (only numbers).
I tried with find('anime_id'), but I can't do this for multiple occurings in the string.
Here is an example, how to extract anime_id from a json file called test.json, using built-in json module:
import json
with open('test.json') as f:
data = json.load(f)
# Create generator and search for anime_id
gen = (i['anime_id'] for i in data)
# If needed, iterate over generator and create a list
gen_list = list(gen)
# Print list on console
print(gen_list)
Your string is in json format, you can parse it with the builtin json module.
import json
data = json.loads(your_string)
for d in data:
print(d["anime_id"])

how to print after the keyword from python?

i have following string in python
b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
I want to print the all alphabet next to keyword "name" such that my output should be
waqas
Note the waqas can be changed to any number so i want print any name next to keyword name using string operation or regex?
First you need to decode the string since it is binary b. Then use literal eval to make the dictionary, then you can access by key
>>> s = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
>>> import ast
>>> ast.literal_eval(s.decode())['name']
'waqas'
It is likely you should be reading your data into your program in a different manner than you are doing now.
If I assume your data is inside a JSON file, try something like the following, using the built-in json module:
import json
with open(filename) as fp:
data = json.load(fp)
print(data['name'])
if you want a more algorithmic way to extract the value of name:
s = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a",\
"persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],\
"name":"waqas"}'
s = s.decode("utf-8")
key = '"name":"'
start = s.find(key) + len(key)
stop = s.find('"', start + 1)
extracted_string = s[start : stop]
print(extracted_string)
output
waqas
You can convert the string into a dictionary with json.loads()
import json
mystring = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
mydict = json.loads(mystring)
print(mydict["name"])
# output 'waqas'
First you need to convert the string into a proper JSON Format by removing b from the string using substring in python suppose you have a variable x :
import json
x = x[1:];
dict = json.loads(x) //convert JSON string into dictionary
print(dict["name"])

JSON from streamed data in Python

I am simply trying to keep the following input and resulting JSON string in order.
Here is the input string and code:
import json
testlist=[]
# we create a list as a tuple so the dictionary order stays correct
testlist=[({"header":{"stream":2,"function":3,"reply":True},"body": [({"format": "A", "value":"This is some text"})]})]
print 'py data string: '
print testlist
data_string = json.dumps(testlist)
print 'json string: '
print data_string
Here is the output string:
json string:
[{"body": [{"format": "A", "value": "This is some text"}], "header": {"stream": 2, "function": 3, "reply": true}}]
I am trying to keep the order of the output the same as the input.
Any help would be great. I can't seem to figure this one point.
As Laurent wrote your question is not very clear, but I give it a try:
OrderedDict.update adds in the above case the entries of databody to the dictionary.
What you seem to want to do is something like data['body'] = databody where databody is this list
[{"format":"A","value":"This is a text\nthat I am sending\n to a file"},{"format":"U6","value":5},{"format":"Boolean","value":true}, "format":"F4", "value":8.10}]
So build first this list end then add it to your dictionary plus what you wrote in your post is that the final variable to be parse into json is a list so do data_string = json.dumps([data])

Write a list objects in JSON using Python

I am trying to output the following JSON from my python (2.7) script:
[
{
"id": "1002-00001",
"name": "Name 1"
},
{
"id": "1002-00002",
"display": "Name 2"
},
]
What data structure in Python will output this when using json.dumps?
The outermost item is a python list, but what should be the type of items inside the list? It looks like a dictionary with no keys?
Hopefully this clarifies the notes in comments that are not clear for you. It's achieved by appending (in this case small) dictionaries into a list.
import json
#Added an extra entry with an integer type. Doesn't have to be string.
full_list = [['1002-00001', 'Name 1'],
['1002-00002', 'Name 2'],
['1002-00003', 2]]
output_list = []
for item in full_list:
sub_dict = {}
sub_dict['id'] = item[0] # key-value pair defined
sub_dict['name'] = item[1]
output_list.append(sub_dict) # Just put the mini dictionary into a list
# See Python data structure
print output_list
# Specifically using json.dumps as requested in question.
# Automatically adds double quotes to strings for json formatting in printed
# output but keeps ints (unquoted)
json_object = json.dumps(output_list)
print json_object
# Writing to a file
with open('SO_jsonout.json', 'w') as outfile:
json.dump(output_list, outfile)
# What I think you are confused about with the "keys" is achieved with an
# outer dictionary (but isn't necessary to make a valid data structure, just
# one that you might be more used to seeing)
outer_dict = {}
outer_dict['so_called_missing_key'] = output_list
print outer_dict

Categories

Resources