I'm new to JSON and Python, any help on this would be greatly appreciated.
I read about json.loads but am confused
How do I read a file into Python using json.loads?
Below is my JSON file format:
{
"header": {
"platform":"atm"
"version":"2.0"
}
"details":[
{
"abc":"3"
"def":"4"
},
{
"abc":"5"
"def":"6"
},
{
"abc":"7"
"def":"8"
}
]
}
My requirement is to read the values of all "abc" "def" in details and add this is to a new list like this [(1,2),(3,4),(5,6),(7,8)]. The new list will be used to create a spark data frame.
Open the file, and get a filehandle:
fh = open('thefile.json')
https://docs.python.org/2/library/functions.html#open
Then, pass the file handle into json.load(): (don't use loads - that's for strings)
import json
data = json.load(fh)
https://docs.python.org/2/library/json.html#json.load
From there, you can easily deal with a python dictionary that represents your json-encoded data.
new_list = [(detail['abc'], detail['def']) for detail in data['details']]
Note that your JSON format is also wrong. You will need comma delimiters in many places, but that's not the question.
I'm trying to understand your question as best as I can, but it looks like it was formatted poorly.
First off your json blob is not valid json, it is missing quite a few commas. This is probably what you are looking for:
{
"header": {
"platform": "atm",
"version": "2.0"
},
"details": [
{
"abc": "3",
"def": "4"
},
{
"abc": "5",
"def": "6"
},
{
"abc": "7",
"def": "8"
}
]
}
Now assuming you are trying to parse this in python you will have to do the following.
import json
json_blob = '{"header": {"platform": "atm","version": "2.0"},"details": [{"abc": "3","def": "4"},{"abc": "5","def": "6"},{"abc": "7","def": "8"}]}'
json_obj = json.loads(json_blob)
final_list = []
for single in json_obj['details']:
final_list.append((int(single['abc']), int(single['def'])))
print(final_list)
This will print the following: [(3, 4), (5, 6), (7, 8)]
Related
I am using Python requests library to execute GraphQL mutation. I need to pass requests library a query parameter which should contain a string which should be constructed from the Python list of Python dictionaries.
Python list of dictionaries looks like:
my_list_of_dicts = [{"custom_module_id": "23", "answer": "some text 2", "user_id": "111"},
{"custom_module_id": "24", "answer": "a", "user_id": "111"}]
Now I need to convert this list of dictionaries in a string so it should look like this:
my_list_of_dicts = [{custom_module_id: "23", answer: "some text 2", user_id: "111"},
{custom_module_id: "24", answer: "a", user_id: "111"}]
Basically I need to get the string that looks like a Python list of dictionaries except that keys of the dictionaries does not have quotations around dictionary key names. I did this and it works:
my_query_string = json.dumps(my_list_of_dicts).replace("\"custom_module_id\"", "custom_module_id")
my_query_string = my_query_string.replace("\"answer\"", "answer")
my_query_string = my_query_string.replace("\"user_id\"", "user_id")
But I was wondering maybe there is better way to achieve this? By "better" I mean some function call that will prepare json/dictionary format for ready to be used GraphQL string.
I think this may help you find your final answer.
Follow this article
gq = """
mutation ReorderProducts($id: ID!, $moves: [MoveInput!]!) {
collectionReorderProducts(id: $id, moves: $moves) {
job {
id
}
userErrors {
field
message
}
}
}
"""
resp = self.sy_graphql_client.execute(
query=gq,
variables={
"id": before_collection_meta.coll_meta.id,
"moves": list(map(lambda mtc:
{
"id": mtc.id, "newPosition": mtc.new_position
}, move_to_commands))
}
)
reorder_job_id = resp["data"]["collectionReorderProducts"]["job"]["id"]
self.sy_graphql_client.wait_for_job(reorder_job_id)
I am trying to read some json with the following format. A simple pd.read_json() returns ValueError: Trailing data. Adding lines=True returns ValueError: Expected object or value. I've tried various combinations of readlines() and load()/loads() so far without success.
Any ideas how I could get this into a dataframe?
{
"content": "kdjfsfkjlffsdkj",
"source": {
"name": "jfkldsjf"
},
"title": "dsldkjfslj",
"url": "vkljfklgjkdlgj"
}
{
"content": "djlskgfdklgjkfgj",
"source": {
"name": "ldfjkdfjs"
},
"title": "lfsjdfklfldsjf",
"url": "lkjlfggdflkjgdlf"
}
The sample you have above isn't valid JSON. To be valid JSON these objects need to be within a JS array ([]) and be comma separated, as follows:
[{
"content": "kdjfsfkjlffsdkj",
"source": {
"name": "jfkldsjf"
},
"title": "dsldkjfslj",
"url": "vkljfklgjkdlgj"
},
{
"content": "djlskgfdklgjkfgj",
"source": {
"name": "ldfjkdfjs"
},
"title": "lfsjdfklfldsjf",
"url": "lkjlfggdflkjgdlf"
}]
I just tried on my machine. When formatted correctly, it works
>>> pd.read_json('data.json')
content source title url
0 kdjfsfkjlffsdkj {'name': 'jfkldsjf'} dsldkjfslj vkljfklgjkdlgj
1 djlskgfdklgjkfgj {'name': 'ldfjkdfjs'} lfsjdfklfldsjf lkjlfggdflkjgdlf
Another solution if you do not want to reformat your files.
Assuming your JSON is in a string called my_json you could do:
import json
import pandas as pd
splitted = my_json.split('\n\n')
my_list = [json.loads(e) for e in splitted]
df = pd.DataFrame(my_list)
Thanks for the ideas internet. None quite solved the problem in the way I needed (I had lots of newline characters in the strings themselves which meant I couldn't split on them) but they helped point the way. In case anyone has a similar problem, this is what worked for me:
with open('path/to/original.json', 'r') as f:
data = f.read()
data = data.split("}\n")
data = [d.strip() + "}" for d in data]
data = list(filter(("}").__ne__, data))
data = [json.loads(d) for d in data]
with open('path/to/reformatted.json', 'w') as f:
json.dump(data, f)
df = pd.read_json('path/to/reformatted.json')
If you can use jq then solution is simpler:
jq -s '.' path/to/original.json > path/to/reformatted.json
Can some one tell me what I am doing wrong ?I am Getting this error..
went through the earlier post of similar error. couldn't able to understand..
import json
import re
import requests
import subprocess
res = requests.get('https://api.tempura1.com/api/1.0/recipes', auth=('12345','123'), headers={'App-Key': 'some key'})
data = res.text
extracted_recipes = []
for recipe in data['recipes']:
extracted_recipes.append({
'name': recipe['name'],
'status': recipe['status']
})
print extracted_recipes
TypeError: string indices must be integers
data contains the below
{
"recipes": {
"47635": {
"name": "Desitnation Search",
"status": "SUCCESSFUL",
"kitchen": "eu",
"active": "YES",
"created_at": 1501672231,
"interval": 5,
"use_legacy_notifications": false
},
"65568": {
"name": "Validation",
"status": "SUCCESSFUL",
"kitchen": "us-west",
"active": "YES",
"created_at": 1522583593,
"interval": 5,
"use_legacy_notifications": false
},
"47437": {
"name": "Gateday",
"status": "SUCCESSFUL",
"kitchen": "us-west",
"active": "YES",
"created_at": 1501411588,
"interval": 10,
"use_legacy_notifications": false
}
},
"counts": {
"total": 3,
"limited": 3,
"filtered": 3
}
}
You are not converting the text to json. Try
data = json.loads(res.text)
or
data = res.json()
Apart from that, you probably need to change the for loop to loop over the values instead of the keys. Change it to something the following
for recipe in data['recipes'].values()
There are two problems with your code, which you could have found out by yourself by doing a very minimal amount of debugging.
The first problem is that you don't parse the response contents from json to a native Python object. Here:
data = res.text
data is a string (json formatted, but still a string). You need to parse it to turn it into it's python representation (in this case a dict). You can do it using the stdlib's json.loads() (general solution) or, since you're using python-requests, just by calling the Response.json() method:
data = res.json()
Then you have this:
for recipe in data['recipes']:
# ...
Now that we have turned data into a proper dict, we can access the data['recipes'] subdict, but iterating directly over a dict actually iterates over the keys, not the values, so in your above for loop recipe will be a string ( "47635", "65568" etc). If you want to iterate over the values, you have to ask for it explicitly:
for recipe in data['recipes'].values():
# now `recipe` is the dict you expected
I am new to python. I am writing a code, where I need to read the json file, and dump some of it's data to a new json file.
Following is my code:
if vmName=='IGW':
with open(APIJSONPath+NwIntfJSONFileName) as f:
data=json.load(f)
for item in data['Interfaces']:
jdata=item
with open(NwIntfJSONPath+vmName+'.json','w') as c:
json.dump(jdata,c,indent=2)
Following is a small part of my json file data from which this code is supposed to retrieve the interface details(Interface name,IPAddress, PrefixLength, DefaultGateway) of eth0 and eth1:
{
"Interfaces": [{
"Name": "eth0",
"IPConfigurations": [{
"IPAddress": "10.0.3.7",
"PrefixLength": 24,
"DefaultGateway": "10.0.3.1",
"IsPrimary": true
}],
"Description0": "connected to cloudsimple network",
"IsPrimary": true
} ,
{
"Name": "eth1",
"IPConfigurations": [{
"IPAddress": "10.0.3.8",
"PrefixLength": 24,
"DefaultGateway": "10.0.3.1",
"IsPrimary": true
}],
"Description1": "connected to internet"
}
]
}
But the data that is getting dumped the new json file is:
{
"Name": "eth1",
"IPConfigurations": [
{
"PrefixLength": 24,
"IsPrimary": true,
"IPAddress": "10.0.3.8",
"DefaultGateway": "10.0.3.1"
}
],
"Description1": "connected to internet"
}
Only eth1 details is getting dumped, not the eth0. The dumped data is also in an unordered manner.
Can someone please help me figure out, where I am going wrong, and how to fix this two issues in my code? Thanks in advance.
If you need all content of data['Interfaces'] in your output json use the below snippet.
if vmName=='IGW':
with open(APIJSONPath+NwIntfJSONFileName) as f:
data=json.load(f)
with open(NwIntfJSONPath+vmName+'.json','w') as c:
json.dump(data['Interfaces'],c,indent=2)
In your example you are looping through data['Interfaces'] and jdata holds the last value of the list. That is why you are only getting the last element in the output json.
I have a list of json objects that I would like to write to a json file. Example of my data is as follows:
{
"_id": "abc",
"resolved": false,
"timestamp": "2017-04-18T04:57:41 366000",
"timestamp_utc": {
"$date": 1492509461366
},
"sessionID": "abc",
"resHeight": 768,
"time_bucket": ["2017-year", "2017-04-month", "2017-16-week", "2017-04-18-day", "2017-04-18 16-hour"],
"referrer": "Standalone",
"g_event_id": "abc",
"user_agent": "abc"
"_id": "abc",
} {
"_id": "abc",
"resolved": false,
"timestamp": "2017-04-18T04:57:41 366000",
"timestamp_utc": {
"$date": 1492509461366
},
"sessionID": "abc",
"resHeight": 768,
"time_bucket": ["2017-year", "2017-04-month", "2017-16-week", "2017-04-18-day", "2017-04-18 16-hour"],
"referrer": "Standalone",
"g_event_id": "abc",
"user_agent": "abc"
}
I would like to wirte this to a json file. Here's the code that I am using for this purpose:
with open("filename", 'w') as outfile1:
for row in data:
outfile1.write(json.dumps(row))
But this gives me a file with only 1 long row of data. I would like to have a row for each json object in my original data. I know there are some other StackOverflow questions that are trying to address somewhat similar situation (by externally inserting '\n' etc.), but it hasn't worked in my case for some reason. I believe there has to be a pythonic way to do this.
How do I achieve this?
The format of the file you are trying to create is called JSON lines.
It seems, you are asking why the jsons are not separated with a newline. Because write method does not append the newline.
If you want implicit newlines you should better use print function:
with open("filename", 'w') as outfile1:
for row in data:
print(json.dumps(row), file=outfile1)
Use the indent argument to output json with extra whitespace. The default is to not output linebreaks or extra spaces.
with open('filename.json', 'w') as outfile1:
json.dump(data, outfile1, indent=4)
https://docs.python.org/3/library/json.html#basic-usage