This is my first attempt at something like this, How do I get the desired output?
the desired json is like so
{
"cluster": "ABC,DEF,GHI,SMU,RTR,DIP-OLM"
}
I am following various answers here, and I only have this
import pandas as pd
file = 'my_template.xlsx'
ans = pd.read_excel(file, sheet_name='clusters')
f = ans.to_json(indent=2)
print(f)
which gives
{
"service_name":{
"0":"ABC",
"1":"DEF",
"2":"GHI",
"3":"SMU",
"4":"MDS",
"5":"APS_OLM",
"6":"RTR",
"7":"DIP-OLM",
"8":"LMS-JSX"
}
}
Some help would open the way for me to get a better understanding of this. Thanks in advance. Would xlrd be better suited for this? There are a couple of more sheets that have to be nested inside of this json.
INFO :- The complete json is something like this:
{
"cluster": "ABC,DEF,GHI,SMU,RTR,DIP-OLM",
"version": "07.04.00",
"ABC": {
"microservices": "ABC - HA (07.04.00)",
"maps": {
"pam_bld_nr": "21",
"pam_bld_nr_frontend": "21",
"pam_bld_nr_backend": "21",
"pam_bt_dk_size": "200"
},
"new_block": {
"additional_param": "additional_param_value",
"additional_param": "additional_param_value2"
}
},
Rest of the values need to come from the other 2 sheets in the same file. Just FYI.
Related
I would like to populate a json by iterating through a list of list with python.
Currently the list of list looks like this:
bookmark_apps = [['Google App','https://google.com'], ['Yahoo App','https://yahoo.com'], ['Espn App','https://espn.com']]
I would like to populate the JSON that looks like this:
{
"name": **Populate App Name Here**,
"label": **Populate App Name Here**,
"signOnMode": "BOOKMARK",
"settings": {
"app": {
"requestIntegration": false,
"url": **Populate App URL Here**
}
}
}
Im confident that there is a better way to approach it then way I'm trying. The way I tried approaching this was breaking it down within 2 lists and iterating through that with zip like this:
app_name_label = []
for sub_list in bookmark_apps:
app_name_label.append(sub_list[0])
bookmark_url = []
for sub_list2 in bookmark_apps:
bookmark_url.append(sub_list2[1])
for i, j in zip(name_label, needed_bookmark_url):
Any suggestions or approaches to get to the solution of populating the json will be greatly appreciated. Thanks.
Use a list comprehension that creates dictionaries from the nested lists.
result = [{
"name": name,
"label": name,
"signOnMode": "BOOKMARK",
"settings": {
"app": {
"requestIntegration": False,
"url": url
}
}
} for name, url in bookmark_apps]
DEMO
I am trying to read some json with the following format. A simple pd.read_json() returns ValueError: Trailing data. Adding lines=True returns ValueError: Expected object or value. I've tried various combinations of readlines() and load()/loads() so far without success.
Any ideas how I could get this into a dataframe?
{
"content": "kdjfsfkjlffsdkj",
"source": {
"name": "jfkldsjf"
},
"title": "dsldkjfslj",
"url": "vkljfklgjkdlgj"
}
{
"content": "djlskgfdklgjkfgj",
"source": {
"name": "ldfjkdfjs"
},
"title": "lfsjdfklfldsjf",
"url": "lkjlfggdflkjgdlf"
}
The sample you have above isn't valid JSON. To be valid JSON these objects need to be within a JS array ([]) and be comma separated, as follows:
[{
"content": "kdjfsfkjlffsdkj",
"source": {
"name": "jfkldsjf"
},
"title": "dsldkjfslj",
"url": "vkljfklgjkdlgj"
},
{
"content": "djlskgfdklgjkfgj",
"source": {
"name": "ldfjkdfjs"
},
"title": "lfsjdfklfldsjf",
"url": "lkjlfggdflkjgdlf"
}]
I just tried on my machine. When formatted correctly, it works
>>> pd.read_json('data.json')
content source title url
0 kdjfsfkjlffsdkj {'name': 'jfkldsjf'} dsldkjfslj vkljfklgjkdlgj
1 djlskgfdklgjkfgj {'name': 'ldfjkdfjs'} lfsjdfklfldsjf lkjlfggdflkjgdlf
Another solution if you do not want to reformat your files.
Assuming your JSON is in a string called my_json you could do:
import json
import pandas as pd
splitted = my_json.split('\n\n')
my_list = [json.loads(e) for e in splitted]
df = pd.DataFrame(my_list)
Thanks for the ideas internet. None quite solved the problem in the way I needed (I had lots of newline characters in the strings themselves which meant I couldn't split on them) but they helped point the way. In case anyone has a similar problem, this is what worked for me:
with open('path/to/original.json', 'r') as f:
data = f.read()
data = data.split("}\n")
data = [d.strip() + "}" for d in data]
data = list(filter(("}").__ne__, data))
data = [json.loads(d) for d in data]
with open('path/to/reformatted.json', 'w') as f:
json.dump(data, f)
df = pd.read_json('path/to/reformatted.json')
If you can use jq then solution is simpler:
jq -s '.' path/to/original.json > path/to/reformatted.json
I am new in Python, would like to extract data from json with Padas.
Json nested structure is as follows:
{
"idDriver": "100001",
"defaultTripType": "private",
"fleetManagerRole": null,
"identifications": [
{
"code": "90-00-00-77-20",
"from": "2019-08-08T10:38:15Z",
"rawId": "",
"vehicle": {
"isBusinessCar": "0",
"id": "10000",
"licensePlate": "ABCD",
"class": "Suziki 1.6 CDTI",
}
}
}
]
}
As an output I would need on one line: 'idDriver' from level 0 and then ‘licensePlate’ from identifications/ vehicle node in one line:
What I have been tried to apply is:
(after loading data from API what works fine)
json_data = json.loads(myResponse.text)
#only unwrapping 'identifications' – works 100% fine
workdata = json_normalize(json_data, record_path= ['identifications'],
meta=['idDriver'])
#unwrapping 'identifications'\'vehicle' - is NOT working
workdata = json_normalize(json_data, record_path= ['identifications','vehicle'],
meta=['idDriver'])
I would appreciate any hint on that.
Kind Regards,
Arek
I would go for rebuilding your dictionary like this:
New_Data = {
"id" : [],
"licensePlate" : []
}
New_Data["id"].append(data["idDriver"])
New_Data["licensePlate"].append(data["identifications"][0]["vehicle"]["licensePlate"])
If you have many data["identifications"] you can easly look over them, if you have many drivers you can do it as well.
For me your first code working nice, only if necessary remove vehicle. text from columns names:
json_data = {
"idDriver": "100001",
"defaultTripType": "private",
"fleetManagerRole": 'null',
"identifications": [
{
"code": "90-00-00-77-20",
"from": "2019-08-08T10:38:15Z",
"rawId": "",
"vehicle": {
"isBusinessCar": "0",
"id": "10000",
"licensePlate": "ABCD",
"class": "Suziki 1.6 CDTI",
}
}
]
}
workdata = json_normalize(json_data, record_path= ['identifications'], meta=['idDriver'])
print (workdata)
code from rawId vehicle.isBusinessCar \
0 90-00-00-77-20 2019-08-08T10:38:15Z 0
vehicle.id vehicle.licensePlate vehicle.class idDriver
0 10000 ABCD Suziki 1.6 CDTI 100001
workdata.columns = workdata.columns.str.replace('vehicle\.','')
print (workdata)
code from rawId isBusinessCar id \
0 90-00-00-77-20 2019-08-08T10:38:15Z 0 10000
licensePlate class idDriver
0 ABCD Suziki 1.6 CDTI 100001
I recently wrote a package to deal with tasks like this easily, it's called cherrypicker. I think the following snippet would achieve your task with CherryPicker:
from cherrypicker import CherryPicker
json_data = json.loads(myResponse.text)
picker = CherryPicker(json_data)
flat_data = picker.flatten['idDriver', 'identifications_0_vehicle_licensePlate'].get()
flat_data would then look like this (I'm assuming that your data is actually a list of objects like the one you described above):
[['100001', 'ABCD'], ...]
You can then load this into a dataframe as follows:
import pandas as pd
df = pd.DataFrame(flat_data, columns=["idDriver", "licensePlate"])
If you want to flatten your data in slightly different ways (e.g. you want every license plate/driver ID combination, not just the first license plate for each driver), then you should be able to do this too although it may require two or three lines rather than just one. Check our the docs for examples of other ways of using it: https://cherrypicker.readthedocs.io.
To install cherrypicker, it's just pip install --user cherrypicker.
I am writing a python lambda function that reads in a json file from s3 and then will take one of the nodes and send it to another lambda function. Here is my code:
The json snippet I want
"jobstreams": [
{
"jobname": "team-summary",
"bucket": "aaa-bbb",
"key": "team-summary.json"
}
step 1 – convert JSON to python objects for processing
note: these I got from another Stack Overflow guru - thanks!!
def _json_object_hook(d): return namedtuple('X', d.keys())(*d.values())
def json2obj(data): return json.loads(data, object_hook=_json_object_hook)
routes = json2obj(jsonText)
step 2 - I then traverse the python objects and find the json I need and dump it
for jobstream in jobstreams:
x = json.dumps(jobstream, ensure_ascii=False)
Howeever, when I print it out, I only have the values not the attributes. Why is that?
print(json.dumps(jobstream, ensure_ascii=False))
yields
["team-summary", "aaa-bbb", "team-summary.json"]
I'm assuming your full json file looks somewhat like what I have in my example
import json
js = {"jobstreams": [
{
"jobname": "team-summary",
"bucket": "aaa-bbb",
"key": "team-summary.json"
},
{
"jobname": "team-2222",
"bucket": "aaa-2222",
"key": "team-222.json"
}
]}
def extract_by_jobname(jobname):
for d in js['jobstreams']:
if d['jobname'] == jobname:
return d
json.dumps(extract_by_jobname("team-summary"))
# '{"jobname": "team-summary", "bucket": "aaa-bbb", "key": "team-summary.json"}'
I ended up creating a new Dictionary from the list that the json.dumps gave me.
["team-summary", "aaa-bbb", "team-summary.json"]
once i had the new dictionary (that is flat), then i converted that to json.... probably not the most efficient approach but i have other fish to fry. THANKS to all for your help!
I'm new to JSON and Python, any help on this would be greatly appreciated.
I read about json.loads but am confused
How do I read a file into Python using json.loads?
Below is my JSON file format:
{
"header": {
"platform":"atm"
"version":"2.0"
}
"details":[
{
"abc":"3"
"def":"4"
},
{
"abc":"5"
"def":"6"
},
{
"abc":"7"
"def":"8"
}
]
}
My requirement is to read the values of all "abc" "def" in details and add this is to a new list like this [(1,2),(3,4),(5,6),(7,8)]. The new list will be used to create a spark data frame.
Open the file, and get a filehandle:
fh = open('thefile.json')
https://docs.python.org/2/library/functions.html#open
Then, pass the file handle into json.load(): (don't use loads - that's for strings)
import json
data = json.load(fh)
https://docs.python.org/2/library/json.html#json.load
From there, you can easily deal with a python dictionary that represents your json-encoded data.
new_list = [(detail['abc'], detail['def']) for detail in data['details']]
Note that your JSON format is also wrong. You will need comma delimiters in many places, but that's not the question.
I'm trying to understand your question as best as I can, but it looks like it was formatted poorly.
First off your json blob is not valid json, it is missing quite a few commas. This is probably what you are looking for:
{
"header": {
"platform": "atm",
"version": "2.0"
},
"details": [
{
"abc": "3",
"def": "4"
},
{
"abc": "5",
"def": "6"
},
{
"abc": "7",
"def": "8"
}
]
}
Now assuming you are trying to parse this in python you will have to do the following.
import json
json_blob = '{"header": {"platform": "atm","version": "2.0"},"details": [{"abc": "3","def": "4"},{"abc": "5","def": "6"},{"abc": "7","def": "8"}]}'
json_obj = json.loads(json_blob)
final_list = []
for single in json_obj['details']:
final_list.append((int(single['abc']), int(single['def'])))
print(final_list)
This will print the following: [(3, 4), (5, 6), (7, 8)]