I have some json data similar to this...
{
"people": [
{
"name": "billy",
"age": "12"
...
...
},
{
"name": "karl",
"age": "31"
...
...
},
...
...
]
}
At the moment I can do this to get a entry from the people list...
wantedPerson = "karl"
for person in people:
if person['name'] == wantedPerson:
* I have the persons entry *
break
Is there a better way of doing this? Something similar to how we can .get('key') ?
Thanks,
Chris
Assuming you load that json data using the standard library for it, you're fairly close to optimal, perhaps you were looking for something like this:
from json import loads
text = '{"people": [{"name": "billy", "age": "12"}, {"name": "karl", "age": "31"}]}'
data = loads(text)
people = [p for p in data['people'] if p['name'] == 'karl']
If you frequently need to access this data, you might just do something like this:
all_people = {p['name']: p for p in data['people']}
print(all_people['karl'])
That is, all_people becomes a dictionary that uses the name as a key, so you can access any person in it quickly by accessing them by name. This assumes however that there are no duplicate names in your data.
First, there's no problem with your current 'naive' approach - it's clear and efficient since you can't find the value you're looking for without scanning the list.
It seems that you refer to better as shorter, so if you want a one-liner solution, consider the following:
next((person for person in people if person.name == wantedPerson), None)
It gets the first person in the list that has the required name or None if no such person was found.
similarly
ps = {
"people": [
{
"name": "billy",
"age": "12"
},
{
"name": "karl",
"age": "31"
},
]
}
print([x for x in ps['people'] if 'karl' in x.values()])
For possible alternatives or details see e.g. # Get key by value in dictionary
Related
I am using Python requests library to execute GraphQL mutation. I need to pass requests library a query parameter which should contain a string which should be constructed from the Python list of Python dictionaries.
Python list of dictionaries looks like:
my_list_of_dicts = [{"custom_module_id": "23", "answer": "some text 2", "user_id": "111"},
{"custom_module_id": "24", "answer": "a", "user_id": "111"}]
Now I need to convert this list of dictionaries in a string so it should look like this:
my_list_of_dicts = [{custom_module_id: "23", answer: "some text 2", user_id: "111"},
{custom_module_id: "24", answer: "a", user_id: "111"}]
Basically I need to get the string that looks like a Python list of dictionaries except that keys of the dictionaries does not have quotations around dictionary key names. I did this and it works:
my_query_string = json.dumps(my_list_of_dicts).replace("\"custom_module_id\"", "custom_module_id")
my_query_string = my_query_string.replace("\"answer\"", "answer")
my_query_string = my_query_string.replace("\"user_id\"", "user_id")
But I was wondering maybe there is better way to achieve this? By "better" I mean some function call that will prepare json/dictionary format for ready to be used GraphQL string.
I think this may help you find your final answer.
Follow this article
gq = """
mutation ReorderProducts($id: ID!, $moves: [MoveInput!]!) {
collectionReorderProducts(id: $id, moves: $moves) {
job {
id
}
userErrors {
field
message
}
}
}
"""
resp = self.sy_graphql_client.execute(
query=gq,
variables={
"id": before_collection_meta.coll_meta.id,
"moves": list(map(lambda mtc:
{
"id": mtc.id, "newPosition": mtc.new_position
}, move_to_commands))
}
)
reorder_job_id = resp["data"]["collectionReorderProducts"]["job"]["id"]
self.sy_graphql_client.wait_for_job(reorder_job_id)
I have a .json file with many entries looking like this:
{
"name": "abc",
"time": "20220607T190731.442",
"id": "123",
"relatedIds": [
{
"id": "456",
"source": "sourceA"
},
{
"id": "789",
"source": "sourceB"
}
],
}
I am saving each entry in a python object, however, I only need the related ID from source A. Problem is, the related ID from source A is not always first place in that nested list.
So data['relatedIds'][0]['id'] is not reliable to yield the right Id.
Currently I am solving the issue like this:
import json
with open("filepath", 'r') as file:
data = json.load(file)
for value in data['relatedIds']:
if(value['source'] == 'sourceA'):
id_from_a = value['id']
entry = Entry(data['name'], data['time'], data['id'], id_from_a)
I don't think this approach is the optimal solution though, especially if relatedIds list gets longer and more entries appended to the JSON file.
Is there a more sophisticated way of singling out this 'id' value from a specified source without looping through all entries in that nested list?
For a cleaner solution, you could try using python's filter() function with a simple lambda:
import json
with open("filepath", 'r') as file:
data = json.load(file)
filtered_data = filter(lambda a : a["source"] == "sourceA", data["relatedIds"])
id_from_a = next(filtered_data)['id']
entry = Entry(data['name'], data['time'], data['id'], id_from_a)
Correct me if I misunderstand how your json file looks, but it seems to work for me.
One step at a time, in order to get to all entries:
>>> data["relatedIds"]
[{'id': '789', 'source': 'sourceB'}, {'id': '456', 'source': 'sourceA'}]
Next, in order to get only those entries with source=sourceA:
>>> [e for e in data["relatedIds"] if e["source"] == "sourceA"]
[{'id': '456', 'source': 'sourceA'}]
Now, since you don't want the whole entry, but just the ID, we can go a little further:
>>> [e["id"] for e in data["relatedIds"] if e["source"] == "sourceA"]
['456']
From there, just grab the first ID:
>>> [e["id"] for e in data["relatedIds"] if e["source"] == "sourceA"][0]
'456'
Can you get whatever generates your .json file to produce the relatedIds as an object rather than a list?
{
"name": "abc",
"time": "20220607T190731.442",
"id": "123",
"relatedIds": {
"sourceA": "456",
"sourceB": "789"
}
}
If not, I'd say you're stuck looping through the list until you find what you're looking for.
Appreciate if you could help me for the best way to transform a result into json as below.
We have a result like below, where we are getting an information on the employees and the companies. In the result, somehow, we are getting a enum like T, but not for all the properties.
[ {
"T.id":"Employee_11",
"T.category":"Employee",
"node_id":["11"]
},
{
"T.id":"Company_12",
"T.category":"Company",
"node_id":["12"],
"employeecount":800
},
{
"T.id":"id~Employee_11_to_Company_12",
"T.category":"WorksIn",
},
{
"T.id":"Employee_13",
"T.category":"Employee",
"node_id":["13"]
},
{
"T.id":"Parent_Company_14",
"T.category":"ParentCompany",
"node_id":["14"],
"employeecount":900,
"childcompany":"Company_12"
},
{
"T.id":"id~Employee_13_to_Parent_Company_14",
"T.category":"Contractorin",
}]
We need to transform this result into a different structure and grouping based on the category, if category in Employee, Company and ParentCompany, then it should be under the node_properties object, else, should be in the edge_properties. And also, apart from the common properties(property_id, property_category and node), different properties to be added if the category is company and parent company. There are few more logic also where we have to get the from and to properties of the edge object based on the 'to' . the expected response is,
"node_properties":[
{
"property_id":"Employee_11",
"property_category":"Employee",
"node":{node_id: "11"}
},
{
"property_id":"Company_12",
"property_category":"Company",
"node":{node_id: "12"},
"employeecount":800
},
{
"property_id":"Employee_13",
"property_category":"Employee",
"node":{node_id: "13"}
},
{
"property_id":"Company_14",
"property_category":"ParentCompany",
"node":{node_id: "14"},
"employeecount":900,
"childcompany":"Company_12"
}
],
"edge_properties":[
{
"from":"Employee_11",
"to":"Company_12",
"property_id":"Employee_11_to_Company_12",
},
{
"from":"Employee_13",
"to":"Parent_Company_14",
"property_id":"Employee_13_to_Parent_Company_14",
}
]
In java, we have used the enhanced for loop, switch etc. How we can write the code in the python to get the structure as above from the initial result structure. ( I am new to python), thank you in advance.
Regards
Here is a method that I quickly made, you can adjust it to your requirements. You can use regex or your own function to get the IDs of the edge_properties then assign it to an object like the way I did. I am not so sure of your full requirements but if that list that you gave is all the categories then this will be sufficient.
def transform(input_list):
node_properties = []
edge_properties = []
for input_obj in input_list:
# print(obj)
new_obj = {}
if input_obj['T.category'] == 'Employee' or input_obj['T.category'] == 'Company' or input_obj['T.category'] == 'ParentCompany':
new_obj['property_id'] = input_obj['T.id']
new_obj['property_category'] = input_obj['T.category']
new_obj['node'] = {input_obj['node_id'][0]}
if "employeecount" in input_obj:
new_obj['employeecount'] = input_obj['employeecount']
if "childcompany" in input_obj:
new_obj['childcompany'] = input_obj['childcompany']
node_properties.append(new_obj)
else: # You can do elif == to as well based on your requirements if there are other outliers
# You can use regex or whichever method here to split the string and add the values like above
edge_properties.append(new_obj)
return [node_properties, edge_properties]
I'm trying to loop through a JSON file using Python and return the name of the object and associated modules for it.
Right now I can basically get the output I want hardcoding the indexes. However, this obviously isn't the right way to do it (the JSON file can vary in length).
Whenever I try to use a loop, I get errors like:
TypeError: string indices must be integers
My JSON file looks like this:
{
"name": "gaming_companies",
"columns": [{
"name": "publisher",
"type": "string",
"cleansing": ["clean_string"]
},
{
"name": "genre",
"type": "string",
"cleansing": ["match_genre", "clean_string"]
},
{
"name": "sales",
"type": "int",
"cleansing": []
}
]
}
My Python code which is 'working' looks like:
import json as js
def cleansing(games_json):
print (games_json['columns'][0]['name'] + " - cleansing:")
[print(i) for i in games_json['columns'][0]['cleansing'] ]
print (games_json['columns'][1]['name'] + " - cleansing:")
[print(i) for i in games_json['columns'][1]['cleansing'] ]
print (games_json['columns'][2]['name'] + " - cleansing:")
[print(i) for i in games_json['columns'][2]['cleansing'] ]
with open(r'C:\Desktop\gamefolder\jsonfiles\games.json') as input_json:
games_json = js.load(input_json)
cleansing(games_json)
The output I'm trying to return is:
publisher
cleansing:
clean_string
genre
cleansing:
match_genre
clean_string
sales
cleansing:
My attempt to loop through them like this:
for x in games_json:
for y in games_json['columns'][x]:
print (y)
Results in:
TypeError: list indices must be integers or slices, not str
games_json shows as a Dict.
Columns shows as a list of dictionaries.
Each object's cleansing attribute shows as a list.
I think this is where my problem is, but I'm not able to get over the hurdle.
The problem with your attempt is using an iterator as a string.
The x in for y in games_json['columns'][x]: is an iterator object and not the strings ['name', 'cleansing'].
You can learn more about python iterators here
As for the case - you might want to iterate over the columns as a separate list.
This code should work
for item in f["columns"]:
print(item["name"])
print("cleansing:")
print(item["cleansing"])
Output-
publisher
cleansing:
['clean_string']
genre
cleansing:
['match_genre', 'clean_string']
sales
cleansing:
[]
This can be one of working solutions as you want to iterate array's elements.
import json
for x in games_json['columns']:
print(x)
print(x['name'])
x = """{
"name": "gaming_companies",
"columns": [{
"name": "publisher",
"type": "string",
"cleansing": ["clean_string"]
},
{
"name": "genre",
"type": "string",
"cleansing": ["match_genre", "clean_string"]
},
{
"name": "sales",
"type": "int",
"cleansing": []
}
]
}"""
x = json.loads(x)
for i in x['columns']:
print(i['name'])
print("cleansing:")
for j in i["cleansing"]:
print(j)
print('\n')
Output
publisher
cleansing:
clean_string
genre
cleansing:
match_genre
clean_string
sales
cleansing:
with open(r'C:\Desktop\gamefolder\jsonfiles\games.json') as input_json:
games_json = js.load(input_json)
for i in games_json['columns']:
print(i['name'])
print("cleansing:")
for j in i["cleansing"]:
print(j)
print('\n')
I'm trying to Query data using python pandas library. here is an example json of the data...
[
{
"name": "Bob",
"city": "NY",
"status": "Active"
},
{
"name": "Jake",
"city": "SF",
"status": "Active"
},
{
"name": "Jill",
"city": "NY",
"status": "Lazy"
},
{
"name": "Steve",
"city": "NY",
"status": "Lazy"
}]
My goal is to query the data where city == NY and status == Lazy.
One way using pandas DataFrame is to do...
df = df[(df.status == "Lazy") & (df.city == "NY")]
This is working fine but i wanted this to be more abstract.
This there way I can use **kwargs to filter the data? so far i've had trouble using Pandas documentation.
so far I've done.....
def main(**kwargs):
readJson = pd.read_json(sys.argv[1])
for key,value in kwargs.iteritems():
print(key,value)
readJson = readJson[readJson[key] == value]
print readJson
if __name__ == '__main__':
main(status="Lazy",city="NY")
again...this works just fine, but I wonder if there is some better way to do it.
I don't really see anything wrong with your approach. If you wanted to use df.query you could do something like this, although I'd argue it's less readable.
expr = " and ".join(k + "=='" + v + "'" for (k,v) in kwargs.items())
readJson = readJson.query(expr)
**Kwargs is nothing really to do with Pandas, it is a basic Python thing, you simply need to make a function that accepts Kwargs and substitute the variable Kwargs into the pandas Df query statement (inside the function). Don't have the time to code it for you but reading the Python docs should get you going. Pandas is but one great part of the Python system, when you start to combine multiple parts you will need to get familiar with those pieces.