Parse Json file and save specific values [duplicate] - python

This question already has answers here:
Getting a list of values from a list of dicts
(10 answers)
Closed 5 years ago.
I have this JSON file where the amount of id's sometimes changes (more id's will be added):
{
"maps": [
{
"id": "blabla1",
"iscategorical": "0"
},
{
"id": "blabla2",
"iscategorical": "0"
},
{
"id": "blabla3",
"iscategorical": "0"
},
{
"id": "blabla4",
"iscategorical": "0"
}
]
}
I have this python code that has to print all the values of ids:
import json
data = json.load(open('data.json'))
variable1 = data["maps"][0]["id"]
print(variable1)
variable2 = data["maps"][1]["id"]
print(variable2)
variable3 = data["maps"][2]["id"]
print(variable3)
variable4 = data["maps"][3]["id"]
print(variable4)
I have to use variables, because i want to show the values in a dropdown menu. Is it possible to save the values of the id's in a more efficient way? How do you know the max amount of id's of this json file (in de example 4)?

You can get the number of id (which is the number of elements) by checking the length of data['maps']:
number_of_ids = len(data['maps'])
A clean way to get all the id values is storing them in a list.
You can achieve this in a pythonic way like this:
list_of_ids = [map['id'] for map in data['maps']]
Using this approach you don't even need to store the number of elements in the original json, because you iterate through all of them using a foreach approach, essentially.
If the pythonic approach troubles you, you can achieve the same thing with a classic foreach approach doing so:
list_of_ids = []
for map in data['maps']:
list_of_ids.append(map['id'])
Or you can do with a classic for loop, and here is where you really need the length:
number_of_ids = len(data['maps'])
list_of_ids = []
for i in range(0,number_of_ids):
list_of_ids.append(data['maps'][i]['id'])
This last is the classic way, but I suggest you to take the others approaches in order to leverage the advantages python offers to you!
You can find more on this stuff here!
Happy coding!

data['maps'] is a simple list, so you can iterate over it as such:
for map in data['maps']:
print(map['id'])
To store them in a variable, you'll need to output them to a list. Storing them each in a separate variable is not a good idea, because like you said, you don't have a way to know how many there are.
ids = []
for map in data['maps']:
ids.append(map['id'])

Related

Python: Trying to extract a value from a list of dictionaries that is stored as a string

I am getting data from an API and storing it in json format. The data I pull is in a list of dictionaries. I am using Python. My task is to only grab the information from the dictionary that matches the ticker symbol.
This is the short version of my data printing using json dumps
[
{
"ticker": "BYDDF.US",
"name": "BYD Co Ltd-H",
"price": 25.635,
"change_1d_prc": 9.927101200686117
},
{
"ticker": "BYDDY.US",
"name": "BYD Co Ltd ADR",
"price": 51.22,
"change_1d_prc": 9.843448423761526
},
{
"ticker": "TSLA.US",
"name": "Tesla Inc",
"price": 194.7,
"change_1d_prc": 7.67018746889343
}
]
Task only gets the dictionary for ticker = TSLA.US. If possible, only get the price associated with this ticker.
I am unaware of how to reference "ticker" or loop through all of them to get the one I need.
I tried the following, but it says that its a string, so it doesn't work:
if "ticker" == "TESLA.US":
print(i)
Try (mylist is your list of dictionaries)
for entry in mylist:
print(entry['ticker'])
Then try this to get what you want:
for entry in mylist:
if entry['ticker'] == 'TSLA.US':
print(entry)
This is a solution that I've seen divide the python community. Some say that it's a feature and "very pythonic"; others say that it's a bad design choice we're stuck with now, and bad practice. I'm personally not a fan, but it is a way to solve this problem, so do with it what you will. :)
Python function loops aren't a new scope; the loop variable persists even after the loop. So, either of these are a solution. Assuming that your list of dictionaries is stored as json_dict:
for target_dict in json_dict:
if target_dict["ticker"] == "TESLA.US":
break
At this point, target_dict will be the dictionary you want.
It is possible to iterate through a list of dictionaries using a for loop.
for stock in list:
if stock["ticker"] == "TSLA.US":
return stock["price"]
This essentially loops through every item in the list, and for each item (which is a dictionary) looks for the key "ticker" and checks if its value is equal to "TSLA.US". If it is, then it returns the value associated with the "price" key.

Python search and replace whilst caching [duplicate]

This question already has answers here:
How can I make a dictionary (dict) from separate lists of keys and values?
(21 answers)
Closed 4 months ago.
I'm attempting to search and replace using information from 2 lists, this is whilst caching any replacements that have been done so the same corresponding values can be given.
For example, I have the following -
names = ["Mark","Steve","Mark","Chrome","192.168.0.1","Mark","Chrome","192.168.0.1","192.168.0.2"]
type = ["user","user","user","process","address","user","process","adress","address"]
And I'm hoping to get the following output -
{
"Mark":"user1",
"Steve":"user2",
"Chrome":"process1",
"192.168.0.1":"adress1",
"192.168.0.2":"adress2"
}
So trying to use the type in the the 2nd list to determine the item in the first list's corresponding value.
Hope this makes sense, is this possible? Any help would be appreciated.
I would recommend you use a dictionary personally.
names = {
"Mark": "user",
"Steve": "user2",
"Chrome": "process1",
"192.168.0.1": "address1",
"192.168.0.2": "address2"
}
print(names["Mark"])
By using this dictionary you can precisely tap into the name you'd like to information of or anything else you want. It is also a little more readable
To form a dictionary from said values you can iterate the range and access values with the same index:
output = {names[i]: types[i] for i in range(len(names))}
Also refrain from using variable name type because it's already taken by a builtin Python syntax.
Looks like you're also trying to store / retrieve the count of the types (i.e. "user1", "user2, "address1", etc.). Hence, we need another data structure to keep count of the types already registered in our "hashmap" (dictionary in python). In the below solution, we use the type_cache.
The code should work as is.
from collections import defaultdict
names = ["Mark", "Steve", "Mark", "Chrome", "192.168.0.1", "Mark", "Chrome", "192.168.0.1", "192.168.0.2"]
types = ["user", "user", "user", "process", "address", "user", "process", "address", "address"]
expected = {
"Mark": "user1",
"Steve": "user2",
"Chrome": "process1",
"192.168.0.1": "address1",
"192.168.0.2": "address2"
}
def process_names_and_types(names, types):
result = {}
type_cache = defaultdict(int)
for name, type_ in zip(names, types):
if name not in result:
type_cache[type_] += 1
result[name] = f"{type_}{type_cache[type_]}"
return result
if __name__ == "__main__":
actual = process_names_and_types(names, types)
assert actual == expected, f"Expected: {expected}, Actual: {actual}"

Parsing JSON output efficiently in Python?

The below block of code works however I'm not satisfied that it is very optimal due to my limited understanding of using JSON but I can't seem to figure out a more efficient method.
The steam_game_db is like this:
{
"applist": {
"apps": [
{
"appid": 5,
"name": "Dedicated Server"
},
{
"appid": 7,
"name": "Steam Client"
},
{
"appid": 8,
"name": "winui2"
},
{
"appid": 10,
"name": "Counter-Strike"
}
]
}
}
and my Python code so far is
i = 0
x = 570
req_name_from_id = requests.get(steam_game_db)
j = req_name_from_id.json()
while j["applist"]["apps"][i]["appid"] != x:
i+=1
returned_game = j["applist"]["apps"][i]["name"]
print(returned_game)
Instead of looping through the entire app list is there a smarter way to perhaps search for it? Ideally the elements in the data structure with 'appid' and 'name' were numbered the same as their corresponding 'appid'
i.e.
appid 570 in the list is Dota2
However element 570 in the data structure in appid 5069 and Red Faction
Also what type of data structure is this? Perhaps it has limited my searching ability for this answer already. (I.e. seems like a dictionary of 'appid' and 'element' to me for each element?)
EDIT: Changed to a for loop as suggested
# returned_id string for appid from another query
req_name_from_id = requests.get(steam_game_db)
j_2 = req_name_from_id.json()
for app in j_2["applist"]["apps"]:
if app["appid"] == int(returned_id):
returned_game = app["name"]
print(returned_game)
The most convenient way to access things by a key (like the app ID here) is to use a dictionary.
You pay a little extra performance cost up-front to fill the dictionary, but after that pulling out values by ID is basically free.
However, it's a trade-off. If you only want to do a single look-up during the life-time of your Python program, then paying that extra performance cost to build the dictionary won't be beneficial, compared to a simple loop like you already did. But if you want to do multiple look-ups, it will be beneficial.
# build dictionary
app_by_id = {}
for app in j["applist"]["apps"]:
app_by_id[app["appid"]] = app["name"]
# use it
print(app_by_id["570"])
Also think about caching the JSON file on disk. This will save time during your program's startup.
It's better to have the JSON file on disk, you can directly dump it into a dictionary and start building up your lookup table. As an example I've tried to maintain your logic while using the dict for lookups. Don't forget to encode the JSON it has special characters in it.
Setup:
import json
f = open('bigJson.json')
apps = {}
with open('bigJson.json', encoding="utf-8") as handle:
dictdump = json.loads(handle.read())
for item in dictdump['applist']['apps']:
apps.setdefault(item['appid'], item['name'])
Usage 1:
That's the way you have used it
for appid in range(0, 570):
if appid in apps:
print(appid, apps[appid].encode("utf-8"))
Usage 2: That's how you can query a key, using getinstead of [] will prevent a KeyError exception if the appid isn't recorded.
print(apps.get(570, 0))

Python: TypeError in referencing item in JSON feed

First, here is a sample JSON feed that I want to read in Python 2.7 with either simplejson or the built in JSON decoder. I am loading the .json file in Python and then searching for a key like "Apple" or "Orange" and when that key is found, I want to bring in the information for it like the types and quantities.
Right now there is only 3 items, but I want to be able to search one that may have up to 1000 items. Here is the code:
{
"fruits": [
{
"Apple": [
{
"type": "Gala",
"quant": 5
},
{
"type": "Honeycrisp",
"quant": 10
},
{
"type": "Red Delicious",
"quant": 4
}
]
},
{
"Banana": [
{
"type": "Plantain",
"quant": 5
}
]
},
{
"Orange": [
{
"type": "Blood",
"quant": 3
},
{
"type": "Navel",
"quant": 20
}
]
}
]
}
My sample Python code is as follows:
import simplejson as json
# Open file
fjson = open('/home/teg/projects/test/fruits.json', 'rb')
f = json.loads(fjson.read())
fjson.close()
# Search for fruit
if 'Orange' in json.dumps(f):
fruit = f['fruits']['Orange']
print(fruit)
else:
print('Orange does not exist')
But whenever I test it out, it gives me this error:
TypeError: list indices must be integers, not str
Was it wrong to have me do a json.dumps and instead should I have just checked the JSON feed as-is from the standard json.loads? I am getting this TypeError because I am not specifying the list index, but what if I don't know the index of that fruit?
Do I have to first search for a fruit and if it is there, get the index and then reference the index before the fruit like this?
fruit = f['fruits'][2]['Orange']
If so, how would I get the index of that fruit if it is found so I could then pull in the information? If you think the JSON is in the wrong format as well and is causing this issue, then I am up for that suggestion as well. I'm stuck on this and any help you guys have would be great. :-)
Your f type is list, it's a list of dictionary's with sub dictionary.
if 'Orange' in json.dumps(f): Will iterate the list and look at each item for Orange.
The problem is that f['fruits'] is a list so it expects an int number (place)
and not a dictionary key like ['Orange']
I think you should check your structure like #kindall said, if you still want to extract Orange this code will do the trick:
for value in f['fruits']:
if 'Orange' in value:
print value['Orange']
The problem is that the data structure has a list enclosing the dictionaries. If you have any control over the data source, that's the place to fix it. Otherwise, the best course is probably to post-process the data after parsing it to eliminate these extra list structures and merge the dictionaries in each list into a single dictionary. If you use an OrderedDict you can even retain the ordering of the items (which is probably why the list was used).
The square bracket in the line "fruits": [ should tell you that the item associated with fruits is (in Python parlance) a list rather than a dict and so cannot be indexed directly with a string like 'Oranges'. It sounds like you want to create a dict of fruits instead. You could do this by reformatting the input.
Or, if the input format is fixed: each item in your fruits list currently has a very specific format. Each item is a dict with exactly one key, and those keys are not duplicated between items. If those rules can be relied upon, it's pretty easy to write a small search routine—or the following code will convert a list-of-dicts into a dict:
fruits = dict(sum([x.items() for x in f['fruits']], []))
print fruits['Orange']

Create nested JSON from flat csv

Trying to create a 4 deep nested JSON from a csv based upon this example:
Region,Company,Department,Expense,Cost
Gondwanaland,Bobs Bits,Operations,nuts,332
Gondwanaland,Bobs Bits,Operations,bolts,254
Gondwanaland,Maureens Melons,Operations,nuts,123
At each level I would like to sum the costs and include it in the outputted JSON at the relevant level.
The structure of the outputted JSON should look something like this:
{
"id": "aUniqueIdentifier",
"name": "usually a nodes name",
"data": [
{
"key": "some key",
"value": "some value"
},
{
"key": "some other key",
"value": "some other value"
}
],
"children": [/* other nodes or empty */ ]
}
(REF: http://blog.thejit.org/2008/04/27/feeding-json-tree-structures-to-the-jit/)
Thinking along the lines of a recursive function in python but have not had much success with this approach so far... any suggestions for a quick and easy solution greatly appreciated?
UPDATE:
Gradually giving up on the idea of the summarised costs because I just can't figure it out :(. I'not much of a python coder yet)! Simply being able to generate the formatted JSON would be good enough and I can plug in the numbers later if I have to.
Have been reading, googling and reading for a solution and on the way have learnt a lot but still no success in creating my nested JSON files from the above CSV strucutre. Must be a simple solution somewhere on the web? Maybe somebody else has had more luck with their search terms????
Here are some hints.
Parse the input to a list of lists with csv.reader:
>>> rows = list(csv.reader(source.splitlines()))
Loop over the list to buildi up your dictionary and summarize the costs. Depending on the structure you're looking to create the build-up might look something like this:
>>> summary = []
>>> for region, company, department, expense, cost in rows[1:]:
summary.setdefault(*region, company, department), []).append((expense, cost))
Write the result out with json.dump:
>>> json.dump(summary, open('dest.json', 'wb'))
Hopefully, the recursive function below will help get you started. It builds a tree from the input. Please be aware of what type you want your leaves to be in, which we label as the "cost". You'll need to elaborate on the function to build-up the exact structure you intend:
import csv, itertools, json
def cluster(rows):
result = []
for key, group in itertools.groupby(rows, key=lambda r: r[0]):
group_rows = [row[1:] for row in group]
if len(group_rows[0]) == 2:
result.append({key: dict(group_rows)})
else:
result.append({key: cluster(group_rows)})
return result
if __name__ == '__main__':
s = '''\
Gondwanaland,Bobs Bits,Operations,nuts,332
Gondwanaland,Bobs Bits,Operations,bolts,254
Gondwanaland,Maureens Melons,Operations,nuts,123
'''
rows = list(csv.reader(s.splitlines()))
r = cluster(rows)
print json.dumps(r, indent=4)

Categories

Resources