Extract out nested keys into a list [duplicate] - python

This question already has answers here:
Loop through all nested dictionary values?
(18 answers)
Closed 5 years ago.
I have a dictionary raw_data['list'] that have values structured like so:
k, v in sorted(raw_data['list'].items()):
print(k, v)
break
1001473688 {'resolved_id': '1001473688', 'item_id': '1001473688', 'word_count': '149', 'excerpt': '“The fraudulence paradox was that the more time and effort you put into trying to appear impressive or attractive to other people, the less impressive or attractive you felt inside — you were a fraud.', 'time_favorited': '0', 'favorite': '0', 'given_url': 'http://www.goodreads.com/quotes/564841-the-fraudulence-paradox-was-that-the-more-time-and-effort', 'is_index': '0', 'status': '0', 'sort_id': 3795, 'authors': {'3445796': {'author_id': '3445796', 'item_id': '1001473688', 'name': 'David Foster Wallace', 'url': 'http://www.goodreads.com/author/show/4339.David_Foster_Wallace'}}, 'time_read': '0', 'has_image': '0', 'has_video': '0', 'given_title': 'Quote by David Foster Wallace: “The fraudulence paradox was that the more t', 'resolved_title': '“The fraudulence paradox was that the more time and effort you put into trying to appear impressive or attractive to other people, the less impressive or attractive you felt inside — you were a fraud. And the more of a fraud you felt like, the harder you tried to convey an impressive or likable image of yourself so that other people wouldn’t find out what a hollow, fraudulent person you really were. Logically, you would think that the moment a supposedly intelligent nineteen-year-old became aware of this paradox, he’d stop being a fraud and just settle for being himself (whatever that was) because he’d figured out that being a fraud was a vicious infinite regress that ultimately resulted in being frightened, lonely, alienated, etc. But here was the other, higher-order paradox, which didn’t even have a form or name — I didn’t, I couldn’t.”', 'resolved_url': 'http://www.goodreads.com/quotes/564841-the-fraudulence-paradox-was-that-the-more-time-and-effort', 'time_added': '1438693251', 'time_updated': '1439849583', 'is_article': '1'}
Some of the values within raw_data['list'] dictionary have a 'tags' key like so:
{'excerpt': '',
'favorite': '0',
'given_title': 'carlcheo.com/wp-content/uploads/2014/12/which-programming-language-should-i',
'given_url': 'http://carlcheo.com/wp-content/uploads/2014/12/which-programming-language-should-i-learn-first-pdf.pdf',
'has_image': '0',
'has_video': '0',
'is_article': '0',
'is_index': '0',
'item_id': '999554490',
'resolved_id': '999554490',
'resolved_title': '',
'resolved_url': 'http://carlcheo.com/wp-content/uploads/2014/12/which-programming-language-should-i-learn-first-pdf.pdf',
'sort_id': 3026,
'status': '0',
'tags': {'programming': {'item_id': '999554490', 'tag': 'programming'}},
'time_added': '1454096378',
'time_favorited': '0',
'time_read': '0',
'time_updated': '1454096385',
'word_count': '0'}
I need to extract out all the keys of 'tags' keys (aka the keys of the 'tags' values) into a list. I don't have much experience with nested dictionaries and struggling to figure out how I should write a nested for loop (if that is the most elegant way to the solution). Please let me know your thoughts. Thanks!

The following nested comprehension should work:
tags = [tag for v in raw_data['list'].values() for tag in v.get('tags', {})]

Related

How to print a dictionary based on a value

This is list of dictionary.
It is basically a sample data, but there are are more items in the list.
I want to basically get the dictionary using a value of the dictionary.
[{'status_id': '153080620724_10157915294545725', 'status_message': 'Beautiful evening in Wisconsin- THANK YOU for your incredible support tonight! Everyone get out on November 8th - and VOTE! LETS MAKE AMERICA GREAT AGAIN! -DJT', 'link_name': 'Timeline Photos', 'status_type': 'photo', 'status_link': 'https://www.facebook.com/DonaldTrump/photos/a.488852220724.393301.153080620724/10157915294545725/?type=3', 'status_published': '10/17/2016 20:56:51', 'num_reactions': '6813', 'num_comments': '543', 'num_shares': '359', 'num_likes': '6178', 'num_loves': '572', 'num_wows': '39', 'num_hahas': '17', 'num_sads': '0', 'num_angrys': '7'}
{'status_id': '153080620724_10157914483265725', 'status_message': "The State Department's quid pro quo scheme proves how CORRUPT our system is. Attempting to protect Crooked Hillary, NOT our American service members or national security information, is absolutely DISGRACEFUL. The American people deserve so much better. On November 8th, we will END this RIGGED system once and for all!", 'link_name': '', 'status_type': 'video', 'status_link': 'https://www.facebook.com/DonaldTrump/videos/10157914483265725/', 'status_published': '10/17/2016 18:00:41', 'num_reactions': '33768', 'num_comments': '3644', 'num_shares': '17653', 'num_likes': '26649', 'num_loves': '487', 'num_wows': '1155', 'num_hahas': '75', 'num_sads': '191', 'num_angrys': '5211'}
{'status_id': '153080620724_10157913199155725', 'status_message': "Crooked Hillary's State Department colluded with the FBI and the DOJ in a DISGRACEFUL quid pro quo exchange where her staff promised FBI agents more overseas positions if the FBI would alter emails that were classified. This is COLLUSION at its core and Crooked Hillary's super PAC, the media, is doing EVERYTHING they can to cover it up. It's a RIGGED system and we MUST not let her get away with this -- our country deserves better! Vote on Nov. 8 and let's take back the White House FOR the people and BY the people! #AmericaFirst! #RIGGED http://www.politico.com/story/2016/10/fbi-state-department-clinton-email-229880", 'link_name': '', 'status_type': 'video', 'status_link': 'https://www.facebook.com/DonaldTrump/videos/10157913199155725/', 'status_published': '10/17/2016 15:34:46', 'num_reactions': '85627', 'num_comments': '8810', 'num_shares': '32594', 'num_likes': '73519', 'num_loves': '2943', 'num_wows': '1020', 'num_hahas': '330', 'num_sads': '263', 'num_angrys': '7552'}
{'status_id': '153080620724_10157912962325725', 'status_message': 'JournoCash: Media gives $382,000 to Clinton, $14,000 Trump, 27-1 margin:', 'link_name': 'JournoCash: Media gives $382,000 to Clinton, $14,000 Trump, 27-1 margin', 'status_type': 'link', 'status_link': 'http://www.washingtonexaminer.com/journocash-media-gives-382000-to-clinton-14000-trump-27-1-margin/article/2604736', 'status_published': '10/17/2016 14:17:24', 'num_reactions': '22696', 'num_comments': '3665', 'num_shares': '5082', 'num_likes': '14029', 'num_loves': '122', 'num_wows': '2091', 'num_hahas': '241', 'num_sads': '286', 'num_angrys': '5927'}
]
I want the value for the highest number of 'num_likes' and print the status_id for that particular dictionary which has the highest 'num_likes'. I also want to understand how the method or process to implement this. I basically use the list to obtain the values and then find the maximum, is there any other way to do it?
The output should be just status_id.
Here I'm declaring your list-of-dictionaries as variable list_of_objs.
Since the num_likes value is string-type using int(obj['num_likes']) to convert the string-to-int - passing that to max method will return what is th max_likes .
list_of_objs = [{..}, {..}, {..}]
max_likes = max([int(obj['num_likes']) for obj in list_of_objs if 'num_likes' in obj.keys()])
print(max_likes)
max_likes_objs =[obj for obj in list_of_objs if int(obj['num_likes'])==max_likes]
print(max_likes_objs)
Last line what I've printed is list of all the dictionaries that have the max-value of num-likes
You can try this:
k=max([i['num_likes'] for i in d])
[i['status_id'] for i in d if i['num_likes']==k][0]
Using a simpler list as example:
l = [
{'likes': 5, 'id': 1},
{'likes': 2, 'id': 2},
{'likes': 7, 'id': 3},
{'likes': 1, 'id': 4},
]
result = list(filter(lambda item: item['likes'] == max([item['likes'] for item in l]), l))
print(result)
this will print [{'likes': 7, 'id': 3}]. The problem here is that if you can have more than one "maximum like item". This is why the function return a list. To print all of the IDs you can to:
print([item['id'] for item in result])
If you are sure that there are no more than one item or, otherwise, you need exactly one (maybe the first) you can do:
result = list(filter(lambda item: item['likes'] == max([item['likes'] for item in l]), l))
result = result[0]['id']
print(result)
which will print 3 in the example.
Now how to approach this problem: first you need the maximum number of likes:
max([item['likes'] for item in l])
call it maxLikes. Then you need the to take all the items with this likes value:
filter(lambda item: item['likes'] == maxLikes, l)
this is a filter applied on the list l (the last argument on the right), with a lambda function that could be read as "all items with 'likes' property equal to the maxLikes number".
Then you transform this in a list with list.
Declaring list_of_status_ids = [{}, {} ...]
Iterate list_of_status_ids and add in a dict having key as num_likes and values as list of status_id.
Then take max of num_likes and get all status_id corresponding to that max num_likes.
from collections import defaultdict
status_id_map = defaultdict(list)
[status_id_map[obj['num_likes']].append(obj['status_id']) for obj in list_of_status_ids]
print status_id_map.get(max(status_id_map.keys()))

Finding the Corresponding Value of a Key in Nested Dictionary

I have a nested dictionary where i am trying to print the corresponding value of the nested key without requiring the outermost numeral key input.
e.g. if roomname in nested-nested dictionary, print room area.
and my dictionary is set up like below:
d = {0: {'RoomName': 'PSC', 'MinArea': '28', 'MinRoomDim': 'null', 'MinDoorWidth': '900', 'MinDoorHeight': '2100', 'NoofDoorLeaves': '1', 'DoorMaterial': 'Glass', 'ReferenceLocation': 'LTA ADC SECTION 3.1 CLAUSE 5.1', 'RoomSpecificInfo': 'Refer to PSC design guidelines'},
1: {'RoomName': 'SMR', 'MinArea': '8', 'MinRoomDim': 'null', 'MinDoorWidth': '900', 'MinDoorHeight': '2100', 'NoofDoorLeaves': '1', 'DoorMaterial': 'Glass', 'ReferenceLocation': 'LTA ADC SECTION 3.1 CLAUSE 5.2', 'RoomSpecificInfo': 'null'},
2: {'RoomName': 'FIRST AID RM', 'MinArea': '7.5', 'MinRoomDim': '3.0m x 2.5m', 'MinDoorWidth': '1000', 'MinDoorHeight': '2100', 'NoofDoorLeaves': '1', 'DoorMaterial': 'null', 'ReferenceLocation': 'LTA ADC SECTION 3.1 CLAUSE 5.3', 'RoomSpecificInfo': 'null'},...
So far all the solutions I managed to find are for typical dictionaries and the solution does not work for nested dictionaries like above. Any help would be appreciated.
You can print nested dictionaries like this:
for i,j in d.items():
for m,n in j.items():
print(m,n)
where m is the key and n is the value of the nested dictionary
Read the json file and then use multiple loads and get() inside to get the required value in the nested json for example to get in key '1' to get RoomName
json.loads(json.loads(x).get("1","{}")).get("RoomName"))

Replace sublist with another sublist - python [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have a list:
Online = [['Robot1', '23.9', 'None', '0'], ['Robot2', '25.9', 'None', '0']]
and i want to replace the sublist if i received different values:
NewSublist1 = ['Robot1', '30.9', 'Sending', '440']
NewSublist2 = ['Robot2', '50']
And i want:
Online = [['Robot1', '30.9', 'Sending', '440'], ['Robot2', '50']]
The number of the sublist elements could change. The only thing that is the same is the Robot id. So i want to make a search, see if the Robot id is on the Online list and replace the sublist with the new one.
You can create a dictionary mapping the robot IDs in your new sublists to the actual new sublists, then look up the existing robot IDs in that dict and replace accordingly.
>>> Online = [['Robot1', '23.9', 'None', '0'], ['Robot3', 'has no replacement'], ['Robot2', '25.9', 'None', '0']]
>>> NewSublists = [['Robot1', '30.9', 'Sending', '440'], ['Robot2', '50'], ['Robot4', 'new entry']]
>>> newsub_dict = {sub[0]: sub for sub in NewSublists}
>>> [newsub_dict.get(sub[0], sub) for sub in Online]
[['Robot1', '30.9', 'Sending', '440'],
['Robot3', 'has no replacement'],
['Robot2', '50']]
This will loop over each element in the list once, giving it a complexity of O(n), n being the number of elements in the Online list. If, instead, you make Online also a dictionary mapping robot IDs to sublists, you could get this down to O(k), k being the number of new sublists.
If you also want to add elements from NewSublists to Online if those are not yet present, you should definitely convert Online to a dict as well; then you can simply update the dict and get the values. Ir order matters, make sure to use a collections.OrderedDict or Python 3.7.
>>> online_dict = {sub[0]: sub for sub in Online}
>>> online_dict.update(newsub_dict)
>>> list(online_dict.values())
[['Robot1', '30.9', 'Sending', '440'],
['Robot3', 'has no replacement'],
['Robot2', '50'],
['Robot4', 'new entry']]

trying to separate two sets of information from csv [duplicate]

This question already has answers here:
How to print a list in Python "nicely"
(11 answers)
Closed 5 years ago.
I've used a draft code and adapted it to suit my code but I don't know how to split the data. The question might seem a vague so I'll try to be help by saying what I want.
My code:
import csv
with open("scores.csv") as csv_data:
reader=csv.reader(csv_data,delimiter=",")
number_sorted=sorted(reader,key=lambda x:int(x[0]),reverse=True)
print(number_sorted)
I get the output:
[['25356767', 'tom'], ['443388', 'jin'], ['6744', 'trev'], ['4666', 'ryan'],
['2445', 'jones'], ['536', 'sue'], ['34', 'bob'], ['8', 'hera'], ['1',
'bill'], ['0', 'v']]
but I want the print to look like this so it looks like a leader board:
`[['25356767', 'tom'],
['443388', 'jin'],
['6744', 'trev'],
['4666', 'ryan'],
['2445', 'jones'],
['536', 'sue'],
['34', 'bob'],
['8', 'hera'],
['1', 'bill'],
['0', 'v']]`
I hope that explains my question.
Can't do that, sorry.
What you're asking to do is, essentially, altering the way the console outputs lists. Simpler (and, probably, prettier) would be to define a function to pretty print the lists.
If you did something like:
def pretty(inlist):
for score, name in inlist:
print name.rjust(10) +"| " +score
Then it would print every line in a pretty format.

Comparing values in dictionaries python

I have 2 nested dictionaries in Python that have this format:
1166869: {'probL2': '0.000', 'probL1': '0.000', 'pronNDiff_site': '1.000', 'StateBin': '0', 'chr': 'chrX', 'rangehist': '59254000-59255000', 'start_bin': '59254000', 'countL2': '4', 'countL1': '0'}
1166870: {'probL2': '0.148', 'probL1': '0.000', 'pronNDiff_site': '0.851', 'StateBin': '0', 'chr': 'chr2', 'rangehist': '59254000-59255000', 'start_bin': '59255000', 'countL2': '5', 'countL1': '15'}
1166871: {'probL2': '0.000', 'probL1': '0.000', 'pronNDiff_site': '1.000', 'StateBin': '0', 'chr': 'chrY', 'rangehist': '59290000-59291000', 'start_bin': '59290000', 'countL2': '1', 'countL1': '2'}
where 1166869, 1166870 and 1166871 represent a line in a file from where I read the data, and the rest of the keys are the data itself.
Now I want to make a list where I store all the different values in the key "chr" because there are some repeated ones.
How can I go through the dictionary and make the comparison between the 2 values? This code is not working:
for k in range(len(file_dict)):
for j in range(len(file_dict)-1):
if (file_dict[j]["chr"] != file_dict[k]["chr"]):
list_chr.append(file_dict[j]["chr"])
Use a set, and just all the items in one go:
chr = { v['chr'] for v in file_dict.itervalues() }
This uses a set comprehension to generate your set in one line of code.
Set comprehensions were introduced in Python 2.7; in earlier versions use:
chr = set(v['chr'] for v in file_dict.itervalues())
In Python 3, you'd need to replace .itervalues() by .values().
Your own code doesn't work because python dictionaries are not lists; you don't retrieve values by index, but by key. You'd have to change it to:
for key in file_dict:
for other_key in file_dict:
if key == other_key:
continue
if file_dict[key]['chr'] != file_dict[otherkey]['chr']:
list_chr.append(filed_dict[key]['chr'])
but that is really inefficient, not to mention incorrect.
how about something along the lines of:
list_chr = list(set([val['chr'] for val in file_dict.values()]))
how does this work?
first a list comprehension gets all the chr entries in the inner dict
these are then converted to a set, such that there are no duplicate entries
these are then converted to a list if that's the format you prefere
please note that maybe you really want to use a set, then the look up time is O(1) instead of O(n)

Categories

Resources