This is list of dictionary.
It is basically a sample data, but there are are more items in the list.
I want to basically get the dictionary using a value of the dictionary.
[{'status_id': '153080620724_10157915294545725', 'status_message': 'Beautiful evening in Wisconsin- THANK YOU for your incredible support tonight! Everyone get out on November 8th - and VOTE! LETS MAKE AMERICA GREAT AGAIN! -DJT', 'link_name': 'Timeline Photos', 'status_type': 'photo', 'status_link': 'https://www.facebook.com/DonaldTrump/photos/a.488852220724.393301.153080620724/10157915294545725/?type=3', 'status_published': '10/17/2016 20:56:51', 'num_reactions': '6813', 'num_comments': '543', 'num_shares': '359', 'num_likes': '6178', 'num_loves': '572', 'num_wows': '39', 'num_hahas': '17', 'num_sads': '0', 'num_angrys': '7'}
{'status_id': '153080620724_10157914483265725', 'status_message': "The State Department's quid pro quo scheme proves how CORRUPT our system is. Attempting to protect Crooked Hillary, NOT our American service members or national security information, is absolutely DISGRACEFUL. The American people deserve so much better. On November 8th, we will END this RIGGED system once and for all!", 'link_name': '', 'status_type': 'video', 'status_link': 'https://www.facebook.com/DonaldTrump/videos/10157914483265725/', 'status_published': '10/17/2016 18:00:41', 'num_reactions': '33768', 'num_comments': '3644', 'num_shares': '17653', 'num_likes': '26649', 'num_loves': '487', 'num_wows': '1155', 'num_hahas': '75', 'num_sads': '191', 'num_angrys': '5211'}
{'status_id': '153080620724_10157913199155725', 'status_message': "Crooked Hillary's State Department colluded with the FBI and the DOJ in a DISGRACEFUL quid pro quo exchange where her staff promised FBI agents more overseas positions if the FBI would alter emails that were classified. This is COLLUSION at its core and Crooked Hillary's super PAC, the media, is doing EVERYTHING they can to cover it up. It's a RIGGED system and we MUST not let her get away with this -- our country deserves better! Vote on Nov. 8 and let's take back the White House FOR the people and BY the people! #AmericaFirst! #RIGGED http://www.politico.com/story/2016/10/fbi-state-department-clinton-email-229880", 'link_name': '', 'status_type': 'video', 'status_link': 'https://www.facebook.com/DonaldTrump/videos/10157913199155725/', 'status_published': '10/17/2016 15:34:46', 'num_reactions': '85627', 'num_comments': '8810', 'num_shares': '32594', 'num_likes': '73519', 'num_loves': '2943', 'num_wows': '1020', 'num_hahas': '330', 'num_sads': '263', 'num_angrys': '7552'}
{'status_id': '153080620724_10157912962325725', 'status_message': 'JournoCash: Media gives $382,000 to Clinton, $14,000 Trump, 27-1 margin:', 'link_name': 'JournoCash: Media gives $382,000 to Clinton, $14,000 Trump, 27-1 margin', 'status_type': 'link', 'status_link': 'http://www.washingtonexaminer.com/journocash-media-gives-382000-to-clinton-14000-trump-27-1-margin/article/2604736', 'status_published': '10/17/2016 14:17:24', 'num_reactions': '22696', 'num_comments': '3665', 'num_shares': '5082', 'num_likes': '14029', 'num_loves': '122', 'num_wows': '2091', 'num_hahas': '241', 'num_sads': '286', 'num_angrys': '5927'}
]
I want the value for the highest number of 'num_likes' and print the status_id for that particular dictionary which has the highest 'num_likes'. I also want to understand how the method or process to implement this. I basically use the list to obtain the values and then find the maximum, is there any other way to do it?
The output should be just status_id.
Here I'm declaring your list-of-dictionaries as variable list_of_objs.
Since the num_likes value is string-type using int(obj['num_likes']) to convert the string-to-int - passing that to max method will return what is th max_likes .
list_of_objs = [{..}, {..}, {..}]
max_likes = max([int(obj['num_likes']) for obj in list_of_objs if 'num_likes' in obj.keys()])
print(max_likes)
max_likes_objs =[obj for obj in list_of_objs if int(obj['num_likes'])==max_likes]
print(max_likes_objs)
Last line what I've printed is list of all the dictionaries that have the max-value of num-likes
You can try this:
k=max([i['num_likes'] for i in d])
[i['status_id'] for i in d if i['num_likes']==k][0]
Using a simpler list as example:
l = [
{'likes': 5, 'id': 1},
{'likes': 2, 'id': 2},
{'likes': 7, 'id': 3},
{'likes': 1, 'id': 4},
]
result = list(filter(lambda item: item['likes'] == max([item['likes'] for item in l]), l))
print(result)
this will print [{'likes': 7, 'id': 3}]. The problem here is that if you can have more than one "maximum like item". This is why the function return a list. To print all of the IDs you can to:
print([item['id'] for item in result])
If you are sure that there are no more than one item or, otherwise, you need exactly one (maybe the first) you can do:
result = list(filter(lambda item: item['likes'] == max([item['likes'] for item in l]), l))
result = result[0]['id']
print(result)
which will print 3 in the example.
Now how to approach this problem: first you need the maximum number of likes:
max([item['likes'] for item in l])
call it maxLikes. Then you need the to take all the items with this likes value:
filter(lambda item: item['likes'] == maxLikes, l)
this is a filter applied on the list l (the last argument on the right), with a lambda function that could be read as "all items with 'likes' property equal to the maxLikes number".
Then you transform this in a list with list.
Declaring list_of_status_ids = [{}, {} ...]
Iterate list_of_status_ids and add in a dict having key as num_likes and values as list of status_id.
Then take max of num_likes and get all status_id corresponding to that max num_likes.
from collections import defaultdict
status_id_map = defaultdict(list)
[status_id_map[obj['num_likes']].append(obj['status_id']) for obj in list_of_status_ids]
print status_id_map.get(max(status_id_map.keys()))
Related
I am currently attempting to extract information from the following data set formatted as a list where each news article is its own dictionary:
news_data = [{'source': {'id': 'the-verge', 'name': 'The Verge'}, 'author': 'Emma Roth', 'title': "Judge rules Tesla can't hide behind arbitration in sexual harassment case - The Verge", 'description': 'A lawsuit accusing Tesla of fostering a work environment with “rampant” sexual harassment will continue in court after a judge blocked Tesla’s request for arbitration.', 'url': 'https://www.theverge.com/2022/5/24/23140051/judge-rules-tesla-hide-behind-arbitration-sexual-harassment-case-elon-musk', 'urlToImage': 'https://cdn.vox-cdn.com/thumbor/t3DT8qyznxCW4ahGTwGCSC4l56s=/0x146:2040x1214/fit-in/1200x630/cdn.vox-cdn.com/uploads/chorus_asset/file/10752835/acastro_180430_1777_tesla_0001.jpg', 'publishedAt': '2022-05-24T22:16:03Z', 'content': 'Tesla cant dismiss this case so easily\r\nIllustration by Alex Castro / The Verge\r\nA lawsuit that accuses Tesla of fostering a workplace with rampant sexual harassment will continue in court after a Ca… [+2274 chars]'}, {'source': {'id': 'bloomberg', 'name': 'Bloomberg'}, 'author': None, 'title': 'Elon Musk Drops Out of $200 Billion Club Again as Tesla Tumbles - Bloomberg', 'description': None, 'url': 'https://www.bloomberg.com/tosv2.html?vid=&uuid=38abbf6a-dbe7-11ec-9ad9-767145594c47&url=L25ld3MvYXJ0aWNsZXMvMjAyMi0wNS0yNC9lbG9uLW11c2stZHJvcHMtb3V0LW9mLTIwMC1iaWxsaW9uLWNsdWItYWdhaW4tYXMtdGVzbGEtdHVtYmxlcw==', 'urlToImage': None, 'publishedAt': '2022-05-24T21:11:29Z', 'content': "To continue, please click the box below to let us know you're not a robot."}]
Specifically, I want to extract only the keys 'title' and 'description', saving them into a list for use later.
I've attempted to do this with the following list comprehension:
news_info = [((k,v) for (k,v) in article if k in ['title', 'description']) for article in news_data]
However, if I print the result, I am simply informed:
[<generator object <listcomp>.<genexpr> at 0x102c27840>, <generator object <listcomp>.<genexpr> at 0x102c26dc0>]
Furthermore, if I attempt to access information (e.g. print(news_info[0]['title'])) there is a "TypeError: 'generator' object is not subscriptable".
I was wondering how I can go about printing and accessing/using the information that is saved in the list.
I'm not 100% on the intend, but from what you've tried it seems like you are looking for the result that this should present you:
news_info = [{'title': article['title'], 'description': article['description']} for article in news_data]
print(news_info[0]['title'])
I'd recommend using a function to create the dictionary though.
((k,v) for (k,v) in article if k in ['title', 'description']) - this piece of code creates generator expression. That's why you are getting generator objects.
Another mistake is that you are iterating over dictionary keys, not k,v pairs.
Taking both into account that is what it should look like:
news_info = [(k,v) for article in news_data for (k,v) in article.items() if k in ['title', 'description']]
Your outer () inside the comprehension is creating a generator. If you're expecting a tuple to be created inside here, then just append the word tuple to the beginning.
news_info = [tuple((k,v) for (k,v) in article.items() ...
You also had an issue in that you need to be iterating over the article items.
I have a dictionary which looks like this and is already part of a constructor :
self.__books = {
1001: {'title': 'Introduction to Programming', 'author': 'Farell', 'copies': 2},
1002: {'title': 'Database Concepts', 'author': 'Morris', 'copies': 5},
1003: {'title': 'Object Oriented Analysis', 'author': 'Farell', 'copies': 4},
1004: {'title': 'Linux Operating System', 'author': 'Palmer', 'copies': 2},
1005: {'title': 'Data Science using Python', 'author': 'Russell', 'copies': 4},
1006: {'title': 'Functional Programming with Python', 'author': 'Babcock', 'copies': 6}
}
What I am trying to do is print the dictionary but in a string format
( for example -
1001 : title: Introduction to Programming, author: Farell, copies: 2 (\n)
1002 : .... ( Hope you get the idea))
As per the problem given, I have to create a property called books that gets the value of 'self.__books' so it can be displayed and referenced w/o changing the original value of 'self.__books' in the future, so that the original can be referenced, if needed ( I also have to practice adding and removing from the list hence the need for the property and not just an attribute - correct me if I misunderstood the idea please.) I have looked up a bunch of resources that explain how to iterate through a dictionary using the key,value method like so -
def books(self):
for (key,value) in self.__books :
return "{0} : {1}".format(key,value)
And all I am getting is an error message that says the compiler 'cannot unpack non-iterable int object'
I do understand that the 1001 is an int, hence the problem when the process starts, but I cannot figure out how to reference the value of the item at that index when the key is an integer ( the key was already given to differentiate the index values - I do not know what such a key would be called i.e, like a 'index key' or 'indexer' maybe?) Maybe the formatting of the original dictionary 'self.__books' stops it somehow ?
Any help would be appreciated and thank you in advance for taking the time to read this.
You must iterate by dict as:
for key in self.__books:
return "{0} : {1}".format(key,self.__books[key])
def books(self):
for (key,value) in self.__books.items() :
return "{0} : {1}".format(key,str(value).replace('{','').replace('\'','').replace('}',''))
I think this will do the trick.
Dict have keys() and items() methods for iteration.
Here in your case the item inside the dict is another dict, try something like below:
for k in self.__books.keys():
ls = self.__books[k]
ss = ""
i = 0
for j in ls.keys():
if i > 0:
ss = ss + ", "
ss = ss + str(j) + ": " + str(ls[j])
i += 1
print("{0} : {1}".format(k, ss))
Say I have a list of dictionaries.
each dict in the list has 3 elements.
Name, id and status.
list_of_dicts = [{'id':1, 'name':'Alice', 'status':0},{'id':2, 'name':'Bob', 'status':0},{'id':3, 'name':'Robert', 'status':1}]
so I get:
In[20]: print list_of_dicts
Out[20]:
[{'id': 1, 'name': 'Alice', 'status': 0},
{'id': 2, 'name': 'Bob', 'status': 0},
{'id': 3, 'name': 'Robert', 'status': 1}]
If i recieve a name, how can I get its status without iterating on the list?
e.g. I get 'Robert' and I want to output 1. thank you.
for example you can use pandas
import pandas as pd
list_of_dicts = [{'id':1, 'name':'Alice', 'status':0},{'id':2, 'name':'Bob', 'status':0},{'id':3, 'name':'Robert', 'status':1}]
a = pd.DataFrame(list_of_dicts)
a.loc[a['name'] == 'Robert']
and play with dataframes its very fast because write on c++ and easy (like sql queries)
As you found you have to iterate (unless you are able to change your data structure to an enclosing dict) why don't you just do it?
>>> [d['status'] for d in list_of_dicts if d['name']=='Robert']
[1]
Despite this, I recommend considering a map type (like dict) every time you see some 'id' field in a proposed data structure. If it's there you probably want to use it for general identification, instead of carrying dicts around. They can be used for relations also, and transfer easily into a relational database if you need it later.
I don't think you can do what you ask without iterating through the dictionary:
Best case, you'll find someone that suggests you a method that hides the iteration.
If what really concerns you is the speed, you may break your iteration as soon as you find the first valid result:
for iteration, elements in enumerate(list_of_dicts):
if elements['name'] == "Robert":
print "Elements id: ", elements['id']
break
print "Iterations: ", iteration
# OUTPUT: Elements id: 3, Iterations: 1
Note that numbers of iteration may vary, since dictionaries are not indexed, and if you have more "Roberts", only for one the "id" will be printed
It's not possible to do this without iteration.
However, but you can transform you dictionary into a different data structure, such as a dictionary where names are the keys:
new_dict = {person["name"]: {k: v for k, v in person.items() if k != "name"} for person in list_of_dicts}
Then you can get the status like so:
new_dict["Robert"]["status"]
# 1
Additionally, as #tobias_k mentions in the comments, you can keep the internal dictionary the same:
{person["name"]: person for person in list_of_dicts}
The only issue with the above approaches is that it can't handle multiple names. You can instead add the unique id into the key to differentiate between names:
new_dict = {(person["name"], person["id"]): person["status"] for person in list_of_dicts}
Which can be called like this:
new_dict["Robert", 3]
# 1
Even though it takes extra computation(only once) to create these data structures, the lookups afterwards will be O(1), instead of iterating the list every time when you want to search a name.
Your list_of_dicts cannot be reached without a loop so for your desire your list should be modified a little like 1 dict and many lists in it:
list_of_dicts_modified = {'name':['Alice', 'Bob', 'Robert'],'id':[1, 2, 3], 'status': [0, 0, 1]}
index = list_of_dicts_modified['name'].index(input().strip())
print('Name: {0} ID: {1} Status: {2}'.format(list_of_dicts_modified['name'][index], list_of_dicts_modified['id'][index], list_of_dicts_modified['status'][index]))
Output:
C:\Users\Documents>py test.py
Alice
Name: Alice ID: 1 Status: 0
I have a dictionary of dictionary called data_dict. Following is how it looks:
{'UMANOFF ADAM S': {'total_stock_value': 'NaN', 'loans': 'NaN', 'salary': 288589},
'YEAP SOON': {'total_stock_value': 192758, 'loans': 'NaN', 'salary': 'NaN'},
'PIPER GREGORY F': {'total_stock_value': 880290, 'loans': 1452356, 'salary': 19791},
'Jack S': {'total_stock_value': 88000, 'loans': 'NaN', 'salary': 288589}
}
Basically it is of the format
{Person Name : Dictionary of that person's attributes}
I am trying to find the name of a person whose salary is certain X.
Specifically in above example - let's say I am trying to find the name of the persons whose salary is 288589. I expect all the names whose salary is 288589.
I have written following generalised function which will take a search key and value and return names of the persons for which that key, value holds true.
def search_person_by_attribute(attribute, value):
person_names = []
for person, attributes_dict in data_dict.items():
if attributes_dict[attribute] == value:
person_names.append(person)
return person_names
This method runs successfully
results = search_person_by_attribute("salary", 288589)
print(results)
and prints
['UMANOFF ADAM S','Jack S']
But somehow I feel this is quite a long way write it. Is there a better/shorter/more pythonic way to do it?
If you can also mention the efficiency (in terms of time complexity) of my as well your suggested solution will be a great bonus.
I would suggest something like this, which I think is not just shorter, but more readable than your version:
def search_person_by_attribute(d, attribute, value):
return [name for name in d if d[name][attribute] == value]
It works exactly like yours, but requires the dictionary as an additional parameter, because I think that's better style:
>>> search_person_by_attribute(d, "salary", 288589)
['UMANOFF ADAM S', 'Jack S']
Hoping someone can help me out. I've spent the past couple hours trying to solve this, and fair warning, I'm still fairly new to python.
This is a repost of a question I recently deleted. I've misinterpreted my code in the last example.The correct example is:
I have a dictionary, with a list that looks similar to:
dic = [
{
'name': 'john',
'items': ['pants_1', 'shirt_2','socks_3']
},
{
'name': 'bob',
items: ['jacket_1', 'hat_1']
}
]
I'm using .append for both 'name', and 'items', which adds the dic values into two new lists:
for x in dic:
dic_name.append(dic['name'])
dic_items.append(dic['items'])
I need to split the item value using '_' as the delimiter, so I've also split the values by doing:
name, items = [i if i is None else i.split('_')[0] for i in dic_name],
[if i is None else i.split('_')[0] for i in chain(*dic_items)])
None is used in case there is no value. This provides me with a new list for name, items, with the delimiter used. Disregard the fact that I used '_' split for names in this example.
When I use this, the index for name, and item no longer match. Do i need to create the listed items in an array to match the name index, and if so, how?
Ideally, I want name[0] (which is john), to also match items[0] (as an array of the items in the list, so pants, shirt, socks). This way when I refer to index 0 for name, it also grabs all the values for items as index 0. The same thing regarding the index used for bob [1], which should match his items with the same index.
#avinash-raj, thanks for your patience, as I've had to update my question to reflect more closely to the code I'm working with.
I'm reading a little bit between the lines but are you trying to just collapse the list and get rid of the field names, e.g.:
>>> dic = [{'name': 'john', 'items':['pants_1','shirt_2','socks_3']},
{'name': 'bob', 'items':['jacket_1','hat_1']}]
>>> data = {d['name']: dict(i.split('_') for i in d['items']) for d in dic}
>>> data
{'bob': {'hat': '1', 'jacket': '1'},
'john': {'pants': '1', 'shirt': '2', 'socks': '3'}}
Now the data is directly related vs. indirectly related via a common index into 2 lists. If you want the dictionary split out you can always
>>> dic_name, dic_items = zip(*data.items())
>>> dic_name
('bob', 'john')
>>> dic_items
({'hat': '1', 'jacket': '1'}, {'pants': '1', 'shirt': '2', 'socks': '3'})
You need a list of dictionaries because the duplicate keys name and items are overwritten:
items = [[i.split('_')[0] for i in d['items']] for d in your_list]
names = [d['name'] for d in your_list] # then grab names from list
Alternatively, you can do this in one line with the built-in zip method and generators, like so:
names, items = zip(*((i['name'], [j.split('_')[0] for j in i['items']]) for i in dic))
From Looping Techniques in the Tutorial.
for name, items in div.items():
names.append(name)
items.append(item)
That will work if your dict is structured
{'name':[item1]}
In the loop body of
for x in dic:
dic_name.append(dic['name'])
dic_items.append(dic['items'])
you'll probably want to access x (to which the items in dic will be assigned in turn) rather than dic.