List of objects with a unique attribute - python

I have a list of objects that each have a specific attribute. That attribute is not unique, and I would like to end up with a list of the objects that is a subset of the entire list such that all of the specific attributes is a unique set.
For example, if I have four objects:
object1.thing = 1
object2.thing = 2
object3.thing = 3
object4.thing = 2
I would want to end up with either
[object1, object2, object3]
or
[object1, object3, object4]
The exact objects that wind up in the final list are not important, only that a list of their specific attribute is unique.
EDIT: To clarify, essentially what I want is a set that is keyed off of that specific attribute.

You can use a list comprehension and set:
objects = (object1,object2,object3,object4)
seen = set()
unique = [obj for obj in objects if obj.thing not in seen and not seen.add(obj.thing)]
The above code is equivalent to:
seen = set()
unique = []
for obj in objects:
if obj.thing not in seen:
unique.append(obj)
seen.add(obj.thing)

You could create a dict whose key is the object's thing and values are the objects themselves.
d = {}
for obj in object_list:
d[obj.thing] = obj
desired_list = d.values()

Related

Why there is multiple values for dictionary key?

I'm watching a python course and I saw a line of code that I don't understand
books_dict[title] = [author,subject,year]
what I see from this line is the key of books_dict is the title and there are multiple values for it?
You can print the type of books_dict[title] with type() function. It tells you that it's a list(so there is only one object). List is a container so it can contain other objects. In your dictionary there is only one value for that key. Whenever you access to that key you will get that one list not individual items inside it. That would be problematic then!
If you have:
d = {}
d["key1"] = [1, 2, 3]
There is only one value, and that value is a list. The list is [author, subject, year].
In addition to what others have already stated, a dictionary holds key, value pairs. One key to one value, however the data types used to create the key and value can both be containers holding more than one element
for example
books_dict[title] = [author,subject,year]
is the same as
temp = [author, subject, year]
books_dict[title] = temp
The key can also hold an iterable, however it must be hashable and immutable.
books_dict[(title, author)] = [subject, year]
which is the same as
key = (title, author)
value = [subject, year]
books_dict[key] = value

Python Remove Duplicate Dict

I am trying to find a way to remove duplicates from a dict list. I don't have to test the entire object contents because the "name" value in a given object is enough to identify duplication (i.e., duplicate name = duplicate object). My current attempt is this;
newResultArray = []
for i in range(0, len(resultArray)):
for j in range(0, len(resultArray)):
if(i != j):
keyI = resultArray[i]['name']
keyJ = resultArray[j]['name']
if(keyI != keyJ):
newResultArray.append(resultArray[i])
, which is wildly incorrect. Grateful for any suggestions. Thank you.
If name is unique, you should just use a dictionary to store your inner dictionaries, with name being the key. Then you won't even have the issue of duplicates, and you can remove from the list in O(1) time.
Since I don't have access to the code that populates resultArray, I'll simply show how you can convert it into a dictionary in linear time. Although the best option would be to use a dictionary instead of resultArray in the first place, if possible.
new_dictionary = {}
for item in resultArray:
new_dictionary[item['name']] = item
If you must have a list in the end, then you can convert back into a dictionary as such:
new_list = [v for k,v in new_dictionary.items()]
Since "name" provides uniqueness... and assuming "name" is a hashable object, you can build an intermediate dictionary keyed by "name". Any like-named dicts will simply overwrite their predecessor in the dict, giving you a list of unique dictionaries.
tmpDict = {result["name"]:result for result in resultArray}
newArray = list(tmpDict.values())
del tmpDict
You could shrink that down to
newArray = list({result["name"]:result for result in resultArray}.values())
which may be a bit obscure.

Understanding Python Lists and Dictionaries using the example

Using this following example:
For each human in the world I would like to create my own list which I can iterate over..
persons = []
attributes = {}
for human in world:
attributes['name'] = human['name']
attributes['eye_color'] = human['eyes']
persons.append(attributes)
Now when I try to print out each name in my own list:
for item in persons:
print item['name']
They are all the same, why?
You are reusing the same dictionary over and over again. persons.append(attributes) adds a reference to that dictionary to the list, it does not create a copy.
Create a new dictionary in your loop:
persons = []
for human in world:
attributes = {}
attributes['name'] = human['name']
attributes['eye_color'] = human['eyes']
persons.append(attributes)
Alternatively, use dict.copy() to create a shallow copy of the dictionary.

Insert only unique objects into list

I have queryset of people:
people = Person.objects.all()
and I have a list un_people = [] - meaning a list of people with unique name.
So, there can be more than one person with the same name. i want to filter for this and then insert into list so that list only contains person objects with unique name.
I tried:
for person in people:
if person.name in un_people:
#... ?
but in list, there are objects, not names. how can I check for objects with same name and then insert into list?
Use a dict to do the uniqueness, then take the values, eg:
uniq_names = {person.name:person for person in people}
uniq_people = uniq_names.values() # use list(unique_names.values()) for Py 3.x
You can use set data structure:
un_people = set(people)
If your elements are not hashable as, JonClemens, suggests you can build a list of names first:
un_people = set([p.name for p in people])

Pythonic way to parse list of dictionaries for a specific attribute?

I want to cross reference a dictionary and django queryset to determine which elements have unique dictionary['name'] and djangoModel.name values, respectively. The way I'm doing this now is to:
Create a list of the dictionary['name'] values
Create a list of djangoModel.name values
Generate the list of unique values by checking for inclusion in those lists
This looks as follows:
alldbTests = dbp.test_set.exclude(end_date__isnull=False) #django queryset
vctestNames = [vctest['name'] for vctest in vcdict['tests']] #from dictionary
dbtestNames = [dbtest.name for dbtest in alldbTests] #from django model
# Compare tests in protocol in fortytwo's db with protocol from vc
obsoleteTests = [dbtest for dbtest in alldbTests if dbtest.name not in vctestNames]
newTests = [vctest for vctest in vcdict if vctest['name'] not in dbtestNames]
It feels unpythonic to have to generate the intermediate list of names (lines 2 and 3 above), just to be able to check for inclusion immediately after. Am I missing anything? I suppose I could put two list comprehensions in one line like this:
obsoleteTests = [dbtest for dbtest in alldbTests if dbtest.name not in [vctest['name'] for vctest in vcdict['tests']]]
But that seems harder to follow.
Edit:
Think of the initial state like this:
# vcdict is a list of django models where the following are all true
alldBTests[0].name == 'test1'
alldBTests[1].name == 'test2'
alldBTests[2].name == 'test4'
dict1 = {'name':'test1', 'status':'pass'}
dict2 = {'name':'test2', 'status':'pass'}
dict3 = {'name':'test5', 'status':'fail'}
vcdict = [dict1, dict2, dict3]
I can't convert to sets and take the difference unless I strip things down to just the name string, but then I lose access to the rest of the model/dictionary, right? Sets only would work here if I had the same type of object in both cases.
vctestNames = dict((vctest['name'], vctest) for vctest in vcdict['tests'])
dbtestNames = dict((dbtest.name, dbtest) for dbtest in alldbTests)
obsoleteTests = [vctestNames[key]
for key in set(vctestNames.keys()) - set(dbtestNames.keys())]
newTests = [dbtestNames[key]
for key in set(dbtestNames.keys()) - set(vctestNames.keys())]
You're working with basic set operations here. You could convert your objects to sets and just find the intersection (think Venn Diagrams):
obsoleteTests = list(set([a.name for a in alldbTests]) - set(vctestNames))
Sets are really useful when comparing two lists of objects (pseudopython):
set(a) - set(b) = [c for c in a and not in b]
set(a) + set(b) = [c for c in a or in b]
set(a).intersection(set(b)) = [c for c in a and in b]
The intersection- and difference-operations of sets should help you solve your problem more elegant.
But as you're originally dealing with dicts these examples and discussion may provide some inspirations: http://code.activestate.com/recipes/59875-finding-the-intersection-of-two-dicts

Categories

Resources