Searching a list of objects in Python - python

Let's assume I'm creating a simple class to work similar to a C-style struct, to just hold data elements. I'm trying to figure out how to search a list of objects for objects with an attribute equaling a certain value. Below is a trivial example to illustrate what I'm trying to do.
For instance:
class Data:
pass
myList = []
for i in range(20):
data = Data()
data.n = i
data.n_squared = i * i
myList.append(data)
How would I go about searching the myList list to determine if it contains an element with n == 5?
I've been Googling and searching the Python docs, and I think I might be able to do this with a list comprehension, but I'm not sure. I might add that I'm having to use Python 2.4.3 by the way, so any new gee-whiz 2.6 or 3.x features aren't available to me.

You can get a list of all matching elements with a list comprehension:
[x for x in myList if x.n == 30] # list of all elements with .n==30
If you simply want to determine if the list contains any element that matches and do it (relatively) efficiently, you can do
def contains(list, filter):
for x in list:
if filter(x):
return True
return False
if contains(myList, lambda x: x.n == 3) # True if any element has .n==3
# do stuff

Simple, Elegant, and Powerful:
A generator expression in conjuction with a builtin… (python 2.5+)
any(x for x in mylist if x.n == 10)
Uses the Python any() builtin, which is defined as follows:
any(iterable) ->
Return True if any element of the iterable is true. Equivalent to:
def any(iterable):
for element in iterable:
if element:
return True
return False

Just for completeness, let's not forget the Simplest Thing That Could Possibly Work:
for i in list:
if i.n == 5:
# do something with it
print "YAY! Found one!"

[x for x in myList if x.n == 30] # list of all matches
[x.n_squared for x in myList if x.n == 30] # property of matches
any(x.n == 30 for x in myList) # if there is any matches
[i for i,x in enumerate(myList) if x.n == 30] # indices of all matches
def first(iterable, default=None):
for item in iterable:
return item
return default
first(x for x in myList if x.n == 30) # the first match, if any

filter(lambda x: x.n == 5, myList)

You can use in to look for an item in a collection, and a list comprehension to extract the field you are interested in. This (works for lists, sets, tuples, and anything that defines __contains__ or __getitem__).
if 5 in [data.n for data in myList]:
print "Found it"
See also:
Contains Method
In operation

Another way you could do it is using the next() function.
matched_obj = next(x for x in list if x.n == 10)

You should add a __eq__ and a __hash__ method to your Data class, it could check if the __dict__ attributes are equal (same properties) and then if their values are equal, too.
If you did that, you can use
test = Data()
test.n = 5
found = test in myList
The in keyword checks if test is in myList.
If you only want to a a n property in Data you could use:
class Data(object):
__slots__ = ['n']
def __init__(self, n):
self.n = n
def __eq__(self, other):
if not isinstance(other, Data):
return False
if self.n != other.n:
return False
return True
def __hash__(self):
return self.n
myList = [ Data(1), Data(2), Data(3) ]
Data(2) in myList #==> True
Data(5) in myList #==> False

Consider using a dictionary:
myDict = {}
for i in range(20):
myDict[i] = i * i
print(5 in myDict)

Use the following list comprehension in combination with the index method:
data_n = 30
j = [data.n for data in mylist].index(data_n)
print(mylist[j].data.n == data_n)

Related

Finding inside a list of instances the index of the ones that meet certain conditions [duplicate]

Let's assume I'm creating a simple class to work similar to a C-style struct, to just hold data elements. I'm trying to figure out how to search a list of objects for objects with an attribute equaling a certain value. Below is a trivial example to illustrate what I'm trying to do.
For instance:
class Data:
pass
myList = []
for i in range(20):
data = Data()
data.n = i
data.n_squared = i * i
myList.append(data)
How would I go about searching the myList list to determine if it contains an element with n == 5?
I've been Googling and searching the Python docs, and I think I might be able to do this with a list comprehension, but I'm not sure. I might add that I'm having to use Python 2.4.3 by the way, so any new gee-whiz 2.6 or 3.x features aren't available to me.
You can get a list of all matching elements with a list comprehension:
[x for x in myList if x.n == 30] # list of all elements with .n==30
If you simply want to determine if the list contains any element that matches and do it (relatively) efficiently, you can do
def contains(list, filter):
for x in list:
if filter(x):
return True
return False
if contains(myList, lambda x: x.n == 3) # True if any element has .n==3
# do stuff
Simple, Elegant, and Powerful:
A generator expression in conjuction with a builtin… (python 2.5+)
any(x for x in mylist if x.n == 10)
Uses the Python any() builtin, which is defined as follows:
any(iterable) ->
Return True if any element of the iterable is true. Equivalent to:
def any(iterable):
for element in iterable:
if element:
return True
return False
Just for completeness, let's not forget the Simplest Thing That Could Possibly Work:
for i in list:
if i.n == 5:
# do something with it
print "YAY! Found one!"
[x for x in myList if x.n == 30] # list of all matches
[x.n_squared for x in myList if x.n == 30] # property of matches
any(x.n == 30 for x in myList) # if there is any matches
[i for i,x in enumerate(myList) if x.n == 30] # indices of all matches
def first(iterable, default=None):
for item in iterable:
return item
return default
first(x for x in myList if x.n == 30) # the first match, if any
filter(lambda x: x.n == 5, myList)
You can use in to look for an item in a collection, and a list comprehension to extract the field you are interested in. This (works for lists, sets, tuples, and anything that defines __contains__ or __getitem__).
if 5 in [data.n for data in myList]:
print "Found it"
See also:
Contains Method
In operation
Another way you could do it is using the next() function.
matched_obj = next(x for x in list if x.n == 10)
You should add a __eq__ and a __hash__ method to your Data class, it could check if the __dict__ attributes are equal (same properties) and then if their values are equal, too.
If you did that, you can use
test = Data()
test.n = 5
found = test in myList
The in keyword checks if test is in myList.
If you only want to a a n property in Data you could use:
class Data(object):
__slots__ = ['n']
def __init__(self, n):
self.n = n
def __eq__(self, other):
if not isinstance(other, Data):
return False
if self.n != other.n:
return False
return True
def __hash__(self):
return self.n
myList = [ Data(1), Data(2), Data(3) ]
Data(2) in myList #==> True
Data(5) in myList #==> False
Consider using a dictionary:
myDict = {}
for i in range(20):
myDict[i] = i * i
print(5 in myDict)
Use the following list comprehension in combination with the index method:
data_n = 30
j = [data.n for data in mylist].index(data_n)
print(mylist[j].data.n == data_n)

Is there a better way to compare two lists in Python, stopping at the shorter list?

I'm trying to compare two lists with short-circuiting logic such that if one list is shorter than the other stop the comparison and return True. I'd like to know if what I have is sufficiently Pythonic or if there is a better way to do it.
def compareLists(list1,list2):
# Comparison invalid if either list is empty
if not list1 or not list2:
return False
equalList = True # initialize as true
for (l1,l2) in zip(list1,list2):
if l1 != l2:
equalList = False
break
return equalList
ipdb> list1 = [1,2,3]
ipdb> list2 = [1,2,3,4]
ipdb> compareLists(list1,list2)
True
A more pythonic way would be to use all:
def compare_lists(list1, list2):
if not list1 or not list2:
return False
else:
return all(x1 == x2 for x1, x2 in zip(list1, list2))
Also note that the naming convention in Python is not camel case, but snake case.
You don't need a loop. You can use the min function to get the length of the shorter list (L), then use the == operator to compare the first L elements of each list, like so:
def compareLists(list1, list2):
L=min(len(list1), len(list2))
return list1[:L]==list2[:L]
The following returns True:
lista=[1,3,4,6,7,8,4,6]
listb=[1,3,4,6,7,8,4,6,3,7,5,2,4]
print (compareLists(lista, listb))
Whereas, the following returns False:
lista=[1,3,4,6,7,-8,4,6]
listb=[1,3,4,6,7,8,4,6,3,7,5,2,4]
print (compareLists(lista, listb))

How To Tell if An Element In A List Is Itself A List? [Python]

As an exercise, I'm creating a code which recursively sums the elements of a list, including lists within lists.
Sample list: a = [1,2,[3,4],[5,6]]
Sample result: 21
My code is as follows:
a = [1,2,[3,4],[5,6]]
def sum(list):
if len(list[0]) == 1 and len(list) ==1: # case where list is [x]
return list[0]
elif len(list[0]) == 1 and len(list) > 1: # case where list is [x, etc]
return list[0] + sum(list[1:])
elif len(list[0]) > 1 and len(list == 1): # case where list is [[x,etc]]
return sum(list[0])
else: # case where list is [[x, etc], etc]
return sum(list[0]) + sum(list[1:])
print (sum(a))
However, when I try to run this I get the error "object of type 'int' has no length." I was under the impression that list[0] would just have a length of 1 if it's not a list in itself, but obviously, that doesn't work. What should I be doing instead to check whether an element in a list is itself a list?
What should I be doing instead to check whether an element in a list is itself a list?
if isinstance(x, list):
Common Gotchas You shouldn't name your own variables as list because you then overwrite the type list and function list()
Similarly for your function sum(). You wouldn't want to overwrite the sum() function that already handles a list of numbers.
Or another way would be hasattr(lst, "__len__")
from numbers import Number
def recur_sum(lst):
# Return single value immeadiately
if not lst or lst is None:
return 0
if isinstance(lst, Number):
return lst
head = None
tail = False
if hasattr(lst, "__len__"):
_size = len(lst)
if _size <= 0: # case where list is []
head = 0
elif _size >= 1:
if hasattr(lst[0], "__len__"): # case where list is [[x,etc]]
head = recur_sum(lst[0])
else: # case where list is [x]
head = lst[0]
tail = (_size > 1)
if not tail:
return head
else:
return head + recur_sum(lst[1:])
In short, you may use the isinstance function to check if an element is a list or not.
The follwoing code may serve as an example.
a = [1,2,[3,4],[5,6]]
a
Out[3]:
[1, 2, [3, 4], [5, 6]]
isinstance(a,list)
Out[4]:
True
isinstance(a[0],list)
Out[5]:
False
isinstance(a[2],list)
Out[6]:
True
For an item in the list you may change the if condition to if type(item) is list:
a = [1,2,[3,4],[5,6]]
for item in a:
if type(item) is list:
# do_something_with(item)
print(item)
and the output will look like this:
[3,4]
[5,6]
Make sure the reserved word list is not overwritten as a variable!

Function Definition: Printing a list of even integers

I'm trying to define a function that returns a list of even integers from a list of overall integers
def print_even_numbers(n: list):
'''Return a list of even numbers given a list of integers'''
for x in list:
if x % 2 == 0:
return(x)
When I tried the code above, the error says that the type isn't iterable
list is the name of the list type. So you cannot iterate over a type. you should use n. Second, your return is indented wrong. It should be on the top function level, because return exits the function. Then you need to collect the result somewhere.
def print_even_numbers(n):
'''Return a list of even numbers given a list of integers'''
result = []
for x in n:
if x % 2 == 0:
result.append(x)
return result
This can be written in short by a list comprehension:
def print_even_numbers(n):
'''Return a list of even numbers given a list of integers'''
return [x for x in n if x % 2 == 0]
Wrong python syntax:
def print_even_numbers(n: list):
You don't need brackets:
return(x)
Wrong indentation. And wrong condition. (And don't use reserved python words such a list for your own variables.
for x in list:
Summarize:
def print_even_numbers(n):
'''Return a list of even numbers given a list of integers'''
result = []
for x in n:
if x % 2 == 0:
result.append(x)
return result
print print_even_numbers(range(10))
>>> [0, 2, 4, 6, 8]
And finally more pythonic way is to use yield to implement desired behaviour:
def gen_even_numbers(n):
for x in n:
if x % 2 == 0:
yield x
print list(gen_even_numbers(range(10)))
>>> [0, 2, 4, 6, 8]
you can use the filter built-in
def print_even_numbers(lst):
return list(filter(lambda x: not x%2, lst))
Note: on python 2 filter returns a list already so there is no need to convert it
def print_even_numbers(lst):
return filter(lambda x: not x%2, lst)
By the way, the function is named print_even_numbers but it does not print anything ;)

tuple checking in python

i've written a small program:
def check(xrr):
""" goes through the list and returns True if the list
does not contain common pairs, IE ([a,b,c],[c,d,e]) = true
but ([a,b,c],[b,a,c]) = false, note the lists can be longer than 2 tuples"""
x = xrr[:]
#sorting the tuples
sorted(map(sorted,x))
for i in range(len(x)-1):
for j in range(len(x)-1):
if [x[i]] == [x[i+1]] and [x[j]] == [x[j+1]]:
return False
return True
But it doesnt seem to work right, this is probably something extremely basic, but after a couple of days trying on and off, i cant really seem to get my head around where the error is.
Thanx in advance
There are so many problems with your code as others have mentioned. I'll try to explain how I would implement this function.
It sounds like what you want to do is actually this: You generate a list of pairs from the input sequences and see if there are any duplicates among the pairs. When you formulate the problem like this it gets much easier to implement.
First we need to generate the pairs. It can be done in many ways, the one you would probably do is:
def pairs( seq ):
ret = []
# go to the 2nd last item of seq
for k in range(len(seq)-1):
# append a pair
ret.append((seq[k], seq[k+1]))
return ret
Now we want to see (a,b) and (b,a) and the same tuple, so we simply sort the tuples:
def sorted_pairs( seq ):
ret = []
for k in range(len(seq)-1):
x,y = (seq[k], seq[k+1])
if x <= y:
ret.append((x,y))
else:
ret.append((y,x))
return ret
Now solving the problem is pretty straight forward. We just need to generate all these tuples and add them to a set. Once we see a pair twice we are done:
def has_common_pairs( *seqs ):
""" checks if there are any common pairs among any of the seqs """
# store all the pairs we've seen
seen = set()
for seq in seqs:
# generate pairs for each seq in seqs
pair_seq = sorted_pairs(seq)
for pair in pair_seq:
# have we seen the pair before?
if pair in seen:
return True
seen.add(pair)
return False
Now the function you were trying to implement is quite simple:
def check(xxr):
return not has_common_pairs(*xxr)
PS: You can generalize the sorted_pairs function to work on any kind of iterable, not only those that support indexing. For completeness sake I'll paste it below, but you don't really need it here and it' harder to understand:
def sorted_pairs( seq ):
""" yield pairs (fst, snd) generated from seq
where fst <= snd for all fst, snd"""
it = iter(seq)
fst = next(it)
for snd in it:
if first <= snd:
yield fst, snd
else:
yield snd, fst
first = snd
I would recommend using a set for this:
def check(xrr):
s = set()
for t in xrr:
u = tuple(sorted(t))
if u in s:
return False
s.add(u)
return True
This way, you don't need to sort the whole list and you stop when the first duplicate is found.
There are several errors in your code. One is that sorted returns a new list, and you just drop the return value. Another one is that you have two nested loops over your data where you would need only one. Here is the code that makes your approach work:
def check(xrr):
x = sorted(map(sorted,xrr))
for i in range(len(x)-1):
if x[i] == x[i+1]:
return False
return True
This could be shortened to
def check(xrr):
x = sorted(map(sorted,xrr))
return all(a != b for a, b in zip(x[:-1], x[1:]))
But note that the first code I gave will be more efficient.
BTW, a list in Python is [1, 2, 3], while a tuple is (1, 2, 3).
sorted doesn't alter the source, it returns a new list.
def check(xrr):
xrrs = map(sorted, xrr)
for i in range(len(xrrs)):
if xrrs[i] in xrrs[i+1:]: return False
return True
I'm not sure that's what's being asked, but if I understood it correctly, I'd write:
def check(lst):
return any(not set(seq).issubset(lst[0]) for seq in lst[1:])
print check([(1, 2, 3), (2, 3, 5)]) # True
print check([(1, 2, 3), (3, 2, 1)]) # False
Here is more general solution, note that it find duplicates, not 'non-duplicates', it's better this way and than to use not.
def has_duplicates(seq):
seen = set()
for item in seq:
if hasattr(item, '__iter__'):
item = tuple(sorted(item))
if item in seen:
return True
seen.add(item)
return False
This is more general solution for finding duplicates:
def get_duplicates(seq):
seen = set()
duplicates = set()
for item in seq:
item = tuple(sorted(item))
if item in seen:
duplicates.add(item)
else:
seen.add(item)
return duplicates
Also it is better to find duplicates, not the 'not duplicates', it saves a lot of confusion. You're better of using general and readable solution, than one-purpose functions.

Categories

Resources