def process_filter_description(filter, images, ial):
'''Return a new list containing only items from list images that pass
the description filter (a str). ial is the related image association list.
Matching is done in a case insensitive manner.
'''
images = []
for items in ial:
Those are the only two lines of code I have so far. What is troubling me is the filter in the function. I really don't know what the filter is supposed to do or how to use it.
In no way am I asking for the full code. I just want help with what the filter is supposed to do and how I can use it.
Like I said in my comment, this is really vague. But I'll try to explain a little about the concept of a filter in python, specifically the filter() function.
The prototype of filter is: iterable <- filter(function, iterable).
iterable is something that can be iterated over. You can look up this term in the docs for a more exact explanation, but for your question, just know that a list is iterable.
function is a function that accepts a single element of the iterable you specify (in this case, an element of the list) and returns a boolean specifying whether the element should exist in the iterable that is returned. If the function returns True, the element will appear in the returned list, if False, it will not.
Here's a short example, showing how you can use the filter() function to filter out all even numbers (which I should point out, is the same as "filtering in" all odd numbers)
def is_odd(i): return i%2
l = [1,2,3,4,5] # This is a list
fl = filter(is_odd, l)
print fl # This will display [1,3,5]
You should convince yourself that is_odd works first. It will return 1 (=True) for odd numbers and 0 (=False) for even numbers.
In practice, you usually use a lambda function instead of defining a single-use top-level function, but you shouldn't worry about that, as this is just fine.
But anyway, you should be able to do something similar to accomplish your goal.
Well it says in the description line:
Return a new list containing only items from list images that pass the description filter (a str)
...
Matching is done in a case insensitive manner
So.. im guessing the filter is just a string, do you have any kind of text associated with the images ? some kind of description or name that could be matched against the filter string ?
Related
I have a set with multiple tuples: set1 = {(1,1),(2,1)} for example.
Now I want to pass each tuple of the set to a method with this signature: process_tuple(self, tuple).
I am doing it with a for loop like this:
for tuple in set1:
process_tuple(tuple)
Is there a better way to do it?
Your question is basically "how can I loop without using a loop". While it's possible to do what you're asking with out an explicit for loop, the loop is by far the clearest and best way to go.
There are some alternatives, but mostly they're just changing how the loop looks, not preventing it in the first place. If you want to collect the return values from the calls to your function in a list, you can use a list comprehension to build the list at the same time as you loop:
results = [process_tuple(tuple) for tuple in set1]
You can also do set or dict comprehensions if those seem useful to your specific needs. For example, you could build a dictionary mapping from the tuples in your set to their processed results with:
results_dict = {tuple: process_tuple(tuple) for tuple in set1}
If you don't want to write out for tuple in set1 at all, you could use the builtin map function to do the looping and passing of values for you. It returns an iterator, which you'll need to fully consume to run the function over the full input. Passing the map object to list sometimes makes sense, for instance, to convert inputs into numbers:
user_numbers = list(map(int, input("Enter space-separated integers: ").split()))
But I'd also strongly encourage you to think of your current code as perhaps the best solution. Just because you can change it to something else, doesn't mean you should.
is there a way of passing a field of an array of collections into a function so that it can still be used to access a element in the collection in python?. i am attempting to search through an array of collections to locate a particular item by comparing it with an identifier. this identifier and field being compared will change as the function is called in different stages of the program. is there a way of passing up the field to the function, to access the required element for comparison?
this is the code that i have tried thus far:
code ...
In your code, M_work is a list. Lists are accessed using an index and this syntax: myList[index]. So that would translate to M_work[place] in your case. Then you say that M_work stores objects which have fields, and you want to access one of these fields by name. To do that, use getattr like this: getattr(M_work[place], field). You can compare the return value to identifier.
Other mistakes in the code you show:
place is misspelled pace at one point.
True is misspelled true at one point.
The body of your loop always returns at the first iteration: there is a return in both the if found == True and else branches. I don't think this is what you want.
You could improve your code by:
noticing that if found == True is equivalent to if found.
finding how you don't actually need the found variable.
looking at Python's for...in loop.
I have a Python list of objects that could be pretty long. At particular times, I'm interested in all of the elements in the list that have a certain attribute, say flag, that evaluates to False. To do so, I've been using a list comprehension, like this:
objList = list()
# ... populate list
[x for x in objList if not x.flag]
Which seems to work well. After forming the sublist, I have a few different operations that I might need to do:
Subscript the sublist to get the element at index ind.
Calculate the length of the sublist (i.e. the number of elements that have flag == False).
Search the sublist for the first instance of a particular object (i.e. using the list's .index() method).
I've implemented these using the naive approach of just forming the sublist and then using its methods to get at the data I want. I'm wondering if there are more efficient ways to go about these. #1 and #3 at least seem like they could be optimized, because in #1 I only need the first ind + 1 matching elements of the sublist, not necessarily the entire result set, and in #3 I only need to search through the sublist until I find a matching element.
Is there a good Pythonic way to do this? I'm guessing I might be able to use the () syntax in some way to get a generator instead of creating the entire list, but I haven't happened upon the right way yet. I obviously could write loops manually, but I'm looking for something as elegant as the comprehension-based method.
If you need to do any of these operations a couple of times, the overhead of other methods will be higher, the list is the best way. It's also probably the clearest, so if memory isn't a problem, then I'd recommend just going with it.
If memory/speed is a problem, then there are alternatives - note that speed-wise, these might actually be slower, depending on the common case for your software.
For your scenarios:
#value = sublist[n]
value = nth(x for x in objList if not x.flag, n)
#value = len(sublist)
value = sum(not x.flag for x in objList)
#value = sublist.index(target)
value = next(dropwhile(lambda x: x != target, (x for x in objList if not x.flag)))
Using itertools.dropwhile() and the nth() recipe from the itertools docs.
I'm going to assume you might do any of these three things, and you might do them more than once.
In that case, what you want is basically to write a lazily evaluated list class. It would keep two pieces of data, a real list cache of evaluated items, and a generator of the rest. You could then do ll[10] and it would evaluate up to the 10th item, ll.index('spam') and it would evaluate until it finds 'spam', and then len(ll) and it would evaluate the rest of the list, all the while caching in the real list what it sees so nothing is done more than once.
Constructing it would look like this:
LazyList(x for x in obj_list if not x.flag)
But nothing would actually be computed until you actually start using it as above.
Since you commented that your objList can change, if you don't also need to index or search objList itself, then you might be better off just storing two different lists, one with .flag = True and one with .flag = False. Then you can use the second list directly instead of constructing it with a list comprehension each time.
If this works in your situation, it is likely the most efficient way to do it.
Okay I concede that I didn't ask the question very well. I will update my question to be more precise.
I am writing a function that takes a list as an argument. I want to check the length of the list so I can loop through the list.
The problem that I have is when the list has only one entry, len(myList) returns the length of that entry (the length of the string) and not the length of the list which should be == 1.
I can fix this if I force the argument to be parsed as a single value list ['val']. But I would prefer my API to allow the user to parse either a value or a list of values.
example:
def myMethod(self,dataHandle, data,**kwargs):
comment = kwargs.get('comment','')
_dataHandle= list()
_data = list()
_dataHandle.append(dataHandle)
_data.append(data)
for i in range(_dataHandle):
# do stuff.
I would like to be able to call my method either by
myMethod('ed', ed.spectra,comment='down welling irradiance')
or by
myMethod(['ed','lu'] , [ed.spectra,lu.spectra] , comments = ['downwelling', upwelling radiance'])
Any help would be greatly appreciated. Might not seem like a big deal to parse ['ed'], but it breaks the consistency of my API so far.
The proper python syntax for a list consisting of a single item is [ 'ed' ].
What you're doing with list('ed') is asking python to convert 'ed' to a list. This is a consistent metaphor in python: when you want to convert something to a string, you say str(some_thing). Any hack you'd use to make list('ed') return a list with just the string 'ed' would break python's internal metaphors.
When python sees list(x), it will try to convert x to a list. If x is iterable, it does something more or less equivalent to this:
def make_list(x):
ret_val = []
for item in x:
ret_val.append(item)
return ret_val
Because your string 'ed' is iterable, python will convert it to a list of length two: [ 'e', 'd' ].
The cleanest idiomatic python in this case might be to have your function accept a variable number of arguments, so instead of this
def my_func(itemList):
...
you'd do this
def my_func(*items):
...
And instead of calling it like this
my_func(['ed','lu','lsky'])
You'd call it like this:
my_func('ed', 'lu', 'lsky')
In this way you can accept any number of arguments, and your API will be nice and clean.
You can ask if your variable is a list:
def my_method(my_var):
if isinstance(my_var, list):
for my_elem in my_var:
# do stuff with my_elem
else: # my_var is not iterable
# do stuff with my_var
EDIT: Another option is to try iterating over it, and if it fails (raises and exception) you assume is a single element:
def my_method(my_var):
try:
for my_elem in my_var:
# do stuff with my_elem
except TypeError: # my_var is not iterable
# do_stuff with my_var
The good thing about this second options is that it will work not only for lists, as the first one, but with anything that is iterable (strings, sets, dicts, etc.)
You do actually need to put your string in a list if you want it to be treated like a list
EDIT
I see that at some point there was a list in front of the string. list, contrary to what you may think, doesn't create a list of one item. It calls __iter__ on the string object and iterates over each item. Thus it makes a list of characters.
Hopefully this makes it clearer:
>>> print(list('abc'))
['a', 'b', 'c']
>>> print(list(('abc',)))
['abc']
list('ed') does not create a list containing a single element, 'ed'. list(x) in general does not create a list containing a single element, x. In fact, if you had been using numbers rather than strings (or anything else non-iterable), this would have been blindingly obvious to you:
>>> list('ed')
['e', 'd']
>>> list(3)
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
list(3)
TypeError: 'int' object is not iterable
>>
So you are in fact passing a list with multiple elements to your method, which is why len is returning greater than 1. It's not returning the length of the first element of the list.
For your method to allow passing either a single item or a list, you'd have to do some checking to see if it's a single item first, and if it is create a list containing it with myVar = [myVar], then run your loop.
However this sort of API is tricky to implement and use, and I would not recommend it. The most natural way to check if you've been given a collection or an item is see if myVar is iterable. However this fails for strings, which are iterable. Strings unfortunately straddle the boundry between a collection and an individual data item; we very very often use them as data items containing a "chunk of text", but they are also collections of characters, and Python allows them to be used as such.
Such an API also is likely to cause you to one day accidentally pass a list that you're thinking of as a single thing and expecting the method to treat it as a single thing. But it's a list, so suddenly the code will behave differently.
It also raises questions about what you do with other data types. A dictionary is not a list, but it can be iterated. If you pass a dictionary as myVar, will it be treated as a list containing a single dictionary, or will it iterate over the keys of the dictionary? How about a tuple? What about a custom class implementing __iter__? What if the custom class implementing __iter__ is trying to be "string-like" rather than "list-like"?
All these questions lead to surprises if the caller guesses/remembers wrongly. Surprises when programming lead to bugs. IMHO, it's better to just live with the extra two characters of typing ([ and ]), and have your API be clean and simple.
I run into this same problem frequently. Building a list from an empty list, as you are doing with the "_dataHandle= list()" line, is common in Python because we don't reserve memory in advance. Therefore, it is often the case that the state of the list will transition from empty, to one element, to multiple elements. As you found, Python treats the indexing different for one element vs. multiple elements. If you can use list comprehension, then the solution can be simple. Instead of:
for i in range(_dataHandle):
use:
for myvar in _dataHandle:
In this case, if there is only one element, the loop only iterates once as you would expect.
I'm fairly new to python and have found that I need to query a list about whether it contains a certain item.
The majority of the postings I have seen on various websites (including this similar stackoverflow question) have all suggested something along the lines of
for i in list
if i == thingIAmLookingFor
return True
However, I have also found from one lone forum that
if thingIAmLookingFor in list
# do work
works.
I am wondering if the if thing in list method is shorthand for the for i in list method, or if it is implemented differently.
I would also like to which, if either, is more preferred.
In your simple example it is of course better to use in.
However... in the question you link to, in doesn't work (at least not directly) because the OP does not want to find an object that is equal to something, but an object whose attribute n is equal to something.
One answer does mention using in on a list comprehension, though I'm not sure why a generator expression wasn't used instead:
if 5 in (data.n for data in myList):
print "Found it"
But this is hardly much of an improvement over the other approaches, such as this one using any:
if any(data.n == 5 for data in myList):
print "Found it"
the "if x in thing:" format is strongly preferred, not just because it takes less code, but it also works on other data types and is (to me) easier to read.
I'm not sure how it's implemented, but I'd expect it to be quite a lot more efficient on datatypes that are stored in a more searchable form. eg. sets or dictionary keys.
The if thing in somelist is the preferred and fastest way.
Under-the-hood that use of the in-operator translates to somelist.__contains__(thing) whose implementation is equivalent to: any((x is thing or x == thing) for x in somelist).
Note the condition tests identity and then equality.
for i in list
if i == thingIAmLookingFor
return True
The above is a terrible way to test whether an item exists in a collection. It returns True from the function, so if you need the test as part of some code you'd need to move this into a separate utility function, or add thingWasFound = False before the loop and set it to True in the if statement (and then break), either of which is several lines of boilerplate for what could be a simple expression.
Plus, if you just use thingIAmLookingFor in list, this might execute more efficiently by doing fewer Python level operations (it'll need to do the same operations, but maybe in C, as list is a builtin type). But even more importantly, if list is actually bound to some other collection like a set or a dictionary thingIAmLookingFor in list will use the hash lookup mechanism such types support and be much more efficient, while using a for loop will force Python to go through every item in turn.
Obligatory post-script: list is a terrible name for a variable that contains a list as it shadows the list builtin, which can confuse you or anyone who reads your code. You're much better off naming it something that tells you something about what it means.