Okay I concede that I didn't ask the question very well. I will update my question to be more precise.
I am writing a function that takes a list as an argument. I want to check the length of the list so I can loop through the list.
The problem that I have is when the list has only one entry, len(myList) returns the length of that entry (the length of the string) and not the length of the list which should be == 1.
I can fix this if I force the argument to be parsed as a single value list ['val']. But I would prefer my API to allow the user to parse either a value or a list of values.
example:
def myMethod(self,dataHandle, data,**kwargs):
comment = kwargs.get('comment','')
_dataHandle= list()
_data = list()
_dataHandle.append(dataHandle)
_data.append(data)
for i in range(_dataHandle):
# do stuff.
I would like to be able to call my method either by
myMethod('ed', ed.spectra,comment='down welling irradiance')
or by
myMethod(['ed','lu'] , [ed.spectra,lu.spectra] , comments = ['downwelling', upwelling radiance'])
Any help would be greatly appreciated. Might not seem like a big deal to parse ['ed'], but it breaks the consistency of my API so far.
The proper python syntax for a list consisting of a single item is [ 'ed' ].
What you're doing with list('ed') is asking python to convert 'ed' to a list. This is a consistent metaphor in python: when you want to convert something to a string, you say str(some_thing). Any hack you'd use to make list('ed') return a list with just the string 'ed' would break python's internal metaphors.
When python sees list(x), it will try to convert x to a list. If x is iterable, it does something more or less equivalent to this:
def make_list(x):
ret_val = []
for item in x:
ret_val.append(item)
return ret_val
Because your string 'ed' is iterable, python will convert it to a list of length two: [ 'e', 'd' ].
The cleanest idiomatic python in this case might be to have your function accept a variable number of arguments, so instead of this
def my_func(itemList):
...
you'd do this
def my_func(*items):
...
And instead of calling it like this
my_func(['ed','lu','lsky'])
You'd call it like this:
my_func('ed', 'lu', 'lsky')
In this way you can accept any number of arguments, and your API will be nice and clean.
You can ask if your variable is a list:
def my_method(my_var):
if isinstance(my_var, list):
for my_elem in my_var:
# do stuff with my_elem
else: # my_var is not iterable
# do stuff with my_var
EDIT: Another option is to try iterating over it, and if it fails (raises and exception) you assume is a single element:
def my_method(my_var):
try:
for my_elem in my_var:
# do stuff with my_elem
except TypeError: # my_var is not iterable
# do_stuff with my_var
The good thing about this second options is that it will work not only for lists, as the first one, but with anything that is iterable (strings, sets, dicts, etc.)
You do actually need to put your string in a list if you want it to be treated like a list
EDIT
I see that at some point there was a list in front of the string. list, contrary to what you may think, doesn't create a list of one item. It calls __iter__ on the string object and iterates over each item. Thus it makes a list of characters.
Hopefully this makes it clearer:
>>> print(list('abc'))
['a', 'b', 'c']
>>> print(list(('abc',)))
['abc']
list('ed') does not create a list containing a single element, 'ed'. list(x) in general does not create a list containing a single element, x. In fact, if you had been using numbers rather than strings (or anything else non-iterable), this would have been blindingly obvious to you:
>>> list('ed')
['e', 'd']
>>> list(3)
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
list(3)
TypeError: 'int' object is not iterable
>>
So you are in fact passing a list with multiple elements to your method, which is why len is returning greater than 1. It's not returning the length of the first element of the list.
For your method to allow passing either a single item or a list, you'd have to do some checking to see if it's a single item first, and if it is create a list containing it with myVar = [myVar], then run your loop.
However this sort of API is tricky to implement and use, and I would not recommend it. The most natural way to check if you've been given a collection or an item is see if myVar is iterable. However this fails for strings, which are iterable. Strings unfortunately straddle the boundry between a collection and an individual data item; we very very often use them as data items containing a "chunk of text", but they are also collections of characters, and Python allows them to be used as such.
Such an API also is likely to cause you to one day accidentally pass a list that you're thinking of as a single thing and expecting the method to treat it as a single thing. But it's a list, so suddenly the code will behave differently.
It also raises questions about what you do with other data types. A dictionary is not a list, but it can be iterated. If you pass a dictionary as myVar, will it be treated as a list containing a single dictionary, or will it iterate over the keys of the dictionary? How about a tuple? What about a custom class implementing __iter__? What if the custom class implementing __iter__ is trying to be "string-like" rather than "list-like"?
All these questions lead to surprises if the caller guesses/remembers wrongly. Surprises when programming lead to bugs. IMHO, it's better to just live with the extra two characters of typing ([ and ]), and have your API be clean and simple.
I run into this same problem frequently. Building a list from an empty list, as you are doing with the "_dataHandle= list()" line, is common in Python because we don't reserve memory in advance. Therefore, it is often the case that the state of the list will transition from empty, to one element, to multiple elements. As you found, Python treats the indexing different for one element vs. multiple elements. If you can use list comprehension, then the solution can be simple. Instead of:
for i in range(_dataHandle):
use:
for myvar in _dataHandle:
In this case, if there is only one element, the loop only iterates once as you would expect.
Related
I have a set with multiple tuples: set1 = {(1,1),(2,1)} for example.
Now I want to pass each tuple of the set to a method with this signature: process_tuple(self, tuple).
I am doing it with a for loop like this:
for tuple in set1:
process_tuple(tuple)
Is there a better way to do it?
Your question is basically "how can I loop without using a loop". While it's possible to do what you're asking with out an explicit for loop, the loop is by far the clearest and best way to go.
There are some alternatives, but mostly they're just changing how the loop looks, not preventing it in the first place. If you want to collect the return values from the calls to your function in a list, you can use a list comprehension to build the list at the same time as you loop:
results = [process_tuple(tuple) for tuple in set1]
You can also do set or dict comprehensions if those seem useful to your specific needs. For example, you could build a dictionary mapping from the tuples in your set to their processed results with:
results_dict = {tuple: process_tuple(tuple) for tuple in set1}
If you don't want to write out for tuple in set1 at all, you could use the builtin map function to do the looping and passing of values for you. It returns an iterator, which you'll need to fully consume to run the function over the full input. Passing the map object to list sometimes makes sense, for instance, to convert inputs into numbers:
user_numbers = list(map(int, input("Enter space-separated integers: ").split()))
But I'd also strongly encourage you to think of your current code as perhaps the best solution. Just because you can change it to something else, doesn't mean you should.
I am learning lists and trying to create a list and add data to it.
mylist=[]
mylist[0]="hello"
This generates Error.
Why cant we add members to lists like this, like we do with arrays in javascript.
Since these are also dynamic and we can add as many members and of any data type to it.
In javascript this works:
var ar=[];
ar[0]=333;
Why this dosent work in Python and we only use append() to add to list.
mylist[0] = 'hello' is syntactic sugar for mylist.__setitem__(0, 'hello').
As per the docs for object.__setitem__(self, key, value):
The same exceptions should be raised for improper key values as for
the __getitem__() method.
The docs for __getitem__ states specifically what leads to IndexError:
if value outside the set of indexes for the sequence (after any
special interpretation of negative values), IndexError should be
raised.
As to the purpose behind this design decision, one can write several chapters to explain why list has been designed in this way. You should familiarise yourself with Python list indexing and slicing before making judgements on its utility.
Lists in Python are fundamentally different to arrays in languages like C. You do not create a list of a fixed size and assign elements to indexes in it. Instead you either create an empty list and append elements to it, or use a list-comprehension to generate a list from a type of expression.
In your case, you want to add to the end, so you must use the .append method:
mylist.append('hello')
#["hello"]
And an example of a list comprehension:
squares = [x**2 for x in range(10)]
#[1,4,9,16,25,36,49,64,81,100]
I have a list of computer nodes called node_names, and I want to find the amount of free ram in each node, and store that in a second list. I then want to combine these lists into a dictionary.
I have:
for i in range(0, number_of_nodes):
sys_output = [commands.getoutput('ssh %s \'free -m\'' % node_names[i])]
free_memory = [x.split()[9] for x in sys_output]
print free_memory
For 4 nodes, this returns [mem1],[mem2],[mem3],[mem4].
How can I combine each memory value into a single list? I'm having difficulty assigning free_memory as a list instead of a string which is replaced after each loop iteration.
Once I have a memory list, I should be able to combine it with the node_names list to make a dictionary file and do any necessary sorting.
I would recommend just building the dictionary directly:
import commands
node_free_mem = {}
for n in node_names:
sys_output = commands.getoutput("ssh %s 'free -m'" % n)
free_memory = sys_output.split()[9]
node_free_mem[n] = int(free_memory)
Here's code that does exactly what you asked: it builds a list, then uses the list to make a dictionary. Discussion after the code.
import commands
def get_free_mem(node_name):
sys_output = commands.getoutput('ssh %s \'free -m\'' % node_name)
free_memory = sys_output.split()[9]
return int(free_memory)
free_list = [get_free_mem(n) for n in node_names]
node_free_mem = dict(zip(node_names, free_list))
Note that in both code samples I simply iterate over the list of node names, rather than using a range() to get index numbers and indexing the list. It's simplest and best in Python to just ask for what you want: you want the names, so ask for those.
I made a helper function for the code to get free memory. Then a simple list comprehension builds a parallel list of free memory values.
The only tricky bit is building the dict. This use of zip() is actually pretty common in Python and is discussed here:
Map two lists into a dictionary in Python
For large lists in Python 2.x you might want to use itertools.izip() instead of the built-in zip(), but in Python 3.x you just use the built-in zip().
EDIT: cleaned up the code; it should work now.
commands.getoutput() returns a string. There is no need to package up the string inside a list, so I removed the square braces. Then in turn there is no need for a list comprehension to get out the free_memory value; just split the string. Now we have a simple string that may be passed to int() to convert to integer.
def process_filter_description(filter, images, ial):
'''Return a new list containing only items from list images that pass
the description filter (a str). ial is the related image association list.
Matching is done in a case insensitive manner.
'''
images = []
for items in ial:
Those are the only two lines of code I have so far. What is troubling me is the filter in the function. I really don't know what the filter is supposed to do or how to use it.
In no way am I asking for the full code. I just want help with what the filter is supposed to do and how I can use it.
Like I said in my comment, this is really vague. But I'll try to explain a little about the concept of a filter in python, specifically the filter() function.
The prototype of filter is: iterable <- filter(function, iterable).
iterable is something that can be iterated over. You can look up this term in the docs for a more exact explanation, but for your question, just know that a list is iterable.
function is a function that accepts a single element of the iterable you specify (in this case, an element of the list) and returns a boolean specifying whether the element should exist in the iterable that is returned. If the function returns True, the element will appear in the returned list, if False, it will not.
Here's a short example, showing how you can use the filter() function to filter out all even numbers (which I should point out, is the same as "filtering in" all odd numbers)
def is_odd(i): return i%2
l = [1,2,3,4,5] # This is a list
fl = filter(is_odd, l)
print fl # This will display [1,3,5]
You should convince yourself that is_odd works first. It will return 1 (=True) for odd numbers and 0 (=False) for even numbers.
In practice, you usually use a lambda function instead of defining a single-use top-level function, but you shouldn't worry about that, as this is just fine.
But anyway, you should be able to do something similar to accomplish your goal.
Well it says in the description line:
Return a new list containing only items from list images that pass the description filter (a str)
...
Matching is done in a case insensitive manner
So.. im guessing the filter is just a string, do you have any kind of text associated with the images ? some kind of description or name that could be matched against the filter string ?
In sum: I need to write a List Comprehension in which i refer to list that is being created by the List Comprehension.
This might not be something you need to do every day, but i don't think it's unusual either.
Maybe there's no answer here--still, please don't tell me i ought to use a for loop. That might be correct, but it's not helpful. The reason is the problem domain: this line of code is part of an ETL module, so performance is relevant, and so is the need to avoid creating a temporary container--hence my wish to code this step in a L/C. If a for loop would work for me here, i would just code one.
In any event, i am unable to write this particular list comprehension. The reason: the expression i need to write has this form:
[ some_function(s) for s in raw_data if s not in this_list ]
In that pseudo-code, "this_list" refers to the list created by evaluating that list comprehension. And that's why i'm stuck--because this_list isn't built until my list comprehension is evaluated, and because this list isn't yet built by the time i need to refer to it, i don't know how to refer to it.
What i have considered so far (and which might be based on one or more false assumptions, though i don't know exactly where):
doesn't the python interpreter have
to give this list-under-construction
a name? i think so
that temporary name is probably taken
from some bound method used to build
my list ('sum'?)
but even if i went to the trouble to
find that bound method and assuming
that it is indeed the temporary name
used by the python interpreter to
refer to the list while it is under
construction, i am pretty sure you
can't refer to bound methods
directly; i'm not aware of such an
explicit rule, but those methods (at
least the few that i've actually
looked at) are not valid python
syntax. I'm guessing one reason why
is so that we do not write them into
our code.
so that's the chain of my so-called reasoning, and which has led me to conclude, or at least guess, that i have coded myself into a corner. Still i thought i ought to verify this with the Community before turning around and going a different direction.
There used to be a way to do this using the undocumented fact that while the list was being built its value was stored in a local variable named _[1].__self__. However that quit working in Python 2.7 (maybe earlier, I wasn't paying close attention).
You can do what you want in a single list comprehension if you set up an external data structure first. Since all your pseudo code seemed to be doing with this_list was checking it to see if each s was already in it -- i.e. a membership test -- I've changed it into a set named seen as an optimization (checking for membership in a list can be very slow if the list is large). Here's what I mean:
raw_data = [c for c in 'abcdaebfc']
seen = set()
def some_function(s):
seen.add(s)
return s
print [ some_function(s) for s in raw_data if s not in seen ]
# ['a', 'b', 'c', 'd', 'e', 'f']
If you don't have access to some_function, you could put a call to it in your own wrapper function that added its return value to the seen set before returning it.
Even though it wouldn't be a list comprehension, I'd encapsulate the whole thing in a function to make reuse easier:
def some_function(s):
# do something with or to 's'...
return s
def add_unique(function, data):
result = []
seen = set(result) # init to empty set
for s in data:
if s not in seen:
t = function(s)
result.append(t)
seen.add(t)
return result
print add_unique(some_function, raw_data)
# ['a', 'b', 'c', 'd', 'e', 'f']
In either case, I find it odd that the list being built in your pseudo code that you want to reference isn't comprised of a subset of raw_data values, but rather the result of calling some_function on each of them -- i.e. transformed data -- which naturally makes one wonder what some_function does such that its return value might match an existing raw_data item's value.
I don't see why you need to do this in one go. Either iterate through the initial data first to eliminate duplicates - or, even better, convert it to a set as KennyTM suggests - then do your list comprehension.
Note that even if you could reference the "list under construction", your approach would still fail because s is not in the list anyway - the result of some_function(s) is.
As far as I know, there is no way to access a list comprehension as it's being built.
As KennyTM mentioned (and if the order of the entries is not relevant), then you can use a set instead. If you're on Python 2.7/3.1 and above, you even get set comprehensions:
{ some_function(s) for s in raw_data }
Otherwise, a for loop isn't that bad either (although it will scale terribly)
l = []
for s in raw_data:
item = somefunction(s)
if item not in l:
l.append(item)
Why don't you simply do:[ some_function(s) for s in set(raw_data) ]
That should do what you are asking for. Except when you need to preserve the order of the previous list.