Tricky to word the title well.
I want to create a list of values that correspond to the variables of a list of objects. It can be inelegently done like this;
class Example:
def __init__(self, x):
self.x = x
objlist = [ Example(i) for i in range(10) ]
DESIRED_OUTCOME = [ obj.x for obj in objlist ]
But this seems unpythonic and cumbersome, so I was wondering if there is a way of indexing all the the values out at one time.
Im wondering if there is a syntax that allows me to take all the variables out at once, like pulling a first axis array from a 2d array;
ex = example2darray[:,1] #2d array syntax
OUTCOME = objlist[:, objlist.x] #Is there something like this that exists?
>>> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
I hope this question makes sense
Nothing unpythonic about that, IMO, but if you really want to iterate over the x values of your instances 'directly' instead of obtaining them from the object itself, you can map them to operator.attrgetter:
import operator
objlist = [Example(i) for i in range(10)]
DESIRED_OUTCOME = map(operator.attrgetter("x"), objlist)
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Beware that on Python 3.x map() returns an iterator so if you want the a list result make sure to turn it into one. Also, unless you construct Example in a special way, pretty much anything will be slower than the good old list comprehension loop which you consider 'inelegant'.
Related
In a list comprehension with a condition that has a function call in it, does Python (specifically CPython 3.9.4) call the function each time, or does it calculate the value once and then uses it?
For example if you have:
list_1 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
list_2 = [x for x in list_1 if x > np.average(list_1)]
Will Python actually calculate the np.average(list_1) len(list_1) times? So would it be more optimized to write
list_1 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
np_avg = np.average(list_1)
list_2 = [x for x in list_1 if x > np_avg]
instead? Or does Python already "know" to just calculate the average beforehand?
Python has to call the function each time. It cannot optimize that part, because successive calls of the function might return different results (for example because of side effects). There is no easy way for Python’s compiler to be sure that this can’t happen.
Therefore, if you (the programmer) know that the result will always be the same – like in this case – it is probably advisable to calculate the result of the function in advance and use it inside the list comprehension.
Assuming standard CPython - Short answer: Yes. Your second snippet is more efficient.
A function call in the filter part of a list comprehension will be called for each element.
We can test this quite easily with a trivial example:
def f(value):
""" Allow even values only """
print('function called')
return value % 2 == 0
mylist = [x for x in range(5) if f(x)]
# 'function called' will be printed 5 times
The above is somewhat equivalent to doing:
mylist = []
for x in range(5):
if f(x):
mylist.append(x)
Since you're comparing against the same average each time, you can indeed just calculate it beforehand and use the same value as you did in your second code snippet.
I have a list that contains many elements, where each element represents an input file, that I want to dynamically subset using the values contained within another list. For example, I have some code that dynamically generates lists that I want to use to define the sub-samples such as
[0, 1, 2, 3]
and
[1, 2, 3, 4]
But I want to use the start and end elements of each of these lists to define an slice range to be applied to another list. In other words, I want the two above lists to be converted into slices that look like this
[0:3]
and [1:4]
But I don't know how to do this, and to be honest I'm not even sure the correct terminology to use to search for this. I have tried searching stack overflow for 'dynamically generate slices from lists' or even 'dynamically generate data slice' (an variants that I can think of along those lines) without any success.
Here are a few more details:
thislist = ['2019/12/26/fjjd', '2019/12/26/defg', '2020/01/09/qpfd', '2020/01/09/tosf', '2020/01/16/zpqr', '2020/01/15/zpqr', '2020/01/15/juwi']
where someIndexSlice is
[0:3]
and generated from a list that looks like this
[0,1,2,3]
thislist[someIndexSlice] = ['2019/12/26/fjjd', '2019/12/26/defg', '2020/01/09/qpfd', '2020/01/09/tosf']
So my questions are:
How can I accomplish this?
What sort of terminology should I use to describe what I am trying to accomplish?
Thanks
You can use the built-in slice function:
>>> lst = [0, 1, 2, 3]
>>> as_slice = slice(lst[0], lst[-1], lst[1] - lst[0])
>>> as_slice
slice(0, 3, 1) # which is same as [0:3]
And then to check if it works correctly:
>>> test = [1, 5, 3, 7, 8]
>>> test[as_slice]
[1, 5, 3]
>>> test[0:3]
[1, 5, 3]
NOTE:
This implementation assumed your lists are equidistant, and sorted.
In my program I have many lines where I need to both iterate over a something and modify it in that same for loop.
However, I know that modifying the thing over which you're iterating is bad because it may - probably will - result in an undesired result.
So I've been doing something like this:
for el_idx, el in enumerate(theList):
if theList[el_idx].IsSomething() is True:
theList[el_idx].SetIt(False)
Is this the best way to do this?
This is a conceptual misunderstanding.
It is dangerous to modify the list itself from within the iteration, because of the way Python translates the loop to lower level code. This can cause unexpected side effects during the iteration, there's a good example here :
https://unspecified.wordpress.com/2009/02/12/thou-shalt-not-modify-a-list-during-iteration/
But modifying mutable objects stored in the list is acceptable, and common practice.
I suspect that you're thinking that because the list is made up of those objects, that modifying those objects modifies the list. This is understandable - it's just not how it's normally thought of. If it helps, consider that the list only really contains references to those objects. When you modify the objects within the loop - you are merely using the list to modify the objects, not modifying the list itself.
What you should not do is add or remove items from the list during the iteration.
Your problem seems to be unclear to me. But if we talk about harmful of modifying list during a for loop iteration in Python. I can think about two scenarios.
First, You modify some elements in list that suppose to be used on the next round of computation as its original value.
e.g. You want to write a program that have such inputs and outputs like these.
Input:
[1, 2, 3, 4]
Expected output:
[1, 3, 6, 10] #[1, 1 + 2, 1 + 2 + 3, 1 + 2 + 3 + 4]
But...you write a code in this way:
#!/usr/bin/env python
mylist = [1, 2, 3, 4]
for idx, n in enumerate(mylist):
mylist[idx] = sum(mylist[:idx + 1])
print mylist
Result is:
[1, 3, 7, 15] # undesired result
Second, you make some change on size of list during a for loop iteration.
e.g. From python-delete-all-entries-of-a-value-in-list:
>>> s=[1,4,1,4,1,4,1,1,0,1]
>>> for i in s:
... if i ==1: s.remove(i)
...
>>> s
[4, 4, 4, 0, 1]
The example shows the undesired result that raised from side-effect of changing size in list. This obviously shows you that for each loop in Python can not handle list with dynamic size in a proper way. Below, I show you some simple way to overcome this problem:
#!/usr/bin/env python
s=[1, 4, 1, 4, 1, 4, 1, 1, 0, 1]
list_size=len(s)
i=0
while i!=list_size:
if s[i]==1:
del s[i]
list_size=len(s)
else:
i=i + 1
print s
Result:
[4, 4, 4, 0]
Conclusion: It's definitely not harmful to modify any elements in list during a loop iteration, if you don't 1) make change on size of list 2) make some side-effect of computation by your own.
you could get index first
idx = [ el_idx for el_idx, el in enumerate(theList) if el.IsSomething() ]
[ theList[i].SetIt(False) for i in idx ]
Is there a more pythonic way to tell the list which parts of it has to stay in it an which parts has to be removed?
li = [1,2,3,4,5,6,7]
Wanted list:
[1,2,3,6,7]
I can do that this way:
wl = li[:-4]+li[-2:]
I'm looking for something like li[:-4,-2:] (in one statement/command)
Of course I can do remove but it can be used in many situations like:
Wanted list:
[3,4,5,6,7]
I can do del li[0:2]
But it's more common to do:
li[2:]
Other than regular python lists, NumPy arrays can be indexed by other sequence-like objects (other than tuples) e.g. by regular python lists or by another NumPy array.
import numpy as np
li = np.arange(1, 8)
# array([1, 2, 3, 4, 5, 6, 7])
wl = li[[0,1,2,5,6]]
# array([1, 2, 3, 6, 7])
Of course, this leaves you now with the problem of creating the index sequence (the regular python list [0,1,2,5,6] in this example), which puts you back on square one. (Unless you need to access several NumPy arrays at the same indices, so you create this index list once and then re-use it.)
You should probably only consider this if you have additional reasons to use NumPy in general or specifically NumPy arrays.
If you want the output list to follow a certain logic, you can use the filter function.
filter(lambda x: x > 2, li)
or maybe
filter(lambda x: x < 4 or x > 5, li)
Say I have an array with a couple hundred elements. I need to iterate of the array and replace one or more items in the array with some other item. Which strategy is more efficient in python in terms of speed (I'm not worried about memory)?
For example: I have an array
my_array = [1,2,3,4,5,6]
I want to replace the first 3 elements with one element with the value 123.
Option 1 (inline):
my_array = [1,2,3,4,5,6]
my_array.remove(0,3)
my_array.insert(0,123)
Option2 (new array creation):
my_array = [1,2,3,4,5,6]
my_array = my_array[3:]
my_array.insert(0,123)
Both of the above will options will give a result of:
>>> [123,4,5,6]
Any comments would be appreciated. Especially if there is options I have missed.
If you want to replace an item or a set of items in a list, you should never use your first option. Removing and adding to a list in the middle is slow (reference). Your second option is also fairly inefficient, since you're doing two operations for a single replacement.
Instead, just do slice assignment, as eiben's answer instructs. This will be significantly faster and more efficient than either of your methods:
>>> my_array = [1,2,3,4,5,6]
>>> my_array[:3] = [123]
>>> my_array
[123, 4, 5, 6]
arr[0] = x
replaces the 0th element with x. You can also replace whole slices.
>>> arr = [1, 2, 3, 4, 5, 6]
>>> arr[0:3] = [8, 9, 99]
>>> arr
[8, 9, 99, 4, 5, 6]
>>>
And generally it's unclear what you're trying to achieve. Please provide more information or an example.
OK, as for your update. The remove method doesn't work (remove needs one argument). But the slicing I presented works for your case too:
>>> arr
[8, 9, 99, 4, 5, 6]
>>> arr[0:3] = [4]
>>> arr
[4, 4, 5, 6]
I would guess it's the fastest method, but do try it with timeit. According to my tests it's twice as fast as your "new array" approach.
If you're looking speed efficience and manipulate series of integers, You should use the standard array module instead:
>>> import array
>>> my_array = array.array('i', [1,2,3,4,5,6])
>>> my_array = my_array[3:]
>>> my_array.insert(0,123)
>>> my_array
array('i', [123, 4, 5, 6])
The key thing is to avoid moving large numbers of list items more than absolutely have to. Slice assignment, as far as i'm aware, still involves moving the items around the slice, which is bad news.
How do you recognise when you have a sequence of items which need to be replaced? I'll assume you have a function like:
def replacement(objects, startIndex):
"returns a pair (numberOfObjectsToReplace, replacementObject), or None if the should be no replacement"
I'd then do:
def replaceAll(objects):
src = 0
dst = 0
while (src < len(objects)):
replacementInfo = replacement(objects, src)
if (replacementInfo != None):
numberOfObjectsToReplace, replacementObject = replacementInfo
else:
numberOfObjectsToReplace = 1
replacementObject = objects[src]
objects[dst] = replacementObject
src = src + numberOfObjectsToReplace
dst = dst + 1
del objects[dst:]
This code still does a few more loads and stores than it absolutely has to, but not many.