When it comes to lists, we all know and love good old pop, which removes the last item from the list and returns it:
>>> x = range(3)
>>> last_element = x.pop()
>>> last_element
2
>>> x
[0, 1]
But suppose I'm using a one-dimensional numpy array to hold my items, because I'm doing a lot of elementwise computations. What then is the most efficient way for me to achieve a pop?
Of course I can do
>>> import numpy as np
>>> x = np.arange(3)
>>> last_element = x[-1]
>>> x = np.delete(x, -1) # Or x = x[:-1]
>>> last_element
2
>>> x
array([0, 1])
And, really, when it comes down to it, this is fine. But is there a one-liner for arrays I'm missing that removes the last item and returns it at the same time?
And I'm not asking for
>>> last_element, x = x[-1], x[:-1]
I'm not counting this as a one-liner, because it's two distinct assignments achieved by two distinct operations. Syntactic sugar is what puts it all on one line. It's a sugary way to do what I've already done above. (Ha, I was sure someone would rush to give this as the answer, and, indeed, someone has. This answer is the equivalent of my asking, "What's a faster way to get to the store than walking?" and someone answering, "Walk, but walk faster." Uh . . . thanks. I already know how to walk.)
There is no such one liner for numpy (unless you write your own). numpy is meant to work on fixed sized objects (or objects that change less frequently). So by that metric a regular old python list is better for popping.
You are correct in that element-wise operations are better with numpy. You're going to have to profile out your code and see which performs better and make a design decision.
Related
I learned on my web search that numpy.arange take less space than python range function. but i tried
using below it gives me different result.
import sys
x = range(1,10000)
print(sys.getsizeof(x)) # --> Output is 48
a = np.arange(1,10000,1,dtype=np.int8)
print(sys.getsizeof(a)) # --> OutPut is 10095
Could anyone please explain?
In PY3, range is an object that can generate a sequence of numbers; it is not the actual sequence. You may need to brush up on some basic Python reading, paying attention to things like lists and generators, and their differences.
In [359]: x = range(3)
In [360]: x
Out[360]: range(0, 3)
We have use something like list or a list comprehension to actually create those numbers:
In [361]: list(x)
Out[361]: [0, 1, 2]
In [362]: [i for i in x]
Out[362]: [0, 1, 2]
A range is often used in a for i in range(3): print(i) kind of loop.
arange is a numpy function that produces a numpy array:
In [363]: arr = np.arange(3)
In [364]: arr
Out[364]: array([0, 1, 2])
We can iterate on such an array, but it is slower than [362]:
In [365]: [i for i in arr]
Out[365]: [0, 1, 2]
But for doing things math, the array is much better:
In [366]: arr * 10
Out[366]: array([ 0, 10, 20])
The array can also be created from the list [361] (and for compatibility with earlier Py2 usage from the range itself):
In [376]: np.array(list(x)) # np.array(x)
Out[376]: array([0, 1, 2])
But this is slower than using arange directly (that's an implementation detail).
Despite the similarity in names, these shouldn't be seen as simple alternatives. Use range in basic Python constructs such as for loop and comprehension. Use arange when you need an array.
An important innovation in Python (compared to earlier languages) is that we could iterate directly on a list. We didn't have to step through indices. And if we needed indices along with with values we could use enumerate:
In [378]: alist = ['a','b','c']
In [379]: for i in range(3): print(alist[i]) # index iteration
a
b
c
In [380]: for v in alist: print(v) # iterate on list directly
a
b
c
In [381]: for i,v in enumerate(alist): print(i,v) # index and values
0 a
1 b
2 c
Thus you might not see range used that much in basic Python code.
the range type constructor creates range objects, which represent sequences of integers with a start, stop, and step in a space efficient manner, calculating the values on the fly.
np.arange function returns a numpy.ndarray object, which is essentially a wrapper around a primitive array. This is a fast and relatively compact representation, compared to if you created a python list, so list(range(N)), but range objects are more space efficient, and indeed, take constant space, so for all practical purposes, range(a) is the same size as range(b) for any integers a, b
As an aside, you should take care interpreting the results of sys.getsizeof, you must understand what it is doing. So do not naively compare the size of Python lists and numpy.ndarray, for example.
Perhaps whatever you read was referring to Python 2, where range returned a list. List objects do require more space than numpy.ndarray objects, generally.
arange store each individual value of the array while range store only 3 values (start, stop and step). That's the reason arange is taking more space compared to range.
As the question is about the size, this will be the answer.
But there are many advantages of using numpy array and arange than python lists for speed, space and efficiency perspective.
This question already has answers here:
Why does this iterative list-growing code give IndexError: list assignment index out of range? How can I repeatedly add (append) elements to a list?
(9 answers)
Closed 4 months ago.
This is such a simple issue that I don't know what I'm doing wrong. Basically I want to iterate through the items in an empty list and increase each one according to some criteria. This is an example of what I'm trying to do:
list1 = []
for i in range(5):
list1[i] = list1[i] + 2*i
This fails with an list index out of range error and I'm stuck. The expected result (what I'm aiming at) would be a list with values:
[0, 2, 4, 6, 8]
Just to be more clear: I'm not after producing that particular list. The question is about how can I modify items of an empty list in a recursive way. As gnibbler showed below, initializing the list was the answer. Cheers.
Ruby (for example) lets you assign items beyond the end of the list. Python doesn't - you would have to initialise list1 like this
list1 = [0] * 5
So when doing this you are actually using i so you can just do your math to i and just set it to do that. there is no need to try and do the math to what is going to be in the list when you already have i. So just do list comprehension:
list1 = [2*i for i in range(5)]
Since you say that it is more complex, just don't use list comprehension, edit your for loop as such:
for i in range(5):
x = 2*i
list1[i] = x
This way you can keep doing things until you finally have the outcome you want, store it in a variable, and set it accordingly! You could also do list1.append(x), which I actually prefer because it will work with any list even if it's not in order like a list made with range
Edit: Since you want to be able to manipulate the array like you do, I would suggest using numpy! There is this great thing called vectorize so you can actually apply a function to a 1D array:
import numpy as np
list1 = range(5)
def my_func(x):
y = x * 2
vfunc = np.vectorize(my_func)
vfunc(list1)
>>> array([0, 2, 4, 6, 8])
I would advise only using this for more complex functions, because you can use numpy broadcasting for easy things like multiplying by two.
Your list is empty, so when you try to read an element of the list (right hand side of this line)
list1[i] = list1[i] + 2*i
it doesn't exist, so you get the error message.
You may also wish to consider using numpy. The multiplication operation is overloaded to be performed on each element of the array. Depending on the size of your list and the operations you plan to perform on it, using numpy very well may be the most efficient approach.
Example:
>>> import numpy
>>> 2 * numpy.arange(5)
array([0, 2, 4, 6, 8])
I would instead write
for i in range(5):
list1.append(2*i)
Yet another way to do this is to use the append method on your list. The reason you're getting an out of range error is because you're saying:
list1 = []
list1.__getitem__(0)
and then manipulate this item, BUT that item does not exist since your made an empty list.
Proof of concept:
list1 = []
list1[1]
IndexError: list index out of range
We can, however, append new stuff to this list like so:
list1 = []
for i in range(5):
list1.append(i * 2)
Is there a a better way to remove the last N elements of a list.
for i in range(0,n):
lst.pop( )
Works for n >= 1
>>> L = [1,2,3, 4, 5]
>>> n=2
>>> del L[-n:]
>>> L
[1, 2, 3]
if you wish to remove the last n elements, in other words, keep first len - n elements:
lst = lst[:len(lst)-n]
Note: This is not an in memory operation. It would create a shallow copy.
As Vincenzooo correctly says, the pythonic lst[:-n] does not work when n==0.
The following works for all n>=0:
lst = lst[:-n or None]
I like this solution because it is kind of readable in English too: "return a slice omitting the last n elements or none (if none needs to be omitted)".
This solution works because of the following:
x or y evaluates to x when x is logically true (e.g., when it is not 0, "", False, None, ...) and to y otherwise. So -n or None is -n when n!=0 and None when n==0.
When slicing, None is equivalent to omitting the value, so lst[:None] is the same as lst[:] (see here).
As noted by #swK, this solution creates a new list (but immediately discards the old one unless it's referenced elsewhere) rather than editing the original one. This is often not a problem in terms of performance as creating a new list in one go is often faster than removing one element at the time (unless n<<len(lst)). It is also often not a problem in terms of space as usually the members of the list take more space than the list itself (unless it's a list of small objects like bytes or the list has many duplicated entries). Please also note that this solution is not exactly equivalent to the OP's: if the original list is referenced by other variables, this solution will not modify (shorten) the other copies unlike in the OP's code.
A possible solution (in the same style as my original one) that works for n>=0 but: a) does not create a copy of the list; and b) also affects other references to the same list, could be the following:
lst[-n:n and None] = []
This is definitely not readable and should not be used. Actually, even my original solution requires too much understanding of the language to be quickly read and univocally understood by everyone. I wouldn't use either in any real code and I think the best solution is that by #wonder.mice: a[len(a)-n:] = [].
Just try to del it like this.
del list[-n:]
I see this was asked a long ago, but none of the answers did it for me; what if we want to get a list without the last N elements, but keep the original one: you just do list[:-n]. If you need to handle cases where n may equal 0, you do list[:-n or None].
>>> a = [1,2,3,4,5,6,7]
>>> b = a[:-4]
>>> b
[1, 2, 3]
>>> a
[1, 1, 2, 3, 4, 5, 7]
As simple as that.
Should be using this:
a[len(a)-n:] = []
or this:
del a[len(a)-n:]
It's much faster, since it really removes items from existing array. The opposite (a = a[:len(a)-1]) creates new list object and less efficient.
>>> timeit.timeit("a = a[:len(a)-1]\na.append(1)", setup="a=range(100)", number=10000000)
6.833014965057373
>>> timeit.timeit("a[len(a)-1:] = []\na.append(1)", setup="a=range(100)", number=10000000)
2.0737061500549316
>>> timeit.timeit("a[-1:] = []\na.append(1)", setup="a=range(100)", number=10000000)
1.507638931274414
>>> timeit.timeit("del a[-1:]\na.append(1)", setup="a=range(100)", number=10000000)
1.2029790878295898
If 0 < n you can use a[-n:] = [] or del a[-n:] which is even faster.
This is one of the cases in which being pythonic doesn't work for me and can give hidden bugs or mess.
None of the solutions above works for the case n=0.
Using l[:len(l)-n] works in the general case:
l=range(4)
for n in [2,1,0]: #test values for numbers of points to cut
print n,l[:len(l)-n]
This is useful for example inside a function to trim edges of a vector, where you want to leave the possibility not to cut anything.
What is the easiest and cleanest way to get the first AND the last elements of a sequence? E.g., I have a sequence [1, 2, 3, 4, 5], and I'd like to get [1, 5] via some kind of slicing magic. What I have come up with so far is:
l = len(s)
result = s[0:l:l-1]
I actually need this for a bit more complex task. I have a 3D numpy array, which is cubic (i.e. is of size NxNxN, where N may vary). I'd like an easy and fast way to get a 2x2x2 array containing the values from the vertices of the source array. The example above is an oversimplified, 1D version of my task.
Use this:
result = [s[0], s[-1]]
Since you're using a numpy array, you may want to use fancy indexing:
a = np.arange(27)
indices = [0, -1]
b = a[indices] # array([0, 26])
For the 3d case:
vertices = [(0,0,0),(0,0,-1),(0,-1,0),(0,-1,-1),(-1,-1,-1),(-1,-1,0),(-1,0,0),(-1,0,-1)]
indices = list(zip(*vertices)) #Can store this for later use.
a = np.arange(27).reshape((3,3,3)) #dummy array for testing. Can be any shape size :)
vertex_values = a[indices].reshape((2,2,2))
I first write down all the vertices (although I am willing to bet there is a clever way to do it using itertools which would let you scale this up to N dimensions ...). The order you specify the vertices is the order they will be in the output array. Then I "transpose" the list of vertices (using zip) so that all the x indices are together and all the y indices are together, etc. (that's how numpy likes it). At this point, you can save that index array and use it to index your array whenever you want the corners of your box. You can easily reshape the result into a 2x2x2 array (although the order I have it is probably not the order you want).
This would give you a list of the first and last element in your sequence:
result = [s[0], s[-1]]
Alternatively, this would give you a tuple
result = s[0], s[-1]
With the particular case of a (N,N,N) ndarray X that you mention, would the following work for you?
s = slice(0,N,N-1)
X[s,s,s]
Example
>>> N = 3
>>> X = np.arange(N*N*N).reshape(N,N,N)
>>> s = slice(0,N,N-1)
>>> print X[s,s,s]
[[[ 0 2]
[ 6 8]]
[[18 20]
[24 26]]]
>>> from operator import itemgetter
>>> first_and_last = itemgetter(0, -1)
>>> first_and_last([1, 2, 3, 4, 5])
(1, 5)
Why do you want to use a slice? Getting each element with
result = [s[0], s[-1]]
is better and more readable.
If you really need to use the slice, then your solution is the simplest working one that I can think of.
This also works for the 3D case you've mentioned.
I'd like to achieve following effect
a=[11, -1, -1, -1]
msg=['one','two','tree','four']
msg[where a<0]
['two','tree','four']
In similar simple fashion (without nasty loops).
PS. For curious people this if statement is working natively in one of functional languages.
//EDIT
I know that below text is different that the requirements above, but I've found what I wonted to acheave.
I don't want to spam another answer in my own thread, so I've also find some nice solution,
and I want to present it to you.
filter(lambda x: not x.endswith('one'),msg)
You can use list comprehensions for this. You need to match the items from the two lists, for which the zip function is used. This will generate a list of tuples, where each tuple contains one item from each of the original lists (i.e., [(11, 'one'), ...]). Once you have this, you can iterate over the result, check if the first element is below 0 and return the second element. See the linked Python docs for more details about the syntax.
[y for (x, y) in zip(a, msg) if x < 0]
The actual problem seems to be about finding items in the msg list that don't contain the string "one". This can be done directly:
[m for m in msg if "one" not in m]
[m for m, i in zip(msg, a) if i < 0]
The answers already posted are good, but if you want an alternative you can look at numpy and at its arrays.
>>> import numpy as np
>>> a = np.array([11, -1, -1, -1])
>>> msg = np.array(['one','two','tree','four'])
>>> a < 0
array([False, True, True, True], dtype=bool)
>>> msg[a < 0]
array(['two', 'tree', 'four'], dtype='|S4')
I don't know how array indexing is implemented in numpy, but it is usually fast and problably rewritten in C. Compared to the other solutions, this should be more readable, but it requires numpy.
I think [msg[i] for i in range(len(a)) if a[i]<0] will work for you.