python numpy set elements of list by condition

python numpy set elements of list by condition - python

I want to set a specific element of my list to specific value with low overhead.
for example if I have this : a = numpy.array([1,2,3,0,4,0]) I want to change every 0 value to 10; in the end I want to have [1, 2, 3, 10, 4, 10]
in Matlab you can do this easily like a(a==0) = 10, is there any equivalent in numpy?

Remarkably similar to Matlab:
>>> a[a == 0] = 10
>>> a
array([ 1, 2, 3, 10, 4, 10])
There's a really nice "NumPy for Matlab Users" guide at the SciPy website.
I should note, this doesn't work on regular Python lists. NumPy arrays are a different datatype that work a lot more like a Matlab matrix than a Python list in terms of access and math operators.

A little more pythonic way would be like this, I guess:
import numpy
a = numpy.array([1,2,3,0,4,0])
for k,v in enumerate(a):
if v == 0:
a[k] = 10
print a
Even more pythonic way (provided by #mtrw)
[10 if k == 0 else k for k in a]

Related

How to get a reverse mapping in numpy in O(1)?

I have a numpy array, whose elements are unique, for example:
b = np.array([5, 4, 6, 8, 1, 2])
(Edit2: b can have large numbers, and float numbers. The above example is there for simplicity)
I am getting numbers, that are elements in b.
I want to find their index in b, meaning I want a reverse mapping, from value to index, in b.
I could do
for number in input:
ind = np.where(number==b)
which would iterate over the entire array every call to where.
I could also create a dictionary,
d = {}
for i, element in enumerate(list(b)):
d[element] = i
I could create this dictionary at "preprocessing" time, but still I would be left with a strange looking dictionary, in a mostly numpy code, which seems (to me) not how numpy is meant to be used.
How can I do this reverse mapping in numpy?
usage (O(1) time and memory required):
print("index of 8 is: ", foo(b, 8))
Edit1: not a duplicate of this
Using in1d like explained here doesn't solve my problem. Using their example:
b = np.array([1, 2, 3, 10, 4])
I want to be able to find for example 10's index in b, at runtime, in O(1).
Doing a pre-processing move
mapping = np.in1d(b, b).nonzero()[0]
>> [0, 1, 2, 3, 4]
(which could be accomplished using np.arange(len(b)))
doesn't really help, because when 10 comes in as input, It is not possible to tell its index in O(1) time with this method.

It's simpler than you think, by exploiting numpy's advanced indexing.
What we do is make our target array and just assign usign b as an index. We'll assign the indices we want by using arange.
>>> t = np.zeros((np.max(b) + 1,))
>>> t[b] = np.arange(0, b.size)
>>> t
array([0., 4., 5., 0., 1., 0., 2., 0., 3.])
You might use nans or -1 instead of zeros to construct the target to help detect invalid lookups.
Memory usage: this is optimally performant in both space and time as it's handled entirely by numpy.
If you can tolerate collisions, you can implement a poor man's hashtable. Suppose we have currencies, for example:
h = np.int32(b * 100.0) % 101 # Typically some prime number
t = np.zeros((101,))
t[h] = np.arange(0, h.size)
# Retrieving a value v; keep in mind v can be an ndarray itself.
t[np.int32(v * 100.0) % 101]
You can do any other steps to munge the address if you know what your dataset looks like.
This is about the limit of what's useful to do with numpy.

Solution
If you want constant time (ie O(1)), then you'll need to precompute a lookup table of some sort. If you want to make your lookup table using another Numpy array, it'll effectively have to be a sparse array, in which most values are "empty". Here's a workable approach in which empty values are marked as -1:
b = np.array([5, 4, 6, 8, 1, 2])
_b_ix = np.array([-1]*(b.max() + 1))
_b_ix[b] = np.arange(b.size)
# _b_ix: array([-1, 4, 5, -1, 1, 0, 2, -1, 3])
def foo(*val):
return _b_ix[list(val)]
Test:
print("index of 8 is: %s" % foo(8))
print("index of 0,5,1,8 is: %s" % foo(0,5,1,8))
Output:
index of 8 is: [3]
index of 0,5,1,8 is: [-1 0 4 3]
Caveat
In production code, you should definitely use a dictionary to solve this problem, as other answerers have pointed out. Why? Well, for one thing, say that your array b contains float values, or any non-int value. Then a Numpy-based lookup table won't work at all.
Thus, you should use the above answer only if you have a deep-seated philosophical opposition to using a dictionary (eg a dict ran over your pet cat).
Here's a nice way to generate a reverse lookup dict:
ix = {k:v for v,k in enumerate(b.flat)}

You can use dict, zip and numpy.arrange to create your reverse lookup:
import numpy
b = np.array([5, 4, 6, 8, 1, 2])
d = dict(zip(b, np.arange(0,len(b))))
print(d)
gives:
{5: 0, 4: 1, 6: 2, 8: 3, 1: 4, 2: 5}

If you want to do multiple lookups, you can do these in O(1) after an initial O(n) traversal to create a lookup dictionary.
b = np.array([5, 4, 6, 8, 1, 2])
lookup_dict = {e:i for i,e in enumerate(b)}
def foo(element):
return lookup_dict[element]
And this works for your test:
>>> print('index of 8 is:', foo(8))
index of 8 is: 3
Note that if there is a possibility that b may have changed since the last foo() call, we must re-create the dictionary.

Numpy: Use one array as an iterative test for elements in another array

I have two arrays, A describes the start positions of 'blocks' of data, B describes absolute positions of things of interest in the non-blocked, raw data.
I want to be able to generate an index of the block-array A that match the location of elements identified in block B.
e.g.
import numpy as np
A = np.array([0,10,13,25,27,33,100])
B = np.array([3, 3, 5, 21, 27, 32, 74])
I want to return an array that looks like:
array([0, 0, 0, 2, 4, 4, 5])
That is, the array that describes the index-position, in terms of A, of the elements in B.
I could write a loop, something like:
list_holder = []
for e in B:
list_holder.append(np.where(A>e)[0][0]-1)
np.array(list_holder)
But it turns out, for large arrays, this becomes rather slow - are there any functional or numpy-tricks that will perform this relatively simple operation as a one-liner?

Your solution is O(N^2). But you can do this in O(N) simply by iterating over the lists in 1 pass as so. I'm not a python guy, so if this code isn't "pythonic" that's why.
def digitize_sorted(a, b):
j=0
c = np.zeros(len(b))
for i in range(len(b)):
while j < len(a) and a[j] <= b[i]:
j += 1
c[i] = j-1
return c

try searchsorted():
A = np.array([0,10,13,25,27,33,100])
B = np.array([3, 3, 5, 21, 27, 32, 74])
np.searchsorted(A, B, side="right") - 1

Removing nested loops in numpy

I've been writing a program to brute force check a sequence of numbers to look for euler bricks, but the method that I came up with involves a triple loop. Since nested Python loops get notoriously slow, I was wondering if there was a better way using numpy to create the array of values that I need.
#x=max side length of brick. User Input.
for t in range(3,x):
a=[];b=[];c=[];
for u in range(2,t):
for v in range(1,u):
a.append(t)
b.append(u)
c.append(v)
a=np.array(a)
b=np.array(b)
c=np.array(c)
...
Is there a better way to generate the array af values, using numpy commands?
Thanks.
Example:
If x=10, when t=3 I want to get:
a=[3]
b=[2]
c=[1]
the first time through the loop. After that, when t=4:
a=[4, 4, 4]
b=[2, 3, 3]
c=[1, 1, 2]
The third time (t=5) I want:
a=[5, 5, 5, 5, 5, 5]
b=[2, 3, 3, 4, 4, 4]
c=[1, 1, 2, 1, 2, 3]
and so on, up to max side lengths around 5000 or so.
EDIT: Solution
a=array(3)
b=array(2)
c=array(1)
for i in range(4,x): #Removing the (3,2,1) check from code does not affect results.
foo=arange(1,i-1)
foo2=empty(len(foo))
foo2.fill(i-1)
c=hstack((c,foo))
b=hstack((b,foo2))
a=empty(len(b))
a.fill(i)
...
Works many times faster now. Thanks all.

Try to use .empty and .fill (http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.fill.html)

There are couple of things which could help, but probably only for large values of x. For starters use xrange instead of range, that will save creating a list you never need. You could also create empty numpy arrays of the correct length and fill them up with the values as you go, instead of appending to a list and then converting it into a numpy array.
I believe this code will work (no python access right this second):
for t in xrange(3, x):
size = (t - 2) * (t - 3)
a = np.zeros(size)
b = np.zeros(size)
c = np.zeros(size)
idx = 0
for u in xrange(2,t):
for v in xrange(1,u):
a[idx] = t
b[idx] = u
c[idx] = v
idx += 1

python list element wise conditional increment

I have been searching this for a while, basically I am trying to conditionally increment a list of element by another list, element-wise...
my code is following, but is there a better way to do it? list comprehension, map??
I think a element-wise operator like ~+= from http://www.python.org/dev/peps/pep-0225/ would be really good, but why is it deferred?
for i in range(1,len(s)):
if s[i]<s[0]:
s[i]+=p[i]
based on some good feedbacks from you guys I have recoded to the following
i=s<s[0]
s[i]+=p[i]
and s,p are both arrays.
p.s still slow than matlab 5 times for one of my code.

Here is a quick version:
# sample data
s = [10, 5, 20]
p = [2,2,2]
# As a one-liner. (You could factor out the lambda)
s = map(lambda (si, pi): si + pi if si < s[0] else si, zip(s,p))
# s is now [10, 7, 20]
This assumes that len(s) <= len(p)
Hope this helps. Let me know. Good luck. :-)

If you don't want to create a new array, then your options are:
What you proposed (though you might want to use xrange depending on the python version)
Use Numpy arrays for s and p. Then you can do something like s[s<s[0]] += p[s<s[0]] if s and p are the same length.
Use Cython to speed up what you've proposed.

Check this SO question:
Merging/adding lists in Python
Basically, something like:
[sum(a) for a in zip(*[s, p]) if a[0] < 0]
Example:
>>> [sum(a) for a in zip(*[[1, 2, 3], [10, 20, 30]]) if a[0] > 2]
[33]
To clarify, here's what zip does:
>>> zip(*[[1, 2, 3], [4, 5, 6]])
[(1, 4), (2, 5), (3, 6)]
It concatenates two (or more) lists into a list of tuples. You can test for conditions on the elements of each of the tuples.

s = [s[i]+p[i]*(s[i]<s[0]) for i in range(1,len(s))]

compare two following values in numpy array

What is the best way to touch two following values in an numpy array?
example:
npdata = np.array([13,15,20,25])
for i in range( len(npdata) ):
print npdata[i] - npdata[i+1]
this looks really messed up and additionally needs exception code for the last iteration of the loop.
any ideas?
Thanks!

numpy provides a function diff for this basic use case
>>> import numpy
>>> x = numpy.array([1, 2, 4, 7, 0])
>>> numpy.diff(x)
array([ 1, 2, 3, -7])
Your snippet computes something closer to -numpy.diff(x).

How about range(len(npdata) - 1) ?
Here's code (using a simple array, but it doesn't matter):
>>> ar = [1, 2, 3, 4, 5]
>>> for i in range(len(ar) - 1):
... print ar[i] + ar[i + 1]
...
3
5
7
9
As you can see it successfully prints the sums of all consecutive pairs in the array, without any exceptions for the last iteration.

You can use ediff1d to get differences of consecutive elements. More generally, a[1:] - a[:-1] will give the differences of consecutive elements and can be used with other operators as well.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python numpy set elements of list by condition - python

A little more pythonic way would be like this, I guess: import numpy a = numpy.array([1,2,3,0,4,0]) for k,v in enumerate(a): if v == 0: a[k] = 10 print a Even more pythonic way (provided by #mtrw) [10 if k == 0 else k for k in a]

Related

How to get a reverse mapping in numpy in O(1)?

Numpy: Use one array as an iterative test for elements in another array

Removing nested loops in numpy

python list element wise conditional increment

compare two following values in numpy array

Categories

Resources