Convert a binary string into signed integer - Python - python

I have a binary matrix which I create by NumPy. The matrix has 6 rows and 8 columns.
array([[1, 0, 1, 1, 1, 0, 1, 1],
[1, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 1, 0, 0, 1, 1, 1],
[1, 0, 1, 1, 0, 1, 1, 0],
[0, 1, 0, 0, 1, 0, 1, 1],
[0, 1, 0, 1, 1, 1, 0, 0]])
First column is the sign of a number.
Example:
1, 0, 1, 1, 1, 0, 1, 1 -> 1 0111011 -> -59
When I used int(str, base=2) as a result I received value 187, and the value should be -59.
>>> int(''.join(map(str, array[0])), 2)
>>> 187
How can I convert the string into the signed integer?

Pyhton doesn't know that the first bit is supposed to represent the sign (compare with bin(-59)), so you have to handle that yourself, for example, if A contains the array:
num = int(''.join(map(str, A[0,1:])), 2)
if A[0,0]:
num *= -1
Here's a more Numpy-ish way to do it, for the whole array at once:
num = np.packbits(A).astype(np.int8)
num[num<0] = -128 - num[num<0]
Finally, a code-golf version:
(A[:,:0:-1]<<range(7)).sum(1)*(1-2*A[:,0])

You could split each row a sign and value variable. Then if sign is negative multiply the value by -1.
row = array[0]
sign, value = row[0], row[1:]
int(''.join(map(str, value)), 2) if sign == 0 else int(''.join(map(str, value)), 2) * -1

First of all, it looks like NumPy array rather than NumPy matrix.
There are a couple options I can think of. Pretty straight forward way will look like that:
def rowToSignedDec(arr, row):
res = int(''.join(str(x) for x in arr[row][1:].tolist()),2)
if arr[row][0] == 1:
return -res
else:
return res
print rowToSignedDec(arr, 0)
-59
That one is clearly not the most efficient one and neither the shortest one-liner:
int(''.join(str(x) for x in arr[0][1:].tolist()),2) - 2*int(arr[0][0])*int(''.join(str(x) for x in arr[0][1:].tolist()),2)
Where arr is the above-mentioned array.

Related

Python - Replacing Values Leading Up To 1s in an Array

Pretend I have a pandas Series that consists of 0s and 1s, but this can work with numpy arrays or any iterable. I would like to create a formula that would take an array and an input n and then return a new series that contains 1s at the nth indices leading up to every time that there is at least a single 1 in the original series. Here is an example:
array = np.array([0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1])
> preceding_indices_function(array, 2)
np.array([0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1])
For each time there is a 1 in the input array, the two indices preceding it are filled in with 1 regardless of whether there is a 0 or 1 in that index in the original array.
I would really appreciate some help on this. Thanks!
Use a convolution with np.convolve:
N = 2
# craft a custom kernel
kernel = np.ones(2*N+1)
kernel[-N:] = 0
# array([1, 1, 1, 0, 0])
out = (np.convolve(array, kernel, mode='same') != 0).astype(int)
Output:
array([0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1])
Unless you don't want to use numpy, mozway's transpose is the best solution.
But since several iterations have been given, I add my itertools based solution
[a or b or c for a,b,c in itertools.zip_longest(array, array[1:], array[2:], fillvalue=0)]
zip_longest is the same as classical zip, but if the iterators have different "lengths", the number of iteration is the one of the longest, and finished iterators will return None. Unless you add a fillvalue parameter to zip_longest.
So, here itertools.zip_longest(array, array[1:], array[2:], fillvalue=0) gives a sequence of triplets (a,b,c), of 3 subsequent elements (a being the current element, b the next, c the one after, b and c being 0 if there isn't any next element or element after the next).
So from there, a simple comprehension build a list of [a or b or c] that is 1 if a, or b or c is 1, 0 else.
import numpy as np
array = np.array([0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1])
array = np.array([a or array[idx+1] or array[idx+2] for idx, a in enumerate(array[:-2])] + [array[-2] or array[-1]] + [array[-1]])
this function works if a is a list, should work with other iterables as well:
def preceding_indices_function(array, n):
for i in range(len(a)):
if array[i] == 1:
for j in range(n):
if i-j-1 >= 0:
array[i-j-1] = 1
return array
I got a solution that is similar to the other one but slightly simpler in my opinion:
>>> [1 if (array[i+1] == 1 or array[i+2] == 1) else x for i,x in enumerate(array) if i < len(array) - 2]
[0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1]

cumulative logical or within bins

Problem
I want to identify when I've encountered a true value and maintain that value for the rest of the array... for a particular bin. From a Numpy perspective it would be like a combination of numpy.logical_or.accumulate and numpy.logical_or.at.
Example
Consider the truth values in a, the bins in b and the expected output in c.
I've used 0 for False and 1 for True then converted to bool in order to align the array values.
a = np.array([0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0]).astype(bool)
b = np.array([0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 2, 3, 3, 0, 1, 2, 3])
# zeros ↕ ↕ ↕ ↕ ↕ ↕ ↕
# ones ↕ ↕ ↕ ↕ ↕
# twos ↕ ↕
# threes ↕ ↕ ↕
c = np.array([0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1]).astype(bool)
# ╰─────╯ ↑ ↑ ↑ ↑
# zero bin no True yet │ │ │ two never had a True
# one bin first True │ three bin first True
# zero bin first True
What I've Tried
I can loop through each value and track whether the associated bin has seen a True value yet.
tracker = np.zeros(4, bool)
result = np.zeros(len(b), bool)
for i, (truth, bin_) in enumerate(zip(a, b)):
tracker[bin_] |= truth
result[i] = tracker[bin_]
result * 1
array([0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1])
But I was hoping for a O(n) time Numpy solution. I have the option of using a JIT wrapper like Numba but I'd rather keep it just Numpy.
O(n) solution
def cumulative_linear_seen(seen, bins):
"""
Tracks whether or not a value has been observed as
True in a 1D array, and marks all future values as
True for these each individual value.
Parameters
----------
seen: ndarray
One-hot array marking an occurence of a value
bins: ndarray
Array of bins to which occurences belong
Returns
-------
One-hot array indicating if the corresponding bin has
been observed at a point in time
"""
# zero indexing won't work with logical and, need to 1-index
one_up = bins + 1
# Next step is finding where each unique value is seen
occ = np.flatnonzero(a)
v_obs = one_up[a]
# We can fill another mapping array with these occurences.
# then map by corresponding index
i_obs = np.full(one_up.max() + 1, seen.shape[0] + 1)
i_obs[v_obs] = occ
# Finally, we create the map and compare to an array of
# indices from the original seen array
seen_idx = i_obs[one_up]
return (seen_idx <= np.arange(seen_idx.shape[0])).astype(int)
array([0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1])
PiR's contribution
Based on insights above
r = np.arange(len(b))
one_hot = np.eye(b.max() + 1, dtype=bool)[b]
np.logical_or.accumulate(one_hot & a[:, None], axis=0)[r, b] * 1
array([0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1])
Older attempts
Just to get things started, here is a solution that, while vectorized, is not O(n). I believe an O(n) solution similar to this exists, I'll work on the complexity :-)
Attempt 1
q = b + 1
u = sparse.csr_matrix(
(a, q, np.arange(a.shape[0] + 1)), (a.shape[0], q.max()+1)
)
m = np.maximum.accumulate(u.A) * np.arange(u.shape[1])
r = np.where(m[:, 1:] == 0, np.nan, m[:, 1:])
(r == q[:, None]).any(1).view(np.int8)
array([0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1], dtype=int8)
Attempt 2
q = b + 1
m = np.logical_and(a, q)
r = np.flatnonzero(u)
t = q[m]
f = np.zeros((a.shape[0], q.max()))
f[r, t-1] = 1
v = np.maximum.accumulate(f) * np.arange(1, q.max()+1)
(v == q[:, None]).any(1).view(np.int8)
array([0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1], dtype=int8)

hexadecimal array to binary array

I want to convert this hexadecimal array :
[7,3,2,0,1,9,0,4]
into this one
[0,1,1,1,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,1,1,0,0,1,0,0,0,0,0,1,0,0]
where you can recognize the first 4 integers is egal to 7 in binary format (0111), and so on.
I tried to use format(x, '04b') but the result is in string format :
['0111','0011','0010','0000','0001','1001','0000','0100']
Consequently I can't use the result as binary array. How to do that ?
This one liner will return a list of integers as you want:
hex = [7,3,2,0,1,9,0,4]
list(map(int,"".join([format(x, '04b') for x in hex])))
You can use bitwise operations:
h = [7,3,2,0,1,9,0,4]
[i >> b & 1 for i in h for b in range(3, -1, -1)]
This returns:
[0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0]
arr = [7,3,2,0,1,9,0,4]
hexa = ''.join(str(e) for e in arr)
print(bin(int(hexa,16))[2:])
This takes a hexidecimal array and converts it into binary!

Simultaneous changing of python numpy array elements

I have a vector of integers from range [0,3], for example:
v = [0,0,1,2,1,3, 0,3,0,2,1,1,0,2,0,3,2,1].
I know that I can replace a specific values of elements in the vector by other value using the following
v[v == 0] = 5
which changes all appearences of 0 in vector v to value 5.
But I would like to do something a little bit different - I want to change all values of 0 (let's call them target values) to 1, and all values different from 0 to 0, thus I want to obtain the following:
v = [1,1,0,0,0,0,1,0,1,0,0,0,1,0,1,0,0,0]
However, I cannot call the substitution code (which I used above) as follows:
v[v==0] = 1
v[v!=0] = 0
because this obviously leeds to a vector of zeros.
Is it possible to do the above substitution in a parralel way, to obtain the desired vector? (I want to have a universal technique, which will allow me to use it even if I will change what is my target value). Any suggestions will be very helpful!
You can check if v is equal to zero and then convert the boolean array to int, and so if the original value is zero, the boolean is true and converts to 1, otherwise 0:
v = np.array([0,0,1,2,1,3, 0,3,0,2,1,1,0,2,0,3,2,1])
(v == 0).astype(int)
# array([1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0])
Or use numpy.where:
np.where(v == 0, 1, 0)
# array([1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0])

Slicing different rows of a numpy array differently

I'm working on a Monte Carlo radiative transfer code, which simulates firing photons through a medium and statistically modelling their random walk. It runs slowly firing one photon at a time, so I'd like to vectorize it and run perhaps 1000 photons at once.
I have divided my slab through which the photons are passing into nlayers slices between optical depth 0 and depth. Effectively, that means that I have nlayers + 2 regions (nlayers plus the region above the slab and the region below the slab). At each step, I have to keep track of which layers each photon passes through.
Let's suppose that I already know that two photons start in layer 0. One takes a step and ends up in layer 2, and the other takes a step and ends up in layer 6. This is represented by an array pastpresent that looks like this:
[[ 0 2]
[ 0 6]]
I want to generate an array traveled_through with (nlayers + 2) columns and 2 rows, describing whether photon i passed through layer j (endpoint-inclusive). It would look something like this (with nlayers = 10):
[[ 1 1 1 0 0 0 0 0 0 0 0 0]
[ 1 1 1 1 1 1 1 0 0 0 0 0]]
I could do this by iterating over the photons and generating each row of traveled_through individually, but that's rather slow, and sort of defeats the point of running many photons at once, so I'd rather not do that.
I tried to define the array as follows:
traveled_through = np.zeros((2, nlayers)).astype(int)
traveled_through[ : , np.min(pastpresent, axis = 1) : np.max(pastpresent, axis = 1) + ] = 1
The idea was that in a given photon's row, the indices from the starting layer through and including the ending layer would be set to 1, with all others remaining 0. However, I get the following error:
traveled_through[ : , np.min(pastpresent, axis = 1) : np.max(pastpresent, axis = 1) + 1 ] = 1
IndexError: invalid slice
My best guess is that numpy does not allow different rows of an array to be indexed differently using this method. Does anyone have suggestions for how to generate traveled_through for an arbitrary number of photons and an arbitrary number of layers?
If the two photons always start at 0, you could perhaps construct your array as follows.
First setting the variables...
>>> pastpresent = np.array([[0, 2], [0, 6]])
>>> nlayers = 10
...and then constructing the array:
>>> (pastpresent[:,1][:,np.newaxis] + 1 > np.arange(nlayers+2)).astype(int)
array([[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0]])
Or if the photons have an arbitrary starting layer:
>>> pastpresent2 = np.array([[1, 7], [3, 9]])
>>> (pastpresent2[:,0][:,np.newaxis] < np.arange(nlayers+2)) &
(pastpresent2[:,1][:,np.newaxis] + 1 > np.arange(nlayers+2)).astype(int)
array([[0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0]])
A little trick I kind of like for this kind of thing involves the accumulate method of the logical_xor ufunc:
>>> a = np.zeros(10, dtype=int)
>>> b = [3, 7]
>>> a[b] = 1
>>> a
array([0, 0, 0, 1, 0, 0, 0, 1, 0, 0])
>>> np.logical_xor.accumulate(a, out=a)
array([0, 0, 0, 1, 1, 1, 1, 0, 0, 0])
Note that this sets to 1 the entries between the positions in b, first index inclusive, last index exclusive, so you have to handle off by 1 errors depending on what exactly you are after.
With several rows, you could make it work as:
>>> a = np.zeros((3, 10), dtype=int)
>>> b = np.array([[1, 7], [0, 4], [3, 8]])
>>> b[:, 1] += 1 # handle the off by 1 error
>>> a[np.arange(len(b))[:, None], b] = 1
>>> a
array([[0, 1, 0, 0, 0, 0, 0, 0, 1, 0],
[1, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0, 1]])
>>> np.logical_xor.accumulate(a, axis=1, out=a)
array([[0, 1, 1, 1, 1, 1, 1, 1, 0, 0],
[1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 1, 1, 1, 1, 1, 0]])

Categories

Resources