How to change values into a matrix? - python

I have a matrix filled with 0 values and I want to add randomly a 1 value into a and a+1 position. Then I want to use b and b+1 for the next row.. and so on.
How can I do it?
w, h = 10, 3
Matrix = [[0 for x in range(w)] for y in range(h)]
a = random.randint(0,9)
b = random.randint(0,9)
c = random.randint(0,9)
print(a, b, c)
EXAMPLE:
a = 5 b = 2 c = 1
0000011000
0011000000
0110000000

You should reduce the randint range to 8 (more generically w-2), or alternatively use randrange so you don't cross the edge of the row with the +1.
Then just loop on each row, generate the number and change that row using the number as an index:
import random
w, h = 10, 3
matrix = [[0 for x in range(w)] for y in range(h)]
for row in matrix:
i = random.randrange(w-1)
print(i)
row[i:i+2] = [1, 1]
print(*matrix, sep='\n')
Will give:
0
8
2
[1, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 1, 1]
[0, 0, 1, 1, 0, 0, 0, 0, 0, 0]

In matrices in Python, you can access the row with [] and the columns with [][],
So if you want the third item in the second row, it would be [1][2] note we're starting from 0.
Getting back to your question, it would look something like this,
Matrix[0][a] = 1
Matrix[0][a + 1] = 1
And so on for the b, c but you could also use a loop.
random = [a, b, c]
for row in range(h):
Matrix[row][random[row]] = 1
Matrix[row][random[row] + 1] = 1
Basically what we're doing in this loop, is for each row in the matrix, which in your code is defined by h and we will look at the first row and define at the index of the random value 'a' and update it to 1.
then we'll go to 'a + 1' in the same row and also update it to 1.
And then for the next row, we'll take b and do the same thing.
Note: this would raise an error if a, b or c are 9 because as soon as you increase it by 1 you will be out the boundaries of the list which is 0 to 9

Related

Is there a way to simplify the creation of all possible (length x height) grids?

Here's my code for a 4x4 grid to better explain my problem:
#The "Duct-Tape" solution
for box0 in range(0,2):
for box1 in range(0,2):
for box2 in range(0,2):
for box3 in range(0,2):
for box4 in range(0,2):
for box5 in range(0,2):
for box6 in range(0,2):
for box7 in range(0,2): #0 = OutBag, 1 = InBag
for box8 in range(0,2):
for box9 in range(0,2):
for box10 in range(0,2):
for box11 in range(0,2):
for box12 in range(0,2):
for box13 in range(0,2):
for box14 in range(0,2):
for box15 in range(0,2):
totalGrids.append([[box0,box1,box2,box3],
[box4,box5,box6,box7],
[box8,box9,box10,box11],
[box12,box13,box14,box15]])
What's a way to make something like this for a length x height size grid?
This is another way to do it with fewer for loops by using binary arithmetic:
totalGrids = []
for i in range(0, 1 << 16):
totalGrids.append(
[
[(i >> j) & 1 for j in range(0, 4)],
[(i >> j) & 1 for j in range(4, 8)],
[(i >> j) & 1 for j in range(8, 12)],
[(i >> j) & 1 for j in range(12, 16)]
])
print(totalGrids[0])
print(totalGrids[1])
print(totalGrids[2])
print()
print(totalGrids[-3])
print(totalGrids[-2])
print(totalGrids[-1])
Output (first 3 and last 3 elements):
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
[[1, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
[[0, 1, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
[[1, 0, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]]
[[0, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]]
[[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]]
To generalize this from 4 x 4 to height x width, something like this should work:
height = 3
width = 5
totalGrids = []
for i in range(0, 1 << (height * width)):
totalGrids.append(
[[(i >> j) & 1 for j in range(k * width, (k + 1) * width)] for k in range(0, height)]
)
Here is an explanation of the above.
The matrix, which has height x width elements, is to be filled with every possible combination of 0s and 1s across these elements. As an example, if height = 2 and width = 4, then there are 8 elements in total, and one ordering of the required combinations of 0s and 1s is:
0 0 0 0 0 0 0 0 (this is 0 in binary)
0 0 0 0 0 0 0 1 (this is 1 in binary)
0 0 0 0 0 0 1 0 (this is 2 in binary)
0 0 0 0 0 0 1 1 (this is 3 in binary)
...
0 0 0 0 1 1 1 1 (this is 15 in binary)
0 0 0 1 0 0 0 0 (this is 16 in binary)
0 0 0 1 0 0 0 1
0 0 0 1 0 0 1 0
0 0 0 1 0 0 1 1 (EXAMPLE VALUE USED BELOW)
...
0 0 1 0 0 0 0 0 (this is 32 in binary)
...
0 0 1 1 0 0 0 0 (this is 48 in binary)
...
1 1 1 1 1 1 1 1 (this is 255 = 2**8 - 1 in binary)
These are just the binary values from 0 to 2**8 - 1 which can be expressed as Python integers in range(0, 2**8). They are exactly what is needed, and now the only question is how to populate a Python list of lists of size height x width.
The answer is to use binary arithmetic. Let's look at 0 0 0 1 0 0 1 1 as an example. We can specify this in Python as an integer, namely i = 19.
For the 1st slot of 8, we want to use the rightmost binary bit in our example, which is 1. We can extract this using Python's bitwise & operation by taking value = i & 1. Applying & 1 to any integer effectively masks off all but the binary ones-place digit.
For the 2nd slot, we need to add an additional step:
First we slide the bits to the right by 1 position (allowing the rightmost bit to fall off the edge, which is fine since we have already processed it and won't need it again) using Python's right shift operation >> as follows: value = i >> 1. In binary, this yields 0 0 0 0 1 0 0 1, which is the integer 9. The right-shift operator has moved the bit that was in the binary twos-place rightward into the binary ones-place.
Next, we can use the same technique as we did for the 1st slot to mask off all but the ones-place bit: value = i & 1.
Rather than do the above as two separate statements, we can simply write: value = (i >> 1) & 1.
In general, for the j'th slot, we can extract the j'th bit from our example integer by writing: value = (i >> j) & 1.
Now let's look at the key logic within the loop:
[[(i >> j) & 1 for j in range(k * width, (k + 1) * width)] for k in range(0, height)]
This uses a nested list comprehension to loop first over k in range(0, height) and then over j in range(k * width, (k + 1) * width), and to put the result of the above bitwise expression (i >> j) & 1 into each successive element in our matrix (or list of lists).
Finally, let's look again at the very outer loop in the code:
for i in range(0, 1 << (height * width)):
This uses Python's bitwise left shift operation <<, which does the opposite of what right shift (>>) does, namely to shift the bits of 1 to the left by (height * width) binary positions. Because each shift to the left causes a number to double in value, our left shift expression gives the same result as 2 ** (height * width), which is exactly the number of 0/1 combinations that your question is seeking.
So, by iterating from 0 to 2 ** (height * width), then extracting and collating the bits of each value into the corresponding matrix elements for that iteration's matrix, and appending that matrix to the totalGrids variable, we ultimately construct a list of matrices with the required properties.

Python - issue with dimension of sequency

I want to create in Python the following sequence of zero's and one's:
{0, 1,1,1,1, 0,0, 1,1,1, 0,0,0, 1,1, 0,0,0,0, 1}
So there is first 1 zero and 4 one's, then 2 zeros and 3 one's, then 3 zeros and 2 ones and finally 4 zeros and 1 one. The final array is supposed to have dimension 20x1, but my code gives me the dimension 4x2. Does anyone know how I can fix this?
Here's my code:
import numpy as np
seq = [ (np.ones(n), np.zeros(5-n) ) for n in range(1,5)]
Many thanks in advance!
For each iteration you create a tuple of two things, hence the 4x2 result. You can bring it to the form you want by concatenating the array elements all together, but there is a pattern to your sequence; you can take advantage that it looks like a triangular matrix of 1s and 0s, which you can then flatten.
n = 5
ones = np.ones((n, n), dtype=int)
seq = np.triu(ones)[1:].flatten()
Output:
array([0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1])
You can use flatten:
import numpy as np
l = np.array([[0] * n + [1] * (5 - n) for n in range(1, 5)]).flatten()
print(l)
# >>> [0 1 1 1 1 0 0 1 1 1 0 0 0 1 1 0 0 0 0 1]

How to replace the N smallest elements in each row of numpy array?

I would like to replace the N smallest elements in each row for 0, and that the resulting array would respect the same order and shape of the original array.
Specifically, if the original numpy array is:
import numpy as np
x = np.array([[0,50,20],[2,0,10],[1,1,0]])
And N = 2, I would like for the result to be the following:
x = np.array([[0,50,0],[0,0,10],[0,1,0]])
I tried the following, but in the last row it replaces 3 elements instead of 2 (because it replaces both 1s and not only one)
import numpy as np
N = 2
x = np.array([[0,50,20],[2,0,10],[1,1,0]])
x_sorted = np.sort(x , axis = 1)
x_sorted[:,N:] = 0
replace = x_sorted.copy()
final = np.where(np.isin(x,replace),0,x)
Note that this is small example and I would like that it works for a much bigger matrix.
Thanks for your time!
One way using numpy.argsort:
N = 2
x[x.argsort().argsort() < N] = 0
Output:
array([[ 0, 50, 0],
[ 0, 0, 10],
[ 0, 1, 0]])
Use numpy.argpartition to find the index of N smallest elements, and then use the index to replace values:
N = 2
idy = np.argpartition(x, N, axis=1)[:, :N]
x[np.arange(len(x))[:,None], idy] = 0
x
array([[ 0, 50, 0],
[ 0, 0, 10],
[ 1, 0, 0]])
Notice if there are ties, it could be undetermined which values get replaced depending on the algorithm used.

Numpy: how to convert observations to probabilities?

I have a feature matrix and a corresponding targets, which are ones or zeroes:
# raw observations
features = np.array([[1, 1, 0],
[1, 1, 0],
[0, 1, 0],
[0, 1, 0],
[0, 1, 0],
[0, 0, 1]])
targets = np.array([1, 0, 1, 1, 0, 0])
As you can see, each feature may correspond to both ones and zeros. I need to convert my raw observation matrix to probability matrix, where each feature will correspond to the probability of seeing one as a target:
[1 1 0] -> 0.5
[0 1 0] -> 0.67
[0 0 1] -> 0
I have constructed a quite straight-forward solution:
import numpy as np
# raw observations
features = np.array([[1, 1, 0],
[1, 1, 0],
[0, 1, 0],
[0, 1, 0],
[0, 1, 0],
[0, 0, 1]])
targets = np.array([1, 0, 1, 1, 0, 0])
from collections import Counter
def convert_obs_to_proba(features, targets):
features_ = []
targets_ = []
# compute unique rows (idx will point to some representative)
b = np.ascontiguousarray(features).view(np.dtype((np.void, features.dtype.itemsize * features.shape[1])))
_, idx = np.unique(b, return_index=True)
idx = idx[::-1]
zeros = Counter()
ones = Counter()
# collect row-wise number of one and zero targets
for i, row in enumerate(features[:]):
if targets[i] == 0:
zeros[tuple(row)] += 1
else:
ones[tuple(row)] += 1
# iterate over unique features and compute probabilities
for k in idx:
unique_row = features[k]
zero_count = zeros[tuple(unique_row)]
one_count = ones[tuple(unique_row)]
proba = float(one_count) / float(zero_count + one_count)
features_.append(unique_row)
targets_.append(proba)
return np.array(features_), np.array(targets_)
features_, targets_ = convert_obs_to_proba(features, targets)
print(features_)
print(targets_)
which:
extracts unique features;
counts number of zero and one observations targets for each unique feature;
computes probability and constructs the result.
Could it be solved in a prettier way using some advanced numpy magic?
Update. Previous code was pretty inefficient O(n^2). Converted it to more performance-friendly. Old code:
import numpy as np
# raw observations
features = np.array([[1, 1, 0],
[1, 1, 0],
[0, 1, 0],
[0, 1, 0],
[0, 1, 0],
[0, 0, 1]])
targets = np.array([1, 0, 1, 1, 0, 0])
def convert_obs_to_proba(features, targets):
features_ = []
targets_ = []
# compute unique rows (idx will point to some representative)
b = np.ascontiguousarray(features).view(np.dtype((np.void, features.dtype.itemsize * features.shape[1])))
_, idx = np.unique(b, return_index=True)
idx = idx[::-1]
# calculate ZERO class occurences and ONE class occurences
for k in idx:
unique_row = features[k]
zeros = 0
ones = 0
for i, row in enumerate(features[:]):
if np.array_equal(row, unique_row):
if targets[i] == 0:
zeros += 1
else:
ones += 1
proba = float(ones) / float(zeros + ones)
features_.append(unique_row)
targets_.append(proba)
return np.array(features_), np.array(targets_)
features_, targets_ = convert_obs_to_proba(features, targets)
print(features_)
print(targets_)
It's easy using Pandas:
df = pd.DataFrame(features)
df['targets'] = targets
Now you have:
0 1 2 targets
0 1 1 0 1
1 1 1 0 0
2 0 1 0 1
3 0 1 0 1
4 0 1 0 0
5 0 0 1 0
Now, the fancy part:
df.groupby([0,1,2]).targets.mean()
Gives you:
0 1 2
0 0 1 0.000000
1 0 0.666667
1 1 0 0.500000
Name: targets, dtype: float64
Pandas doesn't print the 0 at the leftmost part of the 0.666 row, but if you inspect the value there, it is indeed 0.
np.sum(np.reshape([targets[f] if tuple(features[f])==tuple(i) else 0 for i in np.vstack(set(map(tuple,features))) for f in range(features.shape[0])],features.shape[::-1]),axis=1)/np.sum(np.reshape([1 if tuple(features[f])==tuple(i) else 0 for i in np.vstack(set(map(tuple,features))) for f in range(features.shape[0])],features.shape[::-1]),axis=1)
Here you go, numpy magic! Although unnecceserily so, this could probably be cleaned up using some boring variables ;)
(And this is probably far from optimal)

Why cycle behaves differently in just one iteration?

I have this code:
gs = open("graph.txt", "r")
gp = gs.readline()
gp_splitIndex = gp.find(" ")
gp_nodeCount = int(gp[0:gp_splitIndex])
gp_edgeCount = int(gp[gp_splitIndex+1:-1])
matrix = [] # predecare the array
for i in range(0, gp_nodeCount):
matrix.append([])
for y in range(0, gp_nodeCount):
matrix[i].append(0)
for i in range(0, gp_edgeCount-1):
gp = gs.readline()
gp_splitIndex = gp.find(" ") # get the index of space, dividing the 2 numbers on a row
gp_from = int(gp[0:gp_splitIndex])
gp_to = int(gp[gp_splitIndex+1:-1])
matrix[gp_from][gp_to] = 1
print matrix
The file graph.txt contains this:
5 10
0 1
1 2
2 3
3 4
4 0
0 3
3 1
1 4
4 2
2 0
The first two number are telling me, that GRAPH has 5 nodes and 10 edges. The Following number pairs demonstrate the edges between nodes. For example "1 4" means an edge between node 1 and 4.
Problem is, the output should be this:
[[0, 1, 0, 1, 0], [0, 0, 1, 0, 1], [1, 0, 0, 1, 0], [0, 1, 0, 0, 1], [1, 0, 1, 0, 0]]
But instead of that, I get this:
[[0, 1, 0, 1, 0], [0, 0, 1, 0, 1], [0, 0, 0, 1, 0], [0, 1, 0, 0, 1], [1, 0, 1, 0, 0]]
Only one number is different and I can't understand why is this happening. The edge "3 1" is not present. Can someone explain, where is the problem?
Change for i in range(0, gp_edgeCount-1): to
for i in range(0, gp_edgeCount):
The range() function already does the "-1" operation. range(0,3) "==" [0,1,2]
And it is not the "3 1" edge that is missing, it is the "2 0" edge that is missing, and that is the last edge. The matrices start counting at 0.
Matthias has it; you don't need edgeCount - 1 since the range function doesn't include the end value in the iteration.
There are several other things you can do to clean up your code:
The with operator is preferred for opening files, since it closes them automatically for you
You don't need to call find and manually slice, split already does what you want.
You can convert and assign directly to a pair of numbers using a generator expression and iterable unpacking
You can call range with just an end value, the 0 start is implicit.
The multiplication operator is handy for initializing lists
With all of those changes:
with open('graph.txt', 'r') as graph:
node_count, edge_count = (int(n) for n in graph.readline().split())
matrix = [[0]*node_count for _ in range(node_count)]
for i in range(edge_count):
src, dst = (int(n) for n in graph.readline().split())
matrix[src][dst] = 1
print matrix
# [[0, 1, 0, 1, 0], [0, 0, 1, 0, 1], [1, 0, 0, 1, 0], [0, 1, 0, 0, 1], [1, 0, 1, 0, 0]]
Just to keep your code and style, of course it could be much more readable:
gs = open("graph.txt", "r")
gp = gs.readline()
gp_splitIndex = gp.split(" ")
gp_nodeCount = int(gp_splitIndex[0])
gp_edgeCount = int(gp_splitIndex[1])
matrix = [] # predecare the array
for i in range(0, gp_nodeCount):
matrix.append([])
for y in range(0, gp_nodeCount):
matrix[i].append(0)
for i in range(0, gp_edgeCount):
gp = gs.readline()
gp_Index = gp.split(" ") # get the index of space, dividing the 2 numbers on a row
gp_from = int(gp_Index[0])
gp_to = int(gp_Index[1])
matrix[gp_from][gp_to] = 1
print matrix
Exactly is the last instance not used..the 2 0 from your file. Thus the missed 1. Have a nice day!
The other answers are correct, another version similar to the one of tzaman:
with open('graph.txt', mode='r') as txt_file:
lines = [l.strip() for l in txt_file.readlines()]
number_pairs = [[int(n) for n in line.split(' ')] for line in lines]
header = number_pairs[0]
edge_pairs = number_pairs[1:]
num_nodes, num_edges = header
edges = [[0] * num_nodes for _ in xrange(num_nodes)]
for edge_start, edge_end in edge_pairs:
edges[edge_start][edge_end] = 1
print edges

Categories

Resources