I have a 2d numpy array of size 768 x 1024 which contains all the class values of a segmented image.
I have detected pedestrians/vehicles within this array and have got the top left and bottom right coordinate of the bounding box say (381,254) and (387,257).
(381,254) (381,255) ............... (381,257)
(382,254)
.
.
.
(387,254) .................................(387,257)
Each cell under those coordinates have a specific class value (numbers from 1 to 22). The ones that interest me are '4' and '10' which indicates that the bounding box contains a pedestrian or vehicle respectively.
How do I iterate through each element individually (all the elements in row 381 from column 254 to 257 then onto the next row and so on till the bottom right coordinate (387,257)) and check if that particular cell contains the number 4 or 10?
I tried using nested for loop but I'm not able to figure out the logic.
x_1 = 381
x_2 = 387
y_1 = 254
y_2 = 257
ROW = []
COL = []
four = 0
ten = 0
other = 0
for rows in range(x_1, x_2):
ROW.append(rows)
for cols in range(y_1, y_2):
COL.append(cols)
if array[rows][cols] == 4:
four += 1
elif array[rows][cols] == 10:
ten += 1
else:
print('random number')
other += 1
Any help would be appreciated! Thanks.
Try using this instead:
x_1 = 381
x_2 = 387
y_1 = 254
y_2 = 257
ROW = []
COL = []
four = 0
ten = 0
other = 0
for rows in range(x_1, x_2):
ROW.append(rows)
for cols in range(y_1, y_2):
COL.append(cols)
if 4 in (x_2, y_2):
four += 1
elif 10 in (x_2, y_2):
ten += 1
else:
print('random number')
other += 1
It will check if array[x][y] contain number 4 or 10. For example, if array[x][y] == (12, 4) then four += 1.
Related
Title. Essentially I want to replace the characters that would be randomly generated in a 2d List I made. Here's an example, I'd like to replace 0 with 9
0 0 0 0
1 1 1 1
2 2 2 2
3 3 3 3
9 9 9 9
1 1 1 1
2 2 2 2
3 3 3 3
But I am not entirley familiar with how I am supposed to do this because I am rather new at programming.
My first attempt was to make a function with a parameter that wiuld Identify and characters that had 0 and then make a seperate funtinon that would return it and replace it with 9, but it wouldn't even start it, and I have no idea where to go from there. Any help would be
appreciated.
code:
import random
SIZE = 4
EMPTY = " "
PERSON = "P"
PET = "T"
POOP = "O"
ERROR = "!"
CLEANED = "."
MAX_RANDOM = 10
def clean(world,endRow,endColumn):
if world == POOP, POOP == CLEAN:
print("Scooping the poop")
# #args(reference to a alist of characters,integer,integer)
# Function will count the number of occurences of the character specified by the user up the specified
# end point (row/column specified by the user).
# #return(character,integer)
def count(world,endRow,endColumn):
print("Counting number of occurances of a character")
number = 0
element = input("Enter character: ")
return(element,number)
# Randomly generates an 'occupant' for a location in the world
# #args(none)
# Randomly generates and returns a character according to the specified probabilities.
# *50% probability empty space
# *20% probability of a person
# *10% probability of a pet
# *20% probability of fecal matter
# #return(character)
def createElement():
tempNum = random.randrange(MAX_RANDOM)+1
# 50% chance empty
if ((tempNum >= 1) and (tempNum <= 5)):
tempElement = EMPTY
# 20% of a person
elif ((tempNum >= 6) and (tempNum <= 7)):
tempElement = PERSON
# 10% chance of pet
elif (tempNum == 8):
tempElement = PET
# 20% chance of poop in that location (lots in this world)
elif ((tempNum >= 9) and (tempNum <= 10)):
tempElement = POOP
# In case there's a bug in the random number generator
else:
tempElement = ERROR
return(tempElement)
# Creates the SIZExSIZE world. Randomly populates it with the
# return values from function createElement
# #args(none)
# Randomly populates the 2D list which is returned to the caller.
# #return(reference to a list of characters)
def createWorld():
world = [] # Create a variable that refers to a 1D list.
r = 0
# Outer 'r' loop traverses the rows.
# Each iteration of the outer loop creates a new 1D list.
while (r < SIZE):
tempRow = []
world.append(tempRow) # Create new empty row
c = 0
# The inner 'c' loop traverses the columns of the newly
# created 1D list creating and initializing each element
# to the value returned by createElement()
while (c < SIZE):
element = createElement()
tempRow.append(element)
c = c + 1
r = r + 1
return(world)
# Shows the elements of the world. All the elements in a row will
# appear on the same line.
# #args(reference to a list of characters)
# Displays the 2D list with each row on a separate line.
# #return(nothing)
def display(world):
print("OUR WORLD")
print("========")
r = 0
while (r < SIZE): # Each iteration accesses a single row
c = 0
while (c < SIZE): # Each iteration accesses an element in a row
print(world[r][c], end="")
c = c + 1
print() # Done displaying row, move output to the next line
r = r + 1
# #args(none)
# #return(integer,integer)
def getEndPoint():
#Declaring local variables
endRow = -1
endColumn = -1
return(endRow,endColumn)
# Starting execution point for the program.
def start():
world = createWorld()
display(world)
endRow,endColumn = getEndPoint()
element,number = count(world,endRow,endColumn)
print("# occurances of %s=%d" %(element,number))
clean(world,endRow,endColumn)
display(world)
start()
I made this program trying to solve a problem which is fliping a number. For example, when the number 123 is the number inputed the number 321 should be the output.
#function to swap number positions on the array
def swapPositions(list, pos1, pos2):
i = list[pos1]
list[pos1] = list[pos2]
list[pos2] = i
myList = []
theNum = int(input("enter the value"))
theNumInString = str(theNum)
#loop to separate numbers on the integer into each position of the array
for char in theNum2:
myList.append(char)
#this variable is to know how many times we should swap the positions
numofSwaps = len(myList) % 2
posi1 = 0
posi2 = len(myList) - 1
while numofSwaps != 0:
swapPositions(myList, posi1, posi2)
#I add one and subtract one from the positions so they move further to the middle to swap other positions
posi1 += 1
posi2 -= 1
numofSwaps -= 1
number = "".join(myList)
print(number)
what happens when I run the code and try for example 123 it returns 321 as expected
BUT here comes the problem... when I input 12345 the output is 52341 which only swaps the outer two numbers.
this can be done without converting the number to a string, for example
# note: this example works for positive numbers only
def reverseNum(x):
y = 0
while x > 0:
y = y*10 + x%10
x //= 10
return y
>>> reverseNum(3124)
4213
As a newbie to python, I've come across a task that I'm having trouble completing. I am supposed to create a new matrix, taking into consideration the original one, inputted by the user, where each element corresponds to the number of adjacent elements greater or equal to the corresponding one in the original matrix. Since English is not my native language, I might not have presented it properly so here is an example:
input:
3x3 matrix
9 14 13
3 0 7
8 15 15
output:
3x3 matrix
2 5 2
1 0 1
2 5 3
So, if it isn't clear, the new matrix determines how many adjacent elements are greater or equal to an element in the original one. 9 is greater than 3 & 0, so that results in "2", 14 is greater than all the adjacent ones so it prints out "5", etc... It also takes diagonals into consideration, of course.
So far, I've got down the input of the original matrix but I'm unsure how to proceed further. I am not someone who's got access to university materials, professors or peer help and my experimentation and search online has been futile so far. I do not need a complete solution, rather than pointers and concept explanation.
This is the code so far:
# matrix input
rOne = int(input("Number of rows:"))
cOne = int(input("Number of columns:"))
# initialize matrix
matrixOne = []
print("Enter elements rowwise:")
# user input
for i in range(rOne): # for loop za redove
a =[]
for j in range(cOne): # for loop za kolone
a.append(int(input()))
matrixOne.append(a)
# print matrix one
for i in range(rOne):
for j in range(cOne):
print(matrixOne[i][j], end = " ")
print()
Here is a function that can do the exact thing:
def adj_matrix(matrixOne, rOne, cOne):
new_lst = []
for i in range(rOne):
a = []
for j in range(cOne):
count = 0
x, y = (i, j) # matrix values here
cells = list(starmap(lambda a,b: (x+a, y+b), product((0,-1,+1), (0,-1,+1)))) # this will find all the adjacent index
filtered_cell = [p for p in cells if (sum(p)>=0 and prod(p)>=0)] # this filters out all the negative indexs
filtered_cell = [p for p in filtered_cell if p[0]<rOne and p[1]<cOne] # this filters out index that are greater than matrix
for z in filtered_cell:
if matrixOne[i][j] >= matrixOne[z[0]][z[1]]:
count += 1
a.append(count-1)
new_lst.append(a)
return new_lst
also import:
from itertools import product, starmap
from math import prod
Actually managed to solve the thing, feels amazing :D
def get_neighbor_elements(current_row, current_col, total_rows, total_cols):
neighbors = []
for i in [-1, 0, 1]: # red pre, isti red, red posle
for j in [-1, 0, 1]: # kolona pre, ista kolona, kolona posle
row = current_row + i
col = current_col + j
if row < 0 or col < 0 or row >= total_rows or col >= total_cols: # preskace se ako se izaslo iz granica matrice
continue
if row == current_row and col == current_col:# ako su red i kolona isti, preskace se(to je taj isti element)
continue
neighbor = [row, col]
neighbors.append(neighbor)
return neighbors
def make_new_matrix(old_matrix):
new_matrix = []
for i in range(len(old_matrix)):
new_matrix.append([])
for i in range(len(old_matrix)): # iteriramo kroz redove stare matrice
for j in range(len(old_matrix[i])): # iteriramo kroz kolone stare matrice
neighbors = get_neighbor_elements(i, j, len(old_matrix), len(old_matrix[i])) # dobavljamo komsije
count = 0
for neighbor in neighbors: # sad gledamo da li je trenutni element veci ili jednak susednim
if old_matrix[i][j] >= old_matrix[neighbor[0]][neighbor[1]]:
count += 1
new_matrix[i].append(count)
return new_matrix
def print_matrix(matrix):
for i in range(len(matrix)):
for j in range(len(matrix[i])):
print(str(matrix[i][j]) + " ", end='')
print()
if __name__ == '__main__':
matrix = [[12, 10], [2, 10]]
print("Old matrix: ")
print_matrix(matrix)
new_matrix = make_new_matrix(matrix)
print("New matrix")
print_matrix(new_matrix)
I am trying to order data using multiple ranges. Let's suppose I have some data in tt array:
n= 50
b = 20
r = 3
tt = np.array([[[3]*r]*b]*n)
and another values in list:
z = (np.arange(0,5,0.1)).tolist()
Now I need to sort data from tt depending on ranges from z, which should go between 0 and 1, next range is between 1 and 2, next one between 2 and 3 and so on.
My attempts by now were trying to create array of length of each of ranges, and use those length to cut data from tt. Looks something like this:
za = []
za2 = []
za3 = []
za4 = []
za5 = []
za6 = []
za7 = []
for y in range(50):
if 0 <= int(z[y]) < 1:
za.append(z[y])
zi = array([int(len(za))])
if 1 <= int(z[y]) < 2:
za2.append(z[y])
zi2 = array([int(len(za2))])
if 2 <= int(z[y]) < 3:
za3.append(z[y])
zi3 = array([int(len(za3))])
if 3 <= int(z[y]) < 4:
za4.append(z[y])
zi4 = array([int(len(za4))])
if 4 <= int(z[y]) < 5:
za5.append(z[y])
zi5 = array([int(len(za5))])
if 5 <= int(z[y]) < 6:
za6.append(z[y])
zi6 = array([int(len(za6))])
if 6 <= int(z[y]) < 7:
za7.append(z[y])
zi7 = array([int(len(za7))])
till = np.concatenate((np.array(zi), np.array(zi2), np.array(zi3), np.array(zi4), np.array(zi5), np.array(zi6), np.array(zi7))
ttn = []
for p in range(50):
#if hour_lenght[p] != []
tt_h = np.empty(shape(tt[0:till[p],:,:]))
tt_h[:] = np.nan
tt_h = tt_h[np.newaxis,:,:,:]
tt_h[np.newaxis,:,:,:] = tt[0:till[p],:,:]
ttn.append(tt_h)
As you can guess,I get an error "name 'zi6' is not defined" since there are no data in that range. But at least it does the job for the parts that do exist :D. However, if I include else statement after if and do something like:
for y in range(50):
if 0 <= int(z[y]) < 1:
za.append(z[y])
zi = np.array([int(len(za))])
else:
zi = np.array([np.nan])
My initial zi from 1st part gets overwritten with nan.
I should also point that the ultimate goal is to load multiple files that are having similar shape as tt (two last dimensions are always the same while the first one is changing, e.g. :
tt.shape
(50, 20, 3)
and some other tt2 is having shape:
tt2.shape
(55, 20, 3)
with z2 that is having values between 5 and 9.
z2 = (np.arange(5,9,0.1)).tolist()
So in the end I should end up with array ttn wheres
ttn[0] is filled with values from tt in range between 0 and 1,
ttn[1] should be filled with values from tt in between 1 and 2 and so on.
I very much appreciate suggestions and possible solutions on this issue.
I am currently learning python. I do not want to use Biopython, or really any imported modules, other than maybe regex so I can understand what the code is doing.
From a genetic sequence alignment, I would like to find the location of the start and end positions of gaps/indels "-" that are next to each other within my sequences, the number of gap regions, and calculate the length of gap regions. For example:
>Seq1
ATC----GCTGTA--A-----T
I would like an output that may look something like this:
Number of gaps = 3
Index Position of Gap region 1 = 3 to 6
Length of Gap region 1 = 4
Index Position of Gap region 2 = 13 to 14
Length of Gap region 2 = 2
Index Position of Gap region 3 = 16 to 20
Length of Gap region 3 = 5
I have tried to figure this out on larger sequence alignments but I have not been able to even remotely figure out how to do this.
What you want is to use regular expression to find a gap (one or more dashes, which translate to '-+', the plus sign means one or more):
import re
seq = 'ATC----GCTGTA--A-----T'
matches = list(re.finditer('-+', seq))
print 'Number of gaps =', len(matches)
print
for region_number, match in enumerate(matches, 1):
print 'Index Position of Gap region {} = {} to {}'.format(
region_number,
match.start(),
match.end() - 1)
print 'Length of Gap region {} = {}'.format(
region_number,
match.end() - match.start())
print
Notes
matches is a list of match objects
In order to get the region number, I used the function enumerate. You can look it up to see how it works.
The match object has many methods, but we are interested in .start() which returns the start index and .end() which return the end index. Note that the end index here is one more that what you want, thus I subtracted 1 from it.
Here is my suggestion of code, quite straight-forward, short and easy to understand, without any other imported package other than re:
import re
def findGaps(aSeq):
# Get and print the list of gaps present into the sequence
gaps = re.findall('[-]+', aSeq)
print('Number of gaps = {0} \n'.format(len(gaps)))
# Get and print start index, end index and length for each gap
for i,gap in enumerate(gaps,1):
startIndex = aSeq.index(gap)
endIndex = startIndex + len(gap) - 1
print('Index Position of Gap region {0} = {1} to {2}'.format(i, startIndex, endIndex))
print('Length of Gap region {0} = {1} \n'.format(i, len(gap)))
aSeq = aSeq.replace(gap,'*' * len(gap), 1)
findGaps("ATC----GCTGTA--A-----T")
A bit of a longer-winded way about this than with regex, but you could find the index of the hyphens and group them by using first-differences:
>>> def get_seq_gaps(seq):
... gaps = np.array([i for i, el in enumerate(seq) if el == '-'])
... diff = np.cumsum(np.append([False], np.diff(gaps) != 1))
... un = np.unique(diff)
... yield len(un)
... for i in un:
... subseq = gaps[diff == i]
... yield i + 1, len(subseq), subseq.min(), subseq.max()
>>> def report_gaps(seq):
... gaps = get_seq_gaps(seq)
... print('Number of gaps = %s\n' % next(gaps), sep='')
... for (i, l, mn, mx) in gaps:
... print('Index Position of Gap region %s = %s to %s' % (i, mn, mx))
... print('Length of Gap Region %s = %s\n' % (i, l), sep='')
>>> seq = 'ATC----GCTGTA--A-----T'
>>> report_gaps(seq)
Number of gaps = 3
Index Position of Gap region 1 = 3 to 6
Length of Gap Region 1 = 4
Index Position of Gap region 2 = 13 to 14
Length of Gap Region 2 = 2
Index Position of Gap region 3 = 16 to 20
Length of Gap Region 3 = 5
First, this forms an array of the indices at which you have hyphens:
>>> gaps
array([ 3, 4, 5, 6, 13, 14, 16, 17, 18, 19, 20])
Places where first differences are not 1 indicate breaks. Throw on another False to maintain length.
>>> diff
array([0, 0, 0, 0, 1, 1, 2, 2, 2, 2, 2])
Now take the unique elements of these groups, constrain gaps to the corresponding indices, and find its min/max.
This is my take on this problem:
import itertools
nucleotide='ATC----GCTGTA--A-----T'
# group the repeated positions
gaps = [(k, sum(1 for _ in vs)) for k, vs in itertools.groupby(nucleotide)]
# text formating
summary_head = "Number of gaps = {0}"
summary_gap = """
Index Position of Gap region {0} = {2} to {3}
Length of Gap region {0} = {1}
"""
# Print output
print summary_head.format(len([g for g in gaps if g[0]=="-"]))
gcount = 1 # this will count the gap number
position = 0 # this will make sure we know the position in the sequence
for i, g in enumerate(gaps):
if g[0] == "-":
gini = position # start position current gap
gend = position + g[1] - 1 # end position current gap
print summary_gap.format(gcount, g[1], gini, gend)
gcount+=1
position += g[1]
This generates your expected output:
# Number of gaps = 3
# Index Position of Gap region 1 = 3 to 6
# Length of Gap region 1 = 4
# Index Position of Gap region 2 = 13 to 14
# Length of Gap region 2 = 2
# Index Position of Gap region 3 = 16 to 20
# Length of Gap region 3 = 5
EDIT: ALTERNATIVE WITH PANDAS
import itertools
import pandas as pd
nucleotide='ATC----GCTGTA--A-----T'
# group the repeated positions
gaps = pd.DataFrame([(k, sum(1 for _ in vs)) for k, vs in itertools.groupby(nucleotide)])
gaps.columns = ["type", "length"]
gaps["ini"] = gaps["length"].cumsum() - gaps["length"]
gaps["end"] = gaps["ini"] + gaps["length"] - 1
gaps = gaps[gaps["type"] == "-"]
gaps.index = range(1, gaps.shape[0] + 1)
summary_head = "Number of gaps = {0}"
summary_gap = """
Index Position of Gap region {0} = {1[ini]} to {1[end]}
Length of Gap region {0} = {1[length]}
"""
print summary_head.format(gaps.shape[0])
for index, row in gaps.iterrows():
print summary_gap.format(index, row)
This alternative has the benefit that if you are analyzing multiple sequences you can add the sequence identifier as an extra column and have all the data from all your sequences in a single data structure; something like this:
import itertools
import pandas as pd
nucleotides=['>Seq1\nATC----GCTGTA--A-----T',
'>Seq2\nATCTCC---TG--TCGGATG-T']
all_gaps = []
for nucleoseq in nucleotides:
seqid, nucleotide = nucleoseq[1:].split("\n")
gaps = pd.DataFrame([(k, sum(1 for _ in vs)) for k, vs in itertools.groupby(nucleotide)])
gaps.columns = ["type", "length"]
gaps["ini"] = gaps["length"].cumsum() - gaps["length"]
gaps["end"] = gaps["ini"] + gaps["length"] - 1
gaps = gaps[gaps["type"] == "-"]
gaps.index = range(1, gaps.shape[0] + 1)
gaps["seqid"] = seqid
all_gaps.append(gaps)
all_gaps = pd.concat(all_gaps)
print(all_gaps)
will generate a data container with:
type length ini end seqid
1 - 4 3 6 Seq1
2 - 2 13 14 Seq1
3 - 5 16 20 Seq1
1 - 3 6 8 Seq2
2 - 2 11 12 Seq2
3 - 1 20 20 Seq2
that you can format afterwards like:
for k in all_gaps["seqid"].unique():
seqg = all_gaps[all_gaps["seqid"] == k]
print ">{}".format(k)
print summary_head.format(seqg.shape[0])
for index, row in seqg.iterrows():
print summary_gap.format(index, row)
which can look like:
>Seq1
Number of gaps = 3
Index Position of Gap region 1 = 3 to 6
Length of Gap region 1 = 4
Index Position of Gap region 2 = 13 to 14
Length of Gap region 2 = 2
Index Position of Gap region 3 = 16 to 20
Length of Gap region 3 = 5
>Seq2
Number of gaps = 3
Index Position of Gap region 1 = 6 to 8
Length of Gap region 1 = 3
Index Position of Gap region 2 = 11 to 12
Length of Gap region 2 = 2
Index Position of Gap region 3 = 20 to 20
Length of Gap region 3 = 1