flip and swap alternate column in 2d array - python

This is my 2d list
array = ([a,b, c,d, e,f ,g,h], [s,t, u,v, w,x ,y,z])
I would like the output to be
array = ([g,h, e,f, c,d, a,b] ,[y,z, w,x, u,v, s,t])
I was using below to flip the array, but i am stuck to do alternate swap. can anyone help me. Thanks.
array = np.flip(array,axis=1)

You can play with reshape:
array = np.array([['a','b','c','d','e','f','g','h'],
['s','t','u','v','w','x','y','z']])
x,y = array.shape
np.flip(array.reshape((x, -1, 2)), axis=1).reshape(x,y)
output:
array([['g', 'h', 'e', 'f', 'c', 'd', 'a', 'b'],
['y', 'z', 'w', 'x', 'u', 'v', 's', 't']], dtype='<U1')

Related

How to add vectors to an ascii grid?

route001 = (3, 12, 'S', 'S', 'W', 'S', 'S', 'S', 'E', 'E', 'E', 'S', 'S', 'W',
'W', 'S', 'E', 'E', 'E', 'E', 'N', 'N', 'N', 'N', 'W', 'N', 'N',
'E', 'E', 'S', 'E', 'S', 'E', 'S', 'S', 'W', 'S', 'S', 'S', 'S',
'S', 'E', 'N', 'E', 'E')
start = [route001[0]] + [route001[1]]
directions = route001[2:]
coordinates = {"N": [0, 1], 'E': [1, 0], 'S': [0, -1], 'W': [-1, 0]}
vector_list = []
for d in directions:
dx, dy = coordinates[d]
start[0] += dx
start[1] += dy
vector_list.append(start.copy())
start_route = vector_list[0]
end_route = vector_list[-1]
My aim is to have route001 plotted on an ascii grid. The start of my above code turns N, E, S, W into vectors that are sequentially added to the variable vector_list i.e. if i print vector_list it returns [3, 12], [3,11] etc etc.
My question is how do i now get these vectors to display on a simple ascii grid such as?
for x in range(10):
print('- : ' * 10)
I would like it to either A) cycle through my vectors and plot them sequentially i.e. it displays the first vector on the screen, waits a second, then plots the second but not the first, then plots the third but not the second and so forth. B) the same as A but adds to the previous vector i.e. plots the first, then the first and second, then the first, second and third..
I think there might have to be some sort of loop to iterate through the ascii grid inputting the values from vector_list in order? I also think there is a much more elegant way to create the grid.
Please be patient, the above code has taken me a long time to write with a lot of help from SO and trial and error. Many thanks

How to reduce memory consumption when performing a cartesian product?

Given a 2d matrix such as [[a,b,c],[d,e,f]...]], I want to perform a Cartesian product of the matrix so I can determine all the possible combinations.
For this particular constraint, when I am using a 2d matrix with 12 different subsets, it uses more than the 16 megabytes of allotted memory I have. There are three values in each subset, so I would have 312 different combinations.
The cartesian product function that I am using is:
def cartesian_iterative(pools):
result = [[]]
for pool in pools:
result = [x+[y] for x in result for y in pool]
return result
I would like to know how I could reduce memory consumption without using any external libraries. An example 2d array I would working with is [['G', 'H', 'I'], ['M', 'N', 'O'], ['D', 'E', 'F'], ['D', 'E', 'F'], ['P', 'R', 'S'], ['D', 'E', 'F'], ['M', 'N', 'O'], ['D', 'E', 'F'], ['D', 'E', 'F'], ['M', 'N', 'O'], ['A', 'B', 'C'], ['D', 'E', 'F']]
EDIT:
For reference, a link to the problem statement can be found here Problem Statement. Here is the link to the file of possible names Acceptable Names.
The final code:
with open('namenum.in','r') as fin:
num = str(fin.readline().strip()) #the number being used to determine all combinations
numCount = []
for i in range(len(num)):
numCount.append(dicti[num[i]]) #creates a 2d array where each number in the initial 'num' has a group of three letters
def cartesian_iterative(pools): #returns the product of a 2d array
result = [[]]
for pool in pools:
result = [x+[y] for x in result for y in pool]
return result
pos = set() #set of possible names
if len(num) == 12: #only uses more than the allocated memory when the num is 12 digits long.
'''
This optimization allows the product to only calculate 2 * 3^6 values, instead of 3**12. This saves a lot of memory
'''
rights = cartesian_iterative(numCount[6:])
for left in cartesian_iterative(numCount[:6]):
for right in rights:
a = ''.join(left+right)
if a in names:
pos.add(a) #adding name to set
else: #if len(num) < 12, you do not need any other optimizations and can just return normal product
for i in cartesian_iterative(numCount):
a = ''.join(i)
if a in names:
pos.add(a)
pos = sorted(pos)
with open('namenum.out','w') as fout: #outputting all possible names
if len(pos) > 0:
for i in pos:
fout.write(i)
fout.write('\n')
else:
fout.write('NONE\n')
You could use that function on left and right half separately. Then you'd only have 2×36 combinations instead of 312. And they're half as long, somewhat even canceling that factor 2.
for left in cartesian_iterative(pools[:6]):
for right in cartesian_iterative(pools[6:]):
print(left + right)
Output:
['G', 'M', 'D', 'D', 'P', 'D', 'M', 'D', 'D', 'M', 'A', 'D']
['G', 'M', 'D', 'D', 'P', 'D', 'M', 'D', 'D', 'M', 'A', 'E']
['G', 'M', 'D', 'D', 'P', 'D', 'M', 'D', 'D', 'M', 'A', 'F']
['G', 'M', 'D', 'D', 'P', 'D', 'M', 'D', 'D', 'M', 'B', 'D']
...
To be faster, compute the right combinations only once:
rights = cartesian_iterative(pools[6:])
for left in cartesian_iterative(pools[:6]):
for right in rights:
print(left + right)

How to convert multiple fasta lines in a matrix in python?

I have a file (txt or fasta) like this. Each sequence is located only in a single line.
>Line1
ATCGCGCTANANAGCTANANAGCTAGANCACGATAGAGAGAGACTATAGC
>Line2
ATTGCGCTANANAGCTANANCGATAGANCACGAAAGAGATAGACTATAGC
>Line3
ATCGCGCTANANAGCTANANGGCTAGANCNCGAAAGNGATAGACTATAGC
>Line4
ATTGCGCTANANAGCTANANGGATAGANCACGAGAGAGATAGACTATAGC
>Line5
ATTGCGCTANANAGCTANANCGATAGANCACGATNGAGATAGACTATAGC
I have to get a matrix in which each position correspond to each of the letters (nucleotides) of the sequences. In this case a matrix of (5x50).
I've been dealing with numpy methods. I hope someone could help me.
If you are working with DNA sequence data in python, I would recommend using the Biopython library. You can install it with pip install biopython.
Here is how you would achieve your desired result:
from Bio import SeqIO
import os
import numpy as np
pathToFile = os.path.join("C:\\","Users","Kevin","Desktop","test.fasta") #windows machine
allSeqs = []
for seq_record in SeqIO.parse(pathToFile, """fasta"""):
allSeqs.append(seq_record.seq)
seqMat = np.array(allSeqs)
But in the for loop, each seq_record.seq is a Seq object, giving you the flexibility to perform operations on them.
In [5]: seqMat.shape
Out[5]: (5L, 50L)
You can slice your seqMat array however you like.
In [6]: seqMat[0]
Out[6]: array(['A', 'T', 'C', 'G', 'C', 'G', 'C', 'T', 'A', 'N', 'A', 'N', 'A',
'G', 'C', 'T', 'A', 'N', 'A', 'N', 'A', 'G', 'C', 'T', 'A', 'G',
'A', 'N', 'C', 'A', 'C', 'G', 'A', 'T', 'A', 'G', 'A', 'G', 'A',
'G', 'A', 'G', 'A', 'C', 'T', 'A', 'T', 'A', 'G', 'C'],
dtype='|S1')
Highly recommend checking out the tutorial though!
I hope this short bit of code helps. You basically need to split the string into a character array. After that you just put everything into a matrix.
Line1 = "ATGC"
Line2 = "GCTA"
Matr1 = np.matrix([n for n in Line1], [n for n in Line2])
Matr1[0,0] will return the first element in your matrix.
One way of achieving the matrix is to read the content of the file and converting it into a list where each element of the list is the sequence present in each line.And then you can access your matrix as a 2D Data Structure.
Ex: [ATCGCGCTANANAGCTANANAGCTAGANCACGATAGAGAGAGACTATAGC, ATCGCGCTANANAGCTANANAGCTAGANCACGATAGAGAGAGACTATAGC, ATCGCGCTANANAGCTANANAGCTAGANCACGATAGAGAGAGACTATAGC, ATCGCGCTANANAGCTANANAGCTAGANCACGATAGAGAGAGACTATAGC, ATCGCGCTANANAGCTANANAGCTAGANCACGATAGAGAGAGACTATAGC]
filePath = "file path containing the sequence"
List that store the sequence as a matrix
listFasta =list ((open(filePath).read()).split("\n"))
for seq in listFasta:
for charac in seq:
print charac
Another way to access each element of your matrix
for seq in range(len(listFasta)):
for ch in range(len(listFasta[seq])):
print listFasta[seq][ch]

Subtracting arrays from arrays with sub-arrays in Python

In my Python programming course, we are discussing how to manipulate (add,subtract,etc.) arrays and sub-arrays. An example from class was if we had
ourArray = [['a','b','c'],['e','f','g','h'],['i','j'],['k','l','m','n','o','p']...]
and Array = ['a','e','i','k',...], would something like ourArray-Array be possible?
I tried
for w in ourArray:
w[0] - Array[0]
In the end, what I would like is
[['a','b','c'],['e','f','g','h'],['i','j'],['k','l','m','n','o','p']...] - ['a','e','i','k',...] = ['b','c'],['f','g','h'],['j'],['l','m','n','o','p']...].
Also, I am using Python 3 in Windows.
How about this list comprehension, Pythonic, one liner:
>>> ourArray = [['a','b','c'],['e','f','g','h'],['i','j'],['k','l','m','n','o','p']]
>>> Array = ['a','e','i','k']
>>> [[item for item in arr if item not in Array] for arr in ourArray]
[['b', 'c'], ['f', 'g', 'h'], ['j'], ['l', 'm', 'n', 'o', 'p']]
For each array in ourArray, take only the items that are not in Array.
You can always do the brute force method
>>> ourArray = [['a','b','c'],['e','f','g','h'],['i','j'],['k','l','m','n','o','p']]
>>> Array = ['a','e','i','k']
>>> for i in ourArray:
... for j in i:
... if j in Array:
... i.remove(j)
...
>>> ourArray
[['b', 'c'], ['f', 'g', 'h'], ['j'], ['l', 'm', 'n', 'o', 'p']]

Unpack an array filled with lists

I started with something like this:
[a,b,s,d]
[k,e,f,s,d]
[o,w,g]
Then I wanted to rearrange them by length in descending order so that I get this:
[k,e,f,s,d]
[a,b,s,d]
[o,w,g]
However, to do that, I appended each of those into an array as such:
arr = [[a,b,s,d], [k,e,f,s,d], [o,w,g]]
so that I could just use:
sorted(arr, key=len).reverse()
But now I can't unpack arr to just get:
[k,e,f,s,d]
[a,b,s,d]
[o,w,g]
Ideas?
Thanks.
reverse() is an in-place function:
arr = [['a','b','s','d'], ['k','e','f','s','d'], ['o','w','g']]
a = sorted(arr, key=len)
print a
# [['o', 'w', 'g'], ['a', 'b', 's', 'd'], ['k', 'e', 'f', 's', 'd']]
print a.reverse()
# None
print a
# [['k', 'e', 'f', 's', 'd'], ['a', 'b', 's', 'd'], ['o', 'w', 'g']]
reverse() has no output, but it does reverse the array.

Categories

Resources