How to iterate over position through Numpy- Python - python

I am wondering if there is a way to iterate over individual positions in a sequence list using NumPy. For example, if I had a list of sequences:
a = ['AGHT','THIS','OWKF']
The function would be able to go through each individual characters in their position. So for the first sequence 'AGHT', it would be broken down into 'A','G','H','T'. The ultimate goal is to create individual grids based on character abundance in each one of these sequences. So far I have only been able to make a loop that goes through each character, but I need this in NumPy:
b = np.array(a)
for c in b:
for d in c:
print(d)
I would prefer this in NumPy, but if there are other ways I would like to know as well. Thanks!

list expands a string into a list:
In [406]: a = ['AGHT','THIS','OWKF']
In [407]: [list(item) for item in a]
Out[407]: [['A', 'G', 'H', 'T'], ['T', 'H', 'I', 'S'], ['O', 'W', 'K', 'F']]

You can use join() to join the array into a sequence of characters, then iterate over each character or print it like this:
>>> a = ['AGHT','THIS','OWKF']
>>> print(''.join(a))
'AGHTTHISOWKF'
Or to turn it into an array of individual characters:
>>> out = ''.join(a)
>>> b = np.array(list(out))
array(['A', 'G', 'H', 'T', 'T', 'H', 'I', 'S', 'O', 'W', 'K', 'F'],
dtype='<U1')

Related

Why does Python's map() function swap values position in each return?

im studying python and trying to learn how to use the map() function.
Had the idea to change letters from a string for equivalent+1 in alphabet, ex.: abc -> bcd
wrote the following code:
m = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
def func(s):
return m[m.index(s) + 1]
l = "abc"
print(set(map(func, l)))
But every excecution returns a different order for the letters
I got the expected answer by using:
l2 = [func(i) for i in s]
print(l2)
But i wanted to understand the map() function and how it works. Tried to read the documentation but I could not understand much.
Sorry about my bad english and my lack of experience in python :/
It is because you are converting to set in set(map(func, l)) and set is an unordered collection in Python.
From docs:
A set object is an unordered collection of distinct hashable objects....Being an unordered collection, sets do not record element position or order of insertion. Accordingly, sets do not support indexing, slicing, or other sequence-like behavior.
If you replace print(set(map(func, l))) with print(list(map(func, l))), you'll not see this behavior.

Split list of lines into 2d array

I have set of sequences in a list which looks like this :
[agghd,gjg,tomt]
How to split it so that my output looks like the following :
[[a,g,g,h,d],[g,j,g],[t,o,m,t]]
I have done the following code for now :
agghd
gjh
tomt
list2=[]
list2 = [str(sequences.seq).split() for sequences in family]
You can split a string to characters by calling list() on it
list1 = ['agghd', 'gjg', 'tomt']
list2 = [list(string) for string in list1]
# output: [['a', 'g', 'g', 'h', 'd'], ['g', 'j', 'g'], ['t', 'o', 'm', 't']]
You can try
[[eval(n) for n in str(sequences.seq).split()] for sequences in family]

How can I split a list in two unique lists in Python?

Hi I have a list as following:
listt = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o']
15 members.
I want to turn it into 3 lists, I used this code it worked but I want unique lists. this give me 3 lists that have mutual members.
import random
listt = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o']
print(random.sample(listt,5))
print(random.sample(listt,5))
print(random.sample(listt,5))
Try this:
from random import shuffle
def randomise():
listt = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o']
shuffle(listt)
return listt[:5], listt[5:10], listt[10:]
print(randomise())
This will print (for example, since it is random):
(['i', 'k', 'c', 'b', 'a'], ['d', 'j', 'h', 'n', 'f'], ['e', 'l', 'o', 'g', 'm'])
If it doesn't matter to you which items go in each list, then you're better off partitioning the list into thirds:
In [23]: L = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o']
In [24]: size = len(L)
In [25]: L[:size//3]
Out[25]: ['a', 'b', 'c', 'd', 'e']
In [26]: L[size//3:2*size//3]
Out[26]: ['f', 'g', 'h', 'i', 'j']
In [27]: L[2*size//3:]
Out[27]: ['k', 'l', 'm', 'n', 'o']
If you want them to have random elements from the original list, you'll just need to shuffle the input first:
random.shuffle(L)
Instead of sampling your list three times, which will always give you three independent results where individual members may be selected for more than a single list, you could just shuffle the list once and then split it in three parts. That way, you get three random subsets that will not share any items:
>>> random.shuffle(listt)
>>> list[0:5]
>>> listt[0:5]
['b', 'a', 'f', 'e', 'h']
>>> listt[5:10]
['c', 'm', 'g', 'j', 'o']
>>> listt[10:15]
['d', 'l', 'i', 'n', 'k']
Note that random.shuffle will shuffle the list in place, so the original list is modified. If you don’t want to modify the original list, you should make a copy first.
If your list is larger than the desired result set, then of course you can also sample your list once with the combined result size and then split the result accordingly:
>>> sample = random.sample(listt, 5 * 3)
>>> sample[0:5]
['h', 'm', 'i', 'k', 'd']
>>> sample[5:10]
['a', 'b', 'o', 'j', 'n']
>>> sample[10:15]
['c', 'l', 'f', 'e', 'g']
This solution will also avoid modifying the original list, so you will not need a copy if you want to keep it as it is.
Use [:] for slicing all members out of the list which basically copies everything into a new object. Alternatively just use list(<list>) which copies too:
print(random.sample(listt[:],5))
In case you want to shuffle only once, store the shuffle result into a variable and copy later:
output = random.sample(listt,5)
first = output[:]
second = output[:]
print(first is second, first is output) # False, False
and then the original list can be modified without the first or second being modified.
For nested lists you might want to use copy.deepcopy().

Change the way sorted works in Python (different than alphanumeric)

I'm representing cards in poker as letters (lower and uppercase) in order to store them efficiently. I basically now need a custom sorting function to allow calculations with them.
What is the fastest way to sort letters in Python using
['a', 'n', 'A', 'N', 'b', 'o', ....., 'Z']
as the ranks rather than
['A', 'B', 'C', 'D', 'E', 'F', ....., 'z']
which is the default?
Note, this sorting is derived from:
import string
c = string.letters[:13]
d = string.letters[13:26]
h = string.letters[26:39]
s = string.letters[39:]
'a' = 2 of clubs
'n' = 2 of diamonds
'A' = 2 of hearts
'N' = 2 of spades
etc
You can provide a key function to sorted, this function will be called for each element in the iterable and the return value will be used for the sorting instead of the elements value.
In this case it might look something like the following:
order = ['a', 'n', 'A', 'N', 'b', 'o', ....., 'Z']
sorted_list = sorted(some_list, key=order.index)
Here is a brief example to illustrate this:
>>> order = ['a', 'n', 'A', 'N']
>>> sorted(['A', 'n', 'N', 'a'], key=order.index)
['a', 'n', 'A', 'N']
Note that to make this more efficient you may want to use a dictionary lookup for your key function instead of order.index, for example:
order = ['a', 'n', 'A', 'N', 'b', 'o', ....., 'Z']
order_dict = {x: i for i, x in enumerate(order)}
sorted_list = sorted(some_list, key=order_dict.get)
Store [edit: not store, use internally] them as numbers ordered by value and only convert to letters when displaying them.
Edit: if 1 byte values are required then you can have the cards in the range 1:52 as characters, then, again, convert to the proper letters when displaying and storing them.

Split words in a nested list into letters

I was wondering how I can split words in a nested list into their individual letters such that
[['ANTT'], ['XSOB']]
becomes
[['A', 'N', 'T', 'T'], ['X', 'S', 'O', 'B']]
Use a list comprehension:
[list(l[0]) for l in mylist]
Your input list simply contains nested lists with 1 element each, so we need to use l[0] on each element. list() on a string creates a list of the individual characters:
>>> mylist = [['ANTT'], ['XSOB']]
>>> [list(l[0]) for l in mylist]
[['A', 'N', 'T', 'T'], ['X', 'S', 'O', 'B']]
If you ever fix your code to produce a straight up list of strings (so without the single-element nested lists), you only need to remove the [0]:
>>> mylist = ['ANTT', 'XSOB']
>>> [list(l) for l in mylist]
[['A', 'N', 'T', 'T'], ['X', 'S', 'O', 'B']]
You could you a functional approach (still I would prefer list comprehensions as in Martijn Pieters' answer):
>>> from operator import itemgetter
>>> delisted = map(itemgetter(0),[['ANTT'],['XSOB']]) # -> ['ANTT', 'XSOB']
>>> splited = map(list,delisted) # -> [['A', 'N', 'T', 'T'], ['X', 'S', 'O', 'B']]
Or, as a oneliner:
>>> map(list,map(itemgetter(0),[['ANTT'],['XSOB']]))
[['A', 'N', 'T', 'T'], ['X', 'S', 'O', 'B']]
>>> map(lambda s: map(list, s)[0], [['ANTT'],['XSOB']])
[['A', 'N', 'T', 'T'], ['X', 'S', 'O', 'B']]
All answers given so far include a - somewhat suspicious - 0 literal. At first glance this seem to be a somewhat artificial additive, since the original task doesn't include any integer literal.
But the input has - of course - some special properties besides being just a list. It's a list of lists with exactly one string literal in it. In other words, a package of individually boxed objects (strings). If we rephrase the original question and the accepted answer, we can make the relation between [x] and [0] more obvious:
package = [['ANTT'], ['XSOB']]
result = [list(box[0]) for box in package] # 0 indicates first (and only) element in the box

Categories

Resources