Generating a lettering in python? [closed] - python

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
How to write a function ( a generator) that takes three letters (l1, l2, l3) and three numbers (n1, n2, n3) and gives all the possible combinations in which l1 occurs n1 times, l2 n2 times and l3 n3 times.
For example:
for i in function('a', 2, 'b', 1, 'c', 0):
print(i)
gives:
aab
baa
aba

Use itertools.permutations, all you need is a thin wrapper around it:
from itertools import permutations
def l_n_times(l1, n1, l2, n2, l3, n3):
return permutations(l1*n1 + l2*n2 + l3*n3)
Demo:
>>> for item in set(l_n_times('a', 2, 'b', 1, 'c', 0)):
... print(''.join(item))
...
baa
aba
aab
permutations already returns a generator so you don't have use yield yourself.

Doesn't seem to me that itertools would help a lot here, though a recursive implementation may look like this:
def combine(l1, n1, l2, n2, l3, n3):
counters = {l1: n1, l2: n2, l3: n3} # remaining characters to use
buf = [] # string under construction
def recur(depth):
if not depth: # we've reached the bottom
yield ''.join(buf)
return
# choosing next character
for s, c in counters.iteritems():
if not c: # this character is exhausted
continue
counters[s] -= 1
buf.append(s)
for val in recur(depth-1):
# going down recursively
yield val
# restore the state before trying next character
buf.pop()
counters[s] += 1
length = sum(counters.values())
return recur(length)
for s in combine('a', 2, 'b', 1, 'c', 0):
print s

Lets admit you have a data structure like:
letters = {'a': 2, 'b': 1, 'c': 0}
a recursive function would be:
def r(letters, prefix = ''):
for k,v in letters.items():
if v > 0:
d = dict(letters)
d[k] = v - 1
for val in r(d, prefix + k):
yield val
if all(v == 0 for _, v in letters.items()):
yield prefix
No duplicates, and it does use a generator. Quite heavy compared to a simple itertools call.

The docs for itertools have this to say;
The code for combinations() can be also expressed as a subsequence of permutations() after filtering entries where the elements are not in sorted order (according to their position in the input pool):
Since we want all combinations with no duplicates, we'll just enforce strict ordering (ie only yield values that are greater than the greatest one so far);
This would seem to do just that;
def dfunc(l,n):
old=[]
for i in it.permutations(''.join(list(a*b for a,b in sorted(it.izip(l,n))))):
if i > old:
old=i
yield i
>>> dfunc(['b','c','a'],[1,0,2])
<generator object dfunc at 0x10ba055a0>
>>> list(dfunc(['b','c','a'],[1,0,2]))
[('a', 'a', 'b'), ('a', 'b', 'a'), ('b', 'a', 'a')]

Related

How to use enumerate in a list comprehension with two lists?

I just started to use list comprehension and I'm struggling with it. In this case, I need to get the n number of each list (sequence_0 and sequence_1) that the iteration is at each time. How can I do that?
The idea is to get the longest sequence of equal nucleotides (a motif) between the two sequences. Once a pair is finded, the program should continue in the nexts nucleotides of the sequences, checking if they are also equal and then elonganting the motif with it. The final output should be an list of all the motifs finded.
The problem is, to continue in the next nucleotides once a pair is finded, i need the position of the pair in both sequences to the program continue. The index function does not work in this case, and that's why i need the enumerate.
Also, I don't understand exactly the reason for the x and y between (), it would be good to understand that too :)
just to explain, the content of the lists is DNA sequences, so its basically something like:
sequence_1 = ['A', 'T', 'C', 'A', 'C']
def find_shared_motif(arq):
data = fastaread(arq)
seqs = [list(sequence) for sequence in data.values()]
motifs = [[]]
i = 0
sequence_0, sequence_1 = seqs[0], seqs[1] # just to simplify
for x, y in [(x, y) for x in zip(sequence_0[::], sequence_0[1::]) for y in zip(sequence_1[::], sequence_1[1::])]:
print(f'Pairs {"".join(x)} and {"".join(y)} being analyzed...')
if x == y:
print(f'Pairs {"".join(x)} and {"".join(y)} match!')
motifs[i].append(x[0]), motifs[i].append(x[1])
k = sequence_0.index(x[0]) + 2 # NAO ESTA DEVOLVENDO O NUMERO CERTO
u = sequence_1.index(y[0]) + 2
print(k, u)
# Determines if the rest of the sequence is compatible
print(f'Starting to elongate the motif {x}...')
for j, m in enumerate(sequence_1[u::]):
try:
# Checks if the nucleotide is equal for both of the sequences
print(f'Analyzing the pair {sequence_0[k + j]}, {m}')
if m == sequence_0[k + j]:
motifs[i].append(m)
print(f'The pair {sequence_0[k + j]}, {m} is equal!')
# Stop in the first nonequal residue
else:
print(f'The pair {sequence_0[k + j]}, {m} is not equal.')
break
except IndexError:
print('IndexError, end of the string')
else:
i += 1
motifs.append([])
return motifs
...
One way to go with it is to start zipping both lists:
a = ['A', 'T', 'C', 'A', 'C']
b = ['A', 'T', 'C', 'C', 'T']
c = list(zip(a,b))
In that case, c will have the list of tuples below
c = [('A','A'), ('T','T'), ('C','C'), ('A','C'), ('C','T')]
Then, you can go with list comprehension and enumerate:
d = [(i, t) for i, t in enumerate(c)]
This will bring something like this to you:
d = [(0, ('A','A')), (1, ('T','T')), (2, ('C','C')), ...]
Of course you can go for a one-liner, if you want:
d = [(i, t) for i, t in enumerate(zip(a,b))]
>>> [(0, ('A','A')), (1, ('T','T')), (2, ('C','C')), ...]
Now, you have to deal with the nested tuples. Focus on the internal ones. It is obvious that what you want is to compare the first element of the tuples with the second ones. But, also, you will need the position where the difference resides (that lies outside). So, let's build a function for it. Inside the function, i will capture the positions, and t will capture the inner tuples:
def compare(a, b):
d = [(i, t) for i, t in enumerate(zip(a,b))]
for i, t in d:
if t[0] != t[1]:
return i
return -1
In that way, if you get -1 at the end, it means that all elements in both lists are equal, side by side. Otherwise, you will get the position of the first difference between them.
It is important to notice that, in the case of two lists with different sizes, the zip function will bring a list of tuples with the size matching the smaller of the lists. The extra elements of the other list will be ignored.
Ex.
list(zip([1,2], [3,4,5]))
>>> [(1,3), (2,4)]
You can use the function compare with your code to get the positions where the lists differ, and use that to build your motifs.

Going through a list in pairs, except for first element, Python [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have a list myList = ['a', 'b', 'c', 'd', 'e'] and some function that is not relevant to the current problem. I want to access the list in pairs, such as:
value['a'] = function(myList[:1])
value['b'] = function(myList[:2]) - function(myList[:1])
value['c'] = function(myList[:3]) - function(myList[:2])
...
value['e'] = function(myList) - function(myList[:len(myList)-1])
where value can be a dictionary.
To be more explicit, the above code should look like:
value['a'] = function(['a'])
value['b'] = function(['a', 'b']) - function(['a'])
value['c'] = function(['a', 'b', 'c']) - function(['a', 'b'])
...
value['e'] = function(['a', 'b', 'c', 'd', 'e']) - function(['a', 'b', 'c', 'd'])
I am trying to do this with a simple for loop:
value = {}
for idx in range(len(myList)):
if not value[myList[idx]]:
value[myList[idx]] = function[myList[:idx]] - function[myList[:idx-1]]
else:
temp = function[myList[:idx]] - function[myList[:idx-1]]
value[myList[idx]] += [temp] # end-up with a list of values for every element.
I have two problems with this:
It does not work well for the beginning and end of the list.
The code looks very cumbersome and I am wondering if there is a more pythonic way of writing it.
I don't know why do you want to put results into dict with int keys, starting from 1. It's list representation using dict which I find useless in that case (maybe you need it in future code, I don't know).
To form a list with results you can use next list comprehension:
values = [function(myList[0])] + [function(myList[:idx]) - function(myList[:idx - 1]) for idx in range(2, len(myList) + 1)]
This is shortened (and slightly more efficient) version of next code:
values = [function(myList[0])]
for idx in range(2, len(myList) + 1): # range(1, len(myList))
first_slice = myList[: idx] # myList[: idx + 1]
second_slice = myList[: idx - 1] # myList[: idx]
values.append(function(first_slice) - function(second_slice))
If you still want to get results in dict, you can create it using next code:
dict_result = dict(zip(range(1, len(values) + 1), values))
certainly most efficient way :
# Test data
l = ['a', 'b', 'c', 'd']
# Your `function`
def f(args):
print(args)
return len(args)
# BEGIN OF ANSWER
from itertools import accumulate
#prev = accumulated value (to cache previous call to f) while returning the current value too, that is used in the dict comprehension
def func(prev, x):
cur = f(x)
val, prev_x = prev
return cur - prev_x, cur
#cache the initial call
init = f( [ l[0] ])
value = dict(
zip(
l,
accumulate(
( l[:k] for k in range(1, len(l)) ),
func,
initial=(init, init)
)
)
)
EDIT the previous one is for python 3.8. for python < 3.8, then accumulate has no initial keyword argument. Thus, you have to stick with reduce hacks...
# Test data
l = ['a', 'b', 'c', 'd']
# Your function
def f(args):
print(args)
return len(args)
# BEGIN OF ANSWER
from functools import reduce
# reduce function : prev = (list_of_final_values, previous function call result)
def func(prev, x):
cur = f(x)
vals, prev_x = prev
vals.append(cur - prev_x)
return vals, cur
# initial value cached too
init = f( [ l[0] ])
value = dict(
zip( l,
reduce(
func,
( l[:k] for k in range(1, len(l)) ),
([init], init)
)[0]
)
)
...Same result, except it's not lazy...
References :
https://docs.python.org/3/library/itertools.html#itertools.accumulate
https://docs.python.org/3/library/functools.html#functools.reduce
Let's say you have a list
lis = [1, 2, 3, 4, 5]
lis[:2] will give you [1, 2], i.e., one index less than the parameter passed - 0th and 1st index element, NOT the 2nd element. So, your requirement of "pairing" will not be fulfilled by
function(lis[:2]) - function(lis[:1])
as this will only give the second element of the list. This answers your first question of "not working well with the beginning and end of the list"
For the second part of your question, what I can infer from your question is that you want to access the list as {'a':'b', 'c':'d', 'e':'f'}. For this,
myDict = {}
for i in range(0, len(myList), 2):
myDict.update({myList[i] : myList[i + 1]})
If you want to access the list as {'a':'b', 'b':'c', 'c':'d', 'd':'e', 'e':'f'}, do
myDict = {}
for i in range(0, len(myList)):
if i != len(myList) - 1:
myDict.update({myList[i] : myList[i + 1]})

Replace one item in a string with one item from a list

I have a string and a list:
seq = '01202112'
l = [(0,1,0),(1,1,0)]
I would like a pythonic way of replacing each '2' with the value at the corresponding index in the list l such that I obtain two new strings:
list_seq = [01001110, 01101110]
By using .replace(), I could iterate through l, but I wondered is there a more pythonic way to get list_seq?
I might do something like this:
out = [''.join(c if c != '2' else str(next(f, c)) for c in seq) for f in map(iter, l)]
The basic idea is that we call iter to turn the tuples in l into iterators. At that point every time we call next on them, we get the next element we need to use instead of the '2'.
If this is too compact, the logic might be easier to read as a function:
def replace(seq, to_replace, fill):
fill = iter(fill)
for element in seq:
if element != to_replace:
yield element
else:
yield next(fill, element)
giving
In [32]: list(replace([1,2,3,2,2,3,1,2,4,2], to_replace=2, fill="apple"))
Out[32]: [1, 'a', 3, 'p', 'p', 3, 1, 'l', 4, 'e']
Thanks to #DanD in the comments for noting that I had assumed I'd always have enough characters to fill from! We'll follow his suggestion to keep the original characters if we run out, but modifying this approach to behave differently is straightforward and left as an exercise for the reader. :-)
[''.join([str(next(digit, 0)) if x is '2' else x for x in seq])
for digit in map(iter, l)]
I don't know if this solution is 'more pythonic' but:
def my_replace(s, c=None, *other):
return s if c is None else my_replace(s.replace('2', str(c), 1), *other)
seq = '01202112'
l = [(0,1,0),(1,1,0)]
list_req = [my_replace(seq, *x) for x in l]
seq = '01202112'
li = [(0,1,0),(1,1,0)]
def grunch(s, tu):
it = map(str,tu)
return ''.join(next(it) if c=='2' else c for c in s)
list_seq = [grunch(seq,tu) for tu in li]

Run Length Encoding in Python with List Comprehension

I have a more basic Run Length Encoding question compared to many of the questions about this topic that have already been answered. Essentially, I'm trying to take the string
string = 'aabccccaaa'
and have it return
a2b1c4a3
I thought that if I can manage to get all the information into a list like I have illustrated below, I would easily be able to return a2b1c4a3
test = [['a','a'], ['b'], ['c','c','c','c'], ['a','a','a']]
I came up with the following code so far, but was wondering if someone would be able to help me figure out how to make it create the output I illustrated above.
def string_compression():
for i in xrange(len(string)):
prev_item, current_item = string[i-1], string[i]
print prev_item, current_item
if prev_item == current_item:
<HELP>
If anyone has any additional comments regarding more efficient ways to go about solving a question like this I am all ears!
You can use itertools.groupby():
from itertools import groupby
grouped = [list(g) for k, g in groupby(string)]
This will produce your per-letter groups as a list of lists.
You can turn that into a RLE in one step:
rle = ''.join(['{}{}'.format(k, sum(1 for _ in g)) for k, g in groupby(string)])
Each k is the letter being grouped, each g an iterator producing N times the same letter; the sum(1 for _ in g) expression counts those in the most efficient way possible.
Demo:
>>> from itertools import groupby
>>> string = 'aabccccaaa'
>>> [list(g) for k, g in groupby(string)]
[['a', 'a'], ['b'], ['c', 'c', 'c', 'c'], ['a', 'a', 'a']]
>>> ''.join(['{}{}'.format(k, sum(1 for _ in g)) for k, g in groupby(string)])
'a2b1c4a3'
Consider using the more_itertools.run_length tool.
Demo
import more_itertools as mit
iterable = "aabccccaaa"
list(mit.run_length.encode(iterable))
# [('a', 2), ('b', 1), ('c', 4), ('a', 3)]
Code
"".join(f"{x[0]}{x[1]}" for x in mit.run_length.encode(iterable)) # python 3.6
# 'a2b1c4a3'
"".join(x[0] + str(x[1]) for x in mit.run_length.encode(iterable))
# 'a2b1c4a3'
Alternative itertools/functional style:
"".join(map(str, it.chain.from_iterable(x for x in mit.run_length.encode(iterable))))
# 'a2b1c4a3'
Note: more_itertools is a third-party library that installable via pip install more_itertools.
I'm a Python beginner and this is what I wrote for RLE.
s = 'aabccccaaa'
grouped_d = [(k, len(list(g))) for k, g in groupby(s)]
result = ''
for key, count in grouped_d:
result += key + str(count)
print(f'result = {result}')

Recursive Python function to produce a list of anagrams

After a lot of head scratching and googling I still can't figure this out. I'm very new to Python and I'm struggling with the syntax. Conceptually I think I have a pretty decent idea of what I want to do and how to do so recursively. Technically however, coding it into Python however is proving to be a nightmare.
Basically I want to add all of the permutations of a word to list (no duplicate characters allowed), which can then be called by another program or function.
The return command and how to handle white space is really confusing me. I want the recursive function to "return" something once it unwinds but I don't want it to stop the function until all of the characters have iterated and all the permutations have been recursively generated within those iterations. When I run the code below nothing seems to happen.
def permutations(A, B = ''):
assert len(A) >= 0
assert len(A) == len(set(A))
res = []
if len(A) == 0: res = res.extend(B)
else:
for i in range(len(A)):
permutations(A[0:i] + A[i+1:], B + A[i])
return res
permutations('word'))
If I run the code below it prints out OK to my display pane, but I can't figure out how to get it into an output format that can be used by other program like a list.
def permutations(A, B = ''):
assert len(A) >= 0
assert len(A) == len(set(A))
if len(A) == 0: print(B)
else:
for i in range(len(A)):
permutations(A[0:i] + A[i+1:], B + A[i])
permutations('word')
Please could someone advise me on this, while I have some hair left! Very gratefully received.
Thank you
Jon
Basically your mistake is in
res = res.extend(B)
.extend() doesn't return a new list, but modifies the instance.
Another problem is that you don't use the return value from your recursive calls.
Here is one way to fix your code:
def permutations(A, B = ''):
assert len(A) >= 0
assert len(A) == len(set(A))
if len(A) == 0:
return [B]
else:
res = []
for i in range(len(A)):
res.extend(permutations(A[0:i] + A[i+1:], B + A[i]))
return res
print permutations('word')
Like this?
from itertools import permutations
a = [x for x in permutations('word')]
print a
Output:
>>[('w', 'o', 'r', 'd'), ('w', 'o', 'd', 'r'), ('w', 'r', 'o', 'd'),
>>('w', 'r', 'd', 'o'), ('w', 'd', 'o', 'r'), ('w', 'd', 'r', 'o'),
>>('o', 'w', 'r', 'd'), ..............
EDIT:
I just realized you said no duplicate characters allowed. It does not really matter for 'word', but let's say you have 'wordwwwdd'. Then you could do:
[x for x in permutations(''.join(set('wordwwwdd')))]
But it will mess up the order because of using set, so it will look like:
>> [('r', 'o', 'w', 'd'), ('r', 'o', 'd', 'w'), ('r', 'w', 'o', 'd')....
I would do it like this:
def permute_nondupe_letters_to_words(iterable):
return (''.join(i) for i in itertools.permutations(set(iterable)))
And to use it:
word = 'word'
permutation_generator = permute_nondupe_letters_to_words(word)
bucket_1, bucket_2 = [], []
for i in permutation_generator:
bucket_1.append(i)
if i == 'owdr':
break
for i in permutation_generator:
bucket_2.append(i)
And
print(len(bucket_1), len(bucket_2))
prints:
(10, 14)
Here is another way to approach this problem:
it is Python 2.7 and 3.3 compatible (have not yet tested with other versions)
it will accept input containing duplicate items, and only return unique output
(ie permutations("woozy") will only return "oowzy" once)
it returns output in sorted order (and will allow you to specify sort key and ascending or descending order)
it returns string output on string input
it runs as a generator, ie does not store all combinations in memory. If that's what you want, you have to explicitly say so (example shown below)
Edit: it occurred to me that I had omitted a length parameter, so I added one. You can now ask for things like all unique 4-letter permutations from a six-letter string.
Without further ado:
from collections import Counter
import sys
if sys.hexversion < 0x3000000:
# Python 2.x
dict_items_list = lambda d: d.items()
is_string = lambda s: isinstance(s, basestring)
rng = xrange
else:
# Python 3.x
dict_items_list = lambda d: list(d.items())
is_string = lambda s: isinstance(s, str)
rng = range
def permutations(lst, length=None, key=None, reverse=False):
"""
Generate all unique permutations of lst in sorted order
lst list of items to permute
length number of items to pick for each permutation (defaults to all items)
key sort-key for items in lst
reverse sort in reverse order?
"""
# this function is basically a shell, setting up the values
# for _permutations, which actually does most of the work
if length is None:
length = len(lst)
elif length < 1 or length > len(lst):
return [] # no possible answers
# 'woozy' => [('w', 1), ('o', 2), ('z', 1), ('y', 1)] # unknown order
items = dict_items_list(Counter(lst))
# => [('o', 2), ('w', 1), ('y', 1), ('z', 1)] # now in sorted order
items.sort(key=key, reverse=reverse)
if is_string(lst):
# if input was string, return generator of string
return (''.join(s) for s in _permutations(items, length))
else:
# return generator of list
return _permutations(items, length)
def _permutations(items, length):
if length == 1:
for item,num in items:
yield [item]
else:
for ndx in rng(len(items)):
# pick an item to start with
item, num = items[ndx]
# make new list of remaining items
if num == 1:
remaining_items = items[:ndx] + items[ndx+1:]
else:
remaining_items = items[:ndx] + [(item, num-1)] + items[ndx+1:]
# recurse against remaining items
for perm in _permutations(remaining_items, length-1):
yield [item]+perm
# test run!
words = list(permutations("woozy"))
results in
['oowyz',
'oowzy',
'ooywz',
'ooyzw',
'oozwy',
'oozyw',
'owoyz',
# ...
'zwooy',
'zwoyo',
'zwyoo',
'zyoow',
'zyowo',
'zywoo'] # 60 items = 5!/2!, as expected

Categories

Resources