What is the python equivalent to perl "a".."azc" - python

In perl, to get a list of all strings from "a" to "azc", to only thing to do is using the range operator:
perl -le 'print "a".."azc"'
What I want is a list of strings:
["a", "b", ..., "z", "aa", ..., "az" ,"ba", ..., "azc"]
I suppose I can use ord and chr, looping over and over, this is simple to get for "a" to "z", eg:
>>> [chr(c) for c in range(ord("a"), ord("z") + 1)]
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
But a bit more complex for my case, here.
Thanks for any help !

Generator version:
from string import ascii_lowercase
from itertools import product
def letterrange(last):
for k in range(len(last)):
for x in product(ascii_lowercase, repeat=k+1):
result = ''.join(x)
yield result
if result == last:
return
EDIT: #ihightower asks in the comments:
I have no idea what I should do if I want to print from 'b' to 'azc'.
So you want to start with something other than 'a'. Just discard anything before the start value:
def letterrange(first, last):
for k in range(len(last)):
for x in product(ascii_lowercase, repeat=k+1):
result = ''.join(x)
if first:
if first != result:
continue
else:
first = None
yield result
if result == last:
return

A suggestion purely based on iterators:
import string
import itertools
def string_range(letters=string.ascii_lowercase, start="a", end="z"):
return itertools.takewhile(end.__ne__, itertools.dropwhile(start.__ne__, (x for i in itertools.count(1) for x in itertools.imap("".join, itertools.product(letters, repeat=i)))))
print list(string_range(end="azc"))

Use the product call in itertools, and ascii_letters from string.
from string import ascii_letters
from itertools import product
if __name__ == '__main__':
values = []
for i in xrange(1, 4):
values += [''.join(x) for x in product(ascii_letters[:26], repeat=i)]
print values

Here's a better way to do it, though you need a conversion function:
for i in xrange(int('a', 36), int('azd', 36)):
if base36encode(i).isalpha():
print base36encode(i, lower=True)
And here's your function (thank you Wikipedia):
def base36encode(number, alphabet='0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ', lower=False):
'''
Convert positive integer to a base36 string.
'''
if lower:
alphabet = alphabet.lower()
if not isinstance(number, (int, long)):
raise TypeError('number must be an integer')
if number < 0:
raise ValueError('number must be positive')
# Special case for small numbers
if number < 36:
return alphabet[number]
base36 = ''
while number != 0:
number, i = divmod(number, 36)
base36 = alphabet[i] + base36
return base36
I tacked on the lowercase conversion option, just in case you wanted that.

I generalized the accepted answer to be able to start middle and to use other than lowercase:
from string import ascii_lowercase, ascii_uppercase
from itertools import product
def letter_range(first, last, letters=ascii_lowercase):
for k in range(len(first), len(last)):
for x in product(letters, repeat=k+1):
result = ''.join(x)
if len(x) != len(first) or result >= first:
yield result
if result == last:
return
print list(letter_range('a', 'zzz'))
print list(letter_range('BA', 'DZA', ascii_uppercase))

def strrange(end):
values = []
for i in range(1, len(end) + 1):
values += [''.join(x) for x in product(ascii_lowercase, repeat=i)]
return values[:values.index(end) + 1]

Related

How to count sequentially using letters instead of numbers?

Is there a simple way to count using letters in Python? Meaning, 'A' will be used as 1, 'B' as 2 and so on, and after 'Z' will be 'AA', 'AB' and so on. So below code would generate:
def get_next_letter(last_letter):
return last_letter += 1 # pseudo
>>> get_next_letter('a')
'b'
>>> get_next_letter('b')
'c'
>>> get_next_letter('c')
'd'
...
>>> get_next_letter('z')
'aa'
>>> get_next_letter('aa')
'ab'
>>> get_next_letter('ab')
'ac'
...
>>> get_next_letter('az')
'ba'
>>> get_next_letter('ba')
'bb'
...
>>> get_next_letter('zz')
'aaa'
Based on #Charlie Clark's implementation of the openpyxl util get_column_letter, we can have:
def get_number_letter(n):
letters = []
while n > 0:
n, remainder = divmod(n, 26)
# check for exact division and borrow if needed
if remainder == 0:
remainder = 26
n-= 1
letters.append(chr(remainder+64))
return ''.join(reversed(letters))
This gives the letter representation of a number. Now, to increment, we need the reverse. Based on that logic (and the general number base logic), I wrote:
def number_from_string(letters):
n = 0
for i, c in enumerate(reversed(letters)):
n += (ord(c)-64)*26**i
return n
And now we can combine them to:
def get_next_letter(letters):
return get_number_letter(number_from_string(letters)+1)
Original answer:
This kind of "counting" is very similar to how Excel indexes its columns. Therefore it is possible to take advantage of the openpyxl package, which has two utility functions: get_column_letter and column_index_from_string:
from openpyxl.utils import get_column_letter, column_index_from_string
def get_next_letter(letters):
return get_column_letter(column_index_from_string(letters)+1)
NOTE: as this is based on Excel, it is limited to count up-to 'ZZZ'. i.e. calling the function with 'ZZZ' will raise an exception.
Output example for both implementations:
>>> get_next_letter('A')
'B'
>>> get_next_letter('Z')
'AA'
>>> get_next_letter('BD')
'BE'
Let's start with the simple special case of getting just the single-character strings.
from string import ascii_lowercase
def population():
yield from ascii_lowercase
Then
>>> x = population()
>>> list(x)
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
>>> x = population()
>>> next(x)
'a'
>>> next(x)
'b'
So we'd like to add the two-character sequences next:
from string import ascii_lowercase
from itertools import product
def population():
yield from ascii_lowercase
yield from map(''.join, product(ascii_lowercase, repeat=2)
Note that the single-character strings are just a special case of the product with repeat=1, so we could have written
from string import ascii_lowercase
from itertools import product
def population():
yield from map(''.join, product(ascii_lowercase, repeat=1)
yield from map(''.join, product(ascii_lowercase, repeat=2)
We can write this with a loop:
def population():
for k in range(1, 3):
yield from map(''.join, product(ascii_lowercase, repeat=k)
but we don't necessarily want an artificial upper limit on what strings we can produce; we want, in theory, to produce all of them. For that, we replace range with itertools.count.
from string import ascii_lowercase
from itertools import product, count
def population():
for k in count(1):
yield from map(''.join, product(ascii_lowercase, repeat=k)
all proposed are just way too complicated
I came up with below, using a recursive call,
this is it!
def getNextLetter(previous_letter):
"""
'increments' the provide string to the next letter recursively
raises TypeError if previous_letter is not a string
returns "a" if provided previous_letter was emtpy string
"""
if not isinstance(previous_letter, str):
raise TypeError("the previous letter should be a letter, doh")
if previous_letter == '':
return "a"
for letter_location in range(len(previous_letter) - 1, -1, -1):
if previous_letter[letter_location] == "z":
return getNextLetter(previous_letter[:-1])+"a"
else:
return (previous_letter[:-1])+chr(ord(previous_letter[letter_location])+1)
# EOF

'Clumping' a list in python

I've been trying to 'clump' a list
I mean putting items together depending on the item inbetween, so ['d','-','g','p','q','-','a','v','i'] becomes ['d-g','p','q-a','v','i'] when 'clumped' around any '-'
Here's my attempt:
def clump(List):
box = []
for item in List:
try:
if List[List.index(item) + 1] == "-":
box.append("".join(List[List.index(item):List.index(item)+3]))
else:
box.append(item)
except:
pass
return box
However, it outputs (for the example above)
['d-g', '-', 'g', 'p', 'q-a', '-', 'a', 'v']
As I have no idea how to skip the next two items
Also, the code is a complete mess, mainly due to the try and except statement (I use it, otherwise I get an IndexError, when it reaches the last item)
How can it be fixed (or completely rewritten)?
Thanks
Here's an O(n) solution that maintains a flag determining whether or not you are currently clumping. It then manipulates the last item in the list based on this condition:
def clump(arr):
started = False
out = []
for item in arr:
if item == '-':
started = True
out[-1] += item
elif started:
out[-1] += item
started = False
else:
out.append(item)
return out
In action:
In [53]: clump(x)
Out[53]: ['d-g', 'p', 'q-a', 'v', 'i']
This solution will fail if the first item in the list is a dash, but that seems like it should be an invalid input.
Here is a solution using re.sub
>>> import re
>>> l = ['d','-','g','p','q','-','a','v','i']
>>> re.sub(':-:', '-', ':'.join(l)).split(':')
['d-g', 'p', 'q-a', 'v', 'i']
And here is another solution using itertools.zip_longest
>>> from itertools import zip_longest
>>> l = ['d','-','g','p','q','-','a','v','i']
>>> [x+y+z if y=='-' else x for x,y,z in zip_longest(l, l[1:], l[2:], fillvalue='') if '-' not in [x,z]]
['d-g', 'g', 'q-a', 'a', 'v', 'i']

How to get certain number of alphabets from a list?

I have a 26-digit list. I want to print out a list of alphabets according to the numbers. For example, I have a list(consisting of 26-numbers from input):
[0,0,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0]
I did like the output to be like this:
[e,e,l,s]
'e' is on the output 2-times because on the 4-th index it is the 'e' according to the English alphabet formation and the digit on the 4-th index is 2. It's the same for 'l' since it is on the 11-th index and it's digit is 1. The same is for s. The other letters doesn't appear because it's digits are zero.
For example, I give another 26-digit input. Like this:
[1,2,2,3,4,0,3,4,4,1,3,1,4,4,1,0,0,0,0,0,4,2,3,2,2,1]
The output should be:
[a,b,b,c,c,d,d,d,e,e,e,e,g,g,g,h,h,h,h,i,i,i,i,j,k,k,k,l,m,m,m,m,n,n,n,n,o,u,u,u,u,v,v,w,w,w,x,x,y,y,z]
Is, there any possible to do this in Python 3?
You can use chr(97 + item_index) to get the respective items and then multiply by the item itself:
In [40]: [j * chr(97 + i) for i, j in enumerate(lst) if j]
Out[40]: ['ee', 'l', 's']
If you want them separate you can utilize itertools module:
In [44]: from itertools import repeat, chain
In [45]: list(chain.from_iterable(repeat(chr(97 + i), j) for i, j in enumerate(lst) if j))
Out[45]: ['e', 'e', 'l', 's']
Yes, it is definitely possible in Python 3.
Firstly, define an example list (as you did) of numbers and an empty list to store the alphabetical results.
The actual logic to link with the index is using chr(97 + index), ord("a") = 97 therefore, the reverse is chr(97) = a. First index is 0 so 97 remains as it is and as it iterates the count increases and your alphabets too.
Next, a nested for-loop to iterate over the list of numbers and then another for-loop to append the same alphabet multiple times according to the number list.
We could do this -> result.append(chr(97 + i) * my_list[i]) in the first loop itself but it wouldn't yield every alphabet separately [a,b,b,c,c,d,d,d...] rather it would look like [a,bb,cc,ddd...].
my_list = [1,2,2,3,4,0,3,4,4,1,3,1,4,4,1,0,0,0,0,0,4,2,3,2,2,1]
result = []
for i in range(len(my_list)):
if my_list[i] > 0:
for j in range(my_list[i]):
result.append(chr(97 + i))
else:
pass
print(result)
An alternative to the wonderful answer by #Kasramvd
import string
n = [0,0,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0]
res = [i * c for i, c in zip(n, string.ascii_lowercase) if i]
print(res) # -> ['ee', 'l', 's']
Your second example produces:
['a', 'bb', 'cc', 'ddd', 'eeee', 'ggg', 'hhhh', 'iiii', 'j', 'kkk', 'l', 'mmmm', 'nnnn', 'o', 'uuuu', 'vv', 'www', 'xx', 'yy', 'z']
Splitting the strings ('bb' to 'b', 'b') can be done with the standard schema:
[x for y in something for x in y]
Using a slightly different approach, which gives the characters individually as in your example:
import string
a = [0,0,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0]
alphabet_lookup = np.repeat(np.arange(len(a)), a)
letter_lookup = np.array(list(string.ascii_lowercase))
res = letter_lookup[alphabet_lookup]
print(res)
To get
['e' 'e' 'l' 's']

Removing an element from python string recursively?

I'm trying to figure out how to write a program that would remove a given element from a python string recursively. Here's what I have so far:
def remove(x,s):
if x == s[0]:
return ''
else:
return s[0] + remove(x,s[1:])
When testing this code on the input remove('t', 'wait a minute'), it seems to work up until it reaches the first 't', but the code then terminates instead of continuing to go through the string. Does anyone have any ideas of how to fix this?
In your code, you return '' when you run into the character you're removing.
This will drop the rest of the string.
You want to keep going through the string instead (also pass x in recursive calls and add a base case):
def remove(x, s):
if not s:
return ''
if x == s[0]:
return remove(x, s[1:])
else:
return s[0] + remove(x, s[1:])
Also, in case you didn't know, you can use str.replace() to achieve this:
>>> 'wait a minute'.replace('t', '')
'wai a minue'
def Remove(s,e):
return filter(lambda x: x!= e, s)
Here is an example for your test
sequence = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
RemoveElement = ['d','c']
print(filter(lambda x: x not in RemoveElement, sequence))
#['a', 'b', 'e', 'f', 'g', 'h']
if you are just replacing/removing a character like 't' you could just use a list comprehension:
s = 'wait a minute'
xs = ''.join(x for x in s if x != 't')

Returning the value of an index in a python list based on other values

I have put the letters a-z in a list. How would I find the value of an item in the list depending on what the user typed?
For example if they type the letter a it would return c, f would return h and x would return z.
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
newletters = []
offset = 2
userInput = input('type a string')
newvalue = chr(ord(userInput)+offset)
split = list(newvalue)
print split
the above works for a character but not for a string..help?!
You can try this:
>>> offset = 2
>>> aString = raw_input("digit a letter: ")
>>> aString
'a'
>>> chr(ord(aString)+offset)
'c'
documentation:
https://docs.python.org/2/library/functions.html#chr
https://docs.python.org/2/library/functions.html#ord
If you want to iterate over an entire string, a simple way is using a for loop. I assume the input string is always lowercase.
EDIT2: I improved the solution to handle the case when a letter is 'y' or 'z' and without "rotation" should begin a not alphabetic character, eg:
# with only offset addiction this return a non-alphabetic character
>>> chr(ord('z')+2)
'|'
# the 'z' rotation return the letter 'b'
>>> letter = "z"
>>> ord_letter = ord(letter)+offset
>>> ord_letter_rotated = ((ord_letter - 97) % 26) + 97
>>> chr(ord_letter_rotated)
'b'
The code solution:
offset = 2
aString = raw_input("digit the string to convert: ")
#aString = "abz"
newString = ""
for letter in aString:
ord_letter = ord(letter)+offset
ord_letter_rotated = ((ord_letter - 97) % 26) + 97
newString += chr(ord_letter_rotated)
print newString
The output of this code for the entire lowercase alphabet:
cdefghijklmnopqrstuvwxyzab
Note: you can obtain the lowercase alphabet for free also this way:
>>> import string
>>> string.lowercase
'abcdefghijklmnopqrstuvwxyz'
See the wikipedia page to learn something about ROT13:
https://en.wikipedia.org/wiki/ROT13
What should happen for z? Should it become b?
You can use Python's maketrans and translate functions to do this as follows:
import string
def rotate(text, by):
s_from = string.ascii_lowercase
s_to = string.ascii_lowercase[by:] + string.ascii_lowercase[:by]
cypher_table = string.maketrans(s_from, s_to)
return text.translate(cypher_table)
user_input = raw_input('type a string: ').lower()
print rotate(user_input, 2)
This works on the whole string as follows:
type a string: abcxyz
cdezab
How does it work?
If you print s_from and s_to they look as follows:
abcdefghijklmnopqrstuvwxyz
cdefghijklmnopqrstuvwxyzab
maketrans creates a mapping table to map characters in s_from to s_to. translate then applies this mapping to your string.

Categories

Resources