I would like to know how in Python I can iterate through a set of conditions.
string that has 2-6 lower alpha or numeric characters
the first character is always a number
So a short progression would be:
1a
1b
1c
...
1aa
1ab
1ac
...
2aaa
2aab
2aac
etc.
A horrible example that can do the first two is
##Loop through 1a-z0-9
start = '1'
l = 97
while l < 123:
num = start
num += chr(l)
print num
l += 1
l = 48
while l < 58:
num = start
num += chr(l)
print num
l += 1
I found itertools but can't find good examples to go off of.
You can do this using itertools.product and itertools.chain. First define strings of the numbers and letters:
numbers = '0123456789'
alnum = numbers + 'abcdefghijklmnopqrstuvwxyz'
Using itertools.product, you can get tuples with the characters for the strings of various length:
len2 = itertools.product(numbers, alnum) # length 2
len3 = itertools.product(numbers, alnum, alnum) # length 3
...
Chain the iterators for all the lengths together, joining the tuples into strings. I'd do it with a list comprehension:
[''.join(p) for p in itertools.chain(len2, len3, len4, len5, len6)]
I would go with product function from itertools.
import itertools
digits = '0123456789'
alphanum = 'abcdef...z' + digits # this should contain all the letters and digits
for i in xrange(1, 6):
for tok in itertools.product(digits, itertools.product(alphanum, repeat=i)):
# do whatever you want with this token `tok` here.
You can think of this problem in base 26 (Ignoring the first number, we will put this in a separate case.) So with the letters we want to range from 'a' to 'zzzzz' in the base 26 would be 0 and (26,26,26,26,26) = 26 ^ 0 + 26 + 26^2 + 26^3 + 26^4 + 26^5. So now we have a bijection from numbers to letters, we just want to write a function that takes us from a number to a word
letters = 'abcdef..z'
def num_to_word( num ):
res = ''
while num:
res += letters[num%26]
num //= 26
return res
Now to write our function that enumerates this
def generator():
for num in xrange(10):
for letter_num in xrange( sum( 26 ** i for i in xrange( 6 ) ) + 1 ):
tok = str(num) + num_to_word( letter_num )
yield tok
lets do this with a breadth first search type algorithm
starting from
Root:
have 10 children, i = 0,1,...,9
so , this root must have an iterator, 'i'
therefore this outermost loop will iterate 'i' from 0 to 9
i:
for each 'i', there are 5 children (ix , ixx, ixxx, ixxxx, ixxxxx)
( number of chars at the string )
so each i should have its own iterator 'j' representing number of chars
the loop inside Root's loop will iterate 'j' from 1 to 5
j:
'j' will have 'j' number of children ( 1 -> x , 2 -> xx ,..., 5-> xxxxx)
so each j will have its own iterator 'k' representing each "character"
so, 'k' will be iterated inside this j loop, from 1 to j
( i=2, j=4, k = 3 will focus on 'A' at string "2xxAx" )
k:
each 'k' represents a character, so it iterates from 'a' to 'z'
each k should have a iterator(value) 'c' that iterates from 'a' to 'z' (or 97 to 122)
i think this will make sense than what i wanted to show u earlier. :)
if u dont get the idea please tell me.. btw, its an interesting question :)
Related
This question already has answers here:
How to count consecutive repetitions of a substring in a string?
(4 answers)
Closed 1 year ago.
I'm working on a cs50/pset6/dna project. I'm struggling with finding a way to analyze a sequence of strings, and gather the maximum number of times a certain sequence of characters repeats consecutively. Here is an example:
String: JOKHCNHBVDBVDBVDJHGSBVDBVD
Sequence of characters I should look for: BVD
Result: My function should be able to return 3, because in one point the characters BVD repeat three times consecutively, and even though it repeats again two times, I should look for the time that it repeats the most number of times.
It's a bit lame, but one "brute-force"ish way would be to just check for the presence of the longest substring possible. As soon as a substring is found, break out of the loop:
EDIT - Using a function might be more straight forward:
def get_longest_repeating_pattern(string, pattern):
if not pattern:
return ""
for i in range(len(string)//len(pattern), 0, -1):
current_pattern = pattern * i
if current_pattern in string:
return current_pattern
return ""
string = "JOKHCNHBVDBVDBVDJHGSBVDBVD"
pattern = "BVD"
longest_repeating_pattern = get_longest_repeating_pattern(string, pattern)
print(len(longest_repeating_pattern))
EDIT - explanation:
First, just a simple for-loop that starts at a larger number and goes down to a smaller number. For example, we start at 5 and go down to 0 (but not including 0), with a step size of -1:
>>> for i in range(5, 0, -1):
print(i)
5
4
3
2
1
>>>
if string = "JOKHCNHBVDBVDBVDJHGSBVDBVD", then len(string) would be 26, if pattern = "BVD", then len(pattern) is 3.
Back to my original code:
for i in range(len(string)//len(pattern), 0, -1):
Plugging in the numbers:
for i in range(26//3, 0, -1):
26//3 is an integer division which yields 8, so this becomes:
for i in range(8, 0, -1):
So, it's a for-loop that goes from 8 to 1 (remember, it doesn't go down to 0). i takes on the new value for each iteration, first 8 , then 7, etc.
In Python, you can "multiply" strings, like so:
>>> pattern = "BVD"
>>> pattern * 1
'BVD'
>>> pattern * 2
'BVDBVD'
>>> pattern * 3
'BVDBVDBVD'
>>>
A slightly less bruteforcey solution:
string = 'JOKHCNHBVDBVDBVDJHGSBVDBVD'
key = 'BVD'
len_k = len(key)
max_l = 0
passes = 0
curr_len=0
for i in range(len(string) - len_k + 1): # split the string into substrings of same len as key
if passes > 0: # If key was found in previous sequences, pass ()this way, if key is 'BVD', we will ignore 'VD.' and 'D..'
passes-=1
continue
s = string[i:i+len_k]
if s == key:
curr_len+=1
if curr_len > max_l:
max_l=curr_len
passes = len(key)-1
if prev_s == key:
if curr_len > max_l:
max_l=curr_len
else:
curr_len=0
prev_s = s
print(max_l)
You can do that very easily, elegantly and efficiently using a regex.
We look for all sequences of at least one repetition of your search string. Then, we just need to take the maximum length of these sequences, and divide by the length of the search string.
The regex we use is '(:?<your_sequence>)+': at least one repetition (the +) of the group (<your_sequence>). The :? is just here to make the group non capturing, so that findall returns the whole match, and not just the group.
In case there is no match, we use the default parameter of the max function to return 0.
The code is very short, then:
import re
def max_consecutive_repetitions(search, data):
search_re = re.compile('(?:' + search + ')+')
return max((len(seq) for seq in search_re.findall(data)), default=0) // len(search)
Sample run:
print(max_consecutive_repetitions("BVD", "JOKHCNHBVDBVDBVDJHGSBVDBVD"))
# 3
This is my contribution, I'm not a professional but it worked for me (sorry for bad English)
results = {}
# Loops through all the STRs
for i in range(1, len(reader.fieldnames)):
STR = reader.fieldnames[i]
j = 0
s=0
pre_s = 0
# Loops through all the characters in sequence.txt
while j < (len(sequence) - len(STR)):
# checks if the character we are currently looping is the same than the first STR character
if STR[0] == sequence[j]:
# while the sub-string since j to j - STR lenght is the same than STR, I called this a streak
while sequence[j:(j + len(STR))] == STR:
# j skips to the end of sub-string
j += len(STR)
# streaks counter
s += 1
# if s > 0 means that that the whole STR and sequence coincided at least once
if s > 0:
# save the largest streak as pre_s
if s > pre_s:
pre_s = s
# restarts the streak counter to continue exploring the sequence
s=0
j += 1
# assigns pre_s value to a dictionary with the current STR as key
results[STR] = pre_s
print(results)
I would like to make a alphabetical list for an application similar to an excel worksheet.
A user would input number of cells and I would like to generate list.
For example a user needs 54 cells. Then I would generate
'a','b','c',...,'z','aa','ab','ac',...,'az', 'ba','bb'
I can generate the list from [ref]
from string import ascii_lowercase
L = list(ascii_lowercase)
How do i stitch it together?
A similar question for PHP has been asked here. Does some one have the python equivalent?
Use itertools.product.
from string import ascii_lowercase
import itertools
def iter_all_strings():
for size in itertools.count(1):
for s in itertools.product(ascii_lowercase, repeat=size):
yield "".join(s)
for s in iter_all_strings():
print(s)
if s == 'bb':
break
Result:
a
b
c
d
e
...
y
z
aa
ab
ac
...
ay
az
ba
bb
This has the added benefit of going well beyond two-letter combinations. If you need a million strings, it will happily give you three and four and five letter strings.
Bonus style tip: if you don't like having an explicit break inside the bottom loop, you can use islice to make the loop terminate on its own:
for s in itertools.islice(iter_all_strings(), 54):
print s
You can use a list comprehension.
from string import ascii_lowercase
L = list(ascii_lowercase) + [letter1+letter2 for letter1 in ascii_lowercase for letter2 in ascii_lowercase]
Following #Kevin 's answer :
from string import ascii_lowercase
import itertools
# define the generator itself
def iter_all_strings():
size = 1
while True:
for s in itertools.product(ascii_lowercase, repeat=size):
yield "".join(s)
size +=1
The code below enables one to generate strings, that can be used to generate unique labels for example.
# define the generator handler
gen = iter_all_strings()
def label_gen():
for s in gen:
return s
# call it whenever needed
print label_gen()
print label_gen()
print label_gen()
I've ended up doing my own.
I think it can create any number of letters.
def AA(n, s):
r = n % 26
r = r if r > 0 else 26
n = (n - r) / 26
s = chr(64 + r) + s
if n > 26:
s = AA(n, s)
elif n > 0:
s = chr(64 + n) + s
return s
n = quantity | r = remaining (26 letters A-Z) | s = string
To print the list :
def uprint(nc):
for x in range(1, nc + 1):
print AA(x,'').lower()
Used VBA before convert to python :
Function AA(n, s)
r = n Mod 26
r = IIf(r > 0, r, 26)
n = (n - r) / 26
s = Chr(64 + r) & s
If n > 26 Then
s = AA(n, s)
ElseIf n > 0 Then
s = Chr(64 + n) & s
End If
AA = s
End Function
Using neo's insight on a while loop.
For a given iterable with chars in ascending order. 'abcd...'.
n is the Nth position of the representation starting with 1 as the first position.
def char_label(n, chars):
indexes = []
while n:
residual = n % len(chars)
if residual == 0:
residual = len(chars)
indexes.append(residual)
n = (n - residual)
n = n // len(chars)
indexes.reverse()
label = ''
for i in indexes:
label += chars[i-1]
return label
Later you can print a list of the range n of the 'labels' you need using a for loop:
my_chrs = 'abc'
n = 15
for i in range(1, n+1):
print(char_label(i, my_chrs))
or build a list comprehension etc...
Print the set of xl cell range of lowercase and uppercase charterers
Upper_case:
from string import ascii_uppercase
import itertools
def iter_range_strings(start_colu):
for size in itertools.count(1):
for string in itertools.product(ascii_uppercase, repeat=size):
yield "".join(string)
input_colume_range = ['A', 'B']
input_row_range= [1,2]
for row in iter_range_strings(input_colume_range[0]):
for colum in range(int(input_row_range[0]), int(input_row_range[1]+1)):
print(str(row)+ str(colum))
if row == input_colume_range[1]:
break
Result:
A1
A2
B1
B2
In two lines (plus an import):
from string import ascii_uppercase as ABC
count = 100
ABC+=' '
[(ABC[x[0]] + ABC[x[1]]).strip() for i in range(count) if (x:= divmod(i-26, 26))]
Wrap it in a function/lambda if you need to reuse.
code:
alphabet = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
for i in range(len(alphabet)):
for a in range(len(alphabet)):
print(alphabet[i] + alphabet[a])
result:
aa
ab
ac
ad
ae
af
ag
ah
ai
aj
ak
al
am
...
My prof wants me to create a function that return the sum of numbers in a string but without using any lists or list methods.
The function should look like this when operating:
>>> sum_numbers('34 3 542 11')
590
Usually a function like this would be easy to create when using lists and list methods. But trying to do so without using them is a nightmare.
I tried the following code but they don't work:
>>> def sum_numbers(s):
for i in range(len(s)):
int(i)
total = s[i] + s[i]
return total
>>> sum_numbers('1 2 3')
'11'
Instead of getting 1, 2, and 3 all converted into integers and added together, I instead get the string '11'. In other words, the numbers in the string still have not been converted to integers.
I also tried using a map() function but I just got the same results:
>>> def sum_numbers(s):
for i in range(len(s)):
map(int, s[i])
total = s[i] + s[i]
return total
>>> sum_numbers('1 2 3')
'11'
Totally silly of course, but for fun:
s = '34 3 542 11'
n = ""; total = 0
for c in s:
if c == " ":
total = total + int(n)
n = ""
else:
n = n + c
# add the last number
total = total + int(n)
print(total)
> 590
This assumes all characters (apart from whitespaces) are figures.
You've definitely put some effort in here, but one part of your approach definitely won't work as-is: you're iterating over the characters in the string, but you keep trying to treat each character as its own number. I've written a (very commented) method that accomplishes what you want without using any lists or list methods:
def sum_numbers(s):
"""
Convert a string of numbers into a sum of those numbers.
:param s: A string of numbers, e.g. '1 -2 3.3 4e10'.
:return: The floating-point sum of the numbers in the string.
"""
def convert_s_to_val(s):
"""
Convert a string into a number. Will handle anything that
Python could convert to a float.
:param s: A number as a string, e.g. '123' or '8.3e-18'.
:return: The float value of the string.
"""
if s:
return float(s)
else:
return 0
# These will serve as placeholders.
sum = 0
current = ''
# Iterate over the string character by character.
for c in s:
# If the character is a space, we convert the current `current`
# into its numeric representation.
if c.isspace():
sum += convert_s_to_val(current)
current = ''
# For anything else, we accumulate into `current`.
else:
current = current + c
# Add `current`'s last value to the sum and return.
sum += convert_s_to_val(current)
return sum
Personally, I would use this one-liner, but it uses str.split():
def sum_numbers(s):
return sum(map(float, s.split()))
No lists were used (nor harmed) in the production of this answer:
def sum_string(string):
total = 0
if len(string):
j = string.find(" ") % len(string) + 1
total += int(string[:j]) + sum_string(string[j:])
return total
If the string is noisier than the OP indicates, then this should be more robust:
import re
def sum_string(string):
pattern = re.compile(r"[-+]?\d+")
total = 0
match = pattern.search(string)
while match:
total += int(match.group())
match = pattern.search(string, match.end())
return total
EXAMPLES
>>> sum_string('34 3 542 11')
590
>>> sum_string(' 34 4 ')
38
>>> sum_string('lksdjfa34adslkfja4adklfja')
38
>>> # and I threw in signs for fun
...
>>> sum_string('34 -2 45 -8 13')
82
>>>
If you want to be able to handle floats and negative numbers:
def sum_numbers(s):
sm = i = 0
while i < len(s):
t = ""
while i < len(s) and not s[i].isspace():
t += s[i]
i += 1
if t:
sm += float(t)
else:
i += 1
return sm
Which will work for all cases:
In [9]: sum_numbers('34 3 542 11')
Out[9]: 590.0
In [10]: sum_numbers('1.93 -1 23.12 11')
Out[10]: 35.05
In [11]: sum_numbers('')
Out[11]: 0
In [12]: sum_numbers('123456')
Out[12]: 123456.0
Or a variation taking slices:
def sum_numbers(s):
prev = sm = i = 0
while i < len(s):
while i < len(s) and not s[i].isspace():
i += 1
if i > prev:
sm += float(s[prev:i])
prev = i
i += 1
return sm
You could also use itertools.groupby which uses no lists, using a set of allowed chars to group by:
from itertools import groupby
def sum_numbers(s):
allowed = set("0123456789-.")
return sum(float("".join(v)) for k,v in groupby(s, key=allowed.__contains__) if k)
which gives you the same output:
In [14]: sum_numbers('34 3 542 11')
Out[14]: 590.0
In [15]: sum_numbers('1.93 -1 23.12 11')
Out[15]: 35.05
In [16]: sum_numbers('')
Out[16]: 0
In [17]: sum_numbers('123456')
Out[17]: 123456.0
Which if you only have to consider positive ints could just use str.isdigit as the key:
def sum_numbers(s):
return sum(int("".join(v)) for k,v in groupby(s, key=str.isdigit) if k)
Try this:
def sum_numbers(s):
sum = 0
#This string will represent each number
number_str = ''
for i in s:
if i == ' ':
#if it is a whitespace it means
#that we have a number so we incease the sum
sum += int(number_str)
number_str = ''
continue
number_str += i
else:
#add the last number
sum += int(number_str)
return sum
You could write a generator:
def nums(s):
idx=0
while idx<len(s):
ns=''
while idx<len(s) and s[idx].isdigit():
ns+=s[idx]
idx+=1
yield int(ns)
while idx<len(s) and not s[idx].isdigit():
idx+=1
>>> list(nums('34 3 542 11'))
[34, 3, 542, 11]
Then just sum that:
>>> sum(nums('34 3 542 11'))
590
or, you could use re.finditer with a regular expression and a generator construction:
>>> sum(int(m.group(1)) for m in re.finditer(r'(\d+)', '34 3 542 11'))
590
No lists used...
def sum_numbers(s):
total=0
gt=0 #grand total
l=len(s)
for i in range(l):
if(s[i]!=' '):#find each number
total = int(s[i])+total*10
if(s[i]==' ' or i==l-1):#adding to the grand total and also add the last number
gt+=total
total=0
return gt
print(sum_numbers('1 2 3'))
Here each substring is converted to number and added to grant total
If we omit the fact eval is evil, we can solve that problem with it.
def sum_numbers(s):
s = s.replace(' ', '+')
return eval(s)
Yes, that simple. But i won't put that thing in production.
And sure we need to test that:
from hypothesis import given
import hypothesis.strategies as st
#given(list_num=st.lists(st.integers(), min_size=1))
def test_that_thing(list_num):
assert sum_numbers(' '.join(str(i) for i in list_num)) == sum(list_num)
test_that_thing()
And it would raise nothing.
I would like to make a alphabetical list for an application similar to an excel worksheet.
A user would input number of cells and I would like to generate list.
For example a user needs 54 cells. Then I would generate
'a','b','c',...,'z','aa','ab','ac',...,'az', 'ba','bb'
I can generate the list from [ref]
from string import ascii_lowercase
L = list(ascii_lowercase)
How do i stitch it together?
A similar question for PHP has been asked here. Does some one have the python equivalent?
Use itertools.product.
from string import ascii_lowercase
import itertools
def iter_all_strings():
for size in itertools.count(1):
for s in itertools.product(ascii_lowercase, repeat=size):
yield "".join(s)
for s in iter_all_strings():
print(s)
if s == 'bb':
break
Result:
a
b
c
d
e
...
y
z
aa
ab
ac
...
ay
az
ba
bb
This has the added benefit of going well beyond two-letter combinations. If you need a million strings, it will happily give you three and four and five letter strings.
Bonus style tip: if you don't like having an explicit break inside the bottom loop, you can use islice to make the loop terminate on its own:
for s in itertools.islice(iter_all_strings(), 54):
print s
You can use a list comprehension.
from string import ascii_lowercase
L = list(ascii_lowercase) + [letter1+letter2 for letter1 in ascii_lowercase for letter2 in ascii_lowercase]
Following #Kevin 's answer :
from string import ascii_lowercase
import itertools
# define the generator itself
def iter_all_strings():
size = 1
while True:
for s in itertools.product(ascii_lowercase, repeat=size):
yield "".join(s)
size +=1
The code below enables one to generate strings, that can be used to generate unique labels for example.
# define the generator handler
gen = iter_all_strings()
def label_gen():
for s in gen:
return s
# call it whenever needed
print label_gen()
print label_gen()
print label_gen()
I've ended up doing my own.
I think it can create any number of letters.
def AA(n, s):
r = n % 26
r = r if r > 0 else 26
n = (n - r) / 26
s = chr(64 + r) + s
if n > 26:
s = AA(n, s)
elif n > 0:
s = chr(64 + n) + s
return s
n = quantity | r = remaining (26 letters A-Z) | s = string
To print the list :
def uprint(nc):
for x in range(1, nc + 1):
print AA(x,'').lower()
Used VBA before convert to python :
Function AA(n, s)
r = n Mod 26
r = IIf(r > 0, r, 26)
n = (n - r) / 26
s = Chr(64 + r) & s
If n > 26 Then
s = AA(n, s)
ElseIf n > 0 Then
s = Chr(64 + n) & s
End If
AA = s
End Function
Using neo's insight on a while loop.
For a given iterable with chars in ascending order. 'abcd...'.
n is the Nth position of the representation starting with 1 as the first position.
def char_label(n, chars):
indexes = []
while n:
residual = n % len(chars)
if residual == 0:
residual = len(chars)
indexes.append(residual)
n = (n - residual)
n = n // len(chars)
indexes.reverse()
label = ''
for i in indexes:
label += chars[i-1]
return label
Later you can print a list of the range n of the 'labels' you need using a for loop:
my_chrs = 'abc'
n = 15
for i in range(1, n+1):
print(char_label(i, my_chrs))
or build a list comprehension etc...
Print the set of xl cell range of lowercase and uppercase charterers
Upper_case:
from string import ascii_uppercase
import itertools
def iter_range_strings(start_colu):
for size in itertools.count(1):
for string in itertools.product(ascii_uppercase, repeat=size):
yield "".join(string)
input_colume_range = ['A', 'B']
input_row_range= [1,2]
for row in iter_range_strings(input_colume_range[0]):
for colum in range(int(input_row_range[0]), int(input_row_range[1]+1)):
print(str(row)+ str(colum))
if row == input_colume_range[1]:
break
Result:
A1
A2
B1
B2
In two lines (plus an import):
from string import ascii_uppercase as ABC
count = 100
ABC+=' '
[(ABC[x[0]] + ABC[x[1]]).strip() for i in range(count) if (x:= divmod(i-26, 26))]
Wrap it in a function/lambda if you need to reuse.
code:
alphabet = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
for i in range(len(alphabet)):
for a in range(len(alphabet)):
print(alphabet[i] + alphabet[a])
result:
aa
ab
ac
ad
ae
af
ag
ah
ai
aj
ak
al
am
...
what's a simple way to increase the length of a string to an arbitrary integer x? like 'a' goes to 'z' and then goes to 'aa' to 'zz' to 'aaa', etc.
That should do the trick:
def iterate_strings(n):
if n <= 0:
yield ''
return
for c in string.ascii_lowercase:
for s in iterate_strings(n - 1):
yield c + s
It returns a generator.
You can iterate it with a for loop:
for s in iterate_strings(5)
Or get a list of the strings:
list(iterate_strings(5))
If you want to iterate over shorter strings too, you can use this function:
def iterate_strings(n):
yield ''
if n <= 0:
return
for c in string.ascii_lowercase:
for s in iterate_strings(n - 1):
yield c + s
Here's my solution, similar to Adam's, except it's not recursive. :].
from itertools import product
from string import lowercase
def letter_generator(limit):
for length in range(1, limit+1):
for letters in product(lowercase, repeat=length):
yield ''.join(letters)
And it returns a generator, so you can use a for loop to iterate over it:
for letters in letter_generator(5):
# ...
Have fun!
(This is the second time today I found itertools.product() useful. Woot.)
You can multiply the string in the integer.
For example
>>> 'a' * 2
'aa'
>>> 'a' * 4
'aaaa'
>>> 'z' * 3
'zzz'
>>> 'az' * 3
'azazaz'
Define x. I am using x = 5 for this example.
x = 5
import string
for n in range(1,x+1):
for letter in string.ascii_lowercase:
print letter*n