Let's say you have a list which only has two types of values and goes something like ['t','r','r','r','t','t','r','r','t'] and you want to find the length of the smallest sequence number of 'r's which have 't's at both ends.
In this case the smallest sequence of 'r' has a length of 2, because there is first t,r,r,r,t and then t,r,r,t, and the latter has the smallest number of 'r's in a row surrounded by 't' and the number of 'r's is 2.
How would I code for finding that number?
This is from a problem of trying of going to a play with your friend, and you want to sit as close as possible with your friend, so you are trying to find the smallest amount of taken seats in between two free seats at a play. "#" is a taken seat and a "." is a free seat. you are given the amount of seats, and the seating arrangement (free seats and taken seats), and they are all in one line.
An example of an input is:
5
#.##.
where there are two taken seats(##) in between two free seats.
Here is my code which is not working for inputs that I don't know, but working for inputs I throw at it.
import sys
seats = int(input())
configuration = input()
seatsArray = []
betweenSeats = 1
betweenSeatsMin = 1
checked = 0
theArray = []
dotCount = 0
for i in configuration:
seatsArray.append(i)
for i in range(len(seatsArray)):
if i == len(seatsArray) - 1:
break
if seatsArray[i] == "." and seatsArray[i+1] == ".":
print(0)
sys.exit()
for i in range(0,len(seatsArray)):
if i > 0:
if checked == seats:
break
checked += 1
if seatsArray[i] == "#":
if i > 0:
if seatsArray[i-1] == "#":
betweenSeats += 1
if seatsArray[i] == ".":
dotCount += 1
if dotCount > 1:
theArray.append(betweenSeats)
betweenSeats = 1
theArray = sorted(theArray)
if theArray.count(1) > 0:
theArray.remove(1)
theArray = list(dict.fromkeys(theArray))
print(theArray[0])
This is a noob and a !optimal approach to your problem using a counter for the minimum and maximum sequence where ew compare both and return the minimum.
''' create a funciton that
will find min sequence
of target char
in a list'''
def finder(a, target):
max_counter = 0
min_counter = 0
''' iterate through our list
and if the element is the target
increase our max counter by 1
'''
for i in x:
if i == target:
max_counter += 1
'''min here is 0
so it will always be less
so we overwrite it's value
with the value of max_counter'''
if min_counter < max_counter:
min_counter = max_counter
'''at last iteration max counter will be less than min counter
so we overwrite it'''
if max_counter < min_counter:
min_counter = max_counter
else:
max_counter = 0
return min_counter
x = ['t','r','r','r','t','t','r','r','t','t','t','r','t']
y = 'r'
print(finder(x,y))
Create a string from list and then search for pattern required and then count r in the found matches and then take min of it
Code:
import re
lst = ['t','r','r','r','t','t','r','r','t']
text = ''.join(lst)
pattern = '(?<=t)r+(?=t)'
smallest_r_seq = min(match.group().count('r') for match in re.finditer(pattern, text))
print(smallest_r_seq)
Output:
2
I am trying to find the count of occurrence of fixed word from any given string.
Fixed word = 'hackerearth'
Random string may be s = 'aahkcreeatrhaaahkcreeatrha'
Now from string we can generate 2-times hackerearth.
I have written some code to find the count of (h,a,e,r,c,k,t) letters in string:
Code:
word = list(raw_input())
print word
h = word.count('h')
a = word.count('a')
c = word.count('c')
k = word.count('k')
e = word.count('e')
r = word.count('r')
t = word.count('t')
if (h >= 2 and a >= 2 and e >= 2 and r >=2) and (c >= 1 and k >= 1 and t >=1 ):
hc = h/2
ac = a/2
ec = e/2
rc = r/2
num_words = []
num_words.append(hc)
num_words.append(ac)
num_words.append(ec)
num_words.append(rc)
num_words.append(c)
num_words.append(k)
num_words.append(t)
print num_words
Output:
[2, 4, 2, 2, 2, 2, 2]
From above output list, I want to calculate the total occurrence of word.
How can I get total count of fixed word and any other way to make this code easier?
You could utilize Counter:
from collections import Counter
s = 'aahkcreeatrhaaahkcreeatrha'
word = 'hackerearth'
wd = Counter(word)
sd = Counter(s)
print(min((sd.get(c, 0) // wd[c] for c in wd), default=0))
Output:
2
Above code will create two dict like counters where letters are keys and their occurrence are values. Then it will use generator expression to iterate over the letters found in the word and for each letter generate the ratio. min will pick the lowest ratio and default value of 0 is used for case where word is empty string.
When looking for a substring, you need to account for the character order, and not just the counts
something like this should work:
def subword(lookup,whole):
if len(whole)<len(lookup):
return 0
if lookup==whole:
return 1
if lookup=='':
return 1
if lookup[0]==whole[0]:
return subword(lookup[1:],whole[1:])+subword(lookup,whole[1:])
return subword(lookup,whole[1:])
For example:
In [21]: subword('hello','hhhello')
Out[21]: 3
Because you can choose each of the 3 hs and construct the word hello with the remainder
How would I count consecutive characters in Python to see the number of times each unique digit repeats before the next unique digit?
At first, I thought I could do something like:
word = '1000'
counter = 0
print range(len(word))
for i in range(len(word) - 1):
while word[i] == word[i + 1]:
counter += 1
print counter * "0"
else:
counter = 1
print counter * "1"
So that in this manner I could see the number of times each unique digit repeats. But this, of course, falls out of range when i reaches the last value.
In the example above, I would want Python to tell me that 1 repeats 1, and that 0 repeats 3 times. The code above fails, however, because of my while statement.
How could I do this with just built-in functions?
Consecutive counts:
You can use itertools.groupby:
s = "111000222334455555"
from itertools import groupby
groups = groupby(s)
result = [(label, sum(1 for _ in group)) for label, group in groups]
After which, result looks like:
[("1": 3), ("0", 3), ("2", 3), ("3", 2), ("4", 2), ("5", 5)]
And you could format with something like:
", ".join("{}x{}".format(label, count) for label, count in result)
# "1x3, 0x3, 2x3, 3x2, 4x2, 5x5"
Total counts:
Someone in the comments is concerned that you want a total count of numbers so "11100111" -> {"1":6, "0":2}. In that case you want to use a collections.Counter:
from collections import Counter
s = "11100111"
result = Counter(s)
# {"1":6, "0":2}
Your method:
As many have pointed out, your method fails because you're looping through range(len(s)) but addressing s[i+1]. This leads to an off-by-one error when i is pointing at the last index of s, so i+1 raises an IndexError. One way to fix this would be to loop through range(len(s)-1), but it's more pythonic to generate something to iterate over.
For string that's not absolutely huge, zip(s, s[1:]) isn't a a performance issue, so you could do:
counts = []
count = 1
for a, b in zip(s, s[1:]):
if a==b:
count += 1
else:
counts.append((a, count))
count = 1
The only problem being that you'll have to special-case the last character if it's unique. That can be fixed with itertools.zip_longest
import itertools
counts = []
count = 1
for a, b in itertools.zip_longest(s, s[1:], fillvalue=None):
if a==b:
count += 1
else:
counts.append((a, count))
count = 1
If you do have a truly huge string and can't stand to hold two of them in memory at a time, you can use the itertools recipe pairwise.
def pairwise(iterable):
"""iterates pairwise without holding an extra copy of iterable in memory"""
a, b = itertools.tee(iterable)
next(b, None)
return itertools.zip_longest(a, b, fillvalue=None)
counts = []
count = 1
for a, b in pairwise(s):
...
A solution "that way", with only basic statements:
word="100011010" #word = "1"
count=1
length=""
if len(word)>1:
for i in range(1,len(word)):
if word[i-1]==word[i]:
count+=1
else :
length += word[i-1]+" repeats "+str(count)+", "
count=1
length += ("and "+word[i]+" repeats "+str(count))
else:
i=0
length += ("and "+word[i]+" repeats "+str(count))
print (length)
Output :
'1 repeats 1, 0 repeats 3, 1 repeats 2, 0 repeats 1, 1 repeats 1, and 0 repeats 1'
#'1 repeats 1'
Totals (without sub-groupings)
#!/usr/bin/python3 -B
charseq = 'abbcccdddd'
distros = { c:1 for c in charseq }
for c in range(len(charseq)-1):
if charseq[c] == charseq[c+1]:
distros[charseq[c]] += 1
print(distros)
I'll provide a brief explanation for the interesting lines.
distros = { c:1 for c in charseq }
The line above is a dictionary comprehension, and it basically iterates over the characters in charseq and creates a key/value pair for a dictionary where the key is the character and the value is the number of times it has been encountered so far.
Then comes the loop:
for c in range(len(charseq)-1):
We go from 0 to length - 1 to avoid going out of bounds with the c+1 indexing in the loop's body.
if charseq[c] == charseq[c+1]:
distros[charseq[c]] += 1
At this point, every match we encounter we know is consecutive, so we simply add 1 to the character key. For example, if we take a snapshot of one iteration, the code could look like this (using direct values instead of variables, for illustrative purposes):
# replacing vars for their values
if charseq[1] == charseq[1+1]:
distros[charseq[1]] += 1
# this is a snapshot of a single comparison here and what happens later
if 'b' == 'b':
distros['b'] += 1
You can see the program output below with the correct counts:
➜ /tmp ./counter.py
{'b': 2, 'a': 1, 'c': 3, 'd': 4}
You only need to change len(word) to len(word) - 1. That said, you could also use the fact that False's value is 0 and True's value is 1 with sum:
sum(word[i] == word[i+1] for i in range(len(word)-1))
This produces the sum of (False, True, True, False) where False is 0 and True is 1 - which is what you're after.
If you want this to be safe you need to guard empty words (index -1 access):
sum(word[i] == word[i+1] for i in range(max(0, len(word)-1)))
And this can be improved with zip:
sum(c1 == c2 for c1, c2 in zip(word[:-1], word[1:]))
If we want to count consecutive characters without looping, we can make use of pandas:
In [1]: import pandas as pd
In [2]: sample = 'abbcccddddaaaaffaaa'
In [3]: d = pd.Series(list(sample))
In [4]: [(cat[1], grp.shape[0]) for cat, grp in d.groupby([d.ne(d.shift()).cumsum(), d])]
Out[4]: [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 4), ('f', 2), ('a', 3)]
The key is to find the first elements that are different from their previous values and then make proper groupings in pandas:
In [5]: sample = 'abba'
In [6]: d = pd.Series(list(sample))
In [7]: d.ne(d.shift())
Out[7]:
0 True
1 True
2 False
3 True
dtype: bool
In [8]: d.ne(d.shift()).cumsum()
Out[8]:
0 1
1 2
2 2
3 3
dtype: int32
This is my simple code for finding maximum number of consecutive 1's in binaray string in python 3:
count= 0
maxcount = 0
for i in str(bin(13)):
if i == '1':
count +=1
elif count > maxcount:
maxcount = count;
count = 0
else:
count = 0
if count > maxcount: maxcount = count
maxcount
There is no need to count or groupby. Just note the indices where a change occurs and subtract consecutive indicies.
w = "111000222334455555"
iw = [0] + [i+1 for i in range(len(w)-1) if w[i] != w[i+1]] + [len(w)]
dw = [w[i] for i in range(len(w)-1) if w[i] != w[i+1]] + [w[-1]]
cw = [ iw[j] - iw[j-1] for j in range(1, len(iw) ) ]
print(dw) # digits
['1', '0', '2', '3', '4']
print(cw) # counts
[3, 3, 3, 2, 2, 5]
w = 'XXYXYYYXYXXzzzzzYYY'
iw = [0] + [i+1 for i in range(len(w)-1) if w[i] != w[i+1]] + [len(w)]
dw = [w[i] for i in range(len(w)-1) if w[i] != w[i+1]] + [w[-1]]
cw = [ iw[j] - iw[j-1] for j in range(1, len(iw) ) ]
print(dw) # characters
print(cw) # digits
['X', 'Y', 'X', 'Y', 'X', 'Y', 'X', 'z', 'Y']
[2, 1, 1, 3, 1, 1, 2, 5, 3]
A one liner that returns the amount of consecutive characters with no imports:
def f(x):s=x+" ";t=[x[1] for x in zip(s[0:],s[1:],s[2:]) if (x[1]==x[0])or(x[1]==x[2])];return {h: t.count(h) for h in set(t)}
That returns the amount of times any repeated character in a list is in a consecutive run of characters.
alternatively, this accomplishes the same thing, albeit much slower:
def A(m):t=[thing for x,thing in enumerate(m) if thing in [(m[x+1] if x+1<len(m) else None),(m[x-1] if x-1>0 else None)]];return {h: t.count(h) for h in set(t)}
In terms of performance, I ran them with
site = 'https://web.njit.edu/~cm395/theBeeMovieScript/'
s = urllib.request.urlopen(site).read(100_000)
s = str(copy.deepcopy(s))
print(timeit.timeit('A(s)',globals=locals(),number=100))
print(timeit.timeit('f(s)',globals=locals(),number=100))
which resulted in:
12.528256356999918
5.351301653001428
This method can definitely be improved, but without using any external libraries, this was the best I could come up with.
In python
your_string = "wwwwweaaaawwbbbbn"
current = ''
count = 0
for index, loop in enumerate(your_string):
current = loop
count = count + 1
if index == len(your_string)-1:
print(f"{count}{current}", end ='')
break
if your_string[index+1] != current:
print(f"{count}{current}",end ='')
count = 0
continue
This will output
5w1e4a2w4b1n
#I wrote the code using simple loops and if statement
s='feeekksssh' #len(s) =11
count=1 #f:0, e:3, j:2, s:3 h:1
l=[]
for i in range(1,len(s)): #range(1,10)
if s[i-1]==s[i]:
count = count+1
else:
l.append(count)
count=1
if i == len(s)-1: #To check the last character sequence we need loop reverse order
reverse_count=1
for i in range(-1,-(len(s)),-1): #Lopping only for last character
if s[i] == s[i-1]:
reverse_count = reverse_count+1
else:
l.append(reverse_count)
break
print(l)
Today I had an interview and was asked the same question. I was struggling with the original solution in mind:
s = 'abbcccda'
old = ''
cnt = 0
res = ''
for c in s:
cnt += 1
if old != c:
res += f'{old}{cnt}'
old = c
cnt = 0 # default 0 or 1 neither work
print(res)
# 1a1b2c3d1
Sadly this solution always got unexpected edge cases result(is there anyone to fix the code? maybe i need post another question), and finally timeout the interview.
After the interview I calmed down and soon got a stable solution I think(though I like the groupby best).
s = 'abbcccda'
olds = []
for c in s:
if olds and c in olds[-1]:
olds[-1].append(c)
else:
olds.append([c])
print(olds)
res = ''.join([f'{lst[0]}{len(lst)}' for lst in olds])
print(res)
# [['a'], ['b', 'b'], ['c', 'c', 'c'], ['d'], ['a']]
# a1b2c3d1a1
Here is my simple solution:
def count_chars(s):
size = len(s)
count = 1
op = ''
for i in range(1, size):
if s[i] == s[i-1]:
count += 1
else:
op += "{}{}".format(count, s[i-1])
count = 1
if size:
op += "{}{}".format(count, s[size-1])
return op
data_input = 'aabaaaabbaaaaax'
start = 0
end = 0
temp_dict = dict()
while start < len(data_input):
if data_input[start] == data_input[end]:
end = end + 1
if end == len(data_input):
value = data_input[start:end]
temp_dict[value] = len(value)
break
if data_input[start] != data_input[end]:
value = data_input[start:end]
temp_dict[value] = len(value)
start = end
print(temp_dict)
PROBLEM: we need to count consecutive characters and return characters with their count.
def countWithString(input_string:str)-> str:
count = 1
output = ''
for i in range(1,len(input_string)):
if input_string[i]==input_string[i-1]:
count +=1
else:
output += f"{count}{input_string[i-1]}"
count = 1
# Used to add last string count (at last else condition will not run and data will not be inserted to ouput string)
output += f"{count}{input_string[-1]}"
return output
countWithString(input)
input:'aaabbbaabbcc'
output:'3a3b2a2b2c'
Time Complexity: O(n)
Space Complexity: O(1)
temp_str = "aaaajjbbbeeeeewwjjj"
def consecutive_charcounter(input_str):
counter = 0
temp_list = []
for i in range(len(input_str)):
if i==0:
counter+=1
elif input_str[i]== input_str[i-1]:
counter+=1
if i == len(input_str)-1:
temp_list.extend([input_str[i - 1], str(counter)])
else:
temp_list.extend([input_str[i-1],str(counter)])
counter = 1
print("".join(temp_list))
consecutive_charcounter(temp_str)
I have an assignment to do. The problem is something like this. You give a number, say x. The program calculates the square of the numbers starting from 1 and prints it only if it's a palindrome. The program continues to print such numbers till it reaches the number x provided by you.
I have solved the problem. It works fine for uptil x = 10000000. Works fine as in executes in a reasonable amount of time. I want to improve upon the efficiency of my code. I am open to changing the entire code, if required. My aim is to make a program that could execute 10^20 within around 5 mins.
limit = int(input("Enter a number"))
def palindrome(limit):
count = 1
base = 1
while count < limit:
base = base * base #square the number
base = list(str(base)) #convert the number into a list of strings
rbase = base[:] #make a copy of the number
rbase.reverse() #reverse this copy
if len(base) > 1:
i = 0
flag = 1
while i < len(base) and flag == 1:
if base[i] == rbase[i]: #compare the values at the indices
flag = 1
else:
flag = 0
i += 1
if flag == 1:
print(''.join(base)) #print if values match
base = ''.join(base)
base = int(base)
base = count + 1
count = count + 1
palindrome(limit)
He're my version:
import sys
def palindrome(limit):
for i in range(limit):
istring = str(i*i)
if istring == istring[::-1]:
print(istring,end=" ")
print()
palindrome(int(sys.argv[1]))
Timings for your version on my machine:
pu#pumbair: ~/Projects/Stackexchange time python3 palin1.py 100000
121 484 676 10201 12321 14641 40804 44944 69696 94249 698896 1002001 1234321
4008004 5221225 6948496 100020001 102030201 104060401 121242121 123454321 125686521
400080004 404090404 522808225 617323716 942060249
real 0m0.457s
user 0m0.437s
sys 0m0.012s
and for mine:
pu#pumbair: ~/Projects/Stackexchange time python3 palin2.py 100000
0 1 4 9
121 484 676 10201 12321 14641 40804 44944 69696 94249 698896 1002001 1234321
4008004 5221225 6948496 100020001 102030201 104060401 121242121 123454321 125686521
400080004 404090404 522808225 617323716 942060249
real 0m0.122s
user 0m0.104s
sys 0m0.010s
BTW, my version gives more results (0, 1, 4, 9).
Surely something like this will perform better (avoiding the unnecessary extra list operations) and is more readable:
def palindrome(limit):
base = 1
while base < limit:
squared = str(base * base)
reversed = squared[::-1]
if squared == reversed:
print(squared)
base += 1
limit = int(input("Enter a number: "))
palindrome(limit)
I think we can do it a little bit easier.
def palindrome(limit):
count = 1
while count < limit:
base = count * count # square the number
base = str(base) # convert the number into a string
rbase = base[::-1] # make a reverse of the string
if base == rbase:
print(base) #print if values match
count += 1
limit = int(input("Enter a number: "))
palindrome(limit)
String into number and number into string conversions were unnecessary. Strings can be compared, this is why you shouldn't make a loop.
You can keep a list of square palindromes upto a certain limit(say L) in memory.If the Input number x is less than sqrt(L) ,you can simply iterate over the list of palindromes and print them.This way you wont have to iterate over every number and check if its square is palindrome .
You can find a list of square palindromes here : http://www.fengyuan.com/palindrome.html
OK, here's my program. It caches valid suffixes for squares (i.e. the values of n^2 mod 10^k for a fixed k), and then searches for squares which have both that suffix and start with the suffix reversed. This program is very fast: in 24 seconds, it lists all the palindromic squares up to 10^24.
from collections import defaultdict
# algorithm will print palindromic squares x**2 up to x = 10**n.
# efficiency is O(max(10**k, n*10**(n-k)))
n = 16
k = 6
cache = defaultdict(list)
print 0, 0 # special case
# Calculate everything up to 10**k; these will be the prefix/suffix pairs we use later
tail = 10**k
for i in xrange(tail):
if i % 10 == 0: # can't end with 0 and still be a palindrome
continue
sq = i*i
s = str(sq)
if s == s[::-1]:
print i, s
prefix = int(str(sq % tail).zfill(k)[::-1])
cache[prefix].append(i)
prefixes = sorted(cache)
# Loop through the rest, but only consider matching prefix/suffix pairs
for l in xrange(k*2+1, n*2+1):
for p in prefixes:
low = (p * 10**(l-k))**.5
high = ((p+1) * 10**(l-k))**.5
low = int(low / tail) * tail
high = (int(high / tail) + 1) * tail
for n in xrange(low, high, tail):
for suf in cache[p]:
x = n + suf
s = str(x*x)
if s == s[::-1]:
print x, s
Sample output:
0 0
1 1
2 4
3 9
11 121
22 484
26 676
101 10201
111 12321
121 14641
202 40804
212 44944
<snip>
111010010111 12323222344844322232321
111100001111 12343210246864201234321
111283619361 12384043938083934048321
112247658961 12599536942224963599521
128817084669 16593841302620314839561
200000000002 40000000000800000000004