python look-and-say sequence improved - python

I would like to introduce look-and-say sequence at first. It goes like a = {1, 11, 21, 1211, 111221 ...
The system is it checks the previous digit and counts the numbers.
1 = one 1 (so = 11)
11 = two 1 (so = 21)
21 = one 2 one 1 (so = 1211)
As a rule of the sequence, no number can go beyond 3, so creating a translation table can fit in. But it is not semantic, I don't like it.
What I want is, a script which evaluates the given value and return a look-and-say-alike string.
However, to go beyond out limits, I want it to even evaluate chars, so it can return 1A2b41.
I have been trying to make it work for hours, the logic went bad and I am having a brainfreeze at the moment.
Here is the script that actually doesn't work(returns false results), but it can give you the idea, at least.
def seq(a):
k,last,result,a = 1,'','',str(a)
for i in range(len(a)):
if last==a[i]:k+=1
else:
result = result+str(k)+a[i]
k=1
last = a[i]
return result

You can use groupby, it's just what you want:
from itertools import groupby
def lookandsay(n):
return ''.join( str(len(list(g))) + k for k, g in groupby(n))
>>> lookandsay('1')
'11'
>>> lookandsay('1A2b41')
'111A121b1411'
>>> lookandsay(lookandsay('1A2b41'))
'311A1112111b111421'
groupby returns consecutive keys and groups from an iterable object. The key is a function computed for each element, or an identity function if not specified (as above). The group is an iterator - a new group is generated when the value of the key function changes. So, for instance, according to the documentation:
# [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
# [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D

I can see two issues with your code:
The result is expanded by k and a[i] although the counter k does not count chars a[i] but chars last. Replace a[i] by last here (you may not want to add anything in the first round).
After the loop you have to add the last value of the counter together with the last character again (this was not yet done), i.e. add another result = result+str(k)+last after the loop.
In total it looks like
def seq(a):
a = str(a)
k,last,result = 1,a[0],''
for i in range(1,len(a)):
if last==a[i]:k+=1
else:
result = result+str(k)+last
k=1
last = a[i]
result = result+str(k)+last
return result

I think part of why you got stumped is your use of meaningless variable names. You described the problem quite well and called it by name, but didn't even use that name for your function.
If you think of the string you start with as "look", and the one you end up with as "say", that is a start. result is probably fine but a and k have confused you. last is, I think, misleading, because it can mean either previous or final.
Also, Python's for is really foreach for a reason -- you're taking each character in the "look" one at a time, so do it explicitly in the loop.
def looksay(look):
look = str(look)
prev, count, say = look[0], 1, ''
for char in look[1:]:
if char == prev:
count += 1
continue
say += str(count) + prev
prev = char
count = 1
return say + str(count) + prev
The spacing is less important, but Python does have a standard coding style, and it does help readability to use it. The less mental time you have to spend parsing your code, the more focus you have for the problem.

Related

HackerRank Game of Thrones

I am trying to solve this problem on HackerRank and I am having a issue with my logic. I am confused and not able to think what I'm doing wrong, feels like I'm stuck in logic.
Question link: https://www.hackerrank.com/challenges/game-of-thrones/
I created a dictionary of alphabets with value 0. And then counting number of times the alphabet appears in the string. If there are more than 1 alphabet characters occurring 1 times in string, then obviously that string cannot become a palindrome. That's my logic, however it only pass 10/21 test cases.
Here's my code:
def gameOfThrones(s):
alpha_dict = {chr(x): 0 for x in range(97,123)}
counter = 0
for i in s:
if i in alpha_dict:
alpha_dict[i] += 1
for key in alpha_dict.values():
if key == 1:
counter += 1
if counter <= 1:
return 'YES'
else:
return 'NO'
Any idea where I'm going wrong?
Explanation
The issue is that the code doesn't really look for palindromes. Let's step through it with a sample text based on a valid one that they gave: aaabbbbb (the only difference between this and their example is that there is an extra b).
Your first for loop counts how many times the letters appear in the string. In this case, 3 a and 5 b with all the other characters showing up 0 times (quick aside, the end of the range function is exclusive so this would not count any z characters that might show up).
The next for loop counts how many character there are that show up only once in the string. This string is made up of multiple a and b characters, more than the check that you have for if key == 1 so it doesn't trigger it. Since the count is less than 1, it returns YES and exits. However aaabbbbb is not a palindrome unscrambled.
Suggestion
To fix it, I would suggest having more than just one function so you can break down exactly what you need. For example, you can have a function that would return a list of all the unscrambled possibilities.
def allUnscrambled(string)->list:
# find all possible iterations of the string
# if given 'aabb', return 'aabb', 'abab', 'abba', 'bbaa', 'baba', 'baab'
return lstOfStrings
After this, create a palindrome checker. You can use the one shown by Dmitriy or create your own.
def checkIfPalindrome(string)->bool:
# determine if the given string is a palindrome
return isOrNotPalindrome
Put the two together to get a function that will, given a list of strings, determine if at least one of them is a palindrome. If it is, that means the original string is an anagrammed palindrome.
def palindromeInList(lst)->bool:
# given the list of strings from allUnscrambled(str), is any of them a palindrome?
return isPalindromeInList
Your function gameOfThrones(s) can then call this palindromeInList( allUnscrambled(s) ) and then return YES or NO depending on the result. Breaking it up into smaller pieces and delegating tasks is usually a good way to handle these problems.
Corrected the logic in my solution. I was just comparing key == 1 and not with every odd element.
So the corrected code looks like:
for key in alpha_dict.values():
if key % 2 == 1:
counter += 1
It passes all the testcases on HackerRank website.
The property that you have to check on the input string is that the number of characters with odd repetitions must be less than 1. So, the main ingredients to cook you recipe are:
a counter for each character
an hash map to store the counters, having the characters as keys
iterate over the input string
A plain implementation could be:
def gameOfThrones(s):
counters = {}
for c in s:
counters[c] = counters.get(c, 0) + 1
n_odd_characters = sum(v % 2 for v in counters.values())
Using a functional approach, based on reduce from functools:
from functools import reduce
def gamesOfThrones(s):
return ['NO', 'YES'][len(reduce(
lambda x, y: (x | {y: 1}) if y not in x else (x.pop(y) and x),
s,
{}
)) <= 1]
If you want, you can use the Counter class from collections to make your code more concise:
def gamesOfThrones(s):
return ['NO', 'YES'][sum([v % 2 for v in Counter(s).values() ]) <= 1]

How does comparing two chars (within a string) work in Python

I am starting to learn Python and looked at following website: https://www.w3resource.com/python-exercises/string/
I work on #4 which is "Write a Python program to get a string from a given string where all occurrences of its first char have been changed to '$', except the first char itself."
str="restart"
char=str[0]
print(char)
strcpy=str
i=1
for i in range(len(strcpy)):
print(strcpy[i], "\n")
if strcpy[i] is char:
strcpy=strcpy.replace(strcpy[i], '$')
print(strcpy)
I would expect "resta$t" but the actual result is: $esta$t
Thank you for your help!
There are two issues, first, you are not starting iteration where you think you are:
i = 1 # great, i is 1
for i in range(5):
print(i)
0
1
2
3
4
i has been overwritten by the value tracking the loop.
Second, the is does not mean value equivalence. That is reserved for the == operator. Simpler types such as int and str can make it seem like is works in this fashion, but other types do not behave this way:
a, b = 5, 5
a is b
True
a, b = "5", "5"
a is b
True
a==b
True
### This doesn't work
a, b = [], []
a is b
False
a == b
True
As #Kevin pointed out in the comments, 99% of the time, is is not the operator you want.
As far as your code goes, str.replace will replace all instances of the argument supplied with the second arg, unless you give it an optional number of instances to replace. To avoid replacing the first character, grab the first char separately, like val = somestring[0], then replace the rest using a slice, no need for iteration:
somestr = 'restart' # don't use str as a variable name
val = somestr[0] # val is 'r'
# somestr[1:] gives 'estart'
x = somestr[1:].replace(val, '$')
print(val+x)
# resta$t
If you still want to iterate, you can do that over the slice as well:
# collect your letters into a list
letters = []
char = somestr[0]
for letter in somestr[1:]: # No need to track an index here
if letter == char: # don't use is, use == for value comparison
letter = '$' # change letter to a different value if it is equal to char
letters.append(letter)
# Then use join to concatenate back to a string
print(char + ''.join(letters))
# resta$t
There are some need of modification on your code.
Modify your code with as given in below.
strcpy="restart"
i=1
for i in range(len(strcpy)):
strcpy=strcpy.replace(strcpy[0], '$')[:]
print(strcpy)
# $esta$t
Also, the best practice to write code in Python is to use Function. You can modify your code as given below or You can use this function.
def charreplace(s):
return s.replace(s[0],'$')[:]
charreplace("restart")
#'$esta$t'
Hope this helpful.

How to implement a brute force solution to "Finding first unique character in a string"

As described here:
https://leetcode.com/problems/first-unique-character-in-a-string/description/
I attempted one here but couldn't quite finish:
https://paste.pound-python.org/show/JuPLgdgqceMQYh5kk0Sf/
#Given a string, find the first non-repeating character in it and return it's index. If it doesn't exist, return -1.
#xamples:
#s = "leetcode"
#return 0.
#s = "loveleetcode",
#return 2.
#Note: You may assume the string contain only lowercase letters.
class Solution(object):
def firstUniqChar(self, s):
"""
:type s: str
:rtype: int
"""
for i in range(len(s)):
for j in range(i+1,len(s)):
if s[i] == s[j]:
break
#But now what. let's say i have complete loop of j where there's no match with i, how do I return i?
I'm ONLY interested in the brute force N^2 solution, nothing fancier. The idea in the above solution is to start a double loop, where inner loop searches for a match with the outer loop's char, and if there's match, break the inner loop and continue onto the next char on the outer loop.
But the question is, how do I handle when there's NO match, which is when I need to return the outer loop's index as the first unique one.
Can't quite figure out a graceful way to do it, and can handle edge case like a single char string.
Iterate over each char, and check if it appears in any of the following chars. We need to keep track of the characters we've already seen, to avoid falling into edge cases. Try this, it's an O(n^2) solution:
def firstUniqChar(s):
# store already seen chars
seen = []
for i, c in enumerate(s):
# return if char not previously seen and not in rest
if c not in seen and c not in s[i+1:]:
return i
# mark char as seen
seen.append(c)
# no unique chars were found
return -1
For completeness' sake, here's an O(n) solution:
def firstUniqChar(s):
# build frequency table
freq = {}
for i, c in enumerate(s):
if c not in freq:
# store [frequency, index]
freq[c] = [1, i]
else:
# update frequency
freq[c][0] += 1
# find leftmost char with frequency == 1
# it's more efficient to traverse the freq table
# instead of the (potentially big) input string
leftidx = float('+inf')
for f, i in freq.values():
if f == 1 and i < leftidx:
leftidx = i
# handle edge case: no unique chars were found
return leftidx if leftidx != float('+inf') else -1
For example:
firstUniqChar('cc')
=> -1
firstUniqChar('ccdd')
=> -1
firstUniqChar('leetcode')
=> 0
firstUniqChar('loveleetcode')
=> 2
Add an else to the for loop where you return.
for j ...:
...
else:
return i
I'd first like to note that your current algorithm for finding unique characters doesn't work correctly. That's because you can't assume the character at index i is unique just because none of the indexes j found the same character later in the string. The character at index i could be a repeat of an earlier character (which you'd have skipped when the previous j was equal to the current i).
You could fix the algorithm by letting j iterate over the whole range of indexes, and adding an extra check to ignore the matches when the indexes are the same to your if:
for i in range(len(s)):
for j in range(len(s)):
if i != j and s[i] == s[j]:
break
As Ignacio Vazquez-Abrams suggests in his answer, you can then add an else block to the inner for loop to make the code return when no match was found:
else: # this line should be indented to match the "for j" loop
return i
There are also a few ways you can solve this problem more simply if you use the builtin functions and types available in Python.
For instance, you can implement an O(n^2) solution equivalent to the one above using only one explicit loop, and using str.count to replace the inner one:
def firstUniqChar(s):
for i, c in enumerate(s):
if s.count(c) == 1:
return i
return None
I'm also using enumerate to get the character values and indexes together in one step, rather than iterating over a range and indexing later.
There's also a very easy way to make an O(n) solution using collections.Counter, which can do all the counting in one pass before you start checking the characters in order to try to find the first one that is unique:
from collections import Counter
def firstUniqChar(s):
count = Counter(s)
for i, c in enumerate(s):
if count[c] == 1:
return i
return None
I'm not sure your approach will work on an even palindrome, e.g. "redder" (note the second d). Try this instead:
s1 = "leetcode"
s2 = "loveleetcode"
s3 = "redder"
def unique_index(s):
ahead, behind = list(s), set()
for idx, char in enumerate(s):
ahead = ahead[1:]
if (char not in ahead) and (char not in behind):
return idx
behind.add(s[idx])
return -1
assert unique_index(s1) == 0
assert unique_index(s2) == 2
assert unique_index(s3) == -1
For each character, we look ahead and behind. Only characters disjoint from both groups will return an index. As iteration progresses, the list of what is observed ahead shortens, while what is seen behind extends. The default is -1 as stated in the actual leetcode challenge.
A second list is not required. #Óscar López's answer is the simplified answer.

How to remove duplicate characters in a string and print according to the longest occurrence

I've been trying to solve this program, but i am unable.
x="abcaa" # sample input
x="bca" # sample output
i have tried this:
from collections import OrderedDict
def f(x):
print ''.join(OrderedDict.fromkeys(x))
t=input()
for i in range(t):
x=raw_input()
f(x)
The above code is giving:
x="abcaa" # Sample input
x="abc" # sample output
More Details:
Sample Input:
abc
aaadcea
abcdaaae
Sample Output:
abc
adce
bcdae
In first case, the string is="abcaa", here 'a' is repeated maximum at the last so that is placed at last so resulting "bca" And in other case, "aaadcea", here 'a' is repeated maximum at the first so it is placed at first, resulting "adce".
The OrderedDict isn't helping you at all, because the order you're preserving isn't the one you want.
If I understand your question (and I'm not at all sure I do…) the order you want is a sorted order, using the number of times the character appears as the sorting key, so the most frequent characters appear last.
So, this means you need to associate each character with a count in some way. You could do that with an explicit loop and d.setdefault(char, 0) and so on, but if you look in the collections docs, you'll see something named Counter right next to OrderedDict, which is a:
dict subclass for counting hashable objects
That's exactly what you want:
>>> x = 'abcaa'
>>> collections.Counter(x)
Counter({'a': 3, 'b': 1, 'c': 1})
And now you just need to sort with a key function:
>>> ''.join(sorted(c, key=c.__getitem__))
'bca'
If you want this to be a stable sort, so that elements with the same counts are shown in the order they first appear, or the order they first reach that count, then you will need OrderedDict. How do you get both OrderedDict behavior and Counter behavior? There's a recipe in the docs that shows how to do it. (And you actually don't even need that much; the __repr__ and __reduce__ are irrelevant for your use, so you can just inherit from Counter and OrderedDict and pass for the body.)
Taking a different guess at what you want:
For each character, you want to find the position at which it has the most repetitions.
That means that, as you go along, you need to keep track of two things for each character: the position at which it has the most repetitions so far, and how many. And you also need to keep track of the current run of characters.
In that case, the OrderedDict is necessary, it's just not sufficient. You need to add characters to the OrderedDict as you find them, and remove them and readd them when you find a longer run, and you also need to store a count in the value for each key rather that just use the OrderedDict as an OrderedSet. Like this:
d = collections.OrderedDict()
lastch, runlength = None, None
for ch in x:
if ch == lastch:
runlength += 1
else:
try:
del d[lastch]
except KeyError:
pass
if runlength:
d[lastch] = runlength
lastch, runlength = ch, 1
try:
del d[lastch]
except KeyError:
pass
if runlength:
d[lastch] = runlength
x = ''.join(d)
You may notice that there's a bit of repetition here, and a lot of verbosity. You can simplify the problem quite a bit by breaking it into two steps: first compress the string into runs, then just keep track of the largest run for each character. Thanks to the magic of iterators, this doesn't even have to be done in two passes, the first step can be done lazily.
Also, because you're still using Python 2.7 and therefore don't have OrderedDict.move_to_end, we have to do that silly delete-then-add shuffle, but we can use pop to make that more concise.
So:
d = collections.OrderedDict()
for key, group in itertools.groupby(x):
runlength = len(list(group))
if runlength > d.get(key, 0):
d.pop(key, None)
d[key] = runlength
x = ''.join(d)
A different way to solve this would be to use a plain-old dict, and store the runlength and position for each character, then sort the results in position order. This means we no longer need to do the move-to-end shuffle, we're just updating the position as part of the value:
d = {}
for i, (key, group) in enumerate(itertools.groupby(x)):
runlength = len(list(group))
if runlength > d.get(key, (None, 0))[1]:
d[key] = (i, runlength)
x = ''.join(sorted(d, key=d.__getitem__))
However, I'm not sure this improvement actually improves the readability, so I'd go with the second version above.
This is an inelegant, ugly, inefficient, and almost certainly non-Pythonic solution but I think it does what you're looking for.
t = raw_input('Write your string here: ')
# Create a list initalized to 0 to store character counts
seen = dict()
# Make sure actually have a string
if len(t) < 1:
print ""
else:
prevChar = t[0]
count = 0
for char in t:
if char == prevChar:
count = count + 1
else:
# Check if the substring we just finished is the longest
if count > seen.get(prevChar, 0):
seen[prevChar] = count
# Characters differ, restart
count = 1
prevChar = char
# Append last character
seen[prevChar] = count
# Now let's build the string, appending the character when we find the longest version
count = 0
prevChar = t[0]
finalString = ""
for char in t:
if char in finalString:
# Make sure we don't append a char twice, append the first time we find the longest subsequence
continue
if char == prevChar:
count = count + 1
else:
# Check if the substring we just finished is the longest
if count == seen.get(prevChar, 0):
finalString = finalString + prevChar
# Characters differ, restart
count = 1
prevChar = char
# Check the last character
if count == seen[prevChar]:
finalString= finalString + prevChar
print finalString

Python - packing/unpacking by letters

I'm just starting to learn python and I have this exercise that's puzzling me:
Create a function that can pack or unpack a string of letters.
So aaabb would be packed a3b2 and vice versa.
For the packing part of the function, I wrote the following
def packer(s):
if s.isalpha(): # Defines if unpacked
stack = []
for i in s:
if s.count(i) > 1:
if (i + str(s.count(i))) not in stack:
stack.append(i + str(s.count(i)))
else:
stack.append(i)
print "".join(stack)
else:
print "Something's not quite right.."
return False
packer("aaaaaaaaaaaabbbccccd")
This seems to work all proper. But the assignment says that
if the input has (for example) the letter a after b or c, then
it should later be unpacked into it's original form.
So "aaabbkka" should become a3b2k2a, not a4b2k2.
I hence figured, that I cannot use the "count()" command, since
that counts all occurrences of the item in the whole string, correct?
What would be my options here then?
On to the unpacking -
I've thought of the basics what my code needs to do -
between the " if s.isalpha():" and else, I should add an elif that
checks whether or not the string has digits in it. (I figured this would be
enough to determine whether it's the packed version or unpacked).
Create a for loop and inside of it an if sentence, which then checks for every element:
2.1. If it has a number behind it > Return (or add to an empty stack) the number times the digit
2.2. If it has no number following it > Return just the element.
Big question number 2 - how do I check whether it's a number or just another
alphabetical element following an element in the list? I guess this must be done with
slicing, but those only take integers. Could this be achieved with the index command?
Also - if this is of any relevance - so far I've basically covered lists, strings, if and for
and I've been told this exercise is doable with just those (...so if you wouldn't mind keeping this really basic)
All help appreciated for the newbie enthusiast!
SOLVED:
def packer(s):
if s.isalpha(): # Defines if unpacked
groups= []
last_char = None
for c in s:
if c == last_char:
groups[-1].append(c)
else:
groups.append([c])
last_char = c
return ''.join('%s%s' % (g[0], len(g)>1 and len(g) or '') for g in groups)
else: # Seems to be packed
stack = ""
for i in range(len(s)):
if s[i].isalpha():
if i+1 < len(s) and s[i+1].isdigit():
digit = s[i+1]
char = s[i]
i += 2
while i < len(s) and s[i].isdigit():
digit +=s[i]
i+=1
stack += char * int(digit)
else:
stack+= s[i]
else:
""
return "".join(stack)
print (packer("aaaaaaaaaaaabbbccccd"))
print (packer("a4b19am4nmba22"))
So this is my final code. Almost managed to pull it all off with just for loops and if statements.
In the end though I had to bring in the while loop to solve reading the multiple-digit numbers issue. I think I still managed to keep it simple enough. Thanks a ton millimoose and everyone else for chipping in!
A straightforward solution:
If a char is different, make a new group. Otherwise append it to the last group. Finally count all groups and join them.
def packer(s):
groups = []
last_char = None
for c in s:
if c == last_char:
groups[-1].append(c)
else:
groups.append([c])
last_char = c
return ''.join('%s%s'%(g[0], len(g)) for g in groups)
Another approach is using re.
Regex r'(.)\1+' can match consecutive characters longer than 1. And with re.sub you can easily encode it:
regex = re.compile(r'(.)\1+')
def replacer(match):
return match.group(1) + str(len(match.group(0)))
regex.sub(replacer, 'aaabbkka')
#=> 'a3b2k2a'
I think You can use `itertools.grouby' function
for example
import itertools
data = 'aaassaaasssddee'
groupped_data = ((c, len(list(g))) for c, g in itertools.groupby(data))
result = ''.join(c + (str(n) if n > 1 else '') for c, n in groupped_data)
of course one can make this code more readable using generator instead of generator statement
This is an implementation of the algorithm I outlined in the comments:
from itertools import takewhile, count, islice, izip
def consume(items):
from collections import deque
deque(items, maxlen=0)
def ilen(items):
result = count()
consume(izip(items, result))
return next(result)
def pack_or_unpack(data):
start = 0
result = []
while start < len(data):
if data[start].isdigit():
# `data` is packed, bail
return unpack(data)
run = run_len(data, start)
# append the character that might repeat
result.append(data[start])
if run > 1:
# append the length of the run of characters
result.append(str(run))
start += run
return ''.join(result)
def run_len(data, start):
"""Return the end index of the run of identical characters starting at
`start`"""
return start + ilen(takewhile(lambda c: c == data[start],
islice(data, start, None)))
def unpack(data):
result = []
for i in range(len(data)):
if data[i].isdigit():
# skip digits, we'll look for them below
continue
# packed character
c = data[i]
# number of repetitions
n = 1
if (i+1) < len(data) and data[i+1].isdigit():
# if the next character is a digit, grab all the digits in the
# substring starting at i+1
n = int(''.join(takewhile(str.isdigit, data[i+1:])))
# append the repeated character
result.append(c*n) # multiplying a string with a number repeats it
return ''.join(result)
print pack_or_unpack('aaabbc')
print pack_or_unpack('a3b2c')
print pack_or_unpack('a10')
print pack_or_unpack('b5c5')
print pack_or_unpack('abc')
A regex-flavoured version of unpack() would be:
import re
UNPACK_RE = re.compile(r'(?P<char> [a-zA-Z]) (?P<count> \d+)?', re.VERBOSE)
def unpack_re(data):
matches = UNPACK_RE.finditer(data)
pairs = ((m.group('char'), m.group('count')) for m in matches)
return ''.join(char * (int(count) if count else 1)
for char, count in pairs)
This code demonstrates the most straightforward (or "basic") approach of implementing that algorithm. It's not particularly elegant or idiomatic or necessarily efficient. (It would be if written in C, but Python has the caveats such as: indexing a string copies the character into a new string, and algorithms that seem to copy data excessively might be faster than trying to avoid this if the copying is done in C and the workaround was implemented with a Python loop.)

Categories

Resources