decode string algorithm implementation for advice

decode string algorithm implementation for advice - python

Working on below algorithm puzzle to decode a string containing numbers into characters. Post full problem statement and reference code. Actually I referred a few solutions, and it seems all solutions I found decode from back to the front, and I think decode from front to end should also be fine, just wondering if any special benefits or considerations why for this problem, it is better to decode from back to front? Thanks.
A message containing letters from A-Z is being encoded to numbers using the following mapping:
'A' -> 1
'B' -> 2
...
'Z' -> 26
Given an encoded message containing digits, determine the total number of ways to decode it.
For example,
Given encoded message "12", it could be decoded as "AB" (1 2) or "L" (12).
The number of ways decoding "12" is 2.
public class Solution {
public int numDecodings(String s) {
int n = s.length();
if (n == 0) return 0;
int[] memo = new int[n+1];
memo[n] = 1;
memo[n-1] = s.charAt(n-1) != '0' ? 1 : 0;
for (int i = n - 2; i >= 0; i--)
if (s.charAt(i) == '0') continue;
else memo[i] = (Integer.parseInt(s.substring(i,i+2))<=26) ? memo[i+1]+memo[i+2] : memo[i+1];
return memo[0];
}
}
thanks in advance,
Lin

There will be no difference whether you decode the string from front-to-back or back-to-front if you break it into sub-strings and store their results.
This implements front-to-back approach:
def decode_string(st):
result_dict = {st[0]:1}
for i in xrange(2,len(st)+1):
if int(st[i-1]) == 0:
if int(st[i-2]) not in [1,2]:
return "Not possible to decode"
result_dict[st[:i]] = 0
else:
result_dict[st[:i]] = result_dict[st[:i-1]]
if int(st[i-2:i]) < 27 and st[i-2] != '0':
result_dict[st[:i]] = result_dict[st[:i]] + result_dict.get(st[:i-2],1)
return result_dict[st]
print decode_string("125312")
result_dict contains all the possibilities for incremental sub-strings. Initialize with first character
Special check for '0' character because the only acceptable values for 0 are 10 and 20. So break from the loop if input contains something else
Then for each index check whether the combination with the previous index is a character (combination < 27) or not. If true, add the result of string upto index-2 to it.
Store the result of each incremental sub-string in the dictionary
Result:
The result_dict contains values like this:
{'12': 2, '12531': 3, '1': 1, '125312': 6, '125': 3, '1253': 3}
So result_dict[st] gives the required answer
Using Lists is a better idea
def decode_string(st):
result_list = [1]
for i in xrange(2,len(st)+1):
if int(st[i-1]) == 0:
if int(st[i-2]) not in [1,2]:
return "Not possible to decode"
result_list.append(0)
else:
result_list.append(result_list[i-2])
if int(st[i-2:i]) < 27 and st[i-2] != '0':
if i>2:
result_list[i-1] = result_list[i-1] + result_list[i-3]
else:
result_list[i-1] = result_list[i-1] + 1
print result_list
return result_list[-1]
print decode_string("125312")

Related

Python: How to find all ways to decode a string?

I'm trying to solve this problem but it fails with input "226".
Problem:
A message containing letters from A-Z is being encoded to numbers using the following mapping:
'A' -> 1
'B' -> 2
...
'Z' -> 26
Given a non-empty string containing only digits, determine the total number of ways to decode it.
My Code:
class Solution:
def numDecodings(self, s: str) -> int:
decode =[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]
ways = []
for d in decode:
for i in s:
if str(d) == s or str(d) in s:
ways.append(d)
if int(i) in decode:
ways.append(str(i))
return len(ways)
My code returns 2. It only takes care of combinations (22,6) and (2,26).
It should be returning 3, so I'm not sure how to take care of the (2,2,6) combination.

Looks like this problem can be broken down into many subproblems thus can be solved recursively
Subproblem 1 = when the last digit of the string is valid ( i.e. non zero number ) for that you can just recur for (n-1) digits left
if s[n-1] > "0":
count = number_of_decodings(s,n-1)
Subproblem 2 = when last 2 digits form a valid number ( less then 27 ) for that you can just recur for remaining (n-2) digits
if (s[n - 2] == '1' or (s[n - 2] == '2' and s[n - 1] < '7') ) :
count += number_of_decodings(s, n - 2)
Base Case = length of the string is 0 or 1
if n == 0 or n == 1 :
return 1
EDIT: A quick searching on internet , I found another ( more interesting ) method to solve this particular problem which uses dynamic programming to solve this problem
# A Dynamic Programming based function
# to count decodings
def countDecodingDP(digits, n):
count = [0] * (n + 1); # A table to store
# results of subproblems
count[0] = 1;
count[1] = 1;
for i in range(2, n + 1):
count[i] = 0;
# If the last digit is not 0, then last
# digit must add to the number of words
if (digits[i - 1] > '0'):
count[i] = count[i - 1];
# If second last digit is smaller than 2
# and last digit is smaller than 7, then
# last two digits form a valid character
if (digits[i - 2] == '1' or
(digits[i - 2] == '2' and
digits[i - 1] < '7') ):
count[i] += count[i - 2];
return count[n];
the above solution solves the problem in complexity of O(n) and uses the similar method as that of fibonacci number problem
source: https://www.geeksforgeeks.org/count-possible-decodings-given-digit-sequence/

This seemed like a natural for recursion. Since I was bored, and the first answer didn't use recursion and didn't return the actual decodings, I thought there was room for improvement. For what it's worth...
def encodings(str, prefix = ''):
encs = []
if len(str) > 0:
es = encodings(str[1:], (prefix + ',' if prefix else '') + str[0])
encs.extend(es)
if len(str) > 1 and int(str[0:2]) <= 26:
es = encodings(str[2:], (prefix + ',' if prefix else '') + str[0:2])
encs.extend(es)
return encs if len(str) else [prefix]
This returns a list of the possible decodings. To get the count, you just take the length of the list. Here a sample run:
encs = encodings("123")
print("{} {}".format(len(encs), encs))
with result:
3 ['1,2,3', '1,23', '12,3']
Another sample run:
encs = encodings("123123")
print("{} {}".format(len(encs), encs))
with result:
9 ['1,2,3,1,2,3', '1,2,3,1,23', '1,2,3,12,3', '1,23,1,2,3', '1,23,1,23', '1,23,12,3', '12,3,1,2,3', '12,3,1,23', '12,3,12,3']

Converting C# to Python base64 encoding

I am trying to convert a function from C# to python.
My C# code:
static string Base64Encode(string plainText)
{
char[] arr = plainText.ToCharArray();
List<byte> code16 = new List<byte>();
int i = 1;
string note = "";
foreach (char row in arr)
{
if (i == 1)
{
note += "0x" + row;
}
else if (i == 2)
{
note += row;
code16.Add(Convert.ToByte(note, 16));
note = "";
i = 0;
}
i++;
}
return System.Convert.ToBase64String(code16.ToArray());
}
My Python code:
def Base64Ecode(plainText):
code16 = []
i = 1
note = ''
for row in plainText:
if i == 1:
note += '0x' + row
elif i == 2:
note += row
code16.append(int(note, 16))
note = ''
i = 0
i += 1
test = ''
for blah in code16:
test += chr(blah)
print(base64.b64encode(test.encode()))
Both code16 values are the same but I have an issue when I try to base64 encode the data.
C# takes a byte array but pyton takes a string and I am getting two different results.

string.encode() uses the utf-8 encoding by default, which probably creates some multi-byte chars you don't want.
Use string.encode("latin1") to create bytes from 00 to FF.
That said, there is an easier method in python to convert a Hex-String to a bytearray (or bytes object):
base64.b64encode(bytes.fromhex(plainText))
gives the same result as your function.

I just need to turn solution of JavaScript problem into Python code using the same approach

I'm stuck on turning this JS anagram problem into Python solution using the same approach.
Here is the problem:
Here is the JavaScript solution:
if (first.length !== second.length) {
return false;
}
const lookup = {};
for (let i = 0; i < first.length; i++) {
let letter = first[i];
// if letter exists, increment, otherwise set to 1
lookup[letter] ? (lookup[letter] += 1) : (lookup[letter] = 1);
}
for (let i = 0; i < second.length; i++) {
let letter = second[i];
// can't find letter or letter is zero then it's not an anagram
if (!lookup[letter]) {
return false;
} else {
lookup[letter] -= 1;
}
}
return true;
}
console.log(validAnagram('anagram', 'nagaram'));
And here is my Python code using the same approach:
if len(first) != len(second):
return False
lookup = {}
for char in first:
letter = first[char]
if lookup[letter]:
lookup[letter] += 1
else:
lookup[letter] = 1
for char in second:
letter = second[char]
if not lookup[letter]:
return False
else:
lookup[letter] -= 1
return True
print(valid_anagram("anagram", "nagaram"))
This is the error I'm getting when I run my Python solution:
letter = first[char] TypeError: string indices must be integers

Here's the same solution, that uses dict to count letters like your Java code:
from collections import Counter
def valid_anagram( str1, str2 ) :
return Counter(str1) == Counter(str2)
testing:
>>> valid_anagram('anagram', 'nagaram')
True
>>>
I wrote below, and I write here again, the whole point of using python is not reinventing the wheel and use existing libraries to make the code compact, fast and easy to understand.
Take for example your code:
for char in first:
letter = first[char]
if lookup[letter]:
lookup[letter] += 1
else:
lookup[letter] = 1
This can be rewritten as:
lookup = dict()
for letter in first:
if lookup[letter]:
lookup[letter] += 1
else:
lookup[letter] = 1
Or, better yet:
lookup = Counter()
for letter in first:
lookup[letter] += 1
Or, even better:
lookup = Counter( first )
So why waste time and space....

You are attempting to pass in a string to get the index instead of passing in an integer.
first = "hello"
for char in first:
print(char)
Output:
h
e
l
l
o
To get the index use this:
for char in range(len(first)):
print(char)
Output:
0
1
2
3
4
Here is a simpler solution
def valid_anagram(str1, str2):
list_str1 = list(str1)
list_str1.sort()
list_str2 = list(str2)
list_str2.sort()
return (list_str1 == list_str2)

Without using built-in functions, a function should reverse a string without changing the '$' position

I need a Python function which gives reversed string with the following conditions.
$ position should not change in the reversed string.
Should not use Python built-in functions.
Function should be an efficient one.
Example : 'pytho$n'
Result : 'nohty$p'
I have already tried with this code:
list = "$asdasdas"
list1 = []
position = ''
for index, i in enumerate(list):
if i == '$':
position = index
elif i != '$':
list1.append(i)
reverse = []
for index, j in enumerate( list1[::-1] ):
if index == position:
reverse.append( '$' )
reverse.append(j)
print reverse
Thanks in advance.

Recognise that it's a variation on the partitioning step of the Quicksort algorithm, using two pointers (array indices) thus:
data = list("foo$barbaz$$")
i, j = 0, len(data) - 1
while i < j:
while i < j and data[i] == "$": i += 1
while i < j and data[j] == "$": j -= 1
data[i], data[j] = data[j], data[i]
i, j = i + 1, j - 1
"".join(data)
'zab$raboof$$'
P.S. it's a travesty to write this in Python!
A Pythonic solution could look like this:
def merge(template, data):
for c in template:
yield c if c == "$" else next(data)
data = "foo$barbaz$$"
"".join(merge(data, reversed([c for c in data if c != "$"])))
'zab$raboof$$'

Wrote this without using any inbuilt functions. Hope it fulfils your criteria -
string = "zytho$n"
def reverse(string):
string_new = string[::-1]
i = 0
position = 0
position_new = 0
for char in string:
if char=="$":
position = i
break
else:
i = i + 1
j = 0
for char in string_new:
if char=="$":
position_new = i
break
else:
j = j + 1
final_string = string_new[:position_new]+string_new[position_new+1:position+1]+"$"+string_new[position+1:]
return(final_string)
string_new = reverse(string)
print(string_new)
The output of this is-
nohty$x
To explain the code to you, first I used [::-1], which is just taking the last position of the string and moving forward so as to reverse the string. Then I found the position of the $ in both the new and the old string. I found the position in the form of an array, in case you have more than one $ present. However, I took for granted that you have just one $ present, and so took the [0] index of the array. Next I stitched back the string using four things - The part of the new string upto the $ sign, the part of the new string from after the dollar sign to the position of the $ sign in the old string, then the $ sign and after that the rest of the new string.

How to improve python dict performance?

I recently coded a python solution using dictoionaries which got TLE verdict. The solution is exactly similar to a multiset solution in c++ which works. So, we are sure that the logic is correct, but the implementation is not upto the mark.
The problem description for understanding below code (http://codeforces.com/contest/714/problem/C):
For each number we need to get a string of 0s and 1s such that i'th digit is 0/1 if respective ith digit in number is even/odd.
We need to maintain the count of number that have the same mapping that is given by above described point.
Any hints/pointer to improve the performance of below code? It gave TLE (Time Limit Exceeded) for a large test case(http://codeforces.com/contest/714/submission/20594344).
from collections import defaultdict
def getPattern(s):
return ''.join(list(s.zfill(19)))
def getSPattern(s):
news = s.zfill(19)
patlist = [ '0' if (int(news[i])%2 == 0) else '1' for i in range(19) ]
return "".join(patlist)
t = int(raw_input())
pat = defaultdict(str) # holds strings as keys and int as value
for i in range(0, t):
oper, num = raw_input().strip().split(' ')
if oper == '+' :
pattern = getSPattern(str(num))
if pattern in pat:
pat[pattern] += 1
else:
pat[pattern] = 1
elif oper == '-' :
pattern = getSPattern(str(num))
pat[pattern] = max( pat[pattern] - 1, 0)
elif oper == '?' :
print pat.get(getPattern(num) , 0 )

I see lots of small problems with your code but can't say if they add up to significant performance issues:
You've set up, and used, your defaultdict() incorrectly:
pat = defaultdict(str)
...
if pattern in pat:
pat[pattern] += 1
else:
pat[pattern] = 1
The argument to the defaultdict() constructor should be the type of the values, not the keys. Once you've set up your defaultdict properly, you can simply do:
pat = defaultdict(int)
...
pat[pattern] += 1
As the value will now default to zero if the pattern isn't there already.
Since the specification says:
 -  ai — delete a single occurrence of non-negative integer ai from the multiset. It's guaranteed, that there is at least one ai in the
multiset.
Then this:
pat[pattern] = max( pat[pattern] - 1, 0)
can simply be this:
pat[pattern] -= 1
You're working with 19 character strings but since the specification says the numbers will be less than 10 ** 18, you can work with 18 character strings instead.
getSPattern() does a zfill() and then processes the string, it should do it in the reverse order, process the string and then zfill() it, as there's no need to run the logic on the leading zeros.
We don't need the overhead of int() to convert the characters to numbers:
(int(news[i])%2 == 0)
Consider using ord() instead as the ASCII values of the digits have the same parity as the digits themselves: ord('4') -> 52
And you don't need to loop over the indexes, you can simply loop over the characters.
Below is my rework of your code with the above changes, see if it still works (!) and gains you any performance:
from collections import defaultdict
def getPattern(string):
return string.zfill(18)
def getSPattern(string):
# pattern_list = (('0', '1')[ord(character) % 2] for character in string)
pattern_list = ('0' if ord(character) % 2 == 0 else '1' for character in string)
return ("".join(pattern_list)).zfill(18)
patterns = defaultdict(int) # holds keys as strings as and values as int
text = int(raw_input())
for _ in range(text):
operation, number = raw_input().strip().split()
if operation == '+':
pattern = getSPattern(number)
patterns[pattern] += 1
elif operation == '-':
pattern = getSPattern(number)
patterns[pattern] -= 1
elif operation == '?':
print patterns.get(getPattern(number), 0)

With the explanation already done by #cdlane, I just need to add my rewrite of getSPattern where I think the bulk of time is spent. As per my initial comment this is available on https://eval.in/641639
def getSPattern(s):
patlist = ['0' if c in ['0', '2', '4', '6', '8'] else '1' for c in s]
return "".join(patlist).zfill(19)
Using zfill(18) might marginally spare you some time.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

decode string algorithm implementation for advice - python

Related

Python: How to find all ways to decode a string?

Converting C# to Python base64 encoding

I just need to turn solution of JavaScript problem into Python code using the same approach

Without using built-in functions, a function should reverse a string without changing the '$' position

How to improve python dict performance?

Categories

Resources