Efficient solution for adding Binary Strings - python

Problem: Given two binary strings, return their sum (also a binary string).
For example, add_binary_strings('11', '1') should return '100'.
Implementation 1:
def addBinary(a, b):
"""
:type a: str
:type b: str
:rtype: str
"""
a = a[::-1]
b = b[::-1]
carry = '0'
result = ''
# Pad the strings to make their size equal
if len(a) < len(b):
for i in range(len(a), len(b)):
a += '0'
elif len(a) > len(b):
for i in range(len(b), len(a)):
b += '0'
n = len(a)
carry = 0
s = ''
for i in range(n):
l, m, c = int(a[i]), int(b[i]), carry
s += str(l^m^c) # sum is XOR of three bits
carry = (l&m) | (m&c) | (c&l) # carry is pairwise AND of three bits
if carry == 1:
s += str(carry)
return s[::-1]
Implementation 2
def addBinary(self, a, b):
"""
:type a: str
:type b: str
:rtype: str
"""
a = a[::-1]
b = b[::-1]
m = min(len(a), len(b))
carry = '0'
result = ''
for i in range(m):
r, carry = add_digit(a[i], b[i], carry=carry)
result += r
larger, shorter = a, b
if len(a) < len(b):
larger, shorter = b, a
for i in range(len(shorter), len(larger)):
if carry != '0':
r, carry = add_digit(larger[i], carry)
result += r
else:
result += larger[i]
if carry != '0':
result += carry
return result[::-1]
def add_digit(digit1, digit2, carry=None):
if carry is None:
carry = '0'
d1, d2, c = int(digit1), int(digit2), int(carry)
s = d1 + d2 + c
return str(s%2), str(s//2)
According to an online judge, the performance for the first implementation is better in terms of time. However, I find the first implementation to be a bit too verbose because I always have to make both the strings of the same size.
What is the time complexity of creating a new string of length n? I would like to know what are the corresponding space and time complexities for these implementations and how can I improve on the code.
What are the tradeoffs between the implementations and when should I not use a particular one of them?
For example, I should use the 2nd implementation in favour of the 1st
when the size of input strings will differ considerably on the
general.

If problem is defined as:
Given two binary strings, return their sum (also a binary string).
For example, add_binary_strings('11', '1') should return '100'.
Then you just need to do:
def add_binary_strings(a, b):
return '{:b}'.format(int(a,2) + int(b,2))
print(add_binary_strings('11', '1'))
This solution should be faster than the one you found.

I would modify your solution to:
def addBinary(a, b):
max_len = max(len(a), len(b))
a = a.zfill(max_len)
b = b.zfill(max_len)
c = ''
reminder = 0
for i in range(max_len-1, -1, -1):
c = str((int(a[i]) + int(b[i]) + reminder) % 2) + c
reminder = 1 if int(a[i]) + int(b[i])+ reminder > 1 else 0
c = str(reminder) + c
return c
this would save some inversions compared yo your solutions, which would save a bit of time (ouch!). You can omit the last assignment to c if you do not want to increase the length of the output, though you may overflow!

Related

Algorithm not passing tests even when I get correct results on my end

The question is mostly about base conversion. Here's the question.
Start with a random minion ID n, which is a nonnegative integer of length k in base b
Define x and y as integers of length k. x has the digits of n in descending order, and y has the digits of n in ascending order
Define z = x - y. Add leading zeros to z to maintain length k if necessary
Assign n = z to get the next minion ID, and go back to step 2
For example, given minion ID n = 1211, k = 4, b = 10, then x = 2111, y = 1112 and z = 2111 - 1112 = 0999. Then the next minion ID will be n = 0999 and the algorithm iterates again: x = 9990, y = 0999 and z = 9990 - 0999 = 8991, and so on.
Depending on the values of n, k (derived from n), and b, at some point the algorithm reaches a cycle, such as by reaching a constant value. For example, starting with n = 210022, k = 6, b = 3, the algorithm will reach the cycle of values [210111, 122221, 102212] and it will stay in this cycle no matter how many times it continues iterating. Starting with n = 1211, the routine will reach the integer 6174, and since 7641 - 1467 is 6174, it will stay as that value no matter how many times it iterates.
Given a minion ID as a string n representing a nonnegative integer of length k in base b, where 2 <= k <= 9 and 2 <= b <= 10, write a function solution(n, b) which returns the length of the ending cycle of the algorithm above starting with n. For instance, in the example above, solution(210022, 3) would return 3, since iterating on 102212 would return to 210111 when done in base 3. If the algorithm reaches a constant, such as 0, then the length is 1.
Here's my code
def solution(n, b): #n(num): str, b(base): int
#Your code here
num = n
k = len(n)
resList = []
resIdx = 0
loopFlag = True
while loopFlag:
numX = "".join(x for x in sorted(num, reverse=True))
numY = "".join(y for y in sorted(num))
xBaseTen, yBaseTen = getBaseTen(numX, b), getBaseTen(numY, b)
xMinusY = xBaseTen - yBaseTen
num = getBaseB(xMinusY, b, k)
resListLen = len(resList)
for i in range(resListLen - 1, -1, -1):
if resList[i] == num:
loopFlag = False
resIdx = resListLen - i
break
if loopFlag:
resList.append(num)
if num == 0:
resIdx = 1
break
return resIdx
def getBaseTen(n, b): #n(number): str, b(base): int -> int
nBaseTenRes = 0
n = str(int(n)) # Shave prepending zeroes
length = len(n) - 1
for i in range(length + 1):
nBaseTenRes += int(n[i]) * pow(b, length - i)
return nBaseTenRes
def getBaseB(n, b, k): #(number): int, b(base): int, k:(len): int -> str
res = ""
r = 0 # Remainder
nCopy = n
while nCopy > 0:
r = nCopy % b
nCopy = floor(nCopy / b)
res += str(r)
res = res[::-1]
resPrependZeroesLen = k - len(res)
if resPrependZeroesLen > 0:
for i in range(resPrependZeroesLen):
res = "0" + res
return res
The two test that are available to me and are not passing, are ('1211', 10) and ('210022', 3). But I get the right answers for them (1, 3).
Why am I failing? Is the algo wrong? Hitting the time limit?
The problem arose between the differences of the execution environments.
When I executed on my machine on Python 3.7 this
r = nCopy % n
gave me an answer as an int.
While Foobar runs on 2.7, and the answer above is given as a float

Fibonacci series in bit string

I am working on Fibonacci series but in bit string which can be represented as:
f(0)=0;
f(1)=1;
f(2)=10;
f(3)=101;
f(4)=10110;
f(5)=10110101;
Secondly, I have a pattern for example '10' and want to count how many times this occurs in particular series, for example, the Fibonacci series for 5 is '101101101' so '10' occur 3 times.
my code is running correctly without error but the problem is that it cannot run for more than the value of n=45 I want to run n=100
can anyone help? I only want to calculate the count of occurrence
n=5
fibonacci_numbers = ['0', '1']
for i in range(1,n):
fibonacci_numbers.append(fibonacci_numbers[i]+fibonacci_numbers[i-1])
#print(fibonacci_numbers[-1])
print(fibonacci_numbers[-1])
nStr = str (fibonacci_numbers[-1])
pattern = '10'
count = 0
flag = True
start = 0
while flag:
a = nStr.find(pattern, start)
if a == -1:
flag = False
else:
count += 1
start = a + 1
print(count)
This is a fun one! The trick is that you don't actually need that giant bit string, just the number of 10s it contains and the edges. This solution runs in O(n) time and O(1) space.
from typing import NamedTuple
class FibString(NamedTuple):
"""First digit, last digit, and the number of 10s in between."""
first: int
tens: int
last: int
def count_fib_string_tens(n: int) -> int:
"""Count the number of 10s in a n-'Fibonacci bitstring'."""
def combine(b: FibString, a: FibString) -> FibString:
"""Combine two FibStrings."""
tens = b.tens + a.tens
# mind the edges!
if b.last == 1 and a.first == 0:
tens += 1
return FibString(b.first, tens, a.last)
# First two values are 0 and 1 (tens=0 for both)
a, b = FibString(0, 0, 0), FibString(1, 0, 1)
for _ in range(1, n):
a, b = b, combine(b, a)
return b.tens # tada!
I tested this against your original implementation and sure enough it produces the same answers for all values that the original function is able to calculate (but it's about eight orders of magnitude faster by the time you get up to n=40). The answer for n=100 is 218922995834555169026 and it took 0.1ms to calculate using this method.
The nice thing about the Fibonacci sequence that will solve your issue is that you only need the last two values of the sequence. 10110 is made by combining 101 and 10. After that 10 is no longer needed. So instead of appending, you can just keep the two values. Here is what I've done:
n=45
fibonacci_numbers = ['0', '1']
for i in range(1,n):
temp = fibonacci_numbers[1]
fibonacci_numbers[1] = fibonacci_numbers[1] + fibonacci_numbers[0]
fibonacci_numbers[0] = temp
Note that it still uses a decent amount of memory, but it didn't give me a memory error (it does take a bit of time to run though).
I also wasn't able to print the full string as I got an OSError [Errno 5] Input/Output error but it can still count and print that output.
For larger numbers, storing as a string is going to quickly cause a memory issue. In that case, I'd suggest doing the fibonacci sequence with plain integers and then converting to bits. See here for tips on binary conversion.
While the regular fibonacci sequence doesn't work in a direct sense, consider that 10 is 2 and 101 is 5. 5+2 doesn't work - you want 10110 or an or operation 10100 | 10 yielding 22; so if you shift one by the length of the other, you can get the result. See for example
x = 5
y = 2
(x << 2) | y
>> 22
Shifting x by the number of bits representing y and then doing a bitwise or with | solves the issue. Python summarizes these bitwise operations well here. All that's left for you to do is determine how many bits to shift and implement this into your for loop!
For really large n you will still have a memory issue shown in the plot:
'
Finally i got the answer but can someone explain it briefly why it is working
def count(p, n):
count = 0
i = n.find(p)
while i != -1:
n = n[i + 1:]
i = n.find(p)
count += 1
return count
def occurence(p, n):
a1 = "1"
a0 = "0"
lp = len(p)
i = 1
if n <= 5:
return count(p, atring(n))
while lp > len(a1):
temp = a1
a1 += a0
a0 = temp
i += 1
if i >= n:
return count(p, a1)
fn = a1[:lp - 1]
if -lp + 1 < 0:
ln = a1[-lp + 1:]
else:
ln = ""
countn = count(p, a1)
a1 = a1 + a0
i += 1
if -lp + 1 < 0:
lnp1 = a1[-lp + 1:]
else:
lnp1 = ""
k = 0
countn1 = count(p, a1)
for j in range(i + 1, n + 1):
temp = countn1
countn1 += countn
countn = temp
if k % 2 == 0:
string = lnp1 + fn
else:
string = ln + fn
k += 1
countn1 += count(p, string)
return countn1
def atring(n):
a0 = "0"
a1 = "1"
if n == 0 or n == 1:
return str(n)
for i in range(2, n + 1):
temp = a1
a1 += a0
a0 = temp
return a1
def fn():
a = 100
p = '10'
print( occurence(p, a))
if __name__ == "__main__":
fn()

Ordered ranking for a list of permutations

I am trying to develop a method for finding the orderd rank of a particular sequence in the following lists.
a = list(sorted(itertools.combinations(range(0,5),3)))
b = list(sorted(itertools.permutations(range(0,5),3)))
a represents a list of elemnts of a combinatorial number system so the formula for rank is pretty straight forward.
What I need are 2 function magic_rank_1 and magic_rank_2 which have the following definitions
def magic_rank_1(perm_item,permutation_list_object):
## permutation_list_object is actually b
return list_object.index(perm_item)
def magic_rank_2(perm_item,permutation_list_object,combination_list_object):
## permutation_list_object is actually b and combination_list_object is actually a
return combination_list_object.index(tuple(sorted(perm_item)))
So basically magic_rank_2((0,1,2),b,a) = magic_rank_2((2,0,1),b,a)
Sounds easy.. but i have a few restrictions.
I cant use the indexof function because I cannot afford to search lists >100000000 items for every item
I need magic_rank_1 and magic_rank_2 to be purely mathematical without using any sort function or comparison functions or search function. All the information I will have is the tuple whose rank needs to be identified and the last letter of my alphabet (in this case that will be 5)
magic rank 2 need not be a number between 0 and k-1 when k = len(a) as long as it is a unique number between 0 and 2^(ceiling((max_alphabet/2)+1))
I know magic_rank_1 can be calculated by something similar to this but there is a difference ,there every letter of the input alphabet is used, in my case it is a subset
Lastly yes this is supposed to be a substitute for a hashing function, currently using a hashing function but I am not taking advantage of the fact that magic_rank_2((0,1,2),b,a) = magic_rank_2((2,0,1),b,a) . If i can it will reduce my storage space requirements significantly as the length of my sequences is actually 5 , so if I can calculate a method for magic_rank_2 I reduce my storage requirement to 1% of current requirement
UPDATE
- For magic_rank_2 there should be no comparison operation between elements of the tuple, i.e no sorting, min,max etc
that only makes the algorithm less efficient than regular hashing
The following two functions will rank a combination and permutation, given a word and an alphabet (or in your case, a tuple and a list).
import itertools
import math
def rank_comb(word, alph, depth=0):
if not word: return 0
if depth == 0:
word = list(word)
alph = sorted(alph)
pos = 0
for (i,c) in enumerate(alph):
if c == word[0]:
# Recurse
new_word = [x for x in word if x != c]
new_alph = [x for x in alph if x > c]
return pos + rank_comb(new_word, new_alph, depth+1)
else:
num = math.factorial(len(alph)-i-1)
den = math.factorial(len(alph)-i-len(word)) * math.factorial(len(word)-1)
pos += num // den
def rank_perm(word, alph, depth=0):
if not word: return 0
if depth == 0:
word = list(word)
alph = sorted(alph)
pos = 0
for c in alph:
if c == word[0]:
# Recurse
new_word = [x for x in word if x != c]
new_alph = [x for x in alph if x != c]
return pos + rank_perm(new_word, new_alph, depth+1)
else:
num = math.factorial(len(alph)-1)
den = math.factorial(len(alph)-len(word))
pos += num // den
#== Validation =====================================================================
# Params
def get_alph(): return range(8)
r = 6
a = list(sorted(itertools.combinations(get_alph(), r)))
b = list(sorted(itertools.permutations(get_alph(), r)))
# Tests
PASS_COMB = True
PASS_PERM = True
for (i,x) in enumerate(a):
j = rank_comb(x, get_alph())
if i != j:
PASS_COMB = False
print("rank_comb() FAIL:", i, j)
for (i,x) in enumerate(b):
j = rank_perm(x, get_alph())
if i != j:
PASS_PERM = False
print("rank_perm() FAIL:", i, j)
print("rank_comb():", "PASS" if PASS_COMB else "FAIL")
print("rank_perm():", "PASS" if PASS_PERM else "FAIL")
The functions are similar, but there are few differences:
new_alph is filtered differently.
The calculation of both num and den are different.
Update:
rank_comb2 doesn't require sorting the input word (a 3-tuple):
import itertools
import math
def rank_comb2(word, alph, depth=0):
if not word: return 0
if depth == 0:
word = list(word)
alph = sorted(alph)
pos = 0
for (i,c) in enumerate(alph):
if c == min(word):
# Recurse
new_word = [x for x in word if x != c]
new_alph = [x for x in alph if x > c]
return pos + rank_comb2(new_word, new_alph, depth+1)
else:
num = math.factorial(len(alph)-i-1)
den = math.factorial(len(alph)-i-len(word)) * math.factorial(len(word)-1)
pos += num // den
r1 = rank_comb2([2,4,1], range(5))
r2 = rank_comb2([1,4,2], range(5))
r3 = rank_comb2([4,1,2], range(5))
print(r1, r2, r3) # 7 7 7

Evenly intermix two lists of elements (load balance)

Let's say I have two strings made with only 1 character:
'aaaaaaa'
'bbb'
I'd like to find an algorithm to produce a combined string of:
'aabaabaaba'
The two are merged so that there is the fewest # of consecutive characters from either list (in this case that # is 2). The length of each string is arbitrary, and I'd like for it to be symmetrical. Bonus points for extending it to more than just 2 strings.
I am doing this in python, but the language doesn't matter. This is for a load balancing problem I'm working on.
You can use the elements alternatively and use a letter of the longer string if necessary. You can determine whether an additional letter is possible with integer arithmetics: A fraction tells you how many letters come between each letter pair. You accumulate this fraction and use letters from the longer array as long as that accumulated fraction is larger than ½:
def intertwine(a, b):
""" Return a combination of string with fewest number of
consecutive elements from one string
"""
if len(b) > len(a):
return intertwine(b, a)
if not b:
return a
a = list(a)
b = list(b)
num = len(a) - len(b)
denom = len(b)
acc = 0
res = []
while a or b:
acc += num
while acc >= denom / 2:
if a: res += a.pop(0)
acc -= num
if a: res += a.pop(0)
if b: res += b.pop(0)
return "".join(res)
print intertwine("aaabaaa", "bbb") # "aababbaaba"
print intertwine("aaaaaaa", "b") # "aaabaaaa"
print intertwine("aaaaaa", "b") # "aaabaaa"
print intertwine("aa", "bbbbbb") # "bbabbabb"
print intertwine("", "bbbbbb") # "bbbbbb"
print intertwine("", "") # ""
import itertools
def intermix(*containers):
mix = []
for c in sorted(containers, key=lambda c: len(c)):
if len(c) >= len(mix):
bigger, smaller = c, mix
else:
bigger, smaller = mix, c
ratio, remainder = divmod(len(bigger), len(smaller) + 1)
chunk_sizes = (ratio + (1 if i < remainder else 0) for i in range(len(smaller) + 1))
chunk_offsets = itertools.accumulate(chunk_sizes)
off_start = 0
new_mix = []
for i, off in enumerate(chunk_offsets):
new_mix.extend(bigger[off_start:off])
if i == len(smaller):
break
new_mix.append(smaller[i])
off_start = off
mix = new_mix
return mix

Check if a list is a rotation of another list that works with duplicates

I have this function for determining if a list is a rotation of another list:
def isRotation(a,b):
if len(a) != len(b):
return False
c=b*2
i=0
while a[0] != c[i]:
i+=1
for x in a:
if x!= c[i]:
return False
i+=1
return True
e.g.
>>> a = [1,2,3]
>>> b = [2,3,1]
>>> isRotation(a, b)
True
How do I make this work with duplicates? e.g.
a = [3,1,2,3,4]
b = [3,4,3,1,2]
And can it be done in O(n)time?
The following meta-algorithm will solve it.
Build a concatenation of a, e.g., a = [3,1,2,3,4] => aa = [3,1,2,3,4,3,1,2,3,4].
Run any string adaptation of a string-matching algorithm, e.g., Boyer Moore to find b in aa.
One particularly easy implementation, which I would first try, is to use Rabin Karp as the underlying algorithm. In this, you would
calculate the Rabin Fingerprint for b
calculate the Rabin fingerprint for aa[: len(b)], aa[1: len(b) + 1], ..., and compare the lists only when the fingerprints match
Note that
The Rabin fingerprint for a sliding window can be calculated iteratively very efficiently (read about it in the Rabin-Karp link)
If your list is of integers, you actually have a slightly easier time than for strings, as you don't need to think what is the numerical hash value of a letter
-
You can do it in 0(n) time and 0(1) space using a modified version of a maximal suffixes algorithm:
From Jewels of Stringology:
Cyclic equality of words
A rotation of a word u of length n is any word of the form u[k + 1...n][l...k]. Let u, w be two words of the same length n. They are said to be cyclic-equivalent if u(i) == w(j) for some i, j.
If words u and w are written as circles, they are cyclic-equivalent if the circles coincide after appropriate rotations.
There are several linear-time algorithms for testing the cyclic-equivalence
of two words. The simplest one is to apply any string matching algorithm to pattern pat = u and text = ww because words u and w are cyclic=equivalent if pat occurs in text.
Another algorithm is to find maximal suffixes of uu and ww and check if
they are identical on prefixes of size n. We have chosen this problem because there is simpler interesting algorithm, working in linear time and constant space simultaneously, which deserves presentation.
Algorithm Cyclic-Equivalence(u, w)
{ checks cyclic equality of u and w of common length n }
x := uu; y := ww;
i := 0; j := 0;
while (i < n) and (j < n) do begin
k := 1;
while x[i + k] = y[j + k] do k := k + 1;
if k > n then return true;
if x[i + k]> y[i + k] then i := i + k else j := j + k;
{ invariant }
end;
return false;
Which translated to python becomes:
def cyclic_equiv(u, v):
n, i, j = len(u), 0, 0
if n != len(v):
return False
while i < n and j < n:
k = 1
while k <= n and u[(i + k) % n] == v[(j + k) % n]:
k += 1
if k > n:
return True
if u[(i + k) % n] > v[(j + k) % n]:
i += k
else:
j += k
return False
Running a few examples:
In [4]: a = [3,1,2,3,4]
In [5]: b =[3,4,3,1,2]
In [6]: cyclic_equiv(a,b)
Out[6]: True
In [7]: b =[3,4,3,2,1]
In [8]: cyclic_equiv(a,b)
Out[8]: False
In [9]: b =[3,4,3,2]
In [10]: cyclic_equiv(a,b)
Out[10]: False
In [11]: cyclic_equiv([1,2,3],[1,2,3])
Out[11]: True
In [12]: cyclic_equiv([3,1,2],[1,2,3])
Out[12]: True
A more naive approach would be to use a collections.deque to rotate the elements:
def rot(l1,l2):
from collections import deque
if l1 == l2:
return True
# if length is different we cannot get a match
if len(l2) != len(l1):
return False
# if any elements are different we cannot get a match
if set(l1).difference(l2):
return False
l2,l1 = deque(l2),deque(l1)
for i in range(len(l1)):
l2.rotate() # l2.appendleft(d.pop())
if l1 == l2:
return True
return False
I think you could use something like this:
a1 = [3,4,5,1,2,4,2]
a2 = [4,5,1,2,4,2,3]
# Array a2 is rotation of array a1 if it's sublist of a1+a1
def is_rotation(a1, a2):
if len(a1) != len(a2):
return False
double_array = a1 + a1
return check_sublist(double_array, a2)
def check_sublist(a1, a2):
if len(a1) < len(a2):
return False
j = 0
for i in range(len(a1)):
if a1[i] == a2[j]:
j += 1
else:
j = 0
if j == len(a2):
return True
return j == len(a2)
Just common sense if we are talking about interview questions:
we should remember that solution should be easy to code and to describe.
do not try to remember solution on interview. It's better to remember core principle and re-implement it.
Alternatively (I couldn't get the b in aa solution to work), you can 'rotate' your list and check if the rotated list is equal to b:
def is_rotation(a, b):
for n in range(len(a)):
c = c = a[-n:] + a[:-n]
if b == c:
return True
return False
I believe this would be O(n) as it only has one for loop. Hope it helps
This seems to work.
def func(a,b):
if len(a) != len(b):
return False
elif a == b:
return True
indices = [i for i, x in enumerate(b) if x == a[0] and i > 0]
for i in indices:
if a == b[i:] + b[:i]:
return True
return False
And this also:
def func(a, b):
length = len(a)
if length != len(b):
return False
i = 0
while i < length:
if a[0] == b[i]:
j = i
for x in a:
if x != b[j]:
break
j = (j + 1) % length
return True
i += 1
return False
You could try testing the performance of just using the rotate() function in the deque collection:
from collections import deque
def is_rotation(a, b):
if len(a) == len(b):
da = deque(a)
db = deque(b)
for offset in range(len(a)):
if da == db:
return True
da.rotate(1)
return False
In terms of performance, do you need to make this calculation many times for small arrays, or for few times on very large arrays? This would determine whether or not special case testing would speed it up.
If you can represent these as strings instead, just do:
def cyclically_equivalent(a, b):
return len(a) == len(b) and a in 2 * b
Otherwise, one should get a sublist searching algorithm, such as Knuth-Morris-Pratt (Google gives some implementations) and do
def cyclically_equivalent(a, b):
return len(a) == len(b) and sublist_check(a, 2 * b)
Knuth-Morris-Pratt algorithm is a string search algorithm that runs in O(n) where n is the length of a text S (assuming the existence of preconstructed table T, which runs in O(m) where m is the length of the search string). All in all it is O(n+m).
You could do a similar pattern matching algorithm inspired by KMP.
Concatenate a list to itself, like a+a or b+b - this is the searched text/list with 2*n elements
Build the table T based on the other list (be it b or a) - this is done in O(n)
Run the KMP inspired algorithm - this is done in O(2*n) (because you concatenate a list to itself)
Overall time complexity is O(2*n+n) = O(3*n) which is in O(n)

Categories

Resources