Find the nth character of an increasing sequence - python

Recently i saw a competitive coding question, the bruteforce approach doesn't meet the time complexity, Is there any other solution for this,
Question:
An expanding sequence is give which starts with 'a',we should replace each character in the following way,
a=>ab
b=>cd
c=>cd
d=>ab
there for it will look like this in each iteration,
a
ab
abcd
abcdcdab
abcdcdabcdababcd
.......
a number n will be given as input ,the function should return the character at nth postion.
I have tried the brute force approach by forming the full string and returning the char at n.but time limit exceeded.
i have tried the following:
dictionary={
'a':'ab',
'b':'cd',
'c':'cd',
'd':'ab'
}
string="a"
n=128
while len(string)<n:
new_string=''
for i in string:
new_string+=dictionary[i]
string=new_string
print(string[n-1])

The solution to problems like this is never to actually generate all the strings.
Here's a fast solution that descends directly through the tree of substitutions:
dictionary={
'a':['a','b'],
'b':['c','d'],
'c':['c','d'],
'd':['a','b']
}
def nth_char(n):
# Determine how many levels of substitution are reqired
# to produce the nth character.
# Remember the size of the last level
levels = 1
totalchars = 1
lastlevelsize = 1
while totalchars < n:
levels += 1
lastlevelsize *= 2
totalchars += lastlevelsize
# position of the target char in the last level
pos = (n-1) - (totalchars - lastlevelsize)
# start at char 1, and find the path to the target char
# through the levels
current = 'a'
while levels > 1:
levels -= 1
# next iteration, we'll go to the left or right subtree
totalchars -= lastlevelsize
# half of the last level size is the last level size in the next iteration
lastlevelsize = lastlevelsize//2
# is the target char a child of the left or right subtitution product?
# each corresponds to a contiguous part of the last level
if pos < lastlevelsize:
#left - take the left part of the last level
current = dictionary[current][0]
else:
#right - take the right part of the last level
current = dictionary[current][1]
pos -= lastlevelsize
return current
print(nth_char(17))

Related

Python Leetcode 3: Time limit exceeded

I am solving LeetCode problem https://leetcode.com/problems/longest-substring-without-repeating-characters/:
Given a string s, find the length of the longest substring without repeating characters.
Constraints:
0 <= s.length <= 5 * 104
s consists of English letters, digits, symbols and spaces.
If used this sliding window algorithm:
def lengthOfLongestSubstring(str):
# define base case
if (len(str) < 2): return len(str)
# define pointers and frequency counter
left = 0
right = 0
freqCounter = {} # used to store the character count
maxLen = 0
while (right < len(str)):
# adds the character count into the frequency counter dictionary
if (str[right] not in freqCounter):
freqCounter[str[right]] = 1
else:
freqCounter[str[right]] += 1
# print (freqCounter)
# runs the while loop if we have a key-value with value greater than 1.
# this means that there are repeated characters in the substring.
# we want to move the left pointer by 1 until that value decreases to 1 again. E.g., {'a':2,'b':1,'c':1} to {'a':1,'b':1,'c':1}
while (len(freqCounter) != right-left+1):
# while (freqCounter[str[right]] > 1): ## Time Limit Exceeded Error
print(len(freqCounter), freqCounter)
freqCounter[str[left]] -= 1
# remove the key-value if value is 0
if (freqCounter[str[left]] == 0):
del freqCounter[str[left]]
left += 1
maxLen = max(maxLen, right-left+1)
# print(freqCounter, maxLen)
right += 1
return maxLen
print(lengthOfLongestSubstring("abcabcbb")) # 3 'abc'
I got the error "Time Limit Exceeded" when I submitted with this while loop:
while (freqCounter[str[right]] > 1):
instead of
while (len(freqCounter) != right-left+1):
I thought the first is accessing an element in a dictionary, which has a time complexity of O(1). Not sure why this would be significantly slower than the second version. This seems to mean my approach is not optimal in either case. I thought sliding window would be the most efficient algorithm; did I implement it wrong?
Your algorithm running time is close to the timeout limit for some tests -- I even got the time-out with the version len(freqCounter). The difference between the two conditions you have tried cannot be that much different, so I would look into more drastic ways to improve the efficiency of the algorithm:
Instead of counting the frequency of letters, you could store the index of where you last found the character. This allows you to update left in one go, avoiding a second loop where you had to decrease frequencies at each unit step.
Performing a del is really not necessary.
You can also use some more pythonic looping, like with enumerate
Here is the update of your code applying those ideas (the first one is the most important one):
class Solution(object):
def lengthOfLongestSubstring(self, s):
lastpos = {}
left = 0
maxLen = 0
for right, ch in enumerate(s):
if lastpos.setdefault(ch, -1) >= left:
left = lastpos[ch] + 1
else:
maxLen = max(maxLen, right - left + 1)
lastpos[ch] = right
return maxLen
Another boost can be achieved when you work with ASCII codes instead of characters, as then you can use a list instead of a dictionary. As the code challenge guarantees the characters are from a small set of basic characters, we don't need to take other character codes into consideration:
class Solution(object):
def lengthOfLongestSubstring(self, s):
lastpos = [-1] * 128
left = 0
maxLen = 0
for right, asc in enumerate(map(ord, s)):
if lastpos[asc] >= left:
left = lastpos[asc] + 1
else:
maxLen = max(maxLen, right - left + 1)
lastpos[asc] = right
return maxLen
When submitting this, it scored very well in terms of running time.

Time complexity of finding range of target in sorted array - Is this solution O(N) in the worst case?

I was going through LeetCode problem 34. Find First and Last Position of Element in Sorted Array, which says:
Given an array of integers nums sorted in non-decreasing order, find the starting and ending position of a given target value.
If target is not found in the array, return [-1, -1].
You must write an algorithm with O(log n) runtime complexity.
Since the question wanted logn run-time, I implemented the binary-search logic. But I am not sure, and think that, with the extra-while loop inside the base condition, I actually go to O(n) in the worst case. Is that true?
class Solution(object):
def searchRange(self, nums, target):
"""
:type nums: List[int]
:type target: int
:rtype: List[int]
"""
left = 0
right = len(nums) - 1
pos = [-1,-1]
while left <= right:
middle = (left + right) // 2
"""
This is pure binary search until we hit the target. Once
we have hit the target, we expand towards left and right
until we find the number equal to the target.
"""
if nums[middle] == target:
rIndex = middle
while rIndex + 1 < len(nums) and nums[rIndex + 1] == target:
rIndex += 1
pos[1] = rIndex
lIndex = middle
while lIndex - 1 >= 0 and nums[lIndex - 1] == target:
lIndex -= 1
pos[0] = lIndex
break
elif target > nums[middle]:
left = middle + 1
else:
right = middle - 1
return pos
Here is what I think for an example array that looks like:
input = [8,8,8,8,8,8,8] , target = 8
When the base condition nums[middle] == target hits, I will need to iterate the complete array and this makes it run-time complexity as O(n), right?
Interestingly, this solution is faster than 95% of the submissions!! But I think there is some issue with LeetCode!!!
Yes, you are right, the loop degrades the worst case time complexity. You rightly identified what happens when the input array has only duplicates of the target value, and no other value.
The solution is to perform two binary searches: one that prefers going to the left side, and one that prefers to go to the right side of the target value.
If the test cases do not thoroughly test this O(n) behaviour, this O(n) solution will not come out as a bad one.

Optimal brute force solution for finding longest palindromic substring

This is my current approach:
def isPalindrome(s):
if (s[::-1] == s):
return True
return False
def solve(s):
l = len(s)
ans = ""
for i in range(l):
subStr = s[i]
for j in range(i + 1, l):
subStr += s[j]
if (j - i + 1 <= len(ans)):
continue
if (isPalindrome(subStr)):
ans = max(ans, subStr, key=len)
return ans if len(ans) > 1 else s[0]
print(solve(input()))
My code exceeds the time limit according to the auto scoring system. I've already spend some time to look up on Google, all of the solutions i found have the same idea with no optimization or using dynamic programming, but sadly i must and only use brute force to solve this problem. I was trying to break the loop earlier by skipping all the substrings that are shorter than the last found longest palindromic string, but still end up failing to meet the time requirement. Is there any other way to break these loops earlier or more time-efficient approach than the above?
With subStr += s[j], a new string is created over the length of the previous subStr. And with s[::-1], the substring from the previous offset j is copied over and over again. Both are inefficient because strings are immutable in Python and have to be copied as a new string for any string operation. On top of that, the string comparison in s[::-1] == s is also inefficient because you've already compared all of the inner substrings in the previous iterations and need to compare only the outermost two characters at the current offset.
You can instead keep track of just the index and the offset of the longest palindrome so far, and only slice the string upon return. To account for palindromes of both odd and even lengths, you can either increase the index by 0.5 at a time, or double the length to avoid having to deal with float-to-int conversions:
def solve(s):
length = len(s) * 2
index_longest = offset_longest = 0
for index in range(length):
offset = 0
for offset in range(1 + index % 2, min(index, length - index), 2):
if s[(index - offset) // 2] != s[(index + offset) // 2]:
offset -= 2
break
if offset > offset_longest:
index_longest = index
offset_longest = offset
return s[(index_longest - offset_longest) // 2: (index_longest + offset_longest) // 2 + 1]
Solved by using the approach "Expand Around Center", thanks #Maruthi Adithya
This modification of your code should improve performance. You can stop your code when the max possible substring is smaller than your already computed answer. Also, you should start your second loop with j+ans+1 instead of j+1 to avoid useless iterations :
def solve(s):
l = len(s)
ans = ""
for i in range(l):
if (l-i+1 <= len(ans)):
break
subStr = s[i:len(ans)]
for j in range(i + len(ans) + 1, l+1):
if (isPalindrome(subStr)):
ans = subStr
subStr += s[j]
return ans if len(ans) > 1 else s[0]
This is a solution that has a time complexity greater than the solutions provided.
Note: This post is to think about the problem better and does not specifically answer the question. I have taken a mathematical approach to find a time complexity greater than 2^L (where L is size of input string)
Note: This is a post to discuss potential algorithms. You will not find the answer here. And the logic shown here has not been proven extensively.
Do let me know if there is something that I haven't considered.
Approach: Create set of possible substrings. Compare and find the maximum pair* from this set that has the highest possible pallindrome.
Example case with input string: "abc".
In this example, substring set has: "a","b","c","ab","ac","bc","abc".
7 elements.
Comparing each element with all other elements will involve: 7^2 = 49 calculations.
Hence, input size is 3 & no of calculations is 49.
Time Complexity:
First compute time complexity for generating the substring set:
<img src="https://latex.codecogs.com/gif.latex?\sum_{a=1}^{L}\left&space;(&space;C_{a}^{L}&space;\right&space;)" title="\sum_{a=1}^{L}\left ( C_{a}^{L} \right )" />
(The math equation is shown in the code snippet)
Here, we are adding all the different substring size combination from the input size L.
To make it clear: In the above example input size is 3. So we find all the pairs with size =1 (i.e: "a","b","c"). Then size =2 (i.e: "ab","ac","bc") and finally size = 3 (i.e: "abc").
So choosing 1 character from input string = combination of taking L things 1 at a time without repetition.
In our case number of combinations = 3.
This can be mathematically shown as (where a = 1):
<img src="https://latex.codecogs.com/gif.latex?C_{a}^{L}" title="C_{a}^{L}" />
Similarly choosing 2 char from input string = 3
Choosing 3 char from input string = 1
Finding time complexity of palindrome pair from generated set with maximum length:
Size of generated set: N
For this we have to compare each string in set with all other strings in set.
So N*N, or 2 for loops. Hence the final time complexity is:
<img src="https://latex.codecogs.com/gif.latex?\sum_{a=1}^{L}\left&space;(&space;C_{a}^{L}&space;\right&space;)^{2}" title="\sum_{a=1}^{L}\left ( C_{a}^{L} \right )^{2}" />
This is diverging function greater than 2^L for L > 1.
However, there can be multiple optimizations applied to this. For example: there is no need to compare "a" with "abc" as "a" will also be compared with "a". Even if this optimization is applied, it will still have a time complexity > 2^L (For the most cases).
Hope this gave you a new perspective to the problem.
PS: This is my first post.
You should not find the string start from the beginning of that string, but you should start from the middle of it & expand the current string
For example, for the string xyzabccbalmn, your solution will cost ~ 6 * 11 comparison but searching from the middle will cost ~ 11 * 2 + 2 operations
But anyhow, brute-forcing will never ensure that your solution will run fast enough for any arbitrary string.
Try this:
def solve(s):
if len(s)==1:
print(0)
return '1'
if len(s)<=2 and not(isPalindrome(s)):
print (0)
return '1'
elif isPalindrome(s):
print( len(s))
return '1'
elif isPalindrome(s[0:len(s)-1]) or isPalindrome(s[1:len(s)]):
print (len(s)-1)
return '1'
elif len(s)>=2:
solve(s[0:len(s)-1])
return '1'
return 0

Biased Binary Sort Return First Index of Insertion

I am trying to make a way of returning the first possible location where I can insert a new term in an increasing list. My general attempt is to use binary sort until the condition arises that the term before is less than my inserting term, and the next term is equal to or greater than my inserting term:
lis = [1,1,2,2,2,4,5,6,7,8,9]
def binary_sort(elem, lis, left, right):
mid = (left + right)//2
if (lis[mid-1] < elem and elem <= lis[mid]):
return mid
elif (lis[mid] > mid):
return binary_sort(elem, lis, mid, right)
else:
return binary_sort(elem, lis, left, mid)
This is not working... where is the issue with my code or my general strategy?
I would suggest to take a look at the following code.
def binary_insertion_search(elem, items, left, right):
if elem > items[right]:
return right + 1
while left < right:
mid = (left + right) // 2
if elem > items[mid]:
left = mid + 1
elif elem <= items[mid]:
right = mid
return left
I rewrote your function a little bit. Now for the explanation:
I assumed that the index to return is the first position that any content can be placed at, which in turn would move all following items to the right.
Since we can not incorporate indices outside of the range of the list by design, we have to check if the element is larger than the item at the end of the list. We then return the next possible index len(lis).
To avoid recursion alltogether, I used a while loop. First, we check, whether left is equal to or greater than right. This can only be true, if we have found the index to put the element at.
Your calculation of the mid value was good, so I just kept it.
If our element is greater than the item at mid, the only possible next option is to select the next unchecked position, which is mid + 1.
Now for the interesting part. Like in the other case, to find the leftmost item, we have to set the right boundary to mid - 1, in order to skip the mid element. However, we check if the element is smaller or equal to the item at mid.
This guarantees us that when we find a candidate that is equal to the searched element, we run the search again (with reduced ranged from right) to possibly find a smaller index. This stops, when left == right is true, ending the loop.
I hope this answers your question and points out the differences in the code!

Find the index of the first 1 in an array which contains only 1s and 0s, all the 0s being to the left side of the array, and all the 1s to the right?

The task is to find the index of the first 1 in an array which contains only 1s and 0s with all the 0s being to the left side of the array, and all the 1s to the right.
For instance, if the list was, [0,0,0,0,1,1], the answer would be 4.
The time taken must be logarithmic.
I tried implementing the logic that if the middle number was 0, we look only look at the second half of the list. If on the other hand, the middle number was 1, we look at only the first half of the list. We keep doing this till we only have one number left.
def first1(lst):
start_val=0
end_val=len(lst)
midpoint=(end_val+start_val)//2
while end_val-start_val>1:
if lst[midpoint]==1:
endval=midpoint-1
midpoint=start_val+end_val//2
else:
startval=midpoint+1
midpoint=start_val+end_val//2
return midpoint
This gives me an infinite loop. I don't understand what I am doing wrong.
You'll kick yourself when you read this. You initialize variables start_val and end_val, but you assign endpoint + 1 to startval and endval (no underscore in the variable name).
Otherwise, it works fine. I didn't see it immediately, so I put a print statement in the loop along with a 1 second delay so I could see what was going on. Also, I formatted your code a bit to make it more readable, mostly following the recommendations for putting spaces between variables and operators.
import time
def first1(lst):
start_val = 0
end_val = len(lst)
midpoint = (end_val + start_val) // 2
while end_val - start_val > 1:
if lst[midpoint] == 1:
end_val = midpoint - 1
midpoint = (start_val + end_val) // 2
else:
start_val = midpoint + 1
midpoint = (start_val + end_val) // 2
print(start_val, midpoint, end_val)
time.sleep(1)
return midpoint
print(first1([0,0,0,0,1,1]))
the reason that you are having this issue is because on the line
endval=midpoint-1
and the line
startval=midpoint+1
you are missing underscores in start_val and end_val.
however due to the methond you are using this will not work if the length is not a power of 2
You can consider your array as sorted and use a binary search algorithm. This will give you an O(log(N)) time complexity.
For example:
from bisect import bisect_left
a = [0,0,0,0,1,1]
p = bisect_left(a,1)
print(p) # 4
If you're not allowed to use Python modules, you can write your own binary search:
def find1(array):
lo,hi = 0,len(array)-1
while hi>=lo:
mid = (hi+lo)//2
if array[mid]: hi = mid-1
else: lo = mid+1
return hi+1
find1([0,0,0,0,1,1]) # 4

Categories

Resources