Complexity of reverse sentence algorithm - python

I was working on a data-structure problem in Python where I have to reverse the order of the words in the array in the most efficient manner. I came up with the following solution to the problem
def reverse(arr, st, end):
while st < end:
arr[st], arr[end] = arr[end], arr[st]
end -= 1
st += 1
def reverse_arr(arr):
arr = arr[::-1]
st_index = 0
length = len(arr)
for i, val in enumerate(arr):
if val == ' ':
end_index = i-1
reverse(arr, st_index, end_index)
st_index = end_index + 2
if i == length - 1:
reverse(arr, st_index, length-1)
return arr
If the arr is:
arr = [ 'p', 'e', 'r', 'f', 'e', 'c', 't', ' ',
'm', 'a', 'k', 'e', 's', ' ',
'p', 'r', 'a', 'c', 't', 'i', 'c', 'e' ]
It returns:
['p', 'r', 'a', 'c', 't', 'i', 'c', 'e', ' ',
'm', 'a', 'k', 'e', 's', ' ',
'p', 'e', 'r', 'f', 'e', 'c', 't']
The solution works fine but I don't understand how the complexity of this algorithm is O(n). It's written that traversing the array twice with a constant number of actions for each item is linear i.e. O(n) where n is the length of the array.
I think it should be more than O(n) as according to me the length of each word is not fixed and time complexity to reverse each word depends on the length of the word. Can someone explain this in a better way?

reverse will get called once for each word. During that call, it will do a constant amount of work per character.
You can either represent this in terms of the number of words and average length of words (i.e. O(wordCount*averageWordLength)), or in terms of the total number of characters in the array. If you do the latter, it's easy to see that you're still doing a constant amount of work per character (since both reverse and reverse_arr does a constant amount of work per character, and no two reverse calls will include the same character), leading to O(characterCount) complexity.
I would not assume that "the length of the array" in the explanation refers to the number of words, but rather the number of characters, or they're assuming the word length has a fixed upper bound (in which the complexity is indeed O(wordCount)).
TL;DR: n in O(n) is characterCount, not wordCount.

def reverse(arr, st, end):
while st < end:
arr[st], arr[end] = arr[end], arr[st]
end -= 1
st += 1
def reverse_Cha(arr):
arr = arr[::-1]
st_index = 0
length = len(arr)
for i, val in enumerate(arr):
if val == ' ':
end_index = i-1
reverse(arr, st_index, end_index)
st_index = end_index + 2
if i == length - 1:
reverse(arr, st_index, length-1)
return arr
def reverse_Jon(arr):
r = [ch for word in ' '.join(''.join(arr).split()[::-1]) for ch in word]
return r
def reverse_Nua(arr):
rev_arr = list(' '.join(''.join(arr).split()[::-1]))
return rev_arr
If we considered the 3 proposed solutions: yours as reverse_Cha, Jon Clements' as reverse_Jon, and mine as reverse_Nua.
We note that we have O(n) when we use [::-1], when we examine each elements of a list (length n), etc.
reverse_Cha uses [::-1], then examine each elements twice (to read then to exchange), complexity is thus depending on the total number of elements (O(3n+c) which we write as O(n) (+c comes from O(1) operations))
reverse_Jon uses [::-1], then examine each elements twice (examine each character of each word), complexity is thus depending on the total number of elements and number of words (O(3n+m) which we write as O(n+m) (with m the number of words))
reverse_Nua uses [::-1], then stick to python list functions, complexity is thus still depending on the total number of elements (Just O(n) directly this time)
As term of performance (1e6 loops), we got reverse_Cha: 2.785867s; reverse_Jon: 4.11845s (due to for); reverse_Nua: 1.185973s.

I assume this is a purely theoretical question, because in real world applications you would probably rather split your list into one-word sublists, then rejoin the sublists in reverse order - that requires more memory, but is much faster.
Having said that, I'd like to point out that the algorithm you've shown is, indeed, O(n) - it depends on total length of your words, not on lengths of individual words. In other words: it will take the same time for 20 3-letter words, 6 10-letter words, 10 6-letter words… you always go through every letter only twice: once during reversal of individual words (that's the first call to reverse in reverse_arr) and once during reversal of the whole array (the second call to reverse).

Related

How to make a simple pattern detector in python

Let's say I have some code: list=["r","s","r","s"]
I would want to print the next digit of the code. The expected output would, of course, be "r". Is there any way to do this in python?
I tried a couple of programs online, but they all didn't help me.
Assuming that your pattern starts at the beginning of your array, here is a way to find the next element:
def repeat(pattern, length):
return (length//len(pattern))*pattern + pattern[:length%len(pattern)]
def find_pattern(array):
# we successively try longer and longer patterns, starting with length 1
for len_attempt, _ in enumerate(array, 1):
pattern = array[:len_attempt]
if repeat(pattern, len(array)) == array:
return repeat(pattern, len(array)+1)[-1]
Here is the output of this function for various patterns:
arr = ['r', 's', 'r', 's']
print(find_pattern(arr))
>>> r
arr = ['r', 's', 'w', 'r', 's']
print(find_pattern(arr))
>>> w
arr = ['r', 's', 'w', 'w', 's']
print(find_pattern(arr))
>>> r # considering a pattern of length 5
Explanation:
First of all, we define a repeat function which will be useful later. It repeats a pattern to a given length. For example, if we give ['r', 's'] as a pattern and a length of 5, it will return ['r', 's', 'r', 's', 'r'].
Then, we try patterns of length 1, 2, 3... until when the repeat of this pattern gives us the original array. At this point we know that this pattern works best, and we return the next predicted element. In the worst case scenario, the program will consider a pattern of length len(array) in which case it will just return the first element of this array.
You can easily tweak this program to give :
not only the next element of the array, but the nth one.
The length of the pattern
If the pattern doesn't necessarily start at the beginning of your array, it shouldn't be too difficult to make this program work for this case too. (hint: remove the n first elements of the array and find a pattern that ends with these elements.)
I hope this is what you are looking for!

Correct word generation without repetitions

How many five-letter words can you make from a 26-letter alphabet (no repetitions)?
I am writing a program that generates names (just words) from 5 letters in the format: consonant_vowel_consistent_vowel_consonant. Only 5 letters. in Latin. I just want to understand how many times I have to run the cycle for generation. At 65780, for example, repetitions already begin. Can you please tell me how to do it correctly?
import random
import xlsxwriter
consonants = ['B', 'C', 'D', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q',
'R', 'S', 'T', 'V', 'W', 'X', 'Z']
vowels = ['A', 'E', 'I', 'O', 'U', 'Y']
workbook = xlsxwriter.Workbook('GeneratedNames.xlsx')
worksheet = workbook.add_worksheet()
def names_generator(size=5, chars=consonants + vowels):
for y in range(65780):
toggle = True
_id = ""
for i in range(size):
if toggle:
toggle = False
_id += random.choice(consonants)
else:
toggle = True
_id += random.choice(vowels)
worksheet.write(y, 0, _id)
print(_id)
workbook.close()
names_generator()
You can use itertools.combinations to get 3 different consonants and 2 different vowels and get the permutations of those to generate all possible "names".
from itertools import combinations, permutations
names = [a+b+c+d+e for cons in combinations(consonants, 3)
for a, c, e in permutations(cons)
for vow in combinations(vowels, 2)
for b, d in permutations(vow)]
There are only 205,200 = 20x19x18x6x5 in total, so this will take no time at all for 5 letters, but will quickly take longer for more. That is, if by "no repetitions" you mean that no letter should occur more than once. If, instead, you just want that no consecutive letters are repeated (which is already guaranteed by alternating consonants and vowels), or that no names are repeated (which is guaranteed by constructing them without randomness), you can just use itertools.product instead, for a total of 288,000 = 20x20x20x6x6 names:
names = [a+b+c+d+e for a, c, e in product(consonants, repeat=3)
for b, d in product(vowels, repeat=2)]
If you want to generate them in random order, you could just random.shuffle the list afterwards, or if you want just a few such names, you can use random.sample or random.choice on the resulting list.
If you want to avoid duplicates, you shouldn't use randomness but simply generate all such IDs:
from itertools import product
C = consonants
V = vowels
for id_ in map(''.join, product(C, V, C, V, C)):
print(id_)
or
from itertools import cycle, islice, product
for id_ in map(''.join, product(*islice(cycle((consonants, vowels)), 5))):
print(id_)
itertools allows for non repetitive permutations https://docs.python.org/3/library/itertools.html
import itertools, re
names = list(itertools.product(consonants + vowels, repeat=5))
consonants_regex = "(" + "|".join(consonants) + ")"
vowels_regex = "(" + "|".join(vowels) + ")"
search_string = consonants_regex + vowels_regex + consonants_regex + vowels_regex + consonants_regex
names_format = ["".join(name) for name in names if re.match(search_string, "".join(name))]
Output:
>>> len(names)
11881376
>>> len(names_format)
288000
I want to make sure to answer your question
I just want to understand how many times I have to run the cycle for
generation
since I think it is important to get a better intuition about the problem.
You have 20 consonants and 6 vowels and in total that yields 20x6x20x6x20 = 288000 different combinations for words. Since it is sequential, you can split it up to make that easier to understand. You have 20 different consonants you can put as the 1st letter and for each one 6 vowels you can attach afterwards = 20x6 = 120. Then you can keep going and say for those 120 combinations you can add 20 consonants for each = 120x20 = 2400 ... and so on.

How to get certain number of alphabets from a list?

I have a 26-digit list. I want to print out a list of alphabets according to the numbers. For example, I have a list(consisting of 26-numbers from input):
[0,0,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0]
I did like the output to be like this:
[e,e,l,s]
'e' is on the output 2-times because on the 4-th index it is the 'e' according to the English alphabet formation and the digit on the 4-th index is 2. It's the same for 'l' since it is on the 11-th index and it's digit is 1. The same is for s. The other letters doesn't appear because it's digits are zero.
For example, I give another 26-digit input. Like this:
[1,2,2,3,4,0,3,4,4,1,3,1,4,4,1,0,0,0,0,0,4,2,3,2,2,1]
The output should be:
[a,b,b,c,c,d,d,d,e,e,e,e,g,g,g,h,h,h,h,i,i,i,i,j,k,k,k,l,m,m,m,m,n,n,n,n,o,u,u,u,u,v,v,w,w,w,x,x,y,y,z]
Is, there any possible to do this in Python 3?
You can use chr(97 + item_index) to get the respective items and then multiply by the item itself:
In [40]: [j * chr(97 + i) for i, j in enumerate(lst) if j]
Out[40]: ['ee', 'l', 's']
If you want them separate you can utilize itertools module:
In [44]: from itertools import repeat, chain
In [45]: list(chain.from_iterable(repeat(chr(97 + i), j) for i, j in enumerate(lst) if j))
Out[45]: ['e', 'e', 'l', 's']
Yes, it is definitely possible in Python 3.
Firstly, define an example list (as you did) of numbers and an empty list to store the alphabetical results.
The actual logic to link with the index is using chr(97 + index), ord("a") = 97 therefore, the reverse is chr(97) = a. First index is 0 so 97 remains as it is and as it iterates the count increases and your alphabets too.
Next, a nested for-loop to iterate over the list of numbers and then another for-loop to append the same alphabet multiple times according to the number list.
We could do this -> result.append(chr(97 + i) * my_list[i]) in the first loop itself but it wouldn't yield every alphabet separately [a,b,b,c,c,d,d,d...] rather it would look like [a,bb,cc,ddd...].
my_list = [1,2,2,3,4,0,3,4,4,1,3,1,4,4,1,0,0,0,0,0,4,2,3,2,2,1]
result = []
for i in range(len(my_list)):
if my_list[i] > 0:
for j in range(my_list[i]):
result.append(chr(97 + i))
else:
pass
print(result)
An alternative to the wonderful answer by #Kasramvd
import string
n = [0,0,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0]
res = [i * c for i, c in zip(n, string.ascii_lowercase) if i]
print(res) # -> ['ee', 'l', 's']
Your second example produces:
['a', 'bb', 'cc', 'ddd', 'eeee', 'ggg', 'hhhh', 'iiii', 'j', 'kkk', 'l', 'mmmm', 'nnnn', 'o', 'uuuu', 'vv', 'www', 'xx', 'yy', 'z']
Splitting the strings ('bb' to 'b', 'b') can be done with the standard schema:
[x for y in something for x in y]
Using a slightly different approach, which gives the characters individually as in your example:
import string
a = [0,0,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0]
alphabet_lookup = np.repeat(np.arange(len(a)), a)
letter_lookup = np.array(list(string.ascii_lowercase))
res = letter_lookup[alphabet_lookup]
print(res)
To get
['e' 'e' 'l' 's']

Combinations of a String

The following was a job interview question which I struggled with.
(Unnecessary switching between list and set, tested it and realised it was missing an expected output, too many steps).
If possible, looking for the proper answer or maybe a guide on how I should have tackled the problem. Thank you.
Question: Give a String, find all possible combinations from it (Front and Reverse).
Print all combinations and total count of combinations. Order doesn't matter.
Example s = 'polo'
Front Answer = 'p', 'po', 'pol', 'polo', 'ol',
'olo', 'lo', 'o', 'l'.
Reverse Answer: 'o', 'ol', 'olo', 'olop',
'lop', 'op', 'p', 'l'.
My answer:
count = 0
count2 = -1
length = len(s)
my_list = []
for i in s:
temp = s[count:]
temp2 = s[:count2]
my_list.append(i)
my_list.append(temp)
my_list.append(temp2)
count += 1
count2 -= 1
my_set = set(my_list)
for f in my_set:
print(f)
print(len(my_set)) # Answer for front
new_list = []
for f in my_set:
new_list.append(f[::-1])
print('Reverse Result:')
for f in new_list:
print(f)
print(len(new_list)) # Answer for reverse
You can do this with two nested for-loops. One will loop through the start indexes and the nested one loops through the end indexes (starting from the start + 1 going to the length of s +1 to reach the very end).
With these two indexes (start and end), we can use string slicing to append that combination to the list forward. This gives you all the combinations as you see below.
To get the reversed ones, you could do a for-loop as you have done reversing the order of the forward ones, but to save the space, in the code below, we just append the same index but sliced from the reversed s (olop).
s = "polo"
forward = []
backward = []
for start in range(len(s)):
for end in range(start+1, len(s)+1):
forward.append(s[start:end])
backward.append(s[::-1][start:end])
print(forward)
print(backward)
print(len(forward) + len(backward))
which outputs:
['p', 'po', 'pol', 'polo', 'o', 'ol', 'olo', 'l', 'lo', 'o']
['o', 'ol', 'olo', 'olop', 'l', 'lo', 'lop', 'o', 'op', 'p']
20
If you really wanted to make the code clean and short, you could do the same thing in a list-comprehension. The logic remains the same, but we just compress it down to 1 line:
s = "polo"
forward = [s[start:end] for start in range(len(s)) for end in range(start+1, len(s)+1)]
backward = [c[::-1] for c in forward]
print(forward)
print(backward)
print(len(forward) + len(backward))
which gives the same output as before.
Hope this helps!
Try this:
string = "polo"
x = [string[i:j] for i in range(len(string)) for j in range(len(string),0,-1) if j > i ]
x.sort()
print("Forward:",x[::-1])
print("Reverse:",x[::])

Longest repeating substring using for-loops and if-statements

I'm in an introductory level programming class that teaches python. I was introduced to a longest repeating substring problem for a project and I can't seem to crack it. I've looked on here for a solution, but I haven't learned suffix trees yet so I wouldn't be able to use them. So far, I've gotten here:
msg = "kalhfdlakdhfklajdf" (anything)
for i in range(len(msg)):
if msg[i] == msg[i + 1]:
reps.append(msg[i])
What this does is scan my string, msg, and check to see if the counter matches the next character in sequence. If the characters match, it appends msg[i] to the list "reps". My problem is that:
a) The function I created always appends one less than repetition amount, and
b) my function program always crashes due to msg[i+1] going out of bounds once it reaches the last spot on the list.
In essence, I want my program to find repeats, append them to a list where the highest repeating character is counted and returned to the user.
You need to use len(msg)-1 as your range but your condition will omit one character with your condition, and for getting ride of that you can add another condition to your code that check the preceding characters too :
with you'r condition you'll have 8 h in reps till there is 9 in msg:
>>> msg = "kalhfdlakdhhhhhhhhhfklajdf"
>>> reps = []
>>> for i in range(len(msg)-1):
... if msg[i] == msg[i + 1]:
... reps.append(msg[i])
...
>>> reps
['h', 'h', 'h', 'h', 'h', 'h', 'h', 'h']
And with another condition :
>>> reps=[]
>>> for i in range(len(msg)-1):
... if msg[i] == msg[i + 1] or msg[i] == msg[i - 1]:
... reps.append(msg[i])
...
>>> reps
['h', 'h', 'h', 'h', 'h', 'h', 'h', 'h', 'h']
For the groupby answer I alluded to on #Kasra's excellent response:
from itertools import groupby
msg = "kalhfdlakdhhhhhhhhhfklajdf"
maxcount = 0
for substring in groupby(msg):
lett, count = substring[0], len(list(substring[1]))
if count > maxlen:
maxcountlett = lett
maxcount = count
result = [maxcountlett] * maxlen
But note that this only works for substrings of length 1. msg = 'hahahaha' should give ['ha', 'ha', 'ha', 'ha'] by my understanding.
a) Think about what is happening when it makes the first match.
For example, given abcdeeef it sees that msg[4] matches msg[5]. It then goes and appends msg[4] to reps. Then msg[5] matches msg[6] and it appends msg[5] to reps. However, msg[6] does not match msg[7] so it does not append msg[6]. You are one short.
In order to fix this you need to append one extra for each string of matches. A good way to do this is to check if the character you're currently matching already exists in reps. If it does only append the current one. If it does not append it twice.
if msg[i] == msg[i+1]
if msg[i] in reps
reps.append(msg[i])
else
reps.append(msg[i])
reps.append(msg[i])
b) You need to ensure that you do not exceed your boundaries. This can be accomplished by taking 1 off of your range.
for i in (range(len(msg)-1))

Categories

Resources