How to write in with its fundamental brute steps? - python

I was looking to solve the longest consecutive sequence question on Leetcode and this is the provided solution.
The question lies in the inner loop right after # how to rewrite this part without in?
class Solution:
def longestConsecutive(self, nums):
longest_streak = 0
for num in nums:
current_num = num
current_streak = 1
# how to rewrite this part without in?
while current_num + 1 in nums:
current_num += 1
current_streak += 1
longest_streak = max(longest_streak, current_streak)
return longest_streak
I wrote a version that of the inner loop doesn't use in like this.
while j < n and i != j:
if nums[j] == currentSequenceNumber + 1:
currentSequenceLength += 1
currentSequenceNumber = nums[j]
j += 1
I realized after running pdb that this approach would only work for 2 consecutive numbers but not more. How could I rewrite my original portion to keep checking without using in. I have a feeling that in using a similar approach as find when it comes to sequences. I have seen this link for find in strings but it is not the brute force approach that I would like to write out.
I think seeing this how this can be rewritten would clarify why the space complexity is O(n^3) as the solution states. I currently can't understand why with their explanation.

Would write this as a comment but I lack rep (!). I think time complexity of in depends on the structure it is applied to, for example if it is done on a list it has to search every item in the list in series but for a set or dict then it can lookup by hash.
So I can't tell you how to write in unless I know what type nums is (a good use case for type hints here also). However:
list: average O(n)
set: average O(1) worst O(n)
BTW, sets are awesome and a great reason to use Python. Oftentimes people use lists all over the place and wonder why their code doesn't scale...
EDIT: due to stupid rules I still can't comment, so leaving this here. No, I'm not going to write a for, if, break loop for you, it's absolutely trivial.

Related

How to make this leetcode method more efficient, without using count variable or perhaps another way?

This is for leetcode problem: https://leetcode.com/problems/majority-element
There is something wrong with the way I create solutions, and not sure how to stop doing it. Basically the problem is I always create a count variable. Here is it called greatest_count. For the if statement, I create a conditional, which I think is fine, but I feel like I don't need the additional greatest_count variable here but not sure a better way to write it. I always seem to think I need to count it and check it against the previous counts. Do I need this? How can I write this without needing the count variable? or without using the greatest unique? Any ways to optimize this would be great to know.
Problem area:
if unique_count > greatest_count:
greatest_count = unique_count
greatest_unique = i
Here is the full code:
class Solution:
def majorityElement(self, nums):
unique_nums = set(nums)
greatest_unique = 0
greatest_count = 0
for i in unique_nums:
unique_count = nums.count(i)
if unique_count > greatest_count:
greatest_count = unique_count
greatest_unique = i
return greatest_unique
Thank you
In order to get this to work in O(n) time and O(1) space, you would need a different approach. For example, find the majority bit for each of the 32 bits of the numbers and build the answer from the collected bits that are present in more than half the numbers:
def majorityElement(nums):
m = 0
for b in range(32): # go through all 32 bits
c = sum(n&(2**b)!=0 for n in nums) # count numbers with bit b set
if c>len(nums)//2: m |= 2**b # more than half, keep that bit
return m if m<2**31 else m-2**32 # turn Python's int to 32 bit signed
majorityElement([3,2,3]) # 3
majorityElement([-3,-3,1,1,1,-3,-3]) # -3
This is O(n) (linear) time because it runs through the list a fixed number of times. It is O(1) space because it does not use memory proportionally to the size of the list.

Space complexity of dictionary python

Its a leethcode Question
https://leetcode.com/problems/find-the-duplicate-number/
Here they are saying :
You must not modify the array (assume the array is read only).
You must use only constant, O(1) extra space.
Your runtime complexity should be less than O(n^2).
There is only one duplicate number in the array, but it could be repeated more than once.
So in my code i am creating a dictionary using Collection in python.
How my code is satisfying the this line "You must use only constant, O(1) extra space." and what do they mean by this line are they taking about Space complexity. Below is my code, which clears all testcase.
from collections import Counter
class Solution:
def findDuplicate(self, nums: List[int]) -> int:
dict1=Counter(nums)
for i in dict1:
if(dict1[i]>1):
return(i)
Please help. Thanks in advance.
Generally, a dictionary has always space complexity of O(N), because it depends on the number of elements of your array.
A space complexity of O(1) means that you have the same number of pointers regardless of the array size. For instance, if you use a boolean variable in your search algorithm to get your duplicate, this would imply O(1).
Side note:
Another thing is the runtime complexity, which in the case of a dictionary, is O(1) since they are based on hash-tables where you only need a key to get the value. Oppositely, to find a particular value in a list, the runtime complexity is O(N), since in the worst case you have to iterate all the elements.
Dictionaries take O(n) space, so your solution takes O(n) space and violates the O(1) space requirement.
This is an old LeetCode problem, when LeetCode's focus was on job interviews, where such requirements can come up and be discussed (and used to be discussed in LeetCode's forum). It was never enforced by the LeetCode system, that's why your solution gets accepted despite violating the requirement. By now LeetCode is competition-focused and has become just like any other coding challenges site: It only matters whether you get your solution accepted, now how. They still don't (can't?) enforce such space requirements, and I think their new questions don't ask for something like that anymore. I miss the old days.
Your main question has been answered already. For this problem, we'd binary search:
class Solution:
def findDuplicate(self, nums):
lo, hi = 0, len(nums) - 1
mid = (lo + hi) // 2
while hi - lo > 1:
count = 0
for num in nums:
if mid < num <= hi:
count += 1
if count > hi - mid:
lo = mid
else:
hi = mid
mid = (lo + hi) // 2
return hi
References
For additional details, you can see the Discussion Board. There are plenty of accepted solutions with a variety of languages and explanations, efficient algorithms, as well as asymptotic time/space complexity analysis1, 2 in there.

Python : Counting execution of recursive call

I am using Euler problems to test my understanding as I learn Python 3.x. After I cobble together a working solution to each problem, I find the posted solutions very illuminating and I can "absorb" new ideas after I have struggled myself. I am working on Euler 024 and I am trying a recursive approach. Now, in no ways do I believe my approach is the most efficient or most elegant, however, I successfully generate a full set of permutations, increasing in value (because I start with a sorted tuple) - which is one of the outputs I want. In addition, in order to find the millionth in the list (which is the other output I want, but can't yet get) I am trying to count how many there are each time I create a permutation and that's where I get stuck. In other words what I want to do is count the number of recursive calls each time I reach the base case, i.e. a completed permutation, not the total number of recursive calls. I have found on StackOverflow some very clear examples of counting number of executions of recursive calls but I am having no luck applying the idea to my code. Essentially my problems in my attempts so far are about "passing back" the count of the "completed" permutation using a return statement. I think I need to do that because the way my for loop creates the "stem" and "tail" tuples. At a high level, either I can't get the counter to increment (so it always comes out as "1" or "5") or the "nested return" just terminates the code after the first permutation is found, depending on where I place the return. Can anyone help insert the counting into my code?
First the "counting" code I found in SO that I am trying to use:
def recur(n, count=0):
if n == 0:
return "Finished count %s" % count
return recur(n-1, count+1)
print(recur(15))
Next is my permutation code with no counting in it. I have tried lots of approaches, but none of them work. So the following has no "counting" in it, just a comment at which point in the code I believe the counter needs to be incremented.
#
# euler 024 : Lexicographic permutations
#
import time
startTime= time.time()
#
def splitList(listStem,listTail):
for idx in range(0,len(listTail)):
tempStem =((listStem) + (listTail[idx],))
tempTail = ((listTail[:idx]) + (listTail[1+idx:]))
splitList(tempStem,tempTail)
if len(listTail) ==0:
#
# I want to increment counter only when I am here
#
print("listStem=",listStem,"listTail=",listTail)
#
inStem = ()
#inTail = ("0","1","2","3","4","5","6","7","8","9")
inTail = ("0","1","2","3")
testStem = ("0","1")
testTail = ("2","3","4","5")
splitList(inStem,inTail)
#
print('Code execution duration : ',time.time() - startTime,' seconds')
Thanks in advance,
Clive
Since it seems you've understood the basic problem but just want to understand how the recursion is happening, all you need to do is pass a variable that tells you at what point of the call stack you're in. You can add a 3rd argument to your function, and increment it with each recursive call:
def splitList(listStem, listTail, count):
for idx in range(0,len(listTail)):
...
splitList(tempStem, tempTail, count)
if len(listTail) == 0:
count[0] += 1
print('Count:', count)
...
Now, call this function like this (same as before):
splitList(inStem, inTail, [0])
Why don't you write generator for this?
Then you can just stop on nth item ("drop while i < n").
Mine solution is using itertools, but you can use your own permutations generator. Just yield next sequence member instead of printing it.
from itertools import permutations as perm, dropwhile as dw
print(''.join(dw(
lambda x: x[0]<1000000,
enumerate(perm('0123456789'),1)
).__next__()[1]))

Python IndentationError - How to refactor?

I am doing a Project Euler question for programming practice in order to self-teach myself. I know perfectly well how to do the question mathematically, as well as how to do it programmatically.
However, I have to have come up with some insane code to do it; 100 nested loops and Python hilariously raises this error, and probably rightfully so, on 100 levels of indentation:
IndentationError: too many levels of indentation
tally = 0
ceiling = 100
for integer_1 in range(0, 100, 1):
for integer_2 in range(0, 100 - integer_1, 2):
for integer_3 in range(0, 100 - integer_1 - integer_2, 3):
for integer_4 ....
for integer_5 ....
etc.
etc.
all the way to integer_100
I have looked through google for solutions but this issue is so rare it has almost no literature on the subject and I could only find this other stack overflow question ( Python IndentationError: too many levels of indentation ) which I could not find much useful in for my question.
My question is - is there a way to take my solution and find some workaround or refactor it in a way that has it work? I am truly stumped.
EDIT:
Thanks to nneonneo's answer, I was able to solve the question. My code is here just for future reference of people looking for ways to properly refactor their code.
from time import time
t = time()
count_rec_dict = {}
# for finding ways to sum to 100
def count_rec(cursum, level):
global count_rec_dict
# 99 is the last integer that we could be using,
# so prevent the algorithm from going further.
if level == 99:
if cursum == 100:
return 1
else:
return 0
res = 0
for i in xrange(0, 101-cursum, level+1):
# fetch branch value from the dictionary
if (cursum+i, level+1) in count_rec_dict:
res += count_rec_dict[(cursum+i, level+1)]
# add branch value to the dictionary
else:
count_rec_dict[(cursum+i, level+1)] = count_rec(cursum+i, level+1)
res += count_rec_dict[(cursum+i, level+1)]
return res}
print count_rec(0, 0)
print time() - t
which runs in an astonishing 0.041 seconds on my computer. WOW!!!!! I learned some new things today!
A recursive solution should do nicely, though I'm certain there is an entirely different solution to the problem that doesn't require this kind of manipulation.
def count_rec(cursum, level):
if level == 100:
return 1
res = 0
for i in xrange(0, 100-cursum, level+1):
res += count_rec(cursum+i, level+1)
return res
print count_rec(0, 0)
Interestingly enough, if you memoize this function, it will actually have a reasonable running time (such is the power of dynamic programming). Have fun!
One way to avoid the indentation error is to put the loops in separate functions, each one nested only one level deep.
Alternatively, you could use recursion to call a function over and over again, each time with a smaller range and higher nesting level.
That being said, your algorithm will have an impossibly long running time no matter how you code it. You need a better algorithm :-)
To do this using exactly your algorithm (restricting each next number to one that can possibly fit in the required sum), you really do need recursion - but the true brute force method can be a one-liner:
sum(sum(i) == 100 for i in itertools.product(xrange(100), repeat=100))
Naturally, this will be a fair bit slower than a true refactoring of your algorithm (in fact, as mentioned in the comments, it turns out to be intractable).
The most effective solution is based on the idea of arithmetic carrying.
You have lists of maximum values and steps,
and also a list of current values. For each time you want to update those 100 variables, you do this:
inc_index = -1
currentvalue[inc_index] += stepval[inc_index]
# I use >= rather than > here to replicate range()s behaviour that range(0,100) generates numbers from 0 to 99.
while currentvalue[inc_index] >= maxval[inc_index]:
currentvalue[inc_index] = 0
inc_index -= 1
currentvalue[inc_index] += stepval[inc_index]
# now regenerate maxes for all subordinate indices
while inc_index < -1:
maxval[inc_index + 1] = 100 - sum (currentvalue[:inc_index])
inc_index += 1
When an IndexError is raised, you've finished looping (run out of 'digits' to carry into.)

creating a hash-based sorting algorithm

For experimental and learning purposes. I was trying to create a sorting algorithm from a hash function that gives a value biased on alphabetical sequence of the string, it then would ideally place it in the right place from that hash. i tryed looking for a hash-biased sorting function but only found one for integers and would be a memory hog if adapted for my purposes.
The reasoning is that theoretically if done right this algorithm can achieve O(n) speeds or nearly so.
So here is what i have worked out in python so far:
letters = {'a':0,'b':1,'c':2,'d':3,'e':4,'f':5,'g':6,'h':7,'i':8,'j':9,
'k':10,'l':11,'m':12,'n':13,'o':14,'p':15,'q':16,'r':17,
's':18,'t':19,'u':20,'v':21,'w':22,'x':23,'y':24,'z':25,
'A':0,'B':1,'C':2,'D':3,'E':4,'F':5,'G':6,'H':7,'I':8,'J':9,
'K':10,'L':11,'M':12,'N':13,'O':14,'P':15,'Q':16,'R':17,
'S':18,'T':19,'U':20,'V':21,'W':22,'X':23,'Y':24,'Z':25}
def sortlist(listToSort):
listLen = len(listToSort)
newlist = []
for i in listToSort:
k = letters[i[0]]
for j in i[1:]:
k = (k*26) + letters[j]
norm = k/pow(26,len(i)) # get a float hash that is normalized(i think thats what it is called)
# 2nd part
idx = int(norm*len(newlist)) # get a general of where it should go
if newlist: #find the right place from idx
if norm < newlist[idx][1]:
while norm < newlist[idx][1] and idx > 0: idx -= 1
if norm > newlist[idx][1]: idx += 1
else:
while norm > newlist[idx][1] and idx < (len(newlist)-1): idx += 1
if norm > newlist[idx][1]: idx += 1
newlist.insert(idx,[i,norm])# put it in the right place with the "norm" to ref later when sorting
return newlist
i think that the 1st part is good, but the 2nd part needs help. so the Qs would be what would be the best way to do something like this or is it even possible to get O(n) time (or near that) out of this?
the testing i did with an 88,000 word list took prob about 5 min, 10,000 took about 30 sec it got a lot worse as the list count went up.
if this idea actually works out then i would recode it in C to get some real speed and optimizations.
The 2nd part is there only because it works - even if slow, and i cant think of a better way to do it for the life of me, i would like to replace it with something that would not have to do the other loops if at all possible.
thank for any advice or ideas that you could give.
On sorting in O(n): you can't do it generally for all inputs, period. It is simply, fundamentally, mathematically impossible.
Here's the nice, short information-theoretic proof of impossibility: to sort, you have to be able to distinguish among the n! possible orderings of the input; to do so, you have to get log2(n!) bits of data; to do that, you need to do O(log (n!)) comparisons, which is O(n log n). Any sorting algorithm that claims to run in O(n) is either running on specialized data (e.g. data with a fixed number of bits), or is not correct.
Implementing a sorting algorithm is a good learning exercise, but you may want to stick to existing algorithms until you are comfortable with the concepts and methods commonly employed. It might be rather frustrating otherwise if the algorithm doesn't work.
Have fun learning!
P.S. Python's built-in timsort algorithm is really good on a lot of real-world data. So, if you need a general sorting algorithm for production code, you can usually rely on .sort/sorted to be fast enough for your needs. (And, if you can understand timsort, you'll do better than 90% of the Python-wielding population :)

Categories

Resources