How to make this code execute faster? - python

def lookfor(alist, number):
if number in alist:
return alist.index(number)
else:
return "no"
So basically I input hundreds of thousands of numbers and I have to send each one of them thorugh "lookfor" to get an output of either the index of "number" in "alist" or get"no" if the number isn't there.
It perfectly computes when I input not as many numbers but takes several minutes when I input xx,xxx-xxx,xxx numbers.
Any suggestions?

Your code iterates through the list until it finds the number you seek (or until it reaches the end), and if it does find the number, it has to iterate the exact same amount to return the index. Why not take advantage of the behavior of the .index method? Just keep in mind that it raises a ValueError if the number is not present in the list.
def lookfor(alist, number):
try:
return alist.index(number)
except ValueError:
return "no"
afterword: use the timeit module to find the most efficient solution, but be sure to use a variety of inputs so that you can find the overall fastest solution.

def index_on(lst):
index = {val:i for i,val in enumerate(lst)}
def lookup(val):
return index.get(val, 'no')
return lookup
search = index_on(alist)
search('123-4567') # => 293 (index in alist)
search('123-4500') # => 'no' (not found)

Your code currently needs to search through the entire list for each call to lookfor. This can be very slow if alist is big enough.
Instead, you should create a dictionary that maps each element to its index in alist. For example, for alist = [7,4,88], you'd have: indexmap = {7:0, 4:1, 88:2}. Then you can search the dictionary with:
def lookfor(indexmap, number):
return indexmap.get(number, "no")
If alist is constant, you can create indexmap during initialization:
indexmap = {number: index for index,number in enumerate(alist)}
If alist changes over time, you can maintain this dictionary together with alist. For example, if you normally add items with append, you can use:
alist.append(number)
if number not in indexmap:
indexmap[number] = len(alist) - 1

Related

Execution Timed Out on codewars when finding the least used element in an array. Needs to be optimized but dont know how

Instructions on codewars:
There is an array with some numbers. All numbers are equal except for one. Try to find it!
find_uniq([ 1, 1, 1, 2, 1, 1 ]) == 2
find_uniq([ 0, 0, 0.55, 0, 0 ]) == 0.55
It’s guaranteed that array contains at least 3 numbers.
The tests contain some very huge arrays, so think about performance.
This is the code I wrote:
def find_uniq(arr):
for n in arr:
if arr.count(n) == 1:
return n
exit()
It works as follows:
For every character in the array, if that character appears only once, it returns said character and exits the code. If the character appears more than once, it does nothing
When attempting the code on codewars, I get the following error:
STDERR
Execution Timed Out (12000 ms)
I am a beginner so I have no idea how to further optimize the code in order for it to not time out
The first version of my code looked like this:
def find_uniq(arr):
arr.sort()
rep = str(arr)
for character in arr:
cantidad = arr.count(character)
if cantidad > 1:
rep = rep.replace(str(character), "")
rep = rep.replace("[", "")
rep = rep.replace("]", "")
rep = rep.replace(",", "")
rep = rep.replace(" ", "")
rep = float(rep)
n = rep
return n
After getting timed out, I assumed it was due to the repetitive replace functions and the fact that the code had to go through every element even if it had already found the correct one, since the code was deleting the incorrect ones, instead of just returning the correct one
After some iterations that I didn't save we got to the current code, which checks if the character is only once in the array, returns that and exits
def find_uniq(arr):
for n in arr:
if arr.count(n) == 1:
return n
exit()
I have no clue how to further optimize this
.count() iterates over the entire array every time that you call it. If your array has n elements, it will iterate over the array n times, which is quite slow.
You can use collections.Counter as Unmitigated suggests, but if you're not familiar with the module, it might seem overkill for this problem. Since in this case you know that there's only two unique elements in the array, you can get all of the unique elements using set(), and then check the frequency of each unique element:
def find_uniq(arr):
for n in set(arr):
if arr.count(n) == 1:
return n
You can use a dict or collections.Counter to get the frequency of each element with linear time complexity. Then return the element with a frequency of one.
def find_uniq(l):
from collections import Counter
return Counter(l).most_common()[-1][0]
Compare the first two numbers. If they match, find the one in the array that doesn't match (longest solution). Otherwise, return the one that doesn't match the third. Coded:
def find_uniq(arr):
if arr[0]==arr[1]:
target=arr[0]
for i in range(2,len(arr)):
if arr[i] != target:
return arr[i]
else:
if arr[0]==arr[2]:
return arr[1]
else:
return arr[0]
In your original code:
def find_uniq(arr):
for n in arr:
if arr.count(n) == 1:
return n
exit() # note: this line does nothing because you already returned
you're calling arr.count once for each element in the array (assuming the worst case scenario where the unique element is at the very end). Each call to arr.count(n) scans through the entire array counting up n -- so you're iterating over the entire array of N elements N times, which makes this O(N^2) -- very slow if N is big!
The second version of your code has the same problem, but it adds a huge amount of extra complexity by turning the list into a string and then trying to parse the string -- don't do that!
The way to make this fast is to iterate over the entire list once and keep track of the count of each item as you go. This is easiest to do with the built in collections.Counter class:
from collections import Counter
def find_uniq(arr):
return next(i for i, c in Counter(arr).items() if c == 1)
Given the constraint that there are only two different values in the array and exactly one of them is unique, you can make this more efficient (such that you don't even need to iterate over the entire array in all cases) by breaking it into two possibilities: either the first two items are identical and you just need to look for the item that's not equal to those, or they're different and you just need to return the one that's not equal to the third.
def find_uniq(arr):
if arr[0] == arr[1]:
# First two items are the same, iterate through
# the rest of the array to find the unique one.
return next(i for i in arr if i != arr[0])
# Otherwise, either arr[0] or arr[1] is unique.
return arr[0] if arr[1] == arr[2] else arr[1]
In this approach, you only ever need to iterate through the array as far as the unique item (or exactly one item past it in the case where it's one of the first two items). In the specific case where the unique item is toward the start of a very long array, this will be much faster than an approach that iterates over the entire array no matter what. In the worst case scenario, you will still have only a single iteration.

Find the number from the set that is not present in the array

I was given this question in an interview: You are given a set of numbers {1..N} and an array A[N-1]. Find the number from the set that is not present in the array. Below is the code and pseudocode I have so far, that doesn't work.
I am assuming that there is one (and only one) number in the set that isn’t in the array
loop through each element in the set
loop through each element in the array O(n)
check to see if the number is in the array
if it is, do nothing
else, early return the number
def findMissingNo(arr, s):
for num in s: #loop through each element in the set
for num2 in arr: ##loop through each element in the array O(n)
if (num == num2): #if the number in the set is in the array, break
break
print (num)
return num #if the number in the set is not in the array, early return the number
return -1 #return -1 if there is no missing element
s1 = {1,4,5}
arr1 = [1,4]
findMissingNo(arr1, s1)
By defination, we have a set from 1 to N and a array of size N-1 , contains numbers from 1 to N , with one number missing and we have to find that number
since only 1 number is missing, and set has n element and array has n-1 element. so array is subset of set, with missing element as missing, that means
all_number_of_set = all_number_of_array + missing_number
also
sum_of_all_number_of_set = sum_of_array_number + missing_number
which implies
missing_number = sum_of_all_number_of_set - sum_of_array_number
pseudo code
def findMissingNo(set_, arr_ ):
return sum(set_) - sum(arr_)
If I understood your question well then you are finding the efficient way of finding the set number that do not exist in list. I see you are inner looping which would be O(n^2). I would suggest to make the dict for the list which would be O(n) then find O(1) element in dictionay by looping over set O(n). Considering large list with subset set:
def findMissingNo(arr_list, s_list):
d = dict()
for el in arr_list:
d.update({el: el})
for s in s_list:
try:
d[s]
pass
except KeyError:
return s
return -1
s1 = {1,4,5}
arr1 = [1,4]
findMissingNo(arr1, s1)
Hope it helps:)
Your function is quadratic, because it has to check the whole list for each item in the set.
It's important that you don't iterate over the set. Yes, that can work, but you're showing that you don't know the time complexity advantages that you can get from a set or dict in python (or hashtables in general). But you can't iterate over the list either, because the missing item is ... missing. So you won't find it there.
Instead, you build a set from the list, and use the difference function. Or better, symmetric_difference (^) see https://docs.python.org/3.8/library/stdtypes.html#set
def findMissingNo(arr, s):
d = set(arr) ^ s # symmetric difference
if 1 == len(d):
for item in d:
return item
print (findMissingNo([1,4], {1,4,5}))
5
I took a few shortcuts because I knew we wanted one item, and I knew which container it was supposed to be in. I decided to return None if no item was found, but I didn't check for multiple items.
What about something like:
def findMissingNo(arr, s):
for num in s: # loop through each element in the set
if num in arr:
pass
else:
return num # if the number in the set is not in the array, early return the number
return -1 # return -1 if there is no missing element

sub-sum from a list without loops

So i'm studying recursion and have to write some codes using no loops
For a part of my code I want to check if I can sum up a subset of a list to a specific number, and if so return the indexes of those numbers on the list.
For example, if the list is [5,40,20,20,20] and i send it with the number 60, i want my output to be [1,2] since 40+20=60.
In case I can't get to the number, the output should be an empty list.
I started with
def find_sum(num,lst,i,sub_lst_sum,index_lst):
if num == sub_lst_sum:
return index_lst
if i == len(sum): ## finished going over the list without getting to the sum
return []
if sub_lst_sum+lst[i] > num:
return find_sum(num,lst,i+1,sub_lst_sum,index_lst)
return ?..
index_lst = find_sum(60,[5,40,20,20,20],0,0,[])
num is the number i want to sum up to,
lst is the list of numbers
the last return should go over both the option that I count the current number in the list and not counting it.. (otherwise in the example it will take the five and there will be no solution).
I'm not sure how to do this..
Here's a hint. Perhaps the simplest way to go about it is to consider the following inductive reasoning to guide your recursion.
If
index_list = find_sum(num,lst,i+1)
Then
index_list = find_sum(num,lst,i)
That is, if a list of indices can be use to construct a sum num using elements from position i+1 onwards, then it is also a solution when using elements from position i onwards. That much should be clear. The second piece of inductive reasoning is,
If
index_list = find_sum(num-lst[i],lst,i+1)
Then
[i]+index_list = find_sum(num,lst,i)
That is, if a list of indices can be used to return a sum num-lst[i] using elements from position i+1 onwards, then you can use it to build a list of indices whose respective elements sum is num by appending i.
These two bits of inductive reasoning can be translated into two recursive calls to solve the problem. Also the first one I wrote should be used for the second recursive call and not the first (question: why?).
Also you might want to rethink using empty list for the base case where there is no solution. That can work, but your returning as a solution a list that is not a solution. In python I think None would be a the standard idiomatic choice (but you might want to double check that with someone more well-versed in python than me).
Fill in the blanks
def find_sum(num,lst,i):
if num == 0 :
return []
elif i == len(lst) :
return None
else :
ixs = find_sum(???,lst,i+1)
if ixs != None :
return ???
else :
return find_sum(???,lst,i+1)

Delete elements of an integer recursively

My parameter, n is a phone number as an integer.
Using recursion I want to return the first three numbers in the integer.
I've turned the integer into a list of individual number characters and I'm attempting to delete the last number over and over again until I'm left with the last three, but I'm stuck on how to repeat it.
def areaCodes(n):
n = str(n)
n = list(n)
del n[-1]
#n = reduce(opperator.add, n)
n = ''.join(n)
n = int(n)
return n
I know I'm supposed to repeat the name in the return somehow, but because n isn't an integer that I can use to repeat. What do I do?
How about something like this?
def areaCodes(n):
# if n is less than 1000, what does it mean about the number of digits?
if n < 1000:
return # fill...
# otherwise, if n is greater than 1000, how can we alter n to remove the last
# digit? (hint: there's an operation similar to division called f...r division)
return areaCodes( # operate on n somehow...)
I assume that this is an exercise where recursion is necessary. If so, try this (there are better ways to accomplish your end goal, but I tried to modify your existing code as little as possible):
def areaCodes(n):
n_lst = list(str(n))
del n_lst[-1]
n_str = ''.join(n_lst)
n_int = int(n_str)
if len(n_lst) > 3:
return areaCodes(n_int)
return n_int
This will call the function again if the length of the number is greater than three, and return the number otherwise. Basically, the only part you were missing in your original function was the following, which is the recursive part:
if len(n_lst) > 3:
return areaCodes(n_int)
Remember that for a function to be recursive, it will have two main attributes:
It will at some point call itself. (this is what makes it 'repeat')
It will have some stopping condition (or base case).
You mentioned #1 when you wrote that you're supposed to use "the name in the return," so that's great! You just need to write that in your code:
return areaCodes(n), Where n is the updated phone number with a digit removed.
As you can see, each recursive call should do some work towards the solution, and should pass its mini-solution to the next recursive call.
Along with #2 above, you need to specify a base case, where the recursion will cease. So, since you're taking away a digit each time you call your function, you should include some kind of check to see if the current input is the length you want yet.
If it is the right length, you're done, and you should return the current number (not another recursive call).
Otherwise, you aren't done with the recursion yet.
import sys
def areaCodes(n):
#Create a list
myList = list(str(n))
#Delete last element
del myList[-1]
#Combine your elements into string list
myListStr = ''.join(myList)
#Type cast to int
myListInt = int(myListSte)
#Check whether your list satisfies your condition
if len(myList) > 3:
#Recusivley call the function again
return areaCodes(myListInt)
#Return your list when recursion completes
return myListInt
n = 12345
print areaCodes(n)

Count the number of occurrences of a given item in a (sorted) list?

I'm asked to create a method that returns the number of occurrences of a given item in a list. I know how to write code to find a specific item, but how can I code it to where it counts the number of occurrences of a random item.
For example if I have a list [4, 6 4, 3, 6, 4, 9] and I type something like
s1.count(4), it should return 3 or s1.count(6) should return 2.
I'm not allowed to use and built-in functions though.
In a recent assignment, I was asked to count the number of occurrences that sub string "ou" appeared in a given string, and I coded it
if len(astr) < 2:
return 0
else:
return (astr[:2] == "ou")+ count_pattern(astr[1:])
Would something like this work??
def count(self, item):
num=0
for i in self.s_list:
if i in self.s_list:
num[i] +=1
def __str__(self):
return str(self.s_list)
If this list is already sorted, the "most efficient" method -- in terms of Big-O -- would be to perform a binary search with a count-forward/count-backward if the value was found.
However, for an unsorted list as in the example, then the only way to count the occurrences is to go through each item in turn (or sort it first ;-). Here is some pseudo-code, note that it is simpler than the code presented in the original post (there is no if x in list or count[x]):
set count to 0
for each element in the list:
if the element is what we are looking for:
add one to count
Happy coding.
If I told you to count the number of fours in the following list, how would you do it?
1 4 2 4 3 8 2 1 4 2 4 9 7 4
You would start by remembering no fours yet, and add 1 for each element that equals 4. To traverse a list, you can use a for statement. Given an element of the list el, you can check whether it is four like this:
if el == 4:
# TODO: Add 1 to the counter here
In response to your edit:
You're currently testing if i in self.s_list:, which doesn't make any sense since i is an element of the list and therefore always present in it.
When adding to a number, you simply write num += 1. Brackets are only necessary if you want to access the values of a list or dictionary.
Also, don't forget to return num at the end of the function so that somebody calling it gets the result back.
Actually the most efficient method in terms of Big-O would be O(log n). #pst's method would result in O(log n + s) which could become linear if the array is made up of equal elements.
The way to achieve O(log n) would be to use 2 binary searches (which gives O(2log n), but we discard constants, so it is still O(log n)) that are modified to not have an equality test, therefore making all searches unsuccessful. However, on an unsuccessful search (low > high) we return low.
In the first search, if the middle is greater than your search term, recurse into the higher part of the array, else recurse into the lower part. In the second search, reverse the binary comparison.
The first search yields the right boundary of the equal element and the second search yields the left boundary. Simply subtract to get the amount of occurrences.
Based on algorithm described in Skiena.
This seems like a homework... anyways. Try list.count(item). That should do the job.
Third or fourth element here:
http://docs.python.org/tutorial/datastructures.html
Edit:
try something else like:
bukket = dict()
for elem in astr:
if elem not in bukket.keys():
bukket[elem] = 1
else:
bukket[elem] += 1
You can now get all the elements in the list with dict.keys() as list and the corresponding occurences with dict[key].
So you can test it:
import random
l = []
for i in range(0,200):
l.append(random.randint(0,20))
print l
l.sort()
print l
bukket = dict()
for elem in l:
if elem not in bukket.keys():
bukket[elem] = 1
else:
bukket[elem] += 1
print bukket

Categories

Resources