finding duplicates function

finding duplicates function - python

I need help incorporating a find_duplicates function in a program. It needs to traverse the list checking every element to see if it matches the target. Keep track of how many matches are found and print out a sentence summary.
Below is the code that I have written thus far:
myList = [69, 1 , 99, 82, 17]
#Selection Sort Function
def selection_sort_rev(aList):
totalcomparisions = 0
totalexchanges = 0
p=0
print("Original List:" , aList)
print()
n = len(aList)
for end in range(n, 1, -1):
comparisions = 0
exchanges = 1
p= p + 1
#Determine Largest Value
max_position = 0
for i in range(1, end):
comparisions = comparisions + 1
if aList[i] < aList[max_position]:
max_position = i
#Passes and Exchanges
exchnages = exchanges + 1
temp = aList [end - 1]
aList [end - 1] = aList [max_position]
aList [max_position] = temp
print ("Pass", p,":", "Comparsions:", comparisions, "\tExchanges:" , exchanges)
print("\t", aList, "\n")
totalcomparisions = totalcomparisions + comparisions
totalexchanges = totalexchanges + exchanges
print("\tTotal Comparisons:", totalcomparisions, "\tTotal Exchanges:", totalexchanges)
return
The find_duplicates function has to have:
Input paramaters: List, target (value to find)
Search for the target in the list and count how many times it occurs
Return value: None
I'm sure there are more efficient ways of doing this but I am a beginner to programming and I would like to know how to do this in the most simple way possible. PLEASE HELP!!!

import collections
def find_duplicates(L, target):
i_got_this_from_stackoverflow = collections.Counter(L)
if target not in i_got_this_from_stackoverflow:
print "target not found"
else:
print i_got_this_from_stackoverflow[target], "occurrences of the target were found"

If you are looking to improve your programming skills this is a great discussion. If you just want the answer to your problem, use the list.count() method, which was created to answer precisely that question:
>>> myList = [69, 1 , 99, 82, 17, 1, 82]
>>> myList.count(82)
2
Also, your test code does not contain any duplicates in the list. This is a necessary test case, but you also need to try your code on at least one list with duplicates to test that is works when they are present.
When you are learning and starting to achieve mastery you can save yourself an awful lot of time by learning how the various components work. If this is an assignment, however, keep on truckin'!

Related

Timeout Issue in Looping Through Hackerrank Example

can anyone explain why my code for a hacker rank example is timing out. I'm new to whole idea of efficiency of code based on processing time. The code seems to work on small sets, but once I start testing cases using large datasets it times out. I've provided a brief explanation of the method and its purpose for context. But if you could provide any tips if you notice functions I'm using that might consume a large amount of runtime that would be great.
Complete the migratoryBirds function below.
Params: arr: an array of tallies of species of birds sighted by index.
For example. arr = [Type1 = 1, Type2 = 4, Type3 = 4, Type4 = 4, Type5 = 5, Type6 = 3]
Return the lowest type of the the mode of sightings. In this case 4 sightings is the
mode. Type2 is the lowest type that has the mode. So return integer 2.
def migratoryBirds(arr):
# list of counts of occurrences of birds types with the same
# number of sightings
bird_count_mode = []
for i in range(1, len(arr) + 1):
occurr_count = arr.count(i)
bird_count_mode.append(occurr_count)
most_common_count = max(bird_count_mode)
common_count_index = bird_count_mode.index(most_common_count) + 1
# Find the first occurrence of that common_count_index in arr
# lowest_type_bird = arr.index(common_count_index) + 1
# Expect Input: [1,4,4,4,5,3]
# Expect Output: [1 0 1 3 1 0], 3, 4
return bird_count_mode, most_common_count, common_count_index
P.S. Thank you for the edit Chris Charley. I just tried to edit it at the same time

Use collections.Counter() to create a dictionary that maps species to their counts. Get the maximum count from this, then get all the species with that count. Then search the list for the first element of one of those species.
import collections
def migratoryBirds(arr):
species_counts = collections.Counter(arr)
most_common_count = max(species_counts.values())
most_common_species = {species for species, count in species_counts if count = most_common_count}
for i, species in arr:
if species in most_common_species:
return i

Function which eliminates specific elements from a list in an efficient way

I want to create a function (without using libraries) which takes as input three integer numbers (>0) (a,b,c) , for example:
a = 6
b = 6
c = 3
and returns a list containing c elements (so in this case the returned list should contain 3 elements) taken from a list of a numbers (so in this case the initial list is [1,2,3,4,5,6]). The c elements of the returned list have to be the ones that managed to remain in the initial list after removing a number every b positions from the list of a elements until len(return_list) = c.
So for a = 6, b = 6 and c = 3 the function should do something like this:
1) initial_list = [1,2,3,4,5,6]
2) first_change = [2,3,4,5,6] #the number after the 6th (**b**) is 1 because after the last element you keep counting returning to the first one, so 1 is canceled
3) second_change = [2,4,5,6] #you don't start to count positions from the start but from the first number after the eliminated one, so new number after the 6th is 3 and so 3 is canceled
4) third_change = [2,4,5] #now the number of elements in the list is equal to **c**
Notice that if, when counting, you end up finishing the elements from the list, you keep the counting and return to the first element of the list.
I made this function:
def findNumbers(a,b,c):
count = 0
dictionary = {}
countdown = a
for x in range(1,a+1):
dictionary[x] = True
while countdown > c:
for key,value in dictionary.items():
if value == True:
count += 1
if count == b+1:
dictionary[key] = False
count = 0
countdown -= 1
return [key for key in dictionary.keys() if dictionary[key] == True]
It works in some cases, like the above example. But it doesn't work everytime.
For example:
findNumbers(1000,1,5)
returns:
[209, 465, 721, 977] #wrong result
instead of:
[209, 465, 721, 849, 977] #right result
and for bigger numbers, like:
findNumbers(100000, 200000, 5)
it takes too much time to even do its job, I don't know if the problem is the inefficiency of my algorithm or because there's something in the code that gives problems to Python. I would like to know a different approach to this situation which could be both more efficient and able to work in every situation. Can anyone give me some hints/ideas?
Thank you in advance for your time. And let me know if you need more explanations and/or examples.

You can keep track of the index of the last list item deleted like this:
def findNumbers(a, b, c):
l = list(range(1, a + 1))
i = 0
for n in range(a - c):
i = (i + b) % (a - n)
l.pop(i)
return l
so that findNumbers(6, 6, 3) returns:
[2, 4, 5]
and findNumbers(1000, 1, 5) returns:
[209, 465, 721, 849, 977]
and findNumbers(100000, 200000, 5) returns:
[10153, 38628, 65057, 66893, 89103]

I thought I could be recursive about the problem, so I wrote this:
def func(a,b,c):
d = [i+1 for i in range(a)]
def sub(d,b,c):
if c == 0: return d
else:
k = b % len(d)
d.pop(k)
d = d[k:] + d[:k]
return sub(d,b,c-1)
return sub(d,b,a-c)
so that func(6,6,3) returns: [2, 4, 5] successfully and func(1000,1,5) returns: [209, 465, 721, 849, 977] unfortunately with an error.
It turns out that for values of a > 995, the below flag is raised:
RecursionError: maximum recursion depth exceeded while calling a Python object
There was no need to try func(100000,200000,5) - lesson learnt.
Still, rather than dispose of the code, I decided to share it. It could serve as a recursive thinking precautionary.

More on dynamic programming

Two weeks ago I posted THIS question here about dynamic programming. User Andrea Corbellini answered precisely what I wanted, but I wanted to take the problem one more step further.
This is my function
def Opt(n):
if len(n) == 1:
return 0
else:
return sum(n) + min(Opt(n[:i]) + Opt(n[i:])
for i in range(1, len(n)))
Let's say you would call
Opt( [ 1,2,3,4,5 ] )
The previous question solved the problem of computing the optimal value. Now,
instead of the computing the optimum value 33 for the above example, I want to print the way we got to the most optimal solution (path to the optimal solution). So, I want to print the indices where the list got cut/divided to get to the optimal solution in the form of a list. So, the answer to the above example would be :
[ 3,2,1,4 ] ( Cut the pole/list at third marker/index, then after second index, then after first index and lastly at fourth index).
That is the answer should be in the form of a list. The first element of the list will be the index where the first cut/division of the list should happen in the optimal path. The second element will be the second cut/division of the list and so on.
There can also be a different solution:
[ 3,4,2,1 ]
They both would still lead you to the correct output. So, it doesn't matter which one you printed. But, I have no idea how to trace and print the optimal path taken by the Dynamic Programming solution.
By the way, I figured out a non-recursive solution to that problem that was solved in my previous question. But, I still can't figure out to print the path for the optimal solution. Here is the non-recursive code for the previous question, it might be helpful to solve the current problem.
def Opt(numbers):
prefix = [0]
for i in range(1,len(numbers)+1):
prefix.append(prefix[i-1]+numbers[i-1])
results = [[]]
for i in range(0,len(numbers)):
results[0].append(0)
for i in range(1,len(numbers)):
results.append([])
for j in range(0,len(numbers)):
results[i].append([])
for i in range(2,len(numbers)+1): # for all lenghts (of by 1)
for j in range(0,len(numbers)-i+1): # for all beginning
results[i-1][j] = results[0][j]+results[i-2][j+1]+prefix[j+i]-prefix[j]
for k in range(1,i-1): # for all splits
if results[k][j]+results[i-2-k][j+k+1]+prefix[j+i]-prefix[j] < results[i-1][j]:
results[i-1][j] = results[k][j]+results[i-2-k][j+k+1]+prefix[j+i]-prefix[j]
return results[len(numbers)-1][0]

Here is one way of printing the selected :
I used the recursive solution using memoization provided by #Andrea Corbellini in your previous question. This is shown below:
cache = {}
def Opt(n):
# tuple objects are hashable and can be put in the cache.
n = tuple(n)
if n in cache:
return cache[n]
if len(n) == 1:
result = 0
else:
result = sum(n) + min(Opt(n[:i]) + Opt(n[i:])
for i in range(1, len(n)))
cache[n] = result
return result
Now, we have the cache values for all the tuples including the selected ones.
Using this, we can print the selected tuples as shown below:
selectedList = []
def printSelected (n, low):
if len(n) == 1:
# No need to print because it's
# already printed at previous recursion level.
return
minVal = math.Inf
minTupleLeft = ()
minTupleRight = ()
splitI = 0
for i in range(1, len(n)):
tuple1ToI = tuple (n[:i])
tupleiToN = tuple (n[i:])
if (cache[tuple1ToI] + cache[tupleiToN]) < minVal:
minVal = cache[tuple1ToI] + cache[tupleiToN]
minTupleLeft = tuple1ToI
minTupleRight = tupleiToN
splitI = low + i
print minTupleLeft, minTupleRight, minVal
print splitI # OP just wants the split index 'i'.
selectedList.append(splitI) # or add to the list as requested by OP
printSelected (list(minTupleLeft), low)
printSelected (list(minTupleRight), splitI)
You call the above method like shown below:
printSelected (n, 0)

Longest common prefix using buffer?

If I have an input string and an array:
s = "to_be_or_not_to_be"
pos = [15, 2, 8]
I am trying to find the longest common prefix between the consecutive elements of the array pos referencing the original s. I am trying to get the following output:
longest = [3,1]
The way I obtained this is by computing the longest common prefix of the following pairs:
s[15:] which is _be and s[2:] which is _be_or_not_to_be giving 3 ( _be )
s[2:] which is _be_or_not_to_be and s[8:] which is _not_to_be giving 1 ( _ )
However, if s is huge, I don't want to create multiple copies when I do something like s[x:]. After hours of searching, I found the function buffer that maintains only one copy of the input string but I wasn't sure what is the most efficient way to utilize it here in this context. Any suggestions on how to achieve this?

Here is a method without buffer which doesn't copy, as it only looks at one character at a time:
from itertools import islice, izip
s = "to_be_or_not_to_be"
pos = [15, 2, 8]
length = len(s)
for start1, start2 in izip(pos, islice(pos, 1, None)):
pref = 0
for pos1, pos2 in izip(xrange(start1, length), xrange(start2, length)):
if s[pos1] == s[pos2]:
pref += 1
else:
break
print pref
# prints 3 1
I use islice, izip, and xrange in case you're talking about potentially very long strings.
I also couldn't resist this "One Liner" which doesn't even require any indexing:
[next((i for i, (a, b) in
enumerate(izip(islice(s, start1, None), islice(s, start2, None)))
if a != b),
length - max((start1, start2)))
for start1, start2 in izip(pos, islice(pos, 1, None))]
One final method, using os.path.commonprefix:
[len(commonprefix((buffer(s, n), buffer(s, m)))) for n, m in zip(pos, pos[1:])]

>>> import os
>>> os.path.commonprefix([s[i:] for i in pos])
'_'
Let Python to manage memory for you. Don't optimize prematurely.
To get the exact output you could do (as #agf suggested):
print [len(commonprefix([buffer(s, i) for i in adj_indexes]))
for adj_indexes in zip(pos, pos[1:])]
# -> [3, 1]

I think your worrying about copies is unfounded. See below:
>>> s = "how long is a piece of string...?"
>>> t = s[12:]
>>> print t
a piece of string...?
>>> id(t[0])
23295440
>>> id(s[12])
23295440
>>> id(t[2:20]) == id(s[14:32])
True
Unless you're copying the slices and leaving references to the copies hanging around, I wouldn't think it could cause any problem.
edit: There are technical details with string interning and stuff that I'm not really clear on myself. But I'm sure that a string slice is not always a copy:
>>> x = 'google.com'
>>> y = x[:]
>>> x is y
True
I guess the answer I'm trying to give is to just let python manage its memory itself, to begin with, you can look at memory buffers and views later if needed. And if this is already a real problem occurring for you, update your question with details of what the actual problem is.

One way of doing using buffer this is give below. However, there could be much faster ways.
s = "to_be_or_not_to_be"
pos = [15, 2, 8]
lcp = []
length = len(pos) - 1
for index in range(0, length):
pre = buffer(s, pos[index])
cur = buffer(s, pos[index+1], pos[index+1]+len(pre))
count = 0
shorter, longer = min(pre, cur), max(pre, cur)
for i, c in enumerate(shorter):
if c != longer[i]:
break
else:
count += 1
lcp.append(count)
print
print lcp

counting odd numbers in a list python

This is a part of my homework assignment and im close to the final answer but not quite yet. I need to write a function that counts odd numbers in a list.
Create a recursive function count_odd(l) which takes as its only argument a list of integers. The function will return a count of the number of list elements that are odd, i.e., not evenly divisible by 2.\
>>> print count_odd([])
0
>>> print count_odd([1, 3, 5])
3
>>> print count_odd([2, 4, 6])
0
>>> print count_odd([0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144])
8
Here is what i have so far:
#- recursive function count_odd -#
def count_odd(l):
"""returns a count of the odd integers in l.
PRE: l is a list of integers.
POST: l is unchanged."""
count_odd=0
while count_odd<len(l):
if l[count_odd]%2==0:
count_odd=count_odd
else:
l[count_odd]%2!=0
count_odd=count_odd+1
return count_odd
#- test harness
print count_odd([])
print count_odd([1, 3, 5])
print count_odd([2, 4, 6])
print count_odd([0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144])
Can u help explain what im missing. The first two test harness works fine but i cant get the final two. Thanks!

Since this is homework, consider this pseudo-code that just counts a list:
function count (LIST)
if LIST has more items
// recursive case.
// Add one for the current item we are counting,
// and call count() again to process the *remaining* items.
remaining = everything in LIST except the first item
return 1 + count(remaining)
else
// base case -- what "ends" the recursion
// If an item is removed each time, the list will eventually be empty.
return 0
This is very similar to what the homework is asking for, but it needs to be translate to Python and you must work out the correct recursive case logic.
Happy coding.

def count_odd(L):
return (L[0]%2) + count_odd(L[1:]) if L else 0

Are slices ok? Doesn't feel recursive to me, but I guess the whole thing is kind of against usual idioms (i.e. - recursion of this sort in Python):
def countOdd(l):
if l == list(): return 0 # base case, empty list means we're done
return l[0] % 2 + countOdd(l[1:]) # add 1 (or don't) depending on odd/even of element 0. recurse on the rest
x%2 is 1 for odds, 0 for evens. If you are uncomfortable with it or just don't understand it, use the following in place of the last line above:
thisElement = l[0]
restOfList = l[1:]
if thisElement % 2 == 0: currentElementOdd = 0
else: currentElementOdd = 1
return currentElementOdd + countOdd(restOfList)
PS - this is pretty recursive, see what your teacher says if you turn this in =P
>>> def countOdd(l):
... return fold(lambda x,y: x+(y&1),l,0)
...
>>> def fold(f,l,a):
... if l == list(): return a
... return fold(f,l[1:],f(a,l[0]))

All of the prior answers are subdividing the problem into subproblems of size 1 and size n-1. Several people noted that the recursive stack might easily blow out. This solution should keep the recursive stack size at O(log n):
def count_odd(series):
l = len(series) >> 1
if l < 1:
return series[0] & 1 if series else 0
else:
return count_odd(series[:l]) + count_odd(series[l:])

The goal of recursion is to divide the problem into smaller pieces, and apply the solution to the smaller pieces. In this case, we can check if the first number of the list (l[0]) is odd, then call the function again (this is the "recursion") with the rest of the list (l[1:]), adding our current result to the result of the recursion.

def count_odd(series):
if not series:
return 0
else:
left, right = series[0], series[1:]
return count_odd(right) + (1 if (left & 1) else 0)

Tail recursion
def count_odd(integers):
def iter_(lst, count):
return iter_(rest(lst), count + is_odd(first(lst))) if lst else count
return iter_(integers, 0)
def is_odd(integer):
"""Whether the `integer` is odd."""
return integer % 2 != 0 # or `return integer & 1`
def first(lst):
"""Get the first element from the `lst` list.
Return `None` if there are no elements.
"""
return lst[0] if lst else None
def rest(lst):
"""Return `lst` list without the first element."""
return lst[1:]
There is no tail-call optimization in Python, so the above version is purely educational.
The call could be visualize as:
count_odd([1,2,3]) # returns
iter_([1,2,3], 0) # could be replaced by; depth=1
iter_([2,3], 0 + is_odd(1)) if [1,2,3] else 0 # `bool([1,2,3])` is True in Python
iter_([2,3], 0 + True) # `True == 1` in Python
iter_([2,3], 1) # depth=2
iter_([3], 1 + is_odd(2)) if [2,3] else 1
iter_([3], 1 + False) # `False == 0` in Python
iter_([3], 1) # depth=3
iter_([], 1 + is_odd(3)) if [3] else 1
iter_([], 2) # depth=4
iter_(rest([]), 2 + is_odd(first([])) if [] else 2 # bool([]) is False in Python
2 # the answer
Simple trampolining
To avoid 'max recursion depth exceeded' errors for large arrays all tail calls in recursive functions can be wrapped in lambda: expressions; and special trampoline() function can be used to unwrap such expressions. It effectively converts recursion into iterating over a simple loop:
import functools
def trampoline(function):
"""Resolve delayed calls."""
#functools.wraps(function)
def wrapper(*args):
f = function(*args)
while callable(f):
f = f()
return f
return wrapper
def iter_(lst, count):
#NOTE: added `lambda:` before the tail call
return (lambda:iter_(rest(lst), count+is_odd(first(lst)))) if lst else count
#trampoline
def count_odd(integers):
return iter_(integers, 0)
Example:
count_odd([1,2,3])
iter_([1,2,3], 0) # returns callable
lambda:iter_(rest(lst), count+is_odd(first(lst))) # f = f()
iter_([2,3], 0+is_odd(1)) # returns callable
lambda:iter_(rest(lst), count+is_odd(first(lst))) # f = f()
iter_([3], 1+is_odd(2)) # returns callable
lambda:iter_(rest(lst), count+is_odd(first(lst))) # f = f()
iter_([], 1+is_odd(3))
2 # callable(2) is False

I would write it like this:
def countOddNumbers(numbers):
sum = 0
for num in numbers:
if num%2!=0:
sum += numbers.count(num)
return sum

not sure if i got your question , but as above something similar:
def countOddNumbers(numbers):
count=0
for i in numbers:
if i%2!=0:
count+=1
return count

Generator can give quick result in one line code:
sum((x%2 for x in nums))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

finding duplicates function - python

import collections def find_duplicates(L, target): i_got_this_from_stackoverflow = collections.Counter(L) if target not in i_got_this_from_stackoverflow: print "target not found" else: print i_got_this_from_stackoverflow[target], "occurrences of the target were found"

Related

Timeout Issue in Looping Through Hackerrank Example

Function which eliminates specific elements from a list in an efficient way

More on dynamic programming

Longest common prefix using buffer?

counting odd numbers in a list python

Categories

Resources