Counting number of list entries that occur 1 time - python

I'm trying to write a Python function that counts the number of entries in a list that occur exactly once.
For example, given the list [17], this function would return 1. Or given [3,3,-22,1,-22,1,3,0], it would return 1.
** Restriction: I cannot import anything into my program.
The incorrect code that I've written so far: I'm going the double-loop route, but the index math is getting over-complicated.
def count_unique(x):
if len(x) == 1:
return 1
i = 0
j = 1
for i in range(len(x)):
for j in range(j,len(x)):
if x[i] == x[j]:
del x[j]
j+1
j = 0
return len(x)

Since you can't use collections.Counter or sorted/itertools.groupby apparently (one of which would usually be my go to solution, depending on whether the inputs are hashable or sortable), just simulate roughly the same behavior as a Counter, counting all elements and then counting the number of elements that appeared only once at the end:
def count_unique(x):
if len(x) <= 1:
return len(x)
counts = {}
for val in x:
counts[val] = counts.get(val, 0) + 1
return sum(1 for count in counts.values() if count == 1)

lst = [3,3,-22,1,-22,1,3,0]
len(filter(lambda z : z[0] == 1,
map(lambda x : (len(filter(lambda y : y == x, lst)), x), lst)))
sorry :)
Your solution doesn't work because you are doing something weird. Deleting things from a list while iterating through it, j+1 makes no sense etc. Try adding elements that are found to be unique to a new list and then counting the number of things in it. Then figure out what my solution does.
Here is the O(n) solution btw:
lst = [3,3,-22,1,-22,1,3,0,37]
cnts = {}
for n in lst:
if n in cnts:
cnts[n] = cnts[n] + 1
else:
cnts[n] = 1
count = 0
for k, v in cnts.iteritems():
if v == 1:
count += 1
print count

A more simple and understandable solution:
l = [3, 3, -22, 1, -22, 1, 3, 0]
counter = 0
for el in l:
if l.count(el) == 1:
counter += 1
It's pretty simple. You iterate over the items of the list. Then you look if the element is exactly one time in the list and then you add +1. You can improve the code (make liste comprehensions, use lambda expressions and so on), but this is the idea behind it all and the most understandable, imo.

you are making this overly complicated. try using a dictionary where the key is the element in your list. that way if it exists it will be unique
to add to this. it is probably the best method when looking at complexity. an in lookup on a dictionary is considered O(1), the for loop is O(n) so total your time complexity is O(n) which is desirable... using count() on a list element does a search on the whole list for every element which is basically O(n^2)... thats bad
from collections import defaultdict
count_hash_table = defaultdict(int) # i am making a regular dictionary but its data type is an integer
elements = [3,3,-22,1,-22,1,3,0]
for element in elements:
count_hash_table[element] += 1 # here i am using that default datatype to count + 1 for each type
print sum(c for c in count_hash_table.values() if c == 1):

There is method on lists called count.... from this you can go further i guess.
for example:
for el in l:
if l.count(el) > 1:
continue
else:
print("found {0}".format(el))

Related

Finding the count of how many elements of list A appear before than in the similar but mixed list B

A=[2,3,4,1] B=[1,2,3,4]
I need to find how many elements of list A appear before than the same element of list B. In this case values 2,3,4 and the expected return would be 3.
def count(a, b):
muuttuja = 0
for i in range(0, len(a)-1):
if a[i] != b[i] and a[i] not in b[:i]:
muuttuja += 1
return muuttuja
I have tried this kind of solution but it is very slow to process lists that have great number of values. I would appreciate some suggestions for alternative methods of doing the same thing but more efficiently. Thank you!
If both the lists have unique elements you can make a map of element (as key) and index (as value). This can be achieved using dictionary in python. Since, dictionary uses only O(1) time for lookup. This code will give a time complexity of O(n)
A=[2,3,4,1]
B=[1,2,3,4]
d = {}
count = 0
for i,ele in enumerate(A) :
d[ele] = i
for i,ele in enumerate(B) :
if i > d[ele] :
count+=1
Use a set of already seen B-values.
def count(A, B):
result = 0
seen = set()
for a, b in zip(A, B):
seen.add(b)
if a not in seen:
result += 1
return result
This only works if the values in your lists are immutable.
Your method is slow because it has a time complexity of O(N²): checking if an element exists in a list of length N is O(N), and you do this N times. We can do better by using up some more memory instead of time.
First, iterate over b and create a dictionary mapping the values to the first index that value occurs at:
b_map = {}
for index, value in enumerate(b):
if value not in b_map:
b_map[value] = index
b_map is now {1: 0, 2: 1, 3: 2, 4: 3}
Next, iterate over a, counting how many elements have an index less than that element's value in the dictionary we just created:
result = 0
for index, value in enumerate(a):
if index < b_map.get(value, -1):
result += 1
Which gives the expected result of 3.
b_map.get(value, -1) is used to protect against the situation when a value in a doesn't occur in b, and you don't want to count it towards the total: .get returns the default value of -1, which is guaranteed to be less than any index. If you do want to count it, you can replace the -1 with len(a).
The second snippet can be replaced by a single call to sum:
result = sum(index < b_map.get(value, -1)
for index, value in enumerate(a))
You can make a prefix-count of A, which is an array where for each index you keep track of the number of occurrences of each element before the index.
You can use this to efficiently look-up the prefix-counts when looping over B:
import collections
A=[2,3,4,1]
B=[1,2,3,4]
prefix_count = [collections.defaultdict(int) for _ in range(len(A))]
prefix_count[0][A[0]] += 1
for i, n in enumerate(A[1:], start=1):
prefix_count[i] = collections.defaultdict(int, prefix_count[i-1])
prefix_count[i][n] += 1
prefix_count_b = sum(prefix_count[i][n] for i, n in enumerate(B))
print(prefix_count_b)
This outputs 3.
This still could be O(NN) because of the copy from the previous index when initializing the prefix_count array, if someone knows a better way to do this, please let me know*

Use list comprehensions to make a list of count of elements smaller than the element in an array

I was solving this leetcode problem - https://leetcode.com/problems/how-many-numbers-are-smaller-than-the-current-number/
I solved it easily by using nested for loops but list comprehensions have always intrigued me. Ive spent a lot of time to make that one liner work but I always get some syntax error.
here's the solution:
count = 0
ans = []
for i in nums:
for j in nums:
if i > j:
count = count + 1
ans.append(count)
count = 0
return ans
these were the ones so far I think shouldve worked:
return [count = count + 1 for i in nums for j in nums if i > j]
return [count for i in nums for j in nums if i > j count = count + 1]
return [count:= count + 1 for i in nums for j in nums if i > j]
Ill also be happy if there's some resource or similar to put it together, Ive been searching the python docs but didnt find something that'll help me
I will transform the code step by step in order to show the thought process.
First: we don't care what the value of count is afterward, but we need it to be 0 at the start of each inner loop. So it is simpler logically to set it there, rather than outside and then also at the end of the inner loop:
ans = []
for i in nums:
count = 0
for j in nums:
if i > j:
count = count + 1
ans.append(count)
return ans
Next, we focus on the contents of the loop:
count = 0
for j in nums:
if i > j:
count = count + 1
ans.append(count)
A list comprehension is not good at math; it is good at producing a sequence of values from a source sequence. The transformation we need to do here is to put the actual elements into our "counter" variable1, and then figure out how many there are (in order to append to ans). Thus:
smaller = []
for j in nums:
if i > j:
smaller.append(j)
ans.append(len(smaller))
Now that the creation of smaller has the right form, we can replace it with a list comprehension, in a mechanical, rule-based way. It becomes:
smaller = [j for j in nums if i > j]
# ^ ^^^^^^^^^^^^^ ^^^^^^^^
# | \- the rest of the parts are in the same order
# \- this moves from last to first
# and then we use it the same as before
ans.append(len(smaller))
We notice that we can just fold that into one line; and because we are passing a single comprehension argument to len we can drop the brackets2:
ans.append(len(j for j in nums if i > j))
Good. Now, let's put that back in the original context:
ans = []
for i in nums:
ans.append(len(j for j in nums if i > j))
return ans
We notice that the same technique applies: we have the desired form already. So we repeat the procedure:
ans = [len(j for j in nums if i > j) for i in nums]
return ans
And of course:
return [len(j for j in nums if i > j) for i in nums]
Another popular trick is to put a 1 in the output for each original element, and then sum them. It's about the same either way; last I checked the performance is about the same and I don't think either is clearer than the other.
Technically, this produces a generator expression instead. Normally, these would be surrounded with () instead of [], but a special syntax rule lets you drop the extra pair of () when calling a function with a single argument that is a generator expression. This is especially convenient for the built-in functions len and sum - as well as for any, all, max, min and (if you don't need a custom sort order) sorted.
Hmm, three people write sum solutions but every single one does sum(1 for ...). I prefer this:
[sum(j < i for j in nums) for i in nums]
Instead of trying to advance an external counter, try adding ones to your list and then sum it:
for example:
nums = [1,2,3,4,5]
target = 3
print(sum(1 for n in nums if n < target))
Using counter inside the list comprehension creates the challenge of resetting it's value, each iteration of the first loop.
This can be avoided by filtering, and summing, in the second loop:
You use the first loop to iterate over the values of nums array.
return [SECOND_LOOP for i in nums]
You use the second loop, iterating over all elements of nums array. You filter in the elements that are smaller than i, the current element in the first loop, with if i < j, and evaluating 1 for each of them. Finally, you sum all the 1s generated:
sum(1 for j in nums if i > j)
You get the number of values that meet the requirements, by the list comprehension of the first loop:
return [sum(1 for j in nums if i > j) for i in nums]
This solution has been checked & validated in LeetCode.
You need a slightly different approach for the inner loop than a list comprehension. Instead of repeatedly appending a value to a list you need to repeatedly add a value to a variable.
This can be done in a functional way by using sum and a generator expression:
count = 0
# ...
for j in nums:
if i > j:
count = count + 1
can be replaced by
count = sum(1 for j in nums if i > j)
So that we now have this:
ans = []
for i in nums:
count = sum(1 for j in nums if i > j)
ans.append(count)
return ans
This pattern can in fact be replaced by a list comprehension:
return [sum(1 for j in nums if i > j) for i in nums]
Alternative Solution
We can also use the Counter from collections:
class Solution:
def smallerNumbersThanCurrent(self, nums):
count_map = collections.Counter(nums)
smallers = []
for index in range(len(nums)):
count = 0
for key, value in count_map.items():
if key < nums[index]:
count += value
smallers.append(count)
return smallers

Code challenge: finding the divisible in a list

I am playing a code challenge. Simply speaking, the problem is:
Given a list L (max length is of the order of 1000) containing positive integers.
Find the number of "Lucky Triples", which is L[i] divides L[j], and L[j] divides L[k].
for example, [1,2,3,4,5,6] should give the answer 3 because [1,2,4], [1,2,6],[1,3,6]
My attempt:
Sort the list. (let say there are n elements)
3 For loops: i, j, k (i from 1 to n-2), (j from i+1 to n-1), (k from j+1 to n)
only if L[j] % L[i] == 0, the k for loop will be executed
The algorithm seems to give the correct answer. But the challenge said that my code exceeded the time limit. I tried on my computer for the list [1,2,3,...,2000], count = 40888(I guess it is correct). The time is around 5 second.
Is there any faster way to do that?
This is the code I have written in python.
def answer(l):
l.sort()
cnt = 0
if len(l) == 2:
return cnt
for i in range(len(l)-2):
for j in range(1,len(l)-1-i):
if (l[i+j]%l[i] == 0):
for k in range(1,len(l)-j-i):
if (l[i+j+k]%l[i+j] == 0):
cnt += 1
return cnt
You can use additional space to help yourself. After you sort the input list you should make a map/dict where the key is each element in the list and value is a list of elements which are divisible by that in the list so you would have something like this
assume sorted list is list = [1,2,3,4,5,6] your map would be
1 -> [2,3,4,5,6]
2-> [4,6]
3->[6]
4->[]
5->[]
6->[]
now for every key in the map you find what it can divide and then you find what that divides, for example you know that
1 divides 2 and 2 divides 4 and 6, similarly 1 divides 3 and 3 divides 6
the complexity of sorting should be O(nlogn) and that of constructing the list should be better than O(n^2) (but I am not sure about this part) and then I am not sure about the complexity of when you are actually checking for multiples but I think this should be much much faster than a brute force O(n^3)
If someone could help me figure out the time complexity of this I would really appreciate it
EDIT :
You can make the map creation part faster by incrementing by X (and not 1) where X is the number in the list you are currently on since it is sorted.
Thank you guys for all your suggestions. They are brilliant. But it seems that I still can't pass the speed test or I cannot handle with duplicated elements.
After discussing with my friend, I have just come up with another solution. It should be O(n^2) and I passed the speed test. Thanks all!!
def answer(lst):
lst.sort()
count = 0
if len(lst) == 2:
return count
#for each middle element, count the divisors at the front and the multiples at the back. Then multiply them.
for i, middle in enumerate(lst[1:len(lst)-1], start = 1):
countfirst = 0
countthird = 0
for first in (lst[0:i]):
if middle % first == 0:
countfirst += 1
for third in (lst[i+1:]):
if third % middle == 0:
countthird += 1
count += countfirst*countthird
return count
I guess sorting the list is pretty inefficient. I would rather try to iteratively reduce the number of candidates. You could do that in two steps.
At first filter all numbers that do not have a divisor.
from itertools import combinations
candidates = [max(pair) for pair in combinations(l, 2) if max(pair)%min(pair) == 0]
After that, count the number of remaining candidates, that do have a divisor.
result = sum(max(pair)%min(pair) == 0 for pair in combinations(candidates, 2))
Your original code, for reference.
def answer(l):
l.sort()
cnt = 0
if len(l) == 2:
return cnt
for i in range(len(l)-2):
for j in range(1,len(l)-1-i):
if (l[i+j]%l[i] == 0):
for k in range(1,len(l)-j-i):
if (l[i+j+k]%l[i+j] == 0):
cnt += 1
return cnt
There are a number of misimplementations here, and with just a few tweaks we can probably get this running much faster. Let's start:
def answer(lst): # I prefer not to use `l` because it looks like `1`
lst.sort()
count = 0 # use whole words here. No reason not to.
if len(lst) == 2:
return count
for i, first in enumerate(lst):
# using `enumerate` here means you can avoid ugly ranges and
# saves you from a look up on the list afterwards. Not really a
# performance hit, but definitely looks and feels nicer.
for j, second in enumerate(lst[i+1:], start=i+1):
# this is the big savings. You know since you sorted the list that
# lst[1] can't divide lst[n] if n>1, but your code still starts
# searching from lst[1] every time! Enumerating over `l[i+1:]`
# cuts out a lot of unnecessary burden.
if second % first == 0:
# see how using enumerate makes that look nicer?
for third in lst[j+1:]:
if third % second == 0:
count += 1
return count
I bet that on its own will pass your speed test, but if not, you can check for membership instead. In fact, using a set here is probably a great idea!
def answer2(lst):
s = set(lst)
limit = max(s) # we'll never have a valid product higher than this
multiples = {} # accumulator for our mapping
for n in sorted(s):
max_prod = limit // n # n * (max_prod+1) > limit
multiples[n] = [n*k for k in range(2, max_prod+1) if n*k in s]
# in [1,2,3,4,5,6]:
# multiples = {1: [2, 3, 4, 5, 6],
# 2: [4, 6],
# 3: [6],
# 4: [],
# 5: [],
# 6: []}
# multiples is now a mapping you can use a Depth- or Breadth-first-search on
triples = sum(1 for j in multiples
for k in multiples.get(j, [])
for l in multiples.get(k, []))
# This basically just looks up each starting value as j, then grabs
# each valid multiple and assigns it to k, then grabs each valid
# multiple of k and assigns it to l. For every possible combination there,
# it adds 1 more to the result of `triples`
return triples
I'll give you just an idea, the implementation should be up to you:
Initialize the global counter to zero.
Sort the list, starting with smallest number.
Create a list of integers (one entry per number with same index).
Iterate through each number (index i), and do the following:
Check for dividers at positions 0 to i-1.
Store the number of dividers in the list at the position i.
Fetch the number of dividers from the list for each divider, and add each number to the global counter.
Unless you finished, go to 3rd.
Your result should be in the global counter.

Looping through all item except itself

I am trying to find the item in a list with highest number of occurrence.
For this, I am trying to compare every item in list with all other items in list and increasing count's value by 1 each time a match is found.
def findInt(array):
count = []
count = [1 for i in range(0,len(array))]
for i,item in enumerate(array):
if (array[i] == array[i:]): #How can I compare it with all items except itself?
count[i]+=1
return max(count), count.index(max(count))
findInt(array=[1,2,3])
My question is "how do I compare the item with all other items except itself"?
use collections.Counter which has a most_common function.
import collections
def findInt(array):
c = collections.Counter(array)
return c.most_common(1)
DEMO
>>> import collections
>>> array=[1,2,3,1,2,3,2]
>>> c = collections.Counter(array)
>>> c.most_common(1)
[(2, 3)]
DOC
class collections.Counter([iterable-or-mapping])
A Counter is a dict subclass for counting hashable objects. It is an unordered collection where elements are stored as dictionary keys and their counts are stored as dictionary values. Counts are allowed to be any integer value including zero or negative counts. The Counter class is similar to bags or multisets in other languages.
most_common([n])
Return a list of the n most common elements and their counts from the most common to the least. If n is omitted or None, most_common() returns all elements in the counter. Elements with equal counts are ordered arbitrarily:
Whilst there exist many better ways of solving this problem, for instance as indicated in #zwer's comment to your question, here's how I would solve exactly what you're asking:
# O(n ** 2)
def find_int(array):
n = len(array)
count = [1 for i in range(n)]
for i in range(n):
for j in range(n):
if i == j: continue
if array[i] == array[j]:
count[i] += 1
return max(count), count.index(max(count))
# Worse than O(n ** 2)
def find_int_using_slice(array):
n = len(array)
count = [1 for i in range(n)]
for i in range(n):
for a_j in array[0:i] + array[i+1:]:
if array[i] == a_j:
count[i] += 1
return max(count), count.index(max(count))
print(find_int_using_slice([1,2,3,1,2,3,2]))
We're using a nested for-loop here and using continue to skip the iteration when the two indexes are the same.
Unless specifically for the purpose of learning, please consider using built-ins for common tasks this, as they are well implemented, tested, optimised, etc.
There are many potential solutions, but here are the two I'd recommend, depending on your application's requirements: 1) sort and count in a single pass from left to right: O(n * log(n)) and losing the original ordering, or 2) use a dictionary to maintain the counts, requiring only a single pass from left to right: O(n) but using more memory. Of course the better decision would be to use in-built methods which are highly optimised, but that's your call
Updated answer to reflect OP not wanting to use collections.Counter
Using setdefault to prime the counter for first occurrences, then increment the counter. Then you can use max with a key to find the most common item.
def most_common(ar):
y = {}
for item in ar:
y.setdefault(item, 0)
y[item] += 1
return max(y.items(), key=lambda x: x[1])
array = [1, 2, 1, 1, 2, 1, 3, 3, 1]
most_common(array)
(1, 5) # (Most common item, occurrences of item)
def findInt(array):
count = []
for i in range(len(array)):
count.append(array.count(array[i]))
return max(count), count.index(max(count))
print(findInt(array=[1,2,3,1,2,3,2]))
Fine, I'll bite - given that memory is cheap, hashing is preferred over looping. I'd reckon one of the most performant ways would be to use a temporary registry:
def findInt(array):
occurrences = dict.fromkeys(array, 0)
for element in array:
occurrences[element] += 1
items = occurrences.values()
max_occurences = max(items)
return occurrences.keys()[items.index(max_occurences)], max_occurences
Returns a tuple of the element that occurs the most, and the number of times it occurs.
Actually, let's optimize it even more - here's a pure O(N) solution with no extra list building and searching:
def findInt(array):
occurrences = dict.fromkeys(array, 0)
candidate = array[0]
occurs = 0
for element in array:
value = occurrences[element] + 1
occurrences[element] = value
if value > occurs:
candidate = element
occurs = value
return candidate, occurs
Counter is ideal for counting frequencies of items in an iterable. Alternatively, you can loop once with a defaultdict.
import operator as op
import collections as ct
def findInt(array):
dd = ct.defaultdict(int)
for item in array:
dd[item] += 1
return dd
# Frequencies
array = [1, 2, 1, 1, 2, 1, 3, 3, 1]
freq = findInt(array)
freq
# Out: defaultdict(int, {1: 5, 2: 2, 3: 2})
# Maximum key-value pair (2 options)
{k:v for k,v in freq.items() if k == max(freq, key=lambda x: freq[x])}
# Out: {1: 5}
max({k:v for k,v in freq.items()}.items(), key=op.itemgetter(-1))
# Out: (1: 5)

Sort a list efficiently which contains only 0 and 1 without using any builtin python sort function?

What is the most efficient way to sort a list, [0,0,1,0,1,1,0] whose elements are only 0 & 1, without using any builtin sort() or sorted() or count() function. O(n) or less than that
>>> lst = [0,0,1,0,1,1,0]
>>> l, s = len(lst), sum(lst)
>>> result = [0] * (l - s) + [1] * s
>>> result
[0, 0, 0, 0, 1, 1, 1]
There are many different general sorting algorithms that can be used. However, in this case, the most important consideration is that all the elements to sort belong to the set (0,1).
As other contributors answered there is a trivial implementation.
def radix_sort(a):
slist = [[],[]]
for elem in a:
slist[elem].append(elem)
return slist[0] + slist[1]
print radix_sort([0,0,1,0,1,1,0])
It must be noted that this is a particular implementation of the Radix sort. And this can be extended easily if the elements of the list to be sorted belong to a defined limited set.
def radix_sort(a, elems):
slist = {}
for elem in elems:
slist[elem] = []
for elem in a:
slist[elem].append(elem)
nslist = []
for elem in elems:
nslist += slist[elem]
return nslist
print radix_sort([2,0,0,1,3,0,1,1,0],[0,1,2,3])
No sort() or sorted() or count() function. O(n)
This one is O(n) (you can't get less):
old = [0,0,1,0,1,1,0]
zeroes = old.count(0) #you gotta count them somehow!
new = [0]*zeroes + [1]*(len(old) - zeroes)
As there are no Python loops, this may be the faster you can get in pure Python...
def sort_arr_with_zero_one():
main_list = [0,0,1,0,1,1,0]
zero_list = []
one_list = []
for i in main_list:
if i:
one_list.append(i)
else:
zero_list.append(i)
return zero_list + one_list
You have only two values, so you know in advance the precise structure of the output: it will be divided into two regions of varying lengths.
I'd try this:
b = [0,0,1,0,1,1,0]
def odd_sort(a):
zeroes = a.count(0)
return [0 for i in xrange(zeroes)] + [1 for i in xrange(len(a) - zeroes)]
You could walk the list with two pointers, one from the start (i) and from the end (j), and compare the values one by one and swap them if necessary:
def sort_binary_values(l):
i, j = 0, len(l)-1
while i < j:
# skip 0 values from the begin
while i < j and l[i] == 0:
i = i+1
if i >= j: break
# skip 1 values from the end
while i < j and l[j] == 1:
j = j-1
if i >= j: break
# since all in sequence values have been skipped and i and j did not reach each other
# we encountered a pair that is out of order and needs to be swapped
l[i], l[j] = l[j], l[i]
j = j-1
i = i+1
return l
I like the answer by JBernado, but will throw in another monstrous option (although I've not done any profiling on it - it's not particulary extensible as it relies on the order of a dictionary hash, but works for 0 and 1):
from itertools import chain, repeat
from collections import Counter
list(chain.from_iterable(map(repeat, *zip(*Counter(bits).items()))))
Or - slightly less convoluted...
from itertools import repeat, chain, islice, ifilter
from operator import not_
list(islice(chain(ifilter(not_, bits), repeat(1)), len(bits)))
This should keep everything at the C level - so it should be fairly optimal.
All you need to know is how long the original sequence is and how many ones are in it.
old = [0,0,1,0,1,1,0]
ones = sum(1 for b in old if b)
new = [0]*(len(old)-ones) + [1]*ones
Here is a Python solution in O(n) time and O(2) space.
Absolutely no need to create new lists and best time performance
def sort01(arr):
i = 0
j = len(arr)-1
while i < j:
while arr[i] == 0:
i += 1
while arr[j] == 1:
j -= 1
if i<j:
arr[i] = 0
arr[j] = 1
return arr

Categories

Resources