Related
I am working on a question as following:
Given a set of numbers that might contain duplicates, find all of its distinct subsets.
You can use the following as an example :
Example 1:
Input: [1, 3, 3]
Output: [], [1], [3], [1,3], [3,3], [1,3,3]
Example 2:
Input: [1, 5, 3, 3]
Output: [], [1], [5], [3], [1,5], [1,3], [5,3], [1,5,3], [3,3],
[1,3,3], [3,3,5], [1,5,3,3]
My approach is
class Solution:
def distinct_subset(self, nums):
n = len(nums)
previousEnd = 0
output = []
for i in range(n):
# judge if the current element is equal to the previous element
# if so, only update the elements generated in the previous iteration
if i > 0 and nums[i] == nums[i-1]:
previousStart = previousEnd + 1
else:
previousStart = 0
perviousEnd = len(output)
# create a temp array to store the output from the previous iteration
temp = list(output[previousStart:previousEnd])
# add current element to all the array generated by the previous iteration
output += [j + [nums[i]] for j in temp]
return output
def main():
print("Here is the list of subsets: " + str(Solution().distinct_subset([1, 3, 3])))
print("Here is the list of subsets: " + str(Solution().distinct_subset([1, 5, 3, 3])))
main()
However, my approach will only return []:
Here is the list of subsets: []
Here is the list of subsets: []
Process finished with exit code 0
I am not sure why did I go wrong. The algorithm supposes to update the output in each iteration. But now it failed.
Please feel free to share your ideas. Thanks for your help in advanced.
Yes, I ran your code and it appears no matter what you do the function will always return an output of an empty list, because nothing is actually changing in the list, it is always blank.
Forgive me, but I had to look up what 'all distinct subsets' meant, and I stumbled across this code, but it seems to do exactly what you are asking.
# Python3 program to find all subsets of
# given set. Any repeated subset is
# considered only once in the output
def printPowerSet(arr, n):
# Function to find all subsets of given set.
# Any repeated subset is considered only
# once in the output
_list = []
# Run counter i from 000..0 to 111..1
for i in range(2**n):
subset = ""
# consider each element in the set
for j in range(n):
# Check if jth bit in the i is set.
# If the bit is set, we consider
# jth element from set
if (i & (1 << j)) != 0:
subset += str(arr[j]) + "|"
# if subset is encountered for the first time
# If we use set<string>, we can directly insert
if subset not in _list and len(subset) > 0:
_list.append(subset)
# consider every subset
for subset in _list:
# split the subset and print its elements
arr = subset.split('|')
for string in arr:
print(string, end = " ")
print()
# Driver Code
if __name__ == '__main__':
arr = [10, 12, 12, 17]
n = len(arr)
printPowerSet(arr, n)
However, as you can see the above code does not use classes just a single function. If that works great, if you are required to use a class, let me know you will need to change the above code obviously.
I assume the below is what you are looking for:
[1, 3, 3] to [1,3]
[1, 5, 3, 3] to [1,5,3]
The set(list) function will do that for you real easy, however it doesn't handle compound data structure well.
Below code will work for compound data from, one level deep:
[[1, 1], [0, 1], [0, 1], [0, 0], [1, 0], [1, 1], [1, 1]]
to:
[[1, 1], [0, 1], [0, 0], [1, 0]]
code:
def get_unique(list):
temp = []
for i in list:
if i not in temp:
temp.append(i)
yield i
print(*get_unique(list))
I've trimmed the above code to give you your desired outputs, still not in a class though, is this okay?...
def distinct_subset(user_input):
n = len(user_input)
output = []
for i in range(2 ** n):
subset = ""
for j in range(n):
if (i & (1 << j)) != 0:
subset += str(user_input[j]) + ", "
if subset[:-2] not in output and len(subset) > 0:
output.append(subset[:-2])
return output
def main():
print("Here is the list of subsets: " + str(distinct_subset([1, 3, 3])))
print("Here is the list of subsets: " + str(distinct_subset([1, 5, 3, 3])))
main()
You're looking for distinct combinations of the powerset of your list.
Using itertools to generate the combinations and a set to eliminate duplicates, you could write the function like this:
from itertools import combinations
def uniqueSubsets(A):
A = sorted(A)
return [*map(list,{subset for size in range(len(A)+1)
for subset in combinations(A,size)})]
print(uniqueSubsets([1,3,3]))
# [[1, 3], [3, 3], [1], [3], [], [1, 3, 3]]
print(uniqueSubsets([1,5,3,3]))
# [1, 3] [3, 3] [1] [3] [3, 3, 5] [1, 3, 5] [1, 5] [5] [] [1, 3, 3, 5] [1, 3, 3] [3, 5]
If you have a lot of duplicates, it may be more efficient to filter them out as you go. Here is a recursive generator function that short-circuits the expansion when a combination has already been seen. It generates combinations by removing one element at a time (starting from the full size) and recursing to get shorter combinations.
def uniqueSubsets(A,seen=None):
if seen is None: seen,A = set(),sorted(A)
for i in range(len(A)): # for each position in the list
subset = (*A[:i],*A[i+1:]) # combination without that position
if subset in seen: continue # that has not been seen before
seen.add(subset)
yield from uniqueSubsets(subset,seen) # get shorter combinations
yield list(A)
print(*uniqueSubsets([1,3,3]))
# [] [3] [3, 3] [1] [1, 3] [1, 3, 3]
print(*uniqueSubsets([1,5,3,3]))
# [] [3] [3, 3] [5] [5, 3] [5, 3, 3] [1] [1, 3] [1, 3, 3] [1, 5] [1, 5, 3] [1, 5, 3, 3]
In both cases we are sorting the list in order to ensure that the combinations will always present the values in the same order for the set() to recognize them. (otherwise lists such as [3,3,1,3] could still produce duplicates)
a1=[1,2,3,4,5,6]
b1=[[1,2,3], [4,5,6]]
If using np.shape list a1 will return (6,) and b1 will return (2, 3).
If Numpy is forbidden, how can I get the shape of list a1?
I am mainly confused about how can I let the python program know a1 is only one dimension. Is there any good method?
>>>a = [1,2,3,4,5,6]
>>>print (len(a))
6
For one dimensional lists, the above method can be used. len(list_name) returns number of elements in the list.
>>>a = [[1,2,3],[4,5,6]]
>>>nrow = len(a)
>>>ncol = len(a[0])
>>>nrow
2
>>>ncol
3
The above gives the dimension of the list. len(a) returns number of rows. len(a[0]) returns number of rows in a[0] which is the number of columns.
Here's a link to original answer.
The question clearly states 'without using numpy'. However, if anybody reached here looking for a solution without any condition, consider below. This solution will work for a balanced list.
b1=[[1,2,3], [4,5,6]]
np.asarray(b1).shape
(2, 3)
this is a recursive attempt at solving your problem. it will only work if all the lists on the same depth have the same length. otherwise it will raise a ValueError:
from collections.abc import Sequence
def get_shape(lst, shape=()):
"""
returns the shape of nested lists similarly to numpy's shape.
:param lst: the nested list
:param shape: the shape up to the current recursion depth
:return: the shape including the current depth
(finally this will be the full depth)
"""
if not isinstance(lst, Sequence):
# base case
return shape
# peek ahead and assure all lists in the next depth
# have the same length
if isinstance(lst[0], Sequence):
l = len(lst[0])
if not all(len(item) == l for item in lst):
msg = 'not all lists have the same length'
raise ValueError(msg)
shape += (len(lst), )
# recurse
shape = get_shape(lst[0], shape)
return shape
given your input (and the inputs from the comments) these are the results:
a1=[1,2,3,4,5,6]
b1=[[1,2,3],[4,5,6]]
print(get_shape(a1)) # (6,)
print(get_shape(b1)) # (2, 3)
print(get_shape([[0,1], [2,3,4]])) # raises ValueError
print(get_shape([[[1,2],[3,4]],[[5,6],[7,8]]])) # (2, 2, 2)
not sure if the last result is what you wanted.
UPDATE
as pointed out in the comments by mkl the code above will not catch all the cases where the shape of the nested list is inconsistent; e.g. [[0, 1], [2, [3, 4]]] will not raise an error.
this is a shot at checking whether or not the shape is consistent (there might be a more efficient way to do this...)
from collections.abc import Sequence, Iterator
from itertools import tee, chain
def is_shape_consistent(lst: Iterator):
"""
check if all the elements of a nested list have the same
shape.
first check the 'top level' of the given lst, then flatten
it by one level and recursively check that.
:param lst:
:return:
"""
lst0, lst1 = tee(lst, 2)
try:
item0 = next(lst0)
except StopIteration:
return True
is_seq = isinstance(item0, Sequence)
if not all(is_seq == isinstance(item, Sequence) for item in lst0):
return False
if not is_seq:
return True
return is_shape_consistent(chain(*lst1))
which could be used this way:
lst0 = [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
lst1 = [[0, 1, 2], [3, [4, 5]], [7, [8, 9]]]
assert is_shape_consistent(iter(lst0))
assert not is_shape_consistent(iter(lst1))
Here is a great example from "Ten Essays on Fizz Buzz" book by Joel Grus by using recursion.
from typing import List, Tuple, Union
def shape(ndarray: Union[List, float]) -> Tuple[int, ...]:
if isinstance(ndarray, list):
# More dimensions, so make a recursive call
outermost_size = len(ndarray)
row_shape = shape(ndarray[0])
return (outermost_size, *row_shape)
else:
# No more dimensions, so we're done
return ()
Example:
three_d = [
[[0, 0, 0], [1, 1, 1], [2, 2, 2]],
[[0, 0, 0], [1, 1, 1], [2, 2, 2]],
[[0, 0, 0], [1, 1, 1], [2, 2, 2]],
[[0, 0, 0], [1, 1, 1], [2, 2, 2]],
[[0, 0, 0], [1, 1, 1], [2, 2, 2]],
]
result = shape(three_d)
print(result)
>>> (5, 3, 3)
Depending on the level of thoroughness required, I would recommend using tail recursion. Build up the shape from the innermost to the outermost list. That will allow you to check that all the sizes match up at every depth and index.
def shape(lst):
def ishape(lst):
shapes = [ishape(x) if isinstance(x, list) else [] for x in lst]
shape = shapes[0]
if shapes.count(shape) != len(shapes):
raise ValueError('Ragged list')
shape.append(len(lst))
return shape
return tuple(reversed(ishape(lst)))
Here is a demo on IDEOne: https://ideone.com/HJRwlC
shapes.count(shape) != len(shapes) is a neat trick to determine if all the shapes up to a given level are identical, taken from https://stackoverflow.com/a/3844948/2988730.
If your only goal is to determine whether the list is one dimensional or not, just run a single all on the outermost list:
is_1d = all(not isinstance(x, list) for x in lst)
OR
is_1d = not any(isinstance(x, list) for x in lst)
Following function keeps track of first item of each dimension of a list. It doesn't matter how many dimensions it has.
def list_shape(input):
shape = []
a = len(input)
shape.append(a)
b = input[0]
while a > 0:
try:
a = len(b)
shape.append(a)
b = b[0]
except:
break
return shape
list1 = [[[123], [231]], [[345], [231]]]
print(list_shape(list1))
Output:
[2, 2, 1]
Note: Only works for symmetrical lists and numeral data within the list.
My proposal for the problem is the following relative complex code which delivers the maximum shape in form of a tuple.
The method can be feeded by any kind of nested iterable object that contains a __len__ method. I use the itertools.zip_longest() method to calculate the shape.
import itertools
def get_max_shape(value):
if hasattr(value, '__len__'): # add here `and not hasattr(value,'capitalize)` to exclude strings
s0 = len(value)
if s0 == 0:
return ()
if s0==1 and hasattr(value,'capitalize'): # handle string like objects
return ()
s1 = 0
sub_shape = []
# via zip_longest we iterate over the longest sub_list item
for v2 in itertools.zip_longest(*(v if hasattr(v, '__len__') else tuple() for v in value), fillvalue=0):
s1 += 1 # calc the length of second dimension
for v3 in v2:
sub_shape_new = get_max_shape(v3) # recursive measurement of lower dimensions
# compare new shapes with old shapes (find max)
for i, new in enumerate(sub_shape_new):
if i >= len(sub_shape):
# new items
sub_shape.extend(sub_shape_new[i:])
else:
# is max?
if sub_shape[i] < new:
sub_shape[i] = new
if len(sub_shape):
return (s0, s1) + tuple(sub_shape)
elif s1:
return (s0, s1)
else:
return (s0,)
else:
return ()
The execution delivers:
>>> print(get_max_shape([])
()
>>> print(get_max_shape([1, 2, 3])
(3,)
>>> print(get_max_shape([1, 2, [2, 3]]))
(3,2)
>>> print(get_max_shape([[[2, 3, 4, 5], 1, 2], [3, [2, 3, 4], 5, 6]]))
(2, 4, 4)
>>> print(get_max_shape([[[2, 3, 4, [4, [5, 6]]], 1, 2], [3, [2, [3, 4, 5, 6, 7], 4], 5, 6]]))
(2, 4, 4, 5, 2)
>>> print(get_max_shape(['123456789', []])) # strings are also counted as lists
(2, 9)
The code might be improved but in general I think it works and it does not iterate over all items to get the dimensions it uses the length() method if possible.
Let's say I have a list of lists, for example:
[[0, 2], [0, 1], [2, 3], [4, 5, 7, 8], [6, 4]]
and if at least one of the values on a list is the same that another one of a different list, i would like to unite the lists so in the example the final result would be:
[[0, 1, 2, 3], [4, 5, 6, 7, 8]]
I really don't care about the order of the values inside the list [0, 1, 2, 3] or [0, 2, 1, 3].
I tried to do it but it doesn't work. So have you got any ideas? Thanks.
Edit(sorry for not posting the code that i tried before):
What i tried to do was the following:
for p in llista:
for q in p:
for k in llista:
if p==k:
llista.remove(k)
else:
for h in k:
if p!=k:
if q==h:
k.remove(h)
for t in k:
if t not in p:
p.append(t)
llista_final = [x for x in llista if x != []]
Where llista is the list of lists.
I have to admit this is a tricky problem. I'm really curious what does this problem represent and/or where did you find it out...
I initially have thought this is just a graph connected components problem, but I wanted to take a shortcut from creating an explicit representation of the graph, running bfs, etc...
The idea of the solution is this: for every sublist, check if it has some common element with any other sublist, and replace that with their union.
Not very pythonic, but here it is:
def merge(l):
l = list(map(tuple, l))
for i, h in enumerate(l):
sh = set(h)
for j, k in enumerate(l):
if i == j: continue
sk = set(k)
if sh & sk: # h and k have some element in common
l[j] = tuple(sh | sk)
return list(map(list, set(l)))
Here is a function that does what you want. I tried to use self-documenting variable names and comments to help you understand how this code works. As far as I can tell, the code is pythonic. I used sets to speed up and simplify some of the operations. The downside of that is that the items in your input list-of-lists must be hashable, but your example uses integers which works perfectly well.
def cliquesfromlistoflists(inputlistoflists):
"""Given a list of lists, return a new list of lists that unites
the old lists that have at least one element in common.
"""
listofdisjointsets = []
for inputlist in inputlistoflists:
# Update the list of disjoint sets using the current sublist
inputset = set(inputlist)
unionofsetsoverlappinginputset = inputset.copy()
listofdisjointsetsnotoverlappinginputset = []
for aset in listofdisjointsets:
# Unite set if overlaps the new input set, else just store it
if aset.isdisjoint(inputset):
listofdisjointsetsnotoverlappinginputset.append(aset)
else:
unionofsetsoverlappinginputset.update(aset)
listofdisjointsets = (listofdisjointsetsnotoverlappinginputset
+ [unionofsetsoverlappinginputset])
# Return the information in a list-of-lists format
return [list(aset) for aset in listofdisjointsets]
print(cliquesfromlistoflists([[0, 2], [0, 1], [2, 3], [4, 5, 7, 8], [6, 4]]))
# printout is [[0, 1, 2, 3], [4, 5, 6, 7, 8]]
This solution modifies the generic breadth-first search to gradually diminish the initial deque and update a result list with either a combination should a match be found or a list addition if no grouping is discovered:
from collections import deque
d = deque([[0,2] , [0,1] , [2,3] , [4,5,7,8] , [6,4]])
result = [d.popleft()]
while d:
v = d.popleft()
result = [list(set(i+v)) if any(c in i for c in v) else i for i in result] if any(any(c in i for c in v) for i in result) else result + [v]
Output:
[[0, 1, 2, 3], [8, 4, 5, 6, 7]]
Consider the general case of the following list:
l = [['path/to/file', 0], ['path/to/folder$IDX/', 2], ['and/another/', 5]]
Meaning, a list of lists, where each list has a string on its first element that may or may not contain a special marker (in the example above it's $IDX) and some random integer on its second element.
My goal is to have a new list of lists, where every inner-list that has the special marker in its first element, will be replaced with X new lists, having X = 0, 1, 2..., n (where n is known) in place of the special marker.
For example, if n = 2, for the input above, the output should be:
l = [['path/to/file', 0], ['path/to/folder0/', 2], ['path/to/folder1/', 2], ['path/to/folder2/', 2], ['and/another/', 5]]
another example (again, for n = 2):
input:
l = [['random_text$IDX', 512], ['string', 2], ['more_$IDX_random_text', 5]]
output:
l = [['random_text0', 512], ['random_text1', 512], ['random_text2', 512], ['string', 2], ['more_0_random_text', 5], ['more_1_random_text', 5], ['more_2_random_text', 5]]
My first thought was dividing the original list to two list of lists, one that has the marker (l_with) and a second that hasn't (l_without), then processing l_with and adding it to l_without, but this rather naive attempt failed right away at the start:
>>> l = [['path/to/file', 0], ['path/to/folder$IDX/', 2], ['and/another/', 5]]
>>> l_with = [e for e in l if '$IDX' in e[0]]
>>> l_without = [e for e in l if '$IDX' not in e[0]]
>>> l_new = l_without + [[e[0].replace('$IDX', str(i)), e[1]] for i in range(3) for e in l_with]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: expected a character buffer object
In addition, even if the above would work, it's not that scalable... (consider a more general case: what if each inner-list has more than two elements? The only assumption I can make for now is that my (special-or-not) string will be the first element on each inner-list.)
What's a concise way of doing that? (scalable or not...)
This might help.
res = []
l = [['path/to/file', 0], ['path/to/folder$IDX/', 2], ['and/another/', 5]]
n = 2
for i in l:
if "$IDX" in i[0]:
for j in range(0, n+1):
res.append([i[0].replace("$IDX", str(j)) ] + i[1:])
else:
res.append(i)
print res
Output:
[['path/to/file', 0], ['path/to/folder0/', 2], ['path/to/folder1/', 2], ['path/to/folder2/', 2], ['and/another/', 5]]
I want to write a function using recursion to get the subset of a set and print like this when we input[1, 2, 3]as set:
[
[],
[1],
[2],
[3],
[1, 2],
[1, 3],
[2, 3],
[1, 2, 3]
]
My thinking is split the set to two part, the first one, and the rest. Then use the first one to append the every element in the rest list.
Here is my code:
def subset(set):
if len(set) == 0:
return [set]
elif len(set) == 1:
return [set[0]]
else:
short = subset(set[1:])
alist=[]
for item in short:
blist=[set[0]]
blist.append(item)
alist.append(blist)
return alist
It doesn't work. How can I fix it?(only allow one parameter,set)
Assuming that the elements doesn't have any duplicates.
def subset(l):
if not l:
return [[]]
rest = subset(l[1:])
return rest + [[l[0]] + s for s in rest]
The output is not exactly in the same order as you require:
[[], [3], [2], [2, 3], [1], [1, 3], [1, 2], [1, 2, 3]]
But we can try to sort the resulting list of subsets. Assuming the elements of the list are always integers we can try:
sorted(subset([1, 2, 3]), key=lambda l : int('0' + ''.join(str(i) for i in l)))
And then we have the desired output:
[[], [1], [2], [3], [1, 2], [1, 3], [2, 3], [1, 2, 3]]
I believe this should do the trick, although I should probably be asleep so let me know if it doesn't...
def rec(l, _s=None):
if _s is None:
_s = []
_s.append(tuple(l))
for i, null in enumerate(l):
sl = [v for j, v in enumerate(l) if i != j]
if len(sl) > 0:
rec(sl, _s)
_s = set(_s)
return _s
print rec([1,2,3,4])
Alternatively w/o recursive:
import itertools
def norec(l):
s = []
for i in range(len(l)):
s.extend(itertools.combinations(l, i+1))
return s
Finally, recursive with only one parameter. It's a bit clunky, though.
def rec2(l):
new = [l]
for i, null in enumerate(l):
sub = [v for j, v in enumerate(l) if i != j]
new.extend(rec2(sub))
return set([tuple(x) for x in new])
I guess generating the powerset is a nice exercise in recursion, but it's generally better to do things iteratively if you can; Python doesn't have any optimisations for recursion.
It's possible to do powersets with a one-liner, using reduce, but since Guido doesn't like reduce I'll do a slightly longer version. :)
>>> def powerset(seq):
... z = [[]]
... for x in seq:
... z += [y + [x] for y in z]
... return z
...
>>> ps = powerset(range(1, 5)); ps.sort(key=lambda i:(len(i), i)); ps
[[], [1], [2], [3], [4], [1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4],
[1, 2, 3], [1, 2, 4], [1, 3, 4], [2, 3, 4], [1, 2, 3, 4]]
(I manually wrapped the output to avoid scrolling).
I think dreyescat gave the right answer. Check the code below to find out where your code went wrong compared to his, unless you have already figured it out :-).
def subset(set):
if len(set) == 0:
return [set]
elif len(set) == 1: # actually we do not need it
#return [set[0]] # this is wrong
return [set] + [[]] #this is correct, although redundant
else:
short = subset(set[1:])
alist=[]
for item in short:
blist=[set[0]]
#blist.append(item) #item is a list, so do not append it to blist
blist += item #addition
alist.append(blist)
return short + alist # you have to add this "short" which is the return value of the last "subset" call
print(subset([1,2,3]))
#print(sorted(subset([1, 2, 3]), key=lambda l : int('0' + ''.join(str(i) for i in l))))