How can I get a list shape without using numpy? - python

a1=[1,2,3,4,5,6]
b1=[[1,2,3], [4,5,6]]
If using np.shape list a1 will return (6,) and b1 will return (2, 3).
If Numpy is forbidden, how can I get the shape of list a1?
I am mainly confused about how can I let the python program know a1 is only one dimension. Is there any good method?

>>>a = [1,2,3,4,5,6]
>>>print (len(a))
6
For one dimensional lists, the above method can be used. len(list_name) returns number of elements in the list.
>>>a = [[1,2,3],[4,5,6]]
>>>nrow = len(a)
>>>ncol = len(a[0])
>>>nrow
2
>>>ncol
3
The above gives the dimension of the list. len(a) returns number of rows. len(a[0]) returns number of rows in a[0] which is the number of columns.
Here's a link to original answer.

The question clearly states 'without using numpy'. However, if anybody reached here looking for a solution without any condition, consider below. This solution will work for a balanced list.
b1=[[1,2,3], [4,5,6]]
np.asarray(b1).shape
(2, 3)

this is a recursive attempt at solving your problem. it will only work if all the lists on the same depth have the same length. otherwise it will raise a ValueError:
from collections.abc import Sequence
def get_shape(lst, shape=()):
"""
returns the shape of nested lists similarly to numpy's shape.
:param lst: the nested list
:param shape: the shape up to the current recursion depth
:return: the shape including the current depth
(finally this will be the full depth)
"""
if not isinstance(lst, Sequence):
# base case
return shape
# peek ahead and assure all lists in the next depth
# have the same length
if isinstance(lst[0], Sequence):
l = len(lst[0])
if not all(len(item) == l for item in lst):
msg = 'not all lists have the same length'
raise ValueError(msg)
shape += (len(lst), )
# recurse
shape = get_shape(lst[0], shape)
return shape
given your input (and the inputs from the comments) these are the results:
a1=[1,2,3,4,5,6]
b1=[[1,2,3],[4,5,6]]
print(get_shape(a1)) # (6,)
print(get_shape(b1)) # (2, 3)
print(get_shape([[0,1], [2,3,4]])) # raises ValueError
print(get_shape([[[1,2],[3,4]],[[5,6],[7,8]]])) # (2, 2, 2)
not sure if the last result is what you wanted.
UPDATE
as pointed out in the comments by mkl the code above will not catch all the cases where the shape of the nested list is inconsistent; e.g. [[0, 1], [2, [3, 4]]] will not raise an error.
this is a shot at checking whether or not the shape is consistent (there might be a more efficient way to do this...)
from collections.abc import Sequence, Iterator
from itertools import tee, chain
def is_shape_consistent(lst: Iterator):
"""
check if all the elements of a nested list have the same
shape.
first check the 'top level' of the given lst, then flatten
it by one level and recursively check that.
:param lst:
:return:
"""
lst0, lst1 = tee(lst, 2)
try:
item0 = next(lst0)
except StopIteration:
return True
is_seq = isinstance(item0, Sequence)
if not all(is_seq == isinstance(item, Sequence) for item in lst0):
return False
if not is_seq:
return True
return is_shape_consistent(chain(*lst1))
which could be used this way:
lst0 = [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
lst1 = [[0, 1, 2], [3, [4, 5]], [7, [8, 9]]]
assert is_shape_consistent(iter(lst0))
assert not is_shape_consistent(iter(lst1))

Here is a great example from "Ten Essays on Fizz Buzz" book by Joel Grus by using recursion.
from typing import List, Tuple, Union
def shape(ndarray: Union[List, float]) -> Tuple[int, ...]:
if isinstance(ndarray, list):
# More dimensions, so make a recursive call
outermost_size = len(ndarray)
row_shape = shape(ndarray[0])
return (outermost_size, *row_shape)
else:
# No more dimensions, so we're done
return ()
Example:
three_d = [
[[0, 0, 0], [1, 1, 1], [2, 2, 2]],
[[0, 0, 0], [1, 1, 1], [2, 2, 2]],
[[0, 0, 0], [1, 1, 1], [2, 2, 2]],
[[0, 0, 0], [1, 1, 1], [2, 2, 2]],
[[0, 0, 0], [1, 1, 1], [2, 2, 2]],
]
result = shape(three_d)
print(result)
>>> (5, 3, 3)

Depending on the level of thoroughness required, I would recommend using tail recursion. Build up the shape from the innermost to the outermost list. That will allow you to check that all the sizes match up at every depth and index.
def shape(lst):
def ishape(lst):
shapes = [ishape(x) if isinstance(x, list) else [] for x in lst]
shape = shapes[0]
if shapes.count(shape) != len(shapes):
raise ValueError('Ragged list')
shape.append(len(lst))
return shape
return tuple(reversed(ishape(lst)))
Here is a demo on IDEOne: https://ideone.com/HJRwlC
shapes.count(shape) != len(shapes) is a neat trick to determine if all the shapes up to a given level are identical, taken from https://stackoverflow.com/a/3844948/2988730.
If your only goal is to determine whether the list is one dimensional or not, just run a single all on the outermost list:
is_1d = all(not isinstance(x, list) for x in lst)
OR
is_1d = not any(isinstance(x, list) for x in lst)

Following function keeps track of first item of each dimension of a list. It doesn't matter how many dimensions it has.
def list_shape(input):
shape = []
a = len(input)
shape.append(a)
b = input[0]
while a > 0:
try:
a = len(b)
shape.append(a)
b = b[0]
except:
break
return shape
list1 = [[[123], [231]], [[345], [231]]]
print(list_shape(list1))
Output:
[2, 2, 1]
Note: Only works for symmetrical lists and numeral data within the list.

My proposal for the problem is the following relative complex code which delivers the maximum shape in form of a tuple.
The method can be feeded by any kind of nested iterable object that contains a __len__ method. I use the itertools.zip_longest() method to calculate the shape.
import itertools
def get_max_shape(value):
if hasattr(value, '__len__'): # add here `and not hasattr(value,'capitalize)` to exclude strings
s0 = len(value)
if s0 == 0:
return ()
if s0==1 and hasattr(value,'capitalize'): # handle string like objects
return ()
s1 = 0
sub_shape = []
# via zip_longest we iterate over the longest sub_list item
for v2 in itertools.zip_longest(*(v if hasattr(v, '__len__') else tuple() for v in value), fillvalue=0):
s1 += 1 # calc the length of second dimension
for v3 in v2:
sub_shape_new = get_max_shape(v3) # recursive measurement of lower dimensions
# compare new shapes with old shapes (find max)
for i, new in enumerate(sub_shape_new):
if i >= len(sub_shape):
# new items
sub_shape.extend(sub_shape_new[i:])
else:
# is max?
if sub_shape[i] < new:
sub_shape[i] = new
if len(sub_shape):
return (s0, s1) + tuple(sub_shape)
elif s1:
return (s0, s1)
else:
return (s0,)
else:
return ()
The execution delivers:
>>> print(get_max_shape([])
()
>>> print(get_max_shape([1, 2, 3])
(3,)
>>> print(get_max_shape([1, 2, [2, 3]]))
(3,2)
>>> print(get_max_shape([[[2, 3, 4, 5], 1, 2], [3, [2, 3, 4], 5, 6]]))
(2, 4, 4)
>>> print(get_max_shape([[[2, 3, 4, [4, [5, 6]]], 1, 2], [3, [2, [3, 4, 5, 6, 7], 4], 5, 6]]))
(2, 4, 4, 5, 2)
>>> print(get_max_shape(['123456789', []])) # strings are also counted as lists
(2, 9)
The code might be improved but in general I think it works and it does not iterate over all items to get the dimensions it uses the length() method if possible.

Related

how to expand list in list that included in some not-iterable objects to flat? [duplicate]

This question already has answers here:
How to flatten a hetrogenous list of list into a single list in python? [duplicate]
(11 answers)
Closed 1 year ago.
I want to expand list in list that included in some not-iterable
objects to flat.
I tried to do this using list comprehension, but I get an error in not-iterable objects.
How to expand this list to flat?
# [[1, 2], 3] -> [1, 2, 3]
list = [[1, 2], 3]
flat = [item for sublist in list for item in sublist] # TypeError: 'int' object is not iterable
print(flat)
In my environment, numpy is installed in addition to the standard functions.
I tried numpy.concatenate(list).flat, but I get an error.
# [[1, 2], 3] -> [1, 2, 3]
list = [[1, 2], 3]
flat = numpy.concatenate(list).flat # ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 1 dimension(s) and the array at index 1 has 0 dimension(s)
print(flat)
If the iterables are only lists and only one level deep, you can do it in the list comprehension.
L = [[1, 2], 3]
flat = [v for item in L for v in (item if isinstance(item,list) else [item])]
If there are multiple levels and a variety of iterable types, you will probably need a recursive function:
def flatten(L):
if not isinstance(L,(list,tuple,set)): # you probably don't want str here
yield L
return
for F in L:
yield from flatten(F)
L = [[1, 2], 3, ({5,6,7},[8,9],10)]
flat = list(flatten(L))
print(flat)
[1, 2, 3, 5, 6, 7, 8, 9, 10]
You could try this to see if that what's you're looking for:
It can flatten any levels (deeply nested) by recursively calling itself. Just be aware this did not do performance test yet, so there may be room to improve it.
import collections
def flatten(L):
if isinstance(L, collections.Iterable):
return [a for i in L for a in flatten(i)] # recursively calling
else:
return [L]
Running it:
lst = [[1, 2], 3, [[4, 5], 6] ]
print(flatten(lst)) # [1, 2, 3, 4, 5, 6]
lst2 = [[1, 2], 3, [[4, 5, [6]]], 7, 8]
print(flatten(lst2)) # [1, 2, 3, 4, 5, 6, 7, 8] # deeply nested
There may be more elegant solutions, but the trivial one would be to just iterate with a couple of for loops, checking value type:
flat = []
for item in list:
try:
for subitem in item:
flat.append(subitem)
except TypeError: # Not iterable, append as is
flat.append(item)
Note this assumes the nesting is only one level deep.
First of all, I highly recommend avoid identifiers such as list, dict, set etc. as these are data-types from Python and even though the syntax is allowed, you will "hide" them from the usage in your application.
Also, as a suggestion, avoid using list comprehension for more complex operations, as they can become difficult to read.
I recommend the following approach:
my_list = [[1, 2], 3]
flat = []
for item in my_list:
if isinstance(item, list):
for val in item:
flat.append(val)
else: flat.append(item)
print(flat)
Using list comprehension, the solution would look like:
my_list = [[1, 2], 3]
flat = [v for item in my_list for v in (item if isinstance(item,list) else [item])]
print(flat)

Subsets With Duplicates, issues with only returning empty list

I am working on a question as following:
Given a set of numbers that might contain duplicates, find all of its distinct subsets.
You can use the following as an example :
Example 1:
Input: [1, 3, 3]
Output: [], [1], [3], [1,3], [3,3], [1,3,3]
Example 2:
Input: [1, 5, 3, 3]
Output: [], [1], [5], [3], [1,5], [1,3], [5,3], [1,5,3], [3,3],
[1,3,3], [3,3,5], [1,5,3,3]
My approach is
class Solution:
def distinct_subset(self, nums):
n = len(nums)
previousEnd = 0
output = []
for i in range(n):
# judge if the current element is equal to the previous element
# if so, only update the elements generated in the previous iteration
if i > 0 and nums[i] == nums[i-1]:
previousStart = previousEnd + 1
else:
previousStart = 0
perviousEnd = len(output)
# create a temp array to store the output from the previous iteration
temp = list(output[previousStart:previousEnd])
# add current element to all the array generated by the previous iteration
output += [j + [nums[i]] for j in temp]
return output
def main():
print("Here is the list of subsets: " + str(Solution().distinct_subset([1, 3, 3])))
print("Here is the list of subsets: " + str(Solution().distinct_subset([1, 5, 3, 3])))
main()
However, my approach will only return []:
Here is the list of subsets: []
Here is the list of subsets: []
Process finished with exit code 0
I am not sure why did I go wrong. The algorithm supposes to update the output in each iteration. But now it failed.
Please feel free to share your ideas. Thanks for your help in advanced.
Yes, I ran your code and it appears no matter what you do the function will always return an output of an empty list, because nothing is actually changing in the list, it is always blank.
Forgive me, but I had to look up what 'all distinct subsets' meant, and I stumbled across this code, but it seems to do exactly what you are asking.
# Python3 program to find all subsets of
# given set. Any repeated subset is
# considered only once in the output
def printPowerSet(arr, n):
# Function to find all subsets of given set.
# Any repeated subset is considered only
# once in the output
_list = []
# Run counter i from 000..0 to 111..1
for i in range(2**n):
subset = ""
# consider each element in the set
for j in range(n):
# Check if jth bit in the i is set.
# If the bit is set, we consider
# jth element from set
if (i & (1 << j)) != 0:
subset += str(arr[j]) + "|"
# if subset is encountered for the first time
# If we use set<string>, we can directly insert
if subset not in _list and len(subset) > 0:
_list.append(subset)
# consider every subset
for subset in _list:
# split the subset and print its elements
arr = subset.split('|')
for string in arr:
print(string, end = " ")
print()
# Driver Code
if __name__ == '__main__':
arr = [10, 12, 12, 17]
n = len(arr)
printPowerSet(arr, n)
However, as you can see the above code does not use classes just a single function. If that works great, if you are required to use a class, let me know you will need to change the above code obviously.
I assume the below is what you are looking for:
[1, 3, 3] to [1,3]
[1, 5, 3, 3] to [1,5,3]
The set(list) function will do that for you real easy, however it doesn't handle compound data structure well.
Below code will work for compound data from, one level deep:
[[1, 1], [0, 1], [0, 1], [0, 0], [1, 0], [1, 1], [1, 1]]
to:
[[1, 1], [0, 1], [0, 0], [1, 0]]
code:
def get_unique(list):
temp = []
for i in list:
if i not in temp:
temp.append(i)
yield i
print(*get_unique(list))
I've trimmed the above code to give you your desired outputs, still not in a class though, is this okay?...
def distinct_subset(user_input):
n = len(user_input)
output = []
for i in range(2 ** n):
subset = ""
for j in range(n):
if (i & (1 << j)) != 0:
subset += str(user_input[j]) + ", "
if subset[:-2] not in output and len(subset) > 0:
output.append(subset[:-2])
return output
def main():
print("Here is the list of subsets: " + str(distinct_subset([1, 3, 3])))
print("Here is the list of subsets: " + str(distinct_subset([1, 5, 3, 3])))
main()
You're looking for distinct combinations of the powerset of your list.
Using itertools to generate the combinations and a set to eliminate duplicates, you could write the function like this:
from itertools import combinations
def uniqueSubsets(A):
A = sorted(A)
return [*map(list,{subset for size in range(len(A)+1)
for subset in combinations(A,size)})]
print(uniqueSubsets([1,3,3]))
# [[1, 3], [3, 3], [1], [3], [], [1, 3, 3]]
print(uniqueSubsets([1,5,3,3]))
# [1, 3] [3, 3] [1] [3] [3, 3, 5] [1, 3, 5] [1, 5] [5] [] [1, 3, 3, 5] [1, 3, 3] [3, 5]
If you have a lot of duplicates, it may be more efficient to filter them out as you go. Here is a recursive generator function that short-circuits the expansion when a combination has already been seen. It generates combinations by removing one element at a time (starting from the full size) and recursing to get shorter combinations.
def uniqueSubsets(A,seen=None):
if seen is None: seen,A = set(),sorted(A)
for i in range(len(A)): # for each position in the list
subset = (*A[:i],*A[i+1:]) # combination without that position
if subset in seen: continue # that has not been seen before
seen.add(subset)
yield from uniqueSubsets(subset,seen) # get shorter combinations
yield list(A)
print(*uniqueSubsets([1,3,3]))
# [] [3] [3, 3] [1] [1, 3] [1, 3, 3]
print(*uniqueSubsets([1,5,3,3]))
# [] [3] [3, 3] [5] [5, 3] [5, 3, 3] [1] [1, 3] [1, 3, 3] [1, 5] [1, 5, 3] [1, 5, 3, 3]
In both cases we are sorting the list in order to ensure that the combinations will always present the values in the same order for the set() to recognize them. (otherwise lists such as [3,3,1,3] could still produce duplicates)

unite lists if at least one value matches in python

Let's say I have a list of lists, for example:
[[0, 2], [0, 1], [2, 3], [4, 5, 7, 8], [6, 4]]
and if at least one of the values on a list is the same that another one of a different list, i would like to unite the lists so in the example the final result would be:
[[0, 1, 2, 3], [4, 5, 6, 7, 8]]
I really don't care about the order of the values inside the list [0, 1, 2, 3] or [0, 2, 1, 3].
I tried to do it but it doesn't work. So have you got any ideas? Thanks.
Edit(sorry for not posting the code that i tried before):
What i tried to do was the following:
for p in llista:
for q in p:
for k in llista:
if p==k:
llista.remove(k)
else:
for h in k:
if p!=k:
if q==h:
k.remove(h)
for t in k:
if t not in p:
p.append(t)
llista_final = [x for x in llista if x != []]
Where llista is the list of lists.
I have to admit this is a tricky problem. I'm really curious what does this problem represent and/or where did you find it out...
I initially have thought this is just a graph connected components problem, but I wanted to take a shortcut from creating an explicit representation of the graph, running bfs, etc...
The idea of the solution is this: for every sublist, check if it has some common element with any other sublist, and replace that with their union.
Not very pythonic, but here it is:
def merge(l):
l = list(map(tuple, l))
for i, h in enumerate(l):
sh = set(h)
for j, k in enumerate(l):
if i == j: continue
sk = set(k)
if sh & sk: # h and k have some element in common
l[j] = tuple(sh | sk)
return list(map(list, set(l)))
Here is a function that does what you want. I tried to use self-documenting variable names and comments to help you understand how this code works. As far as I can tell, the code is pythonic. I used sets to speed up and simplify some of the operations. The downside of that is that the items in your input list-of-lists must be hashable, but your example uses integers which works perfectly well.
def cliquesfromlistoflists(inputlistoflists):
"""Given a list of lists, return a new list of lists that unites
the old lists that have at least one element in common.
"""
listofdisjointsets = []
for inputlist in inputlistoflists:
# Update the list of disjoint sets using the current sublist
inputset = set(inputlist)
unionofsetsoverlappinginputset = inputset.copy()
listofdisjointsetsnotoverlappinginputset = []
for aset in listofdisjointsets:
# Unite set if overlaps the new input set, else just store it
if aset.isdisjoint(inputset):
listofdisjointsetsnotoverlappinginputset.append(aset)
else:
unionofsetsoverlappinginputset.update(aset)
listofdisjointsets = (listofdisjointsetsnotoverlappinginputset
+ [unionofsetsoverlappinginputset])
# Return the information in a list-of-lists format
return [list(aset) for aset in listofdisjointsets]
print(cliquesfromlistoflists([[0, 2], [0, 1], [2, 3], [4, 5, 7, 8], [6, 4]]))
# printout is [[0, 1, 2, 3], [4, 5, 6, 7, 8]]
This solution modifies the generic breadth-first search to gradually diminish the initial deque and update a result list with either a combination should a match be found or a list addition if no grouping is discovered:
from collections import deque
d = deque([[0,2] , [0,1] , [2,3] , [4,5,7,8] , [6,4]])
result = [d.popleft()]
while d:
v = d.popleft()
result = [list(set(i+v)) if any(c in i for c in v) else i for i in result] if any(any(c in i for c in v) for i in result) else result + [v]
Output:
[[0, 1, 2, 3], [8, 4, 5, 6, 7]]

Sum of two matrices

I ran into an exercise, which I have a problem with:
Write a function with two input parameters: M1 and M2, those are arrays: list of list of numbers. Return the sum of the matrices if they are compatible, or an empty list otherwise.
For example:
A = [[1, 2, 3], [4, 5, 6]]
B = [[1, 1, 1], [1, 1, 1]]
matrix_sum(A, B)
You get:
[[2, 3, 4], [5, 6, 7]]
So I tried:
def matrix_sum(M1, M2):
while len(M1)==len(M2):
res = []
for i in range(len(M1)):
row = []
for j in range(len(M1[0])):
row.append(M1[i][j]+M2[i][j])
res.append(row)
return res
It works for some input but said:
Test failed for
matrix_sum([[1, 2], [2, 3]], [[4, 2, 1], [1, 2, 3]])
expected output: [],
actual output: [[5, 4], [3, 5]]
How can I change it to work for this also?
Your function checks only that the quantities of rows match; it utterly ignores columns. In fact, if you reverse the mismatched arguments, your function will crash on an index error.
Add another check:
if len(M1) == len(M2) and \
len(M1[0]) == len(M2[0]):
Collect the dimensions first, check if they're valid for element-wise addition, then carry out the addition.
def matrix_sum(M1, M2):
dim_m1, dim_n1 = len(M1), len(M1[0])
dim_m2, dim_n2 = len(M2), len(M2[0])
if dim_m1 != dim_m2 or dim_n1 != dim_n2:
return []
res = [[0 for _ in range(dim_n1)] for _ in range(dim_m1)]
for m in range(dim_m1):
for n in range(dim_n1):
res[m][n] = M1[m][n] + M2[m][n]
return res
This should work (I tested it with your examples). It checks every secondry component:
def matrix_sum(M1, M2):
comp=True
n=0
for i in M1:
if len(i)!=len(M2[n]):
comp=False
n+=1
output=[]
if comp:
n=0
for i in M1:
add=[]
m=0
for j in i:
add.append(j+M2[n][m])
m+=1
n+=1
output.append(add)
return output

Search in 2D list using python to find x,y position

I have 2D list and I need to search for the index of an element. As I am begineer to programming I used the following function:
def in_list(c):
for i in xrange(0,no_classes):
if c in classes[i]:
return i;
return -1
Here classes is a 2D list and no_classes denotes the number of classes i.e the 1st dimesntion of the list. -1 is returned when c is not in the araray. Is there any I can optimize the search?
You don't need to define no_classes yourself. Use enumerate():
def in_list(c, classes):
for i, sublist in enumerate(classes):
if c in sublist:
return i
return -1
Use list.index(item)
a = [[1,2],[3,4,5]]
def in_list(item,L):
for i in L:
if item in i:
return L.index(i)
return -1
print in_list(3,a)
# prints 1
if order doesn't matter and you have no duplicates in your data, I suggest to turn you 2D list into list of sets:
>>> l = [[1, 2, 4], [6, 7, 8], [9, 5, 10]]
>>> l = [set(x) for x in l]
>>> l
[set([1, 2, 4]), set([8, 6, 7]), set([9, 10, 5])]
After that, your original function will work faster, because search of element in set is constant (while search of element in list is linear), so you algorithm becomes O(N) and not O(N^2).
Note that you should not do this in your function or it would be converted each time function is called.
If your "2D" list is rectangular (same number of columns for each line), you should convert it to a numpy.ndarray and use numpy functionalities to do the search. For an array of integers, you can use == for comparison. For an array of float numbers, you should use np.isclose instead:
a = np.array(c, dtype=int)
i,j = np.where(a == element)
or
a = np.array(c, dtype=float)
i,j = np.where(np.isclose(a, element))
such that i and j contain the line and column indices, respectively.
Example:
a = np.array([[1, 2],
[3, 4],
[2, 6]], dtype=float)
i, j = np.where(np.isclose(a, 2))
print(i)
#array([0, 2])
print(j)
#array([1, 0]))
using list comprehension: (2D list to 1D)
a = [[1,22],[333,55555,6666666]]
d1 = [x for b in a for x in b]
print(d1)

Categories

Resources