Find index of an element in list using wildcards - python

I have a list like this:
a = [['1','2','3','a','b'],
['4','5','6','c','d'],
['7','8','9','e','f']]
I want to identify the index of an element in the list a based on first three sub-elements of the element. For example, the index of the element which contains ['4','5','6'] as its first three sub-elements is 1.
I have tried to do this using list comprehension follows:
ind = [i for i in range(0,len(a)) if (a[i][0] == '4' and a[i][1] == '5' and a[i][2] == '6')]
But this is computationally expensive as I have to implement this code several times in a for loop. So, I am looking for a less computationally expensive method. SO far I have tried 'list.index' as follows:
ind = a.index(['4','5','6','*','*'])
where '*' is used as wildcard string. But is does not work as it outputs:
['4', '5', '6', '.*', '.*'] is not in list
I think there is something wrong with the way I am using wildcard. Can you please tell me what is it? Or is there another fast way to identify the element of a list based on its sub-elements?

Solution 1: True wildcard
You can simply use a.index(...) if you use a true wildcard:
class Wildcard:
def __eq__(self, anything):
return True
a = [['1','2','3','a','b'],
['4','5','6','c','d'],
['7','8','9','e','f']]
wc = Wildcard()
print(a.index(['4', '5', '6', wc, wc]))
Outputs 1. Try it online!
This might be fast because 1) it does the searching in C code and 2) it does minimal work, as it for example doesn't create a list slice for every row and might often rule out a row simply by looking at just the first value.
Solution 2: operator.indexOf
Or using operator.indexOf, which finds the index in C rather than in Python like an enumerate solution would. Here are two versions of that, I suspect the one mapping a ready-to-go slice is faster:
from operator import indexOf, itemgetter
a = [['1','2','3','a','b'],
['4','5','6','c','d'],
['7','8','9','e','f']]
print(indexOf((r[:3] for r in a), ['4', '5', '6']))
print(indexOf(map(itemgetter(slice(3)), a), ['4', '5', '6']))
Try it online!

Well you could transpose and slice and transpose back and finally index, like this:
>>> list(zip(*list(zip(*a))[:3])).index(('4', '5', '6'))
1
>>>
But what's wrong with?
>>> [x[:3] for x in a].index(['4', '5', '6'])
1
>>>

It may be more efficient to implement it as a generator. In this way, if e.g. you know that you can get at most one match you can stop once it is found:
a = [['1','2','3','a','b'],
['4','5','6','c','d'],
['7','8','9','e','f']]
pattern = ['4','5','6']
def find_index(data, pattern):
for n, elt in enumerate(a):
if elt[:3] == pattern:
yield n
indices = find_index(a, pattern)
next(indices)
It gives:
1

Here's a fairly simple and straightforward way of accomplishing this:
def findListMatch(lists, match):
# Loop over lists with indices
for k, sublist in enumerate(lists):
# Check if the beginning of the list matches the desired start pattern
if sublist[0:len(match)] == match:
# If it's a match, return the index
return k
# If none of the lists match, return a placeholder value of "None"
return None
a = [['1','2','3','a','b'],
['4','5','6','c','d'],
['7','8','9','e','f']]
matchIndex = findListMatch(a, ['4', '5', '6'])
# Result:
# matchIndex = 1

#U12-Forward's answer works but unnecessarily builds a temporary list of the same size as the input list before applying the index method, which can be quite an overhead if the list is long. A more efficient approach would be to use enumerate to generate the indices while comparing the first 3 items to the desired list:
next(i for i, l in enumerate(a) if l[:3] == ['4', '5', '6'])

Related

Select level above in iteratively created nested list with Python

I have trawled through so many questions on lists and I can't find what I'm looking for.
I have a string with braces wrapping some values and some other nested pairs of braces containing values. I don't know the deepest level the structure is nested, but it might look something like this:
{121{12}12{211}2}
I want to iterate over this string and transfer it to a nested list in a way similar to the following pseudocode:
for i in thestring
if there is a leftbrace { start a new list inside the current list
elseif there is a rightbrace } close the current list and select the list on the level above
else add i to currently selected list
I have no idea how to go up a list level and close the current sublist
def listify(s):
i = iter(s)
l = []
for c in i:
if c == '{':
l.append(listify(i))
elif c == '}':
return l
else:
l.append(c)
return l[0]
The trick is to use a recursive function so that the stack takes care of keeping track which list you're in. We use iter so that we can pass the string to recursive calls without the letters we have already processed.
>>>listify(s)
['1', '2', '1', ['1', '2'], '1', '2', ['2', '1', '1'], '2']
Here is one way using regex and ast.literal_eval to create a set of tuples:
In [78]: s = "{121{12}12{211}2}"
In [79]: new = re.sub(r'(})(?!$)|(?<!^)({)',lambda x : {'{': ',(', '}': ',),'}[x.group(0)], s)
In [80]: ast.literal_eval(new)
Out[80]: {121, (211,), 2, 12, (12,)}

Comparing all elements of 2 lists with Python 2

I have 2 lists: a = ['5', '2', '3', '4'], and b = ['1', '6', '7', '5']. Using Python 2, how can I compare each list element in a to each element in b? (i.e. is a[0] == b[0], is a[0] == b[1], etc).
I know that I could just write out numerous if statements, but I hope that there is a more elegant way to do this.
After checking each list element, I want to know how many times a shared value was found (in my example lists above, it would be one time, '5').
EDIT: This is not a duplicate, b/c i am comparing two different lists to each other, while the possible duplicate dealt with only 1 list.
The count() method of list may help:
>>> a = ['5', '2', '3', '4']
>>> b = ['1', '6', '7', '5']
>>> for item in a:
... print item, b.count(item)
...
5 1
2 0
3 0
4 0
Probably faster for big inputs than eugene y's, as it only needs to iterate over b once,
instead of len(a) times:
from collections import Counter
counts = Counter(b)
for i in a:
print(i, counts[i])
If you are only concerned with shared values, and not with their positions or counts, convert them to set and use intersection:
>>> a = ['5','2','3','4']
>>> b = ['1','6','7','5']
>>> set(a).intersection(b)
{'5'}
If you want to retain how often the elements appear in the intersection, you can also do an intersection of collections.Counter using &
>>> a = ['5','2','3','4','1','1','6','5']
>>> b = ['1','6','7','5','5']
>>> collections.Counter(a) & collections.Counter(b)
Counter({'5': 2, '1': 1, '6': 1})
Note: This is different from the solution by #GingerPlusPlus in that it is symmetric, i.e. if 5 is present once in list a and twice in list b, then the shared count will be 1, not 2.
def cmp(*lists):
lists_len_min = list(map(lambda x: len(x), lists))
if min(lists_len_min) != max(lists_len_min):
raise Exception("Lists must have equal length")
iterator = iter(lists)
last = next(iterator)
for element in iterator:
for i, each in enumerate(element):
#print(i, last[i], each)
if last[i] != each:
return False
else:
return True
This function can compare as many lists you want with equal length. Just call cmp(list1, list2, list3)
This code will produce list of elements which is consist in both a and b list
a = [1,2,3,4]
b = [2,3,1,7]
c = [e for e in a if e in b]
It might be complex by memory in case if you use big arrays but if you plan to use this data than why not

Include empty values in a list according to specific positions (Python)

I have the following list:
CompleteList=['00:00:00', '00:00:01', '00:00:02', '00:00:03',....,'23:59:59']
and I also have the following list:
IncompleteList=['00:00:00', '00:00:01', '00:00:03',....,'23:59:59']
As it can be seen the CompleteList has values that are missing in the IncompleteList, as for example value '00:00:02'.
I also have a third array:
MyList=['22', '33', '25',....,'13']
What I need is to include empty values in MyList in those position where IncompleteList has missing values in the following way:
MyList_result=['22', '33','','25',....,'13']
I have achieved this in the following way:
MyList_result=[]
for item in CompleteList:
if item in IncompleteList:
ind=IncompleteList.index(item)
v=MyList[ind]
MyList_result.append(v)
else:
v=''
MyList_result.append(v)
This works but it takes too long taking into account the size of the lists that I am working with. I really need to find a more efficient way of doing it. Any help will be appreciated.
The first intuitive approach would be to convert the IncompleteList to a set and get an iterator for MyList. Then it becomes a linear operation in iterating over CompleteList and spit out the next item from the MyList iterator if the elem from CompleteList is present in IncompleteList else as per your example an empty string
Sample Code
IncompleteList=['00:00:00', '00:00:01', '00:00:03','23:59:59']
IncompleteSet = set(IncompleteList)
MyList=['22', '33', '25','13']
CompleteList=['00:00:00', '00:00:01', '00:00:02', '00:00:03','23:59:59']
MyListIt = iter(MyList)
[next(MyListIt) if cl_elem in IncompleteSet else '' for cl_elem in CompleteList]
Sample Output
Out[100]: ['22', '33', '', '25', '13']
Alternatively you can zip both the IncompleteList and MyList and convert the paired list as a dictionary. Following which iterate over the CompleteList and spit out the corresponding value from the dictionary if the element is present else an empty string
MyDict = dict(zip(IncompleteList, MyList))
[MyDict.get(k, '') for k in CompleteList]
Out[108]: ['22', '33', '', '25', '13']
The bottleneck from your implementation is in two places:
You are checking for each item from the CompleteList in the IncompleteList at
if item in IncompleteList:
which in the worst case would scan the IncompleteList n number of times (if n is the number of elements in the CompleteList)
If the item is present you find the index of the item at
ind = IncompleteList.index(item)
which involves another scan of the IncompleteList
The first solution suggested by #Abhijit solves the second problem where you do not have to scan the list a second time to get the index. However the check for the presence of the item in the IncompleteList/IncompleteSet is still a bottleneck.
If we can assume sorted lists then the following solution will be faster although a little more complex:
MyList_result = []
incomplete_list_index = 0
incomplete_list_length = len(IncompleteList)
for item in CompleteList:
if incomplete_list_index < incomplete_list_length and IncompleteList[incomplete_list_index] == item:
MyList_result.append(MyList[incomplete_list_index])
incomplete_list_index += 1
else:
MyList_result.append('')
This involves just a single pass of the CompleteList (and no pre-processing to generate a Dict as the second solution suggested by #Abhijit).

How to find length of a multi-dimensional list?

How do you find the length of a multi-dimensional list?
I've come up with a way myself, but is this the only way to find the number of values in a multi-dimensional list?
multilist = [['1', '2', 'Ham', '4'], ['5', 'ABCD', 'Foo'], ['Bar', 'Lu', 'Shou']]
counter = 0
for minilist in multilist:
for value in minilist:
counter += 1
print(counter)
I'm pretty sure there is a much simpler way to find the length of a multi-dimensional list, but len(list) does not work, as it only gives the number of lists inside. Is there a more efficient method than this?
How about:
sum(len(x) for x in multilist)
Alternative to #mgilson's solution
sum(map(len, multilist))
If you want the number of items in any n-dimensional list then you need to use a recursive function like this:
def List_Amount(List):
return _List_Amount(List)
def _List_Amount(List):
counter = 0
if isinstance(List, list):
for l in List:
c = _List_Amount(l)
counter+=c
return counter
else:
return 1
This will return the number of items in the list no matter the shape or size of your list
Another alternative (it's this or watch missed math lectures...)
def getLength(element):
if isinstance(element, list):
return sum([getLength(i) for i in element])
return 1
This allows different degrees of 'multi-dimensionality' (if that were a word) to coexist.
eg:
>>> getLength([[1,2],3,4])
4
Or, to allow different collection types
def getLength(element):
try:
element.__iter__
return sum([getLength(i) for i in element])
except:
return 1
eg:
>>> getLength([ 1, 2, (1,3,4), {4:3} ])
6
>>> getLength(["cat","dog"])
2
(Noting that although strings are iterable, they do not have the __iter__ method-wrapper and so will not cause any issues...)

Python for loop isn't iterating '0' from a list

Python for loop isn't iterating '0' from a list!
I tried to make a code to separate an input into numbers and letters(or operators):
g='10+10+20x'
t=[]
for each_g in g:
t.append(each_g)
lol=[]
a=[]
for each_t in t:
if each_t.isdigit():
lol.append(each_t)
x = t.index(each_t)
t.pop(x)
else:
lol = ''.join(lol)
a.append(lol)
a.append(each_t)
lol=[]
print(a)
The desired output would be:
['10', '+', '10', '+', '20', 'x']
but it prints
['1', '+', '1', '+', '2', 'x']
instead.
Is there any problems whit the code or a better solution to make it work as expected?
Thanks.
Don't modify a sequence that you're iterating over. Each pop is shifting a character down before you can process it.
In this case since you're not using t when you're done, there's no need for the pop at all - it's redundant.
Here's an alternative approach (as your code already has been thoroughly discussed by others):
In [38]: import re
In [39]: g='10+10+20x'
In [40]: re.findall('(\d+|[a-zA-Z]+|\+)',g)
Out[40]: ['10', '+', '10', '+', '20', 'x']
Editing a list (or any other iterable) while you're iterating over it is a horrible idea.
Here's how iterating over a list works:
when you say for each_t in t, what's actually happening is that a sequence of numbers is generated in turn; first 0, then 1 and so forth, all the way up to len(t)-1. The iterator then gets the element at that index and assigns it to each_t.
When you pop from that list, you are eliminating an index. Therefore, what used to be at index 3 is now at index 2 (if you popped something at an index less than 3). Then what happens is that the iterator accesses the element at the next number. But since the elements have essentially shifted down an index, when the iterator asks for the element at index i, it means "give me the element that used to be at index i". Instead, what it gets is "the element currently at index i, which is indeed the element that used to be at index i+1".
This is exactly why you skip over elements when you delete from a list as you're iterating over it

Categories

Resources