Remove all items of same name from list of sublists - python

I'm trying to remove all instances of a certain string from a list which contains sublists. For example, something like this:
myarray = ['a', 'a', ['b', 'a', 'a'], ['a', 'c', 'd', 'a'], 'a', ['a', 'd']]
ends up like this
mylist = [['b'],['c','d'],['d']]
after removing all the instances of 'a'.
I have used this code:
def delnodata(lst, what):
for index, item in enumerate(lst):
if type(item) == list:
delnodata(item, what)
else:
if item == what:
lst.remove(item)
delnodata(mylist, 'a')
but the output is:
[['b', 'a'], ['c', 'd'], 'a', ['a', 'd']]
I've seen a lot of similar questions on this site, but unfortunately my programming skills aren't good enough to put this together myself!

I'd do this recursively. This will also work for an arbitrary nesting level.
myarray = ['a', 'a', ['b', 'a', 'a'], ['a', 'c', 'd', 'a'], 'a', ['a', 'd']]
def nestremove(lst, what):
new = []
for item in lst:
if isinstance(item,list):
new.append(nestremove(item,what))
elif item != what:
new.append(item)
return new
print(myarray)
myarray = nestremove(myarray, 'a')
print(myarray)
The function returns a new list, so we don't have to remove items from the original list while iterating over it, which as others have already pointed out can be dangerous (see this question, especially the comments). Instead, you can just reassign myarray.
output:
['a', 'a', ['b', 'a', 'a'], ['a', 'c', 'd', 'a'], 'a', ['a', 'd']]
[['b'], ['c', 'd'], ['d']]

First use for for index, item in enumerate(lst[:]) so it loops through the entire copy of lst second delnodata(myarray, 'a') not delnodata(mylist, 'a') as you have put it
myarray = ['a', 'a', ['b', 'a', 'a'], ['a', 'c', 'd', 'a'], 'a', ['a', 'd']]
def delnodata(lst, what):
for index, item in enumerate(lst[:]):
if type(item) == list: # This if statement is optional
delnodata(item, what)
else:
if item == what:
lst.remove(item)
print lst
delnodata(myarray, 'a')

The following code returns a new list, with 'a's removed. It doesn't modify lists in place, which can cause mysterious problems.
source
myarray = ['a', 'a', ['b', 'a', 'a'], ['a', 'c', 'd', 'a'], 'a', ['a', 'd']]
def testme(data):
for item in data:
if type(item) is list:
yield list( testme(item) )
elif item != 'a':
yield item
res = list( testme(myarray) )
print res
assert res==[['b'],['c','d'],['d']], res
output
[['b'], ['c', 'd'], ['d']]

removing one item at a time from a list is very inefficient, as the following items all need to be shifted to fill the space each time. You should just create a new filtered list instead
>>> def remover(data, r):
... return [remover(x, r) if isinstance(x, list) else x for x in data if x != r]
...
>>> myarray = ['a', 'a', ['b', 'a', 'a'], ['a', 'c', 'd', 'a'], 'a', ['a', 'd']]
>>> remover(myarray, 'a')
[['b'], ['c', 'd'], ['d']]

Related

How to efficiently split a list that has a certain periodicity, into multiple lists?

For example the original list:
['k','a','b','c','a','d','e','a','b','e','f','j','a','c','a','b']
We want to split the list into lists started with 'a' and ended with 'a', like the following:
['a','b','c','a']
['a','d','e','a']
['a','b','e','f','j','a']
['a','c','a']
The final ouput can also be a list of lists. I have tried a double for loop approach with 'a' as the condition, but this is inefficient and not pythonic.
One possible solution is using re (regex)
import re
l = ['k','a','b','c','a','d','e','a','b','e','f','j','a','c','a','b']
r = [list(f"a{_}a") for _ in re.findall("(?<=a)[^a]+(?=a)", "".join(l))]
print(r)
# [['a', 'b', 'c', 'a'], ['a', 'd', 'e', 'a'], ['a', 'b', 'e', 'f', 'j', 'a'], ['a', 'c', 'a']]
You can do this in one loop:
lst = ['k','a','b','c','a','d','e','a','b','e','f','j','a','c','a','b']
out = [[]]
for i in lst:
if i == 'a':
out[-1].append(i)
out.append([])
out[-1].append(i)
out = out[1:] if out[-1][-1] == 'a' else out[1:-1]
Also using numpy.split:
out = [ary.tolist() + ['a'] for ary in np.split(lst, np.where(np.array(lst) == 'a')[0])[1:-1]]
Output:
[['a', 'b', 'c', 'a'], ['a', 'd', 'e', 'a'], ['a', 'b', 'e', 'f', 'j', 'a'], ['a', 'c', 'a']]
Firstly you can store the indices of 'a' from the list.
oList = ['k','a','b','c','a','d','e','a','b','e','f','j','a','c','a','b']
idx_a = list()
for idx, char in enumerate(oList):
if char == 'a':
idx_a.append(idx)
Then for every consecutive indices you can get the sub-list and store it in a list
ans = [oList[idx_a[x]:idx_a[x + 1] + 1] for x in range(len(idx_a))]
You can also get more such lists if you take in-between indices also.
You can do this with a single iteration and a simple state machine:
original_list = list('kabcadeabefjacab')
multiple_lists = []
for c in original_list:
if multiple_lists:
multiple_lists[-1].append(c)
if c == 'a':
multiple_lists.append([c])
if multiple_lists[-1][-1] != 'a':
multiple_lists.pop()
print(multiple_lists)
[['a', 'b', 'c', 'a'], ['a', 'd', 'e', 'a'], ['a', 'b', 'e', 'f', 'j', 'a'], ['a', 'c', 'a']]
We can use str.split() to split the list once we str.join() it to a string, and then use a f-string to add back the stripped "a"s. Note that even if the list starts/ends with an "a", this the split list will have an empty string representing the substring before the split, so our unpacking logic that discards the first + last subsequences will still work as intended.
def split(data):
_, *subseqs, _ = "".join(data).split("a")
return [list(f"a{seq}a") for seq in subseqs]
Output:
>>> from pprint import pprint
>>> testdata = ['k','a','b','c','a','d','e','a','b','e','f','j','a','c','a','b']
>>> pprint(split(testdata))
[['a', 'b', 'c', 'a'],
['a', 'd', 'e', 'a'],
['a', 'b', 'e', 'f', 'j', 'a'],
['a', 'c', 'a']]

List of indices of tuples of tuples that contain certain tuples

I have a list list1 of 3 sublists of tuples like
[[(['A', 'B', 'A'], ['B', 'O', 'A']),
(['A', 'B', 'A'], ['B', 'A', 'O']),
(['A', 'B', 'O'], ['B', 'O', 'A']),
(['A', 'B', 'O'], ['B', 'A', 'O']),
(['A', 'B', 'A'], ['B', 'O', 'A']),
(['A', 'B', 'A'], ['B', 'A', 'O'])],
[(['A', 'B', 'A'], ['B', 'A', 'A']),
(['A', 'B', 'O'], ['B', 'A', 'A']),
(['A', 'B', 'A'], ['B', 'A', 'A'])],
[['A', 'B', 'A'], ['A', 'B', 'O']],
[['A', 'B', 'B']],
[['B', 'A', 'A']]]
Assume list2 = ['A', 'B', 'A']. My goal is to obtain a list of indices of any pairs of tuples (or a singleton set of tuple) in list1 that contain the tuple list2. I tried to use the enumerate function as follows but the result is not correct
print([i for i, j in enumerate(bigset) if ['A', 'B', 'A'] in j[0] or
['A', 'B', 'A'] == j[0] or [['A', 'B', 'A']] in j[0]])
Can anyone please help me with this problem? I'm quite stuck due to the mismatch in the different sizes of tuples of tuples appearing in list1.
Another question I have is: I want to find the total number of 3-element lists in list1. So if I do it by hand, the answer is 22. But how to do it in code? I guess we need to use two for loops?
Expected Output For list1 above with the given list2, we would get the list of indices containing list2 is [0,1,5,6,7,9,10].
Ok, so here you go
This use recursion because we don't know the depth of your list1 SO the index will be counted like this :
0,1
2,3,4,
6,7
8,
9,10,11,12
etc... (The same order you have by writing it in 1 row)
Here the result will be :
[0, 2, 8, 10, 12, 16, 18]
Now the code
def foo(l,ref):
global s
global indexes
for items in l: #if it's an element of 3 letters
if len(items)== 3 and len(items[0])==1:
if items == ref:
indexes.append(s) #save his index if it match the ref
s+= 1 #next index
else: #We need to go deeper
foo(items,ref)
return(s)
list1 = [[(['A', 'B', 'A'], ['B', 'O', 'A']),
(['A', 'B', 'A'], ['B', 'A', 'O']),
(['A', 'B', 'O'], ['B', 'O', 'A']),
(['A', 'B', 'O'], ['B', 'A', 'O']),
(['A', 'B', 'A'], ['B', 'O', 'A']),
(['A', 'B', 'A'], ['B', 'A', 'O'])],
[(['A', 'B', 'A'], ['B', 'A', 'A']),
(['A', 'B', 'O'], ['B', 'A', 'A']),
(['A', 'B', 'A'], ['B', 'A', 'A'])],
[['A', 'B', 'A'], ['A', 'B', 'O']],
[['A', 'B', 'B']],
[['B', 'A', 'A']]]
list2 = ['A', 'B', 'A']
indexes = []
s=0
count= foo(list1,list2)
print(indexes)
s is the index we are working on
count is the total amount of element (22).
Indexes is the list of index you want.
This work even if you make a list3 = [list1,list1,[list1,[list1],list1]] , you may want to try it.
Best luck to end your script now.
Would it work for your implementation if we sort out your list1 into a more friendly format first? If so, you could do that in a pretty simple way:
Go through each element of list1, if the element is itself a big list of tuples, then we want to unpack further. If the element is a tuple (so the first element of that tuple is a list), or it is itself one of your 3-element lists, then we just want to append that as it is.
nice_list = []
for i in list1:
if type(i[0]) == str or type(i[0]) == list:
# i.e. i is one of your 3-element lists, or a tuple of lists
nice_list.append(i)
else:
#If i is still a big list of other tuples, we want to unpack further
for j in i:
nice_list.append(j)
Then you could search for the indices much easier:
for i, idx in zip(nice_list, range(len(nice_list))):
if ['A', 'B', 'A'] in i:
print(idx) #Or append them to a list, whatever you wanted to do
For a not-particularly-elegant solution to your question about finding how many 3-element lists there are, yes you could use a for loop:
no_of_lists = 0
for n in nice_list:
if type(n) == tuple:
no_of_lists += len(n)
elif type(n) == list and type(n[0]) == list:
# if it is a list of lists
no_of_lists += len(n)
elif type(n) == list and type(n[0]) == str:
#if it is a 3-element list
no_of_lists_lists += 1
print('Number of 3-element lists contained:', no_of_lists)
Edit: to answer the question you asked in the comments about how the for n in nice_list part works, this just iterates through each element of the list. To explore this, try writing some code to print out nice_list[0], nice_list[1] etc, or a for loop which prints out each n so you can see what that looks like. For example, you could do:
for n in nice_list:
print(n)
to understand how that's working.
Slightly unconventional approach, due to unknown depth, and/or lack of known array flattening operation - I would try with regex:
import re
def getPos(el, arr):
el=re.escape(str(el))
el=f"(\({el})|({el}\))"
i=0
for s in re.finditer(r"\([^\)]+\)", str(arr)):
if(re.match(el,s.group(0))):
yield i
i+=1
Which yields:
>>> print(list(getPos(list2, list1)))
[0, 1, 4, 5, 6, 8, 9]
(Which I believe is the actual result you want).

Iterating over two lists A and B

I am trying to iterate over two lists A and B. Where the B is equal to A - A[i], where i = 1:
For E.g. listA = ['A', 'B', 'C', 'D'].
For first Item, 'A' in List A, I
want the List B to have ['B', 'C', 'D'] For second Item 'B' in List A,
I want the List B to have ['A', 'C', 'D']
What I have tried until now.
listA = ['A', 'B', 'C', 'D']
for term in listA:
listA.remove(term)
for item in listA:
print(listA)
If all you want is to print the sublists, it will be like:
for i in range(len(listA)):
print(listA[:i]+listA[i+1:])
Or,
for i in listA:
print(list(set(listA) - set(i)))
Try this,
>>> la = ['A', 'B', 'C', 'D']
>>> for i in la:
_temp = la.copy()
_temp.remove(i)
print(_temp)
Output:
['B', 'C', 'D']
['A', 'C', 'D']
['A', 'B', 'D']
['A', 'B', 'C']
*If you want to assign the print output to new variables, use a dictionary where the key will the name of list and value is printted output.
Is this what you want?
listA = ['A', 'B', 'C', 'D']
Bs = \
[listA[:idx] + listA[idx + 1:]
for idx
in range(len(listA))]
for B in Bs:
print(B)
Taking the above solutions a step further, you can store a reference to each of the resulting list in the corresponding variable using a dictionary comprehension:
keys_map = {x: [item for item in listA if item != x] for x in listA}
print(keys_map)
Output
{
'A': ['B', 'C', 'D'],
'B': ['A', 'C', 'D'],
'C': ['A', 'B', 'D'],
'D': ['A', 'B', 'C']
}
and access the desired key like so
keys_map.get('A')
# returns
['B', 'C', 'D']

Why does array content get wiped and reset to the first result for a recursive function?

The issues stems from the output.append(a) on the third line. This program would ideally output 6 unique permutations of the input string, but instead returns 6 of the first result in the recursive loop. I realize exiting the recursion may have something to do with the array being modified, but how can I circumvent this issue to be able to return an array of solutions?
def permute(a, l, r, output):
if l==r:
output.append(a)
else:
for i in range(l,r+1):
a[l], a[i] = a[i], a[l]
permute(a, l+1, r,output)
a[l], a[i] = a[i], a[l] # backtrack
Driver program to test the above function
string = "ABC"
output = []
n = len(string)
a = list(string)
permute(a, 0, n-1,output)
print(output)
For reference, this is what the output looks like:
[['A', 'C', 'B']]
[['B', 'A', 'C'], ['B', 'A', 'C']]
[['B', 'C', 'A'], ['B', 'C', 'A'], ['B', 'C', 'A']]
[['C', 'B', 'A'], ['C', 'B', 'A'], ['C', 'B', 'A'], ['C', 'B', 'A']]
[['C', 'A', 'B'], ['C', 'A', 'B'], ['C', 'A', 'B'], ['C', 'A', 'B'], ['C', 'A', 'B']]
[['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C']]
When the output should be:
['A', 'B', 'C']
['A', 'C', 'B']
['B', 'A', 'C']
['B', 'C', 'A']
['C', 'B', 'A']
['C', 'A', 'B']
The problem is in the line
output.append(a)
it looks fine, but later on the list a changes, and when you append it to output again, the previous a (that you already appended) changes.
To solve the problem, you can simply use shallow copy. Write this instead:
output.append(a[:])
Do you know there is an excisting function in python?
import itertools
listA = ["A", "B", "C"]
perm = itertools.permutations(listA)
for i in list(perm):
print(i)
Result:
('A', 'B', 'C')
('A', 'C', 'B')
('B', 'A', 'C')
('B', 'C', 'A')
('C', 'A', 'B')
('C', 'B', 'A')

Find the same elements from two lists and print the elements from both lists

There are two lists:
k = ['a', 'a', 'b', 'b', 'c', 'c', 'd', 'e']
l = ['a', 'c', 'e']
I want to find the same elements from these two lists, that is:
['a', 'c', 'e']
then I want to print out the element we found, for example, 'a' from both lists, that is: ['a', 'a', 'a'].
The result I want is as follows:
['a', 'a', 'a', 'c', 'c', 'c', 'e', 'e']
I try to doing in this way:
c = []
for item_k in k:
for item_j in j:
if item_k== item_j:
c.append(item_k)
c.append(item_j)
However, the result is ['a', 'a', 'c', 'c', 'e', 'e']
Also in this way:
c=[]
for item_k in k:
if item_k in l:
c.append(item_k)
d=l.count(item_k)
c.append(item_k*d)
print c
But it do not works, can anybody tell me how to do it? really appreciate your help in advance
result = [x for x in sorted(k + l) if x in k and x in l]
print(result)
results:
['a', 'a', 'a', 'c', 'c', 'c', 'e', 'e']
Since you want to pick up elements from both lists, the most straight forward way is probably to iterate over both while checking the other one (this is highly optimizatiable if you depend on speed for doing this):
merged = []
for el in list1:
if el in list2:
merged.append(el)
for el in list2:
if el in list1:
merged.append(el)
.. if the order of the elements is important, you'll have to define an iteration order (in what order do you look at what element from what array?).
If the lists are sorted and you want the result to be sorted:
sorted([x for x in list1 if x in set(list2)] + [x for x in list2 if x in set(list1)] )
You can use set operations to intersect and then loop through, appending to a new list any that match the intersected list
k = ['a', 'a', 'b', 'b', 'c', 'c', 'd', 'e']
l = ['a', 'c', 'e']
common_list = list(set(k).intersection(set(l)))
all_results = []
for item in k:
if item in common_list:
all_results.append(item)
for item in l:
if item in common_list:
all_results.append(item)
print sorted(all_results)
output:
['a', 'a', 'a', 'c', 'c', 'c', 'e', 'e']
Here's a compact way. Readability might suffer a little, but what fun are comprehensions without a little deciphering?
import itertools
k = ['a', 'a', 'b', 'b', 'c', 'c', 'd', 'e']
l = ['a', 'c', 'e']
combined = [letter for letter in itertools.chain(k,l) if letter in l and letter in k]
Here is an implementation that matches your initial algorithm:
k = ['a', 'a', 'b', 'b', 'c', 'c', 'd', 'e']
l=['a', 'c', 'e']
c=[]
for x in l:
count = 0
for y in k:
if x == y:
count += 1
while count>=0:
c.append(x)
count = count -1
print c

Categories

Resources