Iterating through a list for specific instances - python

I have the following code:
paths = [['E', 'D', 'A', 'B'], ['E', 'D', 'A', 'C', 'B'], ['E', 'D', 'B'], ['E', 'D', 'C', 'B'], ['E', 'B'], ['E', 'C', 'B']]
Now, the lists inside a list represent node paths from start to end which were made using Networkx, however that is some background information. My question is more specific.
I am trying to derive the lists that only have every letter from A-E, aka it would return only the list:
paths_desired = [['E', 'D', 'A', 'C', 'B']]
If I were to have another path:
paths = [['E', 'D', 'A', 'B'], ['E', 'D', 'A', 'C', 'B'], ['D', 'B', 'A','C','E'], ['A', 'D', 'C', 'B']]
It would return:
paths_desired = [['E', 'D', 'A', 'C', 'B'],['D', 'B', 'A', 'C', 'E']]
My idea is a for loop that iterates through each list:
for i in pathways:
counter = 0
for j in letters:
if j in i:
counter = counter + 1;
if counter == 5:
desired_paths.append(i)
print(desired_paths)
This works, however, I want to make the loop more specific, meaning I want only lists that have the following order: ['E','D','A','C','B'], even if all the letters are present in a different list, within the paths list.
Additionally, is there a way I can upgrade my for loop, so that I wouldn't count, rather check if the letters are in there, and not more than 1 of each letter? Meaning no multiple Es, no multiple D, etc.

You can use a use a set and .issubset() like this:
def pathways(letters, paths):
ret = []
letters = set(letters)
for path in paths:
if letters.issubset(path):
ret.append(path)
return ret
letters = ['A', 'B', 'C', 'D', 'E']
paths = [['E', 'D', 'A', 'B'], ['E', 'D', 'A', 'C', 'B'],
['D', 'B', 'A','C','E'], ['A', 'D', 'C', 'B']]
print(pathways(letters, paths)) # => [['E', 'D', 'A', 'C', 'B'], ['D', 'B', 'A', 'C', 'E']]
Also, as a comment by ShadowRanger pointed out, the pathways() function could be shortened using filter(). Like this:
def pathways(letters, paths):
return list(filter(set(letters).issubset, paths))
letters = ['A', 'B', 'C', 'D', 'E']
paths = [['E', 'D', 'A', 'B'], ['E', 'D', 'A', 'C', 'B'],
['D', 'B', 'A','C','E'], ['A', 'D', 'C', 'B']]
print(pathways(letters, paths))

Related

How to eliminate the items in the python list step by step?

I am using a 'for loop' to eliminate the item step by step and generate a new list(feature_combination) including different combinations.
feature_list = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
feature_combination = []
for i in range(7):
feature_list.pop()
feature_combination.append(feature_list)
feature_combination
The ideal output should be:
[['A', 'B', 'C', 'D', 'E', 'F'],['A', 'B', 'C', 'D', 'E'],['A', 'B', 'C', 'D'],['A', 'B', 'C'],['A', 'B'],['A'], []]
But the current output is:
[[], [], [], [], [], [], []]
When I print the progress step by step:
feature_list = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
feature_combination = []
for i in range(7):
feature_list.pop()
print(feature_list)
I can get the following the results:
['A', 'B', 'C', 'D', 'E', 'F']
['A', 'B', 'C', 'D', 'E']
['A', 'B', 'C', 'D']
['A', 'B', 'C']
['A', 'B']
['A']
[]
So, why I cannot append these results to an empty list? What is the problem?
It's because when you call feature_combination.append(feature_list), you are appending a reference to feature_list, not the actual value of feature_list. Since feature_list is empty at the end of the for loop, all of the references to it are empty as well.
You can fix it by changing feature_combination.append(feature_list) to feature_combination.append(feature_list.copy()), which makes a copy of the list to store.
First of all, you need to pass an index into pop in order to specify which element to delete. Though I find this unesaccary, instead you could use slicing.
Below is an example of how you could accomplish your goal. This code adjusts to your desired output.
feature_list = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
feature_combination = []
for i in range(7):
feature_list = feature_list[:-1]
feature_combination.append(feature_list)
print(feature_combination)
output
[['A', 'B', 'C', 'D', 'E', 'F'], ['A', 'B', 'C', 'D', 'E'], ['A', 'B', 'C', 'D'], ['A', 'B', 'C'], ['A', 'B'], ['A'], []]
A Python variable is a symbolic name that is a reference or pointer to an object. Once an object is assigned to a variable, you can refer to the object by that name. But the data itself is still contained within the object. refer this.
This is because the feature_list points to a specific object, which keeps updating as you pop are subsequently. You are basically creating a list that contains [object, object, object ...] all pointing to the same feature_list object. As you keep popping and updating the object, the list that collects multiple instances of this same object also gets updated with this object.
Here is how you can test this happening -
feature_list = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
feature_combination = []
for i in range(7):
feature_list.pop()
feature_combination.append(feature_list)
print('iteration', i)
print(feature_combination) #Print the primary list after each iteration
iteration 0
[['A', 'B', 'C', 'D', 'E', 'F']]
iteration 1
[['A', 'B', 'C', 'D', 'E'], ['A', 'B', 'C', 'D', 'E']]
iteration 2
[['A', 'B', 'C', 'D'], ['A', 'B', 'C', 'D'], ['A', 'B', 'C', 'D']]
iteration 3
[['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C']]
iteration 4
[['A', 'B'], ['A', 'B'], ['A', 'B'], ['A', 'B'], ['A', 'B']]
iteration 5
[['A'], ['A'], ['A'], ['A'], ['A'], ['A']]
iteration 6
[[], [], [], [], [], [], []]`
Notice, that after each iteration, every instance of the sublist is being updated after the pop and reflect inside the main list.
A fix
A fix is to use a slice to get and store a copy.
feature_list = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
feature_combination = []
for i in range(7):
feature_list.pop()
print(feature_list)
feature_combination.append(feature_list[:]) #<----
feature_combination
[['A', 'B', 'C', 'D', 'E', 'F'],
['A', 'B', 'C', 'D', 'E'],
['A', 'B', 'C', 'D'],
['A', 'B', 'C'],
['A', 'B'],
['A'],
[]]

Why does array content get wiped and reset to the first result for a recursive function?

The issues stems from the output.append(a) on the third line. This program would ideally output 6 unique permutations of the input string, but instead returns 6 of the first result in the recursive loop. I realize exiting the recursion may have something to do with the array being modified, but how can I circumvent this issue to be able to return an array of solutions?
def permute(a, l, r, output):
if l==r:
output.append(a)
else:
for i in range(l,r+1):
a[l], a[i] = a[i], a[l]
permute(a, l+1, r,output)
a[l], a[i] = a[i], a[l] # backtrack
Driver program to test the above function
string = "ABC"
output = []
n = len(string)
a = list(string)
permute(a, 0, n-1,output)
print(output)
For reference, this is what the output looks like:
[['A', 'C', 'B']]
[['B', 'A', 'C'], ['B', 'A', 'C']]
[['B', 'C', 'A'], ['B', 'C', 'A'], ['B', 'C', 'A']]
[['C', 'B', 'A'], ['C', 'B', 'A'], ['C', 'B', 'A'], ['C', 'B', 'A']]
[['C', 'A', 'B'], ['C', 'A', 'B'], ['C', 'A', 'B'], ['C', 'A', 'B'], ['C', 'A', 'B']]
[['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C']]
When the output should be:
['A', 'B', 'C']
['A', 'C', 'B']
['B', 'A', 'C']
['B', 'C', 'A']
['C', 'B', 'A']
['C', 'A', 'B']
The problem is in the line
output.append(a)
it looks fine, but later on the list a changes, and when you append it to output again, the previous a (that you already appended) changes.
To solve the problem, you can simply use shallow copy. Write this instead:
output.append(a[:])
Do you know there is an excisting function in python?
import itertools
listA = ["A", "B", "C"]
perm = itertools.permutations(listA)
for i in list(perm):
print(i)
Result:
('A', 'B', 'C')
('A', 'C', 'B')
('B', 'A', 'C')
('B', 'C', 'A')
('C', 'A', 'B')
('C', 'B', 'A')

How to traverse through variable length paths to same destination?

I have a directed network as follows.
Now, I would like to get all the possible 4 length and 5 length paths of this graph (where both the starting and ending paths are B)
Examples of 4 length paths are;
B--C--B--A--B
B--A--B--B--B
B--A--B--C--B
Examples of 5 length paths are;
B--C--B--A--B--B
B--A--B--B--C--B
I tried to use Breadth First Search (BFS) to solve this. It seems like most of the code of BFS algorithm does not satisfies my requirements. That is;
They do not consider variable length paths
Their starting and end node is not same as mine (They are different)
My current code is as follows:
graph = {'A': ['B'],
'B': ['A','C', 'B'],
'C': ['B']}
# visits all the nodes of a graph (connected component) using BFS
def bfs_connected_component(graph, start):
# keep track of all visited nodes
explored = []
# keep track of nodes to be checked
queue = [start]
# keep looping until there are nodes still to be checked
while queue:
# pop shallowest node (first node) from queue
node = queue.pop(0)
if node not in explored:
# add node to list of checked nodes
explored.append(node)
neighbours = graph[node]
# add neighbours of node to queue
for neighbour in neighbours:
queue.append(neighbour)
return explored
bfs_connected_component(graph,'A')
I would like to know if there are any python libraries I can use for this or is there is a way to modify BFS to accomplish this problem.
I am happy to provide more examples if needed :)
When searching for all combinatorial results, I found that recursive generators in python provide more compact and understandable code compared to other approaches (such as recursive functions, or stack-based equivalents).
Here we are looking for all paths from node to goal of fixed length. The list path is used as an accumulator, which, at the base case, becomes a prefix of the 1-path passing thru node.
def all_paths(graph, node, goal, length, path=[]):
if length == 0 and node == goal:
yield path + [node]
elif length > 0:
for n in graph[node]:
yield from all_paths(graph, n, goal, length - 1, path + [node])
Paths of length 4:
>>> print(*all_paths(graph, 'B', 'B', 4), sep='\n')
['B', 'A', 'B', 'A', 'B']
['B', 'A', 'B', 'C', 'B']
['B', 'A', 'B', 'B', 'B']
['B', 'C', 'B', 'A', 'B']
['B', 'C', 'B', 'C', 'B']
['B', 'C', 'B', 'B', 'B']
['B', 'B', 'A', 'B', 'B']
['B', 'B', 'C', 'B', 'B']
['B', 'B', 'B', 'A', 'B']
['B', 'B', 'B', 'C', 'B']
['B', 'B', 'B', 'B', 'B']
Paths of length 5:
>>> print(*all_paths(graph, 'B', 'B', 5), sep='\n')
['B', 'A', 'B', 'A', 'B', 'B']
['B', 'A', 'B', 'C', 'B', 'B']
['B', 'A', 'B', 'B', 'A', 'B']
['B', 'A', 'B', 'B', 'C', 'B']
['B', 'A', 'B', 'B', 'B', 'B']
['B', 'C', 'B', 'A', 'B', 'B']
['B', 'C', 'B', 'C', 'B', 'B']
['B', 'C', 'B', 'B', 'A', 'B']
['B', 'C', 'B', 'B', 'C', 'B']
['B', 'C', 'B', 'B', 'B', 'B']
['B', 'B', 'A', 'B', 'A', 'B']
['B', 'B', 'A', 'B', 'C', 'B']
['B', 'B', 'A', 'B', 'B', 'B']
['B', 'B', 'C', 'B', 'A', 'B']
['B', 'B', 'C', 'B', 'C', 'B']
['B', 'B', 'C', 'B', 'B', 'B']
['B', 'B', 'B', 'A', 'B', 'B']
['B', 'B', 'B', 'C', 'B', 'B']
['B', 'B', 'B', 'B', 'A', 'B']
['B', 'B', 'B', 'B', 'C', 'B']
['B', 'B', 'B', 'B', 'B', 'B']

How to reverse a part of a list?

I have the next list:
abclist = ['a', 'b', 'c', 'd', 'e']
With the above list I how to create the next one?
Reversed_part = ['c', 'b', 'a', 'd', 'e']
Only the first 3 items are reversed and the last two stay in the same order.
This is one way.
lst = ['a', 'b', 'c', 'd', 'e']
def partial_reverse(lst, start, end):
"""Indexing (start/end) inputs begins at 0 and are inclusive."""
return lst[:start] + lst[start:end+1][::-1] + lst[end+1:]
partial_reverse(lst, 0, 2) # ['c', 'b', 'a', 'd', 'e']
abclist = ['a', 'b', 'c', 'd', 'e']
quantityToReverse = 3
remainder = len(abclist) - quantityToReverse
reverseArray = list(reversed(abclist[:quantityToReverse]))+abclist[-remainder:]
print(reverseArray)
You can do it using a combination of reversed method & string slicing
Ex:
abclist = ['a', 'b', 'c', 'd', 'e']
print(list(reversed(abclist[:3]))+abclist[-2:])
Output:
['c', 'b', 'a', 'd', 'e']

Fastest way to group by ID for a really big numpy array

I am trying to find the best way to group 'rows' with similar IDs.
My best guess:
np.array([test[test[:,0] == ID] for ID in List_IDs])
result: array of arrays of arrays
[ array([['ID_1', 'col1','col2',...,'coln'],
['ID_1', 'col1','col2',...,'coln'],...,
['ID_1', 'col1','col2',...,'coln']],dtype='|S32')
array([['ID_2', 'col1','col2',...,'coln'],
['ID_2', 'col1','col2',...,'coln'],...,
['ID_2', 'col1','col2',...,'coln']],dtype='|S32')
....
array([['ID_k', 'col1','col2',...,'coln'],
['ID_k', 'col1','col2',...,'coln'],...,
['ID_K', 'col1','col2',...,'coln']],dtype='|S32')]
Can anyone suggest something that can be more efficient ?
Reminder: The test array is huge. 'Rows' not ordered
I am assuming List_IDs is a list of all unique IDs from the first column. With that assumption, here's a Numpy-based solution -
# Sort input array test w.r.t. first column that are IDs
test_sorted = test[test[:,0].argsort()]
# Convert the string IDs to numeric IDs
_,numeric_ID = np.unique(test_sorted[:,0],return_inverse=True)
# Get the indices where shifts (IDs change) occur
_,cut_idx = np.unique(numeric_ID,return_index=True)
# Use the indices to split the input array into sub-arrays with common IDs
out = np.split(test_sorted,cut_idx)[1:]
Sample run -
In [305]: test
Out[305]:
array([['A', 'A', 'B', 'E', 'A'],
['B', 'E', 'A', 'E', 'B'],
['C', 'D', 'D', 'A', 'C'],
['B', 'D', 'A', 'C', 'A'],
['B', 'A', 'E', 'A', 'E'],
['C', 'D', 'C', 'E', 'D']],
dtype='|S32')
In [306]: test_sorted
Out[306]:
array([['A', 'A', 'B', 'E', 'A'],
['B', 'E', 'A', 'E', 'B'],
['B', 'D', 'A', 'C', 'A'],
['B', 'A', 'E', 'A', 'E'],
['C', 'D', 'D', 'A', 'C'],
['C', 'D', 'C', 'E', 'D']],
dtype='|S32')
In [307]: out
Out[307]:
[array([['A', 'A', 'B', 'E', 'A']],
dtype='|S32'), array([['B', 'E', 'A', 'E', 'B'],
['B', 'D', 'A', 'C', 'A'],
['B', 'A', 'E', 'A', 'E']],
dtype='|S32'), array([['C', 'D', 'D', 'A', 'C'],
['C', 'D', 'C', 'E', 'D']],
dtype='|S32')]

Categories

Resources