I am fairly new to Python so please be patient, this is probably simple. I am trying to build an adjacency list representation of a graph. In this particular representation I decided to use list of lists where the first value of each sublist represents the tail node and all other values represent head nodes. For example, the graph with edges 1->2, 2->3, 3->1, 1->3 will be represented as [[1,2,3],[2,3],[3,1]].
Running the following code on this edge list, gives a problem I do not understand.
The edge list (Example.txt):
1 2
2 3
3 1
3 4
5 4
6 4
8 6
6 7
7 8
The Code:
def adjacency_list(graph):
graph_copy = graph[:]
g_tmp = []
nodes = []
for arc in graph_copy:
choice_flag_1 = arc[0] not in nodes
choice_flag_2 = arc[1] not in nodes
if choice_flag_1:
g_tmp.append(arc)
nodes.append(arc[0])
else:
idx = [item[0] for item in g_tmp].index(arc[0])
g_tmp[idx].append(arc[1])
if choice_flag_2:
g_tmp.append([arc[1]])
nodes.append(arc[1])
return g_tmp
# Read input from file
g = []
with open('Example.txt') as f:
for line in f:
line_split = line.split()
new_line = []
for element in line_split:
new_line.append(int(element))
g.append(new_line)
print('File Read. There are: %i items.' % len(g))
graph = adjacency_list(g)
During runtime, when the code processes arc 6 7 (second to last line in file), the following lines (found in the else statement) append 7 not only to g_tmp but also to graph_copy and graph.
idx = [item[0] for item in g_tmp].index(arc[0])
g_tmp[idx].append(arc[1])
What is happening?
Thank you!
J
P.S. I'm running Python 3.5
P.P.S. I also tried replacing graph_copy = graph[:] with graph_copy = list(graph). Same behavior.
The problem is in the lines
if choice_flag_1:
g_tmp.append(arc)
When you append arc, you are appending a shallow copy of the inner list. Replace with a new list like so
if choice_flag_1:
g_tmp.append([arc[0],arc[1]])
Related
What mistake have i done here ?
def levelOrder(root):
#Write your code here
que = []
que.append(root)
while que != []:
coot = que.pop()
print(coot.data,end=" ")
if coot.left is not None:
que.append(coot.left)
if coot.right is not None:
que.append(coot.right)
OutPut Expected:1 2 5 3 6 4
MY_output: 1 2 5 6 3 4
You are appending nodes to end end of the list que(using append()). And also removing the nodes from the end of the list que(using list.pop()), this would not preserve the order, so for something like
1
/ \
2 3
/ \ / \
4 5 6 7
After first iteration would have que=[2, 3], and then you would pop 3 first instead of 2, which is incorrect. Instead, you should be popping 2, popping from the left(since you are appending the new nodes to the right).
So replacing coot = que.pop() with coot = que.pop(0) in your existing code should fix the issue. But note that list.pop(0) is a O(n) operation python. So I would suggest using collections.deque instead.
With deque, your code can be -
from collections import deque
def levelOrder(root):
#Write your code here
que = deque()
que.append(root)
while que != []:
coot = que.popleft()
print(coot.data,end=" ")
if coot.left is not None:
que.append(coot.left)
if coot.right is not None:
que.append(coot.right)
I'm trying to insert a string from one list into another list on a string match (list are different lengths) I can't seem to figure out how to get in working. So i loop over both list, on a string match i insert the next line from list2 into list1 and this needs to continue until all lines from list 2 are inserted in the right place in list1. This is the code i have right now.
with open('D:\\TranslateFIles\\Ext_Python.txt', 'r', encoding='utf-8') as f:
list1 = f.readlines()
f.close()
with open('D:\\TranslateFIles\\Combo_Python.txt', 'r', encoding='utf-8') as f:
list2 = f.readlines()
f.close()
outFile=open('D:\\TranslateFIles\\Output3_Python.txt', 'w', encoding='utf-8')
list3 = list1.copy()
for element, i in enumerate(list3):
for j,b in zip(list2, list2[1:]):
if i == j:
list3.insert(element, b)
print(*list3)
for line in list3:
outFile.write(line)
outFile.close()
This is the output i get right now .
T14-P2818-L30:Location
**T14-P8629-A2067-L999:Vestiging**
T14-P8629-A1033-L999:Location
T14-P8629-A2060-L999:Magasin
T14-P4960-V1000-P2818-L128:TransferRoute
T14-P4960-V1003-P2818-L128:WhseEmployee
T14-P4960-V1004-P2818-L128:WorkCenter
T14-P4960-V1001-P2818-L128:StockkeepingUnit
T14-P4960-X1-L999:
T14-F1-P2818-L128:Code
T14-F1-P8629-A1033-L999:Code
T14-F1-P8629-A2060-L999:Code
T14-F2-P2818-L128:Name
T14-F2-P8629-A1033-L999:Name
T14-F2-P8629-A2060-L999:Nom
T14-F130-P2818-L128:Default Bin Code
T14-F130-P8629-A1033-L999:Default Bin Code
T14-F130-P8629-A2060-L999:Code emplacement par d‚faut
This is the output i want to get.
T14-P2818-L30:Location
T14-P8629-A1033-L999:Location
T14-P8629-A2060-L999:Magasin
**T14-P8629-A2067-L999:Vestiging**
T14-P4960-V1000-P2818-L128:TransferRoute
T14-P4960-V1003-P2818-L128:WhseEmployee
T14-P4960-V1004-P2818-L128:WorkCenter
T14-P4960-V1001-P2818-L128:StockkeepingUnit
T14-P4960-X1-L999:
T14-F1-P2818-L128:Code
T14-F1-P8629-A1033-L999:Code
T14-F1-P8629-A2060-L999:Code
**T14-F1-P8629-A2067-L999:Code**
T14-F2-P2818-L128:Name
T14-F2-P8629-A1033-L999:Name
T14-F2-P8629-A2060-L999:Nom
**T14-F2-P8629-A2067-L999:Naam**
T14-F130-P2818-L128:Default Bin Code
T14-F130-P8629-A1033-L999:Default Bin Code
T14-F130-P8629-A2060-L999:Code emplacement par d‚faut
**T14-F130-P8629-A2067-L999:Standaard opslaglocatiecode**
Items in list one
T14-P2818-L30:Location
T14-P8629-A1033-L999:Location
T14-P8629-A2060-L999:Magasin
T14-P4960-V1000-P2818-L128:TransferRoute
T14-P4960-V1003-P2818-L128:WhseEmployee
T14-P4960-V1004-P2818-L128:WorkCenter
T14-P4960-V1001-P2818-L128:StockkeepingUnit
T14-P4960-X1-L999:
T14-F1-P2818-L128:Code
T14-F1-P8629-A1033-L999:Code
T14-F1-P8629-A2060-L999:Code
T14-F2-P2818-L128:Name
T14-F2-P8629-A1033-L999:Name
T14-F2-P8629-A2060-L999:Nom
T14-F130-P2818-L128:Default Bin Code
T14-F130-P8629-A1033-L999:Default Bin Code
T14-F130-P8629-A2060-L999:Code emplacement par d‚faut
T14-F5700-P2818-L128:Name 2
T14-F5700-P8629-A1033-L999:Name 2
T14-F5700-P8629-A2060-L999:Nom 2
T14-F5701-P2818-L128:Address
T14-F5701-P8629-A1033-L999:Address
T14-F5701-P8629-A2060-L999:Adresse
T14-F5702-P2818-L128:Address 2
T14-F5702-P8629-A1033-L999:Address 2
T14-F5702-P8629-A2060-L999:Adresse (2Šme ligne)
T14-F5703-P2818-L128:City
T14-F5703-P8629-A1033-L999:City
T14-F5703-P8629-A2060-L999:Ville
T14-F5704-P2818-L128:Phone No.
Items in list two
T14-P8629-A1033-L999:Location
T14-P8629-A2067-L999:Location
T14-F1-P8629-A1033-L999:Code
T14-F1-P8629-A2067-L999:Code
T14-F2-P8629-A1033-L999:Name
T14-F2-P8629-A2067-L999:Name
T14-F130-P8629-A1033-L999:Default Bin Code
T14-F130-P8629-A2067-L999:Default Bin Code
T14-F5700-P8629-A1033-L999:Name 2
T14-F5700-P8629-A2067-L999:Name 2
T14-F5701-P8629-A1033-L999:Address
T14-F5701-P8629-A2067-L999:Address
T14-F5702-P8629-A1033-L999:Address 2
T14-F5702-P8629-A2067-L999:Address 2
T14-F5703-P8629-A1033-L999:City
T14-F5703-P8629-A2067-L999:City
T14-F5704-P8629-A1033-L999:Phone No.
T14-F5704-P8629-A2067-L999:Phone No.
T14-F5705-P8629-A1033-L999:Phone No. 2
T14-F5705-P8629-A2067-L999:Phone No. 2
T14-F5706-P8629-A1033-L999:Telex No.
T14-F5706-P8629-A2067-L999:Telex No.
T14-F5707-P8629-A1033-L999:Fax No.
T14-F5707-P8629-A2067-L999:Fax No.
T14-F5713-P8629-A1033-L999:Contact
T14-F5713-P8629-A2067-L999:Contact
T14-F5714-P8629-A1033-L999:Post Code
T14-F5714-P8629-A2067-L999:Post Code
T14-F5715-P8629-A1033-L999:County
T14-F5715-P8629-A2067-L999:County
The string with A2067 should be in Dutch, but i'm still translating.
Basically, you want to insert in l3 (which is a copy of l1) an element from l2 only if that element from l2 is identical to the next element from l2? If so, try:
import more_itertools as mit
l3 = l1.copy()
for i, j in mit.pairwise(l2):
if i == j:
l3.append(i)
My solution is based on pairwise function from more_itertools. pairwise iterates on every pair of subsequent elements in a list.
I have a bug in my code and I have tried to fix it using different approaches, still it does not work. I have scaled down my original code to the essential part of it below. I use a textfile as the input and it contains the number of vertices (first line), number of edges (second line), number of colors (third line) and the remaining lines consist of two numbers (separated by a blank space) representing the edges. What is important are the edges.
INPUT
6
5
3
6 2
2 3
3 4
4 6
6 2
CODE
# An instance of m-Coloring Graph problem (NP-hard) Karp-reduced to an
# instance of the Casting problem.
#! /usr/bin/python3
def subgraph(v,aux1,aux2):
print(nhoods)
sg = list(aux2[v-1])
aux1.remove(sg)
sg.remove(v)
for i, nhood in enumerate(aux1):
try:
aux1[i].remove(v)
aux2[i].remove(v)
except ValueError:
pass # do nothing!
for vertex in sg:
sg.extend(subgraph(vertex,aux1,aux2))
return sg
line = 0
edges = []
inputs = "testfile.txt"
f = open(inputs,"r")
for i in f.readlines():
line += 1
if line == 1:
V = int(i)
elif line == 2:
E = int(i)
elif line == 3:
m = int(i)
else:
edge = [int(n) for n in i.split()]
if edge in edges:
pass # Removes double edges
else:
edges.append(edge)
conv = [] # Connected vertices
for edge in edges:
for vend in edge:
if vend in conv:
pass
else:
conv.append(vend) # Stores none-isolated vertices
# Create lists of neighbors/neighborhoods for each vertex
nhoods = []
for v in conv:
nhood = []
for edge in edges:
if v == edge[0]:
nhood.append(edge[1])
elif v == edge[1]:
nhood.append(edge[0])
nhood.append(v)
nhoods.append(nhood)
# Create list of connected subgraphs
aux1 = list(nhoods)
aux2 = list(nhoods)
#for nhood in nhoods:
# aux1.append(nhood)
# aux2.append(nhood)
SG = [] # List of subgraphs
while aux1 != []:
v = aux1[0][0]
SG.append(subgraph(v,aux1,aux2))
Now, when I run the code, what I want it to do is create copied lists of the nhoods list called aux1 and aux2 (at line 62 in the code). (I later use these for the prupose of finding connected subgraphs in the input graph). However, when I modify one of the copied lists aux1 or aux2 the nhoods changes! But this should not happen when I am using the list() function, right? I have tried using the copy() function and a for-loop with no better results. To me it seems that the lists refer to the same spot in the memory, but why? Is it that the elements of the lists (which are lists) are refering to the same memory spot? How do I solve this?
I hope I did not miss anything, otherwise just ask, thanks in advance!
Best regards//
I somewhat figured out the issue you are facing is mutability property of the list. Also you need to understand the difference in Soft copy and Hard copy. Whatever you have followed are Soft copy approaches. Since you have mutable elements inside a mutable object, Hard copy is required. For Hard copy one approach you can follow is using copy.deepcopy method.
import copy
...
aux1 = copy.deepcopy(nhoods)
aux2 = copy.deepcopy(nhoods)
Now all of the elements of aux1 & aux2 are created on different memory than that of nhoods.
I'm new to programming and python and I'm looking for a way to distinguish between two input formats in the same input file text file. For example, let's say I have an input file like so where values are comma-separated:
5
Washington,A,10
New York,B,20
Seattle,C,30
Boston,B,20
Atlanta,D,50
2
New York,5
Boston,10
Where the format is N followed by N lines of Data1, and M followed by M lines of Data2. I tried opening the file, reading it line by line and storing it into one single list, but I'm not sure how to go about to produce 2 lists for Data1 and Data2, such that I would get:
Data1 = ["Washington,A,10", "New York,B,20", "Seattle,C,30", "Boston,B,20", "Atlanta,D,50"]
Data2 = ["New York,5", "Boston,10"]
My initial idea was to iterate through the list until I found an integer i, remove the integer from the list and continue for the next i iterations all while storing the subsequent values in a separate list, until I found the next integer and then repeat. However, this would destroy my initial list. Is there a better way to separate the two data formats in different lists?
You could use itertools.islice and a list comprehension:
from itertools import islice
string = """
5
Washington,A,10
New York,B,20
Seattle,C,30
Boston,B,20
Atlanta,D,50
2
New York,5
Boston,10
"""
result = [[x for x in islice(parts, idx + 1, idx + 1 + int(line))]
for parts in [string.split("\n")]
for idx, line in enumerate(parts)
if line.isdigit()]
print(result)
This yields
[['Washington,A,10', 'New York,B,20', 'Seattle,C,30', 'Boston,B,20', 'Atlanta,D,50'], ['New York,5', 'Boston,10']]
For a file, you need to change it to:
with open("testfile.txt", "r") as f:
result = [[x for x in islice(parts, idx + 1, idx + 1 + int(line))]
for parts in [f.read().split("\n")]
for idx, line in enumerate(parts)
if line.isdigit()]
print(result)
You're definitely on the right track.
If you want to preserve the original list here, you don't actually have to remove integer i; you can just go on to the next item.
Code:
originalData = []
formattedData = []
with open("data.txt", "r") as f :
f = list(f)
originalData = f
i = 0
while i < len(f): # Iterate through every line
try:
n = int(f[i]) # See if line can be cast to an integer
originalData[i] = n # Change string to int in original
formattedData.append([])
for j in range(n):
i += 1
item = f[i].replace('\n', '')
originalData[i] = item # Remove newline char in original
formattedData[-1].append(item)
except ValueError:
print("File has incorrect format")
i += 1
print(originalData)
print(formattedData)
The following code will produce a list results which is equal to [Data1, Data2].
The code assumes that the number of entries specified is exactly the amount that there is. That means that for a file like this, it will not work.
2
New York,5
Boston,10
Seattle,30
The code:
# get the data from the text file
with open('filename.txt', 'r') as file:
lines = file.read().splitlines()
results = []
index = 0
while index < len(lines):
# Find the start and end values.
start = index + 1
end = start + int(lines[index])
# Everything from the start up to and excluding the end index gets added
results.append(lines[start:end])
# Update the index
index = end
I'm having trouble grasping how to get right the right order of output, when doing nested for loops.
I have a list of integers:
[7, 9, 12]
And a .txt with lines of DNA sequence data.
>Ind1 AACTCAGCTCACG
>Ind2 GTCATCGCTACGA
>Ind3 CTTCAAACTGACT
I am trying to make a nested for loop, that takes the first integer (7), goes through the lines of text and prints the charachter at position 7 for each line. Then takes the next integer, and prints each character at position 9 for each line.
with (Input) as getletter:
for line in getletter:
if line [0] == ">":
for pos in position:
snp = line[pos]
print line[pos], str(pos)
When I run the above code, I get the data I want, but in the wrong order, like so:
A 7
T 9
G 12
T 7
A 9
G 12
T 7
C 9
A 12
What I want is this:
A 7
T 7
T 7
T 9
A 9
C 9
G 12
G 12
A 12
I suspect the problem can be solved by changing the indentation of the code, but I cannot get it right.
------EDIT--------
I've tried to swap the two loops around, but I am obviously not getting the bigger picture of this gives me the same (wrong) result as above.
with (Input) as getsnps:
for line in getsnps:
if line[0] == ">":
hit = line
for pos in position:
print hit[pos], pos
Trying an answer:
with (Input) as getletter:
lines=[x.strip() for x in getLetter.readlines() if x.startswith('>') ]
for pos in position:
for line in lines:
snp = line[pos]
print ("%s\t%s" % (pos,snp))
The file is read and cached into an array (lines, discarding file not starting with >)
we then iterate over the position then the lines and print the expected result.
Please note that you should check that your offset is not bigger than your line.
Alternative without list comprehension (will use more memory, especially if you have a lot of useless lines (i.e. not starting with '>')
with (Input) as getletter:
lines=getLetter.readlines()
for pos in position:
for line in lines:
if line.startswith('>'):
snp = line[pos]
print ("%s\t%s" % (pos,snp))
Alternative with another storage (assuming position is small and Input is big)
with (Input) as getletter:
storage=dict()
for p in positions:
storage[p]=[]
for line in getLetter:
for p in positions:
storage[p]+=[line[pos]]
for (k,v) in storage.iteritems():
print ("%s -> %s" % (k, ",".join(v))
if positions contains a value bigger than size of line, using line[p] will trigger an exception (IndexError). You can either catch it or test for it
try:
a=line[pos]
except IndexError:
a='X'
if pos>len(line):
a='X'
else:
a=line[pos]