I have implemented algorithm for Welsh-Powell graph coloring.
Task is that you have txt document with lines of 2 numbers, in which the first one represent vertex and the second one represent its neighbor.
Looks like this:
1 2
1 4
1 5
1 8
1 9
1 11
2 3
2 4
2 5
2 6
2 7
2 8
2 10
2 11
2 12
3 4
3 5
3 6
3 7
3 8
3 9
3 11
4 6
4 9
4 10
4 11
4 12
5 6
5 7
5 10
5 11
5 12
6 7
6 10
6 11
6 12
7 8
7 10
7 11
7 12
8 10
8 11
8 12
9 10
9 12
10 12
Output should be also txt document with two numbers in line, where the first one is vertex and the second one is number of color like this:
1 2
2 1
3 2
4 3
5 3
6 4
7 5
8 3
9 1
10 2
11 6
12 6
The number of used colors is not important, as it shouldnt be more than all vertexes.
This is graphColoring.py
def file_to_graph(filename):
file1 = open(filename, 'r')
lines = file1.readlines()
graph = [[], []]
for line in lines:
if line == '':
continue
row = line.split(' ')
from_vertex = int(row[0])
to_vertex = int(row[1])
found_from_vertex = -1
found_to_vertex = -1
for i in range(len(graph[0])):
if graph[0][i][0] == from_vertex:
graph[0][i][1] += 1
found_from_vertex = i
if graph[0][i][0] == to_vertex:
graph[0][i][1] += 1
found_to_vertex = i
if found_from_vertex == -1:
graph[0].append([from_vertex, 1])
if found_to_vertex == -1:
graph[0].append([to_vertex, 1])
graph[1].append((from_vertex, to_vertex))
return graph
def get_color(colored_vertexes, vertex):
for colored_vertex in colored_vertexes:
if colored_vertex[0] == vertex:
return colored_vertex[1]
return -1
def color_graph(graph, max_colors=-1):
sorted_vertexes = sorted(graph[0], key=lambda x: x[1])
colored_vertexes = []
for vertex_with_edges in reversed(sorted_vertexes):
colored_vertexes.append([vertex_with_edges[0], 0])
actual_color = 1
while True:
for vertex_with_edge_count in reversed(sorted_vertexes):
color = 0
colored_vertex_index = -1
vertex_name = vertex_with_edge_count[0]
for i in range(len(colored_vertexes)):
if colored_vertexes[i][0] == vertex_name:
color = colored_vertexes[i][1]
colored_vertex_index = i
break
if color != 0:
continue
color_of_neighbours = []
for (from_vertex, to_vertex) in graph[1]:
if from_vertex == vertex_name:
color_of_neighbours.append(get_color(colored_vertexes, to_vertex))
if to_vertex == vertex_name:
color_of_neighbours.append(get_color(colored_vertexes, from_vertex))
if actual_color not in color_of_neighbours:
colored_vertexes[colored_vertex_index][1] = actual_color
actual_color += 1
is_colored = True
for colored_vertex in colored_vertexes:
if colored_vertex[1] == 0:
is_colored = False
break
if is_colored:
break
return colored_vertexes
And this is main.py
from graphColoring import file_to_graph, color_graph
import sys
if __name__ == '__main__':
graph = file_to_graph(sys.argv[1])
colored_vertexes = color_graph(graph)
for colored_vertex in sorted(colored_vertexes, key=lambda x: x[0]):
print(str(colored_vertex[0]) + " " + str(colored_vertex[1]))
EDIT//: I received a solution for problem with input, but I dont have expected output. Instead of expected output above, i received this:
1 2
2 1
3 2
4 4
5 6
6 5
7 4
8 5
9 1
10 3
11 3
12 2
//
I should use "python main.py ./input.txt" to input txt document, but after running the program it says:
line 7, in <module>
graph = file_to_graph(sys.argv[1])
~~~~~~~~^^^
IndexError: list index out of range
I˙m not so good at programming and all of this is really hard for me to understand so I really appreciate some help to make this functional.
For me it is working. And it also gives your desired output
1 1
2 2
3 1
4 2
I think it is about how you run the script. Because when I run
python .\main.py I also get
Traceback (most recent call last):
File "C:\Users\fhdwnig\Desktop\main.py", line 5, in <module>
graph = file_to_graph(sys.argv[1])
~~~~~~~~^^^
you have to run it like this:
python .\main.py .\input.txt
(on windows)
The script is missing the input file.
change this line
sorted_vertexes = sorted(graph[0], key=lambda x: x[1])
to this one
sorted_vertexes = sorted(sorted(graph[0][::-1],key=lambda x:x[0], reverse=True), key=lambda x: x[1])
Related
Assume I have the following pandas data frame:
my_class value
0 1 1
1 1 2
2 1 3
3 2 4
4 2 5
5 2 6
6 2 7
7 2 8
8 2 9
9 3 10
10 3 11
11 3 12
I want to identify the indices of "my_class" where the class changes and remove n rows after and before this index. The output of this example (with n=2) should look like:
my_class value
0 1 1
5 2 6
6 2 7
11 3 12
My approach:
# where class changes happen
s = df['my_class'].ne(df['my_class'].shift(-1).fillna(df['my_class']))
# mask with `bfill` and `ffill`
df[~(s.where(s).bfill(limit=1).ffill(limit=2).eq(1))]
Output:
my_class value
0 1 1
5 2 6
6 2 7
11 3 12
One of possible solutions is to:
Make use of the fact that the index contains consecutive integers.
Find index values where class changes.
For each such index generate a sequence of indices from n-2
to n+1 and concatenate them.
Retrieve rows with indices not in this list.
The code to do it is:
ind = df[df['my_class'].diff().fillna(0, downcast='infer') == 1].index
df[~df.index.isin([item for sublist in
[ range(i-2, i+2) for i in ind ] for item in sublist])]
my_class = np.array([1] * 3 + [2] * 6 + [3] * 3)
cols = np.c_[my_class, np.arange(len(my_class)) + 1]
df = pd.DataFrame(cols, columns=['my_class', 'value'])
df['diff'] = df['my_class'].diff().fillna(0)
idx2drop = []
for i in df[df['diff'] == 1].index:
idx2drop += range(i - 2, i + 2)
print(df.drop(idx_drop)[['my_class', 'value']])
Output:
my_class value
0 1 1
5 2 6
6 2 7
11 3 12
Here's the program:
layout = "{0:>5}"
layout += "{1:>10}"
for i in range(2, 13):
layout += "{"+str(i)+":9>}"
index = []
for i in range(13):
index.append(i)
index = tuple(index)
print(layout.format(*index))
and it prints out like this:
0 123456789101112
but I want it to look something like this(number of spaces might be wrong):
0 1 2 3 4 5 6 7 8 9 10 11 12
What did I do wrong?
":9>}"
should be
":>9}"
This gives:
0 1 2 3 4 5 6 7 8 9 10 11 12
To look exactly like what you ask:
Actually, you're asking for something weird, but here's a more succint way to write what you wrote:
layout = "{0:>5}{1:>5}" + ''.join("{" + str(i) + ":>4}" for i in range(2, 13))
print(layout.format(*range(13)))
Gives:
0 1 2 3 4 5 6 7 8 9 10 11 12
I'm trying to print a nested loops that looks like this:
1 2 3 4
5 6 7 8
9 10 11 12
This is what I have so far:
def main11():
for n in range(1,13)
print(n, end=' ')
however, this prints the numbers in one line: 1 2 3 4 5 6 7 8 9 10 11 12
You can do that using string formatting:
for i in range(1,13):
print '{:2}'.format(i),
if i%4==0: print
[OUTPUT]
1 2 3 4
5 6 7 8
9 10 11 12
Modulus Operator (%)
for n in range(1,13):
print(n, end=' ')
if n%4 == 0:
print
for offset in range(3):
for i in range(1,5):
n = offset*4 + i
print(n, end=' ')
print()
Output:
1 2 3 4
5 6 7 8
9 10 11 12
Or if you want it nicely formatted the way you have in your post:
for offset in range(3):
for i in range(1,5):
n = offset*4 + i
print("% 2s"%n, end=' ')
print()
Output:
1 2 3 4
5 6 7 8
9 10 11 12
Most of the time when you write a for loop, you should check if this is the right implementation.
From the requirements you have, I would write something like this:
NB_NB_INLINE = 4
MAX_NB = 12
start = 1
while start < MAX_NB:
print( ("{: 3d}" * NB_NB_INLINE).format(*tuple( j+start for j in range(NB_NB_INLINE))) )
start += NB_NB_INLINE
I have a sheet of numbers, separated by spaces into columns. Each column represents a different category, and within each column, each number represents a different value. For example, column number four represents age, and within the column, the number 5 represents an age of 44-55. Obviously, each row is a different person's record. I'd like to use a Python script to search through the the sheet, and find all columns where the sixth column is number "1." After that, I want to know how many times each number in column one appears where the number in column six is equal to "1." The script should output to the user that "While column six equals '1', the value '1' appears 12 times in column one. The value '2' appears 18 times..." etc. I hope I'm being clear here. I just want it to list the numbers, basically. Anyway, I'm new to Python. I've attached my code below. I think I should be using dictionaries, but I'm just not totally sure how. So far, I haven't really come close to figuring this out. I would really appreciate if someone could walk me through the logic that would be behind such code. Thank you so much!
ldata = open("list.data", "r")
income_dist = {}
for line in ldata:
linelist = line.strip().split(" ")
key_income_dist = linelist[6]
if key_income_dist in income_dist:
income_dist[key_income_dist] = 1 + income_dist[key_income_dist]
else:
income_dist[key_income_dist] = 1
ldata.close()
print value_no_occupations
First, indentation is majorly important in Python and the above is bad: the 5 lines following linelist = line.strip().split(" ") need to be indented to be in the loop like they should be.
Next they should be indented further and this line added before them:
if len(linelist)>6 and linelist[6]=="1":
This line skips over short lines (there are some), and tests for what you said you wanted: "where column six equals "1."" This is column [6] where the first number on the line is referenced as [0] (these are "offsets", not "cardinal", or counting, numbers).
You'll probably want to change key_income_dist = linelist[6] to key_income_dist = linelist[0] or [1] to get what you want. Play around if necessary.
Finally, you should say print income_dist at the end to get a look at your results. If you want fancier output, study up on formatting.
This is actually easier than it seems! The key is collections.Counter
from collections import Counter
ldata = open("list.data")
rows = [tuple(row.split()) for row in ldata if row.split()[5]==1]
# warning this will break if some rows are shorter than 6 columns
first_col = Counter(item[0] for item in rows)
If you want the distribution of every column (not just the first) do:
distribution = {column: Counter(item[column] for item in rows) for column in range(len(rows[0]))}
# warning this will break if all rows are not the same size!
Considering that the data file has ~9000 rows of data, if you don't want to keep the original data, you can combine step 1 & 2 to make the program use less memory and a little faster.
ldata = open("list.data", "r")
# read in all the rows, note that the list values are strings instead of integers
# keep only the rows with 6th column = '1'
only1 = []
for line in ldata:
if line.strip() == '': # ignor blank lines
continue
row = tuple(line.strip().split(" "))
if row[5] == '1':
only1.append(row)
ldata.close()
# tally the statistics
income_dist = {}
for row in only1:
if row[0] in income_dist:
income_dist[row[0]] += 1
else:
income_dist[row[0]] = 1
# print result
print "While column six equals '1',"
for num in sorted(income_dist):
print "the value %s appears %d times in column one." % (num, income_dist[num])
Sample Test Data in list.data:
9 2 1 5 4 5 5 3 3 0 1 1 7 NA
9 1 1 5 5 5 5 3 5 2 1 1 7 1
9 2 1 3 5 1 5 2 3 1 2 3 7 1
1 2 5 1 2 6 5 1 4 2 3 1 7 1
1 2 5 1 2 6 3 1 4 2 3 1 7 1
8 1 1 6 4 8 5 3 2 0 1 1 7 1
1 1 5 2 3 9 4 1 3 1 2 3 7 1
6 1 3 3 4 1 5 1 1 0 2 3 7 1
2 1 1 6 3 8 5 3 3 0 2 3 7 1
4 1 1 7 4 8 4 3 2 0 2 3 7 1
1 1 5 2 4 1 5 1 1 0 2 3 7 1
4 2 2 2 3 2 5 1 2 0 1 1 5 1
8 2 1 3 6 6 2 2 4 2 1 1 7 1
7 2 1 5 3 5 5 3 4 0 2 1 7 1
1 1 5 2 3 9 4 1 3 1 2 3 7 1
6 1 3 3 4 1 5 1 1 0 2 3 7 1
2 1 1 6 3 8 5 3 3 0 2 3 7 1
4 1 1 7 4 8 4 3 2 0 2 3 7 1
1 1 5 2 4 9 5 1 1 0 2 3 7 1
4 2 2 2 3 2 5 1 2 0 1 1 5 1
Following your original program logic, I come up with this version:
ldata = open("list.data", "r")
# read in all the rows, note that the list values are strings instead of integers
linelist = []
for line in ldata:
linelist.append(tuple(line.strip().split(" ")))
ldata.close()
# keep only the rows with 6th column = '1'
only1 = []
for row in linelist:
if row[5] == '1':
only1.append(row)
# tally the statistics
income_dist = {}
for row in only1:
if row[0] in income_dist:
income_dist[row[0]] += 1
else:
income_dist[row[0]] = 1
# print result
print "While column six equals '1',"
for num in sorted(income_dist):
print "the value %s appears %d times in column one." % (num, income_dist[num])
I was wondering how I could align every item in one list, to the corresponding index in the second list. Here is my code so far:
letters=['a','ab','abc','abcd','abcde','abcdef','abcdefg','abcdefgh','abcdefghi','abcdefghij']
numbers=[1,2,3,4,5,6,7,8,9,10]
for x in range(len(letters)):
print letters[x]+"----------",numbers[x]
This is the output I get:
a---------- 1
ab---------- 2
abc---------- 3
abcd---------- 4
abcde---------- 5
abcdef---------- 6
abcdefg---------- 7
abcdefgh---------- 8
abcdefghi---------- 9
abcdefghij---------- 10
This is the output I want:
a---------- 1
ab--------- 2
abc-------- 3
abcd------- 4
abcde------ 5
abcdef----- 6
abcdefg---- 7
abcdefgh--- 8
abcdefghi-- 9
abcdefghij- 10
You could use string formatting:
for left, right in zip(letters, numbers):
print '{0:-<12} {1}'.format(left, right)
And the output:
a----------- 1
ab---------- 2
abc--------- 3
abcd-------- 4
abcde------- 5
abcdef------ 6
abcdefg----- 7
abcdefgh---- 8
abcdefghi--- 9
abcdefghij-- 10
Something like this using string.formatting:
def solve(letters,numbers):
it=iter(range( max(numbers) ,0,-1))
for x,y in zip(letters,numbers):
print "{0}{1} {2}".format(x,"-"*next(it),y)
....:
In [38]: solve(letters,numbers)
a---------- 1
ab--------- 2
abc-------- 3
abcd------- 4
abcde------ 5
abcdef----- 6
abcdefg---- 7
abcdefgh--- 8
abcdefghi-- 9
abcdefghij- 10
letters=['a','ab','abc','abcd','abcde','abcdef','abcdefg','abcdefgh','abcdefghi','abcdefghij']
for c,x in enumerate(letters, start=1):
print x+("-"*(10-c))+" %s" % c
a--------- 1
ab-------- 2
abc------- 3
abcd------ 4
abcde----- 5
abcdef---- 6
abcdefg--- 7
abcdefgh-- 8
abcdefghi- 9
abcdefghij 10
letters=['a','ab','abc','abcd','abcde','abcdef','abcdefg','abcdefgh','abcdefghi','abcdefghij']
numbers=[1,2,3,4,5,6,7,8,9,10]
for x in range(len(letters)):
print '{0:11}{1}'.format(letters[x],numbers[x]).replace(' ','-');
a----------1
ab---------2
abc--------3
abcd-------4
abcde------5
abcdef-----6
abcdefg----7
abcdefgh---8
abcdefghi--9
abcdefghij-10