Using NetworkX all_simple_paths gives AttributeError - python

I have a graph with adjacency matrix of the form below (a 6-node graph where self-edges are 0 and no_connections are marked Inf and other edges are 1):
{1: {1: 0, 2: 1, 3: inf, 4: inf, 5: inf, 6: inf}, 2: {1: 1, 2: 0, 3: inf, 4: 1, 5: 1, 6: inf}, 3: {1: inf, 2: inf, 3: 0, 4: 1, 5: inf, 6: inf}, 4: {1: inf, 2: 1, 3: 1, 4: 0, 5: 1, 6: 1}, 5: {1: inf, 2: 1, 3: inf, 4: 1, 5: 0, 6: inf}, 6: {1: inf, 2: inf, 3: inf, 4: 1, 5: inf, 6: 0}}
I want to use networkx package for its all_simple_paths function to find all simple paths from a source to a destination but when I call
nx.all_simple_paths(graph, src, dst)
it gives:
AttributeError: 'dict' object has no attribute 'is_multigraph'
I currently do not have the graph in any other format. How should I resolve this issue?
Thanks.

Your graph is currently stored as a dictionary. It's a little unfair to expect networkx to work automagically on any data structure you choose. Even if it were set up to handle a dictionary in the way you've done it, how would it know how to interpret a 0 or inf?
To use networkx commands you'll need your graph to be in the networkx Graph format.
import networkx as nx
D = {1: {1: 0, 2: 1, 3: float('inf'), 4: float('inf'), 5: float('inf'), 6: float('inf')}, 2: {1: 1, 2: 0, 3: float('inf'), 4: 1, 5: 1, 6: float('inf')}, 3: {1: float('inf'), 2: float('inf'), 3: 0, 4: 1, 5: float('inf'), 6: float('inf')}, 4: {1: float('inf'), 2: 1, 3: 1, 4: 0, 5: 1, 6: 1}, 5: {1: float('inf'), 2: 1, 3: float('inf'), 4: 1, 5: 0, 6: float('inf')}, 6: {1: float('inf'), 2: float('inf'), 3: float('inf'), 4: 1, 5: float('inf'), 6: 0}}
G=nx.Graph()
for node, neighbor_dict in D.items():
G.add_node(node)
for neighbor, val in neighbor_dict.items():
if val !=0 and val <float('inf'):
G.add_edge(node, neighbor, weight=val)
for path in nx.all_simple_paths(G,1,3):
print path
>[1, 2, 4, 3]
>[1, 2, 5, 4, 3]

Related

Python Looping - storing dataframes from .txt file loop, with different lengths

I would like to loop through a bunch of .txt files, for each of the files processing it (removing columns, changing names, nan etc) to get the end dataframe output of df1, which has certain date, lat, lon, and variables assigned to it. Over the loop, I would like to get df_all, with all the information from all the files in (most likely in date order).
However, each of my dataframes are different lengths, and there is the possibility of them sharing the same date+ lat/lon values in that column.
I have made code to feed in and process files individually, but I'm stuck on how to make this into a larger loop (via concat/append...?).
I am trying to end up with one large dataframe (df_all), which contains all the 'scattered' information of the different files (df1 outputs). In addition, if there is a conflicting date and lat/lon, I would find the mean. Is this possible to do in python/pandas?
Any help at all on any of the multiple issues would be greatly appreciated! Or ideas on how to go about this.
Here are fake tables that are read in by a for-loop and concat to a big table. Then after all rows are added to a single big table, you can group together multiple rows that have the same values in the A column and get the mean of the B and C columns as an example. You should be able to run this chunk of code yourself and I hope this helps give you keywords to use to search for other questions similar to yours!
import pandas as pd
#Making fake table read ins. you'd be using pd.read_csv or similar
def fake_read_table(name):
small_df1 = pd.DataFrame({'A': {0: 5, 1: 1, 2: 3, 3: 1}, 'B': {0: 4, 1: 4, 2: 4, 3: 4}, 'C': {0: 2, 1: 1, 2: 4, 3: 1}})
small_df2 = pd.DataFrame({'A': {0: 4, 1: 5, 2: 1, 3: 4, 4: 3, 5: 2, 6: 5, 7: 1}, 'B': {0: 3, 1: 1, 2: 1, 3: 1, 4: 5, 5: 1, 6: 4, 7: 2}, 'C': {0: 4, 1: 1, 2: 5, 3: 2, 4: 4, 5: 4, 6: 5, 7: 2}})
small_df3 = pd.DataFrame({'A': {0: 2, 1: 2, 2: 4, 3: 3, 4: 1, 5: 4, 6: 5}, 'B': {0: 1, 1: 2, 2: 3, 3: 1, 4: 3, 5: 5, 6: 4}, 'C': {0: 5, 1: 2, 2: 3, 3: 3, 4: 5, 5: 4, 6: 5}})
if name == '1.txt':
return small_df1
if name == '2.txt':
return small_df2
if name == '3.txt':
return small_df3
#Start here
txt_paths = ['1.txt','2.txt','3.txt']
big_df = pd.DataFrame()
for txt_path in txt_paths:
small_df = fake_read_table(txt_path)
# .. do some processing you need to do somewhere in here ..
big_df = pd.concat((big_df,small_df))
#Taking the average B and C values for rows that have the same A value
agg_df = big_df.groupby('A').agg(
mean_B = ('B','mean'),
mean_C = ('C','mean'),
).reset_index()
print(agg_df)

Why doesn't json.loads() work for multiple columns?

Is it possible to apply json.loads to multiple columns? If I do something like:
df['col1'] = df['col1'].apply(json.loads)
I can apply it to each entry in col1 and everything is fine. But if I do something like,
df[['col1', 'col2', 'col3'] = df[['col1', 'col2', 'col3' ].apply(json.loads)
I get the error:
TypeError: the JSON object must be str, bytes or bytearray, not Series.
Why doesn't this way work? Is it possible to apply it all at once or should I just do each column individually?
Per https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.apply.html, apply applies a function along an axis.
One way is to write a function that applies to the row series. The example doesn't apply the json.loads because the sample input would need to be json, but you can change the apply_transform function to meet your needs
(sorry for the to_dict() output but I have trouble getting dataframe output into text editors)
import pandas as pd
import numpy as np
columns = [
"col1",
"col2",
"col3",
"col4",
"col5"
]
df = pd.DataFrame(np.random.randint(0,5,size=(5, 5)), columns=columns)
df.to_dict()
# {'col1': {0: 4, 1: 3, 2: 3, 3: 1, 4: 2},
# 'col2': {0: 1, 1: 1, 2: 3, 3: 3, 4: 1},
# 'col3': {0: 1, 1: 2, 2: 3, 3: 4, 4: 3},
# 'col4': {0: 0, 1: 1, 2: 1, 3: 3, 4: 2},
# 'col5': {0: 3, 1: 2, 2: 0, 3: 2, 4: 0}}
You will notice that after the transformation, the columns have been doubled. You would replace the multiplication transformation with your own code
def apply_transform(row):
new_row = row.copy()
for col in ['col1', 'col2', 'col3']:
new_row[col] = new_row[col] * 2 # apply your own transform here
return new_row
df_new = df.apply(apply_transform, axis=1)
df_new.to_dict()
# {'col1': {0: 8, 1: 6, 2: 6, 3: 2, 4: 4},
# 'col2': {0: 2, 1: 2, 2: 6, 3: 6, 4: 2},
# 'col3': {0: 2, 1: 4, 2: 6, 3: 8, 4: 6},
# 'col4': {0: 0, 1: 1, 2: 1, 3: 3, 4: 2},
# 'col5': {0: 3, 1: 2, 2: 0, 3: 2, 4: 0}}

When I use .copy() in python, why does it still make references of one dictionary? [duplicate]

This question already has answers here:
List of lists changes reflected across sublists unexpectedly
(17 answers)
Closed 1 year ago.
In python, I want to make a 2D array with dictionaries. I do have knowledge of references, so I explicitly used .copy. When I print the array out, however, the dictionaries that I do not want to be changed also changes.
My code is below.
dicts = []
for j in range(3):
dicts.append([{0:0,1:0,2:0,3:0}.copy()].copy() * 3)
dicts[0][0][0] = 5
dicts[1][1][0] = 10
print(dicts)
OUTPUT:
[[{0: 5, 1: 0}, {0: 5, 1: 0}], [{0: 0, 1: 10}, {0: 0, 1: 10}]]
Does anyone know why this happens, and anyway to fix it? Thank you.
The way to solve this kind of thing cleanly is with list comprehensions:
dicts = [[{0:0,1:0,2:0,3:0} for i in range(3)] for j in range(3)]
dicts[0][0][0] = 5
dicts[1][1][0] = 10
print(dicts)
Output:
[[{0: 5, 1: 0, 2: 0, 3: 0}, {0: 0, 1: 0, 2: 0, 3: 0}, {0: 0, 1: 0, 2: 0, 3: 0}], [{0: 0, 1: 0, 2: 0, 3: 0}, {0: 10, 1: 0, 2: 0, 3: 0}, {0: 0, 1: 0, 2: 0, 3: 0}], [{0: 0, 1: 0, 2: 0, 3: 0}, {0: 0, 1: 0, 2: 0, 3: 0}, {0: 0, 1: 0, 2: 0, 3: 0}]]

Python doesn't return correct values with a dictionary using a function [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I have a function that returns an array of values and another array with dictionaries, All dictionaries are different but it returns the same value
When I print it form the function I get correct values for example
{0: 0}
{0: 1, 1: 0}
{0: 2, 1: 1, 2: 0}
{0: 3, 1: 5, 2: 0}
{0: 4, 1: 1, 2: 0}
{0: 5, 1: 0, 2: 0}
{0: 6, 1: 5, 2: 0}
{0: 7, 1: 2, 2: 1, 3: 0}
But when I return the array y get this (the wrong answer)
([0, 2, 4, 4, 6, 1, 6, 5], # This array is correct
[{0: 7, 1: 2, 2: 1, 3: 0}, # From here is incorrect
{0: 7, 1: 2, 2: 1, 3: 0},
{0: 7, 1: 2, 2: 1, 3: 0},
{0: 7, 1: 2, 2: 1, 3: 0},
{0: 7, 1: 2, 2: 1, 3: 0},
{0: 7, 1: 2, 2: 1, 3: 0},
{0: 7, 1: 2, 2: 1, 3: 0},
{0: 7, 1: 2, 2: 1, 3: 0}])
This is the fragment of code with this problem
.
.
.
for i in range(n):
j = i
count = 0
while parent[j] != -1:
s[i][count] = j
count = count + 1
j = parent[j]
s[i][count] = start
###########
print(s[i])
###########
return dist, s
I think this is what u 're looking for:
for i in range(n):
j = i
s = []
while parent[j] != -1:
s.append(j)
j = parent[j]
s.append(start)
path[i] = s[::-1]

Elements of dict of sets in python

I have a dictionary like this:
dict1 = {0: set([1, 4, 5]), 1: set([2, 6]), 2: set([3]), 3: set([0]), 4: set([1]), 5: set([2]), 6: set([])}
and from this dictionary I want to build another dictionary that count the occurrences of keys in dict1 in every other value ,that is the results should be:
result_dict = {0: 1, 1: 2, 2: 2, 3: 1, 4: 1, 5: 1, 6: 1}
My code was this :
dict1 = {0: set([1, 4, 5]), 1: set([2, 6]), 2: set([3]), 3: set([0]), 4: set([1]), 5:set([2]), 6: set([])}
result_dict = {}
for pair in dict1.keys():
temp_dict = list(dict1.keys())
del temp_dict[pair]
count = 0
for other_pairs in temp_dict :
if pair in dict1[other_pairs]:
count = count + 1
result_dict[pair] = count
The problem with this code is that it is very slow with large set of data.
Another attempt was in a single line, like this :
result_dict = dict((key ,dict1.values().count(key)) for key in dict1.keys())
but it gives me wrong results, since values of dict1 are sets:
{0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0}
thanks a lot in advance
I suppose, for a first stab, I would figure out which values are there:
all_values = set().union(*dict1.values())
Then I'd try to count how many times each value occurred:
result_dict = {}
for v in all_values:
result_dict[v] = sum(v in dict1[key] for key in dict1)
Another approach would be to use a collections.Counter:
result_dict = Counter(v for set_ in dict1.values() for v in set_)
This is probably "cleaner" than my first solution -- but it does involve a nested comprehension which can be a little difficult to grok. It does work however:
>>> from collections import Counter
>>> dict1
{0: set([1, 4, 5]), 1: set([2, 6]), 2: set([3]), 3: set([0]), 4: set([1]), 5: set([2]), 6: set([])}
>>> result_dict = Counter(v for set_ in dict1.values() for v in set_)
Just create a second dictionary using the keys from dict1, with values initiated at 0. Then iterate through the values in the sets of dict1, incrementing values of result_dict as you go. The runtime is O(n), where n is the aggregate number of values in sets of dict1.
dict1 = {0: set([1, 4, 5]), 1: set([2, 6]), 2: set([3]), 3: set([0]), 4: set([1]), 5:set([2]), 6: set([])}
result_dict = dict.fromkeys(dict1.keys(), 0)
# {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0}
for i in dict1.keys():
for j in dict1[i]:
result_dict[j] += 1
print result_dict
# {0: 1, 1: 2, 2: 2, 3: 1, 4: 1, 5: 1, 6: 1}

Categories

Resources