How to load Pandas dataframe into Surprise dataset? - python
I am building a recommender system based on user's ratings for 11 different items.
I started with a dictionary (user_dict) of user ratings:
{'U1': [3, 4, 2, 5, 0, 4, 1, 3, 0, 0, 4],
'U2': [2, 3, 1, 0, 3, 0, 2, 0, 0, 3, 0],
'U3': [0, 4, 0, 5, 0, 4, 0, 3, 0, 2, 4],
'U4': [0, 0, 2, 1, 4, 3, 2, 0, 0, 2, 0],
'U5': [0, 0, 0, 5, 0, 4, 0, 3, 0, 0, 4],
'U6': [2, 3, 4, 0, 3, 0, 3, 0, 3, 4, 0],
'U7': [0, 4, 3, 5, 0, 5, 0, 0, 0, 0, 4],
'U8': [4, 3, 0, 3, 4, 2, 2, 0, 2, 3, 2],
'U9': [0, 2, 0, 3, 1, 0, 1, 0, 0, 2, 0],
'U10': [0, 3, 0, 4, 3, 3, 0, 3, 0, 4, 4],
'U11': [2, 2, 1, 2, 1, 0, 2, 0, 1, 0, 2],
'U12': [0, 4, 4, 5, 0, 0, 0, 3, 0, 4, 5],
'U13': [3, 3, 0, 2, 2, 3, 2, 0, 2, 0, 3],
'U14': [0, 3, 4, 5, 0, 5, 0, 0, 0, 4, 0],
'U15': [2, 0, 0, 3, 0, 2, 2, 3, 0, 0, 3],
'U16': [4, 4, 0, 4, 3, 4, 0, 3, 0, 3, 0],
'U17': [0, 2, 0, 3, 1, 0, 2, 0, 1, 0, 3],
'U18': [2, 3, 1, 0, 3, 2, 3, 2, 0, 2, 0],
'U19': [0, 5, 0, 4, 0, 3, 0, 4, 0, 0, 5],
'U20': [0, 0, 3, 0, 3, 0, 4, 0, 2, 0, 0],
'U21': [3, 0, 2, 4, 2, 3, 0, 4, 2, 3, 3],
'U22': [4, 4, 0, 5, 3, 5, 0, 4, 0, 3, 0],
'U23': [3, 0, 0, 0, 3, 0, 2, 0, 0, 4, 0],
'U24': [4, 0, 3, 0, 3, 0, 3, 0, 0, 2, 2],
'U25': [0, 5, 0, 3, 3, 4, 0, 3, 3, 4, 4]}
I then loaded the dictionary into a Pandas dataframe by using this code:
df= pd.DataFrame(user_dict)
userRatings_df = df.T
print(userRatings_df)
This prints the data like so:
0 1 2 3 4 5 6 7 8 9 10
U1 3 4 2 5 0 4 1 3 0 0 4
U2 2 3 1 0 3 0 2 0 0 3 0
U3 0 4 0 5 0 4 0 3 0 2 4
U4 0 0 2 1 4 3 2 0 0 2 0
U5 0 0 0 5 0 4 0 3 0 0 4
U6 2 3 4 0 3 0 3 0 3 4 0
U7 0 4 3 5 0 5 0 0 0 0 4
U8 4 3 0 3 4 2 2 0 2 3 2
U9 0 2 0 3 1 0 1 0 0 2 0
U10 0 3 0 4 3 3 0 3 0 4 4
U11 2 2 1 2 1 0 2 0 1 0 2
U12 0 4 4 5 0 0 0 3 0 4 5
U13 3 3 0 2 2 3 2 0 2 0 3
U14 0 3 4 5 0 5 0 0 0 4 0
U15 2 0 0 3 0 2 2 3 0 0 3
U16 4 4 0 4 3 4 0 3 0 3 0
U17 0 2 0 3 1 0 2 0 1 0 3
U18 2 3 1 0 3 2 3 2 0 2 0
U19 0 5 0 4 0 3 0 4 0 0 5
U20 0 0 3 0 3 0 4 0 2 0 0
U21 3 0 2 4 2 3 0 4 2 3 3
U22 4 4 0 5 3 5 0 4 0 3 0
U23 3 0 0 0 3 0 2 0 0 4 0
U24 4 0 3 0 3 0 3 0 0 2 2
U25 0 5 0 3 3 4 0 3 3 4 4
When I attempt to load into into a Surprise dataset I run this code:
reader = Reader(rating_scale=(1,5))
userRatings_data=Dataset.load_from_df(userRatings_df[[1,2,3,4,5,6,7,8,9,10]],
reader)
I get this error:
ValueError: too many values to unpack (expected 3)
Can anyone help me to fix this error?
The problem is coming from the way you are converting your dictionary into a pandas dataframe. For the Dataset to be able process a pandas dataframe, you will need to have only three columns. First column is supposed to be the user ID, second column is the item ID and the third column is the actual rating.
This is how I would build a dataframe which would run in "Dataset":
DF = pd.DataFrame()
for key in user_dict.keys():
df = pd.DataFrame(columns=['User', 'Item', 'Rating'])
df['Rating'] = pd.Series(user_dict[key])
df['Item'] = pd.DataFrame(df.index)
df['User'] = key
DF = pd.concat([DF, df], axis = 0)
DF = DF.reset_index(drop=True)
If you pay attention, I am taking every key from the dictionary, which is essentially a user ID, turn it into a pandas column, along with the ratings and the ratings' indices which will be the column for raw item IDs. Then from every key I build a temporary dataframe which is stacked on top of each other in the final and main dataframe.
Hopefully this helps.
Related
Can I use pivotal table to create a heatmap table in pandas
I have this data frame and the result dataframe: df= pd.DataFrame( { "I": ["I1", "I2", "I3", "I4", "I5", "I6", "I7"], "A": [1, 1, 0, 0, 0, 0, 0], "B": [0, 1, 1, 0, 0, 1, 1], "C": [0, 0, 0, 0, 0, 1, 1], "D": [1, 1, 1, 1, 1, 0, 1], "E": [1, 0, 0, 1, 1, 0, 1], "F": [0, 0, 0, 1, 1, 0, 0], "G": [0, 0, 0, 0, 1, 0, 0], "H": [1, 1, 0, 0, 0, 1, 1], }) result=pd.DataFrame( { "I": ["A", "B", "C", "D", "E", "F", "G", "H"], "A": [2, 1, 0, 2, 1, 0, 0, 2], "B": [1, 4, 2, 3, 1, 0, 0, 3], "C": [0, 2, 2, 1, 1, 0, 0, 2], "D": [2, 3, 1, 6, 4, 2, 1, 3], "E": [1, 1, 1, 4, 4, 2, 1, 2], "F": [0, 0, 0, 2, 2, 2, 1, 0], "G": [0, 0, 0, 1, 1, 1, 1, 0], "H": [2, 3, 2, 3, 2, 0, 0, 4], }) print('input dataframe') print(df) print('result dataframe') print(result) The result data frame is a square data frame (the number of rows and columns are the same), and the value in each cell is the number of rows with 1 on both columns. for example the cell at A:B is the number of columns with 1 in Column A and 1 in column B. In this case, the result is 1 since only on row I2 the values for both columns are one. I can write nested for loop to calculate these values, but I am looking for a better way to do so. Can I use a pivotal table for this? My implementation which doesn't use a pivot table is as follows: df=df.astype(bool) r=pd.DataFrame(index=df.columns[1:], columns=df.columns[1:]) for c1 in df.columns[1:]: for c2 in df.columns[1:]: tmp=df[c1] & df[c2] r.loc[c1][c2]=tmp.sum() print(r) running this code generates: A B C D E F G H A 2 1 0 2 1 0 0 2 B 1 4 2 3 1 0 0 3 C 0 2 2 1 1 0 0 2 D 2 3 1 6 4 2 1 3 E 1 1 1 4 4 2 1 2 F 0 0 0 2 2 2 1 0 G 0 0 0 1 1 1 1 0 H 2 3 2 3 2 0 0 4
Yes, but you'd be better off with matrix multiplication: df.iloc[:,1:].T # df.iloc[:,1:] Output: A B C D E F G H A 2 1 0 2 1 0 0 2 B 1 4 2 3 1 0 0 3 C 0 2 2 1 1 0 0 2 D 2 3 1 6 4 2 1 3 E 1 1 1 4 4 2 1 2 F 0 0 0 2 2 2 1 0 G 0 0 0 1 1 1 1 0 H 2 3 2 3 2 0 0 4
Explode column into columns
I got pandas train df which looks like this: image 1 [[0, 0, 0], [1, 0, 1], [0, 1, 1]] 2 [[1, 1, 1], [0, 0, 1], [0, 0, 1]] 2 [[0, 0, 1], [0, 1, 1], [1, 1, 1]] Is there any way to "explode" it but into columns 1 2 3 4 5 6 7 8 9 1 0, 0, 0, 1, 0, 1, 0, 1, 1 2 1, 1, 1, 0, 0, 1, 0, 0, 1 2 0, 0, 1, 0, 1, 1, 1, 1, 1
np.vstack the Series of lists of lists, then reshape pd.DataFrame(np.vstack(df['image']).reshape(len(df), -1)) 0 1 2 3 4 5 6 7 8 0 0 0 0 1 0 1 0 1 1 1 1 1 1 0 0 1 0 0 1 2 0 0 1 0 1 1 1 1 1
How can i transform my binary numpy list of arrays to an image?
[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] [0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] [0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] [0 1 1 1 0 0 0 0 0 1 1 0 0 3 3 0 0 0 4 4 0 0 0 5 5 5 5 0 0 2 2 2 2 2 0 2 2 2 2 2 0 0 0 6 6 6 6 6 6 0 6 6 6 6] [0 1 1 0 0 0 0 0 0 0 0 0 0 3 3 0 0 0 4 4 0 0 5 5 5 5 5 5 0 2 2 2 2 2 2 2 2 2 2 2 2 0 0 6 6 6 6 6 6 6 6 6 6 6] [1 1 1 0 0 0 0 0 0 0 0 0 0 3 3 0 0 0 4 4 0 5 5 5 0 0 5 5 5 0 2 2 0 0 2 2 0 0 0 2 2 0 0 6 6 0 0 6 6 6 0 0 6 6] [1 1 1 0 0 0 0 0 0 0 0 0 0 3 3 0 0 0 4 4 0 5 5 5 5 0 0 0 0 0 2 2 0 2 2 2 0 0 0 2 2 2 0 6 6 0 0 0 6 6 0 0 6 6] [1 1 1 0 0 0 0 0 0 0 0 0 0 3 3 0 0 0 4 4 0 0 5 5 5 5 5 5 0 0 2 2 0 2 2 2 0 0 0 2 2 2 0 6 6 0 0 0 6 6 0 0 6 6] [0 1 1 0 0 0 0 0 0 7 0 0 0 3 3 0 0 0 4 4 0 0 0 0 5 5 5 5 5 0 2 2 0 2 2 2 0 0 0 2 2 2 0 6 6 0 0 0 6 6 0 0 6 6]] As a first step I want the pixels different than 0 to be white and the 0 pixels to be black.what i did to transform the none 0 values all to 1: binary_transform = np.array(labels).astype(bool).astype(int) and it worked then i want to transform the list of arrays of 1s and 0s to image, what i tried: from PIL import Image img = Image.fromarray(binary_transform, '1') img.save('image.png') the docs for Image.fromarray can be found here https://pillow.readthedocs.io/en/3.1.x/reference/Image.html It didn't work then i tried the following: import png png.from_array(binary_transform, 'L').save('image.png') Referring to the docs 'L' is for grayscale while i want binary but i didn't see a binary option, the docs https://pythonhosted.org/pypng/png.html and i got this error ValueError: bitdepth (64) must be a positive integer <= 16
Though you don't say that explicitly, the fact that you said "As a first step...", makes me think you are heading towards a greyscale palette image: import numpy as np from PIL import Image labels=[[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], [0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], [0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], [0,0,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], [0,1,1,1,0,0,0,0,0,1,1,0,0,3,3,0,0,0,4,4,0,0,0,5,5,5,5,0,0,2,2,2,2,2,0,2,2,2,2,2,0,0,0,6,6,6,6,6,6,0,6,6,6,6], [0,1,1,0,0,0,0,0,0,0,0,0,0,3,3,0,0,0,4,4,0,0,5,5,5,5,5,5,0,2,2,2,2,2,2,2,2,2,2,2,2,0,0,6,6,6,6,6,6,6,6,6,6,6], [1,1,1,0,0,0,0,0,0,0,0,0,0,3,3,0,0,0,4,4,0,5,5,5,0,0,5,5,5,0,2,2,0,0,2,2,0,0,0,2,2,0,0,6,6,0,0,6,6,6,0,0,6,6], [1,1,1,0,0,0,0,0,0,0,0,0,0,3,3,0,0,0,4,4,0,5,5,5,5,0,0,0,0,0,2,2,0,2,2,2,0,0,0,2,2,2,0,6,6,0,0,0,6,6,0,0,6,6], [1,1,1,0,0,0,0,0,0,0,0,0,0,3,3,0,0,0,4,4,0,0,5,5,5,5,5,5,0,0,2,2,0,2,2,2,0,0,0,2,2,2,0,6,6,0,0,0,6,6,0,0,6,6], [0,1,1,0,0,0,0,0,0,7,0,0,0,3,3,0,0,0,4,4,0,0,0,0,5,5,5,5,5,0,2,2,0,2,2,2,0,0,0,2,2,2,0,6,6,0,0,0,6,6,0,0,6,6]] binary_transform = np.array(labels).astype(np.uint8) img = Image.fromarray(binary_transform, 'P') img.save('image.png') Note that I have resized and contrast-stretched the image for display purposes. If you really only want a true binary, black and white image, use: binary_transform = np.array(labels).astype(np.uint8) binary_transform[binary_transform>0] = 255 img = Image.fromarray(binary_transform, 'L') img.save('image.png')
If I understand you right, you want the image to appear binary, i.e., just black and white, no grey. If that's the case, OpenCV is your friend: import cv2 import numpy as np binary_transform = np.array(labels).astype(np.uint8) _,thresh_img = cv2.threshold(binary_transform, 0, 255, cv2.THRESH_BINARY) cv2.imwrite('image.png', thresh_img) Of course PIL will work as well, you just need to adjust your non-zero values. binary_transform = np.array(labels).astype(np.uint8) binary_transform[binary_transform > 0] = 255 img = Image.fromarray(binary_transform, 'L') img.save('image.png')
The other answers (all good!) use OpenCV or PIL. Here's how you could create the image using numpngw, a small library that I wrote to create PNG files from numpy arrays. First, here's my data for the example: In [173]: x Out[173]: [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 1, 1, 1, 1, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 1, 1, 1, 1, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 1, 1, 1, 1, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 1, 1, 1, 1, 0, 3, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4, 0, 2, 2, 2, 2, 2, 0], [0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4, 0, 2, 2, 2, 2, 2, 0], [0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4, 0, 2, 2, 2, 2, 2, 0], [0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4, 0, 2, 2, 2, 2, 2, 0], [0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4, 0, 2, 2, 2, 2, 2, 0], [0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4, 0, 2, 2, 2, 2, 2, 0], [0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4, 0, 2, 2, 2, 2, 2, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]] Create the image using numpngw.write_png(): In [174]: import numpy as np In [175]: import numpngw In [176]: numpngw.write_png("foo.png", (np.array(x) > 0).astype(np.uint8), bitdepth=1) Here's the image:
percentage histogram with matplotlib, One input to the axis is a combination of two columns
I have a data frame with a class value I am trying to predict. I am interested in label 1. I am trying to determine if turn plays a role for a given key value. For a given key value of say 1 and a turn number of 1, what percentage of turns have a class value of 1? For example for the given data key=1,turn=1,8/11 have a class label 1 key=1,turn=2,5/6 have a class label 1 How can I plot a percentage histogram for this type of data? I know a normal histogram using matplotlib import matplotlib matplotlib.use('PS') import matplotlib.pyplot as plt plt.hist() but what values I would use to get the percentage histogram? Sample columns from the dataframe key=[ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ] turn=[ 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4] class=[0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 0 0 0 0 1 1 1 1 0 1 1 0 0 1 0 0 0 0 0 1 0 1 1 0 0 1 1 1 0 0]
Since the concepts from the linked question are apparently not what you need, an alternative would be to produce pie charts as shown below. key=[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2 ] turn=[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4] clas=[0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0] import pandas as pd import numpy as np import matplotlib.pyplot as plt df=pd.DataFrame({"key":key, "turn":turn, "class":clas}) piv = pd.pivot_table(df, values="class", index="key", columns="turn") print piv fig, axes = plt.subplots(ncols=4, nrows=2) for i in range(2): axes[i,0].set_ylabel("key {}".format(i+1)) for j in range(4): pie = axes[i,j].pie([piv.values[i,j],1.-piv.values[i,j]], autopct="%.1f%%") axes[i,j].set_aspect("equal") axes[0,j].set_title("turn {}".format(j+1)) plt.legend(pie[0],["class 1","class 0"], bbox_to_anchor=(1,0.5), loc="right", bbox_transform=plt.gcf().transFigure) plt.show()
Possible bug in scipy.ndimage.measurements.label?
I was trying to do a percolation program in python, and I saw a tutorial recommending scipy.ndimage.measurements.label to identify the clusters. The problem is I stated to notice some odd behaviors in the function. Some elements that should belong to the same cluster are receiving different labels. Here is a code snippet that reproduce my problem. import numpy as np import scipy from scipy.ndimage import measurements grid = np.array([[0, 1, 1, 0, 1, 1, 0, 1, 0, 1], [0, 1, 0, 0, 0, 0, 0, 0, 1, 0], [1, 1, 1, 1, 1, 1, 0, 1, 1, 1], [1, 0, 1, 0, 1, 1, 0, 1, 1, 1], [0, 0, 1, 0, 1, 0, 0, 0, 0, 1], [0, 1, 1, 1, 0, 0, 0, 0, 0, 1], [0, 1, 0, 1, 1, 1, 0, 0, 1, 1], #<- notice the last two elements [1, 1, 0, 1, 1, 1, 1, 1, 1, 0], [1, 0, 0, 0, 1, 1, 1, 1, 0, 1], [1, 1, 1, 0, 0, 0, 1, 1, 0, 0]]) labels, nlabels = measurements.label(grid) print "Scipy Version: ", scipy.__version__ print print labels The output I get is: Scipy Version: 0.13.0 [[0 1 1 0 2 2 0 3 0 4] [0 1 0 0 0 0 0 0 5 0] [1 1 1 1 1 1 0 5 5 5] [1 0 1 0 1 1 0 5 5 5] [0 0 1 0 1 0 0 0 0 5] [0 1 1 1 0 0 0 0 0 5] [0 1 0 1 1 1 0 0 1 5] #<- The last two elements [1 1 0 1 1 1 1 1 1 0] # are set with different labels [1 0 0 0 1 1 1 1 0 6] [1 1 1 0 0 0 1 1 0 0]] Am I missing something about the way this function is supposed to work or is this a bug? This is very important because labeling the clusters correctly is crucial to get the right results in percolation. Thanks, for the help.