Python - Permutation/Combination column-wise - python

I have a list
mylist = [
['f', 'l', 'a', 'd', 'l', 'f', 'k'],
['g', 'm', 'b', 'b', 'k', 'g', 'l'],
['h', 'n', 'c', 'a', 'm', 'j', 'o'],
['i', 'o', 'd', 'c', 'n', 'i', 'm'],
['j', 'p', 'e', 'e', 'o', 'h', 'n'],
]
I want do permutation/combination column-wise, such the elements of the column are restricted to that column i.e., f,g,h,i,j remain in Column 1, l,m,n,o,p remain in Column 2 and so on, in the results of permutation/combination. How can this be achieved in Python 2.7?

You could use zip(*mylist) to list the "columns" of mylist. Then use the * operator (again) to unpack those lists as arguments to IT.product or IT.combinations. For example,
import itertools as IT
list(IT.product(*zip(*mylist)))
yields
[('f', 'l', 'a', 'd', 'l', 'f', 'k'),
('f', 'l', 'a', 'd', 'l', 'f', 'l'),
('f', 'l', 'a', 'd', 'l', 'f', 'o'),
('f', 'l', 'a', 'd', 'l', 'f', 'm'),
...]

Related

Given a Python list of lists, find all possible flat lists that keeps the order of each sublist?

I have a list of lists. I want to find all flat lists that keeps the order of each sublist. As an example, let's say I have a list of lists like this:
ll = [['D', 'O', 'G'], ['C', 'A', 'T'], ['F', 'I', 'S', 'H']]
It is trivial to get one solution. I managed to write the following code which can generate a random flat list that keeps the order of each sublist.
import random
# Flatten the list of lists
flat = [x for l in ll for x in l]
# Shuffle to gain randomness
random.shuffle(flat)
for l in ll:
# Find the idxs in the flat list that belongs to the sublist
idxs = [i for i, x in enumerate(flat) if x in l]
# Change the order to match the order in the sublist
for j, idx in enumerate(idxs):
flat[idx] = l[j]
print(flat)
This can generate flat lists that looks as follows:
['F', 'D', 'O', 'C', 'A', 'G', 'I', 'S', 'T', 'H']
['C', 'D', 'F', 'O', 'G', 'I', 'S', 'A', 'T', 'H']
['C', 'D', 'O', 'G', 'F', 'I', 'S', 'A', 'T', 'H']
['F', 'C', 'D', 'I', 'S', 'A', 'H', 'O', 'T', 'G']
As you can see, 'A' always appears after 'C', 'T' always appears after 'A', 'O' always appears after 'D', and so on...
However, I want to get all possible solutions.
Please note that :
I want a general code that works for any given list of lists, not just for "dog cat fish";
It does not matter whether there are duplicants or not because every item is distinguishable.
Can anyone suggest a fast Python algorithm for this?
Suppose you are combining the lists by hand. Instead of shuffling and putting things back in order, you would select one list and take its first element, then again select a list and take its first (unused) element, and so on. So the algorithm you need is this: What are all the different ways to pick from a collection of lists with these particular sizes?
In your example you have lists of length 3, 3, 4; suppose you had a bucket with three red balls, three yellow balls and four green balls, which orderings are possible? Model this, and then just pick the first unused element from the corresponding list to get your output.
Say what? For your example, the (distinct) pick orders would be given by
set(itertools.permutations("RRRYYYGGGG"))
For any list of lists, we'll use integer keys instead of letters. The pick orders are:
elements = []
for key, lst in enumerate(ll):
elements.extend( [ key ] * len(lst))
pick_orders = set(itertools.permutations(elements))
Then you just use each pick order to present the elements from your list of lists, say with pop(0) (from a copy of the lists, since pop() is destructive).
Yet another solution, but this one doesn't use any libraries.
def recurse(lst, indices, total, curr):
done = True
for l, (pos, index) in zip(lst, enumerate(indices)):
if index < len(l): # can increment index
curr.append(l[index]) # add on corresponding value
indices[pos] += 1 # increment index
recurse(lst, indices, total, curr)
# backtrack
indices[pos] -= 1
curr.pop()
done = False # modification made, so not done
if done: # no changes made
total.append(curr.copy())
return
def list_to_all_flat(lst):
seq = [0] * len(lst) # set up indexes
total, curr = [], []
recurse(lst, seq, total, curr)
return total
if __name__ == "__main__":
lst = [['D', 'O', 'G'], ['C', 'A', 'T'], ['F', 'I', 'S', 'H']]
print(list_to_all_flat(lst))
Try:
from itertools import permutations, chain
ll = [["D", "O", "G"], ["C", "A", "T"], ["F", "I", "S", "H"]]
x = [[(i1, i2, o) for i2, o in enumerate(subl)] for i1, subl in enumerate(ll)]
l = sum(len(subl) for subl in ll)
def is_valid(c):
seen = {}
for i1, i2, _ in c:
if i2 != seen.get(i1, -1) + 1:
return False
else:
seen[i1] = i2
return True
for c in permutations(chain(*x), l):
if is_valid(c):
print([o for *_, o in c])
Prints:
['D', 'O', 'G', 'C', 'A', 'T', 'F', 'I', 'S', 'H']
['D', 'O', 'G', 'C', 'A', 'F', 'T', 'I', 'S', 'H']
['D', 'O', 'G', 'C', 'A', 'F', 'I', 'T', 'S', 'H']
['D', 'O', 'G', 'C', 'A', 'F', 'I', 'S', 'T', 'H']
['D', 'O', 'G', 'C', 'A', 'F', 'I', 'S', 'H', 'T']
['D', 'O', 'G', 'C', 'F', 'A', 'T', 'I', 'S', 'H']
['D', 'O', 'G', 'C', 'F', 'A', 'I', 'T', 'S', 'H']
['D', 'O', 'G', 'C', 'F', 'A', 'I', 'S', 'T', 'H']
...
['F', 'I', 'S', 'H', 'C', 'D', 'A', 'O', 'T', 'G']
['F', 'I', 'S', 'H', 'C', 'D', 'A', 'T', 'O', 'G']
['F', 'I', 'S', 'H', 'C', 'A', 'D', 'O', 'G', 'T']
['F', 'I', 'S', 'H', 'C', 'A', 'D', 'O', 'T', 'G']
['F', 'I', 'S', 'H', 'C', 'A', 'D', 'T', 'O', 'G']
['F', 'I', 'S', 'H', 'C', 'A', 'T', 'D', 'O', 'G']
You can use a recursive generator function:
ll = [['D', 'O', 'G'], ['C', 'A', 'T'], ['F', 'I', 'S', 'H']]
def get_combos(d, c = []):
if not any(d) and len(c) == sum(map(len, ll)):
yield c
elif any(d):
for a, b in enumerate(d):
for j, k in enumerate(b):
yield from get_combos(d[:a]+[b[j+1:]]+d[a+1:], c+[k])
print(list(get_combos(ll)))
Output (first ten permutations):
[['D', 'O', 'G', 'C', 'A', 'T', 'F', 'I', 'S', 'H'], ['D', 'O', 'G', 'C', 'A', 'F', 'T', 'I', 'S', 'H'], ['D', 'O', 'G', 'C', 'A', 'F', 'I', 'T', 'S', 'H'], ['D', 'O', 'G', 'C', 'A', 'F', 'I', 'S', 'T', 'H'], ['D', 'O', 'G', 'C', 'A', 'F', 'I', 'S', 'H', 'T'], ['D', 'O', 'G', 'C', 'F', 'A', 'T', 'I', 'S', 'H'], ['D', 'O', 'G', 'C', 'F', 'A', 'I', 'T', 'S', 'H'], ['D', 'O', 'G', 'C', 'F', 'A', 'I', 'S', 'T', 'H'], ['D', 'O', 'G', 'C', 'F', 'A', 'I', 'S', 'H', 'T'], ['D', 'O', 'G', 'C', 'F', 'I', 'A', 'T', 'S', 'H']]
For simplicity, let's start with two list item in a list.
from itertools import permutations
Ls = [['D', 'O', 'G'], ['C', 'A', 'T']]
L_flattened = []
for L in Ls:
for item in L:
L_flattened.append(item)
print("L_flattened:", L_flattened)
print(list(permutations(L_flattened, len(L_flattened))))
[('D', 'O', 'G', 'C', 'A', 'T'), ('D', 'O', 'G', 'C', 'T', 'A'), ('D', 'O', 'G', 'A', 'C', 'T'), ('D', 'O', 'G', 'A', 'T', 'C'), ('D', 'O', 'G', 'T', 'C', 'A'), ('D', 'O', 'G', 'T', 'A', 'C'), ('D', 'O', 'C', 'G', 'A', 'T'),
('D', 'O', 'C', 'G', 'T', 'A'), ('D', 'O', 'C', 'A', 'G', 'T'), ('D', 'O', 'C', 'A', 'T', 'G'),
...
Beware that permutations grow very quickly in their sizes.
In your example there are 10 items and Permutation(10, 10) = 3628800.
I suggest you to calculate permutation here to get an idea before running actual code (which may cause memory error/freeze/crash in system).
You can try verifying all possible permutations:
import random
import itertools
import numpy as np
ll = [['D', 'O', 'G'], ['C', 'A', 'T'], ['F', 'I', 'S', 'H']]
flat = [x for l in ll for x in l]
all_permutations = list(itertools.permutations(flat))
good_permutations = []
count = 0
for perm in all_permutations:
count += 1
cond = True
for l in ll:
idxs = [perm.index(x) for i, x in enumerate(flat) if x in l]
# check if ordered
if not np.all(np.diff(np.array(idxs)) >= 0):
cond = False
break
if cond == True:
good_permutations.append(perm)
if count >= 10000:
break
print(len(good_permutations))
It is only a basic solution as it is really slow to compute (I set the count to limit the number of permutations that are verified).

Can I pause itertools on python, and resume later?

I need to create a list of strings with all the possible combinations of all letters uppercase and lowercase, with non repeating characters, of lenght 14, this is massive and I know it will take a lot of time and space.
My code right now is this:
import itertools
filename = open("strings.txt", "w")
for com in itertools.permutations('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', 14):
filename.write("\n"+"1_"+"".join(com)+"\n"+"0_"+"".join(com))
print ("".join(com))
pretty basic, it does the job and I have not found a faster way as of yet (tried a java algorithm I found that seemed faster but python was faster)
Since this will take a long time, from time to time I need to turn off my computer, so I need to be able to save somewhere where I left and continue, else I will start from the beginning each time it crashes/turn off my pc / anything happen.
Is there any way to do that?
You can pickle that iterator object. Its internal state will be stored in the pickle file. When you resume it should start from where it left off.
Something like this:
import itertools
import os
import pickle
import time
# if the iterator was saved, load it
if os.path.exists('saved_iter.pkl'):
with open('saved_iter.pkl', 'rb') as f:
iterator = pickle.load(f)
# otherwise recreate it
else:
iterator = itertools.permutations('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', 14)
try:
for com in iterator:
# process the object from the iterator
print(com)
time.sleep(1.0)
except KeyboardInterrupt:
# if the script is about to exit, save the iterator state
with open('saved_iter.pkl', 'wb') as f:
pickle.dump(iterator, f)
Which results in:
>python so_test.py
('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n')
('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'o')
('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'p')
('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'q')
('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'r')
('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 's')
>python so_test.py
('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 't')
('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'u')
('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'v')
('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'w')

How to use dictionary

How do I make 2d lists in run time when user inputs some number.
There are easy and difficult things:
horizontal ist trivial - simply join the inner list and replace the word and the word reversed by its upper case representation, then split the string into letters
vertical is mildly more difficult, but you can handle it by transposing the matrix via zip and after replacing transpose again
vertical is handled in the linked question Find Letter of Words in a Matrix Diagonally
Here is how to do horizontal and vertical:
data = [['l', 'd', 'l', 'o', 'h', 'p'],
['i', 't', 'i', 'f', 'w', 'f'],
['g', 'n', 'r', 'k', 'q', 'u'],
['h', 'g', 'u', 'a', 'l', 'l'],
['t', 'c', 'v', 'g', 't', 'l'],
['d', 'r', 'a', 'w', 'c', 's']]
words = ['draw', 'full', 'hold', 'laugh', 'light', 'start', 'all', 'kid']
from pprint import pprint as p
for w in words:
print()
# horizontal
data1 = [list(''.join(line).replace(w,w.upper())) for line in data]
# horizontal, word reversed
data1 = [list(''.join(line).replace(w[::-1],w[::-1].upper())) for line in data1]
# vertical
data1 = list(zip(*[list(''.join(line).replace(w,w.upper()))
for line in zip(*data1)]))
# vertical, word reversed
data1 = list(zip(*[list(''.join(line).replace(w[::-1],w[::-1].upper()))
for line in zip(*data1)]))
if data1 != data:
print(f"Found: {w}:")
data = data1
p(data)
else:
print(f"No {w} - check diagonally")
Output:
Found: draw:
[('l', 'd', 'l', 'o', 'h', 'p'),
('i', 't', 'i', 'f', 'w', 'f'),
('g', 'n', 'r', 'k', 'q', 'u'),
('h', 'g', 'u', 'a', 'l', 'l'),
('t', 'c', 'v', 'g', 't', 'l'),
('D', 'R', 'A', 'W', 'c', 's')]
Found: full:
[('l', 'd', 'l', 'o', 'h', 'p'),
('i', 't', 'i', 'f', 'w', 'F'),
('g', 'n', 'r', 'k', 'q', 'U'),
('h', 'g', 'u', 'a', 'l', 'L'),
('t', 'c', 'v', 'g', 't', 'L'),
('D', 'R', 'A', 'W', 'c', 's')]
Found: hold:
[('l', 'D', 'L', 'O', 'H', 'p'),
('i', 't', 'i', 'f', 'w', 'F'),
('g', 'n', 'r', 'k', 'q', 'U'),
('h', 'g', 'u', 'a', 'l', 'L'),
('t', 'c', 'v', 'g', 't', 'L'),
('D', 'R', 'A', 'W', 'c', 's')]
Found: laugh:
[('l', 'D', 'L', 'O', 'H', 'p'),
('i', 't', 'i', 'f', 'w', 'F'),
('g', 'n', 'r', 'k', 'q', 'U'),
('H', 'G', 'U', 'A', 'L', 'L'),
('t', 'c', 'v', 'g', 't', 'L'),
('D', 'R', 'A', 'W', 'c', 's')]
No light - check diagonally
No start - check diagonally
No all - check diagonally
No kid - check diagonally
For horizontal you can use this:
>>> for i in range(len(l)):# l is the list
temp = ''.join(l[i])
x = temp.find(word) #word is the user input eg. draw
y = len(word)
if x != -1:
for j in range(y):
l[i][j]=l[i][j].capitalize()
I'll try to make for vertical and diagonal as well
this one:
l = [['l', 'd', 'l', 'o', 'h', 'p'],
['i', 't', 'i', 'f', 'w', 'f'],
['g', 'n', 'r', 'k', 'q', 'u'],
['h', 'g', 'u', 'a', 'l', 'l'],
['t', 'c', 'v', 'g', 't', 'l'],
['d', 'r', 'a', 'w', 'c', 's']]
word = input("Please enter a word: ")
for i in range(len(l)):
for j in range(len(l[i])):
if l[i][j] in word:
l[i][j]=l[i][j].capitalize()
print(l)
outs:
Please enter a word: hello
[['L', 'd', 'L', 'O', 'H', 'p'], ['i', 't', 'i', 'f', 'w', 'f'], ['g', 'n', 'r', 'k', 'q', 'u'], ['H', 'g', 'u', 'a', 'L', 'L'], ['t', 'c', 'v', 'g', 't', 'L'], ['d', 'r', 'a', 'w', 'c', 's']]

I just want to get same index letter for each element on my list

For ex:
a = "pandaxngeqrymtso-ezmlaesowxaqbujl-noilktxreecytrql-gskaboofsfoxdtei-utsmakotufodhlrd-iroachimpanzeesa-nintrwflyrkhcdum-jcecahkktiklsvhr-mhvsbaykagodwgca-koalatcwlkfmrwbb-jsrrfdolphinuyt"
a = a.split("-")
mylist = []
word = ""
while i < len(a[0]): #16
for elem in a:
word+= elem[i]
mylist.append(kelime)
i += 1
word = ""
I just want a list which contains "penguinjmhkj, azostrichos..." But I get an Index error.
What can I do?
You can try this:
>>> word = [''.join(letters) for letters in zip(*(list(word) for word in a.split('-')))]
>>> word
['penguinjmkj',
'azostrichos',
'nmiksonevar',
'dllamatcslr',
'aakbacrabaf',
'xetokhwhatd',
'nsxooifkyco',
'gorftmlkkwl',
'ewesupytalp',
'qxeffarigkh',
'racoonkkofi',
'yqyxdzhldmn',
'mbtdhecswru',
'turtledvgwy',
'sjqersuhcbt']
Explanation:
You can understand what it is doing if you print the individual parts,
for e.g.:
>>> print(*(list(word) for word in a.split('-'))
['p', 'a', 'n', 'd', 'a', 'x', 'n', 'g', 'e', 'q', 'r', 'y', 'm', 't', 's', 'o'] ['e', 'z', 'm', 'l', 'a', 'e', 's', 'o', 'w', 'x', 'a', 'q', 'b', 'u', 'j', 'l'] ['n', 'o', 'i', 'l', 'k', 't', 'x', 'r', 'e', 'e', 'c', 'y', 't', 'r', 'q', 'l'] ['g', 's', 'k', 'a', 'b', 'o', 'o', 'f', 's', 'f', 'o', 'x', 'd', 't', 'e', 'i'] ['u', 't', 's', 'm', 'a', 'k', 'o', 't', 'u', 'f', 'o', 'd', 'h', 'l', 'r', 'd'] ['i', 'r', 'o', 'a', 'c', 'h', 'i', 'm', 'p', 'a', 'n', 'z', 'e', 'e', 's', 'a'] ['n', 'i', 'n', 't', 'r', 'w', 'f', 'l', 'y', 'r', 'k', 'h', 'c', 'd', 'u', 'm'] ['j', 'c', 'e', 'c', 'a', 'h', 'k', 'k', 't', 'i', 'k', 'l', 's', 'v', 'h', 'r'] ['m', 'h', 'v', 's', 'b', 'a', 'y', 'k', 'a', 'g', 'o', 'd', 'w', 'g', 'c', 'a'] ['k', 'o', 'a', 'l', 'a', 't', 'c', 'w', 'l', 'k', 'f', 'm', 'r', 'w', 'b', 'b'] ['j', 's', 'r', 'r', 'f', 'd', 'o', 'l', 'p', 'h', 'i', 'n', 'u', 'y', 't']
So it breaks up all the individual words delimited by - into characters.
Then zip does this:
>>> print(zip(*(list(word) for word in a.split('-'))))
('p', 'e', 'n', 'g', 'u', 'i', 'n', 'j', 'm', 'k', 'j') ('a', 'z', 'o', 's', 't', 'r', 'i', 'c', 'h', 'o', 's') ('n', 'm', 'i', 'k', 's', 'o', 'n', 'e', 'v', 'a', 'r') ('d', 'l', 'l', 'a', 'm', 'a', 't', 'c', 's', 'l', 'r') ('a', 'a', 'k', 'b', 'a', 'c', 'r', 'a', 'b', 'a', 'f') ('x', 'e', 't', 'o', 'k', 'h', 'w', 'h', 'a', 't', 'd') ('n', 's', 'x', 'o', 'o', 'i', 'f', 'k', 'y', 'c', 'o') ('g', 'o', 'r', 'f', 't', 'm', 'l', 'k', 'k', 'w', 'l') ('e', 'w', 'e', 's', 'u', 'p', 'y', 't', 'a', 'l', 'p') ('q', 'x', 'e', 'f', 'f', 'a', 'r', 'i', 'g', 'k', 'h') ('r', 'a', 'c', 'o', 'o', 'n', 'k', 'k', 'o', 'f', 'i') ('y', 'q', 'y', 'x', 'd', 'z', 'h', 'l', 'd', 'm', 'n') ('m', 'b', 't', 'd', 'h', 'e', 'c', 's', 'w', 'r', 'u') ('t', 'u', 'r', 't', 'l', 'e', 'd', 'v', 'g', 'w', 'y') ('s', 'j', 'q', 'e', 'r', 's', 'u', 'h', 'c', 'b', 't')
So it takes all the corresponding characters from each group, each of the tuples get passed as letters in each iteration in the main code.
Then you join each tuple, for e.g. in first iteration:
>>> ''.join(('p', 'e', 'n', 'g', 'u', 'i', 'n', 'j', 'm', 'k', 'j'))
'penguinjmkj'
[<item after some operation> for <each item> in <item_list>] this structure is called list comprehension. * is used for iterable unpacking.
it took a while but i cracked it;
what you just need to do is to handle the Index Error: String out of range.
if you count the words they are over 16 while the array items after splitting are just over 11. to cut long story short; Handle the exception with a try_except where you are appending the letters at:
word+= elem[i]
here is my code and how i solved it using try_catch
a = "pandaxngeqrymtso-ezmlaesowxaqbujl-noilktxreecytrql-gskaboofsfoxdtei-utsmakotufodhlrd-iroachimpanzeesa-nintrwflyrkhcdum-jcecahkktiklsvhr-mhvsbaykagodwgca-koalatcwlkfmrwbb-jsrrfdolphinuyt"
newArr = a.split('-')
newWord = []
i = 0
mylist = []
while i < len(newArr[0]):
word = ""
for item in newArr:
try:
word += item[i]
except:
break
i += 1
mylist.append(word)
print(mylist)
I used a try_except to handle the Index Error when appending the letter, then break when ever the 'i' used for the while loop is greater than the newArr length in the for loop.
try for your self!
Similar to the code from Sayandip, but it would have been more readable for me in the past:
mylist =[]
for element in zip(*a.split('-')):
mylist.append(''.join(element))
print(mylist)
I get
['penguinjmkj', 'azostrichos', 'nmiksonevar', 'dllamatcslr', 'aakbacrabaf', 'xetokhwhatd', 'nsxooifkyco', 'gorftmlkkwl', 'ewesupytalp', 'qxeffarigkh', 'racoonkkofi', 'yqyxdzhldmn', 'mbtdhecswru', 'turtledvgwy', 'sjqersuhcbt']
I am moving the splitting in the for loop, and using the * construct to pass the whole list resulting from splitting, see usage here and here.

How can I get all the unique categories within my dataframe using python? [duplicate]

This question already has answers here:
Find the unique values in a column and then sort them
(8 answers)
Closed 3 years ago.
im new to python and trying to work with dataframes manipulation:
I have a df with unique categories:
I am unable to paste the dataframe because I use Spyder IDE and it is not interactive does not display all fields.
My input to get all these unique categories within a dataframe:
uc =[]
for i in df['Category']:
if i[0] not in df['Category']:
uc.append(i[0])
print(uc)
But when I use this script, I only receive the first letters of these categories:
Output:
['F', 'P', 'N', 'F', 'L', 'T', 'W', 'S', 'W', 'B', 'S', 'F', 'T', 'T', 'B', 'T', 'B', 'L', 'S', 'F', 'F', 'F', 'N', 'P', 'H', 'T', 'L', 'T', 'S', 'E', 'P', 'N', 'T', 'L', 'P', 'L', 'W', 'F', 'N', 'L', 'N', 'L', 'F', 'F', 'N', 'T', 'P', 'L', 'B', 'W', 'L', 'W', 'F', 'F', 'H', 'T', 'F', 'T', 'T', 'N', 'G', 'L', 'M', 'N', 'F', 'N', 'F', 'L', 'N', 'P', 'F', 'B', 'B', 'S', 'F', 'P', 'F', 'P', 'P', 'P', 'B', 'P', 'B', 'B', 'L', 'B', 'F', 'P', 'P', 'B', 'B', 'C', 'G', 'C', 'G', 'B', 'P', 'T', 'P', 'P', 'N', 'G', 'S', 'G', 'F', 'G', 'F', 'T', 'S', 'P', 'F', 'C', 'C', 'C', 'C', 'C', 'G', 'C', 'F', 'C', 'F', 'B', 'G', 'C', 'B', 'B', 'B', 'C', 'P', 'G', 'S', 'D', 'P', 'G', 'F', 'L', 'C', 'G', 'P', 'S', 'B', 'P', 'T', 'T', 'L', 'M', 'F', 'T', 'P', 'C', 'F', 'B', 'M', 'G', 'C', 'P', 'T', 'L', 'F', 'F', 'F', 'T', 'P', 'C', 'G', 'T', 'F', 'F', 'S', 'B', 'M', 'T', 'T', 'T', 'T', 'H', 'B', 'N', 'F', 'A', 'T', 'E', 'M', 'L', 'G', 'P', 'B', 'L', 'N', 'S', 'G', 'G', 'F', 'F', 'F', 'G', 'G', 'G', 'G', 'F', 'T', 'G', 'P', 'G', 'C', 'G', 'G', 'G', 'F', 'T', 'T', 'L', 'F', 'S', 'T', 'F', 'F', 'G', 'G', 'L', 'M', 'T', 'L', 'F', 'B', 'A', 'F', 'B', 'F', 'B', 'B', 'T', 'F', 'B', 'F', 'F', 'P', 'V', 'M', 'S', 'F', 'C', 'B', 'N', 'M', 'W', 'B', 'F', 'B', 'F', 'F', 'M', 'L']
How do I change my script to reveive unique categories within a dataframe?
Try with
df['Category'].unique()
print(df['Category'].unique()) see what you get.
Also, i[0] is retrieving the first char of a string value in the df['Category'].
also, if you are new to pandas, you MUST abandon the old habit of for loop. And always type() your result to obtain better understanding.
Do you want this?
uc = set(df['Category'])
This will create a set containing the unique values of 'Category'

Categories

Resources