Python - List of unique sequences - python

I have a dictionary with elements as lists of certain sequence:
a = {'seq1':['5', '4', '3', '2', '1', '6', '7', '8', '9'],
'seq2':['9', '8', '7', '6', '5', '4', '3', '2', '1'],
'seq3':['5', '4', '3', '2', '1', '11', '12', '13', '14'],
'seq4':['15', '16', '17'],
'seq5':['18', '19', '20', '21', '22', '23'],
'seq6':['18', '19', '20', '24', '25', '26']}
So there are 6 sequences
What I need to do is:
To find only unique lists (if two lists contains the same elements (regardless of their order), they are not unique) - say I need to get rid of the second list (the first founded unique list will stay)
In unique lists I need to find unique subsequences of elements and print
it
Bounds of unique sequences are found by resemblance of elements order - in the 1st and the 3rd lists the bound ends exactly after element '1', so we get the subsequence ['5','4','3','2','1']
As the result I would like to see elements exactly in the same order as it was in the beginning (if it`s possible at all somehow). So I expect this:
[['5', '4', '3', '2', '1']['6', '7', '8', '9']['11', '12', '13', '14']['15', '16', '17']['18', '19', '20']['21', '22', '23']['24', '25', '26']]
Tried to do it this way:
import itertools
unique_sets = []
a = {'seq1':["5","4","3","2","1","6","7","8","9"], 'seq2':["9","8","7","6","5","4","3","2","1"], 'seq3':["5","4","3","2","1","11","12","13","14"], 'seq4':["15","16","17"], 'seq5':["18","19","20","21","22","23"], 'seq6':["18","19","20","24","25","26"]}
b = []
for seq in a.values():
b.append(seq)
for seq1, seq2 in itertools.combinations(b,2): #searching for intersections
if set(seq1).intersection(set(seq2)) not in unique_sets:
#if set(seq1).intersection(set(seq2)) == set(seq1):
#continue
unique_sets.append(set(seq1).intersection(set(seq2)))
if set(seq1).difference(set(seq2)) not in unique_sets:
unique_sets.append(set(seq1).difference(set(seq2)))
for it in unique_sets:
print(it)
I got this which is a little bit different from my expectations:
{'9', '5', '2', '3', '7', '1', '4', '8', '6'}
set()
{'5', '2', '3', '1', '4'}
{'9', '8', '6', '7'}
{'5', '2', '14', '3', '1', '11', '12', '4', '13'}
{'17', '16', '15'}
{'19', '20', '18'}
{'23', '21', '22'}
Without comment in the code above the result is even worse.
Plus I have the problem with unordered elements in the sets, which I get as the result. Tried to do this with two separate lists:
seq1 = set([1,2,3,4,5,6,7,8,9])
seq2 = set([1,2,3,4,5,10,11,12])
and it worked fine - elements didn`t ever change their position in sets. Where is my mistake?
Thanks.
Updated: Ok, now I have a little bit more complicated task, where offered alghorithm won`t work
I have this dictionary:
precond = {
'seq1': ["1","2"],
'seq2': ["3","4","2"],
'seq3': ["5","4","2"],
'seq4': ["6","7","4","2"],
'seq5': ["6","4","7","2"],
'seq6': ["6","1","8","9","10"],
'seq7': ["6","1","8","11","9","12","13","14"],
'seq8': ["6","1","8","11","4","15","13"],
'seq9': ["6","1","8","16","9","11","4","17","18","2"],
'seq10': ["6","1","8","19","9","4","16","2"],
}
I expect these sequences, containing at least 2 elements:
[1, 2],
[4, 2],
[6, 7],
[6, 4, 7, 2],
[6, 1, 8]
[9,10],
[6,1,8,11]
[9,12,13,14]
[4,15,13]
[16,9,11,4,17,18,2]
[19,9,4,16,2]
Right now I wrote this code:
precond = {
'seq1': ["1","2"],
'seq2': ["3","4","2"],
'seq3': ["5","4","2"],
'seq4': ["6","7","4","2"],
'seq5': ["6","4","7","2"],
'seq6': ["6","1","8","9","10"],
'seq7': ["6","1","8","11","9","12","13","14"],
'seq8': ["6","1","8","11","4","15","13"],
'seq9': ["6","1","8","16","9","11","4","17","18","2"],
'seq10': ["6","1","8","19","9","4","16","2"],
}
seq_list = []
result_seq = []
#d = []
for seq in precond.values():
seq_list.append(seq)
#print(b)
contseq_ind = 0
control_seq = seq_list[contseq_ind]
mainseq_ind = 1
el_ind = 0
#index2 = 0
def compar():
if control_seq[contseq_ind] != seq_list[mainseq_ind][el_ind]:
mainseq_ind += 1
compar()
else:
result_seq.append(control_seq[contseq_ind])
contseq_ind += 1
el_ind += 1
if contseq_ind > len(control_seq):
control_seq = seq_list[contseq_ind + 1]
compar()
else:
compar()
compar()
This code is not complete anyway - I created looking for the same elements from the beginning, so I still need to write a code for searching of sequence in the end of two compared elements.
Right now I have a problem with recursion. Immidiately after first recursed call I have this error:
if control_seq[contseq_ind] != b[mainseq_ind][el_ind]:
UnboundLocalError: local variable 'control_seq' referenced before assignment
How can I fix this? Or maybe you have a better idea, than using recursion? Thank you in advance.

Not sure if this is what you wanted, but it gets the same result:
from collections import OrderedDict
a = {'seq1':["5","4","3","2","1","6","7","8","9"],
'seq2':["9","8","7","6","5","4","3","2","1"],
'seq3':["5","4","3","2","1","11","12","13","14"],
'seq4':["15","16","17"],
'seq5':["18","19","20","21","22","23"],
'seq6':["18","19","20","24","25","26"]}
level = 0
counts = OrderedDict()
# go through each value in the list of values to count the number
# of times it is used and indicate which list it belongs to
for elements in a.values():
for element in elements:
if element in counts:
a,b = counts[element]
counts[element] = a,b+1
else:
counts[element] = (level,1)
level+=1
last = 0
result = []
# now break up the dictionary of unique values into lists according
# to the count of each value and the level that they existed in
for k,v in counts.items():
if v == last:
result[-1].append(k)
else:
result.append([k])
last = v
print(result)
Result:
[['5', '4', '3', '2', '1'],
['6', '7', '8', '9'],
['11', '12', '13', '14'],
['15', '16', '17'],
['18', '19', '20'],
['21', '22', '23'],
['24', '25', '26']]

Related

Find all possible varients of max pair of 2

Given a string of numbers like 123456, I want to find all the possibilities they can be paired in by 2 or by itself. For example, from the string 123456 I would like to get the following:
12 3 4 5 6, 12 34 5 6, 1 23 4 56, etc.
The nearest I was able to come to was this:
strr = list("123456")
x = list("123456")
for i in range(int(len(strr)/2)):
newlist = []
for j in range(i):
newlist.append(x[j])
newlist.append(x[i] + x[i+1])
for j in range(len(x))[i+2:]:
newlist.append(x[j])
x = newlist.copy()
b = x.copy()
for f in range(len(b))[i:]:
if f == i:
print(b)
continue
b[f] = b[f - 1][1] + b[f]
b[f - 1] = b[f - 1][0]
print(b)
This code gives the output:
It's easy to solve this problem with a recursive generator. This is similar to how you solve change-making problems, just here we have only two "coins", either two characters together, or one character at a time. The total change we're trying to make is the length of the input string. The fact that the characters are digits in a numeric string is irrelevant.
def singles_and_pairs(string):
if len(string) <= 1: # base case
yield list(string) # yield either [] or [string] and then quit
return
for result in singles_and_pairs(string[:-1]): # first recursion
result.append(string[-1:])
yield result
for result in singles_and_pairs(string[:-2]): # second recursion
result.append(string[-2:])
yield result
If you plan on running this on large input strings, you might want to add memoization, since the recursive calls recalculate the same results quite often.
Pheew, this one took me some time to get right, but it seems to finally work (edited for prettier ordering):
def max_2_partitions(my_string):
if not my_string:
return [[]]
if len(my_string) == 1:
return [[my_string]]
ret = []
for i in range(len(my_string)):
for l in max_2_partitions(my_string[:i] + my_string[i + 1:]):
li = sorted([my_string[i]]+l, key = lambda x: (len(x),x))
if li not in ret:
ret.append(li)
for j in range(i+1,len(my_string)):
for l in max_2_partitions(my_string[:i]+my_string[i+1:j]+my_string[j+1:]):
li = sorted([my_string[i] + my_string[j]] + l, key = lambda x: (len(x),x))
if li not in ret:
ret.append(li)
return sorted(ret, key=lambda x: (-len(x),x))
Example:
print(max_2_partitions("1234"))
# [['1', '2', '3', '4'], ['1', '2', '34'], ['1', '3', '24'], ['1', '4', '23'], ['2', '3', '14'], ['2', '4', '13'], ['3', '4', '12'], ['12', '34'], ['13', '24'], ['14', '23']]
12 lines of code, full permutations:
You can first create permutations of the string, and then add spacing:
from itertools import permutations
def solution(A):
result = []
def dfs(A,B):
if not B:
result.append(A)
else:
for i in range(1,min(2,len(B))+1):
dfs(A+[B[:i]],B[i:])
for x in permutations(A):
dfs([],''.join(x))
return result
print(f"{solution('123') = }")
# solution('123') = [['1', '2', '3'], ['1', '23'], ['12', '3'], ['1', '3', '2'], ['1', '32'], ['13', '2'], ['2', '1', '3'], ['2', '13'], ['21', '3'], ['2', '3', '1'], ['2', '31'], ['23', '1'], ['3', '1', '2'], ['3', '12'], ['31', '2'], ['3', '2', '1'], ['3', '21'], ['32', '1']]

how to split a list every nth item

I am trying to split a list every 5th item, then delete the next two items ('nan'). I have attempted to use List[:5], but that does not seem to work in a loop. The desired output is: [['1','2','3','4','5'],['1','2','3','4','5'],['1','2','3','4','5'],['1','2','3','4','5']]
List = ['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan']
for i in List:
# split first 5 items
# delete next two items
# Desired output:
# [['1','2','3','4','5'],['1','2','3','4','5'],['1','2','3','4','5'],['1','2','3','4','5']]
There are lots of ways to do this. I recommend stepping by 7 then splicing by 5.
data = ['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan']
# Step by 7 and keep the first 5
chunks = [data[i:i+5] for i in range(0, len(data), 7)]
print(*chunks, sep='\n')
Output:
['1', '2', '3', '4', '5']
['1', '2', '3', '4', '5']
['1', '2', '3', '4', '5']
['1', '2', '3', '4', '5']
Reference: Split a python list into other “sublists”...
WARNING: make sure the list follows the rules as you said, after every 5 items 2 nan.
This loop will add the first 5 items as a list, and delete the first 7 items.
lst = ['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan']
output = []
while True:
if len(lst) <= 0:
break
output.append(lst[:5])
del lst[:7]
print(output) # [['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5']]
List=['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan']
new_list = list()
for k in range(len(List)//7):
new_list.append(List[k*7:k*7+5])
new_list.append(List[-len(List)%7])
Straightforward solution in case if the list doesn’t follow the rules you mentioned but you want to split sequence always between NAN's:
result, temp = [], []
for item in lst:
if item != 'nan':
temp.append(item)
elif temp:
result.append(list(temp))
temp = []
Using itertools.groupby would also support chunks of different lengths:
[list(v) for k, v in groupby(List, key='nan'.__ne__) if k]
I guess there is more pythonic way to do the same but:
result = []
while (len(List) > 5):
result.append(List[0:0+5])
del List[0:0+5]
del List[0:2]
This results: [['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5']]
mainlist=[]
sublist=[]
count=0
for i in List:
if i!="nan" :
if count==4:
# delete next two items
mainlist.append(sublist)
count=0
sublist=[]
else:
# split first 5 items
sublist.append(i)
count+=1
Generally numpy.split(...) will do any kind of custom splitting for you. Some reference:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.split.html
And the code:
import numpy as np
lst = ['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan']
ind=np.ravel([[i*7+5, (i+1)*7] for i in range(len(lst)//7)])
lst2=np.split(lst, ind)[:-1:2]
print(lst2)
Outputs:
[array(['1', '2', '3', '4', '5'], dtype='<U3'), array(['1', '2', '3', '4', '5'], dtype='<U3'), array(['1', '2', '3', '4', '5'], dtype='<U3'), array(['1', '2', '3', '4', '5'], dtype='<U3')]
I like the splice answers.
Here is my 2 cents.
# changed var name away from var type
myList = ['1','2','3','4','5','nan','nan','1','2','3','4','10','nan','nan','1','2','3','4','15','nan','nan','1','2','3','4','20','nan','nan']
newList = [] # declare new list of lists to create
addItem = [] # declare temp list
myIndex = 0 # declare temp counting variable
for i in myList:
myIndex +=1
if myIndex==6:
nothing = 0 #do nothing
elif myIndex==7: #add sub list to new list and reset variables
if len(addItem)>0:
newList.append(list(addItem))
addItem=[]
myIndex = 0
else:
addItem.append(i)
#output
print(newList)

Adding numbers to lists in Python but will add 1 and 0 instead of 10

Still in my quest for the Josephus problem, I ran across a small problem in the following code
(example code showing the problem without all the Josephus crap, exact same issue with both)
listy = []
var = 0
while var < 15:
var += 1
listy += str(var)
print("var: ", str(var))
print(listy)
print("")
print("")
print(listy)
Instead of adding, for example, 10 to the list, it will add 1 and 0. So instead of listy looking like:
['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11']
etc as it should, it looks like:
['1', '2', '3', '4', '5', '6', '7', '8', '9', '1', '0', '1', '1']
etc. So the full output of the above gave this:image
Any help?
Changing
listy += str(var)
to
listy.append(str(var))
fixes the problem

Filtering out a generator

Whats the best way to filter out some subsets from a generator. For example I have a string "1023" and want to produce all possible combinations of each of the digits. All combinations would be:
['1', '0', '2', '3']
['1', '0', '23']
['1', '02', '3']
['1', '023']
['10', '2', '3']
['10', '23']
['102', '3']
['1023']
I am not interested in a subset that contains a leading 0 on any of the items, so the valid ones are:
['1', '0', '2', '3']
['1', '0', '23']
['10', '2', '3']
['10', '23']
['102', '3']
['1023']
I have two questions.
1) If using a generator, whats the best way to filter out the ones with leading zeroes. Currently, I generate all combinations then loop through it afterwards and only continuing if the subset is valid. For simplicity I am only printing the subset in the sample code. Assuming the generator that was created is very long or if it constains a lot of invalid subsets, its almost a waste to loop through the entire generator. Is there a way to stop the generator when it sees an invalid item (one with leading zero) then filter it off 'allCombinations'
2) If the above doesn't exist, whats a better way to generate these combinations (disregarding combinations with leading zeroes).
Code using a generator:
import itertools
def isValid(subset): ## DIGITS WITH LEADING 0 IS NOT VALID
valid = True
for num in subset:
if num[0] == '0' and len(num) > 1:
valid = False
break
return valid
def get_combinations(source, comb):
res = ""
for x, action in zip(source, comb + (0,)):
res += x
if action == 0:
yield res
res = ""
digits = "1023"
allCombinations = [list(get_combinations(digits, c)) for c in itertools.product((0, 1), repeat=len(digits) - 1)]
for subset in allCombinations: ## LOOPS THROUGH THE ENTIRE GENERATOR
if isValid(subset):
print(subset)
Filtering for an easy and obvious condition like "no leading zeros", it can be more efficiently done at the combination building level.
def generate_pieces(input_string, predicate):
if input_string:
if predicate(input_string):
yield [input_string]
for item_size in range(1, len(input_string)+1):
item = input_string[:item_size]
if not predicate(item):
continue
rest = input_string[item_size:]
for rest_piece in generate_pieces(rest, predicate):
yield [item] + rest_piece
Generating every combination of cuts, so long it's not even funny:
>>> list(generate_pieces('10002', lambda x: True))
[['10002'], ['1', '0002'], ['1', '0', '002'], ['1', '0', '0', '02'], ['1', '0', '0', '0', '2'], ['1', '0', '00', '2'], ['1', '00', '02'], ['1', '00', '0', '2'], ['1', '000', '2'], ['10', '002'], ['10', '0', '02'], ['10', '0', '0', '2'], ['10', '00', '2'], ['100', '02'], ['100', '0', '2'], ['1000', '2']]
Only those where no fragment has leading zeros:
>>> list(generate_pieces('10002', lambda x: not x.startswith('0')))
[['10002'], ['1000', '2']]
Substrings that start with a zero were never considered for the recursive step.
One common solution is to try filtering just before using yield. I have given you an example of filtering just before yield:
import itertools
def my_gen(my_string):
# Create combinations
for length in range(len(my_string)):
for my_tuple in itertools.combinations(my_string, length+1):
# This is the string you would like to output
output_string = "".join(my_tuple)
# filter here:
if output_string[0] != '0':
yield output_string
my_string = '1023'
print(list(my_gen(my_string)))
EDIT: Added in a generator comprehension alternative
import itertools
my_string = '1023'
my_gen = ("".join(my_tuple)[0] for length in range(len(my_string))
for my_tuple in itertools.combinations(my_string, length+1)
if "".join(my_tuple)[0] != '0')

How could i refresh a list once an item has been removed from a list within a list in python

This is quite complicated but i would like to be able to refresh a larger list once at item has been taken out of a mini list within the bigger list.
listA = ['1','2','3','4','5','6','6','8','9','5','3','7']
i used the code below to split it into lists of threes
split = [listA[i:(i+3)] for i in range(0, len(listA) - 1, 3)]
print(split)
# [['1','2','3'],['4','5','6'],['6','8','9'],['5','3','7']]
split = [['1','2','3'],['4','5','6'],['6','8','9'],['5','3','7']]
if i deleted #3 from the first list, split will now be
del split[0][-1]
split = [['1','2'],['4','5','6'],['6','8','9'],['5','3','7']]
after #3 has been deleted, i would like to be able to refresh the list so that it looks like;
split = [['1','2','4'],['5','6','6'],['8','9','5'],['3','7']]
thanks in advance
Not sure how big this list is getting, but you would need to flatten it and recalculate it:
>>> listA = ['1','2','3','4','5','6','6','8','9','5','3','7']
>>> split = [listA[i:(i+3)] for i in range(0, len(listA) - 1, 3)]
>>> split
[['1', '2', '3'], ['4', '5', '6'], ['6', '8', '9'], ['5', '3', '7']]
>>> del split[0][-1]
>>> split
[['1', '2'], ['4', '5', '6'], ['6', '8', '9'], ['5', '3', '7']]
>>> listA = sum(split, []) # <- flatten split list back to 1 level
>>> listA
['1', '2', '4', '5', '6', '6', '8', '9', '5', '3', '7']
>>> split = [listA[i:(i+3)] for i in range(0, len(listA) - 1, 3)]
>>> split
[['1', '2', '4'], ['5', '6', '6'], ['8', '9', '5'], ['3', '7']]
Just recreate the single list from your nested lists, then re-split.
You can join the lists, assuming they are only one level deep, with something like:
rejoined = [element for sublist in split for element in sublist]
There are no doubt fancier ways, or single-liners that use itertools or some other library, but don't overthink it. If you're only talking about a few hundred or even a few thousand items this solution is quite good enough.
I need this for turning of cards in the deck in a solitaire game.
You can deal your cards using itertools.groupby() with a good key function:
def group_key(x, n=3, flag=[0], counter=itertools.count(0)):
if next(counter) % n == 0:
flag[0] = flag[0] ^ 1
return flag[0]
^ is a bitwise operator, basically it change the value of the flag from 0 to 1 and viceversa. The flag value is an element of a list because we're doing some kind of memoization.
Example:
>>> deck = ['1', '2', '3', '4', '5', '6', '6', '8', '9', '5', '3', '7']
>>> for k,g in itertools.groupby(deck, key=group_key):
... print(list(g))
['1', '2', '3']
['4', '5', '6']
['6', '8', '9']
['5', '3', '7']
Now let's say you've used card '9' and '8', so your new deck looks like:
>>> deck = ['1', '2', '3', '4', '5', '6', '6', '5', '3', '7']
>>> for k,g in itertools.groupby(deck, key=group_key):
... print(list(g))
['1', '2', '3']
['4', '5', '6']
['6', '5', '3']
['7']
Build an object that contains a list and tracks when the list is altered (probably by controlling write to it), then have the object do it's own split every time the data is altered and save the split list to a member of the object.

Categories

Resources