python all combinations of subsets of a string - python

I need all combinations of subsets of a string. In addition, a subset of length 1 can only be followed by a subset with length > 1. E.g. for string 4824 the result should be:
[ [4, 824], [4, 82, 4], [48, 24], [482, 4], [4824] ]
So far I managed to retrieve all possible subsets with:
length = len(number)
ss = []
for i in xrange(length):
for j in xrange(i,length):
ss.append(number[i:j + 1])
which gives me:
['4', '48', '482', '4824', '8', '82', '824', '2', '24', '4']
But I don't know how to combine those now.

First, write a function for generating all the partitions of the string:
def partitions(s):
if s:
for i in range(1, len(s) + 1):
for p in partitions(s[i:]):
yield [s[:i]] + p
else:
yield []
This iterates all the possible first segments (one character, two characters, etc.) and combines those with all the partitions for the respective remainder of the string.
>>> list(partitions("4824"))
[['4', '8', '2', '4'], ['4', '8', '24'], ['4', '82', '4'], ['4', '824'], ['48', '2', '4'], ['48', '24'], ['482', '4'], ['4824']]
Now, you can just filter those that match your condition, i.e. those that have no two consecutive substrings of length one.
>>> [p for p in partitions("4824") if not any(len(x) == len(y) == 1 for x, y in zip(p, p[1:]))]
[['4', '82', '4'], ['4', '824'], ['48', '24'], ['482', '4'], ['4824']]
Here, zip(p, p[1:]) is a common recipe for iterating over all pairs of consecutive items.
Update: Actually, incorporating your constraint directly into the partition function is not that hard, either. Just keep track of the last segment and set the minimum length accordingly.
def partitions(s, minLength=1):
if len(s) >= minLength:
for i in range(minLength, len(s) + 1):
for p in partitions(s[i:], 1 if i > 1 else 2):
yield [s[:i]] + p
elif not s:
yield []
Demo:
>>> print list(partitions("4824"))
[['4', '82', '4'], ['4', '824'], ['48', '24'], ['482', '4'], ['4824']]

would be interesting to see more test cases, the following algorithm does what you say:
s="4824"
def partitions(s):
yield [s]
if(len(s)>2):
for i in range(len(s)-1, 0, -1):
for g in partitions(s[i:]):
out = [s[:i]] + g
if not any([len(out[i]) == len(out[i+1]) and len(out[i])==1 for i in range(len(out)-1)]):
yield out
list(partitions(s))
you get:
[['4824'], ['482', '4'], ['48', '24'], ['4', '824'], ['4', '82', '4']]
explanation
I based on the following algorithm:
s="4824"
def partitions_original(s):
#yield original string
yield [s]
if(len(s)>2):
for i in range(len(s)-1, 0, -1):
#divide string in two parts
#iteration 1: a="482", b="4"
#iteration 2: a="48", b="24"
#iteration 3: a="4", b="824"
a = s[:i]
b = s[i:]
#recursive call of b
for g in partitions_original(b):
#iteration 1: b="4", g=[['4']]
#iteration 2: b="24", g=[['24']]
#iteration 3: b="824", g=[['824'], ['82', '4'], ['8', '24']]
yield [a] + g
list(partitions_original(s))
you get:
[['4824'], ['482', '4'], ['48', '24'], ['4', '824'],
['4', '82', '4'], ['4', '8', '24']]
the problem is ['4', '8', '24'] ..... then I must add if to code, because "a subset of length 1 can only be followed by a subset with length > 1"
[len(out[i]) == len(out[i+1]) and len(out[i])==1 for i in range(len(out)-1)] return for ['4', '8', '24'] -> [True, False] .... any Return True if any element of the iterable is true
NOTE
also it can be used:
if all([len(out[i]) != len(out[i+1]) or len(out[i])!=1 for i in range(len(out)-1)]):

What I am doing here is to get all the possible split position of the string and eliminate the last one.
for example, in some string with 5 numbers "12345" for ex., there is 4 possible position to split the string, call it possibility = (0,0,0,0),(1,0,1,0)... with (0,0,1,0) mean (don't separate 1 and 2345,don't separate 12 and 345,separate 123 and 45,don't separate 1234 and 5) so you can get all possibilities while your condition is verified since we eliminate the (1,1,1,1) case.
import itertools
from math import factorial
from itertools import product
def get_comb(string):
L = len(string_)
combinisation = []
for possibility in product([0,1], repeat=len(string_)-1):
s = []
indexes = [i for i in range(len(string_)-1) if list(possibility)[i]!=0]
if sum(indexes) != 0:
if sum(indexes) != len(string_)-1:
for index in indexes:
s.append(string_[:index+1])
s.append(string_[indexes[-1:][0]+1:])
combinisation.append(s)
else:
combinisation.append(string_)
return combinisation
string_ = '4824'
print "%s combinations:"%string_
print get_comb(string_)
string_ = '478952'
print "%s combinations:"%string_
print get_comb(string_)
string_ = '1234'
print "%s combinations:"%string_
print get_comb(string_)
>>
4824 combinations:
[['482', '4'], ['48', '24'], '4824', ['4', '482', '4'], ['4', '48', '24'], '4824
']
478952 combinations:
[['47895', '2'], ['4789', '52'], ['4789', '47895', '2'], ['478', '952'], ['478',
'47895', '2'], '478952', ['478', '4789', '47895', '2'], ['47', '8952'], '478952
', ['47', '4789', '52'], ['47', '4789', '47895', '2'], ['47', '478', '952'], ['4
7', '478', '47895', '2'], ['47', '478', '4789', '52'], ['47', '478', '4789', '47
895', '2'], ['4', '47895', '2'], ['4', '4789', '52'], ['4', '4789', '47895', '2'
], ['4', '478', '952'], ['4', '478', '47895', '2'], '478952', ['4', '478', '4789
', '47895', '2'], ['4', '47', '8952'], '478952', ['4', '47', '4789', '52'], ['4'
, '47', '4789', '47895', '2'], ['4', '47', '478', '952'], ['4', '47', '478', '47
895', '2'], ['4', '47', '478', '4789', '52'], ['4', '47', '478', '4789', '47895'
, '2']]
1234 combinations:
[['123', '4'], ['12', '34'], '1234', ['1', '123', '4'], ['1', '12', '34'], '1234
']

A normal code can be written like:
s=raw_input('enter the string:')
word=[]
for i in range(len(s)):
for j in range(i,len(s)):
word.append(s[i:j+1])
print word
print 'no of possible combinations:',len(word)
And output:
enter the string:
4824
['4', '48', '482', '4824', '8', '82', '824', '2', '24', '4']
no of possible combinations:10

Related

How to create a unique list of lists from a list of list of lists without "flattening"

Edited Question
I've got an inner nested list as shown below.
test_list = [
[['1']],
[['2', '2']],
[['3', '3'], ['4', '4'], ['5', '5'], ['6', '6']],
[['7', '7'], ['8'], ['7'], ['1'], ['7', '7']],
[['7', '7'], ['8'], ['7'], ['7', '7']],
[['2', '2']]]
I'm not looking out to flatten this list per-say. I would rather like to achieve two results.
1)Return a list which discards the outer list for indexes with single item, but keep the nested list for indexes with grouped items. i.e
Expected list output:
new_list = [
['1'],
['2', '2'],
[['3', '3'], ['4', '4'], ['5', '5'], ['6', '6']],
[['7', '7'], ['8'], ['7'], ['1'], ['7', '7']],
[['7', '7'], ['8'], ['7'], ['7', '7']],
['2', '2']]
And when printed looks like:
['1']
['2', '2']
[['3', '3'], ['4', '4'], ['5', '5'], ['6', '6']]
[['7', '7'], ['8'], ['7'], ['1'], ['7', '7']]
[['7', '7'], ['8'], ['7'], ['7', '7']]
['2', '2']
2)Return a new unique list which would look like the new_list variable expressed below.
new_list = [
['1'],
['2'],
['3', '4', '5', '6'],
['7', '8', '7', '1', '7'],
['7', '8', '7', '7'],
['2']]
i.e each item in the new list printed out as:
['1']
['2']
['3', '4', '5', '6']
['7', '8', '7', '1', '7']
['7', '8', '7', '7']
['2']
Thanks.
PS: Sorry for the total re-edit, I totally misrepresented the expected result.
You can use a simple test on the sublists length in a list comprehension:
out = [l[0] if len(l)==1 else l for l in test_list]
output:
[['1'],
['2', '2'],
[['3', '3'], ['4', '4'], ['5', '5'], ['6', '6']],
[['7', '7'], ['8'], ['7'], ['1'], ['7', '7']],
[['7', '7'], ['8'], ['7'], ['7', '7']],
['2', '2']]
If you really want to print:
for l in test_list:
print(l[0] if len(l)==1 else l)
['1']
['2', '2']
[['3', '3'], ['4', '4'], ['5', '5'], ['6', '6']]
[['7', '7'], ['8'], ['7'], ['1'], ['7', '7']]
[['7', '7'], ['8'], ['7'], ['7', '7']]
['2', '2']
edited question
you need a nested list comprehension:
[[x[0] for x in l] for l in test_list]
Output:
[['1'],
['2'],
['3', '4', '5', '6'],
['7', '8', '7', '1', '7'],
['7', '8', '7', '7'],
['2']]
If you struggle with understanding list comprehension, especially the nested list comprehension, you can break it down with the usual for loop and if's conditional.
new_list = []
for each_index in test_list:
if len(each_index) == 1:
new_list.append(each_index[0])
else:
new_list.append(each_index)
print(new_list)
And the second answer with the nested list comprehension translates to:
new_list = []
for each_index in test_list:
test = []
for each_item in each_index:
test.append(each_item[0])
new_list.append(test)
print(new_list)
NOTE:
The list comprehension answer given by #mozway is more concise, but when I want to properly understand the flow of list comprehension, a good practice is to break it down like this so that i avoid just copying and pasting, but leave with a better understanding.

How can I extract a column and create a vector out of them?

mat = [['1', '2', '3', '4', '5'],
['6', '7', '8', '9', '10'],
['11', '12', '13', '14', '15']]
Suppose, I have this vector of vectors.
Say, I need to extract 2nd column of each row, convert them into binary, and then create a vector of them.
Is it possible to do it without using NumPy?
Use zip for transpose list and make loop with enumerate and filter by id with bin().
mat = [['1', '2', '3', '4', '5'],
['6', '7', '8', '9', '10'],
['11', '12', '13', '14', '15']]
vec = [[bin(int(r)) for r in row] for idx, row in enumerate(zip(*mat)) if idx == 1][0]
print(vec) # ['0b10', '0b111', '0b1100']
Yes. This is achievable with the following code :
mat = [['1', '2', '3', '4', '5'],
['6', '7', '8', '9', '10'],
['11', '12', '13', '14', '15']]
def decimalToBinary(n):
return bin(n).replace("0b", "")
new_vect = []
for m in mat:
m = int(m[1])
new_vect.append(decimalToBinary(m))
print (new_vect)
Hope this is expected
['10', '111', '1100']

How to unpack values from a file

If as a input i have a file that read-
0->54:15
1->41:12
2->35:6
3->42:10
4->34:7
5->58:5
6->55:12
7->39:6
8->36:12
9->38:15
10->53:13
11->56:12
12->51:5
13->48:8
14->60:14
15->46:12
16->57:6
17->52:9
18->40:11
Actually this is an adjacency list. I want my code to read the file and take the values as -> u=0,v=54, w=15 and then go with my plan. How can i do this? Thank you in advance for your time to read and answer this.
Using .split would be good.
For each line in the file (You can get this by using the open() function) split it using the arrow and the colon.
for line in lines:
split_line = line.split("->") # Split by the arrow first
split_line = split_line[0] + split_line[1].split(":")
u, v, w = split_line # Note u, v, and w are strings
I would recommend using JSON format so you can use the json module in python the parse the file into variables easily.
If you had a single string:
import re
s = \
'''0->54:15
1->41:12
2->35:6
3->42:10
4->34:7
5->58:5
6->55:12
7->39:6
8->36:12
9->38:15
10->53:13
11->56:12
12->51:5
13->48:8
14->60:14
15->46:12
16->57:6
17->52:9
18->40:11'''
s = s.split('\n')
output = [re.split('->|:', x) for x in s]
output
[['0', '54', '15'], ['1', '41', '12'], ['2', '35', '6'], ['3', '42', '10'], ['4', '34', '7'], ['5', '58', '5'], ['6', '55', '12'], ['7', '39', '6'], ['8', '36', '12'], ['9', '38', '15'], ['10', '53', '13'], ['11', '56', '12'], ['12', '51', '5'], ['13', '48', '8'], ['14', '60', '14'], ['15', '46', '12'], ['16', '57', '6'], ['17', '52', '9'], ['18', '40', '11']]
If you want a dictionary
d = {x[0]:[x[1],x[2]] for x in output}
d
{'0': ['54', '15'], '1': ['41', '12'], '2': ['35', '6'], '3': ['42', '10'], '4': ['34', '7'], '5': ['58', '5'], '6': ['55', '12'], '7': ['39', '6'], '8': ['36', '12'], '9': ['38', '15'], '10': ['53', '13'], '11': ['56', '12'], '12': ['51', '5'], '13': ['48', '8'], '14': ['60', '14'], '15': ['46', '12'], '16': ['57', '6'], '17': ['52', '9'], '18': ['40', '11']}
If you want a dataframe:
import pandas as pd
df = pd.DataFrame(output, columns=['u','v','w'])
df
u v w
0 0 54 15
1 1 41 12
2 2 35 6
3 3 42 10
4 4 34 7
5 5 58 5
6 6 55 12
7 7 39 6
8 8 36 12
9 9 38 15
10 10 53 13
11 11 56 12
12 12 51 5
13 13 48 8
14 14 60 14
15 15 46 12
16 16 57 6
17 17 52 9
18 18 40 11
Here is how you can use re.split() to split strings with multiple delimiters:
from re import split
with open('file.txt','r') as f:
l = f.read().splitlines()
lst = [list(filter(None, split('[(\-\>):]',s))) for s in l]
print(lst)
Output:
[['0', '54', '15'],
['1', '41', '12'],
['2', '35', '6'],
['3', '42', '10'],
['4', '34', '7'],
['5', '58', '5'],
['6', '55', '12'],
['7', '39', '6'],
['8', '36', '12'],
['9', '38', '15'],
['10', '53', '13'],
['11', '56', '12'],
['12', '51', '5'],
['13', '48', '8'],
['14', '60', '14'],
['15', '46', '12'],
['16', '57', '6'],
['17', '52', '9'],
['18', '40', '11']]
Breaking it down:
This: lst = [list(filter(None, split('[(\-\>):]',s))) for s in l] is the equivalent of:
lst = [] # The main list
for s in l: # For every line in the list of lines
uvw = split('[(\-\>):]',s) # uvw = a list of the numbers
uvw = list(filter(None,uvw)) # There is an empty string in the list, so filter it out
lst.append(uvw) # Add the list to the main list
I'm going to challenge the way that you're getting the input file in the first place: if you have any control over how you get this input, I'd encourage you to change its format. (If not, maybe this answer will help people who have a similar issue in the future).
There is typically little reason to "roll your own" serialization and deserialization like this - it's reinventing the wheel, given that most modern languages have built-in libraries to do this already. Rather, if at all possible, you should use a standard serialization and deserialization mechanism like Python pickle or a JSON serializer (or even a CSV, so that you can use a CSV parser).

Split list on smaller lists with equal elements

In my list "A" i have got numbers and ' ', so I want to make a list of list named e.g "b", every list should have nine number (if it possible), no matter how much it have ' '.
Any idea how to do this?
A = ['1', '3', '4', '5', '7', '8', '9', ' ', '13', '16', '3', ' ', '5', '17']
B = [ ['1', '3, '4', '5', '7', '8', '9', ' ', '13', '16'], ['3', ' ', '5', '17'] ]
This will help you:
>>> a = ['1', '3', '4', '5', '7', '8', '9', ' ', '13', '16', '3', ' ', '5', '17']
>>> b=[a[i:i+9] for i in xrange(0,len(a),9)]
>>> b
[['1', '3', '4', '5', '7', '8', '9', ' ', '13'], ['16', '3', ' ', '5', '17']]
>>>
This can be done with two nested while loops:
>>> A = ['1', '3', '4', '5', '7', '8', '9', ' ', '13', '16', '3', ' ', '5', '17']
>>> B = []
>>> while A:
... L = []
... c = 0
... while A and c < 9:
... L.append(A.pop(0))
... if L[-1].isdigit():
... c += 1
... B.append(L)
...
>>> B
[['1', '3', '4', '5', '7', '8', '9', ' ', '13', '16'], ['3', ' ', '5', '17']]
The outer one loops while A is not empty and the inner one while A is not empty and the number of digit only strings appended to the current sub-list is less than 9. The counter is only incremented after a string consisting of only digits is found.
It would be worth your time to get deep into list comprehensions
And there is no xrange in Python 3.x or rather range (in 3.x) does exactly what xrange did in Python 2.x.
A = ['1', '3', '4', '5', '7', '8', '9', ' ', '13', '16', '3', ' ', '5', '17']
B = [i for i in A[0:9]] #is cleaner.
Though I'm not sure exactly what your goal is. Do you want the second list (the remainder list as I'm thinking of it) to be in the same variable? So if you had 28 elements in your list you'd want three lists of 9 and one list of 1?
This is a bit dirty solution but I think you might need to check isdigit part and pop.
def take(lst, n):
if not lst:
return ValueError("Empty list, please check the list.")
items = list(lst)
new_list = []
count = 0
while items:
item = items.pop(0)
new_list.append(item)
if item.isdigit():
count += 1
if count >= n:
yield new_list
new_list = []
count = 0
if new_list:
yield new_list
A = ['1', '3', '4', '5', '7', '8', '9', ' ', '13', '16', '3', ' ', '5', '17']
B = [ii for ii in take(A, 9)]
#[['1', '3', '4', '5', '7', '8', '9', ' ', '13', '16'], ['3', ' ', '5', '17']]
Check the following:
https://docs.python.org/2/library/stdtypes.html#str.isdigit

Integer Manipulation of arrays python

I have 2 arrays and I need to switch the last digit of the integers in one array with the integers in another. Its better if I show you the output to get a better understanding of what I'm trying to do. I'm not sure this is even possible to do at the least.
Output of arrays:
first_array=['3', '4', '5', '2', '0', '0', '1', '7']
second_array=['527', '61', '397', '100', '97', '18', '45', '1']
What it then look like:
first_array=['3', '4', '5', '2', '0', '0', '1', '7']
second_array =['523', '64', '395', '102', '90', '10', '41', '7']
>>> [s[:-1]+f for (f,s) in zip(first_array, second_array)]
['523', '64', '395', '102', '90', '10', '41', '7']
If it is actual integers, you could try "rounding down" each element of the second list to nearest multiple of 10, then adding each element from the first list. For example:
>>> first = [3,4,5,6]
>>> second = [235,123,789,9021]
>>> second = [x - (x%10) for x in second]
>>> second
[230, 120, 780, 9020]
>>> [x + y for (x,y) in zip(first, second)]
[233, 124, 785, 9026]

Categories

Resources