i want to create a pass_word list let's assume i have created a Permutationed list for example :
##
##
##
##
and then i want to add another chars to it (for ex : a,b) a,b is named special chars in this code and ## are added chars
so i want finally get this list :
ab## , ab##,ab##,ab## , ba##, .... a##b,...,b##a , ... , ba##
Note : I don't want any special characters get duplicated for ex i
don't want aa## or bb## (a,b can't be duplicated because they are
special chars #or # can be duplicated because they are added chars )
codes :
master_list=[]
l=[]
l= list(itertools.combinations_with_replacement('##',2)) # get me this list :[(#,#),(#,#),(#,#),(#,#)]
for i in l:
i = i+tuple(s) # adding special char(1 in this example) to created list
master_list.append(i)
print (master_list) # now i have this list : [(#,#,1),(#,#,1),....(#,#,1)
now if i can get all permutation of master_list my problem can be solved but i can't do that
i solved my problem , my idea : first of all i generate all posiable permutation of added chars**(#,#)** and save them to a list and then create another list and save specific chars (a,b) to it now we have to list just we need to merge them and in finally use permute_unique function
def permute_unique(nums):
perms = [[]]
for n in nums:
new_perm = []
for perm in perms:
for i in range(len(perm) + 1):
new_perm.append(perm[:i] + [n] + perm[i:])
# handle duplication
if i < len(perm) and perm[i] == n: break
perms = new_perm
return perms
l= list(itertools.combinations_with_replacement(algorithm,3))
for i in l:
i = i+tuple(s) # merge
master_list.append(i)
print(list(permute_unique))
You can just combine the combinations_with_replacement of the "added" chars with all the permutations of those combinations and the "special" characters:
>>> special = "ab"
>>> added = "##"
>>> [''.join(p)
for a in itertools.combinations_with_replacement(added, 2)
for p in itertools.permutations(a + tuple(special))]
['##ab',
'##ba',
'#a#b',
...
'a#b#',
'ab##',
...
'##ab',
'##ba',
...
'ba##',
'ba##']
If you want to prevent duplicates, pass the inner permuations through a set:
>>> [''.join(p)
for a in itertools.combinations_with_replacement(added, 2)
for p in set(itertools.permutations(a + tuple(special)))]
Related
I have this list
num_list=["mille", "duemila", "tremila", "quattromila", "cinquemila", "seimila", "settemila", "ottomila", "novemila", "diecimila", "milione", "miliardo", "milioni",'miliardi','mila']
I would like to build the following list
output=['millesimo', 'duemillesimo','tremillesimo','quattromillesimo','cinquemillesimo','seimillesimo','settemillesimo', 'ottomillesimo', 'novemillesimo', 'diecimillesimo', 'milionesimo', 'miliardesimo', 'milionesimo','miliardesimo']
This should be built by following the conditions below, after removing the last character from each string:
if the word is 'mila' do nothing;
if the word ends with 'l' then add 'lesimo';
else (if the last two characters of the string, after removing the last character, are 'll' or the string is "milion", "miliard"), then add 'esimo';
I started to do as follows:
numeri_card_esimo = [x[:-1] + 'lesimo' if x[:-2] == 'll' else x[:-1] + 'esimo' for x in numeri_card_esimo]
and the output is not so close to that one I would like:
['millesimo',
'duemilesimo', # it should be duemillesimo
'tremilesimo', # same as above
'quattromilesimo', # same as above
'cinquemilesimo', # same as above
'seimilesimo', # same as above
'settemilesimo', # same as above
'ottomilesimo', # same as above
'novemilesimo', # same as above
'diecimilesimo', # same as above
'milionesimo',
'miliardesimo',
'milionesimo',
'milesimo'] # it should be excluded
but it does not work because of wrong use of if/else conditions. How should I write these conditions?
In my opinion, the logic you are trying to apply is a bit long to be used in a list comprehension. It is better to move it into a function for the sake of readability.
def convert(num):
num = num[:-1]
if num[-2:]=='ll' or num=='milion' or num=='miliard':
num = num + 'esimo'
elif word[-1]=='l':
num = num + 'lesimo'
return num
num_list=["mille", "duemila", "tremila", "quattromila", "cinquemila", "seimila", "settemila", "ottomila", "novemila", "diecimila", "milione", "miliardo", "milioni",'miliardi','mila']
# Remove mila occurrences
num_list = [num for num in num_list if num!='mila']
output = [convert(num) for num in num_list]
print(output)
As an example :
I have 3 lists -
seq_to_find = ['abc','de'] (length = n)
main_list= ['a','b','c','ghi','d','e','far','last','a','b','c'] (length = m)
transaction_nums=[1,3,6,8,10,15,16,17,19,20,22] (note: always sorted,length = m)
how do I find the starting and ending index numbers of each sequence that occurs in the main_list.
In other words I want to write a function, Say
def findTheMasks(seq_to_find,main_list,transaction_nums):
returns a list with sublists having "start" and "end" transaction_nums
for the example given above : [[1,6][10,15][19,22]]
Please help. Thanks in advance.
I assume, that partially matched sequence does not go into result. And I also assume there is no empty ('') seq. Here is a sample solution.
seq_to_find = ['abc', 'de']
main_list= ['a','b','c','ghi','d','e','far','last','a','b','c']
transaction_nums=[1,3,6,8,10,15,16,17,19,20,22]
def findTheMasks(seq_to_find,main_list,transaction_nums):
ret = []
# go through each in main list
for i in range(0, len(main_list)):
# try to match each seq
for seq in seq_to_find:
remain = seq
# match seq from start, reduce seq if any match, until empty
for j in range(i, len(main_list)):
x = main_list[j]
# remain matches next in main list
if remain.startswith(x):
remain = remain.replace(x, '', 1)
# everything matched
if not remain:
break
# not matched
else:
break
# fully matched, add to result
if not remain:
ret.append([transaction_nums[i], transaction_nums[j]])
return ret
print(findTheMasks(seq_to_find, main_list, transaction_nums))
And output is:
[[1, 6], [10, 15], [19, 22]]
I am new to Python and can't quite figure out a solution to my Problem. I would like to split a list into two lists, based on what the list item starts with. My list looks like this, each line represents an item (yes this is not the correct list notation, but for a better overview i'll leave it like this) :
***
**
.param
+foo = bar
+foofoo = barbar
+foofoofoo = barbarbar
.model
+spam = eggs
+spamspam = eggseggs
+spamspamspam = eggseggseggs
So I want a list that contains all lines starting with a '+' between .param and .model and another list that contains all lines starting with a '+' after model until the end.
I have looked at enumerate() and split(), but since I have a list and not a string and am not trying to match whole items in the list, I'm not sure how to implement them.
What I have is this:
paramList = []
for line in newContent:
while line.startswith('+'):
paramList.append(line)
if line.startswith('.'):
break
This is just my try to create the first list. The Problem is, the code reads the second block of '+'s as well because break just Exits the while Loop, not the for Loop.
I hope you can understand my question and thanks in advance for any pointers!
What you want is really a simple task that can be accomplish using list slices and list comprehension:
data = ['**','***','.param','+foo = bar','+foofoo = barbar','+foofoofoo = barbarbar',
'.model','+spam = eggs','+spamspam = eggseggs','+spamspamspam = eggseggseggs']
# First get the interesting positions.
param_tag_pos = data.index('.param')
model_tag_pos = data.index('.model')
# Get all elements between tags.
params = [param for param in data[param_tag_pos + 1: model_tag_pos] if param.startswith('+')]
models = [model for model in data[model_tag_pos + 1: -1] if model.startswith('+')]
print(params)
print(models)
Output
>>> ['+foo = bar', '+foofoo = barbar', '+foofoofoo = barbarbar']
>>> ['+spam = eggs', '+spamspam = eggseggs']
Answer to comment:
Suppose you have a list containing numbers from 0 up to 5.
l = [0, 1, 2, 3, 4, 5]
Then using list slices you can select a subset of l:
another = l[2:5] # another is [2, 3, 4]
That what we are doing here:
data[param_tag_pos + 1: model_tag_pos]
And for your last question: ...how does python know param are the lines in data it should iterate over and what exactly does the first paramin param for paramdo?
Python doesn't know, You have to tell him.
First param is a variable name I'm using here, it cuold be x, list_items, whatever you want.
and I will translate the line of code to plain english for you:
# Pythonian
params = [param for param in data[param_tag_pos + 1: model_tag_pos] if param.startswith('+')]
# English
params is a list of "things", for each "thing" we can see in the list `data`
from position `param_tag_pos + 1` to position `model_tag_pos`, just if that "thing" starts with the character '+'.
data = {}
for line in newContent:
if line.startswith('.'):
cur_dict = {}
data[line[1:]] = cur_dict
elif line.startswith('+'):
key, value = line[1:].split(' = ', 1)
cur_dict[key] = value
This creates a dict of dicts:
{'model': {'spam': 'eggs',
'spamspam': 'eggseggs',
'spamspamspam': 'eggseggseggs'},
'param': {'foo': 'bar',
'foofoo': 'barbar',
'foofoofoo': 'barbarbar'}}
I am new to Python
Whoops. Don't bother with my answer then.
I want a list that contains all lines starting with a '+' between
.param and .model and another list that contains all lines starting
with a '+' after model until the end.
import itertools as it
import pprint
data = [
'***',
'**',
'.param',
'+foo = bar',
'+foofoo = barbar',
'+foofoofoo = barbarbar',
'.model',
'+spam = eggs',
'+spamspam = eggseggs',
'+spamspamspam = eggseggseggs',
]
results = [
list(group) for key, group in it.groupby(data, lambda s: s.startswith('+'))
if key
]
pprint.pprint(results)
print '-' * 20
print results[0]
print '-' * 20
pprint.pprint(results[1])
--output:--
[['+foo = bar', '+foofoo = barbar', '+foofoofoo = barbarbar'],
['+spam = eggs', '+spamspam = eggseggs', '+spamspamspam = eggseggseggs']]
--------------------
['+foo = bar', '+foofoo = barbar', '+foofoofoo = barbarbar']
--------------------
['+spam = eggs', '+spamspam = eggseggs', '+spamspamspam = eggseggseggs']
This thing here:
it.groupby(data, lambda x: x.startswith('+')
...tells python to create groups from the strings according to their first character. If the first character is a '+', then the string gets put into a True group. If the first character is not a '+', then the string gets put into a False group. However, there are more than two groups because consecutive False strings will form a group, and consecutive True strings will form a group.
Based on your data, the first three strings:
***
**
.param
will create one False group. Then, the next strings:
+foo = bar
+foofoo = barbar
+foofoofoo = barbarbar
will create one True group. Then the next string:
'.model'
will create another False group. Then the next strings:
'+spam = eggs'
'+spamspam = eggseggs'
'+spamspamspam = eggseggseggs'
will create another True group. The result will be something like:
{
False: [strs here],
True: [strs here],
False: [strs here],
True: [strs here]
}
Then it's just a matter of picking out each True group: if key, and then converting the corresponding group to a list: list(group).
Response to comment:
where exactly does python go through data, like how does it know s is
the data it's iterating over?
groupby() works like do_stuff() below:
def do_stuff(items, func):
for item in items:
print func(item)
#Create the arguments for do_stuff():
data = [1, 2, 3]
def my_func(x):
return x + 100
#Call do_stuff() with the proper argument types:
do_stuff(data, my_func) #Just like when calling groupby(), you provide some data
#and a function that you want applied to each item in data
--output:--
101
102
103
Which can also be written like this:
do_stuff(data, lambda x: x + 100)
lambda creates an anonymous function, which is convenient for simple functions which you don't need to refer to by name.
This list comprehension:
[
list(group)
for key, group in it.groupby(data, lambda s: s.startswith('+'))
if key
]
is equivalent to this:
results = []
for key, group in it.groupby(data, lambda s: s.startswith('+') ):
if key:
results.append(list(group))
It's clearer to explicitly write a for loop, however list comprehensions execute much faster. Here is some detail:
[
list(group) #The item you want to be in the results list for the current iteration of the loop here:
for key, group in it.groupby(data, lambda s: s.startswith('+')) #A for loop
if key #Only include the item for the current loop iteration in the results list if key is True
]
I would suggest doing things step by step.
1) Grab every word from the array separately.
2) Grab the first letter of the word.
3) Look if that is a '+' or '.'
Example code:
import re
class Dark():
def __init__(self):
# Array
x = ['+Hello', '.World', '+Hobbits', '+Dwarves', '.Orcs']
xPlus = []
xDot = []
# Values
i = 0
# Look through every word in the array one by one.
while (i != len(x)):
# Grab every word (s), and convert to string (y).
s = x[i:i+1]
y = '\n'.join(s)
# Print word
print(y)
# Grab the first letter.
letter = y[:1]
if (letter == '+'):
xPlus.append(y)
elif (letter == '.'):
xDot.append(y)
else:
pass
# Add +1
i = i + 1
# Print lists
print(xPlus)
print(xDot)
#Run class
Dark()
somewhat of a python/programming newbie here.
I have written code that does what I need it to:
import re
syns = ['professionals|experts|specialists|pros', 'repayed|payed back', 'ridiculous|absurd|preposterous', 'salient|prominent|significant' ]
new_syns = ['repayed|payed back', 'ridiculous|crazy|stupid', 'salient|prominent|significant', 'winter-time|winter|winter season', 'professionals|pros']
def pipe1(syn):
# Find first word/phrase in list element up to and including the 1st pipe
r = r'.*?\|'
m = re.match(r, syn)
m = m.group()
return m
def find_non_match():
# Compare 'new_syns' with 'syns' and create new list from non-matches in 'new_syns'
p = '##&' # Place holder created
joined = p.join(syns)
joined = p + joined # Adds place holder to beginning of string too
non_match = []
for syn in new_syns:
m = pipe1(syn)
m = p + m
if m not in joined:
non_match.append(syn)
return non_match
print find_non_match()
Printed output:
['winter-time|winter|winter season']
The code checks if the word/phrase up to and including the first pipe for each element in new_syns is a match for the same partial match in syns list. The purpose of the code is to actually find the non-matches and then append them to a new list called non_match, which it does.
However, I am wonder if it is possible to achieve the same purpose, but in much fewer lines using list comprehension. I have tried, but I am not getting what I want exactly. This is what I have come up with so far:
import re
syns = ['professionals|experts|specialists|pros', 'repayed|payed back', 'ridiculous|absurd|preposterous', 'salient|prominent|significant' ]
new_syns = ['repayed|payed back', 'ridiculous|crazy|stupid', 'salient|prominent|significant', 'winter-time|winter|winter season', 'professionals|pros']
def pipe1(syn):
# Find first word/phrase in list element up to and including the 1st pipe
r = r'.*?\|'
m = re.match(r, syn)
m = '##&' + m.group() # Add unusual symbol combo to creatte match for beginning of element
return m
non_match = [i for i in new_syns if pipe1(i) not in '##&'.join(syns)]
print non_match
Printed output:
['winter-time|winter|winter season', 'professionals|pros'] # I don't want 'professionals|pros' in the list
The caveat in the list comprehension is that when joining syns with ##&, I don't have the ##& at the beginning of the now joined string, whereas in my original code above that does not use the list comprehension I add ##& to the beginning of the joined string. The result is that 'professionals|pros' have slipped through the net. But I don't know how to pull that off within the list comprehension.
So my question is "Is this possible with the list comprehension?".
I think you want something like:
non_match = [i for i in new_syns if not any(any(w == s.split("|")[0]
for w in i.split("|"))
for s in syns)]
This doesn't use regular expressions, but does give the result
non_match == ['winter-time|winter|winter season']
The list includes any items from new_syns where none (not any) of the '|'-separated words w are in any of the first word (split("|")[0]) of each synonym group s from syns
I am trying to write a script that will find strings that share an overlapping region of 5 letters at the beginning or end of each string (shown in example below).
facgakfjeakfjekfzpgghi
pgghiaewkfjaekfjkjakjfkj
kjfkjaejfaefkajewf
I am trying to create a new string which concatenates all three, so the output would be:
facgakfjeakfjekfzpgghiaewkfjaekfjkjakjfkjaejfaefkajewf
Edit:
This is the input:
x = ('facgakfjeakfjekfzpgghi', 'kjfkjaejfaefkajewf', 'pgghiaewkfjaekfjkjakjfkj')
**the list is not ordered
What I've written so far *but is not correct:
def findOverlap(seq)
i = 0
while i < len(seq):
for x[i]:
#check if x[0:5] == [:5] elsewhere
x = ('facgakfjeakfjekfzpgghi', 'kjfkjaejfaefkajewf', 'pgghiaewkfjaekfjkjakjfkj')
findOverlap(x)
Create a dictionary mapping the first 5 characters of each string to its tail
strings = {s[:5]: s[5:] for s in x}
and a set of all the suffixes:
suffixes = set(s[-5:] for s in x)
Now find the string whose prefix does not match any suffix:
prefix = next(p for p in strings if p not in suffixes)
Now we can follow the chain of strings:
result = [prefix]
while prefix in strings:
result.append(strings[prefix])
prefix = strings[prefix][-5:]
print "".join(result)
A brute-force approach - do all combinations and return the first that matches linking terms:
def solution(x):
from itertools import permutations
for perm in permutations(x):
linked = [perm[i][:-5] for i in range(len(perm)-1)
if perm[i][-5:]==perm[i+1][:5]]
if len(perm)-1==len(linked):
return "".join(linked)+perm[-1]
return None
x = ('facgakfjeakfjekfzpgghi', 'kjfkjaejfaefkajewf', 'pgghiaewkfjaekfjkjakjfkj')
print solution(x)
Loop over each pair of candidates, reverse the second string and use the answer from here