Related
I need to generate unique random lists for 3 different objects, each object can appear once on each lis and each code has to in fixed length of 5.
import random
#generate random codes
def generator(code, objects):
for i in range(len(code)):
x = random.choices(objects)
code[i] = x[0]
#Check if code is unique
def isSame(code, list):
if code not in list:
return False
else:
return True
#If code is unique, append it to the codeList and increase counter by 1
codeCount = 0
def listAppend(code, list):
if isSame(code,list) == True:
print('This code is not unique')
else:
list.append(code)
global codeCount
codeCount += 1
if __name__ == '__main__':
codeList = []
desiredCount = 12
while codeCount != desiredCount:
code = [None]*5
objects = ['a','b','c','d','e','f','g']
generator(code, objects)
listAppend(code,codeList)
print(codeList)
This gives me random unique lists but however I couldn't think of how to make each object appear only once in each unique list.
e.g. ['a', 'g', 'g', 'a', 'e'] ==> 'g' and 'a' has repeated twice where I need them to appear only once. like, ['a','b','c','d','e']
Can anyone think of a good way to do this? Thanks!!
EDIT: each code has to have fixed length of 5. Also I'm using random.choices to use its probability parameter.
The way I would do it is:
from random import randrange as rr
Alphabet="abcdefghijklmnopqrstuvwxyz"
def generate(length):
code=[]
for _ in range(length):
random_number=rr(0,len(Alphabet))
if Alphabet[random_number]not in code:
code.append(Alphabet[random_number])
return code
This generates a random element from a tuple / list / string (in my case is a string of alphabet) and checks if the element is already in the code, and if not then it will be added to the code, the length of the code is determined by a parameter.
This will generate all possible 3 unique element selections from the source.
import itertools
list(itertools.combinations('abcdefg',3))
[('a', 'b', 'c'),
('a', 'b', 'd'),
('a', 'b', 'e'),
('a', 'b', 'f'),
('a', 'b', 'g'),
('a', 'c', 'd'),
('a', 'c', 'e'),
('a', 'c', 'f'),
...
('d', 'f', 'g'),
('e', 'f', 'g')]
For size 5, it will be this list
list(itertools.combinations('abcdefg',5))
[('a', 'b', 'c', 'd', 'e'),
('a', 'b', 'c', 'd', 'f'),
('a', 'b', 'c', 'd', 'g'),
('a', 'b', 'c', 'e', 'f'),
('a', 'b', 'c', 'e', 'g'),
('a', 'b', 'c', 'f', 'g'),
('a', 'b', 'd', 'e', 'f'),
('a', 'b', 'd', 'e', 'g'),
('a', 'b', 'd', 'f', 'g'),
('a', 'b', 'e', 'f', 'g'),
('a', 'c', 'd', 'e', 'f'),
('a', 'c', 'd', 'e', 'g'),
('a', 'c', 'd', 'f', 'g'),
('a', 'c', 'e', 'f', 'g'),
('a', 'd', 'e', 'f', 'g'),
('b', 'c', 'd', 'e', 'f'),
('b', 'c', 'd', 'e', 'g'),
('b', 'c', 'd', 'f', 'g'),
('b', 'c', 'e', 'f', 'g'),
('b', 'd', 'e', 'f', 'g'),
('c', 'd', 'e', 'f', 'g')]
by only adding object.remove() line to function generator, I managed to get solution the way I wanted.
by removing whatever appended to the code list, reuse is eliminated.
#generate random codes
def generator(code, objects):
for i in range(len(code)):
x = random.choices(objects)
code[i] = x[0]
#new line
objects.remove(x[0])
I have those python lists :
x = [('D', 'F'), ('A', 'D'), ('B', 'G'), ('B', 'C'), ('A', 'B')]
priority_list = ['A', 'B', 'C', 'D', 'F', 'G'] # Ordered from highest to lowest priority
How can I, for each tuple in my list, keep the value with the highest priority according to priority_list? The result would be :
['D', 'A', 'B', 'B', 'A']
Another examples:
x = [('B', 'D'), ('E', 'A'), ('B', 'A'), ('D', 'F'), ('E', 'C')]
priority_list = ['A', 'B', 'C', 'D', 'E', 'F']
# Result:
['B', 'A', 'A', 'D', 'C']
x = [('B', 'C'), ('F', 'E'), ('B', 'A'), ('D', 'F'), ('E', 'C')]
priority_list = ['F', 'E', 'D', 'C', 'B', 'A'] # Notice the change in priorities
# Result:
['C', 'F', 'B', 'F', 'E']
Thanks in advance, I might be over complicating this.
You can try
[sorted(i, key=priority_list.index)[0] for i in x]
though it will throw an exception if you find a value not in the priority list.
You can try:
def get_priority_val(data, priority_list):
for single_val in priority_list:
if single_val in data:
return single_val
x = [('D', 'F'), ('A', 'D'), ('B', 'G'), ('B', 'C'), ('A', 'B')]
priority_list = ['A', 'B', 'C', 'D', 'F', 'G']
final_data = []
for data in x:
final_data.append(get_priority_val(data, priority_list))
print(final_data)
Output:
['D', 'A', 'B', 'B', 'A']
you can try using list comprehension:
ans = [d[0] if priority_list.index(d[0]) < priority_list.index(d[1] ) else d[1] for d in x ]
output:
['D', 'A', 'B', 'B', 'A']
You can do it in one line using a list comprehension :
[y[0] if priority_list.index(y[0]) < priority_list.index(y[1]) else y[1] for y in x]
Output :
['D', 'A', 'B', 'B', 'A']
You can create a dict containing priority values and just use min with a custom key
>>> priority_dict = {k:i for i,k in enumerate(priority_list)}
>>> [min(t, key=priority_dict.get) for t in x]
['D', 'A', 'B', 'B', 'A']
I have list with urls for crawling.['http://domain1.com','http://domain1.com/page1','http://domain2.com']
Code:
prev_domain = ''
while urls:
url = urls.pop()
if base_url(url) == prev_domain: # base_url is custom function return domain of an url
urls.append(url) # is this is possible?
continue
else:
crawl(url)
Basically I dont want to crawl webpages of same domain continuously. Continuosly crawling a domain url, return http response status code with 429: Too Many Requests. The user has sent too many requests in a given amount of time ("rate limiting"). To by-pass this issue, I'm planning to go with below logic.
Loop through all items in the list and compare current element base url with previously processed element base url.
If base urls are different then process for next step, otherwise do not process current element, just append this element to the same list.
Note : If urls in list are of same domain, make delay in processing each element and then execute.
Please provide your thoughts.
Your algorithm is almost correct, but not the implementation:
>>> L = [1,2,3]
>>> L.pop()
3
>>> L.append(3)
>>> L
[1, 2, 3]
That's why your program loops forever: if the domain is the same as the previous domain, you just append then pop then append, then.... What you need is not a stack, it's a round robin:
>>> L.pop()
3
>>> L.insert(0, 3)
>>> L
[3, 1, 2]
Let's take a shuffled list of permutations of "abcd":
>>> L = [('b', 'c', 'd', 'a'), ('d', 'c', 'b', 'a'), ('a', 'c', 'd', 'b'), ('c', 'd', 'a', 'b'), ('b', 'd', 'a', 'c'), ('b', 'a', 'd', 'c'), ('b', 'c', 'a', 'd'), ('a', 'b', 'd', 'c'), ('d', 'a', 'b', 'c'), ('a', 'b', 'c', 'd'), ('d', 'c', 'a', 'b'), ('a', 'd', 'c', 'b'), ('d', 'a', 'c', 'b'), ('c', 'd', 'b', 'a'), ('d', 'b', 'c', 'a'), ('d', 'b', 'a', 'c'), ('a', 'd', 'b', 'c'), ('b', 'd', 'c', 'a'), ('c', 'b', 'd', 'a'), ('c', 'a', 'b', 'd'), ('b', 'a', 'c', 'd')]
The first letter is the domain. Here's a slightly modified version of your code:
>>> prev = None
>>> while L:
... e = L.pop()
... if L and e[0] == prev:
... L.insert(0, e)
... else:
... print(e)
... prev = e[0]
('b', 'a', 'c', 'd')
('c', 'a', 'b', 'd')
('b', 'd', 'c', 'a')
('a', 'd', 'b', 'c')
('d', 'b', 'a', 'c')
('c', 'd', 'b', 'a')
('d', 'a', 'c', 'b')
('a', 'd', 'c', 'b')
('d', 'c', 'a', 'b')
('a', 'b', 'c', 'd')
('d', 'a', 'b', 'c')
('a', 'b', 'd', 'c')
('b', 'c', 'a', 'd')
('c', 'd', 'a', 'b')
('a', 'c', 'd', 'b')
('d', 'c', 'b', 'a')
('b', 'c', 'd', 'a')
('c', 'b', 'd', 'a')
('d', 'b', 'c', 'a')
('b', 'a', 'd', 'c')
('b', 'd', 'a', 'c')
The modification is: if L and, because if the last element of the list domain is prev, then you'll loop forever with your one element list: pop, same as prev, insert, pop, ...(as with pop/append)
Here's another option: create a dict domain -> list of urls:
>>> d = {}
>>> for e in L:
... d.setdefault(e[0], []).append(e)
>>> d
{'b': [('b', 'c', 'd', 'a'), ('b', 'd', 'a', 'c'), ('b', 'a', 'd', 'c'), ('b', 'c', 'a', 'd'), ('b', 'd', 'c', 'a'), ('b', 'a', 'c', 'd')], 'd': [('d', 'c', 'b', 'a'), ('d', 'a', 'b', 'c'), ('d', 'c', 'a', 'b'), ('d', 'a', 'c', 'b'), ('d', 'b', 'c', 'a'), ('d', 'b', 'a', 'c')], 'a': [('a', 'c', 'd', 'b'), ('a', 'b', 'd', 'c'), ('a', 'b', 'c', 'd'), ('a', 'd', 'c', 'b'), ('a', 'd', 'b', 'c')], 'c': [('c', 'd', 'a', 'b'), ('c', 'd', 'b', 'a'), ('c', 'b', 'd', 'a'), ('c', 'a', 'b', 'd')]}
Now, take an element of every domain and clear the dict, then loop until the dict is empty:
>>> while d:
... for k, vs in d.items():
... e = vs.pop()
... print (e)
... d = {k: vs for k, vs in d.items() if vs} # clear the dict
...
('b', 'a', 'c', 'd')
('d', 'b', 'a', 'c')
('a', 'd', 'b', 'c')
('c', 'a', 'b', 'd')
('b', 'd', 'c', 'a')
('d', 'b', 'c', 'a')
('a', 'd', 'c', 'b')
('c', 'b', 'd', 'a')
('b', 'c', 'a', 'd')
('d', 'a', 'c', 'b')
('a', 'b', 'c', 'd')
('c', 'd', 'b', 'a')
('b', 'a', 'd', 'c')
('d', 'c', 'a', 'b')
('a', 'b', 'd', 'c')
('c', 'd', 'a', 'b')
('b', 'd', 'a', 'c')
('d', 'a', 'b', 'c')
('a', 'c', 'd', 'b')
('b', 'c', 'd', 'a')
('d', 'c', 'b', 'a')
The output is more uniform.
Check the following code snippet,
urls = ['http://domain1.com','http://domain1.com/page1','http://domain2.com']
crawl_for_urls = {}
for url in urls:
domain = base_url(url)
if domain not in crowl_for_urls:
crawl_for_urls.update({domain:url})
crawl(url)
crawl() will be called only for unique domain.
Or you can use:
urls = ['http://domain1.com','http://domain1.com/page1','http://domain2.com']
crawl_for_urls = {}
for url in urls:
domain = base_url(url)
if domain not in crowl_for_urls:
crawl_for_urls.update({domain:[url]})
crawl(url)
else:
crawl_for_urls.get(domain, []).append(url)
This way you can categories the URL's based on domain and also can use crawl() for unique domain.
I'm trying to implement an algorithm to find all stable marriage solutions with a brute force approach without the Gale-Shapley algorithm (because it gives us only 2 of them).
I'm using the checking mechanism found in rosettacoode but I'm having an hard time trying to find a way to create all possible matches with no repetitions (the kind you have with 2 for cycles)
e.g
from these 2 lists [a,b,c] and [d,e,f] create
[(a,d),(b,e),(c,f)]
[(a,d),(b,f),(c,e)]
[(a,e),(b,f),(c,d)]
[(a,e),(b,d),(c,f)]
[(a,f),(b,d),(c,e)]
[(a,f),(b,e),(c,d)]
UPDATE1:
With all the solutions so far I'm not able to run it when it gets big.
I should probably do it recursively without storing long data structures, testing the single result when I get it and discard the others . I came out with this solution but still has problems because gives me some repetition and something that is missing. I don't know how to fix it, and sorry my brain is melting!
boys=['a','b','c']
girls=['d','e','f']
def matches(boys, girls, dic={}):
if len(dic)==3: #len(girls)==0 geves me more problems
print dic #just for testing with few elements
#run the stability check
else:
for b in boys:
for g in girls:
dic[b]=g
bb=boys[:]
bb.remove(b)
gg=girls[:]
gg.remove(g)
matches(bb,gg, dic)
dic.clear()
matches(boys,girls)
gives me this output
{'a': 'd', 'c': 'f', 'b': 'e'} <-
{'a': 'e', 'c': 'f', 'b': 'd'} <-
{'a': 'f', 'c': 'e', 'b': 'd'}
{'a': 'e', 'c': 'f', 'b': 'd'} <-
{'a': 'd', 'c': 'f', 'b': 'e'} <-
{'a': 'd', 'c': 'e', 'b': 'f'} <-
{'a': 'e', 'c': 'd', 'b': 'f'}
{'a': 'd', 'c': 'e', 'b': 'f'} <-
{'a': 'd', 'c': 'f', 'b': 'e'} <-
UPDATE 2
My complete working exercise inspired by #Zags (inspired by #Jonas):
guyprefers = {
'A': ['P','S','L','M','R','T','O','N'],
'B': ['M','N','S','P','O','L','T','R'],
'D': ['T','P','L','O','R','M','N','S'],
'E': ['N','M','S','O','L','R','T','P'],
'F': ['S','M','P','L','N','R','T','O'],
'G': ['L','R','S','P','T','O','M','N'],
'J': ['M','P','S','R','N','O','T','L'],
'K': ['N','T','O','P','S','M','R','L']
}
galprefers = {
'L': ['F','D','J','G','A','B','K','E'],
'M': ['K','G','D','F','J','B','A','E'],
'N': ['A','F','G','B','E','K','J','D'],
'O': ['K','J','D','B','E','A','F','G'],
'P': ['G','E','J','D','K','A','B','F'],
'R': ['B','K','F','D','E','G','J','A'],
'S': ['J','F','B','A','K','G','E','D'],
'T': ['J','E','A','F','B','D','G','K']
}
guys = sorted(guyprefers.keys())
gals = sorted(galprefers.keys())
def permutations(iterable): #from itertools a bit simplified
pool = tuple(iterable) #just to understand what it is doing
n = len(pool)
indices = range(n)
cycles = range(n, 0, -1)
while n:
for i in reversed(range(n)):
cycles[i] -= 1
if cycles[i] == 0:
indices[i:] = indices[i+1:] + indices[i:i+1]
cycles[i] = n - i
else:
j = cycles[i]
indices[i], indices[-j] = indices[-j], indices[i]
yield tuple(pool[i] for i in indices[:n])
break
else:
return
def check(engaged): #thanks to rosettacode
inversengaged = dict((v,k) for k,v in engaged.items())
for she, he in engaged.items():
shelikes = galprefers[she]
shelikesbetter = shelikes[:shelikes.index(he)]
helikes = guyprefers[he]
helikesbetter = helikes[:helikes.index(she)]
for guy in shelikesbetter:
guysgirl = inversengaged[guy]
guylikes = guyprefers[guy]
if guylikes.index(guysgirl) > guylikes.index(she):
return False
for gal in helikesbetter:
girlsguy = engaged[gal]
gallikes = galprefers[gal]
if gallikes.index(girlsguy) > gallikes.index(he):
return False
return True
match_to_check={}
for i in permutations(guys):
couples = sorted(zip(i, gals))
for couple in couples:
match_to_check[couple[1]]=couple[0]
if check(match_to_check):
print match_to_check
match_to_check.clear()
with the correct output:
{'M': 'F', 'L': 'D', 'O': 'K', 'N': 'A', 'P': 'G', 'S': 'J', 'R': 'B', 'T': 'E'}
{'M': 'F', 'L': 'D', 'O': 'K', 'N': 'B', 'P': 'G', 'S': 'J', 'R': 'E', 'T': 'A'}
{'M': 'J', 'L': 'D', 'O': 'K', 'N': 'A', 'P': 'G', 'S': 'F', 'R': 'B', 'T': 'E'}
{'M': 'J', 'L': 'D', 'O': 'K', 'N': 'B', 'P': 'G', 'S': 'F', 'R': 'E', 'T': 'A'}
{'M': 'D', 'L': 'F', 'O': 'K', 'N': 'A', 'P': 'G', 'S': 'J', 'R': 'B', 'T': 'E'}
{'M': 'J', 'L': 'G', 'O': 'K', 'N': 'A', 'P': 'D', 'S': 'F', 'R': 'B', 'T': 'E'}
{'M': 'J', 'L': 'G', 'O': 'K', 'N': 'B', 'P': 'A', 'S': 'F', 'R': 'E', 'T': 'D'}
{'M': 'J', 'L': 'G', 'O': 'K', 'N': 'B', 'P': 'D', 'S': 'F', 'R': 'E', 'T': 'A'}
Optimized answer
(Insipred by #Jonas but doesn't require Numpy):
from itertools import permutations
l1 = ["a", "b", "c"]
l2 = ["d", "e", "f"]
valid_pairings = [sorted(zip(i, l2)) for i in permutations(l1)]
valid_pairings is:
[
[('a', 'd'), ('b', 'e'), ('c', 'f')],
[('a', 'd'), ('b', 'f'), ('c', 'e')],
[('a', 'e'), ('b', 'd'), ('c', 'f')],
[('a', 'f'), ('b', 'd'), ('c', 'e')],
[('a', 'e'), ('b', 'f'), ('c', 'd')],
[('a', 'f'), ('b', 'e'), ('c', 'd')]
]
Warning: the size of the output is factiorial(n), where n is the size of one the smaller input. At n = 14, this requires 100's of GBs of memory to store, more than most modern systems have.
Old Answer
from itertools import product, combinations
def flatten(lst):
return [item for sublist in lst for item in sublist]
l1 = ["a", "b", "c"]
l2 = ["d", "e", "f"]
all_pairings = combinations(product(l1, l2), min(len(l1), len(l2)))
# remove those pairings where an item appears more than once
valid_pairings = [i for i in all_pairings if len(set(flatten(i))) == len(flatten(i))]
Valid pairings is:
[
(('a', 'd'), ('b', 'e'), ('c', 'f')),
(('a', 'd'), ('b', 'f'), ('c', 'e')),
(('a', 'e'), ('b', 'd'), ('c', 'f')),
(('a', 'e'), ('b', 'f'), ('c', 'd')),
(('a', 'f'), ('b', 'd'), ('c', 'e')),
(('a', 'f'), ('b', 'e'), ('c', 'd'))
]
This is a bit of a brute force approach (don't use it for long lists), just sample the population enough times to be sure you have all possible combinations, make a set of it and sort the result.
from random import sample
x = ["a","b","c"]
y = ['d','e','f']
z = {tuple(sample(y,3)) for i in range(25)}
result = sorted([list(zip(x,z_)) for z_ in z])
>>>result
[[('a', 'd'), ('b', 'e'), ('c', 'f')],
[('a', 'd'), ('b', 'f'), ('c', 'e')],
[('a', 'e'), ('b', 'd'), ('c', 'f')],
[('a', 'e'), ('b', 'f'), ('c', 'd')],
[('a', 'f'), ('b', 'd'), ('c', 'e')],
[('a', 'f'), ('b', 'e'), ('c', 'd')]]
This is not the way to go, it's just a different approach.
Combine the "wives" with all permutations of the "husbands" and you get all combinations.
import itertools
import numpy as np
husbands = ['d', 'e', 'f']
wifes = ['a', 'b', 'c']
permutations = list(itertools.permutations(husbands))
repetition = [wifes for _ in permutations]
res = np.dstack((repetition,permutations))
print(res)
Result is:
[[['a' 'd']
['b' 'e']
['c' 'f']]
[['a' 'd']
['b' 'f']
['c' 'e']]
[['a' 'e']
['b' 'd']
['c' 'f']]
[['a' 'e']
['b' 'f']
['c' 'd']]
[['a' 'f']
['b' 'd']
['c' 'e']]
[['a' 'f']
['b' 'e']
['c' 'd']]]
If you prefer tuples:
res = res.view(dtype=np.dtype([('x', np.dtype('U1')), ('y', np.dtype('U1'))]))
res = res.reshape(res.shape[:-1])
print(res)
Result:
[[('a', 'd') ('b', 'e') ('c', 'f')]
[('a', 'd') ('b', 'f') ('c', 'e')]
[('a', 'e') ('b', 'd') ('c', 'f')]
[('a', 'e') ('b', 'f') ('c', 'd')]
[('a', 'f') ('b', 'd') ('c', 'e')]
[('a', 'f') ('b', 'e') ('c', 'd')]]
In python, I have a dictionary called
d = {('A', 'A', 'A'):1, ('A', 'A', 'B'):1, ('A', 'A', 'C'):1, ('A', 'B', 'A'): 2, ('A', 'B','C'):2, ...}.
Is there a simple way to change the values (to 10 for example) of for when the key is, for example, ('A', 'A', _) where _ can be any char A~Z ?
So, it will look like {('A', 'A', 'A'):10, ('A', 'A', 'B'):10, ('A', 'A', 'C'):10, ('A', 'B', 'A'): 2, ('A', 'B', 'C'):2, ...} at the end.
As for now, I'm using a loop with a variable x for ('A', 'A', x), but I'm wondering if there are such keywords in python.
Thanks for the tips.
Just check the first two elements of each tuple, the last is irrelevant unless you specifically want to make sure it is also a letter:
for k in d:
if k[0] == "A" and k[1] == "A":
d[k] = 10
print(d)
{('A', 'B', 'A'): 2, ('A', 'B', 'C'): 2, ('A', 'A', 'A'): 10, ('A', 'A', 'C'): 10, ('A', 'A', 'B'): 10}
If the last element must also actually be alpha then use str.isalpha:
d = {('A', 'A', '!'):1, ('A', 'A', 'B'):1, ('A', 'A', 'C'):1, ('A', 'B', 'A'): 2, ('A', 'B','C'):2}
for k in d:
if all((k[0] == "A", k[1] == "A", k[2].isalpha())):
d[k] = 10
print(d)
{('A', 'B', 'A'): 2, ('A', 'B', 'C'): 2, ('A', 'A', '!'): 1, ('A', 'A', 'C'): 10, ('A', 'A', 'B'): 10}
There is no keyword where d[('A', 'A', _)]=10 will work, you could hack a functional approach using map with python2:
d = {('A', 'A', 'A'):1, ('A', 'A', 'B'):1, ('A', 'A', 'C'):1, ('A', 'B', 'A'): 2, ('A', 'B','C'):2}
map(lambda k: d.__setitem__(k, 10) if ((k[0], k[1]) == ("A", "A")) else k, d)
print(d)
{('A', 'B', 'A'): 2, ('A', 'B', 'C'): 2, ('A', 'A', 'A'): 10, ('A', 'A', 'C'): 10, ('A', 'A', 'B'): 10}
Or including isalpha:
d = {('A', 'A', '!'):1, ('A', 'A', 'B'):1, ('A', 'A', 'C'):1, ('A', 'B', 'A'): 2, ('A', 'B','C'):2}
map(lambda k: d.__setitem__(k, 10) if ((k[0], k[1],k[2].isalpha()) == ("A", "A",True)) else k, d)
print(d)
How about something like this:
for item in d.keys():
if re.match("\('A', 'A', '[A-Z]'\)",str(item)):
d[item] = 10
This is another method. Returns None in the console, but appears to update the values:
[d.update({y:10}) for y in [x for x in d.keys() if re.match("\('A', 'A', '[A-Z]'\)",str(x))]]