Select random variables from list with limit Python - python

I need to generate unique random lists for 3 different objects, each object can appear once on each lis and each code has to in fixed length of 5.
import random
#generate random codes
def generator(code, objects):
for i in range(len(code)):
x = random.choices(objects)
code[i] = x[0]
#Check if code is unique
def isSame(code, list):
if code not in list:
return False
else:
return True
#If code is unique, append it to the codeList and increase counter by 1
codeCount = 0
def listAppend(code, list):
if isSame(code,list) == True:
print('This code is not unique')
else:
list.append(code)
global codeCount
codeCount += 1
if __name__ == '__main__':
codeList = []
desiredCount = 12
while codeCount != desiredCount:
code = [None]*5
objects = ['a','b','c','d','e','f','g']
generator(code, objects)
listAppend(code,codeList)
print(codeList)
This gives me random unique lists but however I couldn't think of how to make each object appear only once in each unique list.
e.g. ['a', 'g', 'g', 'a', 'e'] ==> 'g' and 'a' has repeated twice where I need them to appear only once. like, ['a','b','c','d','e']
Can anyone think of a good way to do this? Thanks!!
EDIT: each code has to have fixed length of 5. Also I'm using random.choices to use its probability parameter.

The way I would do it is:
from random import randrange as rr
Alphabet="abcdefghijklmnopqrstuvwxyz"
def generate(length):
code=[]
for _ in range(length):
random_number=rr(0,len(Alphabet))
if Alphabet[random_number]not in code:
code.append(Alphabet[random_number])
return code
This generates a random element from a tuple / list / string (in my case is a string of alphabet) and checks if the element is already in the code, and if not then it will be added to the code, the length of the code is determined by a parameter.

This will generate all possible 3 unique element selections from the source.
import itertools
list(itertools.combinations('abcdefg',3))
[('a', 'b', 'c'),
('a', 'b', 'd'),
('a', 'b', 'e'),
('a', 'b', 'f'),
('a', 'b', 'g'),
('a', 'c', 'd'),
('a', 'c', 'e'),
('a', 'c', 'f'),
...
('d', 'f', 'g'),
('e', 'f', 'g')]
For size 5, it will be this list
list(itertools.combinations('abcdefg',5))
[('a', 'b', 'c', 'd', 'e'),
('a', 'b', 'c', 'd', 'f'),
('a', 'b', 'c', 'd', 'g'),
('a', 'b', 'c', 'e', 'f'),
('a', 'b', 'c', 'e', 'g'),
('a', 'b', 'c', 'f', 'g'),
('a', 'b', 'd', 'e', 'f'),
('a', 'b', 'd', 'e', 'g'),
('a', 'b', 'd', 'f', 'g'),
('a', 'b', 'e', 'f', 'g'),
('a', 'c', 'd', 'e', 'f'),
('a', 'c', 'd', 'e', 'g'),
('a', 'c', 'd', 'f', 'g'),
('a', 'c', 'e', 'f', 'g'),
('a', 'd', 'e', 'f', 'g'),
('b', 'c', 'd', 'e', 'f'),
('b', 'c', 'd', 'e', 'g'),
('b', 'c', 'd', 'f', 'g'),
('b', 'c', 'e', 'f', 'g'),
('b', 'd', 'e', 'f', 'g'),
('c', 'd', 'e', 'f', 'g')]

by only adding object.remove() line to function generator, I managed to get solution the way I wanted.
by removing whatever appended to the code list, reuse is eliminated.
#generate random codes
def generator(code, objects):
for i in range(len(code)):
x = random.choices(objects)
code[i] = x[0]
#new line
objects.remove(x[0])

Related

Python replace tuples from list of tuples by highest priority list

I have those python lists :
x = [('D', 'F'), ('A', 'D'), ('B', 'G'), ('B', 'C'), ('A', 'B')]
priority_list = ['A', 'B', 'C', 'D', 'F', 'G'] # Ordered from highest to lowest priority
How can I, for each tuple in my list, keep the value with the highest priority according to priority_list? The result would be :
['D', 'A', 'B', 'B', 'A']
Another examples:
x = [('B', 'D'), ('E', 'A'), ('B', 'A'), ('D', 'F'), ('E', 'C')]
priority_list = ['A', 'B', 'C', 'D', 'E', 'F']
# Result:
['B', 'A', 'A', 'D', 'C']
x = [('B', 'C'), ('F', 'E'), ('B', 'A'), ('D', 'F'), ('E', 'C')]
priority_list = ['F', 'E', 'D', 'C', 'B', 'A'] # Notice the change in priorities
# Result:
['C', 'F', 'B', 'F', 'E']
Thanks in advance, I might be over complicating this.
You can try
[sorted(i, key=priority_list.index)[0] for i in x]
though it will throw an exception if you find a value not in the priority list.
You can try:
def get_priority_val(data, priority_list):
for single_val in priority_list:
if single_val in data:
return single_val
x = [('D', 'F'), ('A', 'D'), ('B', 'G'), ('B', 'C'), ('A', 'B')]
priority_list = ['A', 'B', 'C', 'D', 'F', 'G']
final_data = []
for data in x:
final_data.append(get_priority_val(data, priority_list))
print(final_data)
Output:
['D', 'A', 'B', 'B', 'A']
you can try using list comprehension:
ans = [d[0] if priority_list.index(d[0]) < priority_list.index(d[1] ) else d[1] for d in x ]
output:
['D', 'A', 'B', 'B', 'A']
You can do it in one line using a list comprehension :
[y[0] if priority_list.index(y[0]) < priority_list.index(y[1]) else y[1] for y in x]
Output :
['D', 'A', 'B', 'B', 'A']
You can create a dict containing priority values and just use min with a custom key
>>> priority_dict = {k:i for i,k in enumerate(priority_list)}
>>> [min(t, key=priority_dict.get) for t in x]
['D', 'A', 'B', 'B', 'A']

How to pop item from list and push it back to list based on condition using python

I have list with urls for crawling.['http://domain1.com','http://domain1.com/page1','http://domain2.com']
Code:
prev_domain = ''
while urls:
url = urls.pop()
if base_url(url) == prev_domain: # base_url is custom function return domain of an url
urls.append(url) # is this is possible?
continue
else:
crawl(url)
Basically I dont want to crawl webpages of same domain continuously. Continuosly crawling a domain url, return http response status code with 429: Too Many Requests. The user has sent too many requests in a given amount of time ("rate limiting"). To by-pass this issue, I'm planning to go with below logic.
Loop through all items in the list and compare current element base url with previously processed element base url.
If base urls are different then process for next step, otherwise do not process current element, just append this element to the same list.
Note : If urls in list are of same domain, make delay in processing each element and then execute.
Please provide your thoughts.
Your algorithm is almost correct, but not the implementation:
>>> L = [1,2,3]
>>> L.pop()
3
>>> L.append(3)
>>> L
[1, 2, 3]
That's why your program loops forever: if the domain is the same as the previous domain, you just append then pop then append, then.... What you need is not a stack, it's a round robin:
>>> L.pop()
3
>>> L.insert(0, 3)
>>> L
[3, 1, 2]
Let's take a shuffled list of permutations of "abcd":
>>> L = [('b', 'c', 'd', 'a'), ('d', 'c', 'b', 'a'), ('a', 'c', 'd', 'b'), ('c', 'd', 'a', 'b'), ('b', 'd', 'a', 'c'), ('b', 'a', 'd', 'c'), ('b', 'c', 'a', 'd'), ('a', 'b', 'd', 'c'), ('d', 'a', 'b', 'c'), ('a', 'b', 'c', 'd'), ('d', 'c', 'a', 'b'), ('a', 'd', 'c', 'b'), ('d', 'a', 'c', 'b'), ('c', 'd', 'b', 'a'), ('d', 'b', 'c', 'a'), ('d', 'b', 'a', 'c'), ('a', 'd', 'b', 'c'), ('b', 'd', 'c', 'a'), ('c', 'b', 'd', 'a'), ('c', 'a', 'b', 'd'), ('b', 'a', 'c', 'd')]
The first letter is the domain. Here's a slightly modified version of your code:
>>> prev = None
>>> while L:
... e = L.pop()
... if L and e[0] == prev:
... L.insert(0, e)
... else:
... print(e)
... prev = e[0]
('b', 'a', 'c', 'd')
('c', 'a', 'b', 'd')
('b', 'd', 'c', 'a')
('a', 'd', 'b', 'c')
('d', 'b', 'a', 'c')
('c', 'd', 'b', 'a')
('d', 'a', 'c', 'b')
('a', 'd', 'c', 'b')
('d', 'c', 'a', 'b')
('a', 'b', 'c', 'd')
('d', 'a', 'b', 'c')
('a', 'b', 'd', 'c')
('b', 'c', 'a', 'd')
('c', 'd', 'a', 'b')
('a', 'c', 'd', 'b')
('d', 'c', 'b', 'a')
('b', 'c', 'd', 'a')
('c', 'b', 'd', 'a')
('d', 'b', 'c', 'a')
('b', 'a', 'd', 'c')
('b', 'd', 'a', 'c')
The modification is: if L and, because if the last element of the list domain is prev, then you'll loop forever with your one element list: pop, same as prev, insert, pop, ...(as with pop/append)
Here's another option: create a dict domain -> list of urls:
>>> d = {}
>>> for e in L:
... d.setdefault(e[0], []).append(e)
>>> d
{'b': [('b', 'c', 'd', 'a'), ('b', 'd', 'a', 'c'), ('b', 'a', 'd', 'c'), ('b', 'c', 'a', 'd'), ('b', 'd', 'c', 'a'), ('b', 'a', 'c', 'd')], 'd': [('d', 'c', 'b', 'a'), ('d', 'a', 'b', 'c'), ('d', 'c', 'a', 'b'), ('d', 'a', 'c', 'b'), ('d', 'b', 'c', 'a'), ('d', 'b', 'a', 'c')], 'a': [('a', 'c', 'd', 'b'), ('a', 'b', 'd', 'c'), ('a', 'b', 'c', 'd'), ('a', 'd', 'c', 'b'), ('a', 'd', 'b', 'c')], 'c': [('c', 'd', 'a', 'b'), ('c', 'd', 'b', 'a'), ('c', 'b', 'd', 'a'), ('c', 'a', 'b', 'd')]}
Now, take an element of every domain and clear the dict, then loop until the dict is empty:
>>> while d:
... for k, vs in d.items():
... e = vs.pop()
... print (e)
... d = {k: vs for k, vs in d.items() if vs} # clear the dict
...
('b', 'a', 'c', 'd')
('d', 'b', 'a', 'c')
('a', 'd', 'b', 'c')
('c', 'a', 'b', 'd')
('b', 'd', 'c', 'a')
('d', 'b', 'c', 'a')
('a', 'd', 'c', 'b')
('c', 'b', 'd', 'a')
('b', 'c', 'a', 'd')
('d', 'a', 'c', 'b')
('a', 'b', 'c', 'd')
('c', 'd', 'b', 'a')
('b', 'a', 'd', 'c')
('d', 'c', 'a', 'b')
('a', 'b', 'd', 'c')
('c', 'd', 'a', 'b')
('b', 'd', 'a', 'c')
('d', 'a', 'b', 'c')
('a', 'c', 'd', 'b')
('b', 'c', 'd', 'a')
('d', 'c', 'b', 'a')
The output is more uniform.
Check the following code snippet,
urls = ['http://domain1.com','http://domain1.com/page1','http://domain2.com']
crawl_for_urls = {}
for url in urls:
domain = base_url(url)
if domain not in crowl_for_urls:
crawl_for_urls.update({domain:url})
crawl(url)
crawl() will be called only for unique domain.
Or you can use:
urls = ['http://domain1.com','http://domain1.com/page1','http://domain2.com']
crawl_for_urls = {}
for url in urls:
domain = base_url(url)
if domain not in crowl_for_urls:
crawl_for_urls.update({domain:[url]})
crawl(url)
else:
crawl_for_urls.get(domain, []).append(url)
This way you can categories the URL's based on domain and also can use crawl() for unique domain.

Reading a list of tuples from a text file with strings and numbers

I have a text file where each line represents the results from a sequence mining operation. So the first element in each tuple is a tuple of strings (letters), and the second element is the frequency (int).
How can I read these back from the text file into the original format? Format as follows, copied directly from the text file.... Can't seem to find any similar examples out there but there's got to be a way to do this easily.
(('a',), 30838057)
(('a', 'b'), 23151399)
(('a', 'b', 'c'), 13865674)
(('a', 'b', 'c', 'e'), 8979035)
(('a', 'b', 'c', 'e', 'f'), 6771982)
(('a', 'b', 'c', 'e', 'f', 'g'), 4514076)
(('a', 'b', 'c', 'e', 'f', 'g', 'h'), 2403374)
As other have commented you can use the ast.literal_eval() function since your data appears to be formatted the same a Python literals:
import ast
from pprint import pprint
filename = 'tuples_list.txt'
tuple_list = []
with open(filename) as inp:
for line in inp:
values = ast.literal_eval(line)
tuple_list.append(values)
pprint(tuple_list)
Output:
[(('a',), 30838057),
(('a', 'b'), 23151399),
(('a', 'b', 'c'), 13865674),
(('a', 'b', 'c', 'e'), 8979035),
(('a', 'b', 'c', 'e', 'f'), 6771982),
(('a', 'b', 'c', 'e', 'f', 'g'), 4514076),
(('a', 'b', 'c', 'e', 'f', 'g', 'h'), 2403374)]

mapping list of numbers to dictionary keys with multiple values

I will start with an example. let's say I have a dictionary such as:
d = {1:['A','B'],
2:['C']}
and a list:
vals = [1,2]
I want to map these values in the list (vals) to all possible ones in the dictionary (d). so the output here should be two lists such as:
[[ 'A','C']['B','C']]
this is basically the problem I am facing now. I thought I can do it with for loop but when we faced this dictionary and list of values,I couldn't do it using a for loop or even a nested loops:
d = {1:['A','B','C'] ,
2:['D','E'],
3:['F','G'],
4:['I'] }
values = [1,2,3,4]
the output here should be:
[['A', 'D', 'F', 'I'],
['A', 'D', 'G', 'I'],
['A', 'E', 'F', 'I'],
['A', 'E', 'G', 'I'],
['B', 'D', 'F', 'I'],
['B', 'D', 'G', 'I'],
['B', 'E', 'F', 'I'],
['B', 'E', 'G', 'I'],
['C', 'D', 'F', 'I'],
['C', 'D', 'G', 'I'],
['C', 'E', 'F', 'I'],
['C', 'E', 'G', 'I']]
You can use itertools product() for this. Just make a comprehension of the indexes you want to include and pass them to product(). If you are okay with tuples it's a nice one-liner:
import itertools
list(itertools.product(*(d[x] for x in values)))
results:
[('A', 'D', 'F', 'I'),
('A', 'D', 'G', 'I'),
('A', 'E', 'F', 'I'),
('A', 'E', 'G', 'I'),
('B', 'D', 'F', 'I'),
('B', 'D', 'G', 'I'),
('B', 'E', 'F', 'I'),
('B', 'E', 'G', 'I'),
('C', 'D', 'F', 'I'),
('C', 'D', 'G', 'I'),
('C', 'E', 'F', 'I'),
('C', 'E', 'G', 'I')]
If you simple need working solution, use that of Mark Meyer, however if you are curious if it is doable via fors, answer is yes, following way:
d = {1:['A','B','C'] ,2:['D','E'],3:['F','G'],4:['I']}
for k in sorted(d.keys())[:-1][::-1]:
d[k] = [(i+j) for i in d[k] for j in d[k+1]]
out = [tuple(i) for i in d[1]]
print(out)
gives:
[('A', 'D', 'F', 'I'), ('A', 'D', 'G', 'I'), ('A', 'E', 'F', 'I'), ('A', 'E', 'G', 'I'), ('B', 'D', 'F', 'I'), ('B', 'D', 'G', 'I'), ('B', 'E', 'F', 'I'), ('B', 'E', 'G', 'I'), ('C', 'D', 'F', 'I'), ('C', 'D', 'G', 'I'), ('C', 'E', 'F', 'I'), ('C', 'E', 'G', 'I')]
Note that this solution assumes that dict d is correct, i.e. its keys are subsequent numbers starting at 1 and all values are lists of one-letter strs. Now explanation: outer for is working on numbers from second greatest to 1, descending, in this particular case: 3,2,1. List comprehension is making "every-with-every" join (like SQL CROSS JOIN) of k-th list with (k+1)-th list and is effect is stored under current key k.
Finally I retrieve d[1] which is list of strs and convert it to list of tuples compliant with requirements. If how this solution is working explanation is unclear for you please copy code snippet, add print(d) below d[k] = ... and observe what it prints.

How to create network data from lists

My data look like this:
data = [['A', 'B', 'C', 'D'],
['E', 'F', 'G'],
['I', 'J']]
I would like to transform the data to the following:
data = [['A', 'B'],
['A', 'C'],
['A', 'D'],
['B', 'C'],
['B', 'D'],
['C', 'D'],
['E', 'F'],
['E', 'G'],
['F', 'G'],
['I', 'J']]
My codes are not working:
for item in data:
count = len(item)
for i in range (0, count):
print item[i], item[i+1]
These codes need improvement. Any suggestion?
The main thing here is to use itertools.combinations() with each item of the list. See this example below
>>> from itertools import combinations
>>> list(combinations(['A', 'B', 'C', 'D'] , 2))
[('A', 'B'), ('A', 'C'), ('A', 'D'), ('B', 'C'), ('B', 'D'), ('C', 'D')]
It's fairly easy to then combine the results into a single list using a list comprehension or chain.from_iterable()
>>> data = [['A', 'B', 'C', 'D'],
... ['E', 'F', 'G'],
... ['I', 'J']]
>>> list(chain.from_iterable(combinations(x, 2) for x in data))
[('A', 'B'), ('A', 'C'), ('A', 'D'), ('B', 'C'), ('B', 'D'), ('C', 'D'), ('E', 'F'), ('E', 'G'), ('F', 'G'), ('I', 'J')]
As pointed out you can use itertool.combinations in combination with a list comprehension to flatten the list:
>>> from itertools import combinations
>>> [x for d in data for x in combinations(d, 2)]
[('A', 'B'), ('A', 'C'), ('A', 'D'), ('B', 'C'), ('B', 'D'),
('C', 'D'), ('E', 'F'), ('E', 'G'), ('F', 'G'), ('I', 'J')]
You can use itertools.combinations:
from itertools import combinations
data = [['A', 'B', 'C', 'D'],
['E', 'F', 'G'],
['I', 'J']]
result = []
for sublist in data:
result.extend(map(list, combinations(sublist, 2)))
print result
OUTPUT
[['A', 'B'], ['A', 'C'], ['A', 'D'], ['B', 'C'], ['B', 'D'], ['C', 'D'], ['E', 'F'], ['E', 'G'], ['F', 'G'], ['I', 'J']]
You simply need 3 nested for loops
data2 = []
for item in data:
for i in range(0, len(item)-1):
for j in range(i+1, len(item)):
data2.append([item[i],item[j]])
print data2
Output:
[['A', 'B'], ['A', 'C'], ['A', 'D'], ['B', 'C'], ['B', 'D'], ['C', 'D'],
['E', 'F'], ['E', 'G'], ['F', 'G'], ['I', 'J']]
Here is a python one liner
pairs = [ [item[i], item[j]] for item in data for i in range(len(item)) for j in range(i + 1, len(item))]

Categories

Resources