I have a string where I want to output random ints of differing size using Python's built-in format function.
IE: "{one_digit}:{two_digit}:{one_digit}"
Yields: "3:27:9"
I'm trying:
import random
"{one_digit}:{two_digit}:{one_digit}".format(one_digit=random.randint(1,9),two_digits=random.randint(10,99))
but this always outputs...
"{one_digit}:{two_digit}:{one_digit}".format(one_digit=random.randint(1,9),two_digit=random.randint(10,99))
>>>'4:22:4'
"{one_digit}:{two_digit}:{one_digit}".format(one_digit=random.randint(1,9),two_digit=random.randint(10,99))
>>>'7:48:7'
"{one_digit}:{two_digit}:{one_digit}".format(one_digit=random.randint(1,9),two_digit=random.randint(10,99))
>>>'2:28:2'
"{one_digit}:{two_digit}:{one_digit}".format(one_digit=random.randint(1,9),two_digit=random.randint(10,99))
>>>'1:12:1'
Which is as expected since the numbers are evaluated before hand. I'd like them to all be random, though. I tried using a lambda function but only got this:
"test{number}:{number}".format(number=lambda x: random.randint(1,10))
But that only yields
"test{number}:{number}".format(number=lambda x: random.randint(1,10))
>>>'test<function <lambda> at 0x10aa14e18>:<function <lambda> at 0x10aa14e18>'
First off: str.format is the wrong tool for the job, because it doesn't allow you to generate a different value for each replacement.
The correct solution is therefore to implement your own replacement function. We'll replace the {one_digit} and {two_digit} format specifiers with something more suitable: {1} and {2}, respectively.
format_string = "{1}:{2}:{1}"
Now we can use regex to substitute all of these markers with random numbers. Regex is handy because re.sub accepts a replacement function, which we can use to generate a new random number every time:
import re
def repl(match):
num_digits = int(match.group(1))
lower_bound = 10 ** (num_digits - 1)
upper_bound = 10 * lower_bound - 1
random_number = random.randint(lower_bound, upper_bound)
return str(random_number)
result = re.sub(r'{(\d+)}', repl, format_string)
print(result) # result: 5:56:1
How about this?
import random
r = [1,2,3,4,5]
','.join(map(str,(random.randint(-10**i,10**i) for i in r)))
The first two params(-10** i, 10**i) are low and upper bound meanwhile size=10 is the amount of numbers).
Example output: '-8,45,-328,7634,51218'
Explanation:
It seems you are looking to join random numbers with ,. This can simply be done using ','.join([array with strings]), e.g. ','.join(['1','2']) which would return '1,2'.
What about This?
'%s:%s:%s' % (random.randint(1,9),random.randint(10,99),random.randint(1,9))
EDIT : meeting requirements.
a=[1,2,2,1,3,4,5,9,0] # our definition of the pattern (decimal range)
b= ''
for j in enumerate(a):
x=random.randint(10**j,10**(j+1)-1)
b = b + '%s:' % x
print(b)
sample:
print (b)
31:107:715:76:2602:99021:357311:7593756971:1:
How do I generate a string of size N, made of numbers and uppercase English letters such as:
6U1S75
4Z4UKK
U911K4
Answer in one line:
''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))
or even shorter starting with Python 3.6 using random.choices():
''.join(random.choices(string.ascii_uppercase + string.digits, k=N))
A cryptographically more secure version: see this post
''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))
In details, with a clean function for further reuse:
>>> import string
>>> import random
>>> def id_generator(size=6, chars=string.ascii_uppercase + string.digits):
... return ''.join(random.choice(chars) for _ in range(size))
...
>>> id_generator()
'G5G74W'
>>> id_generator(3, "6793YUIO")
'Y3U'
How does it work ?
We import string, a module that contains sequences of common ASCII characters, and random, a module that deals with random generation.
string.ascii_uppercase + string.digits just concatenates the list of characters representing uppercase ASCII chars and digits:
>>> string.ascii_uppercase
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> string.digits
'0123456789'
>>> string.ascii_uppercase + string.digits
'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
Then we use a list comprehension to create a list of 'n' elements:
>>> range(4) # range create a list of 'n' numbers
[0, 1, 2, 3]
>>> ['elem' for _ in range(4)] # we use range to create 4 times 'elem'
['elem', 'elem', 'elem', 'elem']
In the example above, we use [ to create the list, but we don't in the id_generator function so Python doesn't create the list in memory, but generates the elements on the fly, one by one (more about this here).
Instead of asking to create 'n' times the string elem, we will ask Python to create 'n' times a random character, picked from a sequence of characters:
>>> random.choice("abcde")
'a'
>>> random.choice("abcde")
'd'
>>> random.choice("abcde")
'b'
Therefore random.choice(chars) for _ in range(size) really is creating a sequence of size characters. Characters that are randomly picked from chars:
>>> [random.choice('abcde') for _ in range(3)]
['a', 'b', 'b']
>>> [random.choice('abcde') for _ in range(3)]
['e', 'b', 'e']
>>> [random.choice('abcde') for _ in range(3)]
['d', 'a', 'c']
Then we just join them with an empty string so the sequence becomes a string:
>>> ''.join(['a', 'b', 'b'])
'abb'
>>> [random.choice('abcde') for _ in range(3)]
['d', 'c', 'b']
>>> ''.join(random.choice('abcde') for _ in range(3))
'dac'
This Stack Overflow quesion is the current top Google result for "random string Python". The current top answer is:
''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))
This is an excellent method, but the PRNG in random is not cryptographically secure. I assume many people researching this question will want to generate random strings for encryption or passwords. You can do this securely by making a small change in the above code:
''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))
Using random.SystemRandom() instead of just random uses /dev/urandom on *nix machines and CryptGenRandom() in Windows. These are cryptographically secure PRNGs. Using random.choice instead of random.SystemRandom().choice in an application that requires a secure PRNG could be potentially devastating, and given the popularity of this question, I bet that mistake has been made many times already.
If you're using python3.6 or above, you can use the new secrets module as mentioned in MSeifert's answer:
''.join(secrets.choice(string.ascii_uppercase + string.digits) for _ in range(N))
The module docs also discuss convenient ways to generate secure tokens and best practices.
Simply use Python's builtin uuid:
If UUIDs are okay for your purposes, use the built-in uuid package.
One Line Solution:
import uuid; uuid.uuid4().hex.upper()[0:6]
In Depth Version:
Example:
import uuid
uuid.uuid4() #uuid4 => full random uuid
# Outputs something like: UUID('0172fc9a-1dac-4414-b88d-6b9a6feb91ea')
If you need exactly your format (for example, "6U1S75"), you can do it like this:
import uuid
def my_random_string(string_length=10):
"""Returns a random string of length string_length."""
random = str(uuid.uuid4()) # Convert UUID format to a Python string.
random = random.upper() # Make all characters uppercase.
random = random.replace("-","") # Remove the UUID '-'.
return random[0:string_length] # Return the random string.
print(my_random_string(6)) # For example, D9E50C
A simpler, faster but slightly less random way is to use random.sample instead of choosing each letter separately, If n-repetitions are allowed, enlarge your random basis by n times e.g.
import random
import string
char_set = string.ascii_uppercase + string.digits
print ''.join(random.sample(char_set*6, 6))
Note:
random.sample prevents character reuse, multiplying the size of the character set makes multiple repetitions possible, but they are still less likely then they are in a pure random choice. If we go for a string of length 6, and we pick 'X' as the first character, in the choice example, the odds of getting 'X' for the second character are the same as the odds of getting 'X' as the first character. In the random.sample implementation, the odds of getting 'X' as any subsequent character are only 6/7 the chance of getting it as the first character
import uuid
lowercase_str = uuid.uuid4().hex
lowercase_str is a random value like 'cea8b32e00934aaea8c005a35d85a5c0'
uppercase_str = lowercase_str.upper()
uppercase_str is 'CEA8B32E00934AAEA8C005A35D85A5C0'
From Python 3.6 on you should use the secrets module if you need it to be cryptographically secure instead of the random module (otherwise this answer is identical to the one of #Ignacio Vazquez-Abrams):
from secrets import choice
import string
''.join([choice(string.ascii_uppercase + string.digits) for _ in range(N)])
One additional note: a list-comprehension is faster in the case of str.join than using a generator expression!
A faster, easier and more flexible way to do this is to use the strgen module (pip install StringGenerator).
Generate a 6-character random string with upper case letters and digits:
>>> from strgen import StringGenerator as SG
>>> SG("[\u\d]{6}").render()
u'YZI2CI'
Get a unique list:
>>> SG("[\l\d]{10}").render_list(5,unique=True)
[u'xqqtmi1pOk', u'zmkWdUr63O', u'PGaGcPHrX2', u'6RZiUbkk2i', u'j9eIeeWgEF']
Guarantee one "special" character in the string:
>>> SG("[\l\d]{10}&[\p]").render()
u'jaYI0bcPG*0'
A random HTML color:
>>> SG("#[\h]{6}").render()
u'#CEdFCa'
etc.
We need to be aware that this:
''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))
might not have a digit (or uppercase character) in it.
strgen is faster in developer-time than any of the above solutions. The solution from Ignacio is the fastest run-time performing and is the right answer using the Python Standard Library. But you will hardly ever use it in that form. You will want to use SystemRandom (or fallback if not available), make sure required character sets are represented, use unicode (or not), make sure successive invocations produce a unique string, use a subset of one of the string module character classes, etc. This all requires lots more code than in the answers provided. The various attempts to generalize a solution all have limitations that strgen solves with greater brevity and expressive power using a simple template language.
It's on PyPI:
pip install StringGenerator
Disclosure: I'm the author of the strgen module.
Based on another Stack Overflow answer, Most lightweight way to create a random string and a random hexadecimal number, a better version than the accepted answer would be:
('%06x' % random.randrange(16**6)).upper()
much faster.
I thought no one had answered this yet lol! But hey, here's my own go at it:
import random
def random_alphanumeric(limit):
#ascii alphabet of all alphanumerals
r = (range(48, 58) + range(65, 91) + range(97, 123))
random.shuffle(r)
return reduce(lambda i, s: i + chr(s), r[:random.randint(0, len(r))], "")
If you need a random string rather than a pseudo random one, you should use os.urandom as the source
from os import urandom
from itertools import islice, imap, repeat
import string
def rand_string(length=5):
chars = set(string.ascii_uppercase + string.digits)
char_gen = (c for c in imap(urandom, repeat(1)) if c in chars)
return ''.join(islice(char_gen, None, length))
This method is slightly faster, and slightly more annoying, than the random.choice() method Ignacio posted.
It takes advantage of the nature of pseudo-random algorithms, and banks on bitwise and and shift being faster than generating a new random number for each character.
# must be length 32 -- 5 bits -- the question didn't specify using the full set
# of uppercase letters ;)
_ALPHABET = 'ABCDEFGHJKLMNPQRSTUVWXYZ23456789'
def generate_with_randbits(size=32):
def chop(x):
while x:
yield x & 31
x = x >> 5
return ''.join(_ALPHABET[x] for x in chop(random.getrandbits(size * 5))).ljust(size, 'A')
...create a generator that takes out 5 bit numbers at a time 0..31 until none left
...join() the results of the generator on a random number with the right bits
With Timeit, for 32-character strings, the timing was:
[('generate_with_random_choice', 28.92901611328125),
('generate_with_randbits', 20.0293550491333)]
...but for 64 character strings, randbits loses out ;)
I would probably never use this approach in production code unless I really disliked my co-workers.
edit: updated to suit the question (uppercase and digits only), and use bitwise operators & and >> instead of % and //
Use Numpy's random.choice() function
import numpy as np
import string
if __name__ == '__main__':
length = 16
a = np.random.choice(list(string.ascii_uppercase + string.digits), length)
print(''.join(a))
Documentation is here http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.random.choice.html
I'd do it this way:
import random
from string import digits, ascii_uppercase
legals = digits + ascii_uppercase
def rand_string(length, char_set=legals):
output = ''
for _ in range(length): output += random.choice(char_set)
return output
Or just:
def rand_string(length, char_set=legals):
return ''.join( random.choice(char_set) for _ in range(length) )
Sometimes 0 (zero) & O (letter O) can be confusing. So I use
import uuid
uuid.uuid4().hex[:6].upper().replace('0','X').replace('O','Y')
>>> import string
>>> import random
the following logic still generates 6 character random sample
>>> print ''.join(random.sample((string.ascii_uppercase+string.digits),6))
JT7K3Q
No need to multiply by 6
>>> print ''.join(random.sample((string.ascii_uppercase+string.digits)*6,6))
TK82HK
I used this method to generate random string of length n from a -> z
import random
s = ''.join(random.choice([chr(i) for i in range(ord('a'),ord('z'))]) for _ in range(10))
Security Oriented Approach
Our recommendation for anything security related is to avoid "rolling you own" and to use the secrets module which is specifically vetted for security.
This is from the best practices section of the docs:
import string
import secrets
alphabet = string.ascii_letters + string.digits
password = ''.join(secrets.choice(alphabet) for i in range(8))
Since you specifically asked for uppercase letters, you can either substitute ascii_uppercase for ascii_letters, or just uppercase the password with:
password = password.upper()
Standard Approach Not Aiming for Security
The canonical approach to this problem (as specified) uses the choices() function in the random module:
>>> from random import choices
>>> from string import ascii_uppercase, digits
>>> population = ascii_uppercase + digits
>>> str.join('', choices(population, k=6))
'6JWF1H'
>>> import random
>>> str = []
>>> chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890'
>>> num = int(raw_input('How long do you want the string to be? '))
How long do you want the string to be? 10
>>> for k in range(1, num+1):
... str.append(random.choice(chars))
...
>>> str = "".join(str)
>>> str
'tm2JUQ04CK'
The random.choice function picks a random entry in a list. You also create a list so that you can append the character in the for statement. At the end str is ['t', 'm', '2', 'J', 'U', 'Q', '0', '4', 'C', 'K'], but the str = "".join(str) takes care of that, leaving you with 'tm2JUQ04CK'.
Hope this helps!
For those of you who enjoy functional python:
from itertools import imap, starmap, islice, repeat
from functools import partial
from string import letters, digits, join
from random import choice
join_chars = partial(join, sep='')
identity = lambda o: o
def irand_seqs(symbols=join_chars((letters, digits)), length=6, join=join_chars, select=choice, breakup=islice):
""" Generates an indefinite sequence of joined random symbols each of a specific length
:param symbols: symbols to select,
[defaults to string.letters + string.digits, digits 0 - 9, lower and upper case English letters.]
:param length: the length of each sequence,
[defaults to 6]
:param join: method used to join selected symbol,
[defaults to ''.join generating a string.]
:param select: method used to select a random element from the giving population.
[defaults to random.choice, which selects a single element randomly]
:return: indefinite iterator generating random sequences of giving [:param length]
>>> from tools import irand_seqs
>>> strings = irand_seqs()
>>> a = next(strings)
>>> assert isinstance(a, (str, unicode))
>>> assert len(a) == 6
>>> assert next(strings) != next(strings)
"""
return imap(join, starmap(breakup, repeat((imap(select, repeat(symbols)), None, length))))
It generates an indefinite [infinite] iterator, of joined random sequences, by first generating an indefinite sequence of randomly selected symbol from the giving pool, then breaking this sequence into length parts which is then joined, it should work with any sequence that supports getitem, by default it simply generates a random sequence of alpha numeric letters, though you can easily modify to generate other things:
for example to generate random tuples of digits:
>>> irand_tuples = irand_seqs(xrange(10), join=tuple)
>>> next(irand_tuples)
(0, 5, 5, 7, 2, 8)
>>> next(irand_tuples)
(3, 2, 2, 0, 3, 1)
if you don't want to use next for generation you can simply make it callable:
>>> irand_tuples = irand_seqs(xrange(10), join=tuple)
>>> make_rand_tuples = partial(next, irand_tuples)
>>> make_rand_tuples()
(1, 6, 2, 8, 1, 9)
if you want to generate the sequence on the fly simply set join to identity.
>>> irand_tuples = irand_seqs(xrange(10), join=identity)
>>> selections = next(irand_tuples)
>>> next(selections)
8
>>> list(selections)
[6, 3, 8, 2, 2]
As others have mentioned if you need more security then set the appropriate select function:
>>> from random import SystemRandom
>>> rand_strs = irand_seqs(select=SystemRandom().choice)
'QsaDxQ'
the default selector is choice which may select the same symbol multiple times for each chunk, if instead you'd want the same member selected at most once for each chunk then, one possible usage:
>>> from random import sample
>>> irand_samples = irand_seqs(xrange(10), length=1, join=next, select=lambda pool: sample(pool, 6))
>>> next(irand_samples)
[0, 9, 2, 3, 1, 6]
we use sample as our selector, to do the complete selection, so the chunks are actually length 1, and to join we simply call next which fetches the next completely generated chunk, granted this example seems a bit cumbersome and it is ...
(1) This will give you all caps and numbers:
import string, random
passkey=''
for x in range(8):
if random.choice([1,2]) == 1:
passkey += passkey.join(random.choice(string.ascii_uppercase))
else:
passkey += passkey.join(random.choice(string.digits))
print passkey
(2) If you later want to include lowercase letters in your key, then this will also work:
import string, random
passkey=''
for x in range(8):
if random.choice([1,2]) == 1:
passkey += passkey.join(random.choice(string.ascii_letters))
else:
passkey += passkey.join(random.choice(string.digits))
print passkey
this is a take on Anurag Uniyal 's response and something that i was working on myself.
import random
import string
oneFile = open('Numbers.txt', 'w')
userInput = 0
key_count = 0
value_count = 0
chars = string.ascii_uppercase + string.digits + string.punctuation
for userInput in range(int(input('How many 12 digit keys do you want?'))):
while key_count <= userInput:
key_count += 1
number = random.randint(1, 999)
key = number
text = str(key) + ": " + str(''.join(random.sample(chars*6, 12)))
oneFile.write(text + "\n")
oneFile.close()
import string
from random import *
characters = string.ascii_letters + string.punctuation + string.digits
password = "".join(choice(characters) for x in range(randint(8, 16)))
print password
import random
q=2
o=1
list =[r'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','s','0','1','2','3','4','5','6','7','8','9','0']
while(q>o):
print("")
for i in range(1,128):
x=random.choice(list)
print(x,end="")
Here length of string can be changed in for loop i.e for i in range(1,length)
It is simple algorithm which is easy to understand. it uses list so you can discard characters that you do not need.
I was looking at the different answers and took time to read the documentation of secrets
The secrets module is used for generating cryptographically strong random numbers suitable for managing data such as passwords, account authentication, security tokens, and related secrets.
In particularly, secrets should be used in preference to the default pseudo-random number generator in the random module, which is designed for modelling and simulation, not security or cryptography.
Looking more into what it has to offer I found a very handy function if you want to mimic an ID like Google Drive IDs:
secrets.token_urlsafe([nbytes=None])
Return a random URL-safe text string, containing nbytes random bytes. The text is Base64 encoded, so on average each byte results in approximately 1.3 characters. If nbytes is None or not supplied, a reasonable default is used.
Use it the following way:
import secrets
import math
def id_generator():
id = secrets.token_urlsafe(math.floor(32 / 1.3))
return id
print(id_generator())
Output a 32 characters length id:
joXR8dYbBDAHpVs5ci6iD-oIgPhkeQFk
I know this is slightly different from the OP's question but I expect that it would still be helpful to many who were looking for the same use-case that I was looking for.
A simple one:
import string
import random
character = string.lowercase + string.uppercase + string.digits + string.punctuation
char_len = len(character)
# you can specify your password length here
pass_len = random.randint(10,20)
password = ''
for x in range(pass_len):
password = password + character[random.randint(0,char_len-1)]
print password
I would like to suggest you next option:
import crypt
n = 10
crypt.crypt("any sring").replace('/', '').replace('.', '').upper()[-n:-1]
Paranoic mode:
import uuid
import crypt
n = 10
crypt.crypt(str(uuid.uuid4())).replace('/', '').replace('.', '').upper()[-n:-1]
Two methods :
import random, math
def randStr_1(chars:str, length:int) -> str:
chars *= math.ceil(length / len(chars))
chars = letters[0:length]
chars = list(chars)
random.shuffle(characters)
return ''.join(chars)
def randStr_2(chars:str, length:int) -> str:
return ''.join(random.choice(chars) for i in range(chars))
Benchmark :
from timeit import timeit
setup = """
import os, subprocess, time, string, random, math
def randStr_1(letters:str, length:int) -> str:
letters *= math.ceil(length / len(letters))
letters = letters[0:length]
letters = list(letters)
random.shuffle(letters)
return ''.join(letters)
def randStr_2(letters:str, length:int) -> str:
return ''.join(random.choice(letters) for i in range(length))
"""
print('Method 1 vs Method 2', ', run 10 times each.')
for length in [100,1000,10000,50000,100000,500000,1000000]:
print(length, 'characters:')
eff1 = timeit("randStr_1(string.ascii_letters, {})".format(length), setup=setup, number=10)
eff2 = timeit("randStr_2(string.ascii_letters, {})".format(length), setup=setup, number=10)
print('\t{}s : {}s'.format(round(eff1, 6), round(eff2, 6)))
print('\tratio = {} : {}\n'.format(eff1/eff1, round(eff2/eff1, 2)))
Output :
Method 1 vs Method 2 , run 10 times each.
100 characters:
0.001411s : 0.00179s
ratio = 1.0 : 1.27
1000 characters:
0.013857s : 0.017603s
ratio = 1.0 : 1.27
10000 characters:
0.13426s : 0.151169s
ratio = 1.0 : 1.13
50000 characters:
0.709403s : 0.855136s
ratio = 1.0 : 1.21
100000 characters:
1.360735s : 1.674584s
ratio = 1.0 : 1.23
500000 characters:
6.754923s : 7.160508s
ratio = 1.0 : 1.06
1000000 characters:
11.232965s : 14.223914s
ratio = 1.0 : 1.27
The performance of first method is better.
Generate random 16-byte ID containig letters, digits, '_' and '-'
os.urandom(16).translate((f'{string.ascii_letters}{string.digits}-_'*4).encode('ascii'))
import string, random
lower = string.ascii_lowercase
upper = string.ascii_uppercase
digits = string.digits
special = '!"£$%^&*.,##/?'
def rand_pass(l=4, u=4, d=4, s=4):
p = []
[p.append(random.choice(lower)) for x in range(l)]
[p.append(random.choice(upper)) for x in range(u)]
[p.append(random.choice(digits)) for x in range(d)]
[p.append(random.choice(special)) for x in range(s)]
random.shuffle(p)
return "".join(p)
print(rand_pass())
# #5U,#A4yIZvnp%51
If you want an easy-to-use but highly customisable key generator, use key-generator pypi package.
Here is the GitHub repo where you can find the complete documentation.
You can customise it to give a string jist like you want with many more options. Here's an example:
from key_generator.key_generator import generate
custom_key = generate(2, ['-', ':'], 3, 10, type_of_value = 'char', capital = 'mix', seed = 17).get_key()
print(custom_key) # ZLFdHXIUe-ekwJCu
Hope this helps :)
Disclaimer: This uses the key-generator library which I made.
For example,
The function could be something like def RandABCD(n, .25, .34, .25, .25):
Where n is the length of the string to be generated and the following numbers are the desired probabilities of A, B, C, D.
I would imagine this is quite simple, however i am having trouble creating a working program. Any help would be greatly appreciated.
Here's the code to select a single weighted value. You should be able to take it from here. It uses bisect and random to accomplish the work.
from bisect import bisect
from random import random
def WeightedABCD(*weights):
chars = 'ABCD'
breakpoints = [sum(weights[:x+1]) for x in range(4)]
return chars[bisect(breakpoints, random())]
Call it like this: WeightedABCD(.25, .34, .25, .25).
EDIT: Here is a version that works even if the weights don't add up to 1.0:
from bisect import bisect_left
from random import uniform
def WeightedABCD(*weights):
chars = 'ABCD'
breakpoints = [sum(weights[:x+1]) for x in range(4)]
return chars[bisect_left(breakpoints, uniform(0.0,breakpoints[-1]))]
The random class is quite powerful in python. You can generate a list with the characters desired at the appropriate weights and then use random.choice to obtain a selection.
First, make sure you do an import random.
For example, let's say you wanted a truly random string from A,B,C, or D.
1. Generate a list with the characters
li = ['A','B','C','D']
Then obtain values from it using random.choice
output = "".join([random.choice(li) for i in range(0, n)])
You could easily make that a function with n as a parameter.
In the above case, you have an equal chance of getting A,B,C, or D.
You can use duplicate entries in the list to give characters higher probabilities. So, for example, let's say you wanted a 50% chance of A and 25% chances of B and C respectively. You could have an array like this:
li = ['A','A','B','C']
And so on.
It would not be hard to parameterize the characters coming in with desired weights, to model that I'd use a dictionary.
characterbasis = {'A':25, 'B':25, 'C':25, 'D':25}
Make that the first parameter, and the second being the length of the string and use the above code to generate your string.
For four letters, here's something quick off the top of my head:
from random import random
def randABCD(n, pA, pB, pC, pD):
# assumes pA + pB + pC + pD == 1
cA = pA
cB = cA + pB
cC = cB + pC
def choose():
r = random()
if r < cA:
return 'A'
elif r < cB:
return 'B'
elif r < cC:
return 'C'
else:
return 'D'
return ''.join([choose() for i in xrange(n)])
I have no doubt that this can be made much cleaner/shorter, I'm just in a bit of a hurry right now.
The reason I wouldn't be content with David in Dakota's answer of using a list of duplicate characters is that depending on your probabilities, it may not be possible to create a list with duplicates in the right numbers to simulate the probabilities you want. (Well, I guess it might always be possible, but you might wind up needing a huge list - what if your probabilities were 0.11235442079, 0.4072777384, 0.2297927874, 0.25057505341?)
EDIT: here's a much cleaner generic version that works with any number of letters with any weights:
from bisect import bisect
from random import uniform
def rand_string(n, content):
''' Creates a string of letters (or substrings) chosen independently
with specified probabilities. content is a dictionary mapping
a substring to its "weight" which is proportional to its probability,
and n is the desired number of elements in the string.
This does not assume the sum of the weights is 1.'''
l, cdf = zip(*[(l, w) for l, w in content.iteritems()])
cdf = list(cdf)
for i in xrange(1, len(cdf)):
cdf[i] += cdf[i - 1]
return ''.join([l[bisect(cdf, uniform(0, cdf[-1]))] for i in xrange(n)])
Here is a rough idea of what might suit you
import random as r
def distributed_choice(probs):
r= r.random()
cum = 0.0
for pair in probs:
if (r < cum + pair[1]):
return pair[0]
cum += pair[1]
The parameter probs takes a list of pairs of the form (object, probability). It is assumed that the sum of probabilities is 1 (otherwise, its trivial to normalize).
To use it just execute:
''.join([distributed_choice(probs)]*4)
Hmm, something like:
import random
class RandomDistribution:
def __init__(self, kv):
self.entries = kv.keys()
self.where = []
cnt = 0
for x in self.entries:
self.where.append(cnt)
cnt += kv[x]
self.where.append(cnt)
def find(self, key):
l, r = 0, len(self.where)-1
while l+1 < r:
m = (l+r)/2
if self.where[m] <= key:
l=m
else:
r=m
return self.entries[l]
def randomselect(self):
return self.find(random.random()*self.where[-1])
rd = RandomDistribution( {"foo": 5.5, "bar": 3.14, "baz": 2.8 } )
for x in range(1000):
print rd.randomselect()
should get you most of the way...
Thank you all for your help, I was able to figure something out, mostly with this info.
For my particular need, I did something like this:
import random
#Create a function to randomize a given string
def makerandom(seq):
return ''.join(random.sample(seq, len(seq)))
def randomDNA(n, probA=0.25, probC=0.25, probG=0.25, probT=0.25):
notrandom=''
A=int(n*probA)
C=int(n*probC)
T=int(n*probT)
G=int(n*probG)
#The remainder part here is used to make sure all n are used, as one cannot
#have half an A for example.
remainder=''
for i in range(0, n-(A+G+C+T)):
ramainder+=random.choice("ATGC")
notrandom=notrandom+ 'A'*A+ 'C'*C+ 'G'*G+ 'T'*T + remainder
return makerandom(notrandom)