Check if a string contains a string except a list - python

I have a string as follows:
f = 'ATCTGTCGTYCACGT'
I want to check whether the string contains any characters except: A, C, G or T, and if so, print them.
for i in f:
if i != 'A' and i != 'C' and i != 'G' and i != 'T':
print(i)
Is there a way to achieve this without looping through the string?

You can use set to achieve the desired output.
f = 'ATCTGTCGTYCACGTXYZ'
not_valid={'A', 'C', 'G' , 'T'}
unique=set(f)
print(unique-not_valid)
output
{'Y','X','Z'} #characters in f which are not equal to 'A','C','G','T'

Depending on the size of your input string, the for loop might be the most efficient solution.
However, since you explicitly ask for a solution without an explicit loop, this can be done with a regex.
import re
f = 'ABCDEFG'
print(*re.findall('[^ABC]', f), sep='\n')
Outputs
D
E
F
G

Just do
l = ['A', 'C', 'G', 'T']
for i in f:
if i not in l:
print(i)
It checks whether the list contains a char of the list
If you don't want to loop through the list you can do:
import re
l = ['A', 'C', 'G', 'T']
contains = bool(re.search("%s" % "[" + "".join(l) + "]", f))

Technically this loops but we convert your input string to a set which removes duplicate values
accepted_values = ['a','t','c','g']
input = 'ATCTGTCGTYCACGT'
print([i for i in set(input.lower()) if i not in accepted_values])

Related

python - how to use the join method and sort method

My purpose is to get an input as a string and return a list of lower case letters of that string, without repeats, without punctuations, in alphabetical order. For example, the input "happy!" would get ['a','h','p','y']. I try to use the join function to get rid of my punctuations but somehow it doesn't work. Does anybody know why? Also, can sort.() sort alphabets? Am I using it in the right way? Thanks!
def split(a):
a.lower()
return [char for char in a]
def f(a):
i=split(a)
s=set(i)
l=list(s)
v=l.join(u for u in l if u not in ("?", ".", ";", ":", "!"))
v.sort()
return v
.join() is a string method, but being used on a list, so the code raises an exception, but join and isn't really needed here.
You're on the right track with set(). It only stores unique items, so create a set of your input and compute the intersection(&) with lower case letters. Sort the result:
>>> import string
>>> s = 'Happy!'
>>> sorted(set(s.lower()) & set(string.ascii_lowercase))
['a', 'h', 'p', 'y']
You could use:
def f(a):
return sorted(set(a.lower().strip('?.;:!')))
>>> f('Happy!')
['a', 'h', 'p', 'y']
You could also use regex for this:
pattern = re.compile(r'[^a-z]')
string = 'Hello# W0rld!!##'
print(sorted(set(pattern.sub('', string))))
Output:
['d', 'e', 'l', 'o', 'r']

Correct word generation without repetitions

How many five-letter words can you make from a 26-letter alphabet (no repetitions)?
I am writing a program that generates names (just words) from 5 letters in the format: consonant_vowel_consistent_vowel_consonant. Only 5 letters. in Latin. I just want to understand how many times I have to run the cycle for generation. At 65780, for example, repetitions already begin. Can you please tell me how to do it correctly?
import random
import xlsxwriter
consonants = ['B', 'C', 'D', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q',
'R', 'S', 'T', 'V', 'W', 'X', 'Z']
vowels = ['A', 'E', 'I', 'O', 'U', 'Y']
workbook = xlsxwriter.Workbook('GeneratedNames.xlsx')
worksheet = workbook.add_worksheet()
def names_generator(size=5, chars=consonants + vowels):
for y in range(65780):
toggle = True
_id = ""
for i in range(size):
if toggle:
toggle = False
_id += random.choice(consonants)
else:
toggle = True
_id += random.choice(vowels)
worksheet.write(y, 0, _id)
print(_id)
workbook.close()
names_generator()
You can use itertools.combinations to get 3 different consonants and 2 different vowels and get the permutations of those to generate all possible "names".
from itertools import combinations, permutations
names = [a+b+c+d+e for cons in combinations(consonants, 3)
for a, c, e in permutations(cons)
for vow in combinations(vowels, 2)
for b, d in permutations(vow)]
There are only 205,200 = 20x19x18x6x5 in total, so this will take no time at all for 5 letters, but will quickly take longer for more. That is, if by "no repetitions" you mean that no letter should occur more than once. If, instead, you just want that no consecutive letters are repeated (which is already guaranteed by alternating consonants and vowels), or that no names are repeated (which is guaranteed by constructing them without randomness), you can just use itertools.product instead, for a total of 288,000 = 20x20x20x6x6 names:
names = [a+b+c+d+e for a, c, e in product(consonants, repeat=3)
for b, d in product(vowels, repeat=2)]
If you want to generate them in random order, you could just random.shuffle the list afterwards, or if you want just a few such names, you can use random.sample or random.choice on the resulting list.
If you want to avoid duplicates, you shouldn't use randomness but simply generate all such IDs:
from itertools import product
C = consonants
V = vowels
for id_ in map(''.join, product(C, V, C, V, C)):
print(id_)
or
from itertools import cycle, islice, product
for id_ in map(''.join, product(*islice(cycle((consonants, vowels)), 5))):
print(id_)
itertools allows for non repetitive permutations https://docs.python.org/3/library/itertools.html
import itertools, re
names = list(itertools.product(consonants + vowels, repeat=5))
consonants_regex = "(" + "|".join(consonants) + ")"
vowels_regex = "(" + "|".join(vowels) + ")"
search_string = consonants_regex + vowels_regex + consonants_regex + vowels_regex + consonants_regex
names_format = ["".join(name) for name in names if re.match(search_string, "".join(name))]
Output:
>>> len(names)
11881376
>>> len(names_format)
288000
I want to make sure to answer your question
I just want to understand how many times I have to run the cycle for
generation
since I think it is important to get a better intuition about the problem.
You have 20 consonants and 6 vowels and in total that yields 20x6x20x6x20 = 288000 different combinations for words. Since it is sequential, you can split it up to make that easier to understand. You have 20 different consonants you can put as the 1st letter and for each one 6 vowels you can attach afterwards = 20x6 = 120. Then you can keep going and say for those 120 combinations you can add 20 consonants for each = 120x20 = 2400 ... and so on.

Comparing letters in list with string in a list python

I have a list lets say:
DIRECTION_LETTERS=['u', 'd' ,'r' , 'l', 'w', 'x', 'y', 'z']
Now the other parameter of mine is like arguement, I write in
udl and it returns ['udl']
so lets say another list arg_list = ['udl']
I want to check if u and d and l is in this list or i want to check if none of the letters in my direction letters list in the arg list
to make it print error msg I have tried this:
for letter in DIRECTION_LETTERS:
for char in arg_list[4]:
if letter in arg_list[4][0]:
continue
else:
print (ERROR_MESSAGE_DIRECTIONS)
return False
return True
There's a handy function in Python called all() which returns true if all arguments are true. Feel free to find a nice way of using it, but in general:
>>> DIRECTION_LETTERS=['u', 'd' ,'r' , 'l', 'w', 'x', 'y', 'z']
>>>
>>> s="udl"
>>> print(all(c in DIRECTION_LETTERS for c in s))
True
>>> s="udlxa"
>>> print(all(c in DIRECTION_LETTERS for c in s))
False
This line
if letter in arg_list[4][0]:
needs to be
if char in DIRECTION_LETTERS:
and you need to delete the line for letter in DIRECTION_LETTERS:.

how to use a for loop to replacing index useing python

Define a function named encrypt which takes as input a string (which is the name of a text file in the current directory). The function should then print the encrypted content of this file.
Here text encryption is done by replacing every occurence of a vowel with its next in the list 'aeiou'. So 'a' is replaced by 'e', 'e' is replaced by 'i', so on and 'u' is replaced by 'a'. Also each consonant is replaced with its next in the list 'bcdfghjklmnpqrstvwxyz' so 'b' is replaced by 'c', 'c' by 'd' so on and lastly 'z' is replaced by 'b'. The same replacement logic holds for upper case letters. Note that non-alphabetic characters should appear in their original form without modification.
def encrypt (eo):
vowel = 'aeiou'
con = 'bcdfghjklmnpqrstvwxyz'
for eo in vowel (t[i+1]):
res=
return res
This piece of code could be useful. Pay attention to the vowel and con content. I appended one letter in each variable vowel and com to avoid the modulo operation. Assume the eo is the input string.
def encrypt (eo):
vowel = 'aeioua'
con = 'bcdfghjklmnpqrstvwxyzb'
encrytTable = vowel + con
res = ""
for letter in eo:
res += encrytTable[encrytTable.find(letter)+1]
return res
If eo is the input filename, you need some file read operation like:
>>> fh = open(eo)
>>> fh.read()
>>> fh.>>> fh.close()
And a more effient way to do it, is pre-compute a encryptTable array and use the table to replace the origianl input in a linear manner. In following code, I assume your input only include lower-case letters. Abd if the shift distance is not 1, you need to modify the code. Pre-compute:
>>> vowel = 'aeioua'
>>> con = 'bcdfghjklmnpqrstvwxyzb'
>>> encryptTable = []
>>> for i in xrange(97,123):
temp = chr(i)
if temp in vowel:
encryptTable.append(vowel[vowel.find(temp)+1])
else:
encryptTable.append(con[con.find(temp)+1])
>>> encryptTable
['e', 'c', 'd', 'f', 'i', 'g', 'h', 'j', 'o', 'k', 'l', 'm', 'n', 'p', 'u', 'q', 'r', 's', 't', 'v', 'a', 'w', 'x', 'y', 'z', 'b']
And then replace the content:
>>> plain = "helloworld"
>>> encrypted = "".join([encryptTable[ord(i)-ord('a')] for i in plain])
>>> encrypted
'jimmuxusmf'
def encrypt(s):
vowels = 'aeiou'
vowelReps = dict(zip(vowels, vowels[1:]+vowels[0]))
cons = 'bcdfghjklmnpqrstvwxyz'
consReps = dict(zip(cons, cons[1:]+cons[0]))
answer = []
for char in s:
if char.lower() in vowelReps:
answer.append(vowelReps[char.lower()]
else:
answer.append(consReps[char.lower()]
if char.isupper():
answer[-1] = answer[-1].upper()
return ''.join(answer)
You have multiple problems here:
for eo in ... would replace the eo argument; except
t isn't defined, so will give a NameError;
res= is a SyntaxError; and
Even if all of the above was fixed, return res will happen on the first character, as it is indented too far.
Instead, you could do the following:
def encrypt(eo):
vowels = "aeiou"
for index, vowel in enumerate(vowels): # iterate through the five vowels
new_v = vowels[(index + 1) % len(vowels)] # determine replacement
eo = eo.replace(vowel, new_v) # do replacement
You can then do the same thing for the consonants, then return eo (which should be indented to the same level as vowels = ...!).
Note:
the use of % to keep the index into vowels within the appropriate range; and
the use of enumerate to get both the character vowel from the string vowels and its index within that string.
Alternatively, and more efficiently:
build a dictionary mapping character in to character out;
build a list of replacement characters using the input eo and the dict; and
str.join the output characters together and return it.

How to sort the letters in a string alphabetically in Python

Is there an easy way to sort the letters in a string alphabetically in Python?
So for:
a = 'ZENOVW'
I would like to return:
'ENOVWZ'
You can do:
>>> a = 'ZENOVW'
>>> ''.join(sorted(a))
'ENOVWZ'
>>> a = 'ZENOVW'
>>> b = sorted(a)
>>> print b
['E', 'N', 'O', 'V', 'W', 'Z']
sorted returns a list, so you can make it a string again using join:
>>> c = ''.join(b)
which joins the items of b together with an empty string '' in between each item.
>>> print c
'ENOVWZ'
Sorted() solution can give you some unexpected results with other strings.
List of other solutions:
Sort letters and make them distinct:
>>> s = "Bubble Bobble"
>>> ''.join(sorted(set(s.lower())))
' belou'
Sort letters and make them distinct while keeping caps:
>>> s = "Bubble Bobble"
>>> ''.join(sorted(set(s)))
' Bbelou'
Sort letters and keep duplicates:
>>> s = "Bubble Bobble"
>>> ''.join(sorted(s))
' BBbbbbeellou'
If you want to get rid of the space in the result, add strip() function in any of those mentioned cases:
>>> s = "Bubble Bobble"
>>> ''.join(sorted(set(s.lower()))).strip()
'belou'
Python functionsorted returns ASCII based result for string.
INCORRECT: In the example below, e and d is behind H and W due it's to ASCII value.
>>>a = "Hello World!"
>>>"".join(sorted(a))
' !!HWdellloor'
CORRECT: In order to write the sorted string without changing the case of letter. Use the code:
>>> a = "Hello World!"
>>> "".join(sorted(a,key=lambda x:x.lower()))
' !deHllloorW'
OR (Ref: https://docs.python.org/3/library/functions.html#sorted)
>>> a = "Hello World!"
>>> "".join(sorted(a,key=str.lower))
' !deHllloorW'
If you want to remove all punctuation and numbers.
Use the code:
>>> a = "Hello World!"
>>> "".join(filter(lambda x:x.isalpha(), sorted(a,key=lambda x:x.lower())))
'deHllloorW'
You can use reduce
>>> a = 'ZENOVW'
>>> reduce(lambda x,y: x+y, sorted(a))
'ENOVWZ'
the code can be used to sort string in alphabetical order without using any inbuilt function of python
k = input("Enter any string again ")
li = []
x = len(k)
for i in range (0,x):
li.append(k[i])
print("List is : ",li)
for i in range(0,x):
for j in range(0,x):
if li[i]<li[j]:
temp = li[i]
li[i]=li[j]
li[j]=temp
j=""
for i in range(0,x):
j = j+li[i]
print("After sorting String is : ",j)
Really liked the answer with the reduce() function. Here's another way to sort the string using accumulate().
from itertools import accumulate
s = 'mississippi'
print(tuple(accumulate(sorted(s)))[-1])
sorted(s) -> ['i', 'i', 'i', 'i', 'm', 'p', 'p', 's', 's', 's', 's']
tuple(accumulate(sorted(s)) -> ('i', 'ii', 'iii', 'iiii', 'iiiim', 'iiiimp', 'iiiimpp', 'iiiimpps', 'iiiimppss', 'iiiimppsss', 'iiiimppssss')
We are selecting the last index (-1) of the tuple

Categories

Resources