My start of the code goes like that:
complementDNA = originalDNA.replace('a' , 't' , 't' , 'a')
and it says on the running
complementDNA = originalDNA.replace('a' , 't' , 't' , 'a')
TypeError: replace() takes at most 3 arguments (4 given)
Assuming originalDNA is a string, then I think you dont want to replace, you want to translate, ie:
originalDNA = 'atgta' # Know nothing about DNA btw
complement_table = str.maketrans('at', 'ta')
complementDNA = originalDNA.translate(complement_table)
# complementDNA is now 'tagat'
To give a brief explanation, maketrans takes at least 2 arguments and at most 3. The first two arguments are strings of equal length where each character of the first argument will be replaced by the character at the same position in the second argument. The optional third argument is other string with the characters you want to delete.
So, for example str.maketrans('ac', 'ca', 'b') will replace 'a' to 'c', 'c' to 'a' and delete all 'b'.
'abccba'.translate(str.maketrans('ac', 'ca', 'b')) will then be 'caac'
Replace takes two arguments. replace(before, after).
You will have to do it for 'a' and 't' separately and for 't' to 'a' separately. That would not give the right answer. One way you can do it is by converting the DNA to a list of characters and iterating over them checking manually to convert 'a' to 't' and 't' to 'a'. Like so
DNAlist = []
for character in originalDNA:
DNAlist.append(character)
for i in range(0, len(DNAlist)):
if DNAlist[i] == 'a':
DNAlist [i] = 't'
elif DNAlist[i] == 't':
DNAlist[i] = 'a'
# Convert the list back to string
DNAstring = ''.join(DNAlist)
Although I would suggest to use lists until you have to convert the DNA to string. Strings are immutable in python, i.e they can't be changed, just made new everytime. Therefore, string operations can be expensive.
If you read the documentation of str.replace then you will know that it replaces all occurrences of the first argument by occurences of the second argument.
To compute the complementary DNA strand of a given DNA strand with str.replace you have to do the following:
dna = "atgcgctagctcattt"
# Replace A by T and T by A.
cdna = dna.replace('a', 'x')
cdna = cdna.replace('t', 'a')
cdna = cdna.replace('x', 't')
# Replace G by C and C by G.
cdna = cdna.replace('g', 'x')
cdna = cdna.replace('c', 'g')
cdna = cdna.replace('x', 'c')
However it is probably more efficient to use str.translate:
dna = "atgcgctagctcattt"
map = str.maketrans("atgc", "tacg")
cdna = dna.translate(map)
which is similar to Jose's answer. In both cases the result will be:
cdna = "tacgcgatcgagtaaa"
I hope this will help you.
The method str.replace() only takes three arguments, the strings to replace and how many time you want to replace (blank to replace all). You can't change it all at the same time. Try:
complementDNA = originalDNA.replace('a' , 'x').replace('t', 'a').replace('x', 't')
Related
I'm learning python. I'm trying to identify rows of data where the string value includes a special character.
import pandas as pd
cn = pd.read_excel(f"../Files/df.xlsx", sheet_name='Values')
cn = cn[['DestinationName']]
special_characters = "!##$%^&*()-+?_=,<>/"
cn['Special Characters'] = ["Y" if any(c in special_characters for c in cn) else "N"]
Basically, I'd like to either only display rows that include any of the special characters, or create a separate column to show whether Yes (it includes a special character) or No. For example, Red & Blue has the "&" character so it should be flagged as Yes, while RedBlue shouldn't.
I'm a little stuck, and any help would be appreciated
I would recommend using sets on this specific task :
Creating a set of your list of special characters
Create a new column, which contains the following boolean : "the intersection of special_characters and the string of column "Destination Name" is non empty"
It should look like this:
special_characters_set = set(list(special_characters))
cn["Special Characters"] = cn["DestinationName"].apply(lambda x : len(set(list(x)).intersect(special_characters_set)) != 0)
Where
# list('hello') = ['h', 'e', 'l', 'l', 'o'] # ordered and repetitions
# set(list('hello')) = {'h', 'e', 'l', 'o'} # non ordered and no repetitions
Keep in mind that the .apply() method is not really the most computationally efficient to manipulate dataframes.
How to replace the first character alone in a string using python?
string = "11234"
translation_table = str.maketrans({'1': 'I'})
output= (string.translate(translation_table))
print(output)
Expected Output:
I1234
Actual Ouptut:
11234
I am not sure what you want to achive, but it seems you just want to replace a '1' for an 'I' just once, so try this:
string = "11234"
string.replace('1', 'I', 1)
str.replace takes 3 parameters old, new, and count (which is optional). count indicates the number of times you want to replace the old substring with the new substring.
In Python, strings are immutable meaning you cannot assign to indices or modify a character at a specific index. Use str.replace() instead. Here's the function header
str.replace(old, new[, count])
This built in function returns a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.
If you don't want to use str.replace(), you can manually do it by taking advantage of splicing
def manual_replace(s, char, index):
return s[:index] + char + s[index +1:]
string = '11234'
print(manual_replace(string, 'I', 0))
Output
I1234
You can use re (regex), and use the sub function there, first parameter is the thing you want to replace, and second is the thing that you want to replace with, third is the string, fourth is the count, so i say 1 because you only want the first one:
>>> import re
>>> string = "11234"
>>> re.sub('1', 'I', string, 1)
'I1234'
>>>
It's virtually just:
re.sub('1', 'I', string, 1)
I would like to replace all the french letters within words with their ASCII equivalent.
letters = [['é', 'à'], ['è', 'ù'], ['â', 'ê'], ['î', 'ô'], ['û', 'ç']]
for x in letters:
for a in x:
a = a.replace('é', 'e')
a = a.replace('à', 'a')
a = a.replace('è', 'e')
a = a.replace('ù', 'u')
a = a.replace('â', 'a')
a = a.replace('ê', 'e')
a = a.replace('î', 'i')
a = a.replace('ô', 'o')
a = a.replace('û', 'u')
a = a.replace('ç', 'c')
print(letters[0][0])
This code prints é however. How can I make this work?
May I suggest you consider using translation tables.
translationTable = str.maketrans("éàèùâêîôûç", "eaeuaeiouc")
test = "Héllô Càèùverâêt Jîôûç"
test = test.translate(translationTable)
print(test)
will print Hello Caeuveraet Jiouc. Pardon my French.
You can also use unidecode. Install it: pip install unidecode.
Then, do:
from unidecode import unidecode
s = "Héllô Càèùverâêt Jîôûç ïîäüë"
s = unidecode(s)
print(s) # Hello Caeuveraet Jiouc iiaue
The result will be the same string, but the french characters will be converted to their ASCII equivalent: Hello Caeuveraet Jiouc iiaue
The replace function returns the string with the character replaced.
In your code you don't store this return value.
The lines in your loop should be a = a.replace('é', 'e').
You also need to store that output so you can print it in the end.
This post explains how variables within loops are accessed.
Although I am new to Python, I would approach it this way:
letterXchange = {'à':'a', 'â':'a', 'ä':'a', 'é':'e', 'è':'e', 'ê':'e', 'ë':'e',
'î':'i', 'ï':'i', 'ô':'o', 'ö':'o', 'ù':'u', 'û':'u', 'ü':'u', 'ç':'c'}
text = input() # Replace it with the string in your code.
for item in list(text):
if item in letterXchange:
text = text.replace(item,letterXchange.get(str(item)))
else:
pass
print(text)
Here is another solution, using the low level unicode package called unicodedata.
In the unicode structure, a character like 'ô' is actually a composite character, made of the character 'o' and another character called 'COMBINING GRAVE ACCENT', which is basically the '̀'. Using the method decomposition in unicodedata, one can obtain the unicodes (in hex) of these two parts.
>>> import unicodedata as ud
>>> ud.decomposition('ù')
'0075 0300'
>>> chr(0x0075)
'u'
>>> >>> chr(0x0300)
'̀'
Therefore, to retrieve 'u' from 'ù', we can first do a string split, then use the built-in int function for the conversion(see this thread for converting a hex string to an integer), and then get the character using chr function.
import unicodedata as ud
def get_ascii_char(c):
s = ud.decomposition(c)
if s == '': # for an indecomposable character, it returns ''
return c
code = int('0x' + s.split()[0], 0)
return chr(code)
s1 = 'GSHMGLYELSASNFELHVAQGDHFIKFFAPWCGHCKALAPTWEQLALGLEHSETVKIGKVDbTQHYELbSGNQVRGYPTLLWFRDGKKVDQYKGKRDLESLREYVESQLQR'
This is a string I would like to replace the lowercase letters to a certain uppercase letter, say, 'C'. the command I am using is :
string.replace(s1, s1.lower(), 'C'),
problem:the resulting string is still the same as the old one, b is 'b' and not 'C'
Currently, you're trying to replace a lowercase copy of the entire string with 'C'. You're also seemingly not assigning the result of string.replace with anything, which won't work. replace doesn't modify in place, it returns a new copy of the string with the replacements applied.
You'll need to iterate over the string and replace any lowercase letters.
s1 = 'GSHMGLYELSASNFELHVAQGDHFIKFFAPWCGHCKALAPTWEQLALGLEHSETVKIGKVDbTQHYELbSGNQVRGYPTLLWFRDGKKVDQYKGKRDLESLREYVESQLQR'
replaced_string = ''.join(x if x.isupper() else 'C' for x in s1)
Your condition is too complex for simple "replace" method. Use regexp instead:
import re
s1 = "GaHMxLYELmASNFElHVAQG"
s2 = re.sub(r"[a-z]", "C", s1)
print s2
It will print "GCHMCLYELCASNFECHVAQG"
[a-z] means "any letter from a to z" - add extra lower letters for it for your language, if needed. For example, for russian this pattern will be: [a-zа-я]
string.replace(s1, s1.lower(), 'C')
Will replace with 'C' only the whole string
gshmglyelsasnfelhvaqgdhfikffapwcghckalaptweqlalglehsetvkigkvdbtqhyelbsgnqvrgyptllwfrdgkkvdqykgkrdleslreyvesqlqr
If you want to substitute all the characters with a given property in a string what I suggest is to use regular expressions, in your case it will be:
s2 = re.sub("[a-z]", "C", s1)
s1.lower() is equal to
>>> s1.lower()
'gshmglyelsasnfelhvaqgdhfikffapwcghckalaptweqlalglehsetvkigkvdbtqhyelbsgnqvrgyptllwfrdgkkvdqykgkrdleslreyvesqlqr'
So string.replace(s1, s1.lower(), 'C') searches string c1 for any occurances of that whole string of lower case characters, and if it finds any then it replaces each one with 'C'.
Note that string.replace is also a method on strings themselves ever since Python 2.0 or so, s1.replace(s1.lower(), 'C') would do the exact same thing.
You can use a translation table:
>>> from string import maketrans, lowercase
>>> trans_table = maketrans(lowercase, 'C' * len(lowercase))
>>> s1.translate(trans_table)
Maketrans takes two strings of characters with equal lengths, and translate() then translates each occurence of a character in the first to its equivalent in the second.
lowercase is 'abcdefghijklmnopqrstuvwxyz', and 'C' * len(lowercase) is simply a string of 26 Cs.
Is it possible to replace a single character inside a string that occurs many times?
Input:
Sentence=("This is an Example. Thxs code is not what I'm having problems with.") #Example input
^
Sentence=("This is an Example. This code is not what I'm having problems with.") #Desired output
Replace the 'x' in "Thxs" with an i, without replacing the x in "Example".
You can do it by including some context:
s = s.replace("Thxs", "This")
Alternatively you can keep a list of words that you don't wish to replace:
whitelist = ['example', 'explanation']
def replace_except_whitelist(m):
s = m.group()
if s in whitelist: return s
else: return s.replace('x', 'i')
s = 'Thxs example'
result = re.sub("\w+", replace_except_whitelist, s)
print(result)
Output:
This example
Sure, but you essentially have to build up a new string out of the parts you want:
>>> s = "This is an Example. Thxs code is not what I'm having problems with."
>>> s[22]
'x'
>>> s[:22] + "i" + s[23:]
"This is an Example. This code is not what I'm having problems with."
For information about the notation used here, see good primer for python slice notation.
If you know whether you want to replace the first occurrence of x, or the second, or the third, or the last, you can combine str.find (or str.rfind if you wish to start from the end of the string) with slicing and str.replace, feeding the character you wish to replace to the first method, as many times as it is needed to get a position just before the character you want to replace (for the specific sentence you suggest, just one), then slice the string in two and replace only one occurrence in the second slice.
An example is worth a thousands words, or so they say. In the following, I assume you want to substitute the (n+1)th occurrence of the character.
>>> s = "This is an Example. Thxs code is not what I'm having problems with."
>>> n = 1
>>> pos = 0
>>> for i in range(n):
>>> pos = s.find('x', pos) + 1
...
>>> s[:pos] + s[pos:].replace('x', 'i', 1)
"This is an Example. This code is not what I'm having problems with."
Note that you need to add an offset to pos, otherwise you will replace the occurrence of x you have just found.