How to create a censor "translator" via function in python

How to create a censor "translator" via function in python - python

I'm trying to create a "translator" of sorts, in which if the raw_input has any curses (pre-determined, I list maybe 6 test ones), the function will output a string with the curse as ****.
This is my code below:
def censor(sequence):
curse = ('badword1', 'badword2', 'badword3', 'badword4', 'badword5', 'badword6')
nsequence = sequence.split()
aword = ''
bsequence = []
for x in range(0, len(nsequence)):
if nsequence[x] != curse:
bsequence.append(nsequence[x])
else:
bsequence.append('*' * (len(x)))
latest = ''.join(bsequence)
return bsequence
if __name__ == "__main__":
print(censor(raw_input("Your sentence here: ")))

A simple approach is to simply use Python's native string method: str.replace
def censor(string):
curses = ('badword1', 'badword2', 'badword3', 'badword4', 'badword5', 'badword6')
for curse in curses:
string = string.replace(curse, '*' * len(curse))
return string
To improve efficiency, you could try to compile the list of curses into a regular expression and then do a single replacement operation.
Python Documentation

First, there's no need to iterate over element indices here. Python allows you to iterate over the elements themselves, which is ideal for this case.
Second, you are checking whether each of those words in the given sentence is equal to the entire tuple of potential bad words. You want to check whether each word is in that tuple (a set would be better).
Third, you are mixing up indices and elements when you do len(x) - that assumes that x is the word itself, but it is actually the index, as you use elsewhere.
Fourth, you are joining the sequence within the loop, and on the empty string. You should join it on a space, and only after you've checked each element.
def censor(sequence):
curse = {'badword1', 'badword2', 'badword3', 'badword4', 'badword5', 'badword6'}
nsequence = sequence.split()
bsequence = []
for x in nsequence:
if x not in curse:
bsequence.append(x)
else:
bsequence.append('*' * (len(x)))
return ' '.join(bsequence)

Related

How to replace all T with U in an input string of DNA?

So, the task is quite simple. I just need to replace all "T"s with "U"s in an input string of DNA. I have written the following code:
def transcribe_dna_to_rna(s):
base_change = {"t":"U", "T":"U"}
replace = "".join([base_change(n,n) for n in s])
return replace.upper()
and for some reason, I get the following error code:
'dict' object is not callable
Why is it that my dictionary is not callable? What should I change in my code?
Thanks for any tips in advance!

To correctly convert DNA to RNA nucleotides in string s, use a combination of str.maketrans and str.translate, which replaces thymine to uracil while preserving the case. For example:
s = 'ACTGactgACTG'
s = s.translate(str.maketrans("tT", "uU"))
print(s)
# ACUGacugACUG
Note that in bioinformatics, case (lower or upper) is often important and should be preserved, so keeping both t -> u and T -> U is important. See, for example:
Uppercase vs lowercase letters in reference genome
SEE ALSO:
Character Translation using Python (like the tr command)
Note that there are specialized bioinformatics tools specifically for handling biological sequences.
For example, BioPython offers transcribe:
from Bio.Seq import Seq
my_seq = Seq('ACTGactgACTG')
my_seq = my_seq.transcribe()
print(my_seq)
# ACUGacugACUG
To install BioPython, use conda install biopython or conda create --name biopython biopython.

The syntax error tells you that base_change(n,n) looks like you are trying to use base_change as the name of a function, when in fact it is a dictionary.
I guess what you wanted to say was
def transcribe_dna_to_rna(s):
base_change = {"t":"U", "T":"U"}
replace = "".join([base_change.get(n, n) for n in s])
return replace.upper()
where the function is the .get(x, y) method of the dictionary, which returns the value for the key in x if it is present, and otherwise y (so in this case, return the original n if it's not in the dictionary).
But this is overcomplicating things; Python very easily lets you replace characters in strings.
def transcribe_dna_to_rna(s):
return s.upper().replace("T", "U")
(Stole the reordering to put the .upper() first from #norie's answer; thanks!)
If your real dictionary was much larger, your original attempt might make more sense, as long chains of .replace().replace().replace()... are unattractive and eventually inefficient when you have a lot of them.

In python 3, use str.translate:
dna = "ACTG"
rna = dna.translate(str.maketrans("T", "U")) # "ACUG"

Change s to upper and then do the replacement.
def transcribe_dna_to_rna(s):
return s.upper().replace("T", "U")

parse nested function to extract each inner function in python

I have a nested expression as below
expression = 'position(\'a\' IN Concat("function_test"."PRODUCT_CATEGORIES"."CATEGORY_NAME" , "function_test"."PRODUCT_CATEGORIES"."CATEGORY_NAME" ))'
I want the output as by retreiving nested function first and then outer functions
['Concat("function_test"."PRODUCT_CATEGORIES"."CATEGORY_NAME" , "function_test"."PRODUCT_CATEGORIES"."CATEGORY_NAME" )','position(\'a\' IN Concat("function_test"."PRODUCT_CATEGORIES"."CATEGORY_NAME" , "function_test"."PRODUCT_CATEGORIES"."CATEGORY_NAME" ))']
Below is the code I have tried
result = []
for i in range(len(expression)):
if expression[i]=="(":
a.append(i)
elif expression[i]==")":
fromIdx=a.pop()
fromIdx2=max(a[-1],expression.rfind(",", 0, fromIdx))
flag=False
for (fromIndex, toIndex) in first_Index:
if fromIdx2 + 1 >= fromIndex and i <= toIndex:
flag=True
break
if flag==False:
result.append(expression[fromIdx2+1:i+1])
But this works only if expression is separated by ','
for ex:
expression = 'position(\'a\' , Concat("function_test"."PRODUCT_CATEGORIES"."CATEGORY_NAME" , "function_test"."PRODUCT_CATEGORIES"."CATEGORY_NAME" ))'
and result for this expression from my code will be correct as exprected.
In first expression ,I mentioned ,there is IN operator instead of ',' hence my code doesnt work.
Please help

If you want it to be reliable, you need a full-fledged SQL parser. Fortunately, there is an out-of-box solution for that: https://pypi.org/project/sqlparse/. As soon as you have a parsed token tree, you can walk through it and do what you need:
import sqlparse
def extract_functions(tree):
res = []
def visit(token):
if token.is_group:
for child in token.tokens:
visit(child)
if isinstance(token, sqlparse.sql.Function):
res.append(token.value)
visit(tree)
return res
extract_functions(sqlparse.parse(expression)[0])
Explanation.
sqlparse.parse(expression) parses the string and returns a tuple of statements. As there is only one statement in the example, we can just take the first element. If there are many statements, you should rather iterate over all tuple elements.
extract_functions recursively walks over a parsed token tree depth first (since you want inner calls appear before outer ones) using token.is_group to determine if the current token is a leaf, tests if the current token is a function, and if it is, appends its string representation (token.value) to the result list.

How do i loop through each entry in a string delimited by a comma and include it as a separate input using Python

I have a string called listnumber
listnumbers
'1.0,2.0,3.0,4.0,5.0,6.0'
I have a function that returns each value of that string
def myfun(lists):
return ','.join([i for i in lists.split(',')])
When i type the function
myfun(listnumbers)
'1.0,2.0,3.0,4.0,5.0,6.0'
I have a loop script
aprx = arcpy.mp.ArcGISProject("CURRENT")
m = aprx.listMaps("Map")[0]
for lyr in m.listLayers("OMAP_PCT_POP_ACS17"):
if lyr.supports("DEFINITIONQUERY"):
lyr.definitionQuery="Value=" ""+myfun(listnumbers)+""
I end up getting
Value=1.0,2.0,3.0,4.0,5.0,6.0
What i would really like is for this to loop and give me
Value=1.0
Value=2.0
Value=3.0
and so on...... As separate entries. I feel i am very close i just need to make some changes.

Instead of joining the string back together, just leave it apart:
def myfun(lists):
return [i for i in lists.split(',')]
Then, in your loop, you should loop through the values of the list returned by myfun:
aprx = arcpy.mp.ArcGISProject("CURRENT")
m = aprx.listMaps("Map")[0]
for lyr in m.listLayers("OMAP_PCT_POP_ACS17"):
if lyr.supports("DEFINITIONQUERY"):
for value in myfun(listnumbers):
lyr.definitionQuery = "Value=" + value
However, str.split already does what this improved myfun does, since it already returns a list. Thus, you can simplify even further and get rid of myfun entirely:
aprx = arcpy.mp.ArcGISProject("CURRENT")
m = aprx.listMaps("Map")[0]
for lyr in m.listLayers("OMAP_PCT_POP_ACS17"):
if lyr.supports("DEFINITIONQUERY"):
for value in listnumbers.split(','):
lyr.definitionQuery = "Value=" + value

Local variable 'list' referenced before assignment

I made a simple script that converts any input text into a "code" and can also translate it back. It only works one word at a time.
I want to make the script adds each new code to a list that is printed every time. For example, the first time you translate something, "HELLO" becomes "lohleci". The second time, I want it not only to show "world" = "ldwropx", but also state below everything translated so far.
I'm new to Python and have looked through forums for people with similar problems. The way I tried doing it (a segment was removed and put into a separate script), I get an error saying "local variable 'list' referenced before assignment." This is the code producing the error:
list = "none"
def list():
word = raw_input("")
if list == "none":
list = word + " "
print list
list()
else:
new_list = list + word + " "
list = new_list
print list
list()
list()

Your code has several problems, all of which are fixable with a bit more knowledge.
Don't use the name list for your own variables or functions. It's the name of a built-in Python function, and if you use that name for your own functions you won't be able to call the built-in function. (At least, not without resorting to advanced tricks which you shouldn't be trying to learn yet.)
You're also re-using the same name (list) for two different things, a variable and a function. Don't do that; give them different, meaningful names which reflect what they are. E.g., wordlist for the variable that contains a list of words, and get_words() for your function.
Instead of using a variable named list where you accumulate a set of strings, but which isn't actually a Python list, why not use a real Python list? They're designed for exactly what you want to do.
You use Python lists like this:
wordlist = []
# To add words at the end of the list:
wordlist.append("hello")
# To print the list in format ["word", "word 2", "word 3"]:
print wordlist
# To put a single space between each item of the list, then print it:
print " ".join(wordlist)
# To put a comma-and-space between each item of the list, then print it:
print ", ".join(wordlist)
Don't worry too much about the join() function, and why the separator (the string that goes between the list items) comes before the join(), just yet. That gets into classes, instances, and methods, which you'll learn later. For now, focus on using lists properly.
Also, if you use lists properly, you'll have no need for that if list == "none" check you're doing, because you can append() to an empty list just as well as to a list with contents. So your code would become:
Example A
wordlist = []
def translate_this(word):
# Define this however you like
return word
def get_words():
word = raw_input("")
translated_word = translate_this(word)
wordlist.append(translated_word)
print " ".join(wordlist)
# Or: print ", ".join(wordlist)
get_words()
get_words()
Now there's one more change I'd suggest making. Instead of calling your function at the end every time, use a while loop. The condition of the while loop can be anything you like; in particular, if you make the condition to be the Python value True, then the loop will never exit and keep on looping forever, like so:
Example B
wordlist = []
def translate_this(word):
# Define this however you like
return word
def get_words():
while True:
word = raw_input("")
translated_word = translate_this(word)
wordlist.append(translated_word)
print " ".join(wordlist)
# Or: print ", ".join(wordlist)
get_words()
Finally, if you want to get out of a loop (any loop, not just an infinite loop) early, you can use the break statement:
Example C
wordlist = []
def translate_this(word):
# Define this however you like
return word
def get_words():
while True:
word = raw_input("")
if word == "quit":
break
translated_word = translate_this(word)
wordlist.append(translated_word)
print " ".join(wordlist)
# Or: print ", ".join(wordlist)
get_words()
That should solve most of your problems so far. If you have any questions about how any of this code works, let me know.

Python: argument conversion during string format error /w dictionary/list reads

new to these boards and understand there is protocol and any critique is appreciated. I have begun python programming a few days ago and am trying to play catch-up. The basis of the program is to read a file, convert a specific occurrence of a string into a dictionary of positions within the document. Issues abound, I'll take all responses.
Here is my code:
f = open('C:\CodeDoc\Mm9\sampleCpG.txt', 'r')
cpglist = f.read()
def buildcpg(cpg):
return "\t".join(["%d" % (k) for k in cpg.items()])
lookingFor = 'CG'
i = 0
index = 0
cpgdic = {}
try:
while i < len(cpglist):
index = cpglist.index(lookingFor, i)
i = index + 1
for index in range(len(cpglist)):
if index not in cpgdic:
cpgdic[index] = index
print (buildcpg(cpgdic))
except ValueError:
pass
f.close()
The cpgdic is supposed to act as a dictionary of the position reference obtained in the index. Each read of index should be entering cpgdic as a new value, and the print (buildcpg(cpgdic)) is my hunch of where the logic fails. I believe(??) it is passing cpgdic into the buildcpg function, where it should be returned as an output of all the positions of 'CG', however the error "TypeError:not all arguments converted during string formatting" shows up. Your turn!
ps. this destroys my 2GB memory; I need to improve with much more reading

cpg.items is yielding tuples. As such, k is a tuple (length 2) and then you're trying to format that as a single integer.
As a side note, you'll probably be a bit more memory efficient if you leave off the [ and ] in the join line. This will turn your list comprehension to a generator expression which is a bit nicer. If you're on python2.x, you could use cpg.iteritems() instead of cpg.items() as well to save a little memory.
It also makes little sense to store a dictionary where the keys and the values are the same. In this case, a simple list is probably more elegant. I would probably write the code this way:
with open('C:\CodeDoc\Mm9\sampleCpG.txt') as fin:
cpgtxt = fin.read()
indices = [i for i,_ in enumerate(cpgtxt) if cpgtxt[i:i+2] == 'CG']
print '\t'.join(indices)
Here it is in action:
>>> s = "CGFOOCGBARCGBAZ"
>>> indices = [i for i,_ in enumerate(s) if s[i:i+2] == 'CG']
>>> print indices
[0, 5, 10]
Note that
i for i,_ in enumerate(s)
is roughly the same thing as
i for i in range(len(s))
except that I don't like range(len(s)) and the former version will work with any iterable -- Not just sequences.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to create a censor "translator" via function in python - python

Related

How to replace all T with U in an input string of DNA?

parse nested function to extract each inner function in python

How do i loop through each entry in a string delimited by a comma and include it as a separate input using Python

Local variable 'list' referenced before assignment

Python: argument conversion during string format error /w dictionary/list reads

Categories

Resources