String for text issue in python - python

I met some problems about try to use string in text.
here is a provided file sqroot2_10kdigits.txt.
the sqroot2_10kdigits.txt is below:
1.4142135623 7309504880 1688724209 6980785696 7187537694 8073176679 7379907324 7846210703 8850387534 3276415727 3501384623 0912297024
9248360558 5073721264 4121497099 9358314132 2266592750 5592755799
9505011527 8206057147 0109559971 6059702745 3459686201 4728517418
6408891986 0955232923 0484308714 3214508397 6260362799 5251407989
6872533965 4633180882 9640620615 2583523950 5474575028 7759961729
8355752203 3753185701 1354374603 4084988471
My code is below:
myfile = open("sqroot2_10kdigits.txt")
txt = myfile.read()
print(txt)
myfile.close()
Q2: Make a new empty string called sqroot_2_string. Note that there's a space between every 10 digits.Instead of using the .rstrip() method, try using .replace(" ", "") to remove all the spaces in the file and save it in the empty string I just made. Check the length of the string as well, it should be 10002. Then print the first 10 digits followed by .... Here's an example:
The first 10 digit of square root of 2 is 1.4142135623... My codes are below:
def sqroot_2_string(string):
count = 0
list = []
for i in xrange(len(string)):
if string[i] != ' ':
list.append(string[i])
return toString(list)
# Utility Function
def toString(List):
return ''.join(List)
# Driver program
string = myfile
print sqroot_2_string(string)
Anyone can check my code in Q2? I don't know how to use .replace(" ", "") to remove all the spaces in the file and save it in the empty string

You can just do
def sqroot_2_string(string):
return string.replace(" ", "")
Also note that you should do
print(sqroot_2_string(txt))
so you are using the text from the file instead of the file handle

Related

Python: how to avoid the space in dna calculated?

I am using python 2.7.
I want to find the DNA length. I have no idea where is the mistake.....The length of DNA supposed to be 283, but it comes up with 345.
The sequence in a single line is nothing wrong but just the length have some problem.....
I think the spaces are calculated too. May I know how to get the length of the DNA without including the spaces?
Thank you.
import re
singleSeq = ""
fh = open("seq.embl.txt")
lines = fh.readlines()
for line in lines:
lines = line.strip()
m = re.match(r"\s+(.[^\d]+)\s+\d+", line)
if m:
print(m.group(0))
seqline = m.group(1)
print(seqline)
singleSeq += seqline
print("\nSequence in a single line: ")
# print(line.strip(singleSeq))
print(singleSeq)
print("\nSequence length: ", len(singleSeq))
Output
Sequence in a single line:
cccatgtccc agcggcgtat tgctttgcat cgcgaacgca ctttcaatgt cccagcggcg tattgcttct attttataag taccagctaa attttttttt tttttttata agtaccagct aaaatttttt tttttttttt ttataagtac cagctaaaat tttttttttt tttttttata agtaccagct aaaatttttt ttttttttta taagttccag cggcgtattg ctttctgaaa tttaaaaaaa aaaaaaaatt tttttttaat aatatattat ata
Sequence length: 345
This should do the trick
# Python3 code to remove whitespace
def remove(string):
return string.replace(" ", "")
# Driver Program
string = ' t e s t '
print(remove(string))
it seems you are reinventing the wheel her. i strongly suggest you try BioPython for this
from Bio import SeqIO
record = SeqIO.read("seq.embl.txt", "embl")
print("\nSequence length: ", len(record))

how to replace (or delete) a part of string from txt file in python

i am very new in python (and programming in general) and here is my issue. i would like to replace (or delete) a part of a string from a txt file which contains hundreds or thousands of lines. each line starts with the very same string which i want to delete.
i have not found a method to delete it so i tried a replace it with empty string but for some reason it doesn't work.
here is what i have written:
file = "C:/Users/experimental/Desktop/testfile siera.txt"
siera_log = open(file)
text_to_replace = "Chart: Bar Backtest: NQU8-CME [CB] 1 Min #1 | Study: free dll = 0 |"
for each_line in siera_log:
new_line = each_line.replace("text_to_replace", " ")
print(new_line)
when i print it to check if it was done, i can see that the lines are as they were before. no change was made.
can anyone help me to find out why?
each line starts with the very same string which i want to delete.
The problem is you're passing a string "text_to_replace" rather than the variable text_to_replace.
But, for this specific problem, you could just remove the first n characters from each line:
text_to_replace = "Chart: Bar Backtest: NQU8-CME [CB] 1 Min #1 | Study: free dll = 0 |"
n = len(text_to_replace)
for each_line in siera_log:
new_line = each_line[n:]
print(new_line)
If you quote a variable it becomes a string literal and won't be evaluated as a variable.
Change your line for replacement to:
new_line = each_line.replace(text_to_replace, " ")

Python String Replace Error

I have a python script that keeps returning the following error:
TypeError: replace() takes at least 2 arguments (1 given)
I cannot for the life of me figure out what is causing this.
Here is part of my code:
inHandler = open(inFile2, 'r')
outHandler = open(outFile2, 'w')
for line in inHandler:
str = str.replace("set([u'", "")
str = str.replace("'", "")
str = str.replace("u'", "")
str = str.replace("'])", "")
outHandler.write(str)
inHandler.close()
outHandler.close()
Everything that is seen within double quotations needs to be replaced with nothing.
So set([u' should look like
This is what you want to do:
for line in inHandler:
line = line.replace("set([u'", "")
line = line.replace("'", "")
line = line.replace("u'", "")
line = line.replace("'])", "")
outHandler.write(line)
On the documentation, wherever it says something like str.replace(old,new[,count]) the str is an example variable. In fact, str is an inbuilt function, meaning you never want to change what it means by assigning it to anything.
line = line.replace("set([u'", "")
^This sets the string equal to the new, improved string.
line = line.replace("set([u'", "")
^ This is the string of what you want to change.
There are two ways to call replace.
Let us start by defining a string:
In [19]: s = "set([u'"
We can call the replace method of string s:
In [20]: s.replace("u'", "")
Out[20]: 'set(['
Or, we can call the replace of the class str:
In [21]: str.replace(s, "u'", "")
Out[21]: 'set(['
The latter way requires three arguments because str. That is why you received the error about missing arguments.
What went wrong
Consider the code:
for line in inHandler:
str = str.replace("set([u'", "")
str = str.replace("'", "")
str = str.replace("u'", "")
str = str.replace("'])", "")
First, note the goal is to replace text in line but nowhere in the calls to replace is the variable line used for anything.
The first call to replace generates the error:
>>> str.replace("set([u'", "")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: replace() takes at least 2 arguments (1 given)
Used in the above form, str.replace interprets its first argument as the string to replace. It is as if you wrote:
"set([u'".replace("")
In other words, it thinks that set([u' is the string to operate on and the replace function was given just one argument: the empty string. That it why the message is replace() takes at least 2 arguments (1 given).
What you need is to operate on the variable line:
line = line.replace("set([u'", "")
And so on for the remaining lines in the loop.
Conclusion: you have two issue:
error: wrong syntax for str.replace
abuse: the Python reserved key str
error: wrong syntax for str.replace
for error TypeError: replace() takes at least 2 arguments (1 given)
the root cause is:
your code
str = str.replace("set([u'", "")
intension is to use str.replace do replace work
the correct (one) syntax is:
newStr = oldStr.replace(fromStr, toStr[, replaceCount])
corresponding to your code, could be:
replacedLine = line.replace("set([u'", "")
Note: another syntax is:
newStr = str.replace(oldStr, fromStr, toStr[, replaceCount])
for full details please refer another post's answer
abuse: the Python reserved key str
background knowledge
str is Python reserved key word
means: the name of String class in Python 3
str has many builtin functions
such as: str.replace(), str.lower()
also means: you should NOT use str as normal variable/function name
so your code:
for line in inHandler:
str = str.replace("set([u'", "")
should change to (something like this):
for line in inHandler:
newLine = line.replace("set([u'", "")
or
for line in inHandler:
newLine = str.replace(line, "set([u'", "")
I think this should work.
for s in inHandler:
s = s.replace("set([u'", " ") ## notice the space between the quotes
s = s.replace("'", " ")
s = s.replace("u'", " ")
s = s.replace("'])", " ")
Please refrain from using built-in data types as variables (like you have used str).
You could compress the four lines of replace code
str = str.replace("set([u'", "")
str = str.replace("'", "")
str = str.replace("u'", "")
str = str.replace("'])", "")
as,
str = re.sub(r"set\(\[u'|u'|'\]\)|'", r"", str)
Example:
>>> import re
>>> string = "foo set([u'bar'foobaru''])"
>>> string = re.sub(r"set\(\[u'|u'|'\]\)|'", r"", string)
>>> string
'foo barfoobar'
I modified your code as below:
inHandler = open('test.txt', 'r')
outHandler = open('test1.txt', 'w')
data = ''
for line in inHandler.readlines():
print 'src:' + line
line = line.replace("set([u'", "")
line = line.replace("u'", "")
line = line.replace("'])", "")
line = line.replace("'", "")
data += line
print 'replace:' + line
outHandler.write(data)
inHandler.close()
outHandler.close()
And I tested it. Result:
src:set([u'adg',u'dafasdf'])
replace:adg,dafasdf
src:set([u'adg',u'dafasdf'])
replace:adg,dafasdf
src:set([u'adg',u'dafasdf'])
replace:adg,dafasdf
if you are referring to str.replace (string) inbuild function in python then
str.replace(old, new[, count])
Return a copy of the string with all occurrences of substring old replaced by new.
If the optional argument count is given, only the first count occurrences are replaced.
Means you need to give 2 values.
And you are using line as variable so you can try.
local_string = ''
for line in inHandler:
local_string = line
local_string = local_string.replace("set([u'", "")
local_string = local_string.replace("'", "")
local_string = local_string.replace("u'", "")
local_string = local_string.replace("'])", "")

Encrypting the lines in a file

I'm trying to write a program that opens a text file, and shifts each of the characters in the file 5 characters to the right. It should only do this for alphanumeric characters, and leave nonalphanumerics as they are. (ex: C becomes H) I'm supposed to be using the ASCII table to do this, and I'm having an issue when the characters wrap around. ex: w should become b, but my program gives me a character that's in the ASCII table. Another issue I'm having is that all the characters are printing on separate lines and I'd like them all to print on the same line.
I can't use lists or dictionaries.
This is what I have, I'm not sure how to do the final if statement
def main():
fileName= input('Please enter the file name: ')
encryptFile(fileName)
def encryptFile(fileName):
f= open(fileName, 'r')
line=1
while line:
line=f.readline()
for char in line:
if char.isalnum():
a=ord(char)
b= a + 5
#if number wraps around, how to correct it
if
print(chr(c))
else:
print(chr(b))
else:
print(char)
Using str.translate:
In [24]: import string
In [25]: string.uppercase
Out[25]: 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
In [26]: string.uppercase[5:]+string.uppercase[:5]
Out[26]: 'FGHIJKLMNOPQRSTUVWXYZABCDE'
In [27]: table = string.maketrans(string.uppercase, string.uppercase[5:]+string.uppercase[:5])
In [28]: 'CAR'.translate(table)
Out[28]: 'HFW'
In [29]: 'HELLO'.translate(table)
Out[29]: 'MJQQT'
First, it matters if it is lower or upper case. I am going to assume here that all the characters are lower case (if they aren't, it would be easy enough to make them)
if b>122:
b=122-b #z=122
c=b+96 #a=97
w=119 in ASCII and z=122 (decimal in ASCII) so 119+5=124 and 124-122=2 which is our new b, then we add that to a-1 (this takes care of if we get a 1 back, 2+96=98 and 98 is b.
For the printing on the same line, instead of printing when you have them, I would write them to a list, then create a string from that list.
e.g instead of
print(chr(c))
else:
print(chr(b))
I would do
someList.append(chr(c))
else:
somList.append(chr(b))
then join each element of the list together into one string.
You could create a dictionary to handle it:
import string
s = string.lowercase + string.uppercase + string.digits + string.lowercase[:5]
encryptionKey = {s[i]:s[i+5] for i in range(len(s)-5)}
The final addend to s (+ string.lowercase[:5]) adds the first 5 letters into the key. Then, we use a simple dictionary comprehension to create a key for the encryption.
Put into your code (I also changed it so you iterate through the lines rather than using f.readline():
import string
def main():
fileName= input('Please enter the file name: ')
encryptFile(fileName)
def encryptFile(fileName):
s = string.lowercase + string.uppercase + string.digits + string.lowercase[:5]
encryptionKey = {s[i]:s[i+5] for i in range(len(s)-5)}
f= open(fileName, 'r')
line=1
for line in f:
for char in line:
if char.isalnum():
print(encryptionKey[char])
else:
print(char)

How to append two strings in Python?

I have done this operation millions of times, just using the + operator! I have no idea why it is not working this time, it is overwriting the first part of the string with the new one! I have a list of strings and just want to concatenate them in one single string! If I run the program from Eclipse it works, from the command-line it doesn't!
The list is:
["UNH+1+XYZ:08:2:1A+%CONVID%'&\r", "ORG+1A+77499505:ABC+++A+FR:EUR++123+1A'&\r", "DUM'&\r"]
I want to discard the first and the last elements, the code is:
ediMsg = ""
count = 1
print "extract_the_info, lineList ",lineList
print "extract_the_info, len(lineList) ",len(lineList)
while (count < (len(lineList)-1)):
temp = ""
# ediMsg = ediMsg+str(lineList[count])
# print "Count "+str(count)+" ediMsg ",ediMsg
print "line value : ",lineList[count]
temp = lineList[count]
ediMsg += " "+temp
print "ediMsg : ",ediMsg
count += 1
print "count ",count
Look at the output:
extract_the_info, lineList ["UNH+1+XYZ:08:2:1A+%CONVID%'&\r", "ORG+1A+77499505:ABC+++A+FR:EUR++123+1A'&\r", "DUM'&\r"]
extract_the_info, len(lineList) 8
line value : ORG+1A+77499505:ABC+++A+FR:EUR++123+1A'&
ediMsg : ORG+1A+77499505:ABC+++A+FR:EUR++123+1A'&
count 2
line value : DUM'&
DUM'& : ORG+1A+77499505:ABC+++A+FR:EUR++123+1A'&
count 3
Why is it doing so!?
While the two answers are correct (use " ".join()), your problem (besides very ugly python code) is this:
Your strings end in "\r", which is a carriage return. Everything is fine, but when you print to the console, "\r" will make printing continue from the start of the same line, hence overwrite what was written on that line so far.
You should use the following and forget about this nightmare:
''.join(list_of_strings)
The problem is not with the concatenation of the strings (although that could use some cleaning up), but in your printing. The \r in your string has a special meaning and will overwrite previously printed strings.
Use repr(), as such:
...
print "line value : ", repr(lineList[count])
temp = lineList[count]
ediMsg += " "+temp
print "ediMsg : ", repr(ediMsg)
...
to print out your result, that will make sure any special characters doesn't mess up the output.
'\r' is the carriage return character. When you're printing out a string, a '\r' will cause the next characters to go at the start of the line.
Change this:
print "ediMsg : ",ediMsg
to:
print "ediMsg : ",repr(ediMsg)
and you will see the embedded \r values.
And while your code works, please change it to the one-liner:
ediMsg = ' '.join(lineList[1:-1])
Your problem is printing, and it is not string manipulation. Try using '\n' as last char instead of '\r' in each string in:
lineList = [
"UNH+1+TCCARQ:08:2:1A+%CONVID%'&\r",
"ORG+1A+77499505:PARAF0103+++A+FR:EUR++11730788+1A'&\r",
"DUM'&\r",
"FPT+CC::::::::N'&\r",
"CCD+CA:5132839000000027:0450'&\r",
"CPY+++AF'&\r",
"MON+712:1.00:EUR'&\r",
"UNT+8+1'\r"
]
I just gave it a quick look. It seems your problem arises when you are printing the text. I haven't done such things for a long time, but probably you only get the last line when you print. If you check the actual variable, I'm sure you'll find that the value is correct.
By last line, I'm talking about the \r you got in the text strings.

Categories

Resources