Encryption with input and output file - python

I am new to python and am stuck on the task below. Below is an example of my input file and what I want it to output.
Input File Message (from online sample)
So pure of heart
And strong of mind
So true of aim with his marshmallow laser
Marshmallow laser
Output File Message
LhtinkXthYtaXTkm
ugWtlmkhgZthYtfbgW
LhtmknXthYtTbftpbmatabltfTklafTeehpteTlXk
FTklafTeehpteTlXk
Below is my syntax and guidance as to why it isn't completing the task intended would be helpful. It is printing 'wwww'....I believe it is a 'w' for each line.
inputFileName = input("Enter the message to encrypt: ")
key = int( input("Enter the shift key: " ))
outputFileName = input("Enter the output file name: " )
infile=open(inputFileName,"r")
outfile = open( outputFileName, "w" )
sequence=infile.readlines()
alphabet = " ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
shiftedAlphabetStart = alphabet[len(alphabet) - key:]
shiftedAlphabetEnd = alphabet[:len(alphabet) - key]
shiftedAlphabet = shiftedAlphabetStart + shiftedAlphabetEnd
print( alphabet )
print( shiftedAlphabet )
encryptedMessage = ''
for character in sequence:
letterIndex = alphabet.find( character )
encryptedCharacter = shiftedAlphabet[letterIndex]
#print( "{0} -> {1}".format( character, encryptedCharacter ) )
encryptedMessage = encryptedMessage + encryptedCharacter
print( "The encrypted message is: {0}".format( encryptedMessage ))

If you print(sequence), you'll realize that it's a List of lines, not a string.
So when you iterate through it with for character in sequence:, you're not iterating through the original text character by character, you're iterating through the list line by line.
This is because readlines() return a list of lines.
You can, if you still want to use readlines(), try adding something like:
original_text = ''
for line in sequence:
original_text += line
A better way however, is to simply change sequence = infile.readlines() to sequence = infile.read().

Related

So i wrote a script and i need to fix it so it doesnt write in the script in string form, see the problem in its body

PyPython.py
from Project import *
v = open("Project.py", "w")
w = open("Backup.txt", "w")
PInput = None
DisableCode = "Save"
LineSeek = 0
Line = "None"
while PInput != DisableCode:
PInput = input(": ")
if PInput == "Create Line":
Line = input("Type in the command: ")
Line = repr(Line)
v.write(Line + "\n")
w.write(Line + "\n")
print("Done!")
After running the code
in Project.py...
'print("Hi")'
It must be
print("Hi")
What should i do change in PyPython.py to get rid of string marks in Project.py?
There is no need for repr(Line) here. The repr() method returns a string containing a printable representation of an object.
Try removing that line, it will work as expected.

Python printing with user defined functions

I'm trying to write a code that will take data from a file and write it differently. I have the code for the most part but when i run it, everything is on one line.
import csv
#Step 4
def read_data(filename):
try:
data = open("dna.txt", "r")
except IOError:
print( "File not found")
return data
#Step 5
def get_dna_stats(dna_string):
a_letters = ""
t_letters = ""
if "A" in dna_string:
a_letters.append("A")
if "T" in dna_string:
t_letters.append("T")
nucleotide_content = ((len(a_letters) + len(t_letters))/len(dna_string))
#Step 6
def get_dna_complement(dna_string):
dna_complement = ""
for i in dna_string:
if i == "A":
dna_complement.append("T")
elif i == "T":
dna_complement.append("A")
elif i == "G":
dna_complement.append("C")
elif i == "C":
dna_complement.append("G")
else:
break
return dna_complement
#Step 7
def print_dna(dna_strand):
dna_complement = get_dna_complement(dna_strand)
for i in dna_strand:
for j in dna_complement:
print( i + "=" + j)
#Step 8
def get_rna_sequence(dna_string):
rna_complement = ""
for i in dna_string:
if i == "A":
rna_complement.append("U")
elif i == "T":
rna_complement.append("A")
elif i == "G":
rna_complement.append("C")
elif i == "C":
rna_complement.append("G")
else:
break
return rna_complement
#Step 9
def extract_exon(dna_strand, start, end):
return (f"{dna_strand} between {start} and {end}")
#Step 10
def calculate_exon_pctg(dna_strand, exons):
exons_length = 0
for i in exons:
exons_length += 1
return exons_length/ len(dna_strand)
#Step 11
def format_data(dna_string):
x = "dna_strand"[0:62].upper()
y = "dna_strand"[63:90].lower()
z = "dna_strand"[91:-1].upper()
return x+y+z
#Step 12
def write_results(output, filename):
try:
with open("output.csv","w") as csvFile:
writer = csv.writer(csvFile)
for i in output:
csvFile.write(i)
except IOError:
print("Error writing file")
#Step 13
def main():
read_data("dna.txt")
output = []
output.append("The AT content is" + get_dna_stats() + "% of the DNA sequence.")
get_dna_stats("dna_sequence")
output.append("The DNA complement is " + get_dna_complement())
get_dna_complement("dna_sequence")
output.append("The RNA sequence is" + get_rna_sequence())
get_rna_sequence("dna_sequence")
exon1 = extract_exon("dna_sequence", 0, 62)
exon2 = extract_exon("dna_sequence", 91, len("dna_sequence"))
output.append(f"The exon regions are {exon1} and {exon2}")
output.append("The DNA sequence, which exons in uppercase and introns in lowercase, is" + format_dna())
format_data("dna_sequence")
output.append("Exons comprise " + calculate_exon_pctg())
calculate_exon_pctg("dna_sequence",[exon1, exon2])
write_results(output, "results.txt")
print("DNA processing complete")
#Step 14
if __name__ == "__main__":
main()
When I run it, its supposed to output a file that looks like this but my code ends up putting every word on the top line like this
I have a feeling it has to do with the write_resultsfunction but that's all i know on how to write to the file.
The second mistake I'm making is that I'm not calling the functions correctly in the append statements. I've tried concatenating and I've tried formatting the string but now I'm hitting a road block on what I need to do.
When you write to the file you need to concat a '\n' to the end of the string every time you want to have something on a new line in the written file
for example:
output.append("The AT content is" + get_dna_stats() + "% of the DNA sequence." + '\n')
To solve your second problem I would change your code to something like this:
temp = "The AT content is" + get_dna_stats() + "% of the DNA sequence." + '\n'
output.append(temp)
When you append to a list and call a function it will take the literal text of the function instead of calling it. Doing it with a temp string holder will call the function before the string is concatenated. Then you are able to append the string to the list
read_data() doesn't actually read anything (just opens file). It should read the file and return its contents:
def read_data(filename):
with open(filename, "r") as f:
return f.read()
get_dna_stats() won't get DNA stats (won't return anything, and it doesn't count "A"s or "T"s, only checks if they're present, nucleotide_content is computed but never used or returned. It should probably count and return the results:
def get_dna_stats(dna_string):
num_a = dna_string.count("A")
num_t = dna_string.count("T")
nucleotide_content = (num_a + num_t) /float(len(dna_string))
return nucleotide_content
get_dna_complement() and get_rna_sequence(): you can't append to a string. Instead use
dna_complement += "T"
... and rather than break, you either append a "?" to denote a failed transscription, or raise ValueError("invalid letter in DNA: "+i)
print_dna() is a bit more interesting. I'm guessing you want to "zip" each letter of the DNA and its complement. Coincidentally, you can use the zip function to achieve just that:
def print_dna(dna_strand):
dna_complement = get_dna_complement(dna_strand)
for dna_letter, complement in zip(dna_strand, dna_complement):
print(dna_letter + "=" + complement)
As for extract_exon(), I don't know what that is, but presumably you just want the substring from start to end, which is achieved by:
def extract_exon(dna_strand, start, end):
return dna_strand[start:end] # possibly end+1, I don't know exons
I am guessing that in calculate_exon_pctg(), you want exons_length += len(i) to sum the lengths of the exons. You can achieve this by using the buildin function sum:
exons_length = sum(exons)
In function format_data(), loose the doublequotes. You want the variable.
main() doesn't pass any data around. It should pass the results of read_data() to all the other functions:
def main():
data = read_data("dna.txt")
output = []
output.append("The AT content is " + get_dna_stats(data) + "% of the DNA sequence.")
output.append("The DNA complement is " + get_dna_complement(data))
output.append("The RNA sequence is" + get_rna_sequence(data))
...
write_results(output, "results.txt")
print("DNA processing complete")
The key for you at this stage is to understand how function calls work: they take data as input parameters, and they return some results. You need to a) provide the input data, and b) catch the results.
write_results() - from your screenshot, you seem to want to write a plain old text file, yet you use csv.writer() (which writes CSV, i.e. tabular data). To write plain text,
def write_results(output, filename):
with open(filename, "w") as f:
f.write("\n".join(output)) # join output lines with newline
f.write("\n") # extra newline at file's end
If you really do want a CSV file, you'll need to define the columns first, and make all the output you collect fit that column format.
You never told your program to make a new line. You could either append or prepend the special "\n" character to each of your strings or you could do it in a system agnostic way by doing
import os
at the top of your file and writing your write_results function like this:
def write_results(output, filename):
try:
with open("output.csv","w") as csvFile:
writer = csv.writer(csvFile)
for i in output:
csvFile.write(i)
os.write(csvFile, os.linesep) # Add this line! It is a system agnostic newline
except IOError:
print("Error writing file")

How do I split text from a text file at newlines?

I need to encrypt a message. The message follows, it is saved in a file named assignmenttest.txt
Hi my name is Allie
I am a Junior
I like to play volleyball
I need the program to encrypt each line and keep it's format so that So, I wrote the following program:
fileInputName = input("Enter the file you want to encrypt: ")
key = int(input("Enter your shift key: "))
outputFileName = input("Enter the file name to write to: ")
fileInputOpen = open(fileInputName, "r")
message = fileInputOpen.read()
alphabet = " ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
shiftedStart = alphabet[len(alphabet) - key:]
shiftedEnd = alphabet[:len(alphabet) - key]
shiftedAlphabet = shiftedStart + shiftedEnd
encryptedMessage = ""
for character in message:
letterIndex = message.split("\n")
letterIndex = alphabet.find(character)
encryptedCharacter = shiftedAlphabet[letterIndex]
#print( "{0} -> {1}".format(character, encryptedCharacter))
encryptedMessage += encryptedCharacter
print("The encrypted message is: {0}".format(encryptedMessage))
outputFile = open( outputFileName, "w")
print(encryptedMessage, file=outputFile)
outputFile.close()
print("Done writing encrypted message to file {0}".format(outputFileName))
I tried to use a split at \n, but the output is not formatted in three separate lines, instead it is all just one long string of encrypted letters.
Any ideas on how to split the encrypted message at the correct spot and have it display as such? I've tried multiple split methods and none have worked. Thank you so much.
As the other answers have said, you can replace
fileInputOpen = open(fileInputName, "r")
message = fileInputOpen.read()
with
with open(fileInputName, "r") as f:
messages = f.readlines()
This way, messages will be a list of strings, where each string is the text from a single line in your input file. Then, with some slight modifications to your loop over each character in messages, you can encrypt each string from your messages list. Here, I replaced your encryptedMessage with currentEncryptedMessage and added encryptedMessages, a list that keeps track of the encrypted version of each string in messages.
encryptedMessages = []
currentEncryptedMessage = ""
for message in messages:
for character in message:
... # same as code provided
currentEncryptedMessage += encryptedCharacter
encryptedMessages.append(currentEncryptedMessage)
When writing to your file, you can iterate through each element in encryptedMessages to print line-by-line.
with open( outputFileName, "w") as outputFile:
for message in encryptedMessages:
print(message, file=outputFile)
And so your output text file will preserve the line breaks from your input file.
Instead of splitting at '\n', you can append all the characters in message that are not in alphabet to encryptedMessage when you encounter one.
for character in message:
if !(character in alphabet):
encryptedMessage += character
continue # this takes back to begin of the loop
letterIndex = alphabet.find(character)
encryptedCharacter = shiftedAlphabet[letterIndex]
#print( "{0} -> {1}".format(character, encryptedCharacter))
encryptedMessage += encryptedCharacter
Try changing:
message = fileInputOpen.read()
to
message = fileInputOpen.readlines()
This will make your file reads handle the file line by line. This will allow you to do your processing on a line by line basis first. Beyond that, If you want to encrypt each character, you'll need another for loop for the characters.
Instead of reading the file all at once. Read the lines individually.
f = open("file.txt")
for i in f.readlines():
print (i)
You'll have to loop each line and every character you want to
un-shift;
The script should only un-shift characters present in alphabet;
Checking for file existence is also a must or you may get errors if it doesn't exist.
with open... is the recommended way of reading and writing files in python.
Here's an approach:
import os
import string
fileInputName = input("Enter the file you want to encrypt: ")
while not os.path.exists(fileInputName):
fileInputName = input("{} file doesn't exist.\nEnter the file you want to encrypt : ".format(fileInputName))
key = int(input("Enter your shift key (> 0): "))
while key < 1 :
key = int(input("Invalid shift key value ({}) \nEnter your shift key (> 0): ".format(key)))
fileOutputName = input("Enter the file name to write to: ")
if os.path.exists(fileOutputName) :
ow = input("{} exists, overwrite? (y/n): ".format(fileOutputName))
if not ow.startswith("y"):
fileOutputName = input("Enter the file name to write to: ") # asks for output filename again
alphabet = string.ascii_letters + " "
shiftedStart = alphabet[len(alphabet) - key:]
shiftedEnd = alphabet[:len(alphabet) - key]
shiftedAlphabet = shiftedStart + shiftedEnd
with open(fileOutputName, "a") as outputFile: # opens out file
with open(fileInputName, "r") as inFile: # opens in file
for line in inFile.readlines(): # loop all lines in fileInput
encryptedCharacter = ""
for character in line: # loop all characters in line
if character in alphabet: # un-shift only if character is present in `alphabet`
letterIndex = alphabet.find(character)
encryptedCharacter += shiftedAlphabet[letterIndex]
else:
encryptedCharacter += character # add the original character un-shifted
outputFile.write("{}".format(encryptedCharacter)) # append line to outfile

Read special characters from .txt file in python

The goal of this code is to find the frequency of words used in a book.
I am tying to read in the text of a book but the following line keeps throwing my code off:
precious protégés. No, gentlemen; he'll always show 'em a clean pair
specifically the é character
I have looked at the following documentation, but I don't quite understand it: https://docs.python.org/3.4/howto/unicode.html
Heres my code:
import string
# Create word dictionary from the comprehensive word list
word_dict = {}
def create_word_dict ():
# open words.txt and populate dictionary
word_file = open ("./words.txt", "r")
for line in word_file:
line = line.strip()
word_dict[line] = 1
# Removes punctuation marks from a string
def parseString (st):
st = st.encode("ascii", "replace")
new_line = ""
st = st.strip()
for ch in st:
ch = str(ch)
if (n for n in (1,2,3,4,5,6,7,8,9,0)) in ch or ' ' in ch or ch.isspace() or ch == u'\xe9':
print (ch)
new_line += ch
else:
new_line += ""
# now remove all instances of 's or ' at end of line
new_line = new_line.strip()
print (new_line)
if (new_line[-1] == "'"):
new_line = new_line[:-1]
new_line.replace("'s", "")
# Conversion from ASCII codes back to useable text
message = new_line
decodedMessage = ""
for item in message.split():
decodedMessage += chr(int(item))
print (decodedMessage)
return new_line
# Returns a dictionary of words and their frequencies
def getWordFreq (file):
# Open file for reading the book.txt
book = open (file, "r")
# create an empty set for all Capitalized words
cap_words = set()
# create a dictionary for words
book_dict = {}
total_words = 0
# remove all punctuation marks other than '[not s]
for line in book:
line = line.strip()
if (len(line) > 0):
line = parseString (line)
word_list = line.split()
# add words to the book dictionary
for word in word_list:
total_words += 1
if (word in book_dict):
book_dict[word] = book_dict[word] + 1
else:
book_dict[word] = 1
print (book_dict)
# close the file
book.close()
def main():
wordFreq1 = getWordFreq ("./Tale.txt")
print (wordFreq1)
main()
The error that I received is as follows:
Traceback (most recent call last):
File "Books.py", line 80, in <module>
main()
File "Books.py", line 77, in main
wordFreq1 = getWordFreq ("./Tale.txt")
File "Books.py", line 60, in getWordFreq
line = parseString (line)
File "Books.py", line 36, in parseString
decodedMessage += chr(int(item))
OverflowError: Python int too large to convert to C long
When you open a text file in python, the encoding is ANSI by default, so it doesn't contain your é chartecter. Try
word_file = open ("./words.txt", "r", encoding='utf-8')
The best way I could think of is to read each character as an ASCII value, into an array, and then take the char value. For example, 97 is ASCII for "a" and if you do char(97) it will output "a". Check out some online ASCII tables that provide values for special characters also.
Try:
def parseString(st):
st = st.encode("ascii", "replace")
# rest of code here
The new error you are getting is because you are calling isalpha on an int (i.e. a number)
Try this:
for ch in st:
ch = str(ch)
if (n for n in (1,2,3,4,5,6,7,8,9,0) if n in ch) or ' ' in ch or ch.isspace() or ch == u'\xe9':
print (ch)

Python file handling: output is not as expected.

I tried the following program on a file but didn't get the accurate result.
The decoded file is not the exact copy of the original message.
Some letters are eaten up somewhere.
"""
This file of python script works by encrypting the message in a below fashion:
Don't replace the characters in the even places.
Replace the characters in the odd places by their place numbers
. and if they exceed 'z', then again they will start from 'a'.
example: for a message "hello" would be "ieolt" and for message "maya" would be "naba"
import os
import time
import sys
def openfile(filename): # opens file with name 'filename'
file_to_open = open(filename,'a+')
return file_to_open
def readfile(filename): # returns a long string with the info of the message in 'filename' file.
time.sleep(0.3)
print "Reading from the file "+filename
reading_file = openfile(filename)
read_msg = reading_file.read()
return read_msg
def decode(msg): # returns decoded message of input message 'msg'.
""" reverse function of encode(msg) """
decoded_message = ""
letters = " abcdefghijklmnopqrstuvwxyz"
time.sleep(0.5)
print " Encoding ...."
print "encoding the message...."
index_of_msg = 0
for char in msg.lower():
if char.isalpha():
if index_of_msg%2 == 0 :
decoded_message += letters[(letters.rfind (char)- (index_of_msg+1))%26] # changed msg.rfind(char) to index_of_msg
else:
decoded_message += char
else:
decoded_message += char
index_of_msg +=1
time.sleep(0.5)
print "decoding completed"
return decoded_message
def encode(msg): # returns encoded message of input message 'msg'.
"""Clean up work must be done here.."""
encoded_message = ""
letters = " abcdefghijklmnopqrstuvwxyz"
time.sleep(0.5)
print " Encoding ...."
print "encoding the message...."
index_of_msg = 0
for char in msg.lower():
if char.isalpha():
if index_of_msg%2 == 0 :
encoded_message += letters[(letters.rfind(char)+ (index_of_msg+1))%26] # changed msg.rfind(char) to index_of_msg
else:
encoded_message += char
else:
encoded_message += char
index_of_msg +=1
time.sleep(0.5)
print "encoding completed"
return encoded_message
def write(msg,filename): # writes the message 'msg' given to it, to the file named 'filename'.
print "Opening the file "+filename
time.sleep(0.4)
file_output = openfile(filename)
print filename + " opened and ready to be written"
time.sleep(0.3)
print "Writing the encoded message to the file "+filename
file_output.write(msg)
file_output.close()
time.sleep(0.4)
print "Writing to the file has completed."
def start(): # Starter main function that incorporates all other functions :)
os.chdir('aaest/')
clear = lambda: os.system('clear')
clear()
print "Hi, Welcome to this Encryption Program. \n"
filename = raw_input("Enter the file name in which you stored the message: ")
print "Opening the file " + filename
time.sleep(0.5)
openfile(filename)
print filename +" opened up and ready, retrieving the message from it."
time.sleep(0.5)
message = readfile(filename)
print "The message of the "+filename+" is retrieved."
time.sleep(0.5)
encoded_msg = encode(message)
time.sleep(0.3)
decoded_msg = decode(encoded_msg)
encoded_file = raw_input("Enter the name of the output file in which encoded message will be saved :")
write(encoded_msg,encoded_file)
decoded_file = raw_input("Enter the name of the output file in which decoded message will be saved :")
write(decoded_msg,decoded_file)
start()
Can anyone please help me with this.
Part of your problem is that your letters strings begin with space rather than 'a'. So if you have a 'y' as the first character of the string, it gets replaced with a space. Then when you try to decode, the space fails your isalpha check and is not replaced.
There are a number of ways this code could be cleaner, but that's the first logical error I see. Unless I'm missing something, letters = "abcdefghijklmnopqrstuvwxyz" should fix that particular error. Or better yet, use string.ascii_lowercase.

Categories

Resources