lower case the first letter after a separator Python - python

I am trying to lowercase the first letter after a separator or split (":"). This means that if I have a file with lines like this:
hello:world
iamlearning:python
is:cool
I would like to convert it to this:
hello:World
iamlearning:Python
is:Cool
I looked for information on how to do it, saw information and tried to do some tests, but it did not work for me. I can lower case all words, but not the first letter after a separator. Here the code:
fname = input("Enter file name: ")
f = open(fname)
s = f.read().strip().lower()
f.close()
f = open(fname,"w")
f.write(s)
f.close()
If someone can help me, I am trying to make a text editor :)
Thanks in advance.

have fun making your combo editor :^)
delimiter = ":"
with open("file.txt", "r", encoding='UTF-8')as f:
data = [f"{x}{delimiter}{y.capitalize()}" for x,y in [tuple(i.strip().split(delimiter)) for i in f.readlines()]]
with open("file.txt", "w+", encoding='UTF-8')as f: f.write("\n".join(data))
before:
username:password1
email:password2
after:
username:Password1
email:Password2
So whats going on here?
well, ets break down this crazy list comprehension
data = [tuple(i.strip().split(delimiter)) for i in f.readlines()]
which sets data to:
[('username', 'Password1'), ('email', 'Password2')]
basically a list of tuples for each combo in the combolist. ("email", "pass")
So, whats a tuple? think of it as a light ordered list of data that can easily be accessed in loops.
All this does line does is splits the data into 2 parts so i can easily edit the second in my next line. Which brings us to the second loop...
data = [f"{x}{delimiter}{y.capitalize()}" for x,y in data]
This this goes through our list of tuples data, and just joins them with an f string. you could have just as easily done (x + delimiter + (y.capitalize()))
Obviously you will want to create more modules to use in your tool so you can do the same thing while applying different things. E.g. if you wanted to add an ! to the end of a password you could edit that line and add a function in that applies to y:
import random
addSpecial(y):
return f"{y}{random.choice(["!", "?", "*", "$"])}
f"{x}{delimiter}{addSpecial(y)}"
some other resources:
Tuple unpacking in for loops
Lastly, don't use it yourself to do harm... I know a thing or two because I've seen (or done) a thing or two in that community...
Feel free to contact me via my bio for more information

Split the line, capitalize the second part, rebuild the string:
l = 'iamlearning:python'
a, b = l.split(':')
a + ':' + b.capitalize()
#'iamlearning:Python'

Related

Sort file by key

I am learning Python 3 and I'm having issues completing this task. It's given a file with a string on each new line. I have to sort its content by the string located between the first hyphen and the second hyphen and write the sorted content into a different file. This is what I tried so far, but nothing gets sorted:
def sort_keys(path, input, output):
list = []
with open(path+'\\'+input, 'r') as f:
for line in f:
if line.count('-') >= 1:
list.append(line)
sorted(list, key = lambda s: s.split("-")[1])
with open(path + "\\"+ output, 'w') as o:
for line in list:
o.write(line)
sort_keys("C:\\Users\\Daniel\\Desktop", "sample.txt", "results.txt")
This is the input file: https://pastebin.com/j8r8fZP6
Question 1: What am I doing wrong with the sorting? I've used it to sort the words of a sentence on the last letter and it worked fine, but here don't know what I am doing wrong
Question 2: I feel writing the content of the input file in a list, sorting the list and writing aftwerwards that content is not very efficient. What is the "pythonic" way of doing it?
Question 3: Do you know any good exercises to learn working with files + folders in Python 3?
Kind regards
Your sorting is fine. The problem is that sorted() returns a list, rather than altering the one provided. It's also much easier to use list comprehensions to read the file:
def sort_keys(path, infile, outfile):
with open(path+'\\'+infile, 'r') as f:
inputlines = [line.strip() for line in f.readlines() if "-" in line]
outputlines = sorted(inputlines, key=lambda s: s.split("-")[1])
with open(path + "\\" + outfile, 'w') as o:
for line in outputlines:
o.write(line + "\n")
sort_keys("C:\\Users\\Daniel\\Desktop", "sample.txt", "results.txt")
I also changed a few variable names, for legibility's sake.
EDIT: I understand that there are easier ways of doing the sorting (list.sort(x)), however this way seems more readable to me.
First, your data has a couple lines without hyphens. Is that a typo? Or do you need to deal with those lines? If it is NOT a typo and those lines are supposed to be part of the data, how should they be handled?
I'm going to assume those lines are typos and ignore them for now.
Second, do you need to return the whole line? But each line is sorted by the 2nd group of characters between the hyphens? If that's the case...
first, read in the file:
f = open('./text.txt', 'r')
There are a couple ways to go from here, but let's clean up the file contents a little and make a list object:
l = [i.replace("\n","") for i in f]
This will create a list l with all the newline characters removed. This particular way of creating the list is called a list comprehension. You can do the exact same thing with the following code:
l = []
for i in f:
l.append(i.replace("\n","")
Now lets create a dictionary with the key as the 2nd group and the value as the whole line. Again, there are some lines with no hyphens, so we are going to just skip those for now with a simple try/except block:
d = {}
for i in l:
try:
d[i.split("-")[1]] = i
except IndexError:
pass
Now, here things can get slightly tricky. It depends on how you want to approach the problem. Dictionaries are inherently unsorted in python, so there is not a really good way to simply sort the dictionary. ONE way (not necessarily the BEST way) is to create a sorted list of the dictionary keys:
s = sorted([k for k, v in d.items()])
Again, I used a list comprehension here, but you can rewrite that line to do the exact same thing here:
s = []
for k, v in d.items():
s.append(k)
s = sorted(s)
Now, we can write the dictionary back to a file by iterating through the dictionary using the sorted list. To see what I mean, lets print out the dictionary one value at a time using the sorted list as the keys:
for i in s:
print(d[i])
But instead of printing, we will now append the line to a file:
o = open('./out.txt', 'a')
for i in s:
o.write(d[i] + "\n")
Depending on your system and formatting, you may or may not need the + "\n" part. Also note that you want to use 'a' and not 'w' because you are appending one line at a time and if you use 'w' your file will only be the last item of the list.

Python - Get specific characters from text file or from list

I have a text file with this in it
Curtain Open time: 8:00
When I wrote to the file I used this
File.write("Curtain Open Time: " + Var_CurtainOpenTime, + "\n")
I used the "\n" to go onto the next line for more data to be wrote. "Var_CurtainOpenTime" is a variable in this case it was "8:00". I have some code to read the line which looks like this:
FileRead = open('File.txt', 'r')
Printing this would read "Curtain Open Time: 8:00".
I want to be able to just get "8:00". I had previously used FileRead.split(" ") to separate each word but after the 8:00 I get ["Curtain", "Open", "Time:", "8:00\n"]. So I believe I would need to remove the first 3 indexes somehow and somehow remove '\n' from the last index. I don't know how I would approach this. Any help?
Try the following, I will comment the explain
with open('File.txt') as f:
[line.replace('\n','').split()[3:][0] for line in f][0]
or just:
FileRead = open('File.txt', 'r')
result = [line.replace('\n','').split()[3:][0] for line in FileRead][0]
you just need to change from the .split(" ") to .split() and then get the last list item
with open('file.txt') as f:
print f.read().split()[-1]
Well once you have the list from the split, you can remove the first 3 terms by doing l=l[3:] (where l is your list). Then you can remove the \n by doing s = s[:-1] where s is your desired string. This is using list slicing. You can look at documentation if you want to understand it further.

Trouble with turning .txt file into lists

In a project I am doing I am storing the lists in a .txt file in order to save space in my actual code. I can turn each line into a separate list but I need to turn multiple lines into one list where each letter is a different element. I would appreciate any help I can get. The program I wrote is as follows:
lst = open('listvariables.txt', 'r')
data = lst.readlines()
for line in data:
words = line.split()
print(words)
Here is a part of the .txt file I am using:
; 1
wwwwwwwwwww
wwwwswwwwww
wwwsswwwssw
wwskssssssw
wssspkswssw
wwwskwwwssw
wwwsswggssw
wwwswwgwsww
wwsssssswww
wwssssswwww
wwwwwwwwwww
; 2
wwwwwwwww
wwwwwsgsw
wwwwskgsw
wwwsksssw
wwskpswww
wsksswwww
wggswwwww
wssswwwww
wwwwwwwww
If someone could make the program print out two lists that would be great.
You can load the whole file and turn it into a char-list like:
with open('listvariables.txt', 'r') as f
your_list = list(f.read())
I'm not sure why you want to do it, tho. You can iterate over string the same way you can iterate over a list - the only advantage is that list is a mutable object but you wouldn't want to do complex changes to it, anyway.
If you want each character in a string to be an element of the final list you should use
myList = list(myString)
If I understand you correctly this should work:
with open('listvariables.txt', 'r') as my_file:
my_list = [list(line) for line in my_file.read().splitlines()]

Refering to a list of names using Python

I am new to Python, so please bear with me.
I can't get this little script to work properly:
genome = open('refT.txt','r')
datafile - a reference genome with a bunch (2 million) of contigs:
Contig_01
TGCAGGTAAAAAACTGTCACCTGCTGGT
Contig_02
TGCAGGTCTTCCCACTTTATGATCCCTTA
Contig_03
TGCAGTGTGTCACTGGCCAAGCCCAGCGC
Contig_04
TGCAGTGAGCAGACCCCAAAGGGAACCAT
Contig_05
TGCAGTAAGGGTAAGATTTGCTTGACCTA
The file is opened:
cont_list = open('dataT.txt','r')
a list of contigs that I want to extract from the dataset listed above:
Contig_01
Contig_02
Contig_03
Contig_05
My hopeless script:
for line in cont_list:
if genome.readline() not in line:
continue
else:
a=genome.readline()
s=line+a
data_out = open ('output.txt','a')
data_out.write("%s" % s)
data_out.close()
input('Press ENTER to exit')
The script successfully writes the first three contigs to the output file, but for some reason it doesn't seem able to skip "contig_04", which is not in the list, and move on to "Contig_05".
I might seem a lazy bastard for posting this, but I've spent all afternoon on this tiny bit of code -_-
I would first try to generate an iterable which gives you a tuple: (contig, gnome):
def pair(file_obj):
for line in file_obj:
yield line, next(file_obj)
Now, I would use that to get the desired elements:
wanted = {'Contig_01', 'Contig_02', 'Contig_03', 'Contig_05'}
with open('filename') as fin:
pairs = pair(fin)
while wanted:
p = next(pairs)
if p[0] in wanted:
# write to output file, store in a list, or dict, ...
wanted.forget(p[0])
I would recommend several things:
Try using with open(filename, 'r') as f instead of f = open(...)/f.close(). with will handle the closing for you. It also encourages you to handle all of your file IO in one place.
Try to read in all the contigs you want into a list or other structure. It is a pain to have many files open at once. Read all the lines at once and store them.
Here's some example code that might do what you're looking for
from itertools import izip_longest
# Read in contigs from file and store in list
contigs = []
with open('dataT.txt', 'r') as contigfile:
for line in contigfile:
contigs.append(line.rstrip()) #rstrip() removes '\n' from EOL
# Read through genome file, open up an output file
with open('refT.txt', 'r') as genomefile, open('out.txt', 'w') as outfile:
# Nifty way to sort through fasta files 2 lines at a time
for name, seq in izip_longest(*[genomefile]*2):
# compare the contig name to your list of contigs
if name.rstrip() in contigs:
outfile.write(name) #optional. remove if you only want the seq
outfile.write(seq)
Here's a pretty compact approach to get the sequences you'd like.
def get_sequences(data_file, valid_contigs):
sequences = []
with open(data_file) as cont_list:
for line in cont_list:
if line.startswith(valid_contigs):
sequence = cont_list.next().strip()
sequences.append(sequence)
return sequences
if __name__ == '__main__':
valid_contigs = ('Contig_01', 'Contig_02', 'Contig_03', 'Contig_05')
sequences = get_sequences('dataT.txt', valid_contigs)
print(sequences)
The utilizes the ability of startswith() to accept a tuple as a parameter and check for any matches. If the line matches what you want (a desired contig), it will grab the next line and append it to sequences after stripping out the unwanted whitespace characters.
From there, writing the sequences grabbed to an output file is pretty straightforward.
Example output:
['TGCAGGTAAAAAACTGTCACCTGCTGGT',
'TGCAGGTCTTCCCACTTTATGATCCCTTA',
'TGCAGTGTGTCACTGGCCAAGCCCAGCGC',
'TGCAGTAAGGGTAAGATTTGCTTGACCTA']

How to write to a specific line in a text file

I hope I'm not reposting (I did research before hand) but I need a little help.
So I'll explain the problem as best as I can.
I have is a text file, and inside it I have information in this format:
a 10
b 11
c 12
I read this file and convert it to a dictionary with the first column as the key, and the second as the value.
Now I'm trying to do the opposite, I need to be able to write the file back with modified values in the same format, the key separated by a space, then the corresponding value.
Why would I want to do this?
Well, all the values are supposed to be changeable by the user using the program. So when the do decide to change the values, I need them to be written back to the text file.
This is where the problem is, I just don't know how to do it.
How might I go about doing this?
I've got my current code for reading the values here:
T_Dictionary = {}
with open(r"C:\NetSendClient\files\nsed.txt",newline = "") as f:
reader = csv.reader(f, delimiter=" ")
T_Dictionary = dict(reader)
ok,supposing the dictionary is called A and the file is text.txt i would do that:
W=""
for i in A: # for each key in the dictionary
W+="{0} {1}\n".format(i,A[i]) # Append to W a dictionary key , a space , the value corresponding to that key and start a new line
with open("text.txt","w") as O:
O.write(W)
if i understood what you were asking.
however using this method would leave an empty line at the end of the file ,but that can be removed replacing
O.write(W)
with
O.write(W[0:-1])
i hope it helped
Something like this:
def txtf_exp2(xlist):
print("\n", xlist)
t = open("mytxt.txt", "w+")
# combines a list of lists into a list
ylist = []
for i in range(len(xlist)):
newstr = xlist[i][0] + "\n"
ylist.append(newstr)
newstr = str(xlist[i][1]) + "\n"
ylist.append(newstr)
t.writelines(ylist)
t.seek(0)
print(t.read())
t.close()
def txtf_exp3(xlist):
# does the same as the function above but is simpler
print("\n", xlist)
t = open("mytext.txt", "w+")
for i in range(len(xlist)):
t.write(xlist[i][0] + "\n" + str(xlist[i][1]) + "\n")
t.seek(0)
print(t.read())
t.close()
You'll have to make some changes, but it's very similar to what you're trying to do. M

Categories

Resources