Read a file of numbers into a tuple in python? - python

I have a file that has numbers like this in it:
5
10
15
20
I know how to write code that reads the file and inputs the numbers into a LIST but how do I write code that reads the file and inputs the number in a TUPLE if a tuple doesnt support the append function? this is what I got so far:
filename=input("Please enter the filename or path")
file=open(filename, 'r')
filecontents=file.readlines()
tuple1=tuple(filecontents)
print(tuple1)
the output is this:
('5\n', '10\n', '15\n', '20\n')
it should be this:
5,10,15,20

Try this:
s=','.join(map(str.rstrip,file))
Demo:
filename=input("Please enter the filename or path: ")
file=open(filename, 'r')
s=tuple(map(str.rstrip,file))
print(s)
Example output:
Please enter the filename or path: thefile.txt
(5,10,15,20)

Using with open(..) is recommended to make sure the file is closed once you are done with it. Then use an expression to transform the returned list to a tuple.
filename=input("Please enter the filename or path")
with open(filename, 'r') as f:
lines = f.readlines()
tup = tuple(line.rstrip('\n') for line in lines)
print(tup)

If you are sure they are integers, you can do something like:
filename=input("Please enter the filename or path")
with open(filename, 'r') as f:
lines = f.readlines()
result = tuple(int(line.strip('\n')) for line in lines)
print(resultt)
Also, if you have a list, you can always convert it to a tuple:
t = tuple([1,2,3,4])
So you can build the list appending elements, and finally convert it to a tuple

If you already know how to make a list of ints, just cast it to a tuple like what you are doing in your attempt to solve the problem.
Here a map object can also se casted to a tuple, but it also works with list:
filename=input("Please enter the filename or path: ")
with open(filename, 'r') as file:
filecontents=tuple(map(int, file.read().split()))
print(filecontents)
Also, if you use with statement you don't need to worry about closing the file (you was missing that part in your code too)

Related

Simple parsing and sorting data from file

Sorry if this has already been answered before; the searches I have done have not been helpful.
I have a file that stores data as such:
name,number
(Although perhaps not relevant to the question, I will have to add entries to this file. I know how to do this.)
My question is for the pythonic(?) way of analyzing the data and sorting it in ascending order. So if the file was:
alex,30
bob,20
and I have to add the entry
carol, 25
The file should be rewritten as
bob,20
carol,25
alex,30
My first attempt was to store the entire file as a string (by read()) and then split by lines to get a list of strings, procedurally split those strings by a comma, and then create a new list of scores then sort that, but this doesn't seem right and fails because I don't have a way to go "back" once I have the order of scores.
I am unable to use libraries for this program.
Edit:
My first attempt I did not test because all it manages to do is sort a list of the scores; I don't know of a way to get the "entries" back.
file = open("scores.txt" , "r")
data = file.read()
list_data = data.split()
data.append([name,score])
for i in range(len(list_data)):
list_scores = list_scores.append(list_data[i][1])
list_scores = sorted(list_scores)
As you can see, this gives me an ascending list of scores, but I do not know where to go from here in order to sort the list of name, score entries.
You will just have to write the sorted entries back to some file, using some basic string formatting:
with open('scores.txt') as f_in, open('file_out.txt', 'w') as f_out:
entries = [(x, int(y)) for x, y in (line.strip().split(',') for line in f_in)]
entries.append(('carol', 25))
entries.sort(key=lambda e: e[1])
for x, y in entries:
f_out.write('{},{}\n'.format(x, y))
I'm going to assume you're capable of putting your data into a .csv file in the following format:
Name,Number
John,20
Jane,25
Then you can use csv.DictReader to read this into a dictionary with something like as shown in the listed example:
with(open('name_age.csv', 'w') as csvfile:
reader = csv.DictReader(csvfile)
and write to it using
with(open('name_age.csv') as csvfile:
writer = csv.DictWriter(csvfile)
writer.writerow({'Name':'Carol','Number':25})
You can then sort it using python's built-in operator as shown here
this a function that will take a filename and sort it for you
def sort_file(filename):
f = open(filename, 'r')
text = f.read()
f.close()
lines = [i.split(',') for i in text.splitlines()]
lines.sort(key=lambda x: x[1])
lines = [', '.join(i) for i in lines]
text = '\n'.join(lines)
f = open(filename, 'w')
f.write(text)
f.close()

Read and convert row in text file into list of string

I have a text file data.txt that contains 2 rows of text.
first_row_1 first_row_2 first_row_3
second_row_1 second_row_2 second_row_3
I would like to read the second row of the text file and convert the contents into a list of string in python. The list should look like this;
txt_list_str=['second_row_1','second_row_2','second_row_3']
Here is my attempted code;
import csv
with open('data.txt', newline='') as f:
reader = csv.reader(f)
row1 = next(reader)
row2 = next(reader)
my_list = row2.split(" ")
I got the error AttributeError: 'list' object has no attribute 'split'
I am using python v3.
EDIT: Thanks for all the answers. I am sure all of them works. But can someone tell me what is wrong with my own attempted code? Thanks.
The reason your code doesn't work is you are trying to use split on a list, but it is meant to be used on a string. Therefore in your example you would use row2[0] to access the first element of the list.
my_list = row2[0].split(" ")
Alternatively, if you have access to the numpy library you can use loadtxt.
import numpy as np
f = np.loadtxt("data.txt", dtype=str, skiprows=1)
print (f)
# ['second_row_1' 'second_row_2' 'second_row_3']
The result of this is an array as opposed to a list. You could simply cast the array to a list if you require a list
print (list(f))
#['second_row_1', 'second_row_2', 'second_row_3']
Use read file method to open file.
E.g.
>>> fp = open('temp.txt')
Use file inbuilt generator to iterate lines by next method, and ignore first line.
>>> next(fp)
'first_row_1 first_row_2 first_row_3)\n'
Get second line in any variable.
>>> second_line = next(fp)
>>> second_line
'second_row_1 second_row_2 second_row_3'
Use Split string method to get items in list. split method take one or zero argument. if not given they split use white space for split.
>>> second_line.split()
['second_row_1', 'second_row_2', 'second_row_3']
And finally close the file.
fp.close()
Note: There are number of way to get respective output.
But you should attempt first as DavidG said in comment.
with open("file.txt", "r") as f:
next(f) # skipping first line; will work without this too
for line in f:
txt_list_str = line.split()
print(txt_list_str)
Output
['second_row_1', 'second_row_2', 'second_row_3']

Searching for a term in a text file using python

I'm really desperate for some help on this python code please. I need to search for a variable (string), return it and the data present on the same line as the variable data.
I've managed to create a variable and then search for the variable in a text file, however if the data contained in the variable is found in the text file the contents of the whole text file is printed out not the line in which the variable data exists.
This is my code so far, please help:
number = input("Please enter the number of the item that you want to find:")
f = open("file.txt", "r")
lines = f.read()
if lines.find("number"):
print (lines)
else:
f.close
Thank you in advance.
See my changes below:
number = input("Please enter the number of the item that you want to find:")
f = open("file.txt", "r")
lines = f.read()
for line in lines: # check each line instead
if number in line: # if the number you're looking for is present
print(line) # print it
It goes like
lines_containg_number = [line for line in lines if number in line]
What this'll do is give you all the lines in the text file in the form of a list and then you can simply print out the contents of the list...
If you use 'with' loop, you don't have to close file. It will be handled by with. Otherwise you have to use f.close(). Solution:
number = input("Please enter the number of the item that you want to find:")
with open('file.txt', 'r') as f:
for line in f:
if number in line:
print line

Refering to a list of names using Python

I am new to Python, so please bear with me.
I can't get this little script to work properly:
genome = open('refT.txt','r')
datafile - a reference genome with a bunch (2 million) of contigs:
Contig_01
TGCAGGTAAAAAACTGTCACCTGCTGGT
Contig_02
TGCAGGTCTTCCCACTTTATGATCCCTTA
Contig_03
TGCAGTGTGTCACTGGCCAAGCCCAGCGC
Contig_04
TGCAGTGAGCAGACCCCAAAGGGAACCAT
Contig_05
TGCAGTAAGGGTAAGATTTGCTTGACCTA
The file is opened:
cont_list = open('dataT.txt','r')
a list of contigs that I want to extract from the dataset listed above:
Contig_01
Contig_02
Contig_03
Contig_05
My hopeless script:
for line in cont_list:
if genome.readline() not in line:
continue
else:
a=genome.readline()
s=line+a
data_out = open ('output.txt','a')
data_out.write("%s" % s)
data_out.close()
input('Press ENTER to exit')
The script successfully writes the first three contigs to the output file, but for some reason it doesn't seem able to skip "contig_04", which is not in the list, and move on to "Contig_05".
I might seem a lazy bastard for posting this, but I've spent all afternoon on this tiny bit of code -_-
I would first try to generate an iterable which gives you a tuple: (contig, gnome):
def pair(file_obj):
for line in file_obj:
yield line, next(file_obj)
Now, I would use that to get the desired elements:
wanted = {'Contig_01', 'Contig_02', 'Contig_03', 'Contig_05'}
with open('filename') as fin:
pairs = pair(fin)
while wanted:
p = next(pairs)
if p[0] in wanted:
# write to output file, store in a list, or dict, ...
wanted.forget(p[0])
I would recommend several things:
Try using with open(filename, 'r') as f instead of f = open(...)/f.close(). with will handle the closing for you. It also encourages you to handle all of your file IO in one place.
Try to read in all the contigs you want into a list or other structure. It is a pain to have many files open at once. Read all the lines at once and store them.
Here's some example code that might do what you're looking for
from itertools import izip_longest
# Read in contigs from file and store in list
contigs = []
with open('dataT.txt', 'r') as contigfile:
for line in contigfile:
contigs.append(line.rstrip()) #rstrip() removes '\n' from EOL
# Read through genome file, open up an output file
with open('refT.txt', 'r') as genomefile, open('out.txt', 'w') as outfile:
# Nifty way to sort through fasta files 2 lines at a time
for name, seq in izip_longest(*[genomefile]*2):
# compare the contig name to your list of contigs
if name.rstrip() in contigs:
outfile.write(name) #optional. remove if you only want the seq
outfile.write(seq)
Here's a pretty compact approach to get the sequences you'd like.
def get_sequences(data_file, valid_contigs):
sequences = []
with open(data_file) as cont_list:
for line in cont_list:
if line.startswith(valid_contigs):
sequence = cont_list.next().strip()
sequences.append(sequence)
return sequences
if __name__ == '__main__':
valid_contigs = ('Contig_01', 'Contig_02', 'Contig_03', 'Contig_05')
sequences = get_sequences('dataT.txt', valid_contigs)
print(sequences)
The utilizes the ability of startswith() to accept a tuple as a parameter and check for any matches. If the line matches what you want (a desired contig), it will grab the next line and append it to sequences after stripping out the unwanted whitespace characters.
From there, writing the sequences grabbed to an output file is pretty straightforward.
Example output:
['TGCAGGTAAAAAACTGTCACCTGCTGGT',
'TGCAGGTCTTCCCACTTTATGATCCCTTA',
'TGCAGTGTGTCACTGGCCAAGCCCAGCGC',
'TGCAGTAAGGGTAAGATTTGCTTGACCTA']

writing a list to a txt file in python

list consists of RANDOM strings inside it
#example
list = [1,2,3,4]
filename = ('output.txt')
outfile = open(filename, 'w')
outfile.writelines(list)
outfile.close()
my result in the file
1234
so now how do I make the program produce the result that I want which is:
1
2
3
4
myList = [1,2,3,4]
with open('path/to/output', 'w') as outfile:
outfile.write('\n'.join(str(i) for i in myList))
By the way, the list that you have in your post contains ints, not strings.
Also, please NEVER name your variables list or dict or any other type for that matter
writelines() needs a list of strings with line separators appended to them but your code is only giving it a list of integers. To make it work you'd need to use something like this:
some_list = [1,2,3,4]
filename = 'output.txt'
outfile = open(filename, 'w')
outfile.writelines([str(i)+'\n' for i in some_list])
outfile.close()
In Python file objects are context managers which means they can be used with a with statement so you could do the same thing a little more succinctly with the following which will close the file automatically. It also uses a generator expression (enclosed in parentheses instead of brackets) to do the string conversion since doing that avoids the need to build a temporary list just to pass to the function.
with open(filename, 'w') as outfile:
outfile.writelines((str(i)+'\n' for i in some_list))

Categories

Resources