How to read file line by line in file? - python

I'm trying to learn python and I'm doing a problem out of a book but I'm stuck on one question. It asks me to read a file and each line contains an 'a' or a 's' and basically I have a total which is 500. If the line contains an 'a' it would add the amount next to it for example it would say "a 20" and it would add 20 to my total and for s it would subtract that amount. In the end I'm supposed to return the total after it made all the changes. So far I got
def NumFile(file:
infile = open(file,'r')
content = infile.readlines()
infile.close()
add = ('a','A')
subtract = ('s','S')
after that I'm completely lost at how to start this

You need to iterate over the lines of the file. Here is a skeleton implementation:
# ...
with open(filename) as f:
for line in f:
tok = line.split()
op = tok[0]
qty = int(tok[1])
# ...
# ...
This places every operation and quantity into op and qty respectively.
I leave it to you to fill in the blanks (# ...).

A variation might be
f = open('myfile.txt','r')
lines = f.readlines()
for i in lines:
i = i.strip() # removes new line characters
i = i.split() # splits a string by spaces and stores as a list
key = i[0] # an 'a' or an 's'
val = int( i[1] ) # an integer, which you can now add to some other variable
Try adding print statements to see whats going on. The cool thing about python is you can stack multiple commands in a single line. Here is an equivalent code
for i in open('myfile.txt','r').readlines():
i = i.strip().split()
key = i[0]
val = int (i[1])

Related

Looking to partially fill a fixed size array from file in python

I have to create an array with a fixed size and then fill partially from a data file. The fixed size needs to be 10 and there are three lines in the file. With my current code i get seven items in the array listed as ' ' how do I edit this code to just partially fill the array and ignore the empty spots?
MAX_COLORS = 10
colors = [''] * MAX_COLORS
counter = int(0)
filename = input("Enter the name of the color file (primary.dat or secondary.dat): ")
infile = open(filename, 'r')
line = infile.readline()
while line != '' and counter < MAX_COLORS:
colors[counter] = str.rstrip(line)
counter = counter + 1
line = infile.readline()
infile.close()
As # LarrytheLlama comments, try to use append:
# you do not need str() because rstrip() return str.
colors.append(rstrip(line))
Also you might want to remove 'counter' if you do not use it for other purpose.
The problem is in the following line:
colors[counter] = str.rstrip(line)
Should be:
colors[counter] = line.rstrip()
Explanation: Your variable line is an object of type str, and rstrip is a method of str. Calling line.rstrip() returns an rstripped copy of line.
EDIT: Based on your comments, I understand now what results you are looking for. You simply want to read up to 10 lines from the file, or less, and put the values (without newline), into a list.
I've taken the liberty of rewriting your program not just to fix the problems you had, but in addition I cleaned it up and simplified it. I hope that this code shows you a few other tricks that will be useful to you.
MAX_COLORS = 10
colors = []
filename = input("Enter the name of the color file (primary.dat or secondary.dat): ")
infile = open(filename, 'r')
while len(colors) < MAX_COLORS:
line = infile.readline()
if not line:
break
colors.append(line.rstrip())
infile.close()
print('The colors are:')
for color in colors:
print(' %s' % color)
The issue with this is that if you were to have an extra newline at the end of your file (which can easily happen by accident), that otherwise empty line would be read in as a color. You probably don't want that. To solve that problem, you could do it like this:
line = infile.readline().rstrip()
if not line:
break
colors.append(line)
This will cause your program to stop reading at the first blank line, or the end of the file, whichever comes first.

How to read specific data from txt file using python

Help me, I have an data.txt file. and i want to read the specific data in that file 112652447744 that is freespace data in directory C.
This is the content of my data.txt file:
Caption FreeSpace Size
C: 112652447744 146776518656
D: 295803727872 299962986496
E:
Z:
You can read each line (ignoring the first) and separate it by spaces to get each column. This way you could extract the size and free space for each partition:
contents = # here go the file contents
# split the text file into lines and ignore the first line
lines = contents.splitlines()[1:]
part_data = {}
for line in lines:
columns = line.split() # split the lines along the white space characters
if len(columns) != 3:
continue # incomplete lines
part_data[columns[0]] = (columns[1], columns[2])
That will give you the free space and size for every partition in the dictionary. To get your actual result it'd be:
part_data['C:'][0]
If you only want the second column and second row, ignoring the drive letter, you can reduce it to the following:
contents = # here go the file contents
second_line = contents.splitlines()[1]
second_column = second_line.split()[1]
There you go, but that requires that it always formatted the same. If the second line does not have three columns, it won't actually work and cause an IndexError most likely.
Note that a_string.split() removes all whitespace automatically, while a_string.split(' ') will also return the whitespace in it.
Try this:
#Open file in lines
file = open('path/to/txt','r').readlines()
#For each line
for disk in file:
fields = disk.split("\t")
#To check the total fields...
if(len(fields)>1):
print fields[1]
Your question doesn't say so, but can I assume that you want to "lookup" the value that represents FreeSpace in the row where the Caption is C:?
Unless you need the data in the file for something else, just read in the file line by line until you get your result. The first line will be your headers row, and you use the position of the 'caption_header' and 'fs_header' to parse each subsequent data row.
In your example, this will mean we want to test if the first value of each row contains C:, and if so the answer we're looking for will be in the second column. If we find the answer then there is no need to search the rest of the rows.
def find_value(caption_header, fs_header, caption_value, fp):
fs = None
with open(fp) as fid:
headers = fid.readline().strip().split()
for i, h in enumerate(headers):
if h == caption_header:
caption_position = i
if h == fs_header:
fs_position = i
line = fid.readline()
while line != '':
values = line.strip().split()
if values[caption_position] == caption_value:
fs = values[fs_position]
break
return fs
Then use it like this:
fs = find_value('Caption', 'FreeSpace', 'C:')

How to fetch these below values in python

I have a file in the below format
.aaa b/b
.ddd e/e
.fff h/h
.lop m/n
I'm trying to read this file. My desired output is if I find ".aaa" I should get b/b, if I find ".ddd" I should get e/e and so on.
I know how to fetch 1st column and 2nd column but I don't know how to compare them and fetch the value. This is what I've written.
file = open('some_file.txt')
for line in file:
fields = line.strip().split()
print (fields[0]) #This will give 1st column
print (fields[1]) # This will give 2nd column
This is not the right way of doing things. What approach follow?
Any time you want to do lookups, a dictionary is going to be your friend.
You could write a function to load the data into a dictionary:
def load_data(filename):
result = dict()
with open(filename, 'r') as f:
for line in f:
k,v = line.strip().split() # will fail if not exactly 2 fields
result[k] = v
return result
And then use it to perform your lookups like this:
data = load_data('foo.txt')
print data['.aaa']
It sounds like what you may want is to build a dictionary mapping column 1 to column 2. You could try:
file = open('some_file.txt')
field_dict = {}
for line in file:
fields = line.strip().split()
field_dict[fields[0]] = fields[1]
Then in your other code, when you see '.ddd' you can simply get the reference from the dictionary (e.g. field_dict['.ddd'] should return 'e/e')
Just do splitting on each line according to the spaces and check whether the first item matches the word you gave. If so then do printing the second item from the list.
word = input("Enter the word to search : ")
with open(file) as f:
for line in f:
m = line.strip().split()
if m[0] == word:
print m[1]

Python reading file problems

highest_score = 0
g = open("grades_single.txt","r")
arrayList = []
for line in highest_score:
if float(highest_score) > highest_score:
arrayList.extend(line.split())
g.close()
print(highest_score)
Hello, wondered if anyone could help me , I'm having problems here. I have to read in a file of which contains 3 lines. First line is no use and nor is the 3rd. The second contains a list of letters, to which I have to pull them out (for instance all the As all the Bs all the Cs all the way upto G) there are multiple letters of each. I have to be able to count how many off each through this program. I'm very new to this so please bear with me if the coding created is wrong. Just wondered if anyone could point me in the right direction of how to pull out these letters on the second line and count them. I then have to do a mathamatical function with these letters but I hope to work that out for myself.
Sample of the data:
GTSDF60000
ADCBCBBCADEBCCBADGAACDCCBEDCBACCFEABBCBBBCCEAABCBB
*
You do not read the contents of the file. To do so use the .read() or .readlines() method on your opened file. .readlines() reads each line in a file seperately like so:
g = open("grades_single.txt","r")
filecontent = g.readlines()
since it is good practice to directly close your file after opening it and reading its contents, directly follow with:
g.close()
another option would be:
with open("grades_single.txt","r") as g:
content = g.readlines()
the with-statement closes the file for you (so you don't need to use the .close()-method this way.
Since you need the contents of the second line only you can choose that one directly:
content = g.readlines()[1]
.readlines() doesn't strip a line of is newline(which usually is: \n), so you still have to do so:
content = g.readlines()[1].strip('\n')
The .count()-method lets you count items in a list or in a string. So you could do:
dct = {}
for item in content:
dct[item] = content.count(item)
this can be made more efficient by using a dictionary-comprehension:
dct = {item:content.count(item) for item in content}
at last you can get the highest score and print it:
highest_score = max(dct.values())
print(highest_score)
.values() returns the values of a dictionary and max, well, returns the maximum value in a list.
Thus the code that does what you're looking for could be:
with open("grades_single.txt","r") as g:
content = g.readlines()[1].strip('\n')
dct = {item:content.count(item) for item in content}
highest_score = max(dct.values())
print(highest_score)
highest_score = 0
arrayList = []
with open("grades_single.txt") as f:
arraylist.extend(f[1])
print (arrayList)
This will show you the second line of that file. It will extend arrayList then you can do whatever you want with that list.
import re
# opens the file in read mode (and closes it automatically when done)
with open('my_file.txt', 'r') as opened_file:
# Temporarily stores all lines of the file here.
all_lines_list = []
for line in opened_file.readlines():
all_lines_list.append(line)
# This is the selected pattern.
# It basically means "match a single character from a to g"
# and ignores upper or lower case
pattern = re.compile(r'[a-g]', re.IGNORECASE)
# Which line i want to choose (assuming you only need one line chosen)
line_num_i_need = 2
# (1 is deducted since the first element in python has index 0)
matches = re.findall(pattern, all_lines_list[line_num_i_need-1])
print('\nMatches found:')
print(matches)
print('\nTotal matches:')
print(len(matches))
You might want to check regular expressions in case you need some more complex pattern.
To count the occurrences of each letter I used a dictionary instead of a list. With a dictionary, you can access each letter count later on.
d = {}
g = open("grades_single.txt", "r")
for i,line in enumerate(g):
if i == 1:
holder = list(line.strip())
g.close()
for letter in holder:
d[letter] = holder.count(letter)
for key,value in d.iteritems():
print("{},{}").format(key,value)
Outputs
A,9
C,15
B,15
E,4
D,5
G,1
F,1
One can treat the first line specially (and in this case ignore it) with next inside try: except StopIteration:. In this case, where you only want the second line, follow with another next instead of a for loop.
with open("grades_single.txt") as f:
try:
next(f) # discard 1st line
line = next(f)
except StopIteration:
raise ValueError('file does not even have two lines')
# now use line

Making a dictionary from file, first word is key in each line then other four numbers are to be a tuple value

This dictionary is supposed to take the three letter country code of a country, i.e, GRE for great britain, and then take the four consecutive numbers after it as a tuple. it should be something like this:
{GRE:(204,203,112,116)} and continue doing that for every single country in the list. The txt file goes down like so:
Country,Games,Gold,Silver,Bronze
AFG,13,0,0,2
ALG,15,5,2,8
ARG,40,18,24,28
ARM,10,1,2,9
ANZ,2,3,4,5 etc.;
This isn't actually code i just wanted to show it is formatted.
I need my program to skip the first line because it's a header. Here's what my code looks like thus far:
def medals(goldMedals):
infile = open(goldMedals, 'r')
medalDict = {}
for line in infile:
if infile[line] != 0:
key = line[0:3]
value = line[3:].split(',')
medalDict[key] = value
print(medalDict)
infile.close()
return medalDict
medals('GoldMedals.txt')
Your for loop should be like:
next(infile) # Skip the first line
for line in infile:
words = line.split(',')
medalDict[words[0]] = tuple(map(int, words[1:]))
A variation on a theme, I'd convert all the remaining cols to ints, and I'd use a namedtuple:
from collections import namedtuple
with open('file.txt') as fin:
# The first line names the columns
lines = iter(fin)
columns = lines.next().strip().split(',')
row = namedtuple('Row', columns[1:])
results = {}
for line in lines:
columns = line.strip().split(',')
results[columns[0]] = row(*(int(c) for c in columns[1:]))
# Results is now a dict to named tuples
This has the nice feature of 1) skipping the first line and 2) providing both offset and named access to the rows:
# These both work to return the 'Games' column
results['ALG'].Games
results['ALG'][0]
with open('path/to/file') as infile:
answer = {}
for line in infile:
k,v = line.strip().split(',',1)
answer[k] = tuple(int(i) for i in v.split(','))
I think inspectorG4dget's answer is the most readable... but for those playing code golf:
with open('medals.txt', 'r') as infile:
headers = infile.readline()
dict([(i[0], tuple(i[1:])) for i in [list(line.strip().split(',')) for line in infile]])

Categories

Resources