Reading files into dictionaries

Reading files into dictionaries - python

I am trying to read from a file into a dictionary. The lane.split() method will not work as I am formatting my file over separate lines, with too many spaces.
in inventory2
(item, description) = line.split()
ValueError: too many values to unpack
Here is my text file. Key \n Value.
Key
A rusty old key, you used it to gain entry to the manor.
A stick
You found it on your way in, it deals little damage.
Health potion
A health potion, it can restore some health.
Any solutions to this would be much appreciated.
def inventory2():
inventory_file = open("inventory_test.txt", "r")
inventory = {}
for line in inventory_file:
(item, description) = line.split()
inventory[(item)] = description
#invenory = {inventory_file.readline(): inventory_file.readline()}
print(line)
inventory_file.close

You are looping over each line in the file, so there will never be a line with both key and value. Use the next() function to get the next line for a given key instead:
def inventory2():
with open("inventory_test.txt", "r") as inventory_file:
inventory = {}
for line in inventory_file:
item = line.strip()
description = next(inventory_file).strip()
inventory[item] = description
return inventory
or, more compact with a dict comprehension:
def inventory2():
with open("inventory_test.txt", "r") as inventory_file:
return {line.strip(): next(inventory_file).strip() for line in inventory_file}

Here is another way:
def inventory2():
inventory_file = open("inventory_test.txt", "r")
inventory = {}
lines = inventory_file.readlines()
x = 0
while (x < len(lines)):
item = lines[x].strip()
description = lines[x+1].strip()
inventory[item] = description
x += 2
print inventory
return inventory
Outputs:
{'Health potion': 'A health potion, it can restore some health.', 'A stick': 'You found it on your way in, it deals little damage.', 'Key': 'A rusty old key, you used it to gain entry to the manor.'}

Related

Need to print an author of a book by the book's title

I don't know how to print the author of a book by inputting the book's title. The book and its author are separated from each other by a pipe character ("|") in the text file. I only found out how to print first book's author.
def load_library(a):
s = open(a,'r')
while True:
theline = s.readline()
razdel = theline.split('|')
if len(theline) == 0:
break
books_authors=razdel.pop(1)
return books_authors
return razdel
s.close()
if __name__=='__main__':
result = load_library('books.txt')
print(result)

Your function returns the call inside the while loop instead of printing/saving it to a list. So only the first line is read.
def load_library(a):
s = open(a,'r')
result = []
while True:
theline = s.readline()
razdel = theline.split('|')
if len(theline) == 0:
break
books_authors=razdel.pop(1)
result.append(book_authors)
s.close()
return result
if __name__=='__main__':
result = load_library('books.txt')
print(result)

Assuming that your text file is sorted in the form booktitle | author, this sould do it.
import csv
#read csv sheet into dict
D={}
with open(filename,'r') as f:
fcsv=csv.reader(f,delimiter='|')
for row in fcsv:
D[row[0]]=row[1]
print D[booktitle]

If the 'books.txt' file contained these entries:
Title1|Joe Blow
Title2|Jane Doe
Title Number Three|Henry Longwords
You could build a dictionary from it that had the book title as the keys and a list of author(s) that wrote it. Here's what I I'm talking about:
from pprint import pprint
def load_library(filename):
authors_by_book = {}
with open(filename) as file:
for line in file:
author, title = line.strip().split('|')
authors_by_book.setdefault(author, []).append(title)
return authors_by_book
if __name__=='__main__':
authors_by_book = load_library('books.txt')
pprint(authors_by_book)
print()
print('Author(s) of {}: {}'.format('Title2', authors_by_book['Title2']))
Output:
{'Title Number Three': ['Henry Longwords'],
'Title1': ['Joe Blow'],
'Title2': ['Jane Doe']}
Author(s) of Title2: ['Jane Doe']
The reason the value associated with each key is a list, is because a book could have more than one author. This could happen if the same title appeared more than once in the 'bookts.txt' file, like this:
Deep Throat|Bob Woodward
Deep Throat|Carl Bernstein
You could modify the code to support having multiple authors listed in a single entry, such as:
Deep Throat|Bob Woodward|Carl Bernstein
if you needed/wanted to.

Python 3.X combining similar lines in .txt files together

A question regarding combining values from a text file into a single variable and printing it.
An example I can give is a .txt file such as this:
School, 234
School, 543
I want to know the necessary steps to combining both of the school into a single variable "school" and have a value of 777.
I know that we will need to open the .txt file for reading and then splitting it apart with the .split(",") method.
Code Example:
schoolPopulation = open("SchoolPopulation.txt", "r")
for line in schoolPopulation:
line = line.split(",")
Could anyone please advise me on how to tackle this problem?

Python has rich standard library, where you can find classes for many typical tasks. Counter is what you need in current situation:
from collections import Counter
c = Counter()
with open('SchoolPopulation.txt', 'r') as fh:
for line in fh:
name, val = line.split(',')
c[name] += int(val)
print(c)

Something like this?
schoolPopulation = open("SchoolPopulation.txt", "r")
results = {}
for line in schoolPopulation:
parts = line.split(",")
name = parts[0].lower()
val = int(parts[1])
if name in results:
results[name] += val
else:
results[name] = val
print(results)
schoolPopulation.close()
You could also use defaultdict and the with keyword.
from collections import defaultdict
with open("SchoolPopulation.txt", "r") as schoolPopulation:
results = defaultdict(int)
for line in schoolPopulation:
parts = line.split(",")
name = parts[0].lower()
val = int(parts[1])
results[name] += val
print(results)
If you'd like to display your results nicely you can do something like
for key in results:
print("%s: %d" % (key, results[key]))

school = population = prev = ''
pop_count = 0
with open('SchoolPopulation.txt', 'r') as infile:
for line in infile:
line = line.split(',')
school = line[0]
population = int(line[1])
if school == prev or prev == '':
pop_count += line[1]
else:
pass #do something else here
prev = school

read fasta file and compute DNA gc content

First, I want to say that I am new to python programming. I spent nearly 20 hours for figuring this out, however, my code is still frustrating. I have a fasta file which comprise of ID and DNA sequence. I would like to read in the FASTA data and do some computational work.
The FASTA file reads like this:
>1111886
AACGAACGCTGGCGGCATGCCTAACACATGCAAGTCGAACGA…
>1111885
AGAGTTTGATCCTGGCTCAGAATGAACGCTGGCGGCGTGCCT…
>1111883
GCTGGCGGCGTGCCTAACACATGTAAGTCGAACGGGACTGGG…
I wrote the following code to read in the fasta data and do some analysis such as compute gc content by ID, average sequence length ,etc. I put the detailed description for what I want to do in docstring. I appreciate anyone who can improve my code, especially how to get gc content for each ID.
class fasta(object):
def __init__(self, filename):
self.filename = filename
self.num_sequences = None
self.sequences = {} #{seq_id} = sequence
def parse_file(self):
**"""Reads in the sequence data contained in the filename associated with this instance of the class.
Stores both the sequence and the optional comment for each ID."""**
with open(self.filename) as f:
return f.read().split('>')[1:]
def get_info(self):
**"""Returns a description of the class in a pretty string format.
The description should include the filename for the instance and the number of sequences."""**
for line in file(self.filename, 'r'):
if line.startswith('>'):
self.num_sequences += 1
return self.num_sequences
def compute_gc_content(self,some_id):
**"""compute the gc conent for sequence ID some_id. If some_id, return an appropriate error values"""**
baseFrequency = {}
for line in file(self.filename, 'r'):
if not line.startswith(">"):
for base in sequence:
baseFrequency[base] = baseFrequency.get(base,0)+1
items = baseFrequency.items()
items.sort()
for i in items:
gc=(baseFrequency['G'] + baseFrequency['C'])/float(len(sequence))
return gc
def sequence_statistics(self):
**"""returns a dictionary containing
The average sequence length
The average gc content"""**
baseFrequency = {}
for line in file(self.filename, 'r'):
if not line.startswith(">"):
for base in sequence:
baseFrequency[base] = baseFrequency.get(base,0)+1
items = baseFrequency.items()
items.sort()
for i in items:
gc=(baseFrequency['G'] + baseFrequency['C'])/float(len(sequence))
aveseq=sum(len(sequence))/float(self.count)
return (gc,aveseq)
def get_all_kmers(self, k=8):
**"""get all kmer counts of size k in fasta file. Returns a dictionary with keys equal to the kmers
and values equal to the counts"""**
t={}
for x in range (len(self.sequence)+1-k):
kmer=self.sequence[x:x+k]
t[kmer]=f.get(kmer,0)+1
kmers = get_all_kmers(k=8)
return(t,len(kmers))
def query_sequence_id(self, some_id):
**"""query sequence ids for some_id. If some_id does not exist in the class, return
a string error message"""**
for line in file(self.filename, 'r'):
if id in line:
print "The id exists"
else:
print "The id does not exist"

This should be able to read and parse your file. saving it in a dictionary with the id as keys, and info\data as a sub dictionary for each id.
Please ask if you don't understand all of this.
def parse_file(self):
"""Reads in the sequence data contained in the filename associated with this instance of the class.
Stores both the sequence and the optional comment for each ID."""
def parser(filename):
seq = []
with open(filename, 'r') as f:
for line in f:
if line.startswith('>'):
if seq:
yield seqId, seqInfo, ''.join(seq)
line = line.split()
seqId = line[0][1:]
if len(line) > 1:
seqInfo = ' '.join(line[1:])
else:
seqInfo = ''
seq = []
else:
seq.append(line.replace('\n', ''))
if seq:
yield seqId, seqInfo, ''.join(seq)
sequences = parser(self.filename)
self.sequences = {sequenceId: {'info': sequenceInfo, 'data': sequenceData} for (sequenceId, sequenceInfo, sequenceData) in sequences}

Python: Reading individual elements of a file

I am attempting to read in individual elements of a file. In this example, the first element of each line is to be the key of a dictionary. The next five elements will be a corresponding value for said key in list form.
max_points = [25, 25, 50, 25, 100]
assignments = ['hw ch 1', 'hw ch 2', 'quiz ', 'hw ch 3', 'test']
students = {'#Max': max_points}
def load_records(students, filename):
#loads student records from a file
in_file = open(filename, "r")
#run until break
while True:
#read line for each iteration
in_line = in_file.readline()
#ends while True
if not in_line: break
#deletes line read in
in_line = in_line[:-1]
#initialize grades list
grades = [0]*len(students['#Max'])
#set name and grades
name, grades[0], grades[1], grades[2], grades[3], grades[4] = in_line.split()
#add names and grades to dictionary
students[name] = grades
print name, students[name]
filename = 'C:\Python27\Python_prgms\Grades_list.txt'
print load_records(students, filename)
The method I have now is extremely caveman, and I would like to know what the more elegant, looping method would be. I have been looking for a while, but I can't seem to find the correct method of iteration. Help a brotha out.

Another way of doing it:
def load_records(students, filename):
with open(filename) as f:
for line in f:
line = line.split()
name = line[0]
students[name] = map(int, line[1:])
print name, students[name]
It seems a bit strange that the student dictionary contains both the scores and a parameter #Max though - a key has two meanings, is it a student's name or parameter's name? Might be better to separate them.

I had an assignment similar to this last year.
def load_records(students, filename):
file = open(filename, 'r')
s = ""
while s != None: # loop until end of file is reached
s = file.readline()
# manipulate s how you need
Also, you should use inline comments like above, it makes the code much easier to read compared to how you have it now.

How do I append to a new object?

I have this function in python 3 that works almost as I want it to work:
def read_people_from_file(filename):
"""Function that reads a file and adds them as persons"""
print("reading file")
try:
with open(filename, 'rU') as f:
contents = f.readlines()
except IOError:
print("Error: Can not find file or read data")
sys.exit(1)
#Remove blank lines
new_contents = []
for line in contents:
if not line.strip():
continue
else:
new_contents.append(line)
#Remove instructions from file
del new_contents[0:3]
#Create persons (--> Here is my problem/question! <--)
person = 1*[None]
person[0] = Person()
person[0] = Person("Abraham", "m", 34, 1, 140, 0.9, 90, 0.9, 0.9)
for line in new_contents:
words = line.split()
person.append(Person(words[0], words[1], words[2], words[3], words[4], words[5], words[6], words[7], words[8]))
return person
In the last chunk of code, below "#Create persons", is a thing that I have not figured out how to do.
How do I create the empty list of persons and then add persons from the file?
If I remove the hard coded person named "Abraham", my code does not work.
The file is a text file with one person per row with the attributes coming after the name.
Part of the Person class looks like this:
class Person:
def __init__(self, name=None, gender=None, age=int(100 or 0), beauty=int(0), intelligence=int(0), humor=int(0), wealth=int(0), sexiness=int(0), education=int(0)):
self.name = name
self.gender = gender
self.age = age
self.beauty = beauty
self.intelligence = intelligence
self.humor = humor
self.wealth = wealth
self.sexiness = sexiness
self.education = education
I hope that the above code is self explanatory.
I suspect that there is some more pythonian way of doing what I want.
Any help is appreciated.

You can do
persons = []
...
for line in new_contents:
words = line.split()
persons.append(Person(...))

There's always:
persons = [Person(*line.split()) for line in new_contents]

This is probably the simplest way to do what you want:
def readfile():
data = open("file path to read from","r") #opens file in read mode
people = []
for line in data: #goes through each line
people.append(Person(*line.split())) #creates adds "Person" class to a list. The *line.split() breaks the line into a list of words and passes the elements of the list to the __init__ function of the class as different arguments.
return people

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Reading files into dictionaries - python

Related

Need to print an author of a book by the book's title

Python 3.X combining similar lines in .txt files together

read fasta file and compute DNA gc content

Python: Reading individual elements of a file

How do I append to a new object?

Categories

Resources