Find author of python file from docstring - python

I'm trying to write a program which has a function that finds and prints the author of a file by looking for the Author string in the docstring. I've managed to get the code below to print the author of a file that has the author string followed by the authors name and also the author string not followed by a name. The thing I'm having problems with is trying to print Unknown when the author string does not exist at all i.e. no part of the docstring contains Author.
N.B. lines is just a list constructed by using readlines() on a file.
def author_name(lines):
'''Finds the authors name within the docstring'''
for line in lines:
if line.startswith("Author"):
line = line.strip('\n')
line = line.strip('\'')
author_line = line.split(': ')
if len(author_line[1]) >=4:
print("{0:21}{1}".format("Author", author_line[1]))
else:
print("{0:21}{1}".format("Author", "Unknown"))

If you are writing a function, then return a value. Do not use print (that is for debugging only). Once you use return, you can return early if you do find the author:
def author_name(lines):
'''Finds the authors name within the docstring'''
for line in lines:
name = 'Unknown'
if line.startswith("Author"):
line = line.strip('\n')
line = line.strip('\'')
author_line = line.split(': ')
if len(author_line[1]) >=4:
name = author_line[1]
return "{0:21}{1}".format("Author", name) # ends the function, we found an author
return "{0:21}{1}".format("Author", name)
print(author_name(some_docstring.splitlines()))
The last return statement only executes if there were no lines starting with Author, because if there was, the function would have returned early.
Alternatively, because we default name to Unknown, you can use break as well to end the loop early and leave returning to that last line:
def author_name(lines):
'''Finds the authors name within the docstring'''
for line in lines:
name = 'Unknown'
if line.startswith("Author"):
line = line.strip('\n')
line = line.strip('\'')
author_line = line.split(': ')
if len(author_line[1]) >=4:
name = author_line[1]
break # ends the `for` loop, we found an author.
return "{0:21}{1}".format("Author", name)

Related

Python 3: search text file with user input?

there is a part of my program where I would like to pass a sorted list of names from a text file to a function which asks the user to enter a name and then indicate whether the name entered is found in the list or not. If the name is found, its position (i.e. index) is also printed.
The file is just 30 names, but when the second function is executed, it will show this after I input which name I would like to search for:
name not found.
name not found.
name found
name not found.
name not found.
...etc for all 30 names.
Here's the code:
def main():
infile = open('names.txt', 'r')
line = infile.readline()
while line !='':
line = line.rstrip('\n')
print(line)
line = infile.readline()
print('\nHere are the names sorted:\n')
infile = open("names.txt", 'r')
names = infile.readlines()
names.sort()
for line in names:
line = line.rstrip('\n')
print(line)
line = infile.readline()
search_file(line) # I don't this this is the correct way to
# pass the sorted list of names?
def search_file(line):
search = open('names.txt', 'r')
user_search = input('\nSearch for a name(Last, First): ')
#item_index = line.index(search)
print()
with open('names.txt', 'r') as f:
for line in f:
if user_search in line:
print('name found')#, item_index)
else:
print('name not found.')
updated code here:
this time it always displays "not found"
def search_file(line):
user_search = input('\nSearch for a name(Last, First): ')
print()
try:
item_index = line.index(user_search)
print(user_search, 'found at index', item_index)
except ValueError:
print('not found.')
Well first you would only want to open the file you are searching one time. You could load all lines in the file into a list with .readlines() this function returns a string in a list for each line. Then you can search for the user string with in each line with
for l in lines:
if (l.find(userstring)>-1):
foundvar=True

Parsing with multiple identifiers

I was trying to implement this block of code from Generator not working to split string by particular identifier . Python 2 but I found two bugs in it that I can’t seem to fix.
Input:
#m120204
CTCT
+
~##!
#this_one_has_an_at_sign
CTCTCT
+
#jfik9
#thisoneisempty
+
#empty line after + and then empty line to end file (2 empty lines)
The two bugs are:
(i) when there is a # that starts the line of code after the ‘+’ line such as the 2nd entry (#this_one_has_an_at_sign)
(ii) when there line following the #identification_line or the line following the ‘+’ lines are empty like in 3rd entry (#thisoneisempty)
I would like the output to be the same as the post that i referenced:
yield (name, body, extra)
in the case of #this_one_has_an_at_sign
name= this_one_has_an_at_sign
body= CTCTCT
quality= #jfik9
in the case of #thisoneisempty
name= thisoneisempty
body= ''
quality= ''
I tried using flags but i can’t seem to fix this issue. I know how to do it without using a generator but i’m going to be using big files so i don’t want to go down that path. My current code is:
def organize(input_file):
name = None
body = ''
extra = ''
for line in input_file:
line = line.strip()
if line.startswith('#'):
if name:
body, extra = body.split('+',1)
yield name, body, extra
body = ''
name = line
else:
body = body + line
body, extra = body.split('+',1)
yield name, body, extra
for line in organize(file_path):
print line

Parsing Generator Python 2

Input:
#example1
abcd
efg
hijklmnopq
#example2
123456789
Script:
def parser_function(f):
name = ''
body = ''
for line in f:
if len(line) >= 1:
if line[0] == '#':
name = line
continue
body = body + line
yield name,''.join(body)
for line in parser_function(data_file):
print line
Output
('#example1', 'abcd')
('#example1', 'abcdefg')
('#example1', 'abcdefghijklmnopq')
('#example2', 'abcdefghijklmnopq123456789')
Desired Output:
('#example1', 'abcdefghijklmnopq')
('#example2', '123456789')
My problem, my generator is yielding every line but i'm not sure where to reset the line. i'm having trouble getting the desired output and i've tried a few different ways. any help would be greatly appreciated. saw some other generators that had "if name:" but they were fairly complicated. I got it to work using those codes but i'm trying to make my code as small as possible
You need to change where you yield:
def parser_function(f):
name = None
body = ''
for line in f:
if line and line[0] == '#':
if name:
yield name, body
name = line
else:
body += line
if name:
yield name, body
This yields once before every #... and once at the end.
P.S. I've renamed str to body to avoid shadowing a built-in.

Python name to/from file

I want the text to display the users name if they have entered it before. I have this working in c++ but wanted to practice python. the output will continue to do the "else" statement. I have tried having the if statement search for a string such as "noname" or "empty" and it would still do the else statement.
fr = open('sample.txt','r')
name = fr.read()
fr.close()
if name is None:
fw = open('sample.txt','w')
stuff = raw_input("enter name:")
fw.write(stuff)
fw.close()
else:
print(name)
If you have a blank file without any data in it, f.read() doesn't return None, it returns an empty string.
So rather than do if name is None you could write if name == '' or, to be certain, if name in (None, '').
You might also want to make sure you add a newline character when you write the names to your file, so you should do:
f.write(name + '\n')
for example.
Edit: As Cat Plus Plus mentioned, you can just do if name:, because both None and an empty string will evaluate to False. I just thought it was less clear for this particular question.
Use with to open files, it closes them automtically:
with open('sample.txt','a+') as names: # if file does not exist, "a" will create it
lines = names.read()
if lines: # if file has data
print("Hello {}".format(lines))
else: # else file is empty, ask for name and write name to file
name = raw_input("enter name:")
names.write(name)
To check if the name exists before writing:
with open('sample.txt','a+') as names:
lines = names.read()
name = raw_input("enter name:")
if name in lines:
print("Hello {}".format(name))
else:
names.write(name)

Cannot add new items into python dictionary

Hi I'm new to python. I am trying to add different key value pairs to a dictionary depending on different if statements like the following:
def getContent(file)
for line in file:
content = {}
if line.startswith(titlestart):
line = line.replace(titlestart, "")
line = line.replace("]]></title>", "")
content["title"] = line
elif line.startswith(linkstart):
line = line.replace(linkstart, "")
line = line.replace("]]>", "")
content["link"] = line
elif line.startswith(pubstart):
line = line.replace(pubstart, "")
line = line.replace("</pubdate>", "")
content["pubdate"] = line
return content
print getContent(list)
However, this always returns the empty dictionary {}.
I thought it was variable scope issue at first but that doesn't seem to be it. I feel like this is a very simple question but I'm not sure what to google to find the answer.
Any help would be appreciated.
You reinitialize content for every line, move the initialization outside of the loop:
def getContent(file)
content = {}
for line in file:
etc.

Categories

Resources