Deleting a line in file containing exact string (Python) - python

import re
print "List of names:"
f=open('names.txt','r') #look below
lines = f.readlines()
for line in lines:
info = line.split('|')
names = info[0]
print names
name = raw_input("Enter the name of the person you want to delete: ")
f.close()
f = open('names.txt','w')
for line in lines:
if not re.match(name,line):
f.write(line)
break
print "That person doesn't exist!"
names.txt :
John|22|Nice
Johnny|55|Better than John
Peter|25|The worst
So, when you run the program, list of names is printed and then you have to enter the name of the person whose line you want to delete.
The problem is, if I enter John, it deletes the first and the second line, but I want only the first line to be deleted. My guess is that I'm not doing re.match() right. I tried re.match(name,names) but that doesn't work either.
So, the string you enter into name should be compared to the strings in names , and if there's an exact match, it should delete the line that has name as the first element.
I found a lot of similar problems but my function contains everything combined and I can't figure it out.

re.match matches the string at the beginning of the string. You may add word delimeter to your expression
name + r'\b'
but in your case, re is an overkill, simple comparison will do
name == line.partition('|')[0]
BTW, if you need to split only once at the beginning - or end - partition and rpartition functions are better options
EDIT
Timing:
>>> timeit('line.startswith(name+"|")', 'line="John|22|Nice";name="John"')
0.33100164101452345
>>> timeit('line.partition("|")[0] == name', 'line="John|22|Nice";name="John"')
0.2520693876228961
>>> timeit('re.match(name+r"\b", line)', 'import re; line="John|22|Nice";name="John"')
1.8754496594662555
>>> timeit('line.split("|")[0] == name', 'line="John|22|Nice";name="Jonny"')
0.511219799415926
Especially for Padraick
>>> timeit('line.partition("|")[0] == name', 'line="John|22|Nice";name="John"')
0.27333073995099083
>>> timeit('line.split("|", 1)[0] == name', 'line="John|22|Nice";name="John"')
0.5120651608158937
Frankly - I am surprised myself

with open("in.txt") as f:
lines = f.readlines()
name = raw_input("Enter the name of the person you want to delete: ").lower() + "|"
ln = len(name)
for ind, line in enumerate(lines):
if name == line[:ln].lower():
lines[ind:ind+1] = []
break
with open("in.txt","w") as out:
out.writelines(lines)
If you want to remove all John's etc.. don't break just keep looping and writing, as it stands we erase the first "John" we find. The fastest way is to just index.

Related

How to read IDs from a .txt file?

I'm doing a database using txt files in which the data is stores like this: idTheme|Name|Hours.
For example: 45|Object Oriented Programming|12
I need to print the line of text with the idTheme I'm given, so I did this:
print("Give me the ID")
id_search = input("> ")
for line in file:
for x in line:
if id_search != x:
break
else:
print(line)
break
Since the ID is always the first character in each line I thought thst would work.
Problem is, this only works when the ID is 1 character long.
I was thinking on putting the info on a list and using a split("|") to divide it but that would put all the info in the list, not only the IDs.
Any suggestions are apreciated.
You could use split as you said and just use index 0 to get the ID.
for line in file:
id = line.split("|")[0]
if id_search == id:
print(line)
You can invert your if statement so if the id is equal to the search term it prints the line, otherwise nothing happens. You also avoid looping through the entire line.
You can use somethign like:
with open("text.txt") as f:
lines = [x.strip() for x in list(f) if x]
print("Give me the ID")
id_search = input("> ").strip()
if id_search:
for line in lines:
id, name, otherid = line.split("|")
if id_search == id:
print(line)
break
Demo

Find an replace list value based on other list values

I am working on an assignment for school and have hit a wall.
The challenge is:
You will be passed the filename P, firstname F, lastname L, and a new birthday B.
Load the fixed length record file in P, search for F,L in the first and change birthday to B.
Then save the file.
I have managed to read from the file and split the data into the desired entries in a list.
However, my issue arises when I am searching for the first and last name in the list.
The first and last name being passed is Adam Smith, but there is also an Adam Smithers in the list who is found by my code as well.
When I attempt to replace the desired element of the list, it is replacing it for both Smith and Smithers.
We cannot use regular expressions for the assignment, so I am at a bit of a loss for how to approach this and find an exact match for the last name while ignoring the other last name that contains Smith without using regex.
Here's my code:
import sys
P= sys.argv[1]
F= sys.argv[2]
L= sys.argv[3]
B= sys.argv[4]
filePath = P
firstName = F
lastName = L
newBirthday = B
records = []
file1 = open(filePath, 'r')
fileContent = file1.read()
while len(fileContent) > 0:
record = []
fname = fileContent[0:16]
lname = fileContent[16:32]
bday = fileContent[32:40]
record = [fname,lname,bday]
records.append(record)
fileContent = fileContent[40:]
for record in records:
if firstName in record[0] and lastName in record[1]:
record[2] = newBirthday
file1.close()
file2 = open(filePath, 'w')
for record in records:
file2.write(record[0])
file2.write(record[1])
file2.write(record[2])
file2.close()
Any ideas or hints that someone would be able to provide will be most appreciated.
Thanks!
Edit:
Icewine was kind enough to suggest using the below instead:
if firstName == record[0] and lastName == record[1]:
However, when I try that, it does not find any matching records.
I believe this is because there as blank spaces after each name to make up 16 characters in each name, giving a fixed length for the name. So when I'm using the == operator, it's not finding an exact match because there are also blank spaces in the name, so it's not an exact match.
Use == instead of in
if firstName == record[0] and lastName == record[1]:
EDIT: try this
Removes whitespace from end of string
if firstName.rstrip() == record[0].rstrip() and lastName.rstrip() == record[1].rstrip()
or
Removes whitespace from start and end of string
if firstName.strip() == record[0].strip() and lastName.strip() == record[1].strip()
Trim the whitespaces in the string and then use == for matching. This will give the expected output. Sample code
for record in records:
if firstName == record[0].strip() and lastName == record[1].strip():
record[2] = newBirthday
Either pad spaces onto the passed data to match what's in the file:
firstName = f'{F:<16}'
or strip the extra spaces from the file contents to match the passed data:
fname = fileContent[0:16].strip()
then you can simply compare the names, either keeping the in operator or using ==.
An easy solution to this would be to sort the list of names and values (in this case, birthdays (I suggest you use a dictionary for this purpose) and then perform a search operation that finds only the first occurrence of an element. That way only Adam Smith is selected.
You can then further deal with duplicates by checking if the next element is the same as the one you found.
i.e. if the first occurrence is i, check if i+1 == i, and update all of the duplicates. (this may be what you need to do for your exercise, though it doesn't make sense to update other people's birthdays like this.)
Hopefully this helps :)
Here are a few improvements I could suggest to your code:
make a function or at least put the code inside if __name__ == '__main__':. You can google why.
I suggest using with open(file_path, 'r') as f: to read and write files; it looks cleaner and you won't forget to close the file.
remove surounding spaces before comparing to your input; I used line[:16].strip().
you read in specified widths, so don't forget to write back into the file with those same widths. This code ensures that each part of the record has the specified width:
f.write('{:16s}{:16s}{:8s}'.format(*record))
Here is my version of the code:
import sys
if __name__ == '__main__':
file_path = sys.argv[1]
first_name = sys.argv[2]
last_name = sys.argv[3]
new_birth_date = sys.argv[4]
record_list = []
with open(file_path, 'r') as f:
while True:
line = f.read(40) # read 40 chars at the time
if len(line) == 0:
# the end of the file was reached
break
record = [
line[:16].strip(),
line[16:32].strip(),
line[32:].strip(),
]
if first_name == record[0] and last_name == record[1]:
record[2] = new_birth_date
record_list.append(record)
with open(file_path, 'w') as f:
for record in record_list:
f.write('{:16s}{:16s}{:8s}'.format(*record))
Does this work for you?
Note: this code does not split lines by newlines, but it reads 40 chars at the time, so you could end up with a newline char inside those 40 chars.
I did it this way because the code in the question seems to do something similar.

How to output only the string in list data?

My file reads the teamNames.txt file which is:
Collingwood
Essendon
Hawthorn
Richmond
Code:
file2 = input("Enter the team-names file: ") ## E.g. teamNames.txt
bob = open(file2)
teamname = []
for line1 in bob: ##loop statement until no more line in file
teamname.append(line1)
print(teamname)
The output is:
['Collingwood\n', 'Essendon\n', 'Hawthorn\n', 'Richmond\n']
I want to make it so the output will be:
Collingwood, Essendon, Hawthorn, Richmond
One option is to use the replace() function. I've modified your code to include this function.
file2= input("Enter the team-names file: ") ## E.g. teamNames.txt
bob =open(file2)
teamname = []
for line1 in bob: ##loop statement until no more line in file
teamname.append(line1.replace("\n",""))
print(teamname)
Would give you the output:
['Collingwood', 'Essendon', 'Hawthorn', 'Richmond']
You could then modify teamname to get your requested output:
print(", ".join(teamname))
What about
for line1 in bob:
teamname.append(line1.strip()) # .strip() removes the \n
print (', '.join(teamname))
The .join() does the final formatting.
Update. I now think a more pythonistic (and elegant) answer would be:
file2 = input("Enter the team-names file: ") ## E.g. teamNames.txt
with open(file2) as f:
teamname = [line.strip() for line in f]
print (', '.join(teamname))
The with statement ensures the file is closed when the block finishes. Instead of doing the for loop, it now uses a list comprehension, a cool way to create a list by transforming the elements from another list (or from an iterable object, like file).
The join method works well but you can also try using a for loop.
for name in teamname: # takes each entry separately
print name

Print full sequence not just first line | Python 3.3 | Print from specific line to end (")

I am attempting to pull out multiple (50-100) sequences from a large .txt file seperated by new lines ('\n'). The sequence is a few lines long but not always the same length so i can't just print lines x-y. The sequences end with " and the next line always starts with the same word so maybe that could be used as a keyword.
I am writing using python 3.3
This is what I have so far:
searchfile = open('filename.txt' , 'r')
cache = []
for line in searchfile:
cache.append(line)
for line in range(len(cache)):
if "keyword1" in cache[line].lower():
print(cache[line+5])
This pulls out the starting line (which is 5 lines below the keyword line always) however it only pulls out this line.
How do I print the whole sequence?
Thankyou for your help.
EDIT 1:
Current output = ABCDCECECCECECE ...
Desired output = ABCBDBEBSOSO ...
ABCBDBDBDBDD ...
continued until " or new line
Edit 2
Text file looks like this:
Name (keyword):
Date
Address1
Address2
Sex
Response"................................"
Y/N
The sequence between the " and " is what I need
TL;DR - How do I print from line + 5 to end when end = keyword
Not sure if I understand your sequence data but if you're searching for each 'keyword' then the next " char then the following should work:
keyword_pos =[]
endseq_pos = []
for line in range(len(cache)):
if 'keyword1' in cache[line].lower():
keyword_pos.append(line)
if '"' in cache[line]:
endseq_pos.append(line)
for key in keyword_pos:
for endseq in endseq_pos:
if endseq > key:
print(cache[key:endseq])
break
This simply compiles a list of all the positions of all the keywords and " characters and then matches the two and prints all the lines in between.
Hope that helps.
I agree with #Michal Frystacky that regex is the way forward. However as I now understand the problem, we need two searches one for the 'keyword' then again 5 lines on, to find the 'sequence'
This should work but may need the regex to be tweaked:
import re
with open('yourfile.txt') as f:
lines = f.readlines()
for i,line in enumerate(lines):
#first search for keyword
key_match = re.search(r'\((keyword)',line)
if key_match:
#if successful search 5 lines on for the string between the quotation marks
seq_match = re.search(r'"([A-Z]*)"',lines[i+5])
if seq_match:
print(key_match.group(1) +' '+ seq_match.group(1))
1This can be done rather simply with regex
import re
lines = 'Name (keyword):','Date','Address1','Address2','Sex','Response"................................" '
for line in lines:
match = re.search('.*?"(:?.*?)"?',line)
if match:
print(match.group(1))
Eventually to use this sample code we would lines = f.readlines() from the dataset. Its important to note that we catch only things between " and another ", if there is no " mark at the end, we will miss this data, but accounting for that isn't too difficult.

Python: How to replace the last word of a line from a text file?

How would I go about replacing the last word of a specific line from all the lines of a text file that has been loaded into Python? I know that if I wanted to access it as a list I'd use [-1] for the specific line, but I don't know how to do it as a string. An example of the text file is:
A I'm at the shop with Bill.
B I'm at the shop with Sarah.
C I'm at the shop with nobody.
D I'm at the shop with Cameron.
If you want a more powerful editing option, Regex is your friend.
import re
pattern = re.compile(r'\w*(\W*)$') # Matches the last word, and captures any remaining non-word characters
# so we don't lose punctuation. This will includes newline chars.
line_num = 2 # The line number we want to operate on.
new_name = 'Steve' # Who are we at the shops with? Not Allan.
with open('shopping_friends.txt') as f:
lines = f.readlines()
lines[line_num] = re.sub(pattern, new_name + r'\1', lines[line_num])
# substitue the last word for your friend's name, keeping punctuation after it.
# Now do something with your modified data here
please check this , is not 100% pythonic, is more for an overview
file_text = '''I'm at the shop with Bill.
I'm at the shop with Sarah.
I'm at the shop with nobody.
I'm at the shop with Cameron.
I'm at the shop with nobody.'''
def rep_last_word_line(textin, search_for, replace_with, line_nr):
if (isinstance(textin,str)):
textin = textin.splitlines()
else:
# remove enline from all lines - just in case
textin = [ln.replace('\n', ' ').replace('\r', '') for ln in textin]
if (textin[line_nr] != None):
line = textin[line_nr].replace('\n', ' ').replace('\r', '')
splited_line = line.split()
last_word = splited_line[-1]
if (last_word[0:len(search_for)] == search_for):
splited_line[-1] = last_word.replace(search_for,replace_with)
textin[line_nr] = ' '.join(splited_line)
return '\r\n'.join(textin)
print rep_last_word_line(file_text,'nobody','Steve',2)
print '='*80
print rep_last_word_line(file_text,'nobody','Steve',4)
print '='*80
# read content from file
f = open('in.txt','r')
# file_text = f.read().splitlines() # read text and then use str.splitlines to split line withoud endline character
file_text = f.readlines() # this will keep the endline character
print rep_last_word_line(file_text,'nobody','Steve',2)
If you have blank lines, you might want to try splitlines()
lines = your_text.splitlines()
lastword = lines[line_to_replace].split()[-1]
lines[line_to_replace] = lines[line_to_replace].replace(lastword, word_to_replace_with)
keep in mind that your changes are in lines now and not your_text so if you want your changes to persist, you'll have to write out lines instead
with open('shops.txt', 'w') as s:
for l in lines:
s.write(l)
Assuming you have a file "example.txt":
with open("example.txt") as myfile:
mylines = list(myfile)
lastword = mylines[2].split()[-1]
mylines[2] = mylines[2].replace(lastword,"Steve.")
(Edit: fixed off by one error... assuming by 3rd line, he means human style counting rather than zeroth-indexed)
(Note that the with line returns a file object which will be automatically closed even if for example the length of myfile is less than 3; this file object also provides an iterator which then gets converted into a straight list so you can pick a specific line.)

Categories

Resources