I am having a problem with my python program that im running - python

From an input file I'm suppose to extract only first name of the student and then save the result in a new file called "student-­‐firstname.txt" The output file should contain a list of
first names (not include middle name). I was able to get delete of the last name but I'm having problem deleting the middle name any help or suggestion?
the student name in the file look something like this (last name, first name, and middle initial)
Martin, John
Smith, James W.
Brown, Ashley S.
my python code is:
f=open("studentname.txt", 'r')
f2=open ("student-firstname.txt",'w')
str = ''
for line in f.readlines():
str = str + line
line=line.strip()
token=line.split(",")
f2.write(token[1]+"\n")
f.close()
f2.close()

f=open("studentname.txt", 'r')
f2=open ("student-firstname.txt",'w')
for line in f.readlines():
token=line.split()
f2.write(token[1]+"\n")
f.close()
f2.close()

Split token[1] with space.
fname = token[1].split(' ')[0]

with open("studentname.txt") as f, open("student-firstname.txt", 'w') as fout:
for line in f:
firstname = line.split()[1]
print >> fout, firstname
Note:
you could use a with statement to make sure that the files are always closed even in case of an exception. You might need contextlib.nested() on old Python versions
'r' is a default mode for files. You don't need to specify it explicitly
.readlines() reads all lines at once. You could iterate over the file line by line directly
To avoid hardcoding the filenames you could use fileinput. Save it to firstname.py:
#!/usr/bin/env python
import fileinput
for line in fileinput.input():
firstname = line.split()[1]
print firstname
Example: $ python firstname.py studentname.txt >student-firstname.txt

Check out regular expressions. Something like this will probably work:
>>> import re
>>> nameline = "Smith, James W."
>>> names = re.match("(\w+),\s+(\w+).*", nameline)
>>> if names:
... print names.groups()
('Smith', 'James')
Line 3 basically says find a sequence of word characters as group 0, followed by a comma, some space characters and another sequence of word characters as group 1, followed by anything in nameline.

f = open("file")
o = open("out","w")
for line in f:
o.write(line.rstrip().split(",")[1].strip().split()+"\n")
f.close()
o.close()

Related

python3.5.2 deleting all matching characters from a file

Given the following exemple how can i remove all "a" characters from a file that have the following content:
asdasdasd \n d1233sss \n aaa \n 123
I wrote the following solution but it does not work:
with open("testfisier","r+") as file:
for line in file:
for index in range(len(line)):
if line[index] is "a":line[index].replace("a","")
There weren't any changes because you didn't write it back to the file.
with open("testfisier", "r+") as file:
for line in file:
for index in range(len(line)):
if line[index] is "a":
replace_file = line[index].replace("a", "")
# Write the changes.
file.write(replace_file)
Or:
with open("testfisier", "r+") as f:
f.write(f.read().replace("a", ""))
Try using regexp substitution. For instance, assuming you have read in the string and named it a_string
import re
re.sub('a','',a_string,'')
This would be one of many possible solutions.
Hope this helps!
You can try this:
import re
data = open("testfisier").read()
final_data = re.sub('a+', '', data)
You can call replace on a long string. No need to call it on single chars. Also, replace does not change a string, but returns a new one:
with open("testfisier", "r+") as file:
text = file.read()
text = text.replace("a", "") # replace a's in the entire text
file.seek(0) # move file pointer back to start
file.write(text)

python parse and print text before .(dot)

I am trying a program where it has to parse the text file:qwer.txt and print the value before '=' and after ',':
qwer.txt
john.xavier=s/o john
jane.victory=s/o ram
output:
xavier
victory
My program shows the entire line,please help on how to display specific text after . and =
with open("qwer.txt", 'r') as my_file:
a = my_file.readlines()
for line in a:
for part in line.split():
if "=" in part:
print part.split(' ')[-1]
Please help! answers will be appreciated.
with open("qwer.txt", 'r') as my_file:
for line in my_file:
print line.split('=')[0].split('.')[1]
You might need to understand the with statement better :-)
Here is my solution:
with open("qwer.txt", 'r') as my_file:
for line in my_file:
name = line.split("=", 1)[0]
print name.split(".")[-1]
The two lines can be combines like this as well:
print line.split("=", 1)[0].split(".")[-1]
The official doc of "with" statement is here
Fun little way using regex rather than splitting, and will ignore bad lines rather than erroring (pretty slick if I do say so myself). Also gives you a nice list of names if you want to use them further rather than outputting.
import re
r = re.compile('.+?\.(.+)?\=.+')
with open("qwer.txt", 'r') as f:
names = [r.match(x).group(1) for x in f.read().splitlines() if r.match(x)]
for name in names: print name

How do I write a string to a specific line number?

I am having trouble finding the answer to this after quite a bit of searching.
What I want to do is, do a string search and write on the line above or below it, depending on my string.
Here is something I've done so far:
file = open('input.txt', 'r+')
f = enumerate(file)
for num, line in f:
if 'string' in line:
linewrite = num - 1
???????
EDIT EXTENSION OF INITIAL QUESTION:
I already picked the answer that best solved my initial question. But now using Ashwini's method where I rewrote the file, how can I do a search AND REPLACE a string. To be more specific.
I have a text file with
SAMPLE
AB
CD
..
TYPES
AB
QP
PO
..
RUNS
AB
DE
ZY
I want to replace AB with XX, ONLY UNDER lines SAMPLE and RUNS
I've already tried multiple ways of using replace(). I tried something like
if 'SAMPLE' in line:
f1.write(line.replace('testsample', 'XX'))
if 'RUNS' in line:
f1.write(line.replace('testsample', 'XX'))
and that didn't work
The following can be used as a template:
import fileinput
for line in fileinput.input('somefile', inplace=True):
if 'something' in line:
print 'this goes before the line'
print line,
print 'this goes after the line'
else:
print line, # just print the line anyway
You may have to read all the lines in a list first, and if the condition is matched you can then store your string at a particular index using list.insert
with open('input.txt', 'r+') as f:
lines = f.readlines()
for i, line in enumerate(lines):
if 'string' in line:
lines.insert(i,"somedata") # inserts "somedata" above the current line
f.truncate(0) # truncates the file
f.seek(0) # moves the pointer to the start of the file
f.writelines(lines) # write the new data to the file
or without storing all the lines you'll need a temporary file to store the data, and then
rename the temporary file to the original file:
import os
with open('input.txt', 'r') as f, open("new_file",'w') as f1:
for line in f:
if 'string' in line:
f1.write("somedate\n") # Move f1.write(line) above, to write above instead
f1.write(line)
os.remove('input.txt') # For windows only
os.rename("newfile", 'input.txt') # Rename the new file

How to read a text file into a string variable and strip newlines?

I have a text file that looks like:
ABC
DEF
How can I read the file into a single-line string without newlines, in this case creating a string 'ABCDEF'?
For reading the file into a list of lines, but removing the trailing newline character from each line, see How to read a file without newlines?.
You could use:
with open('data.txt', 'r') as file:
data = file.read().replace('\n', '')
Or if the file content is guaranteed to be one-line
with open('data.txt', 'r') as file:
data = file.read().rstrip()
In Python 3.5 or later, using pathlib you can copy text file contents into a variable and close the file in one line:
from pathlib import Path
txt = Path('data.txt').read_text()
and then you can use str.replace to remove the newlines:
txt = txt.replace('\n', '')
You can read from a file in one line:
str = open('very_Important.txt', 'r').read()
Please note that this does not close the file explicitly.
CPython will close the file when it exits as part of the garbage collection.
But other python implementations won't. To write portable code, it is better to use with or close the file explicitly. Short is not always better. See https://stackoverflow.com/a/7396043/362951
To join all lines into a string and remove new lines, I normally use :
with open('t.txt') as f:
s = " ".join([l.rstrip("\n") for l in f])
with open("data.txt") as myfile:
data="".join(line.rstrip() for line in myfile)
join() will join a list of strings, and rstrip() with no arguments will trim whitespace, including newlines, from the end of strings.
This can be done using the read() method :
text_as_string = open('Your_Text_File.txt', 'r').read()
Or as the default mode itself is 'r' (read) so simply use,
text_as_string = open('Your_Text_File.txt').read()
I'm surprised nobody mentioned splitlines() yet.
with open ("data.txt", "r") as myfile:
data = myfile.read().splitlines()
Variable data is now a list that looks like this when printed:
['LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN', 'GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE']
Note there are no newlines (\n).
At that point, it sounds like you want to print back the lines to console, which you can achieve with a for loop:
for line in data:
print(line)
It's hard to tell exactly what you're after, but something like this should get you started:
with open ("data.txt", "r") as myfile:
data = ' '.join([line.replace('\n', '') for line in myfile.readlines()])
I have fiddled around with this for a while and have prefer to use use read in combination with rstrip. Without rstrip("\n"), Python adds a newline to the end of the string, which in most cases is not very useful.
with open("myfile.txt") as f:
file_content = f.read().rstrip("\n")
print(file_content)
Here are four codes for you to choose one:
with open("my_text_file.txt", "r") as file:
data = file.read().replace("\n", "")
or
with open("my_text_file.txt", "r") as file:
data = "".join(file.read().split("\n"))
or
with open("my_text_file.txt", "r") as file:
data = "".join(file.read().splitlines())
or
with open("my_text_file.txt", "r") as file:
data = "".join([line for line in file])
you can compress this into one into two lines of code!!!
content = open('filepath','r').read().replace('\n',' ')
print(content)
if your file reads:
hello how are you?
who are you?
blank blank
python output
hello how are you? who are you? blank blank
You can also strip each line and concatenate into a final string.
myfile = open("data.txt","r")
data = ""
lines = myfile.readlines()
for line in lines:
data = data + line.strip();
This would also work out just fine.
This is a one line, copy-pasteable solution that also closes the file object:
_ = open('data.txt', 'r'); data = _.read(); _.close()
f = open('data.txt','r')
string = ""
while 1:
line = f.readline()
if not line:break
string += line
f.close()
print(string)
python3: Google "list comprehension" if the square bracket syntax is new to you.
with open('data.txt') as f:
lines = [ line.strip('\n') for line in list(f) ]
Oneliner:
List: "".join([line.rstrip('\n') for line in open('file.txt')])
Generator: "".join((line.rstrip('\n') for line in open('file.txt')))
List is faster than generator but heavier on memory. Generators are slower than lists and is lighter for memory like iterating over lines. In case of "".join(), I think both should work well. .join() function should be removed to get list or generator respectively.
Note: close() / closing of file descriptor probably not needed
Have you tried this?
x = "yourfilename.txt"
y = open(x, 'r').read()
print(y)
To remove line breaks using Python you can use replace function of a string.
This example removes all 3 types of line breaks:
my_string = open('lala.json').read()
print(my_string)
my_string = my_string.replace("\r","").replace("\n","")
print(my_string)
Example file is:
{
"lala": "lulu",
"foo": "bar"
}
You can try it using this replay scenario:
https://repl.it/repls/AnnualJointHardware
I don't feel that anyone addressed the [ ] part of your question. When you read each line into your variable, because there were multiple lines before you replaced the \n with '' you ended up creating a list. If you have a variable of x and print it out just by
x
or print(x)
or str(x)
You will see the entire list with the brackets. If you call each element of the (array of sorts)
x[0]
then it omits the brackets. If you use the str() function you will see just the data and not the '' either.
str(x[0])
Maybe you could try this? I use this in my programs.
Data= open ('data.txt', 'r')
data = Data.readlines()
for i in range(len(data)):
data[i] = data[i].strip()+ ' '
data = ''.join(data).strip()
Regular expression works too:
import re
with open("depression.txt") as f:
l = re.split(' ', re.sub('\n',' ', f.read()))[:-1]
print (l)
['I', 'feel', 'empty', 'and', 'dead', 'inside']
with open('data.txt', 'r') as file:
data = [line.strip('\n') for line in file.readlines()]
data = ''.join(data)
from pathlib import Path
line_lst = Path("to/the/file.txt").read_text().splitlines()
Is the best way to get all the lines of a file, the '\n' are already stripped by the splitlines() (which smartly recognize win/mac/unix lines types).
But if nonetheless you want to strip each lines:
line_lst = [line.strip() for line in txt = Path("to/the/file.txt").read_text().splitlines()]
strip() was just a useful exemple, but you can process your line as you please.
At the end, you just want concatenated text ?
txt = ''.join(Path("to/the/file.txt").read_text().splitlines())
This works:
Change your file to:
LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE
Then:
file = open("file.txt")
line = file.read()
words = line.split()
This creates a list named words that equals:
['LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN', 'GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE']
That got rid of the "\n". To answer the part about the brackets getting in your way, just do this:
for word in words: # Assuming words is the list above
print word # Prints each word in file on a different line
Or:
print words[0] + ",", words[1] # Note that the "+" symbol indicates no spaces
#The comma not in parentheses indicates a space
This returns:
LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN, GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE
with open(player_name, 'r') as myfile:
data=myfile.readline()
list=data.split(" ")
word=list[0]
This code will help you to read the first line and then using the list and split option you can convert the first line word separated by space to be stored in a list.
Than you can easily access any word, or even store it in a string.
You can also do the same thing with using a for loop.
file = open("myfile.txt", "r")
lines = file.readlines()
str = '' #string declaration
for i in range(len(lines)):
str += lines[i].rstrip('\n') + ' '
print str
Try the following:
with open('data.txt', 'r') as myfile:
data = myfile.read()
sentences = data.split('\\n')
for sentence in sentences:
print(sentence)
Caution: It does not remove the \n. It is just for viewing the text as if there were no \n

Deleting a line from a file in Python

I'm trying to delete a specific line that contains a specific string.
I've a file called numbers.txt with the following content:
peter
tom
tom1
yan
What I want to delete is that tom from the file, so I made this function:
def deleteLine():
fn = 'numbers.txt'
f = open(fn)
output = []
for line in f:
if not "tom" in line:
output.append(line)
f.close()
f = open(fn, 'w')
f.writelines(output)
f.close()
The output is:
peter
yan
As you can see, the problem is that the function delete tom and tom1, but I don't want to delete tom1. I want to delete just tom. This is the output that I want to have:
peter
tom1
yan
Any ideas to change the function to make this correctly?
change the line:
if not "tom" in line:
to:
if "tom" != line.strip():
That's because
if not "tom" in line
checks, whether tom is not a substring of the current line. But in tom1, tom is a substring. Thus, it is deleted.
You probably could want one of the following:
if not "tom\n"==line # checks for complete (un)identity
if "tom\n" != line # checks for complete (un)identity, classical way
if not "tom"==line.strip() # first removes surrounding whitespace from `line`
Just for fun, here's a two-liner to do it.
lines = filter(lambda x:x[0:-1]!="tom", open("names.txt", "r"))
open("names.txt", "w").write("".join(lines))
Challenge: someone post a one-liner for this.
You could also use the fileinput module to get arguably the most readable result:
import fileinput
for l in fileinput.input("names.txt", inplace=1):
if l != "tom\n": print l[:-1]
You can use regex.
import re
if not re.match("^tom$", line):
output.append(line)
The $ means the end of the string.
I'm new in programing and python (a few months)... this is my solution:
import fileinput
c = 0 # counter
for line in fileinput.input("korrer.csv", inplace=True, mode="rb"):
# the line I want to delete
if c == 3:
c += 1
pass
else:
line = line.replace("\n", "")
print line
c +=1
I'm sure there is a simpler way, just it's an idea. (my English it's not very good looking!!)

Categories

Resources