Want to remove \n from len [duplicate] - python

This question already has answers here:
How to read a file line-by-line into a list?
(28 answers)
Closed 1 year ago.
here is my task:
You are provided a books.txt file, which includes the book titles, each one written on a separate line.
Read the title one by one and output the code for each book on a separate line.
For example, if the books.txt file contains:
Some book
Another book
Your program should output:
S9
A12
file = open("/usercode/files/books.txt", "r")
with file as f:
lines = f.readlines()
for i in lines:
count = len(i)
count = str(count - 1)
print(i[0]+count)
file.close()
and this outputs everything correct but the last line because i[#lastLine] is done after the last count if that makes any sense?(could be completely wrong I am learning)
Basically I want to know where I am going wrong in my code. I believe it is the way I structured the for i in lines part and should have handled the \n in a different way to
count = len(i)
count = str(count - 1)
ANSWER
Thank you for informing me, adding i = i.strip() strips new lines aka \n which sorted the problem!
Working code:
file = open("/usercode/files/books.txt", "r")
with file as f:
lines = f.readlines()
for i in lines:
i = i.strip('\n') #Strips new lines aka \n
count = str(len(i))
print(i[0]+count)
file.close()

You can use strip() on i to remove the new lines.
strip - Returns a copy of the string with the leading and trailing characters removed.
Python docs strip
you can cast your count as a str() to print.
str()
Returns a string containing a printable representation of an object.
Python docs str
As comments point out, suggest also changing i to line for readability.
file = open("books.txt", "r")
with file as f:
lines = f.readlines()
for line in lines:
count = str(len(line.strip()))
print(line[0]+count)
file.close()

Related

Split and print the word before and after the \ of *n of lines, from a txt to two different txt's

I searched around a bit, but I couldn't find a solution that fits my needs.
I'm new to python, so I'm sorry if what I'm asking is pretty obvious.
I have a .txt file (for simplicity I will call it inputfile.txt) with a list of names of folder\files like this:
camisos\CROWDER_IMAG_1.mov
camisos\KS_HIGHENERGY.mov
camisos\KS_LOWENERGY.mov
What I need is to split the first word (the one before the \) and write it to a txt file (for simplicity I will call it outputfile.txt).
Then take the second (the one after the \) and write it in another txt file.
This is what i did so far:
with open("inputfile.txt", "r") as f:
lines = f.readlines()
with open("outputfile.txt", "w") as new_f:
for line in lines:
text = input()
print(text.split()[0])
This in my mind should print only the first word in the new txt, but I only got an empty txt file without any error.
Any advice is much appreciated, thanks in advance for any help you could give me.
You can read the file in a list of strings and split each string to create 2 separate lists.
with open("inputfile.txt", "r") as f:
lines = f.readlines()
X = []
Y = []
for line in lines:
X.append(line.split('\\')[0] + '\n')
Y.append(line.split('\\')[1])
with open("outputfile1.txt", "w") as f1:
f1.writelines(X)
with open("outputfile2.txt", "w") as f2:
f2.writelines(Y)

Concatenate lines with previous line based on number of letters in first column

New to coding and trying to figure out how to fix a broken csv file to make be able to work with it properly.
So the file has been exported from a case management system and contains fields for username, casenr, time spent, notes and date.
The problem is that occasional notes have newlines in them and when exporting the csv the tooling does not contain quotation marks to define it as a string within the field.
see below example:
user;case;hours;note;date;
tnn;123;4;solved problem;2017-11-27;
tnn;124;2;random comment;2017-11-27;
tnn;125;3;I am writing a comment
that contains new lines
without quotation marks;2017-11-28;
HJL;129;8;trying to concatenate lines to re form the broken csv;2017-11-29;
I would like to concatenate lines 3,4 and 5 to show the following:
tnn;125;3;I am writing a comment that contains new lines without quotation marks;2017-11-28;
Since every line starts with a username (always 3 letters) I thought I would be able to iterate the lines to find which lines do not start with a username and concatenate that with the previous line.
It is not really working as expected though.
This is what I have got so far:
import re
with open('Rapp.txt', 'r') as f:
for line in f:
previous = line #keep current line in variable to join next line
if not re.match(r'^[A-Za-z]{3}', line): #regex to match 3 letters
print(previous.join(line))
Script shows no output just finishes silently, any thoughts?
I think I would go a slightly different way:
import re
all_the_data = ""
with open('Rapp.txt', 'r') as f:
for line in f:
if not re.search("\d{4}-\d{1,2}-\d{1,2};\n", line):
line = re.sub("\n", "", line)
all_the_data = "".join([all_the_data, line])
print (all_the_data)
There a several ways to do this each with pros and cons, but I think this keeps it simple.
Loop the file as you have done and if the line doesn't end in a date and ; take off the carriage return and stuff it into all_the_data. That way you don't have to play with looking back 'up' the file. Again, lots of way to do this. If you would rather use the logic of starts with 3 letters and a ; and looking back, this works:
import re
all_the_data = ""
with open('Rapp.txt', 'r') as f:
all_the_data = ""
for line in f:
if not re.search("^[A-Za-z]{3};", line):
all_the_data = re.sub("\n$", "", all_the_data)
all_the_data = "".join([all_the_data, line])
print ("results:")
print (all_the_data)
Pretty much what was asked for. The logic being if the current line doesn't start right, take out the previous line's carriage return from all_the_data.
If you need help playing with the regex itself, this site is great: http://regex101.com
The regex in your code matches to all the lines (string) in the txt (finds a valid match to the pattern). The if condition is never true and hence nothing prints.
with open('./Rapp.txt', 'r') as f:
join_words = []
for line in f:
line = line.strip()
if len(line) > 3 and ";" in line[0:4] and len(join_words) > 0:
print(';'.join(join_words))
join_words = []
join_words.append(line)
else:
join_words.append(line)
print(";".join(join_words))
I've tried to not use regex here to keep it a little clear if possible. But, regex is a better option.
A simple way would be to use a generator that acts as a filter on the original file. That filter would concatenate a line to the previous one if it has not a semicolon (;) in its 4th column. Code could be:
def preprocess(fd):
previous = next(fd)
for line in fd:
if line[3] == ';':
yield previous
previous = line
else:
previous = previous.strip() + " " + line
yield previous # don't forget last line!
You could then use:
with open(test.txt) as fd:
rd = csv.DictReader(preprocess(fd))
for row in rd:
...
The trick here is that the csv module only requires on object that returns a line each time next function is applied to it, so a generator is appropriate.
But this is only a workaround and the correct way would be that the previous step directly produces a correct CSV file.

Python - replace a line by its column in file

Sorry for posting such an easy question, but i couldn't find an answer on google.
I wish my code to do something like this
code:
lines = open("Bal.txt").write
lines[1] = new_value
lines.close()
p.s i wish to replace the line in a file with a value
xxx.dat before:
ddddddddddddddddd
EEEEEEEEEEEEEEEEE
fffffffffffffffff
with open('xxx.txt','r') as f:
x=f.readlines()
x[1] = "QQQQQQQQQQQQQQQQQQQ\n"
with open('xxx.txt','w') as f:
f.writelines(x)
xxx.dat after:
ddddddddddddddddd
QQQQQQQQQQQQQQQQQQQ
fffffffffffffffff
Note:f.read() returns a string, whereas f.readlines() returns a list, enabling you to replace an occurrence within that list.
Inclusion of the \n (Linux) newline character is important to separate line[1] from line[2] when you next read the file, or you would end up with:
ddddddddddddddddd
QQQQQQQQQQQQQQQQQQQfffffffffffffffff

Finding string in a file [duplicate]

This question already has answers here:
Accessing the index in 'for' loops
(26 answers)
How can I use `return` to get back multiple values from a loop? Can I put them in a list?
(2 answers)
How should I read a file line-by-line in Python?
(3 answers)
Closed 6 years ago.
First timer here with really using files and I/O. I'm running my code through a tester and the tester calls the different files I'm working with through my code. So for this, I'm representing the file as "filename" below and the string I'm looking for in that file as "s". I'm pretty sure I'm going through the lines of the code and searching for the string correctly. This is what I have for that :
def locate(filename, s):
file= open(filename)
line= file.readlines()
for s in line:
if s in line:
return [line.count]
I'm aware the return line isn't correct. How would I return the number of the line that the string I'm looking for is located on as a list?
You can use enumerate to keep track of the line number:
def locate(filename, s):
with open(filename) as f:
return [i for i, line in enumerate(f, 1) if s in line]
In case the searched string can be found from first and third line it will produce following output:
[1, 3]
You can use enumerate.
Sample Text File
hello hey s hi
hola
s
Code
def locate(filename, letter_to_find):
locations = []
with open(filename, 'r') as f:
for line_num, line in enumerate(f):
for word in line.split(' '):
if letter_to_find in word:
locations.append(line_num)
return locations
Output
[0, 2]
As we can see it shows that the string s on lines 0 and 2.
Note computers start counting at 0
Whats going on
Opens the file with read permissions.
Iterates over each line, enumerateing them as it goes, and keeping track of the line number in line_num.
Iterates over each word in the line.
If the letter_to_find that you passed into the function is in word, it appends the line_num to locations.
return locations
These are the problem lines
for s in line:
if s in line:
you have to read line into another variable apart from s
def locate(filename, s):
file= open(filename)
line= file.readlines()
index = 0;
for l in line:
print l;
index = index + 1
if s in l:
return index
print locate("/Temp/s.txt","s")

Python using re module to parse an imported text file

def regexread():
import re
result = ''
savefileagain = open('sliceeverfile3.txt','w')
#text=open('emeverslicefile4.txt','r')
text='09,11,14,34,44,10,11, 27886637, 0\n561, Tue, 5,Feb,2013, 06,25,31,40,45,06,07, 19070109, 0\n560, Fri, 1,Feb,2013, 05,21,34,37,38,01,06, 13063500, 0\n559, Tue,29,Jan,2013,'
pattern='\d\d,\d\d,\d\d,\d\d,\d\d,\d\d,\d\d'
#with open('emeverslicefile4.txt') as text:
f = re.findall(pattern,text)
for item in f:
print(item)
savefileagain.write(item)
#savefileagain.close()
The above function as written parses the text and returns sets of seven numbers. I have three problems.
Firstly the 'read' file which contains exactly the same text as text='09,...etc' returns a TypeError expected string or buffer, which I cannot solve even by reading some of the posts.
Secondly, when I try to write results to the 'write' file, nothing is returned and
thirdly, I am not sure how to get the same output that I get with the print statement, which is three lines of seven numbers each which is the output that I want.
This should do the trick:
import re
filename = 'sliceeverfile3.txt'
pattern = '\d\d,\d\d,\d\d,\d\d,\d\d,\d\d,\d\d'
new_file = []
# Make sure file gets closed after being iterated
with open(filename, 'r') as f:
# Read the file contents and generate a list with each line
lines = f.readlines()
# Iterate each line
for line in lines:
# Regex applied to each line
match = re.search(pattern, line)
if match:
# Make sure to add \n to display correctly when we write it back
new_line = match.group() + '\n'
print new_line
new_file.append(new_line)
with open(filename, 'w') as f:
# go to start of file
f.seek(0)
# actually write the lines
f.writelines(new_file)
You're sort of on the right track...
You'll iterate over the file:
How to iterate over the file in python
and apply the regex to each line. The link above should really answer all 3 of your questions when you realize you're trying to write 'item', which doesn't exist outside of that loop.

Categories

Resources