make list from text file and compare the lists

make list from text file and compare the lists - python

The full.txt contains:
www.example.com/a.jpg
www.example.com/b.jpg
www.example.com/k.jpg
www.example.com/n.jpg
www.example.com/x.jpg
The partial.txt contains:
a.jpg
k.jpg
Why the following code does not provide the desired result?
with open ('full.txt', 'r') as infile:
lines_full=[line for line in infile]
with open ('partial.txt', 'r') as infile:
lines_partial=[line for line in infile]
with open ('remaining.txt', 'w') as outfile:
for element in lines_full:
if element[16:21] not in lines_partial: #element[16:21] means like a.jpg
outfile.write (element)
The desired remaining.txt should have those elements of full.txt that are not in partial.txt exactly as follows:
www.example.com/b.jpg
www.example.com/n.jpg
www.example.com/x.jpg

you can use os.path library:
from os import path
with open ('full.txt', 'r') as f:
lines_full = f.read().splitlines()
with open ('partial.txt', 'r') as f:
lines_partial = set(f.read().splitlines()) # create set for faster checking
lines_new = [x + '\n' for x in lines_full if path.split(x)[1] not in lines_partial]
with open('remaining.txt', 'w') as f:
f.writelines(lines_new)

This code will include the newline character at the end of each line, which means it will never match "a.jpg" or "k.jpg" precisely.
with open ('partial.txt', 'r') as infile:
lines_partial=[line for line in infile]
Change it to
with open ('partial.txt', 'r') as infile:
lines_partial=[line[:-1] for line in infile]
to get rid of the newline characters (line[:-1] means "without the last character of the line")

Related

How to add empty lines between lines in text.txt document?

The purpose of the code is to add an empty line between lines in text.txt document and write some words in those empty lines.
I tried looping through every line but the file should be in read mode only;
iushnaufihsnuesa
fsuhadnfuisgadnfuigasdf
asfhasndfusaugdf
suhdfnciusgenfuigsaueifcas
This is a sample of text.txt document
how can i implement this on this txt?
f = open("text.txt", 'w+')
for x in f:
f.write("\n Words between spacing")
f.close()
First i tried directly to just make a new line between each line and add couple of stuuf
I also thought of first making empty lines between each line and then add some words in the empty spaces but I didn't figure this out

Ok, for files in the region of 200 lines long you can store the whole file as a list of strings and add lines when re-writing the file:
with open("text.txt", 'r') as f:
data = [line for line in f]
with open("text.txt", 'w') as f:
for line in data:
f.write(line)
f.write("Words between spacing\n")

You can divide this operation in three steps.
In the first one, you read all the lines from the file into a list[str] using f.readlines():
with open("text.txt", "r") as f: # using "read" mode
lines = f.readlines()
Second is to join these lines inside the list using the "".join(...) function.
lines = "My line between the lines\n".join(lines)
On third step, write it down to the file:
with open("text.txt", "w") as f: # using "write" mode
f.write(lines)
Also, you can use f.read() in conjunction with text.replace("\n", ...):
with open("text.txt", "r") as f:
full_text = f.read()
full_text = full_text.replace("\n", "\nMy desirable text between the lines\n")
with open("text.txt", "w") as f:
f.write(full_text)
Initial text:
iushnaufihsnuesa
fsuhadnfuisgadnfuigasdf
asfhasndfusaugdf
suhdfnciusgenfuigsaueifcas
Final text:
iushnaufihsnuesa
My desirable text between the lines
fsuhadnfuisgadnfuigasdf
My desirable text between the lines
asfhasndfusaugdf
My desirable text between the lines
suhdfnciusgenfuigsaueifcas

Adding a comma to end of first row of csv files within a directory using python

Ive got some code that lets me open all csv files in a directory and run through them removing the top 2 lines of each file, Ideally during this process I would like it to also add a single comma at the end of the new first line (what would have been originally line 3)
Another approach that's possible could be to remove the trailing comma's on all other rows that appear in each of the csvs.
Any thoughts or approaches would be gratefully received.
import glob
path='P:\pytest'
for filename in glob.iglob(path+'/*.csv'):
with open(filename, 'r') as f:
lines = f.read().split("\n")
f.close()
if len(lines) >= 1:
lines = lines[2:]
o = open(filename, 'w')
for line in lines:
o.write(line+'\n')
o.close()

adding a counter in there can solve this:
import glob
path=r'C:/Users/dsqallihoussaini/Desktop/dev_projects/stack_over_flow'
for filename in glob.iglob(path+'/*.csv'):
with open(filename, 'r') as f:
lines = f.read().split("\n")
print(lines)
f.close()
if len(lines) >= 1:
lines = lines[2:]
o = open(filename, 'w')
counter=0
for line in lines:
counter=counter+1
if counter==1:
o.write(line+',\n')
else:
o.write(line+'\n')
o.close()

One possible problem with your code is that you are reading the whole file into memory, which might be fine. If you are reading larger files, then you want to process the file line by line.
The easiest way to do that is to use the fileinput module: https://docs.python.org/3/library/fileinput.html
Something like the following should work:
#!/usr/bin/env python3
import glob
import fileinput
# inplace makes a backup of the file, then any output to stdout is written
# to the current file.
# change the glob..below is just an example.
#
# Iterate through each file in the glob.iglob() results
with fileinput.input(files=glob.iglob('*.csv'), inplace=True) as f:
for line in f: # Iterate over each line of the current file.
if f.filelineno() > 2: # Skip the first two lines
# Note: 'line' has the newline in it.
# Insert the comma if line 3 of the file, otherwise output original line
print(line[:-1]+',') if f.filelineno() == 3 else print(line, end="")

Ive added some encoding as well as mine was throwing a error but encoding fixed that up nicely
import glob
path=r'C:/whateveryourfolderis'
for filename in glob.iglob(path+'/*.csv'):
with open(filename, 'r',encoding='utf-8') as f:
lines = f.read().split("\n")
#print(lines)
f.close()
if len(lines) >= 1:
lines = lines[2:]
o = open(filename, 'w',encoding='utf-8')
counter=0
for line in lines:
counter=counter+1
if counter==1:
o.write(line+',\n')
else:
o.write(line+'\n')
o.close()

reading .txt file in python

I have a problem with a code in python. I want to read a .txt file. I use the code:
f = open('test.txt', 'r') # We need to re-open the file
data = f.read()
print(data)
I would like to read ONLY the first line from this .txt file. I use
f = open('test.txt', 'r') # We need to re-open the file
data = f.readline(1)
print(data)
But I am seeing that in screen only the first letter of the line is showing.
Could you help me in order to read all the letters of the line ? (I mean to read whole the line of the .txt file)

with open("file.txt") as f:
print(f.readline())
This will open the file using with context block (which will close the file automatically when we are done with it), and read the first line, this will be the same as:
f = open(“file.txt”)
print(f.readline())
f.close()
Your attempt with f.readline(1) won’t work because it the argument is meant for how many characters to print in the file, therefore it will only print the first character.
Second method:
with open("file.txt") as f:
print(f.readlines()[0])
Or you could also do the above which will get a list of lines and print only the first line.
To read the fifth line, use
with open("file.txt") as f:
print(f.readlines()[4])
Or:
with open("file.txt") as f:
lines = []
lines += f.readline()
lines += f.readline()
lines += f.readline()
lines += f.readline()
lines += f.readline()
print(lines[-1])
The -1 represents the last item of the list
Learn more:
with statement
files in python
readline method

Your first try is almost there, you should have done the following:
f = open('my_file.txt', 'r')
line = f.readline()
print(line)
f.close()
A safer approach to read file is:
with open('my_file.txt', 'r') as f:
print(f.readline())
Both ways will print only the first line.
Your error was that you passed 1 to readline which means you want to read size of 1, which is only a single character. please refer to https://www.w3schools.com/python/ref_file_readline.asp

I tried this and it works, after your suggestions:
f = open('test.txt', 'r')
data = f.readlines()[1]
print(data)

Use with open(...) instead:
with open("test.txt") as file:
line = file.readline()
print(line)

Keep f.readline() without parameters.
It will return you first line as a string and move cursor to second line.
Next time you use f.readline() it will return second line and move cursor to the next, etc...

Copying content of file into another file in Python [duplicate]

I would like to copy certain lines of text from one text file to another. In my current script when I search for a string it copies everything afterwards, how can I copy just a certain part of the text? E.g. only copy lines when it has "tests/file/myword" in it?
current code:
#!/usr/bin/env python
f = open('list1.txt')
f1 = open('output.txt', 'a')
doIHaveToCopyTheLine=False
for line in f.readlines():
if 'tests/file/myword' in line:
doIHaveToCopyTheLine=True
if doIHaveToCopyTheLine:
f1.write(line)
f1.close()
f.close()

The oneliner:
open("out1.txt", "w").writelines([l for l in open("in.txt").readlines() if "tests/file/myword" in l])
Recommended with with:
with open("in.txt") as f:
lines = f.readlines()
lines = [l for l in lines if "ROW" in l]
with open("out.txt", "w") as f1:
f1.writelines(lines)
Using less memory:
with open("in.txt") as f:
with open("out.txt", "w") as f1:
for line in f:
if "ROW" in line:
f1.write(line)

readlines() reads the entire input file into a list and is not a good performer. Just iterate through the lines in the file. I used 'with' on output.txt so that it is automatically closed when done. That's not needed on 'list1.txt' because it will be closed when the for loop ends.
#!/usr/bin/env python
with open('output.txt', 'a') as f1:
for line in open('list1.txt'):
if 'tests/file/myword' in line:
f1.write(line)

Just a slightly cleaned up way of doing this. This is no more or less performant than ATOzTOA's answer, but there's no reason to do two separate with statements.
with open(path_1, 'a') as file_1, open(path_2, 'r') as file_2:
for line in file_2:
if 'tests/file/myword' in line:
file_1.write(line)

Safe and memory-saving:
with open("out1.txt", "w") as fw, open("in.txt","r") as fr:
fw.writelines(l for l in fr if "tests/file/myword" in l)
It doesn't create temporary lists (what readline and [] would do, which is a non-starter if the file is huge), all is done with generator comprehensions, and using with blocks ensure that the files are closed on exit.

f=open('list1.txt')
f1=open('output.txt','a')
for x in f.readlines():
f1.write(x)
f.close()
f1.close()
this will work 100% try this once

in Python 3.10 with parenthesized context managers, you can use multiple context managers in one with block:
with (open('list1.txt', 'w') as fout, open('output.txt') as fin):
fout.write(fin.read())

f = open('list1.txt')
f1 = open('output.txt', 'a')
# doIHaveToCopyTheLine=False
for line in f.readlines():
if 'tests/file/myword' in line:
f1.write(line)
f1.close()
f.close()
Now Your code will work. Try This one.

Take lines from two files, output into same line- python

I am attempting to open two files then take the first line in the first file, write it to an out file, then take the first line in the second file and append it to the same line in the output file, separated by a tab.
I've attempted to code this, and my outfile just ends up being the whole contents of the first file, followed by the entire contents of the second file. I included print statements just because I wanted to see something going on in the terminal while the script was running, that is why they are there. Any ideas?
import sys
InFileName = sys.argv[1]
InFile = open(InFileName, 'r')
InFileName2 = sys.argv[2]
InFile2 = open(InFileName2, 'r')
OutFileName = "combined_data.txt"
OutFile = open(OutFileName, 'a')
for line in InFile:
OutFile.write(str(line) + '\t')
print line
for line2 in InFile2:
OutFile.write(str(line2) + '\n')
print line
InFile.close()
InFile2.close()
OutFile.close()

You can use zip for this:
with open(file1) as f1,open(file2) as f2,open("combined_data.txt","w") as fout:
for t in zip(f1,f2):
fout.write('\t'.join(x.strip() for x in t)+'\n')
In the case where your two files don't have the same number of lines (or if they're REALLY BIG), you could use itertools.izip_longest(f1,f2,fillvalue='')

Perhaps this gives you a few ideas:
Adding entries from multiple files in python
o = open('output.txt', 'wb')
fh = open('input.txt', 'rb')
fh2 = open('input2.txt', 'rb')
for line in fh.readlines():
o.write(line.strip('\r\n') + '\t' + fh2.readline().strip('\r\n') + '\n')
## If you want to write remaining files from input2.txt:
# for line in fh2.readlines():
# o.write(line.rstrip('\r\n') + '\n')
fh.close()
fh2.close()
o.close()
This will give you:
line1_of_file_1 line1_of_file_2
line2_of_file_1 line2_of_file_2
line3_of_file_1 line3_of_file_2
line4_of_file_1 line4_of_file_2
Where the space in my output example is a [tab]
Note: no line ending is appended to the file for obvious reasons.
For this to work, the linendings would need to be proper in both file 1 and 2.
To check this:
print 'File 1:'
f = open('input.txt', 'rb')
print [r.read[:200]]
f.close()
print 'File 2:'
f = open('input2.txt', 'rb')
print [r.read[:200]]
f.close()
This should give you something like
File 1:
['This is\ta lot of\t text\r\nWith a few line\r\nendings\r\n']
File 2:
['Give\r\nMe\r\nSome\r\nLove\r\n']

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

make list from text file and compare the lists - python

Related

How to add empty lines between lines in text.txt document?

Adding a comma to end of first row of csv files within a directory using python

reading .txt file in python

Copying content of file into another file in Python [duplicate]

Take lines from two files, output into same line- python

Categories

Resources