this is my first time here and i wish i get your help.
i'm new to python and i need your help
i have two .txt files
here an Example
file1.txt
customer1.com
customer2.com
customer3.com
customer4.com
customer5.com
customer6.com
customer7.com
customer8.com
customer9.com
file2.txt
service1
service2
service3
i want to loop the file2.txt on the file1.txt =>
like the following example
customer1.com/service1
customer1.com/service2
customer1.com/service3
customer2.com/service1
customer2.com/service2
customer2.com/service3
customer3.com/service1
customer3.com/service2
customer3.com/service3
AND GOES ON TILL file1.txt is done.
also i need to make IF statment
for example let's say the customer number3 has the service number 2 (file found i mean)
customer3.com/service2 [service found]
i need the loop for customer3 to stop looking for services and save the output (customer3.com/service2) in a new file called file3.txt
and the loop go on with other customers and every customer has the service found, the output save in file3.txt
i hope you understand what i mean.
thanks.
you could use itertools.product to get a cartesian product of the lines from each file to get every URL combination:
from itertools import product
with open("file1.txt") as f1, open("file2.txt") as f2, open(
"file3.txt", mode="w"
) as out:
for x, y in product(f1, f2):
out.write("%s/%s\n" % (x.strip(), y.strip()))
file3.txt
customer1.com/service1
customer1.com/service2
customer1.com/service3
customer2.com/service1
customer2.com/service2
customer2.com/service3
customer3.com/service1
customer3.com/service2
...
Loop task is easy. you need to read every file and to save data as list. then write a file with that loop order. See example. But I do not understand that black line and that service found logic. Is to general. Be more specific.
Example:
list1, list2 = [], []
with open("file1.txt", "r") as f1:
line = f1.readline()
while line:
line = line.strip()
list1.append(line)
line = f1.readline()
with open("file2.txt", "r") as f2:
line = f2.readline()
while line:
line = line.strip()
list2.append(line)
line = f2.readline()
with open("file3.txt", "w") as f3:
for i in list1:
for j in list2:
f3.write(f"{i}/{j}\n")
f3.write("\n") # just for that black line
Try this to read line by line and use the accordingly.
file1 = open('file1.txt', 'r')
file2 = open('file2.txt', 'r')
lines1 = file1.readlines()
lines2 = file2.readlines()
for line_from_1 in lines1:
for line_from_2 in lines2:
print(line_from_1 + '/' + line_from_2)
Related
I'm trying to compare two .txt files for changes or deleted lines. If its deleted I want to output what the deleted line was and if it was changed I want to output the new line. I originally tried comparing line to line but when something was deleted it wouldn't work for my purpose:
for line1 in f1:
for line1 in f2:
if line1==line1:
print("SAME",file=x)
else:
print(f"Original:{line1} / New:{line1}", file=x)
Then I tried not comparing line to line so I could figure out if something was deleted but I'm not getting any output:
def check_diff(f1,f2):
check = {}
for file in [f1,f2]:
with open(file,'r') as f:
check[file] = []
for line in f:
check[file].append(line)
diff = set(check[f1]) - set(check[f2])
for line in diff:
print(line.rstrip(),file=x)
I tried combining a lot of other questions previously asked similar to my problem to get this far, but I'm new to python so I need a little extra help. Thanks! Please let me know if I need to add any additional information.
The concept is simple. Lets say file1,txt is the original file, and file2 is the one we need to see what changes were made to it:
with open('file1.txt', 'r') as f:
f1 = f.readlines()
with open('file2.txt', 'r') as f:
f2 = f.readlines()
deleted = []
added = []
for l in f1:
if l not in f2:
deleted.append(l)
for l in f2:
if l not in f1:
added.append(l)
print('Deleted lines:')
print(''.join(deleted))
print('Added lines:')
print(''.join(added))
For every line in the original file, if that line isn't in the other file, then that means that the line have been deleted. If it's the other way around, that means the line have been added.
I am not sure how you would quantify a changed line (since you could count it as one deleted plus one added line), but perhaps something like the below would be of some aid. Note that if your files are large, it might be faster to store the data in a set instead of a list, since the former has typically a search time complexity of O(1), while the latter has O(n):
with open('file1.txt', 'r') as f1, open('file2.txt', 'r') as f2:
file1 = set(f1.read().splitlines())
file2 = set(f2.read().splitlines())
changed_lines = [line for line in file1 if line not in file2]
deleted_lines = [line for line in file2 if line not in file1]
print('Changed lines:\n' + '\n'.join(changed_lines))
print('Deleted lines:\n' + '\n'.join(deleted_lines))
The f1.write(line2) works but it does not replace the text in the file, it just adds it to the file. I want the file1 to be identical to file2 if they are different by overwriting the text from file1 with the text from file2
Here is my code:
with open("file1.txt", "r+") as f1, open("file2.txt", "r") as f2:
for line1 in f1:
for line2 in f2:
if line1 == line2:
print("same")
else:
print("different")
f1.write(line2)
break
f1.close()
f2.close()
I would read both files create a new list with the different elements replaced and then write the entire list to the file
with open('file2.txt', 'r') as f:
content = [line.strip() for line in f]
with open('file1.txt', 'r') as j:
content_a = [line.strip() for line in j]
for idx, item in enumerate(content_a):
if content_a[idx] == content[idx]:
print('same')
pass
else:
print('different')
content_a[idx] = content[idx]
with open('file1.txt', 'w') as k:
k.write('\n'.join(content_a))
file1.txt before:
chrx#chrx:~/python/stackoverflow/9.28$ cat file1.txt
this
that
this
that
who #replacing
that
what
blah
code output:
same
same
same
same
different
same
same
same
file1.txt after:
chrx#chrx:~/python/stackoverflow/9.28$ cat file1.txt
this
that
this
that
vash #replaced who
that
what
blah
I want the file1 to be identical to file2
import shutil
with open('file2', 'rb') as f2, open('file1', 'wb') as f1:
shutil.copyfileobj(f2, f1)
This will be faster as you don't have to read file1.
Your code is not working because you'd have to position the file current pointer (with f1.seek() in the correct position to write the line.
In your code, you're reading a line first, and that positions the pointer after the line just read. When writing, the line data will be written in the file in that point, thus duplicating the line.
Since lines can have different sizes, making this work won't be easy, because even if you position the pointer correctly, if some line is modified to get bigger it would overwrite part of the next line inside the file when you write it. You would end up having to cache at least part of the file contents in memory anyway.
Better truncate the file (erase its contents) and write the other file data directly - then they will be identical. That's what the code in the answer does.
I am working on python program where the goal is to create a tool that takes the first word word from a file and put it beside another line in a different file.
This is the code snippet:
lines = open("x.txt", "r").readlines()
lines2 = open("c.txt", "r").readlines()
for line in lines:
r = line.split()
line1 = str(r[0])
for line2 in lines2:
l2 = line2
rn = open("b.txt", "r").read()
os = open("b.txt", "w").write(rn + line1+ "\t" + l2)
but it doesn't work correctly.
My question is that I want to make this tool to take the first word from a file, and put it beside a line in from another file for all lines in the file.
For example:
File: 1.txt :
hello there
hi there
File: 2.txt :
michal smith
takawa sama
I want the result to be :
Output:
hello michal smith
hi takaua sama
By using the zip function, you can loop through both simultaneously. Then you can pull the first word from your greeting and add it to the name to write to the file.
greetings = open("x.txt", "r").readlines()
names = open("c.txt", "r").readlines()
with open("b.txt", "w") as output_file:
for greeting, name in zip(greetings, names):
greeting = greeting.split(" ")[0]
output = "{0} {1}\n".format(greeting, name)
output_file.write(output)
Yes , like Tigerhawk indicated you want to use zip function, which combines elements from different iterables at the same index to create a list of tuples (each ith tuple having elements from ith index from each list).
Example code -
lines = open("x.txt", "r").readlines()
lines2 = open("c.txt", "r").readlines()
newlines = ["{} {}".format(x.split()[0] , y) for x, y in zip(lines,lines2)]
with open("b.txt", "w") as opfile:
opfile.write(newlines)
from itertools import *
with open('x.txt', 'r') as lines:
with open('c.txt', 'r') as lines2:
with open('b.txt', 'w') as result:
words = imap(lambda x: str(x.split()[0]), lines)
results = izip(words, lines2)
for word, line in results:
result_line = '{0} {1}'.format(word, line)
result.write(result_line)
This code will work without loading files into memory.
I have file1.txt which contains lines as
list 0
list 1
line 1
In file2.txt i want to write only if the line is not already exists in file2.txt
my code:
fo=open("file1.txt","r")
fin=open("file2.txt","a")
lines=fo.readlines()
for line in lines:
if "list" in line:
fin.write(line)
for line in lines:
if "li" in line:
fin.write(line)
Output: It is printing the lines twice.Here I want to write only once if the same line is repeated.
list 0
list 1
list 0
list 1
line 1
My output should be
list 0
list 1
line 1
My suggestion would be, to first read all lines of file2.txt and put them into a suitable datastructure (i.e. a Set).
Then reopen file2.txt in append mode, iterate over all lines of file1.txt and write only these that are not in the set (here, the in operator comes handy...)
with open("file2.txt", "r") as f2:
lineset = set(f2)
with open("file2.txt", "a") as f2:
with open("file1.txt", "r") as f1:
for line in f1:
if not line in lineset:
d2.write(line)
This will read all the lines in file2 and only write a line to file2 if its not already there. It will also close your file automatically by using the excellent "with" statement in python. :)
with open("file1.txt","r") as file1, open("file2.txt", "w+") as file2:
lines2 = file2.readlines()
for line in file1:
if line not in lines2:
file2.write(line)
If you want to use list iteration, the same code is just 2 lines, but I prefer the readability of the first version.
with open("file1.txt", "r") as file1, open("file2.txt", "w+") as file2:
[file2.write(line) for line in file1 if line not in file2.readlines()]
Use a set to track the collection of lines in the file2.txt file.
fo=open("file1.txt","r")
fin=open("file2.txt","a")
lines=fo.readlines()
# Rewing the file so that we can read it's contents.
fin.seek(0)
existing_lines = set(fin)
for line in lines:
if line not in existing_lines:
fin.write(line)
existing_lines.add(line)
You would want to do something like:
fo=open("file1.txt","r")
fin=open("file2.txt","a")
linesOut=fo.readlines()
linesIn=fin.readlines()
for lineOut in linesOut:
#check each line in linesIn to see if it contains lineOut
writeLine=True
for lineIn in linesIn:
if lineOut==lineIn:
writeLine=False
#if not add it
if writeLine:
fin.write(lineOut)
I have two text files: file1 has 40 lines and file2 has 1.3 million lines
I would like to compare every line in file1 with file2.
If a line in file1 appeared once or multiple times in file2,
this line(lines) should be deleted from file2 and remaining lines of file2 return to a third file3.
I could painfully delete one line in file1 from file2 by manually copying the line,
indicated as "unwanted_line" in my code. Does anyone knows how to do this in python.
Thanks in advance for your assistance.
Here's my code:
fname = open(raw_input('Enter input filename: ')) #file2
outfile = open('Value.txt','w')
unwanted_line = "222" #This is in file1
for line in fname.readlines():
if not unwanted_line in line:
# now remove unwanted_line from fname
data =line.strip("unwanted_line")
# write it to the output file
outfile.write(data)
print 'results written to:\n', os.getcwd()+'\Value.txt'
NOTE:
This is how I got it to work for me. I would like to thank everyone who contributed towards the solution. I took your ideas here.I used set(),where intersection (common lines) of file1 with file2 is removed, then, the unique lines in file2 are return to file3. It might not be most elegant way of doing it, but it works for me. I respect everyone of your ideas, there are great and wonderful, it makes me feel python is the only programming language in the whole world.
Thanks guys.
def diff_lines(filenameA,filenameB):
fnameA = set(filenameA)
fnameB = set(filenameB)
data = []
#identify lines not common to both files
#diff_line = fnameB ^ fnameA
diff_line = fnameA.symmetric_difference(fnameB)
data = list(diff_line)
data.sort()
return data
Read file1; put the lines into a set or dict (it'll have to be a dict if you're using a really old version of Python); now go through file2 and say something like if line not in things_seen_in_file_1: outfile.write(line) for each line.
Incidentally, in recent Python versions you shouldn't bother calling readlines: an open file is an iterator and you can just say for line in open(filename2): ... to process each line of the file.
Here is my version, but be aware that miniscule variations can cause line not to be considered same (like one space before new line).
file1, file2, file3 = 'verysmalldict.txt', 'uk.txt', 'not_small.txt'
drop_these = set(open(file1))
with open(file3, 'w') as outfile:
outfile.write(''.join(line for line in open(file2) if line not in drop_these))
with open(path1) as f1:
lines1 = set(f1)
with open(path2) as f2:
lines2 = tuple(f2)
lines3 = x for x in lines2 if x in lines1
lines2 = x for x in lines2 if x not in lines1
with open(path2, 'w') as f2:
f2.writelines(lines2)
with open(path3, 'w') as f3:
f3.writelines(lines3)
Closing f2 by using 2 separate with statements is a matter of personal preference/design choice.
what you can do is load file1 completely into memory (since it is small) and check each line in file2 if it matches a line in file1. if it doesn't then write it to file three. Sort of like this:
file1 = open('file1')
file2 = open('file2')
file3 = open('file3','w')
lines_from_file1 = []
# Read in all lines from file1
for line in file1:
lines_from_file1.append(line)
file1.close()
# Now iterate over lines of file2
for line2 in file2:
keep_this_line = True
for line1 in lines_from_file1:
if line1 == line2:
keep_this_line = False
break # break out of inner for loop
if keep_this_line:
# line from file2 is not in file1 so save it into file3
file3.write(line2)
file2.close()
file3.close()
Maybe not the most elegant solution, but if you don't have to do it ever 3 seconds, it should work.
EDIT: By the way, the question in the text somewhat differs from the title. I tried to answer the question in the text.