I created a code to take two .txt files, compare them and export the results to another .txt file. Below is my code (sorry about the mess).
Any ideas? Or am I just an imbecile?
Using python 3.5.2:
# Barcodes Search (V3actual)
# Import the text files, putting them into arrays/lists
with open('Barcodes1000', 'r') as f:
barcodes = {line.strip() for line in f}
with open('EANstaging1000', 'r') as f:
EAN_staging = {line.strip() for line in f}
##diff = barcodes ^ EAN_staging
##print (diff)
in_barcodes_but_not_in_EAN_staging = barcodes.difference(EAN_staging)
print (in_barcodes_but_not_in_EAN_staging)
# Exporting in_barcodes_but_not_in_EAN_staging to a .txt file
with open("BarcodesSearch29_06_16", "wt") as BarcodesSearch29_06_16: # Create .txt file
BarcodesSearch29_06_16.write(in_barcodes_but_not_in_EAN_staging) # Write results to the .txt file
From the comments to your question, it sounds like your issue is that you want to save your list of strings as a file. File.write expects a single string as input, while File.writelines expects a list of strings, which is what your data appears to be.
with open("BarcodesSearch29_06_16", "wt") as BarcodesSearch29_06_16:
BarcodesSearch29_06_16.writelines(in_barcodes_but_not_in_EAN_staging)
That will iterate through your list in_barcodes_but_not_in_EAN_staging, and write each element as a separate line in the file BarcodesSearch29_06_16.
Try BarcodesSearch29_06_16.write(str(in_barcodes_but_not_in_EAN_staging)). Also, you'll want to close the file after you're done writing to it with BarcodesSearch29_06_16.close().
Related
and thank you for taking the time to read this post. This is literally my first time trying to use Python so bare with me.
My Target/Goal: Edit the original text file (Original .txt file) so that for every domain listed an "OR" is added in between them (below target formatting image). Any help is greatly appreciated.
I have been able to google the information to open and read the txt file, however, I am not sure how to do the formatting part.
Script
Original .txt file
Target formatting
You can achieve this in a couple lines as:
with open(my_file) as fd:
result = fd.read().replace("\n", " OR ")
You could then write this to another file with:
with open(formatted_file, "w") as fd:
fd.write(result)
something you could do is the following
import re
# This opens the file in read mode
with open('Original.txt', 'r') as file:
# Read the contents of the file
contents = file.read()
# Seems that your original file has line breaks to each domain so
# you could replace it with the word "OR" using a regular expression
contents = re.sub(r'\n+', ' OR ', contents)
# Then you should open the file in write mode
with open('Original.txt', 'w') as file:
# and finally write the modified contents to the file
file.write(contents)
a suggestion is, maybe you want to try first writing in a different file to see if you are happy with the results (or do a copy of Original.txt just in case)
with open('AnotherOriginal.txt', 'w') as file:
file.write(contents)
I am new to python and using it for my internship. My goal is to pull specific data from about 100 .ls documents (all in the same folder) and then write it to another .txt file and from there import it into excel. My problem is I can read all the files, but cannot figure out how to pull the specifics from that file into a list. From the list I want to write them into a .txt file and then import to excel.
Is there anyway to read set readlines() to only capture certain lines?
It's hard to know exactly what you want without an example or sample code/content. What you might do is create a list and append the desired line to it.
result_list = [] # Create an empty list
with open("myfile.txt", "r") as f:
Lines = f.readlines() # read the lines of the file
for line in Lines: # loop through the lines
if "desired_string" in line:
result_list.append(line) # if the line contains the string, the line is added
Hi stackoverflow community,
Situation,
I'm trying to run this converter found from here,
However what I want is for it to read an array of file path from a text file and convert them.
Reason being, these file path are filtered manually, so I don't have to convert unnecessary files. There are a large amount of unnecessary files in the folder.
How can I go about with this? Thank you.
with open("file_path",'r') as file_content:
content=file_content.read()
content=content.split('\n')
You can read the data of the file using the method above, Then covert the data of file into a list(or any other iteratable data type) so that we can use it with for loop.I used content=content.split('\n') to split the data of content by '\n' (Every time you press enter key, a new line character '\n' is sended), you can use any other character to split.
for i in content:
# the code you want to execute
Note
Some useful links:
Split
File writing
File read and write
By looking at your situation, I guess this is what you want (to only convert certain file in a directory), in which you don't need an extra '.txt' file to process:
import os
for f in os.listdir(path):
if f.startswith("Prelim") and f.endswith(".doc"):
convert(f)
But if for some reason you want to stick with the ".txt" processing, this may help:
with open("list.txt") as f:
lines = f.readlines()
for line in lines:
convert(line)
with open('task3randomtext.txt', 'r') as f:
text = f.readlines()
this is the code I use to take text from a text file and read it into my program but it reads it in as a list an will not let me split it to convert it to a array. Is there any other way or splitting it or reading it in.
If you look at the Python documentation (which you should!) you can see that readlines() returns a list containing every line in the file. To then split each line, you could do something along the lines of:
with open('times.txt') as f:
for line in f.readlines():
print line.split()
I'm doing a program where I export an excel file to .txt and I have to import this .txt file into my program. The main goal is to extract the same part from each line but the problem is that in the .txt file the lines of the excel are being made into a huge string with no /n. Do you know if there is a way to separate them within the program and if so how can I do it?
The file I'm working with can be downloaded in http://we.tl/YtixI1ck6l
and so far I was trying something like
ppi = []
for line in read_text:
prot_interaction = line[0:14]
ppi.append(prot_interaction)
result_ppi = []
for line in read_text:
result = line[-1]
result_ppi.append(result)
But since it's not formatted in lines but just in a single one I'm not getting any good results.
Using that file as an example, use the csv module to parse it.
Example:
import csv
with open('/tmp/Model_Oralome.txt', 'rU') as f:
reader=csv.reader(f, delimiter="\t")
for row in reader:
print row[0]
Prints:
ppi
C4FQL5;Q08426
C8PB60;D2NP19
P40189;Q05655
P22712;Q9NR31
...
P05783;P02751
B5E709;D2NPK7
Q8N7J2;Q9UKZ4
(BTW, the issue you may be having with this particular file is the line terminations are a CR only from a Mac Classic OS. You can fix that in Python by using the Universal Newline mode when you open the file...)
Excel is exporting the text file with carriage returns (\r) instead of newlines (\n).
ppi = []
with open("Model_Oralome.txt",'r') as f:
lines = f.readlines()
lines = lines[0].split('\r')
From here you can iterate through each line of lines. Since it looks like you want the value of the first column:
lines = lines[1:]
for line in lines:
content = line.split('\t')
ppi.append(content[0])