I'm having some difficulty with writing a program in Python. I would like the program to read lines between a set of characters, reverse the order of the lines and then write them into a new file. The input is:
AN10 G17 G21 G90
N20 '2014_12_08_Banding_Test_4
N30 M3 S1B
N40G00X0.000Y0.000Z17.000
N50 G00X0.001Y0.001Z17.000
N60 G01Z0.000F3900.0
N70 G01X0.251
N80 G01X149.999
N90 G01Y0.251
N100 G01X149.749
N110 G01X149.499Z-8.169
N120 G01X148.249Z-8.173
N130 G01X146.999Z-8.183
N140 G01X145.499Z-8.201
...
N3140 G01Y0.501
So far my code is:
with open('Source.nc') as infile, open('Output.nc', 'w') as outfile:
copy = False
strings_A = ("G01Y", ".251")
strings_B = ("G01Y", ".501")
content = infile.readlines()
for lines in content:
lines.splitlines(1)
if all(x in lines for x in strings_A):
copy = True
elif all(x in lines for x in strings_B):
copy = False
elif copy:
outfile.writelines(reversed(lines))
I think I am failing to understand something about the difference between lines and a multi-multiline string. I would really appreciate some help here!
Thanks in advance, Arthur
A string has multiple lines if it contains newline characters \n.
You can think of a file as either one long string that contains newline characters:
s = infile.read()
Or you can treat it like a list of lines:
lines = infile.readlines()
If you have a multiline string you can split it into a list of lines:
lines = s.splitlines(False)
# which is basically a special form of:
lines = s.split('\n')
If you want to process a file line by line all of the following methods are equivalent (in effect if not in efficiency) :
with open(filename, 'r') as f:
s = f.read()
lines = s.splitlines()
for line in lines:
# do something
pass
with open(filename, 'r') as f:
lines = f.readlines()
for line in lines:
# do something
pass
# this last option is the most pythonic one,
# it uses the fact that any file object can be treated as a list of lines
with open(filename, 'r') as f
for line in f:
# do something
pass
EDIT Now the solution of your problem:
with open('Source.nc') as infile, open('Output.nc', 'w') as outfile:
copy = False
strings_A = ("G01Y", ".251")
strings_B = ("G01Y", ".501")
target_lines = []
for line in infile:
if copy and all(x in line for x in strings_B):
outfile.writelines(reversed(target_lines))
break
if copy:
target_lines.append(line)
if all(x in line for x in strings_A):
copy = True
This will copy all lines between a line that matches all(x in line for x in strings_A) and a line that matches all(x in line for x in strings_B) into the outfile in reversed order. The identifying lines are NOT included in the output (I hope that was the intent).
The order of the if clauses is deliberate to achieve that.
Also be aware that the identification tests (all(x in line for x in strings_A)) you use, work as a substring search not a word match, again I don't know if that was your intent.
EDIT2 In response to comment:
with open('Source.nc') as infile, open('Output.nc', 'w') as outfile:
strings_A = ("G01Y", ".251")
strings_B = ("G01Y", ".501")
do_reverse = False
lines_to_reverse = []
for line in infile:
if all(x in line for x in strings_B):
do_reverse = False
outfile.writelines(reversed(lines_to_reverse))
outfile.writeline(line)
continue
if do_reverse:
lines_to_reverse.append(line)
continue
else:
outfile.writeline(line)
if all(x in line for x in strings_A):
do_reverse = True
lines_to_reverse = []
Related
I have a file that contains the following content. This is a sample of the file. The fine contains up to 1000 values.
'1022409', '10856967', '11665741'
I need to read the file and create this list ['1022409', '10856967', '11665741']
I'm using the following code:
with open('Pafos.txt', 'r') as f:
parcelIds_list = f.readlines()
print (parcelIds_list[0].split(','))
The value is of parcelIds_list parameter is this list ["'1022409', '10856967', '11665741'"] with only index 0.
Any ideas please?
Follow this code
with open ('Pafos.txt', 'r') as f:
# To split
num = f.readline().split(', ')
# To remove ' and create list
# If u want list of string
num_list = [x.replace('\'', '') for x in num]
Output
['1022409', '10856967', '11665741']
If u want list of int
# If u want list of int
num_list = [int(x.replace('\'', '')) for x in num]
Output
[1022409, 10856967, 11665741]
If u have more than one row in file, you need to add some extra line of code
It is a bit hard to code with the few lines you have given from the text file, but try this :
temp = []
with open('Pafos.txt', 'r') as f:
parcelIds_list = f.readlines()
for j in parcelIDS_list:
temp.extend(j.split(", "))
new_list = [i.strip()[1:-1] for i in temp]
print(new_list)
Let me know if this doesn't work, and what went wrong. I will modify my answer likewise.
I found the solution thanks to this.
with open('Pafos.txt', 'r') as f:
parcelIds_list = f.readlines()
cs_mylist = []
for y in [x.split(',') for x in parcelIds_list]:
for z in y:
cs_mylist.append(z.replace(' ', ''))
Probably there is a cleaner way, but this works:
with open('Pafos.txt', 'r') as f:
parcelIds_list = f.readlines()
line = parcelIds_list[0]
line = line.replace("'", "")
line = line.replace(" ", "")
line = line.split(',')
print(line)
I am trying to compare the two lines and capture the lines that match with each other. For example,
file1.txt contains
my
sure
file2.txt contains
my : 2
mine : 5
sure : 1
and I am trying to output
my : 2
sure : 1
I have the following code so far
inFile = "file1.txt"
dicts = "file2.txt"
with open(inFile) as f:
content = f.readlines()
content = [x.strip() for x in content]
with open(dicts) as fd:
inDict = fd.readlines()
inDict = [x.strip() for x in inDict]
ordered_dict = {}
for line in inDict:
key = line.split(":")[0].strip()
value = int(line.split(":")[1].strip())
ordered_dict[key] = value
for (key, val) in ordered_dict.items():
for entry in content:
if entry == content:
print(key, val)
else:
continue
However, this is very inefficient because it loops two times and iterates a lot. Therefore, this is not ideal when it comes to large files. How can I make this workable for large files?
You don't need nested loops. One loop to read in file2 and translate to a dict, and another loop to read file1 and look up the results.
inFile = "file1.txt"
dicts = "file2.txt"
ordered_dict = {}
with open(dicts) as fd:
for line in fd:
a,b = line.split(' : ')
ordered_dict[a] = b
with open(inFile) as f:
for line in f:
line = line.strip()
if line in ordered_dict:
print( line, ":", ordered_dict[line] )
The first loop can be done as a list comprehension.
with open(dicts) as fd:
ordered_dict = dict( line.strip().split(' : ') for line in fd )
Here is a solution with one for loop:
inFile = "file1.txt"
dicts = "file2.txt"
with open(inFile) as f:
content_list = list(map(str.split,f.readlines()))
with open(dicts) as fd:
in_dict_lines = fd.readlines()
for dline in in_dict_lines:
key,val=dline.split(" : ")
if key in content_list:
ordered_dict[key] = value
I would like to search for strings that match a pattern in a text file and export only the matched strings
k=''
regex = re.compile(r'[a-zA-Z]{2}\d{8}')
with open(file, 'r') as f:
for line in f:
line = line.replace(',', '')
line = line.replace('.', '')
k = regex.findall(line)
#k.append(line)
if not k=='':
position=True
else:
position=False
if position==True:
print(k)
Somehow my code doesn't work, it always returns the following output:
[] [] [] [] [] [] [] ['AI13933231'] [] [] [] [] []
I want the output to contain only the matched strings. Thank you!
The reason why there are empty array literals [] is because this line actually exists, but is either empty (containing just \n) or does not match the regex '[a-zA-Z]{2}\d{8}'. And please note that regex.findall(line) returns an list, so if the regex did not find any that matches, it is an empty list.
Your main error happened in this section: if not k=='':. Note k is an list.
Consider this code:
import re
k=''
regex = re.compile(r'[a-zA-Z]{2}\d{8}')
with open("omg.txt", 'r') as f:
for line in f:
line = line.replace(',', '')
line = line.replace('.', '')
k = regex.findall(line)
#k.append(line)
position = False
if str(k) != '[]': # The `[]` is just the string representation of an empty array
position=True
print(k)
else:
position=False
Given the file (Text after # are ignored, not part of the file)
AZ23153133
# Empty line
AB12355342
gz # No match
XY93312344
The output would be
['AZ23153133']
['AB12355342']
['XY93312344']
I have a text file in the following format:
DELIMITER1
extract me
extract me
extract me
DELIMITER2
I'd like to extract every block of extract mes between DELIMITER1 and DELIMITER2 in the .txt file
This is my current, non-performing code:
import re
def GetTheSentences(file):
fileContents = open(file)
start_rx = re.compile('DELIMITER')
end_rx = re.compile('DELIMITER2')
line_iterator = iter(fileContents)
start = False
for line in line_iterator:
if re.findall(start_rx, line):
start = True
break
while start:
next_line = next(line_iterator)
if re.findall(end_rx, next_line):
break
print next_line
continue
line_iterator.next()
Any ideas?
You can simplify this to one regular expression using re.S, the DOTALL flag.
import re
def GetTheSentences(infile):
with open(infile) as fp:
for result in re.findall('DELIMITER1(.*?)DELIMITER2', fp.read(), re.S):
print result
# extract me
# extract me
# extract me
This also makes use of the non-greedy operator .*?, so multiple non-overlapping blocks of DELIMITER1-DELIMITER2 pairs will all be found.
If the delimiters are within a line:
def get_sentences(filename):
with open(filename) as file_contents:
d1, d2 = '.', ',' # just example delimiters
for line in file_contents:
i1, i2 = line.find(d1), line.find(d2)
if -1 < i1 < i2:
yield line[i1+1:i2]
sentences = list(get_sentences('path/to/my/file'))
If they are on their own lines:
def get_sentences(filename):
with open(filename) as file_contents:
d1, d2 = '.', ',' # just example delimiters
results = []
for line in file_contents:
if d1 in line:
results = []
elif d2 in line:
yield results
else:
results.append(line)
sentences = list(get_sentences('path/to/my/file'))
This should do what you want:
import re
def GetTheSentences(file):
start_rx = re.compile('DELIMITER')
end_rx = re.compile('DELIMITER2')
start = False
output = []
with open(file, 'rb') as datafile:
for line in datafile.readlines():
if re.match(start_rx, line):
start = True
elif re.match(end_rx, line):
start = False
if start:
output.append(line)
return output
Your previous version looks like it's supposed to be an iterator function. Do you want your output returned one item at a time? That's slightly different.
This is a good job for List comprehensions, no regex required. First list comp scrubs the typical \n in the text line list found when opening txt file. Second list comp just uses in operator to identify sequence patterns to filter.
def extract_lines(file):
scrubbed = [x.strip('\n') for x in open(file, 'r')]
return [x for x in scrubbed if x not in ('DELIMITER1','DELIMITER2')]
How can I parse through the following file, and turn each line to an element of a list (there is a whitespace at the beginning of each line) ? Unfortunately I've always sucked at regex :/ So turn this:
32.42.4.120', '32.42.4.127
32.42.5.128', '32.42.5.255
32.42.15.136', '32.42.15.143
32.58.129.0', '32.58.129.7
32.58.131.0', '32.58.131.63
46.7.0.0', '46.7.255.255
into a list :
('32.42.4.120', '32.42.4.127'),
('32.42.5.128', '32.42.5.255'),
('32.42.15.136', '32.42.15.143'),
('32.58.129.0', '32.58.129.7'),
('32.58.131.0', '32.58.131.63'),
How about this? (If I am wrong, at least let me know before down-voting)
>>> x = [tuple(line.strip().split("', '")) for line in open('file')]
>>> x
[('32.42.4.120', '32.42.4.127'), ('32.42.5.128', '32.42.5.255'), ('32.42.15.136', '32.42.15.143'), ('32.58.129.0', '32.58.129.7'), ('32.58.131.0', '32.58.131.63'), ('46.7.0.0', '46.7.255.255')]
no regex needed:
l = []
with open("name_file", "r") as f:
for line in f:
l.append(line.split(", "))
if you want to remove first space and to have tuple you can do:
l = []
with open("name_file", "r") as f:
for line in f:
data = line.split(", ")
l.append((data[0].strip(), data[1].strip()))
l = []
f = open("test_data.txt")
for line in f:
elems = line[1:-1].split("', '")
l.append((elems[0], elems[1]))
f.close()
print l
Output:
[('32.42.4.120', '32.42.4.127'), ('32.42.5.128', '32.42.5.255'), ('32.42.15.136', '32.42.15.143'), ('32.58.129.0', '32.58.129.7'), ('32.58.131.0', '32.58.131.63'), ('46.7.0.0', '46.7.255.25')]