python replacement for xml

python replacement for xml - python

I have file, FF_tuningConfig_AMPKi.xml, contains of records such as:
<KiConfig active="%{active}" id="AMP_RET_W_LIN_SUSPICIOUS_MULTIPLE_LOGIN_IN_SHORT_PERIOD$KiConfig"/>
<KiConfig active="%{active}" id="AMP_RET_W_LIN_UNUSUAL_SESSION_HOUR_OF_DAY$KiConfig"/>
I have the following code:
def replace_content(path,se,search,String_Replace):
for root, dirs, files in os.walk(path):
for filename in files:
if((se in filename)):
file=open(os.path.join(root, filename),'r')
lines = file.readlines()
file=open(os.path.join(root, filename),'w')
for line in lines:
if search in line:
#print "found="+line
words=line.split('=')
# print words
# print "line=" + words[0] +"="+ "8\n"
line=line.replace(line,String_Replace)
#print "after="+line
file.write(line)
file.close()
print (os.path.join(root,filename) + " was replaced")
replace_content(Path,'FF_tuningConfig_AMPKi.xml','<KiConfig active="%{active}"','<KiConfig active="true"')
I am getting the below:
active="true" <Thresholds>
Instead of:
<KiConfig active="true" id="AMP_RET_W_LIN_UNUSUAL_SESSION_HOUR_OF_DAY$KiConfig"/>

Your problem is with line=line.replace(line,String_Replace). Take a look at the documentation for str.replace()
line = line.replace(search,String_Replace)
To test your code, you could have written a separate script with only the part that seemed to be failing.
# test input
s = '''<KiConfig active="%{active}" id="AMP_RET_W_LIN_SUSPICIOUS_MULTIPLE_LOGIN_IN_SHORT_PERIOD$KiConfig"/>
<KiConfig active="%{active}" id="AMP_RET_W_LIN_UNUSUAL_SESSION_HOUR_OF_DAY$KiConfig"/>'''
lines = s.split('\n')
# parameters
search, String_Replace = '<KiConfig active="%{active}"','<KiConfig active="true"'
# Then the part of your code that seems to be failing
for line in lines:
if search in line:
line = line.replace(line, String_Replace)
print(line)
That lets you focus on the problem and makes it easy and fast to modify then test your code. Once you have that functionality working, copy and paste it into your working code. If that part of your code actually works then you have eliminated it as a source for errors and you can test other parts.
As an aside, no need to test if your search string is in the line before attempting to replace. If the search string isn't in the line, str.replace() will return the line without modification.

Related

How to find first and last characters in a file using python?

I am stuck on this revision exercise which asks to copy an input file to an output file and return the first and last letters.
def copy_file(filename):
input_file = open(filename, "r")
content = input_file.read()
content[0]
content[1]
return content[0] + content[-1]
input_file.close()
Why do I get an error message which I try get the first and last letters? And how would I copy the file to the output file?
Here is the test:
input_f = "FreeAdvice.txt"
first_last_chars = copy_file(input_f)
print(first_last_chars)
print_content('cure737.txt')
Error Message:
FileNotFoundError: [Errno 2] No such file or directory: 'hjac737(my username).txt'

All the code after a return statement is never executed, a proper code editor would highlight it to you, so I recommend you use one. So the file was never closed. A good practice is to use a context manager for that : it will automatically call close for you, even in case of an exception, when you exit the scope (indentation level).
The code you provided also miss to write the file content, which may be causing the error you reported.
I explicitely used the "rt" (and "wt") mode for the files (althought they are defaults), because we want the first and last character of the file, so it supports Unicode (any character, not just ASCII).
def copy_file(filename):
with open(filename, "rt") as input_file:
content = input_file.read()
print(input_file.closed) # True
my_username = "LENORMJU"
output_file_name = my_username + ".txt"
with open(output_file_name, "wt") as output_file:
output_file.write(content)
print(output_file.closed) # True
# last: return the result
return content[0] + content[-1]
print(copy_file("so67730842.py"))
When I run this script (on itself), the file is copied and I get the output d) which is correct.

Parse multiple log files for strings

I'm trying to parse a number of log files from a log directory, to search for any number of strings in a list along with a server name. I feel like I've tried a million different options, and I have it working fine with just one log file.. but when I try to go through all the log files in the directory I can't seem to get anywhere.
if args.f:
logs = args.f
else:
try:
logs = glob("/var/opt/cray/log/p0-current/*")
except IndexError:
print "Something is wrong. p0-current is not available."
sys.exit(1)
valid_errors = ["error", "nmi", "CATERR"]
logList = []
for log in logs:
logList.append(log)
#theLog = open("logList")
#logFile = log.readlines()
#logFile.close()
#printList = []
#for line in logFile:
# if (valid_errors in line):
# printList.append(line)
#
#for item in printList:
# print item
# with open("log", "r") as tmp_log:
# open_log = tmp_log.readlines()
# for line in open_log:
# for down_nodes in open_log:
# if valid_errors in open_log:
# print valid_errors
down_nodes is a pre-filled list further up the script containing a list of servers which are marked as down.
Commented out are some of the various attempts I've been working through.
logList = []
for log in logs:
logList.append(log)
I thought this may be the way forward to put each individual log file in a list, then loop through this list and use open() followed by readlines() but I'm missing some kind of logic here.. maybe I'm not thinking correctly.
I could really do with some pointers here please.
Thanks.

So your last for loop is redundant because logs is already a list of strings. With that information, we can iterate through logs and do something for each log.
for log in logs:
with open(log) as f:
for line in f.readlines():
if any(error in line for error in valid_errors):
#do stuff
The line if any(error in line for error in valid_errors): checks the line to see if any of the errors in valid_errors are in the line. The syntax is a generator that yields error for each error in valid_errors.
To answer your question involving down_nodes, I don't believe you should include this in the same any(). You should try something like
if any(error in line for error in valid_errors) and \
any(node in line for node in down_nodes):

Firstly you need to find all logs:
import os
import fnmatch
def find_files(pattern, top_level_dir):
for path, dirlist, filelist in os.walk(top_level_dir):
for name in fnmatch.filter(filelist, pattern)
yield os.path.join(path, name)
For example, to find all *.txt files in current dir:
txtfiles = find_files('*.txt', '.')
Then get file objects from the names:
def open_files(filenames):
for name in filenames:
yield open(name, 'r', encoding='utf-8')
Finally individual lines from files:
def lines_from_files(files):
for f in files:
for line in f:
yield line
Since you want to find some errors the check could look like this:
import re
def find_errors(lines):
pattern = re.compile('(error|nmi|CATERR)')
for line in lines:
if pattern.search(line):
print(line)
You can now process a stream of lines generated from a given directory:
txt_file_names = find_files('*.txt', '.')
txt_files = open_files(txt_file_names)
txt_lines = lines_from_files(txt_files)
find_errors(txt_lines)
The idea to process logs as a stream of data originates from talk by David Beazley.

Python text parsing and saving as html

I've been playing around with Python trying to write a script to scan a directory for specific files, finding certain keywords and saving the lines where these keywords appear into a new file. I came up with this;
import sys, os, glob
for filename in glob.glob("./*.LOG"):
with open(filename) as logFile:
name = os.path.splitext(logFile.name)[0]
newLOG = open(name + '_ERROR!'+'.LOG', "w")
allLines = logFile.readlines()
logFile.close()
printList = []
for line in allLines:
if ('ERROR' in line) or ('error' in line):
printList.append(line)
for item in printList:
# print item
newLOG.write(item)
This is all good but I thought I'd try instead saving this new file as html wrapping it all in the rights tags(html,head,body...) so that maybe I could change the font colour of the keywords. So far it looks like this;
import sys, os, glob
for filename in glob.glob("./*.LOG"):
with open (filename) as logFile:
name = os.path.splitext(logFile.name)[0]
newLOG = open(name + '_ERROR!'+'.html', "w")
newLOG.write('<html>')
newLOG.write('<head>')
newLOG.write('<body><p>')
allLines = logFile.readlines()
logFile.close()
printList = []
for line in allLines:
if ('ERROR' in line) or ('error' in line):
printList.append(line)
for item in printList:
# print item
newLOG.write('</html>')
newLOG.write('</head>')
newLOG.write('</body><p>')
newLOG.write(item)
Now the problem is I'm new to this and I'm still trying to figure out how to work with indentations and loops.. Because my html tags are being appended from within the loop, every line has the <html>, <head> & <body><p> tag around them and it just looks wrong. I understand the problem and have tried rewriting things so that the tags are applied outside the loop but I've not had much success.
Could someone show me a better way of getting the file name of the current file, creating a new file+appending it as I think this is why I'm getting the file handling errors when trying to change how it all works.
Thanks

It's a matter of indenting the lines to the right level. The HTML footer must be printed at the indentation level of the header lines, not indented within the loop. Try this:
import sys, os, glob
import cgi
for filename in glob.glob("./*.LOG"):
name = os.path.splitext(filename)[0]
with open(filename, 'r') as logFile, open('%s_ERROR!.html' % name, 'w') as outfile:
outfile.write("<html>\n<head>\n</head>\n<body><p>")
allLines = logFile.readlines()
printList = []
for line in allLines:
if ('ERROR' in line) or ('error' in line):
printList.append(line)
for item in printList:
# Note: HTML-escape value of item
outfile.write(cgi.escape(item) + '<br>')
outfile.write("</p></body>\n</html>")
Note that you don't need to use printList - you could just emit the HTML code as you go through the log.
Consider breaking this into smaller functions for reusability and readability.

How does one parse a .lua file with Python and pull out the require statements?

I am not very good at parsing files but have something I would like to accomplish. The following is a snippet of a .lua script that has some require statements. I would like to use Python to parse this .lua file and pull the 'require' statements out.
For example, here are the require statements:
require "common.acme_1"
require "common.acme_2"
require "acme_3"
require "common.core.acme_4"
From the example above I would then like to split the directory from the required file. In the example 'require "common.acme_1"' the directory would be common and the required file would be acme_1. I would then just add the .lua extention to acme_1. I need this information so I can validate if the file exists on the file system (which I know how to do) and then against luac (compiler) to make sure it is a valid lua file (which I also know how to do).
I simply need help pulling these require statements out using Python and splitting the directory name from the filename.

You can do this with built in string methods, but since the parsing is a little bit complicated (paths can be multi-part) the simplest solution might be to use regex. If you're using regex, you can do the parsing and splitting using groups:
import re
data = \
'''
require "common.acme_1"
require "common.acme_2"
require "acme_3"
require "common.core.acme_4"
'''
finds = re.findall(r'require\s+"(([^."]+\.)*)?([^."]+)"', data, re.MULTILINE)
print [dict(path=x[0].rstrip('.'),file=x[2]) for x in finds]
The first group is the path (including the trailing .), the second group is the inner group needed for matching repeated path parts (discarded), and the third group is the file name. If there is no path you get path=''.
Output:
[{'path': 'common', 'file': 'acme_1'}, {'path': 'common', 'file': 'acme_2'}, {'path': '', 'file': 'acme_3'}, {'path': 'common.core', 'file': 'acme_4'}]

Here ya go!
import sys
import os.path
if len(sys.argv) != 2:
print "Usage:", sys.argv[0], "<inputfile.lua>"
exit()
f = open(sys.argv[1], "r")
lines = f.readlines()
f.close()
for line in lines:
if line.startswith("require "):
path = line.replace('require "', '').replace('"', '').replace("\n", '').replace(".", "/") + ".lua"
fName = os.path.basename(path)
path = path.replace(fName, "")
print "File: " + fName
print "Directory: " + path
#do what you want to each file & path here

Here's a crazy one-liner, not sure if this was exactly what you wanted and most certainly not the most optimal one...
In [270]: import re
In [271]: [[s[::-1] for s in rec[::-1].split(".", 1)][::-1] for rec in re.findall(r"require \"([^\"]*)", text)]
Out[271]:
[['common', 'acme_1'],
['common', 'acme_2'],
['acme_3'],
['common.core', 'acme_4']]

This is straight forward
One liners are great but they take too much effort to understand early and this is not a job for using regular expressions in my opinion
mylines = [line.split('require')[-1] for line in open(mylua.lua).readlines() if line.startswith('require')]
paths = []
for line in mylines:
if 'common.' in line:
paths.append('common, line.split('common.')[-1]
else:
paths.append('',line)

You could use finditer:
lua='''
require "common.acme_1"
require "common.acme_2"
require "acme_3"
require 'common.core.acme_4'
'''
import re
print [m.group(2) for m in re.finditer(r'^require\s+(\'|")([^\'"]+)(\1)', lua, re.S | re.M)]
# ['common.acme_1', 'common.acme_2', 'acme_3', 'common.core.acme_4']
Then just split on the '.' to split into paths:
for e in [m.group(2) for m in re.finditer(r'^require\s+(\'|")([^\'"]+)(\1)', lua, re.S | re.M)]:
parts=e.split('.')
if parts[:-1]:
print '/'.join(parts[:-1]), parts[-1]
else:
print parts[0]
Prints:
common acme_1
common acme_2
acme_3
common/core acme_4

file = '/path/to/test.lua'
def parse():
with open(file, 'r') as f:
requires = [line.split()[1].strip('"') for line in f.readlines() if line.startswith('require ')]
for r in requires:
filename = r.replace('.', '/') + '.lua'
print(filename)
The with statement opens the file in question. The next line creates a list of all lines that start with 'require ' and splits them, ignoring the 'require' and grabbing only the last part and strips off the double quotes. Then go though the list and replace the dots with slashes and appends '.lua'. The print statement shows the results.

Python- need to append characters to the beginning and end of each line in text file

I should preface that I am a complete Python Newbie.
Im trying to create a script that will loop through a directory and its subdirectories looking for text files. When it encounters a text file it will parse the file and convert it to NITF XML and upload to an FTP directory.
At this point I am still working on reading the text file into variables so that they can be inserted into the XML document in the right places. An example to the text file is as follows.
Headline
Subhead
By A person
Paragraph text.
And here is the code I have so far:
with open("path/to/textFile.txt") as f:
#content = f.readlines()
head,sub,auth = [f.readline().strip() for i in range(3)]
data=f.read()
pth = os.getcwd()
print head,sub,auth,data,pth
My question is: how do I iterate through the body of the text file(data) and wrap each line in HTML P tags? For example;
<P>line of text in file </P> <P>Next line in text file</p>.

Something like
output_format = '<p>{}</p>\n'.format
with open('input') as fin, open('output', 'w') as fout:
fout.writelines( output_format(line.strip()) for line in fin )

This assumes that you want to write the new content back to the original file:
with open('path/to/textFile.txt') as f:
content = f.readlines()
with open('path/to/textFile.txt', 'w') as f:
for line in content:
f.write('<p>' + line.strip() + '</p>\n')

with open('infile') as fin, open('outfile',w) as fout:
for line in fin:
fout.write('<P>{0}</P>\n'.format(line[:-1]) #slice off the newline. Same as `line.rstrip('\n')`.
#Only do this once you're sure the script works :)
shutil.move('outfile','infile') #Need to replace the input file with the output file

in you case, you should probably replace
data=f.read()
with:
data = '\n'.join("<p>%s</p>" % l.strip() for l in f)

use data=f.readlines() here,
and then iterate over data and try something like this:
for line in data:
line="<p>"+line.strip()+"</p>"
#write line+'\n' to a file or do something else

append the and <\p> for each line
ex:
data_new=[]
data=f.readlines()
for lines in data:
data_new.append("<p>%s</p>\n" % data.strip().strip("\n"))

You could use the fileinput module to modify one or more files in-place, with optional backup file creation if desired (see its documentation for details). Here's it being used to process one file.
import fileinput
for line in fileinput.input('testinput.txt', inplace=1):
print '<P>'+line[:-1]+'<\P>'
The 'testinput.txt' argument could also be a sequence of two or more file names instead of just a single one, which could be useful especially if you're using os.walk() to generate the list of files in the directory and its subdirectories to process (as you probably should be doing).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python replacement for xml - python

Related

How to find first and last characters in a file using python?

Parse multiple log files for strings

Python text parsing and saving as html

How does one parse a .lua file with Python and pull out the require statements?

Python- need to append characters to the beginning and end of each line in text file

Categories

Resources