Find a syslog file based on the date in its name - python

I am working on a Python script that will allow me to better find specific syslogs files. The script simply consists in typing the IP address of the concerned machine and then the date (in YYYYMMDD format) since there is one log file per day on the server..
The script is largely copied and adapted from a similar script that a colleague programmer did before me. But I don't have as much expertise as he does on Python, and he didn't leave comments on his code.
# This Python file uses the following encoding: utf-8
import os, sys, re
name = raw_input("IP Address? \n")
dir = os.listdir(os.getcwd())
flag_equip = False
flag_h_start = True
flag_h_end = True
flag_date = True
for n in dir :
if(n!="syslog_hosts" and n!="syslog_hosts.avec.domain" and n!="syslog_hosts.BKP"):
for nn in os.listdir("/applis/syslog/syslog_cpe"):
if(re.search(name,nn)):
flag_equip = True
print("Equipment found!")
while(flag_date):
date = raw_input("date AAAAMMJJ ?\n")
if(re.search("[\d]{8}$",date)):
flag_date = False
for nnn in os.listdir("/applis/syslog/syslog_cpe"+"/"+n+"/"+nn):
raw_filename = nnn.split(".")
for i in raw_filename :
if(i==date):
break
if(flag_equip==False):
print("Equipment not found")
I have a problem with the part where I have to enter the date. No matter what I put in the date, it always gives me this error below. The IP address matches the one I entered but not the date which is always the first one of the machine.
File "scriptsyslog.py", line 21, in <module>
for nnn in os.listdir("/applis/syslog/syslog_cpe"+"/"+n+"/"+nn):
OSError: [Errno 2] No such file or directory:'/applis/syslog/syslog_cpe/.bash_history/100.117.130.80.20220404.log'
My code may seem a bit strange but that's because it's not complete. I hope that's clear enough. The log files are all named in the format "[IP address][YYYYMMDD].log"
For example 100.117.130.80.20220410.log
Thank you in advance. This is my first question on StackOverflow.

Here's an attempt to fix the script.
Rather than require interactive I/O, this reads the IP address and the date as command-line arguments. I think you will find that this is both more user-friendly for interactive use and easier to use from other scripts programmatically.
I can't guess what your file and directory tree looks like, but I have reimplemented the code based on some guesswork. One of the hard parts was figuring out how to give the variables less misleading or cryptic names, but your question doesn't explain how the files are organized, so I could have guessed some parts wrong.
Your original script looks like you were traversing multiple nested directories, but if your prose description is correct, you just want to scan one directory and find the file for the specified IP address and date.
This simply assumes that you have all your files in /applis/syslog/syslog_cpe and that the scanning of files in the current directory was an incorrect, erroneous addition to a previously working script.
# Stylistic fix: Each import on a separate line
import os
import sys
if len(sys.argv) != 3:
print("Syntax: %s <ip> <yyyymmdd>" % sys.argv[0].split('/')[-1])
sys.exit(1)
filename = os.path.join(
"/applis/syslog/syslog_cpe",
sys.argv[1] + "." + sys.argv[2] + ".log")
if os.path.exists(filename):
print(filename)
else:
# Signal failure to caller
sys.exit(2)
There were many other things wrong with your script; here is an attempt I came up with while trying to pick apart what your script was actually doing, which I'm leaving here in case it could entertain or enlighten you.
# Stylistic fix: Each import on a separate line
import os
import sys
import re
# Stylistic fix: Encapsulate main code into a function
def scanlog(ip_address, date, basedir="/applis/syslog/syslog_cpe"):
"""
Return any log in the specified directory matching the
IP address and date pattern.
"""
# Validate input before starting to loop
if not re.search(r"^\d{8}$", date):
raise ValueError("date must be exactly eight digits yyyymmdd")
if not re.search(r"^\d{1,3}(?:\.\d{1,3}){3}$", ip_address):
raise ValueError("IP address must be four octets with dots")
for file in os.listdir(basedir):
# Stylistic fix: use value not in (x, y, z) over value != x and value != y and value != z
# ... Except let's also say if value in then continue
if file in ("syslog_hosts", "syslog_hosts.avec.domain", "syslog_hosts.BKP"):
continue
# ... So that we can continue on the same indentation level
# Don't use a regex for simple substring matching
if file.startswith(ip_address):
# print("Equipment found!")
if file.split(".")[-2] == date:
return file
# print("Equipment not found!")
# Stylistic fix: Call main function if invoked as a script
def main():
if len(sys.argv) != 3:
print("Syntax: %s <ip> <yyyymmdd>" % sys.argv[0].split('/')[-1])
sys.exit(1)
found = scanlog(*sys.argv[1:])
if found:
print(found)
else:
# Signal error to caller
sys.exit(2)
if __name__ == "__main__":
main()
There is no reason to separately indicate the encoding; Python 3 source generally uses UTF-8 by default, but this script doesn't contain any characters which are not ASCII, so it doesn't even really matter.
I think this should work under Python 2 as well, but it was written and tested with Python 3. I don't think it makes sense to stay on Python 2 any longer anyway.

Related

how to import files from command line in python

I am using python and I am supposed to read a file from command line for further processing. My input file has a binary that should be read for further processing. Here is my input file sub.py:
CODE = " \x55\x48\x8b\x05\xb8\x13\x00\x00"
and my main file which should read this is like the following:
import pyvex
import archinfo
import fileinput
import sys
filename = sys.argv[-1]
f = open(sys.argv[-1],"r")
CODE = f.read()
f.close()
print CODE
#CODE = b"\x55\x48\x8b\x05\xb8\x13\x00\x00"
# translate an AMD64 basic block (of nops) at 0x400400 into VEX
irsb = pyvex.IRSB(CODE, 0x1000, archinfo.ArchAMD64())
# pretty-print the basic block
irsb.pp()
# this is the IR Expression of the jump target of the unconditional exit at the end of the basic block
print irsb.next
# this is the type of the unconditional exit (i.e., a call, ret, syscall, etc)
print irsb.jumpkind
# you can also pretty-print it
irsb.next.pp()
# iterate through each statement and print all the statements
for stmt in irsb.statements:
stmt.pp()
# pretty-print the IR expression representing the data, and the *type* of that IR expression written by every store statement
import pyvex
for stmt in irsb.statements:
if isinstance(stmt, pyvex.IRStmt.Store):
print "Data:",
stmt.data.pp()
print ""
print "Type:",
print stmt.data.result_type
print ""
# pretty-print the condition and jump target of every conditional exit from the basic block
for stmt in irsb.statements:
if isinstance(stmt, pyvex.IRStmt.Exit):
print "Condition:",
stmt.guard.pp()
print ""
print "Target:",
stmt.dst.pp()
print ""
# these are the types of every temp in the IRSB
print irsb.tyenv.types
# here is one way to get the type of temp 0
print irsb.tyenv.types[0]
The problem is that when I run "python maincode.py sub.py' it reads the code as content of the file but its output is completely different from when I directly add CODE into the statement irsb = pyvex.IRSB(CODE, 0x1000, archinfo.ArchAMD64()). Does anyone know what is the problem and how can I solve it? I even use importing from inputfile but it does not read a text.
Have you considered the __import__ way?
You could do
mod = __import__(sys.argv[-1])
print mod.CODE
and just pass the filename without the .py extension as your command line argument:
python maincode.py sub
EDIT: Apparently using __import__ is discouraged. Instead though you can use importlib module:
import sys,importlib
mod = importlib.import_module(sys.argv[-1])
print mod.CODE
..and it should work the same as using __import__
If you need to pass a path to the module, one way is if in each of the directories you added an empty file named
__init__.py
That will allow python to interpret the directories as module namespaces, and you can then pass the path in its module form: python maincode.py path.to.subfolder.sub
If for some reason you cannot or don't want to add the directories as namespaces, and don't want to add the init.py files everywhere, you could also use imp.find_module. Your maincode.py would instead look like this:
import sys, imp
mod = imp.find_module("sub","/path/to/subfolder/")
print mod.code
You'll have to write code which breaks apart your command line input into the module part "sub" and the folder path "/path/to/subfolder/" though. Does that make sense? Once its ready you'll call it like you expect, python maincode.py /path/to/subfolder/sub/
you're reading the code as text, while when reading as file you're likely reading as binary
you probably need to convert binary to text of vice-versa to make this work
Binary to String/Text in Python

Python String Query

I recently made a Twitter-bot that takes a specified .txt file and tweets it out, line by line. A lot of the features I built into the program to troubleshoot some formatting issues actually allows the program to work with pretty much any text file.
I would like build in a feature where I can "import" a .txt file to use.
I put that in quotes because the program runs in the command line at them moment.
I figured there are two ways I can tackle this problem but need some guidance on each:
A) I begin the program with a prompt asking which file the user want to use. This is stored as a string (lets say variable string) and the code looks like this-
file = open(string,'r')
There are two main issues with. The first is I'm unsure how to keep the program from crashing if the program specified is misspelled or does not exist. The second is that it won't mesh with future development (eventually I'd like to build app functionality around this program)
B) Somehow specify the desired file somehow in the command line. While the program will still occasionally crash, it isn't as inconvenient to the user. Also, this would lend itself to future development, as it'll be easier to pass a value in through the command line than an internal prompt.
Any ideas?
For the first part of the question, exception handling is the way to go . Though for the second part you can also use a module called argparse.
import argparse
# creating command line argument with the name file_name
parser = argparse.ArgumentParser()
parser.add_argument("file_name", help="Enter file name")
args = parser.parse_args()
# reading the file
with open(args.file_name,'r') as f:
file = f.read()
You can read more about the argparse module on its documentation page.
Regarding A), you may want to investigate
try:
with open(fname) as f:
blah = f.read()
except Exception as ex:
# handle error
and for B) you can, e.g.
import sys
fname = sys.argv[1]
You could also combine the both to make sure that the user has passed an argument:
#!/usr/bin/env python
# encoding: utf-8
import sys
def tweet_me(text):
# your bot goes here
print text
if __name__ == '__main__':
try:
fname = sys.argv[1]
with open(fname) as f:
blah = f.read()
tweet_me(blah)
except Exception as ex:
print ex
print "Please call this as %s <name-of-textfile>" % sys.argv[0]
Just in case someone wonders about the # encoding: utf-8 line. This allows the source code to contain utf-8 characters. Otherwise only ASCII is allowed, which would be ok for this script. So the line is not necessary. I was, however, testing the script on itself (python x.py x.py) and, as a little test, added a utf-8 comment (# ä). In real life, you will have to care a lot more for character encoding of your input...
Beware, however, that just catching any Exception that may arise from the whole program is not considered good coding style. While Python encourages to assume the best and try it, it might be wise to catch expectable errors right where they happen. For example , accessing a file which does not exist will raise an IOError. You may end up with something like:
except IndexError as ex:
print "Syntax: %s <text-file>" % sys.argv[0]
except IOError as ex:
print "Please provide an existing (and accessible) text-file."
except Exception as ex:
print "uncaught Exception:", type(ex)

File is created but cannot be written in Python

I am trying to write some results I get from a function for a range but I don't understand why the file is empty. The function is working fine because I can see the results in the console when I use print. First, I'm creating the file which is working because it is created; the output file name is taken from a string, and that part is working too. So the following creates the file in the given path:
report_strategy = open(output_path+strategy.partition("strategy(")[2].partition(",")[0]+".txt", "w")
it creates a text file with the name taken from a string named "strategy", for example:
strategy = "strategy(abstraction,Ent_parent)"
a file called "abstraction.txt" is created in the output path folder. So far so good. But I can't get to write anything to this file. I have a range of a few integers
maps = (175,178,185)
This is the function:
def strategy_count(map_path,map_id)
The following loop does the counting for each item in the range "maps" to return an integer:
for i in maps:
report_strategy.write(str(i), ",", str(strategy_count(maps_path,str(i))))
and the file is closed at the end:
report_strategy.close()
Now the following:
for i in maps:
print str(i), "," , strategy_count(maps_path,str(i))
does give me what I want in the console:
175 , 3
178 , 0
185 , 1
What am I missing?! The function works, the file is created. I see the output in the console as I want, but I can't write the same thing in the file. And of course, I close the file.
This is a part of a program that reads text files (actually Prolog files) and runs an Answer Set Programming solver called Clingo. Then the output is read to find instances of occurring strategies (a series of actions with specific rules). The whole code:
import pmaps
import strategies
import generalization
# select the strategy to count:
strategy = strategies.abstraction_strategy
import subprocess
def strategy_count(path,name):
p=subprocess.Popen([pmaps.clingo_path,"0",""],
stdout=subprocess.PIPE,stderr=subprocess.STDOUT,stdin=subprocess.PIPE)
#
## write input facts and rules to clingo
with open(path+name+".txt","r") as source:
for line in source:
p.stdin.write(line)
source.close()
# some generalization rules added
p.stdin.write(generalization.parent_of)
p.stdin.write(generalization.chain_parent_of)
# add the strategy
p.stdin.write(strategy)
p.stdin.write("#hide.")
p.stdin.write("#show strategy(_,_).")
#p.stdin.write("#show parent_of(_,_,_).")
# close the input to clingo
p.stdin.close()
lines = []
for line in p.stdout.readlines():
lines.append(line)
counter=0
for line in lines:
if line.startswith('Answer'):
answer = lines[counter+1]
break
if line.startswith('UNSATISFIABLE'):
answer = ''
break
counter+=1
strategies = answer.count('strategy')
return strategies
# select which data set (from the "pmaps" file) to count strategies for:
report_strategy = open(pmaps.hw3_output_path+strategy.partition("strategy(")[2].partition(",")[0]+".txt", "w")
for i in pmaps.pmaps_hw3_fall14:
report_strategy.write(str(i), ",", str(strategy_count(pmaps.path_hw3_fall14,str(i))))
report_strategy.close()
# the following is for testing the code. It is working and there is the right output in the console
#for i in pmaps.pmaps_hw3_fall14:
# print str(i), "," , strategy_count(pmaps.path_hw3_fall14,str(i))
write takes one argument, which must be a string. It doesn't take multiple arguments like print, and it doesn't add a line terminator.
If you want the behavior of print, there's a "print to file" option:
print >>whateverfile, stuff, to, print
Looks weird, doesn't it? The function version of print, active by default in Python 3 and enabled with from __future__ import print_function in Python 2, has nicer syntax for it:
print(stuff, to, print, out=whateverfile)
The problem was with the write which as #user2357112 mentioned takes only one argument. The solution could also be joining the strings with + or join():
for i in maps:
report.write(str(i)+ ","+str(strategy_count(pmaps.path_hw3_fall14,str(i)))+"\n")
#user2357112 your answer might have the advantage of knowing if your test debug in the console produces the write answer, you just need to write that. Thanks.

Saving data in Python without a text file?

I have a python program that just needs to save one line of text (a path to a specific folder on the computer).
I've got it working to store it in a text file and read from it; however, I'd much prefer a solution where the python file is the only one.
And so, I ask: is there any way to save text in a python program even after its closed, without any new files being created?
EDIT: I'm using py2exe to make the program an .exe file afterwards: maybe the file could be stored in there, and so it's as though there is no text file?
You can save the file name in the Python script and modify it in the script itself, if you like. For example:
import re,sys
savefile = "widget.txt"
x = input("Save file name?:")
lines = list(open(sys.argv[0]))
out = open(sys.argv[0],"w")
for line in lines:
if re.match("^savefile",line):
line = 'savefile = "' + x + '"\n'
out.write(line)
This script reads itself into a list then opens itself again for writing and amends the line in which savefile is set. Each time the script is run, the change to the value of savefile will be persistent.
I wouldn't necessarily recommend this sort of self-modifying code as good practice, but I think this may be what you're looking for.
Seems like what you want to do would better be solved using the Windows Registry - I am assuming that since you mentioned you'll be creating an exe from your script.
This following snippet tries to read a string from the registry and if it doesn't find it (such as when the program is started for the first time) it will create this string. No files, no mess... except that there will be a registry entry lying around. If you remove the software from the computer, you should also remove the key from the registry. Also be sure to change the MyCompany and MyProgram and My String designators to something more meaningful.
See the Python _winreg API for details.
import _winreg as wr
key_location = r'Software\MyCompany\MyProgram'
try:
key = wr.OpenKey(wr.HKEY_CURRENT_USER, key_location, 0, wr.KEY_ALL_ACCESS)
value = wr.QueryValueEx(key, 'My String')
print('Found value:', value)
except:
print('Creating value.')
key = wr.CreateKey(wr.HKEY_CURRENT_USER, key_location)
wr.SetValueEx(key, 'My String', 0, wr.REG_SZ, 'This is what I want to save!')
wr.CloseKey(key)
Note that the _winreg module is called winreg in Python 3.
Why don't you just put it at the beginning of the code. E.g. start your code:
import ... #import statements should always go first
path = 'what you want to save'
And now you have path saved as a string

How can I test if a string refers to a file or directory? with regular expressions? in python?

so I'm writting a generic backup application with os module and pickle and far I've tried the code below to see if something is a file or directory (based on its string input and not its physical contents).
import os, re
def test(path):
prog = re.compile("^[-\w,\s]+.[A-Za-z]{3}$")
result = prog.match(path)
if os.path.isfile(path) or result:
print "is file"
elif os.path.isdir(path):
print "is directory"
else: print "I dont know"
Problems
test("C:/treeOfFunFiles/")
is directory
test("/beach.jpg")
I dont know
test("beach.jpg")
I dont know
test("/directory/")
I dont know
Desired Output
test("C:/treeOfFunFiles/")
is directory
test("/beach.jpg")
is file
test("beach.jpg")
is file
test("/directory/")
is directory
Resources
Test filename with regular expression
Python RE library
Validating file types by regular expression
what regular expression should I be using to tell the difference between what might be a file and what might be a directory? or is there a different way to go about this?
The os module provides methods to check whether or not a path is a file or a directory. It is advisable to use this module over regular expressions.
>>> import os
>>> print os.path.isfile(r'/Users')
False
>>> print os.path.isdir(r'/Users')
True
This might help someone, I had the exact same need and I used the following regular expression to test whether an input string is a directory, file or neither:
for generic file:
^(\/+\w{0,}){0,}\.\w{1,}$
for generic directory:
^(\/+\w{0,}){0,}$
So the generated python function looks like :
import os, re
def check_input(path):
check_file = re.compile("^(\/+\w{0,}){0,}\.\w{1,}$")
check_directory = re.compile("^(\/+\w{0,}){0,}$")
if check_file.match(path):
print("It is a file.")
elif check_directory.match(path):
print("It is a directory")
else:
print("It is neither")
Example:
check_input("/foo/bar/file.xyz") prints -> Is a file
check_input("/foo/bar/directory") prints -> Is a directory
check_input("Random gibberish") prints -> It is neither
This layer of security of input may be reinforced later by the os.path.isfile() and os.path.isdir() built-in functions as Mr.Squig kindly showed but I'd bet this preliminary test may save you a few microseconds and boost your script performance.
PS: While using this piece of code, I noticed I missed a huge use case when the path actually contains special chars like the dash "-" which is widely used. To solve this I changed the \w{0,} which specifies the requirement of alphabetic only words with .{0,} which is just a random character. This is more of a workaround than a solution. But that's all I have for now.
In a character class, if present and meant as a hyphen, the - needs to either be the first/last character, or escaped \- so change "^[\w-,\s]+\.[A-Za-z]{3}$" to "^[-\w,\s]+\.[A-Za-z]{3}$" for instance.
Otherwise, I think using regex's to determine if something looks like a filename/directory is pointless...
/dev/fd0 isn't a file or directory for instance
~/comm.pipe could look like a file but is a named pipe
~/images/test is a symbolic link to a file called '~/images/holiday/photo1.jpg'
Have a look at the os.path module which have functions that ask the OS what something is...:
 

Categories

Resources