Python String Query - python

I recently made a Twitter-bot that takes a specified .txt file and tweets it out, line by line. A lot of the features I built into the program to troubleshoot some formatting issues actually allows the program to work with pretty much any text file.
I would like build in a feature where I can "import" a .txt file to use.
I put that in quotes because the program runs in the command line at them moment.
I figured there are two ways I can tackle this problem but need some guidance on each:
A) I begin the program with a prompt asking which file the user want to use. This is stored as a string (lets say variable string) and the code looks like this-
file = open(string,'r')
There are two main issues with. The first is I'm unsure how to keep the program from crashing if the program specified is misspelled or does not exist. The second is that it won't mesh with future development (eventually I'd like to build app functionality around this program)
B) Somehow specify the desired file somehow in the command line. While the program will still occasionally crash, it isn't as inconvenient to the user. Also, this would lend itself to future development, as it'll be easier to pass a value in through the command line than an internal prompt.
Any ideas?

For the first part of the question, exception handling is the way to go . Though for the second part you can also use a module called argparse.
import argparse
# creating command line argument with the name file_name
parser = argparse.ArgumentParser()
parser.add_argument("file_name", help="Enter file name")
args = parser.parse_args()
# reading the file
with open(args.file_name,'r') as f:
file = f.read()
You can read more about the argparse module on its documentation page.

Regarding A), you may want to investigate
try:
with open(fname) as f:
blah = f.read()
except Exception as ex:
# handle error
and for B) you can, e.g.
import sys
fname = sys.argv[1]
You could also combine the both to make sure that the user has passed an argument:
#!/usr/bin/env python
# encoding: utf-8
import sys
def tweet_me(text):
# your bot goes here
print text
if __name__ == '__main__':
try:
fname = sys.argv[1]
with open(fname) as f:
blah = f.read()
tweet_me(blah)
except Exception as ex:
print ex
print "Please call this as %s <name-of-textfile>" % sys.argv[0]
Just in case someone wonders about the # encoding: utf-8 line. This allows the source code to contain utf-8 characters. Otherwise only ASCII is allowed, which would be ok for this script. So the line is not necessary. I was, however, testing the script on itself (python x.py x.py) and, as a little test, added a utf-8 comment (# รค). In real life, you will have to care a lot more for character encoding of your input...
Beware, however, that just catching any Exception that may arise from the whole program is not considered good coding style. While Python encourages to assume the best and try it, it might be wise to catch expectable errors right where they happen. For example , accessing a file which does not exist will raise an IOError. You may end up with something like:
except IndexError as ex:
print "Syntax: %s <text-file>" % sys.argv[0]
except IOError as ex:
print "Please provide an existing (and accessible) text-file."
except Exception as ex:
print "uncaught Exception:", type(ex)

Related

Find a syslog file based on the date in its name

I am working on a Python script that will allow me to better find specific syslogs files. The script simply consists in typing the IP address of the concerned machine and then the date (in YYYYMMDD format) since there is one log file per day on the server..
The script is largely copied and adapted from a similar script that a colleague programmer did before me. But I don't have as much expertise as he does on Python, and he didn't leave comments on his code.
# This Python file uses the following encoding: utf-8
import os, sys, re
name = raw_input("IP Address? \n")
dir = os.listdir(os.getcwd())
flag_equip = False
flag_h_start = True
flag_h_end = True
flag_date = True
for n in dir :
if(n!="syslog_hosts" and n!="syslog_hosts.avec.domain" and n!="syslog_hosts.BKP"):
for nn in os.listdir("/applis/syslog/syslog_cpe"):
if(re.search(name,nn)):
flag_equip = True
print("Equipment found!")
while(flag_date):
date = raw_input("date AAAAMMJJ ?\n")
if(re.search("[\d]{8}$",date)):
flag_date = False
for nnn in os.listdir("/applis/syslog/syslog_cpe"+"/"+n+"/"+nn):
raw_filename = nnn.split(".")
for i in raw_filename :
if(i==date):
break
if(flag_equip==False):
print("Equipment not found")
I have a problem with the part where I have to enter the date. No matter what I put in the date, it always gives me this error below. The IP address matches the one I entered but not the date which is always the first one of the machine.
File "scriptsyslog.py", line 21, in <module>
for nnn in os.listdir("/applis/syslog/syslog_cpe"+"/"+n+"/"+nn):
OSError: [Errno 2] No such file or directory:'/applis/syslog/syslog_cpe/.bash_history/100.117.130.80.20220404.log'
My code may seem a bit strange but that's because it's not complete. I hope that's clear enough. The log files are all named in the format "[IP address][YYYYMMDD].log"
For example 100.117.130.80.20220410.log
Thank you in advance. This is my first question on StackOverflow.
Here's an attempt to fix the script.
Rather than require interactive I/O, this reads the IP address and the date as command-line arguments. I think you will find that this is both more user-friendly for interactive use and easier to use from other scripts programmatically.
I can't guess what your file and directory tree looks like, but I have reimplemented the code based on some guesswork. One of the hard parts was figuring out how to give the variables less misleading or cryptic names, but your question doesn't explain how the files are organized, so I could have guessed some parts wrong.
Your original script looks like you were traversing multiple nested directories, but if your prose description is correct, you just want to scan one directory and find the file for the specified IP address and date.
This simply assumes that you have all your files in /applis/syslog/syslog_cpe and that the scanning of files in the current directory was an incorrect, erroneous addition to a previously working script.
# Stylistic fix: Each import on a separate line
import os
import sys
if len(sys.argv) != 3:
print("Syntax: %s <ip> <yyyymmdd>" % sys.argv[0].split('/')[-1])
sys.exit(1)
filename = os.path.join(
"/applis/syslog/syslog_cpe",
sys.argv[1] + "." + sys.argv[2] + ".log")
if os.path.exists(filename):
print(filename)
else:
# Signal failure to caller
sys.exit(2)
There were many other things wrong with your script; here is an attempt I came up with while trying to pick apart what your script was actually doing, which I'm leaving here in case it could entertain or enlighten you.
# Stylistic fix: Each import on a separate line
import os
import sys
import re
# Stylistic fix: Encapsulate main code into a function
def scanlog(ip_address, date, basedir="/applis/syslog/syslog_cpe"):
"""
Return any log in the specified directory matching the
IP address and date pattern.
"""
# Validate input before starting to loop
if not re.search(r"^\d{8}$", date):
raise ValueError("date must be exactly eight digits yyyymmdd")
if not re.search(r"^\d{1,3}(?:\.\d{1,3}){3}$", ip_address):
raise ValueError("IP address must be four octets with dots")
for file in os.listdir(basedir):
# Stylistic fix: use value not in (x, y, z) over value != x and value != y and value != z
# ... Except let's also say if value in then continue
if file in ("syslog_hosts", "syslog_hosts.avec.domain", "syslog_hosts.BKP"):
continue
# ... So that we can continue on the same indentation level
# Don't use a regex for simple substring matching
if file.startswith(ip_address):
# print("Equipment found!")
if file.split(".")[-2] == date:
return file
# print("Equipment not found!")
# Stylistic fix: Call main function if invoked as a script
def main():
if len(sys.argv) != 3:
print("Syntax: %s <ip> <yyyymmdd>" % sys.argv[0].split('/')[-1])
sys.exit(1)
found = scanlog(*sys.argv[1:])
if found:
print(found)
else:
# Signal error to caller
sys.exit(2)
if __name__ == "__main__":
main()
There is no reason to separately indicate the encoding; Python 3 source generally uses UTF-8 by default, but this script doesn't contain any characters which are not ASCII, so it doesn't even really matter.
I think this should work under Python 2 as well, but it was written and tested with Python 3. I don't think it makes sense to stay on Python 2 any longer anyway.

Input a text file into a program

I'm working on a PDF generator project. The goal is to have a program that takes document files and generate a PDF file. I'm having trouble in finding a way to input a file into the program to be converted.
I started out by using the input function, where I input the file in the terminal. As a test, I wanted to input, open, read, and print a csv file containing US zipcode data. The rest of the program opens, reads and prints out some of the data. Here is the code:
import csv
file = input("Drop file here: ")
with open(file, 'r', encoding='utf8') as zf:
rf = csv.reader(zf, delimiter=',')
header = next(rf)
data = [row for row in rf]
print(header)
print(data[1])
print(data[10])
print(data[100])
print(data[1000])
When I opened the terminal to input the file this error (TypeError: 'encoding' is an invalid keyword argument for this function) appeared.
Is there a better way I can code a program to input a file so it can be open and converted into a PDF?
There are more things going on and as was mentioned in the comments, in this case it is very relevant which version of python are you using. A bit more of the back story.
input built-in has different meaning in Python2 (https://docs.python.org/2.7/library/functions.html#input) or Python3 (https://docs.python.org/3.6/library/functions.html#input). In Python2 it reads the user input and tries to execute it as python code, which is unlikely what you actually wanted.
Then as pointed out, open arguments are different as well (https://docs.python.org/2.7/library/functions.html#open and https://docs.python.org/3.6/library/functions.html#open).
In short, as suggested by #idlehands, if you have both version installed try calling python3 instead of python and this code should actually run.
Recommendation: I would suggest not to use interactive input like this at all (unless there is a good reason to do that) and instead let the desired filename be passed in from outside. I'd opt for argparse (https://docs.python.org/3.6/library/argparse.html#module-argparse) in this case which very comfortably gives you great flexibility, for instance myscript.py:
#!/usr/bin/env python3
import argparse
import sys
parser = argparse.ArgumentParser(description='My script to do stuff.')
parser.add_argument('-o', '--output', metavar='OUTFILE', dest='out_file',
type=argparse.FileType('w'), default=sys.stdout,
help='Resulting file.')
parser.add_argument('in_file', metavar='INFILE', nargs="?",
type=argparse.FileType('r'), default=sys.stdin,
help='File to be processed.')
args = parser.parse_args()
args.out_file.write(args.in_file.read()) # replace with actual action
This gives you the ability to run the script as a pass through pipe stuff in and out, work on specified file(s) as well as explicitly use - to denote stdin/stdout are to be used. argparse also gives you command line usage/help for free.
You may want the specifics tweak for different behavior, but bottom line, I'd still go with a command line argument.
EDIT: I should add more more comment for consideration. I'd write the actual code (a function or more complex object) performing the wanted action so that it exposes ins/outs through its interfaces and write the command line to gather these bits and call my action code with it. That way you can reuse it from another Python script easily or write a GUI for that should you need/want to.

Efficient way to find a string based on a list

I'm new to scripting and have been reading up on Python for about 6 weeks. The below is meant to read a log file and send an alert if one of the keywords defined in srchstring is found. It works as expected and doesn't alert on strings previously found, as expected. However the file its processing is actively being written to by an application and the script is too slow on files around 500mb. under 200mb it works fine ie within 20secs.
Could someone suggest a more efficient way to search for a string within a file based on a pre-defined list?
import os
srchstring = ["Shutdown", "Disconnecting", "Stopping Event Thread"]
if os.path.isfile(r"\\server\\share\\logfile.txt"):
with open(r"\\server\\share\\logfile.txt","r") as F:
for line in F:
for st in srchstring:
if st in line:
print line,
#do some slicing of string to get dd/mm/yy hh:mm:ss:ms
# then create a marker file called file_dd/mm/yy hh:mm:ss:ms
if os.path.isfile("file_dd/mm/yy hh:mm:ss:ms"): # check if a file already exists named file_dd/mm/yy hh:mm:ss:ms
print "string previously found- ignoring, continuing search" # marker file exists
else:
open("file_dd/mm/yy hh:mm:ss:ms", 'a') # create file_dd/mm/yy hh:mm:ss:ms
print "error string found--creating marker file sending email alert" # no marker file, create it then send email
else:
print "file not exist"
Refactoring the search expression to a precompiled regular expression avoids the (explicit) innermost loop.
import os, re
regex = re.compile(r'Shutdown|Disconnecting|Stopping Event Thread')
if os.path.isfile(r"\\server\\share\\logfile.txt"):
#Indentation fixed as per comment
with open(r"\\server\\share\\logfile.txt","r") as F:
for line in F:
if regex.search(line):
# ...
I assume here that you use Linux. If you don't, install MinGW on Windows and the solution below will become suitable too.
Just leave the hard part to the most efficient tools available. Filter your data before you go to the python script. Use grep command to get the lines containing "Shutdown", "Disconnecting" or "Stopping Event Thread"
grep 'Shutdown\|Disconnecting\|"Stopping Event Thread"' /server/share/logfile.txt
and redirect the lines to your script
grep 'Shutdown\|Disconnecting\|"Stopping Event Thread"' /server/share/logfile.txt | python log.py
Edit: Windows solution. You can create a .bat file to make it executable.
findstr /c:"Shutdown" /c:"Disconnecting" /c:"Stopping Event Thread" \server\share\logfile.txt | python log.py
In 'log.py', read from stdin. It's file-like object, so no difficulties here:
import sys
for line in sys.stdin:
print line,
# do some slicing of string to get dd/mm/yy hh:mm:ss:ms
# then create a marker file called file_dd/mm/yy hh:mm:ss:ms
# and so on
This solution will reduce the amount of work your script has to do. As Python isn't a fast language, it may speed up the task. I suspect it can be rewritten purely in bash and it will be even faster (20+ years of optimization of a C program is not the thing you compete with easily), but I don't know bash enough.

Taking String arguments for a function without quotes

I've got a function meant to download a file from a URL and write it to a disk, along with imposing a particular file extension. At present, it looks something like this:
import requests
import os
def getpml(url,filename):
psc = requests.get(url)
outfile = os.path.join(os.getcwd(),filename+'.pml')
f = open(outfile,'w')
f.write(psc.content)
f.close()
try:
with open(outfile) as f:
print "File Successfully Written"
except IOError as e:
print "I/O Error, File Not Written"
return
When I try something like
getpml('http://www.mysite.com/data.txt','download') I get the appropriate file sitting in the current working directory, download.pml. But when I feed the function the same arguments without the ' symbol, Python says something to the effect of "NameError: name 'download' is not defined" (the URL produces a syntax error). This even occurs if, within the function itself, I use str(filename) or things like that.
I'd prefer not to have to input the arguments of the function in with quote characters - it just makes entering URLs and the like slightly more difficult. Any ideas? I presume there is a simple way to do this, but my Python skills are spotty.
No, that cannot be done. When you are typing Python source code you have to type quotes around strings. Otherwise Python can't tell where the string begins and ends.
It seems like you have a more general misunderstanding too. Calling getpml(http://www.mysite.com) without quotes isn't calling it with "the same argument without quotes". There simply isn't any argument there at all. It's not like there are "arguments with quotes" and "arguments without quotes". Python isn't like speaking a natural human language where you can make any sound and it's up to the listener to figure out what you mean. Python code can only be made up of certain building blocks (object names, strings, operators, etc.), and URLs aren't one of those.
You can call your function differently:
data = """\
http://www.mysite.com/data1.txt download1
http://www.mysite.com/data2.txt download2
http://www.mysite.com/data3.txt download3
"""
for line in data.splitlines():
url, filename = line.strip().split()
getpml(url, filename)

Python - writing lines from file into IRC buffer

Ok, so I am trying to write a Python script for XCHAT that will allow me to type "/hookcommand filename" and then will print that file line by line into my irc buffer.
EDIT: Here is what I have now
__module_name__ = "scroll.py"
__module_version__ = "1.0"
__module_description__ = "script to scroll contents of txt file on irc"
import xchat, random, os, glob, string
def gg(ascii):
ascii = glob.glob("F:\irc\as\*.txt")
for textfile in ascii:
f = open(textfile, 'r')
def gg_cb(word, word_eol, userdata):
ascii = gg(word[0])
xchat.command("msg %s %s"%(xchat.get_info('channel'), ascii))
return xchat.EAT_ALL
xchat.hook_command("gg", gg_cb, help="/gg filename to use")
Well, your first problem is that you're referring to a variable ascii before you define it:
ascii = gg(ascii)
Try making that:
ascii = gg(word[0])
Next, you're opening each file returned by glob... only to do absolutely nothing with them. I'm not going to give you the code for this: please try to work out what it's doing or not doing for yourself. One tip: the xchat interface is an extra complication. Try to get it working in plain Python first, then connect it to xchat.
There may well be other problems - I don't know the xchat api.
When you say "not working", try to specify exactly how it's not working. Is there an error message? Does it do the wrong thing? What have you tried?

Categories

Resources