Using stdout to customize Logfile - python

For my current project I need to find a way to write a stdout-dependent version or at least a formatted stdout to a logfile.
Currently I'm just using subprocess.Popen(exec, stdout=log), which writes it to a file, unformatted.
The best idea I've got was formatting the output in on-the-fly, by intercepting the stdout, rerouting it to a custom module and let it write own strings to the log by using a combination of if's and elif's. But I couldn't find any way on how to do this in Python 3 in the docs. The most fitting answer I could find was in this question, the accepted answer is already kind of near.
But I just fail to understand on how exactly I should build the class that should be used as stdout=class. Which one of these methods receives the input (and how is it formatted)? Which of these methods are necessary? Or might there even be a way easier method to accomplish what I want to do?

You can use the sarge project to allow a parent process to capture the stdout (and/or stderr) of a subprocess, read it line by line (say) and do whatever you like with the read lines (including writing them, appropriately formatted, to a log file).
Disclosure: I'm the maintainer of the sarge project.

Related

Python3 curses code not working in pipeline

I am writing a python script that I want to use in a unix pipeline. My goal is to write to the screen using curses (which should only be seen by the person running the command, not the pipe), and then write the "return value" to stdout at the end so it can continue down the pipeline, something along the lines of ./myscript.py | consumer_script
This was failing in mysterious ways until I found This. The suggested solution was to use newterm instead of init_scr.
My problem is that I am using python, and from what I could find in the documentation, newterm doesnt exist. All I was able to find was a single reference to newterm, and it didn't come with a link.
Could someone please either point me towards the python newterm, or suggest another way of working with pipes and curses.
I think you're making this more complicated than it needs to be... the simple answer is to write the curses stream to another handle than stdout. If it works for you, stderr is the obvious choice. In short, anything that gets written to stdout goes into the pipeline, and if you don't want it there, you need a different handle.
Check out this thread for ways to write to stderr in python:
How to print to stderr in Python?

How to customize Flyway so that it can handle CSV files as input as well?

Has someone implemented the CSV-handling for Flyway? It was requested some time ago (Flyway specific migration with csv files). Flyway comments it now as a possibility for the MigrationResolver and MigrationExecutor, but it does not seem to be implemented.
I've tried to do it myself with Flyway 4.2, but I'm not very good with java. I got as far as creating my own jar using the sample and make it accessible to flyway. But how does flyway distinguish when to use the SqlMigrator and when to use my CsvMigrator? I thought I have to register my own prefix/suffix (as the question above writes), but FlywayConfiguration seems to be read-only, at least I did not see any API calls for doing this :(.
How to connect the different Resolvers to the different migration file types? (.sql to the migration using Sql and .csv/.py to the loading of Csv and executing python scripts)
After some shed of tears and blood, it looks like came up with something on this. I can't make the whole code available because it is using proprietary file format, but here's the main ideas:
implement the ConfigurationAware as well, and use the setFlywayConfiguration implementation to catalog the extra files you want to handle (i.e. .csv). This is executed only once during the run.
during this cataloging I could not use the scanner or LoadableResources, there's some Java magic I do not understand. All the classes and methods seem to be available and accessible, even when using .getMethods() runtime... but when trying to actually call them during a run it throws java.lang.NoSuchMethodError and java.lang.NoClassDefFoundError. I've wasted a whole day on this - don't do that, just copy-paste the code from org.flywaydb.core.internal.util.scanner.filesystem.FileSystemScanner.
use Set< String > instead of LoadableResources[], way easier to work with, especially since there's no access to LoadableResources anyway and working with [] was a nightmare.
the python/shell call will go to the execute(). Some tips:
any exception or fawlty exitcode needs to be translated to an SQLException.
the build is enforcing Java 1.6, so new ProcessBuilder(cmd).inheritIO() cannot be used. Look at these solutions: ProcessBuilder: Forwarding stdout and stderr of started processes without blocking the main thread if you want to print the STDOUT/STDERR.
to compile flyway including your custom module, clone the whole flyway repo from git, edit the main pom.xml to include your module as well and use this command to compile: "mvn install -P-CommercialDBTest -P-CommandlinePlatformAssemblies -DskipTests=true" (I found this in another stackoverflow question.)
what I haven't done yet is the checksum part, I don't know yet what that wants.

Delegating command line arguments to another commands

I need some Python module to support forwarding command line arguments to other commands.
argparse allows to parse arguments easily, but doesn't deliver any "deparsing" tool.
I could just forward os.sys.argv if I hadn't need to delete or change values of any of them, but I have.
I can imagine myself a class that just operates on array of strings all the time, without losing any information, but I failed finding any.
Does somebody know such tool or maybe met similar problem and found out another nice way to handle?
(Sorry for English :()
If you use the subprocess module to run the commands with delegated arguments you can specify your command as a list of strings that won't be subject to shell parsing (as long as you don't use shell=True). You therefore don't need to bother about quoting concerns the same way you would if you were reconstructing a command line. See https://docs.python.org/2/library/subprocess.html#frequently-used-arguments for further details.

Use named pipes to send input to program based on output

Here's a general example of what I need to do:
For example, I would initiate a back trace by sending the command "bt" to GDB from the program. Then I would search for a word such as "pardrivr" and get the line number associated with it by using regular expressions. Then I would input "f [line_number_of_pardriver]" into GDB. This process would be repeated until the correct information is eventually extracted.
I want to use named pipes in bash or python to accomplish this.
Could someone please provide a simple example of how to do this?
My recommendation is not to do this. Instead there are two more supportable ways to go:
Write your code in Python directly in gdb. Gdb has been extensible in Python for several years now.
Use the gdb MI ("Machine Interface") approach. There are libraries available to parse this already (not sure if there is one in Python but I assume so). This is better than parsing gdb's command-line output because some pains are taken to avoid gratuitous breakage -- this is the preferred way for programs to interact with gdb.

What to write into log file?

My question is simple: what to write into a log.
Are there any conventions?
What do I have to put in?
Since my app has to be released, I'd like to have friendly logs, which could be read by most people without asking what it is.
I already have some ideas, like a timestamp, a unique identifier for each function/method, etc..
I'd like to have several log levels, like tracing/debugging, informations, errors/warnings.
Do you use some pre-formatted log resources?
Thank you
It's quite pleasant, and already implemented.
Read this: http://docs.python.org/library/logging.html
Edit
"easy to parse, read," are generally contradictory features. English -- easy to read, hard to parse. XML -- easy to parse, hard to read. There is no format that achieves easy-to-read and easy-to-parse. Some log formats are "not horrible to read and not impossible to parse".
Some folks create multiple handlers so that one log has several formats: a summary for people to read and an XML version for automated parsing.
Here are some suggestions for content:
timestamp
message
log message type (such as error, warning, trace, debug)
thread id ( so you can make sense of the log file from a multi threaded application)
Best practices for implementation:
Put a mutex around the write method so that you can be sure that each write is thread safe and will make sense.
Send 1 message at at a time to the log file, and specify the type of log message each time. Then you can set what type of logging you want to take on program startup.
Use no buffering on the file, or flush often in case there is a program crash.
Edit: I just noticed the question was tagged with Python, so please see S. Lott's answer before mine. It may be enough for your needs.
Since you tagged your question python, I refer you to this question as well.
As for the content, Brian proposal is good. Remember however to add the program name if you are using a shared log.
The important thing about a logfile is "greppability". Try to provide all your info in a single line, with proper string identifiers that are unique (also in radix) for a particular condition. As for the timestamp, use the ISO-8601 standard which sorts nicely.
A good idea is to look at log analysis software. Unless you plan to write your own, you will probably want to exploit an existing log analysis package such as Analog. If that is the case, you will probably want to generate a log output that is similar enough to the formats that it accepts. It will allow you to create nice charts and graphs with minimal effort!
In my opinion, best approach is to use existing logging libraries such as log4j (or it's variations for other languages). It gives you control on how your messages are formatted and you can change it without ever touching your code. It follows best practices, robust and used by millions of users. Of course, you can write you own logging framework, but that would be very odd thing to do, unless you need something very specific. In any case, walk though their documentation and see how log statements are presented there.
Check out log4py - Python ported version of log4j, I think there are several implementations for Python out there..
Python's logging library are thread-safe for single process threads.
timeStamp i.e. DateTime YYYY/MM/DD:HH:mm:ss:ms
User
Thread ID
Function Name
Message/Error Message/Success Message/Function Trace
Have this in XML format and you can then easily write a parser for it.
<log>
<logEntry DebugLevel="0|1|2|3|4|5....">
<TimeStamp format="YYYY/MM/DD:HH:mm:ss:ms" value="2009/04/22:14:12:33:120" />
<ThreadId value="" />
<FunctionName value="" />
<Message type="Error|Success|Failure|Status|Trace">
Your message goes here
</Message>
</logEntry>
</log>

Categories

Resources