I have a python script to process a bunch of log files. It opens up each log and parse each line, and stores the information in data structures. After scanning through the file, it collects the statistics and generate an output file.
I'm testing it against various logs, and it went ok for 20+ different files. But recently, I found it consistently fails on a particular file with 4381k lines. It somehow cannot complete the scanning and parsing.
I'm in the process of narrow down the problem right now, I don't know exactly what was happening there. I'd really need some input in order to find the right direction. Thanks in advance!
Related
I have a script I'm running a bunch of times that generates and logs data in json files. These take days to run and I need to run several dozen test cases. I log progress in json files for post-processing. I'd like to check in occasionally to see how long it has left. This is all single thread, but I've dealt with multiprocessing enough to be scared of opening the file while it's being written for fear that viewing it will place a temporary lock on the file.
Is it safe to view the json in a linux terminal using nano log_file.json while my Python scripts are running and could attempt to write to the log at any time?
If it is not safe, are there any alternatives?
I'm worried if Python tries to record an entry that it could be lost or throw an error while I'm viewing progress. Viewing only, no saving obviously. I'd love to check in on progress to switch between test cases faster, but I really don't want to raise an error that loses days of progress if it's unable to write to the json.
Sorry if this is a duplicate, I tried searching but I'm not sure what to even search for this question.
You can use tail command on terminal to view the logs. Following is the full command:-
tail -F <path_to_file>
It will show some of the last lines of the file and continue to show if data is being written in the file.
I would like to capture all errors that are created during the runtime or during testing(using pytest) of a Python program. So far I have managed to capture all output of pytest using a plugin for pytest which saves the output to a .txt file line by line.
But now I would like to have all the errors created by Python in a .txt file. As I am new to Python and also new to such capturing of program output, do you have any ideas how could I achieve something like this?
Thanks for all the answers in advance.
EDIT: I am creating a plugin for PyCharm IDE, which could visualize these Errors in a certain way, that is why I need those errors saved somewhere, so I can work with them later during the visualization. If there is a better way to save them, please share it.
Question about general possibilities using Python here, I don't really know enough about programming to know whether it's something that's doable, and if so, how do I go about it.
I have a program which is a simple desktop program, which you load files into. The program can then output various properties of the thing that's in the file, and depending on what you ask it to do will output a report. It outputs the report in text format, but not as a file, and instead actually, just in the program itself displays the report. Like this:
My question is that if I want to get this text output for a large number of files, I'm currently manually loading the files individually into the program making the report, copying this to a text file, and saving the text file.
Basically I want to know whether it's extremely difficult to get Python to do this for me, or not. If it is doable, are the resources available for me to read about how it might be done? Are there conditions about being able to run my program and various commands from the Python command box?
Hope my question's clear enough. Sorry if it's a bit garbled.
The tricky part here is
The program can then output various properties of the thing that's in the file, and depending on what you ask it to do will output a report.
Basically, if the desktop application you use has a command line interface, it is possible and relatively easy.
If this program has command line option to open a document and output a report in any format (print the report on the standard output, write it into a file on the disk, etc.), you can call that commands from a script python for each files you set in a list.
If your software doesn't have a CLI (Command Line Interface), it might be possible but more diffficult. In that case, you have to automate actions by using a library that will emulate clicks on the Window of you software (1. Click on Open 2. Click, click, click to select the file to load 3. Click on the button to generate a report etc.) It's a pain, but it can be considered.
You will find plenty of resources to learn by yourself how to code a python script. You will probably need to learn about lists, loops, files manipulations and maybe the subprocess library which will let you call any command from your python script.
I suggest you to start with Python3 instead of Python2 because it has a better support for unicode that could quickly become an issue if you have non ascii characters in your input files or in reports from your software.
Good luck ;)
If the only way you can get report is selecting and copy/pasting it from program GUI, the situation just begs for AutoIt instead of Python.
With Python it would be much more difficult. Unless you want to improve your python knowledge or course...
Simulating keypresses, you can open specific file in program (through sending ctrl+o or alt and navigating file menu). Simulating mouse or keypress - start report generation. Then simulate a mouse click in text area, and perform something like:
(just a skeleton of script, probably need to be modified to suit your situation and needs)
send("^{a}^{c}") ; to select all and copy (if these keys are supported in this program
$text = ClipGet() ; get contents of clipboard
$fout = FileOpen("somefile.txt",2)
FileWrite($fout,$text)
FileClose($fout)
To fully automate the task, in script you can get a list of source files in specific folder, and run this macro for each of them, automatically naming resulting txt files.
I'm a week into learning Python and am trying to write a piece of code that allows me to run a text-based Perl script in LXTerminal automatically. I have a couple of questions regarding some specifics.
I need my code to start the Perl script with a user-inputted environment file, enter a few specific settings into the Perl script, and then read in many .txt files, one at a time, into the Perl script. It also needs to restart the process for every single .txt file and capture each individual output (it would help if every output could be written to a single .csv file).
To call the Perl script, I'm starting with the following:
alphamelts="/home/melts/Desktop/alphamelts"
pipe=subprocess.Popen(["perl", "/home/Desktop/melts/alphaMELTS", "run_alphamelts.command -f %s"]) % raw_input("Enter an environment file:"), stdout=PIPE
Assuming that's correct, I now need it to read in a .txt file, enter number-based commands, have my code wait for the Perl script to finish its calculations, and I need it to write the output to a .csv file. If it helps, the Perl script I'm running automatically generates a space delimited file containing the results of its calculations once the program exists, but it would be super helpful if only a few of its outputs were written onto a single seperate .csv file for each .txt file processed.
No idea where to go from here but I absolutely have to get this working. Sorry for the complexity.
Thank you!
you can do some really cool stuff in ipython. Check out this notebook for some specific examples. As far as waiting for a subprocess to finish, I think you need to put a pause in your script. Also, for data handling and export to csv and excel, I'd recommend pandas
Just something to get you started.
I need to run the python program in the backend. To the script I have given one input file and the code is processing that file and creating new output file. Now if I change the input file content I don't want to run the code again. It should run in the back end continously and generate the output file. Please if someone knows the answer for this let me know.
thank you
Basically, you have to set up a so-called FileWatcher, i.e. some mechanism which looks out for changes in a file.
There are several techniques for watching file/directory changes in python. Have a look at this question: Monitoring contents of files/directories?. Another link is here, this is about directory changes but file changes are handled in a similar way. You could also google for "watch file changes python" in order to get a lot of answers :)
Note: If you're programming in windows, you should probably implement your program as windows service, look here for how to do that.