scipy.fft hangs with certain sound files - python

scipy.fft seems to hang when running this simple script:
import scipy
from scipy.io import wavfile
sound = 'sounds/silence/iPhone5.wav'
fs, data = wavfile.read(sound)
print scipy.fft(data)
on certain files. Try this file for example.
A few things I noticed:
Running the individual commands from the interactive interpreter does not hang.
Running with other sound files does not always hang the script (it's not just this file that isn't working though)
Sometimes I get WavFileWarning: chunk not understood, but it doesn't seem to be related to when it happens
If I terminate the script with Ctrl+C I get the result as if it never got stuck.
Opening the file with wave or audiolab leads to the same result.
Is this a bug or am I doing something wrong?

Check the value of data.shape for the files that hang up the system. If your data length happens to be a prime number, or the product of several large prime numbers, there isn't much that the FFT algorithm can do to speed up calculation of the DFT. If you pad with zeros, or trim your data to the nearest power of 2, everything should run much, much faster.

This should have been a comment, but there's just not enough space there...
You could do a bit more debugging, which might help a bit.
(Assuming you're on some sort of unix-like OS)
When the program gets stuck, does it idle or use a lot of CPU? You could use "top" or similar to check.
What is the program doing when it appears stuck? Can you get a stack trace? Either using a debugger like gdb or some other tool.
And I guess what really should be step one. Search the net for your symptoms. If it is a bug, it is likely already found and reported. It might even be fixed already.
By looking at a stack trace it should be possible to see if the program is stuck waiting for something, stuck in a loop somewhere or just doing lots of work.
It might also be able to tell you if the problem is in python code, C extensions or somewhere else. Being used to read stack traces is of course a plus. :)

Related

Libeosa randomly crashes trying to read audio files, despite it reading fine the same files other times

So I mainly want some insight onto wheter this is something I can resolve or if this is a bug or some quirk of librosa itself.
In essence, I am building a database for a machine learning algorithm that uses audio recordings in the .flac format.
Sometimes, the program crashes, with librosa spitting out an error log saying that it couldn't find a given file.
Other times, and the critical part in this, WITHOUT altering the code nor the database, it loads the files fine and carries on (doesn't even throw warnings which is odd concerning the prior behaviour)
Only other thing of note is that my computer's memory is quite full, and the hardware is old (2015-2016) with my RAM being a bit of a bottleneck.
Any insight over what might be happening?
Edit:
Nevermind, I found the bug, garbage is randomly being feed to the list containing the file names.

mpirun crashing on certain amount of processes

having a strange issue when trying to run a simple "hello world" program with MPI.
I eventually want to use 100 processes for this MPI script I'm writing in python and was even able to run the hello world test earlier with up to 100 processes. However, now I keep encountering the same error when I try to run the script with ~50 processes.
The specific error I see seems to be stating:
ORTE_ERROR_LOG: The system limit on number of network connections a process can open was reached in file util/listener.c at line 321
After trying to research this, I understand that it has something to do with a process running out of file descriptors and it seems like the most common solutions state that a file is not closing properly. However, my issue here is, I'm not opening any files? My script is just:
print('I am process:', rank)
So what could the issue be stemming from here?
I seem to have found a slight workaround.
I am working on a Mac, so I'm assuming that earlier I was able to stay under my file limit that is at a certain default amount set by the OS. By configuring the max file limit, I was able to bypass the limit amount I was originally hitting, causing my program to crash.
This fix isn't ideal, since my script now takes quite a while to run, but it is at least a temporary one until I can find a better fix.
If anyone would like to attempt this, the solution I found was posted by #tombigel on GitHub and can be found here.

Python3 Search the virtual memory of a running windows process

begin TLDR;
I want to write a python3 script to scan through the memory of a running windows process and find strings.
end TLDR;
This is for a CTF binary. It's a typical Windows x86 PE file. The goal is simply to get a flag from the processes memory as it runs. This is easy with ProcessHacker you can search through the strings in the memory of the running application and find the flag with a regex. Now because I'm a masochistic geek I strive to script out solutions for CTFs (for everything really). Specifically I want to use python3, C# is also an option but would really like to keep all of the solution scripts in python.
Thought this would be a very simple task. You know... pip install some library written by someone that's already solved the problem and use it. Couldn't find anything that would let me do what I need for this task. Here are the libraries I tried out already.
ctypes - This was the first one I used, specifically ReadProcessMemory. Kept getting 299 errors which was because the buffer I was passing in was larger than that section of memory so I made a recursive function that would catch that exception, divide the buffer length by 2 until it got something THEN would read one byte at a time until it hit a 299 error. May have been on the right track there but I wasn't able to get the flag. I WAS able to find the flag only if I knew the exact address of the flag (which I'd get from process hacker). I may make a separate question on SO to address that, this one is really just me asking the community if something already exists before diving into this.
pymem - A nice wrapper for ctypes but had the same issues as above.
winappdbg - python2.x only. I don't want to use python 2.x.
haystack - Looks like this depends on winappdbg which depends on python 2.x.
angr - This is a possibility, Only scratched the surface with it so far. Looks complicated and it's on the to learn list but don't want to dive into something right now that's not going to solve the issue.
volatility - Looks like this is meant for working with full RAM dumps not for hooking into currently running processes and reading the memory.
My plan at the moment is to dive a bit more into angr to see if that will work, go back to pymem/ctypes and try more things. If all else fails ProcessHacker IS opensource. I'm not fluent in C so it'll take time to figure out how they're doing it. Really hoping there's some python3 library I'm missing or maybe I'm going about this the wrong way.
Ended up writing the script using the frida library. Also have to give soutz to rootbsd because his or her code in the fridump3 project helped greatly.

How to trace random MemoryError in python script?

I have a python script, which is used to perform a lab measurement using several devices. The whole setup is rather involved, including communication over serial devices, API calls as well as the use of self-written and commercial drivers. In the end, however, everything boils down to two nested loops, which vary some parameters, collect data and write it to a file.
My problem is that I observe random occurences of a MemoryError, typically after 10 hours, equivalent to ~15k runs of the loops. At the moment, I don't have an idea, where it comes from or how I can trace it further. So I would be happy for suggestions, how to work on my problem. My observations up to this moment are as follows.
The error occurs at random states of the program. Different runs will throw the MemoryError at different lines of my script.
There is never any helpful error message. Python only says MemoryError without any error string. The traceback leads me to some point in the script, where memory is needed (e.g. when building a list), but it appears to be no specific instruction, which is the problem.
My RAM is far from full. The python process in question typically consumes some ten MB of RAM when viewed in the task manager. In addition, the RAM usage appears to be stable for hours. Usually, it increases slowly for some time, just to drop to down to the previous level quickly, which I interpret as the garbage collector kicking in periodically.
So far I did not find any indications for a memory leak. I used memory_profiler to trace the memory usage of my functions and found it to be stable. In addition, I followed this blog entry to observe what the garbage collector does in detail. Again, I could not find any hints for undeleted objects.
I am stuck to Win7 x86 due to a driver, which will only work on a 32bit system. So I cannot follow suggestions like this to go to a 64 bit version of Windows. Anyway, I do not see, how this would help in my situation.
The iPython console, from which the script is being launched, often behaves strange after the error occurred. Sometimes, a new MemoryError is thrown even for very simple operations. Often, the console is marked by Windows as "not responding" after some time. A menu pops up, where besides the usual options to wait for the process or to terminate it, there is a third option to "restore" the program (whatever that means). Doing so usually causes the console to work normal again.
At this point, I am somewhat out of ideas on how to proceed. The general receipe to comment out parts of the script until it works is highly undesirable in my case. As stated above, each test run will take several hours, meaning a potential downtime of weeks for my lab equipment. Going that direction, appears unfeasable to me. Is there any more direct approach to learn, what is crashing behind the scenes? How can I understand that python apparently fails to malloc?

Python crashes in rare cases when running code - how to debug?

I have a problem that I seriously spent months on now!
Essentially I am running code that requires to read from and save to HD5 files. I am using h5py for this.
It's very hard to debug because the problem (whatever it is) only occurs in like 5% of the cases (each run takes several hours) and when it gets there it crashes python completely so debugging with python itself is impossible. Using simple logs it's also impossible to pinpoint to the exact crashing situation - it appears to be very random, crashing at different points within the code, or with a lag.
I tried using OllyDbg to figure out whats happening and can safely conclude that it consistently crashes at the following location: http://i.imgur.com/c4X5W.png
It seems to be shortly after calling the python native PyObject_ClearWeakRefs, with an access violation error message. The weird thing is that the file is successfully written to. What would cause the access violation error? Or is that python internal (e.g. the stack?) and not file (i.e. my code) related?
Has anyone an idea whats happening here? If not, is there a smarter way of finding out what exactly is happening? maybe some hidden python logs or something I don't know about?
Thank you
PyObject_ClearWeakRefs is in the python interpreter itself. But if it only happens in a small number of runs, it could be hardware related. Things you could try:
Run your program on a different machine. if it doesn't crash there, it is probably a hardware issue.
Reinstall python, in case the installed version has somehow become corrupted.
Run a memory test program.
Thanks for all the answers. I ran two versions this time, one with a new python install and my same program, another one on my original computer/install, but replacing all HDF5 read/write procedures with numpy read/write procedures.
The program continued to crash on my second computer at odd times, but on my primary computer I had zero crashes with the changed code. I think it is thus safe to conclude that the problems were HDF5 or more specifically h5py related. It appears that more people encountered issues with h5py in that respect. Given that any error in my application translates to potentially large financial losses I decided to dump HDF5 completely in favor of other stable solutions.
Use a try catch statement. This can be put into the program in order to stop the program from crashing when erroneous data is entered

Categories

Resources