I have a binary file in which data segments are interspersed. I know locations (byte offsets) of every data segment, as well as size of those data segments, as well as the type of data points (float, float32 - meaning that every data point is coded by 4 bytes). I want to read those data segments into an array like structure (for example, numpy array or pandas dataframe), but I have trouble doing so. I've tried using numpy's memmap, but it short circuits on the last data segment, and numpy's from file just gets me bizzare results.
Sample of the code:
begin=datadf["$BEGINDATA"][0] #datadf is pandas.df that has where data begins and its size
buf.seek(begin) #buf is the file that is opened in rb mode
size=datadf["$DATASIZE"][0]+1 #same as the above
data=buf.read(size) #this should get me that data segment, but in binary
Is there a way to reliably convert to float32 from this binary data.
For further clarification, I'm including printout of first 10 data points.
buf.seek(begin)
print(buf.read(40)) #10 points of float32 (4bytes) means 40
>>>b'\xa5\x10[#\x00\x00\x88#a\xf3\xf7A\x00\x00\x88#&\x93\x9bA\x00\x00\x88#\x00\x00\x00#\xfc\xcd\x08?\x1c\xe2\xbe?\x03\xf9\xa4?'
If it's any value, while there are 4 bytes (32 bit width) for each float point, every float point is capped to maximum value of 10 000
If you want a numpy.ndarray, you can just use numpy.frombuffer
>>> import numpy as np
>>> data = b'\xa5\x10[#\x00\x00\x88#a\xf3\xf7A\x00\x00\x88#&\x93\x9bA\x00\x00\x88#\x00\x00\x00#\xfc\xcd\x08?\x1c\xe2\xbe?\x03\xf9\xa4?'
>>> np.frombuffer(data, dtype=np.float32)
array([ 3.422891 , 4.25 , 30.993837 , 4.25 , 19.44685 ,
4.25 , 2. , 0.5343931, 1.4912753, 1.2888492],
dtype=float32)
Related
I try to calculate the 90 percentile of a variable over a period of 16 years. The data is stored in netCDF files (where 1 month is stored in 1 file --> 12files/year * 16years).
I pre-processed the data and took the daily_max and monthly mean of the variable of interested. So bottom line the folder consists of 192 files that contain each one value (the monthly mean of the daily max).
The data was opened using following command:
ds = xr.open_mfdataset(f"{folderdir}/*.nc", chunks={"time":1})
Trying to calculate the quantile (from some data variable, which was extracted from the ds: data_variable = ds["data_variable"]) with following code:
q90 = data_varaible.qunatile(0.95, "time"), yields follwing error message:
ValueError: dimension time on 0th function argument to apply_ufunc with dask='parallelized' consists of multiple chunks, but is also a core dimension. To fix, either rechunk into a single dask array chunk along this dimension, i.e., .chunk(dict(time=-1)), or pass allow_rechunk=True in dask_gufunc_kwargs but beware that this may significantly increase memory usage.
I tried to rechunk, as explained in the error message by apply: data_variable.chunk(dict(time=-1).quantile(0.95,'time'), with no success (got the exact same error.
Further I tired to rechunk in the following way: data_variable.chunk({'time':1})), which was also not successful.
Printing out the data.variable.chunk(), actually shows that the chunk size in time dimension is supposed to be 1, so i don't understand where I made a mistake.
ps: I didn't try allow_rechunk=True in dask_gufunc_kwargs, since I don't know where to pass that argument.
Thanks for the help,
Max
ps: Printing out the data_variable yields, (to be clear, some_variable (see above) is 'wsgsmax' here):
<xarray.DataArray 'wsgsmax' (time: 132, y: 853, x: 789)>
dask.array<concatenate, shape=(132, 853, 789), dtype=float32, chunksize=(1, 853, 789), chunktype=numpy.ndarray>
Coordinates:
* time (time) datetime64[ns] 1995-01-16T12:00:00 ... 2005-12-16T12:00:00
lon (y, x) float32 dask.array<chunksize=(853, 789), meta=np.ndarray>
lat (y, x) float32 dask.array<chunksize=(853, 789), meta=np.ndarray>
* x (x) float32 0.0 2.5e+03 5e+03 ... 1.965e+06 1.968e+06 1.97e+06
* y (y) float32 0.0 2.5e+03 5e+03 ... 2.125e+06 2.128e+06 2.13e+06
height float32 10.0
Attributes:
standard_name: wind_speed_of_gust
long_name: Maximum Near Surface Wind Speed Of Gust
units: m s-1
grid_mapping: Lambert_Conformal
cell_methods: time: maximum
FA_name: CLSRAFALES.POS
par: 228
lvt: 105
lev: 10
tri: 2
chunk({"time": 1} will produce as many chunks as there are time steps.
Each chunk will have a size of 1.
Printing out the data.variable.chunk(), actually shows that the chunk size in time dimension is supposed to be 1, so i don't understand where I made a mistake.
To compute percentiles dask needs to load the full timeseries into memory so it forbids chunking over "time" dimension.
So what you want is either chunk({"time": len(ds.time)} or to use directly the shorthand chunk({"time": -1}.
I don't understand why data_variable.chunk(dict(time=-1).quantile(0.95,'time') would not work though.
I am trying to read .pfm images of shape 804 x 600 for which I have written a function like this. I know that my images are float16 but they are being read as float32.
def read_pfm(file):
"""Method to decode .pfm files and return data as numpy array"""
f = open(file, "rb")
# read information on number of channels and shape
line1, line2, line3 = (f.readline() for _ in range(3))
width, height = (int(s) for s in line2.split())
# read data as big endian float
data = np.fromfile(f,'>f') # TODO: data is read as float32. Why? Should be float16
print(data.dtype)
print(data.shape)
data = np.reshape(data, shape)
return data
My questions is two-fold:
Why are my images being read as float32 by default when they are float16?
When I do force the images to be read as float16 in this way
data = np.fromfile(f,'>f2')
the shape of input changes from (482400,) to (964800,). Why does this happen?
Edit: I realized that I made a mistake and the images are actually float32. However the answer by Daweo still clarifies the confusion I had about 16-/32-bit.
When I do force the images to be read as float16(...)the shape of input changes from (482400,) to (964800,). Why does this
happen?
Observe that 482400 * 2 == 964800 and 32/16 == 2.
Consider simple example, let say you have following 8 bits
01101110
when you would be instructed that 8-bit integers are used you would consider that to be single number (01101110), but when instructed that 4-bit integers are used you would consider that to be 2 numbers (0110, 1110) and when instructed that 2-bit integers are used you would consider that to be 4 numbers (01, 10, 11, 10). Likewise if given sequence of bytes when assumed to be holding float32 does contain N numbers, then same sequence of bytes treated as holding float16 does contain N*(32/16) that is N*2 numbers.
I am trying to load a .wav file in Python using the scipy folder. My final objective is to create the spectrogram of that audio file. The code for reading the file could be summarized as follows:
import scipy.io.wavfile as wav
(sig, rate) = wav.read(_wav_file_)
For some .wav files I am receiving the following error:
WavFileWarning: Chunk (non-data) not understood, skipping it.
WavFileWarning) ** ValueError: Incomplete wav chunk.
Therefore, I decided to use librosa for reading the files using the:
import librosa
(sig, rate) = librosa.load(_wav_file_, sr=None)
That is working properly for all cases, however, I noticed a difference in the colors of the spectrogram. While it was the same exact figure, however, somehow the colors were inversed. More specifically, I noticed that when keeping the same function for calculation of the specs and changing only the way I am reading the .wav there was this difference. Any idea what can produce that thing? Is there a default difference between the way the two approaches read the .wav file?
EDIT:
(rate1, sig1) = wav.read(spec_file) # rate1 = 16000
sig, rate = librosa.load(spec_file) # rate 22050
sig = np.array(α*sig, dtype = "int16")
Something that almost worked is to multiple the result of sig with a constant α alpha that was the scale between the max values of the signal from scipy wavread and the signal derived from librosa. Still though the signal rates were different.
This sounds like a quantization problem. If samples in the wave file are stored as float and librosa is just performing a straight cast to an int, and value less than 1 will be truncated to 0. More than likely, this is why sig is an array of all zeros. The float must be scaled to map it into range of an int. For example,
>>> a = sp.randn(10)
>>> a
array([-0.04250369, 0.244113 , 0.64479281, -0.3665814 , -0.2836227 ,
-0.27808428, -0.07668698, -1.3104602 , 0.95253315, -0.56778205])
Convert a to type int without scaling
>>> a.astype(int)
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Convert a to int with scaling for 16-bit integer
>>> b = (a* 32767).astype(int)
>>> b
array([ -1392, 7998, 21127, -12011, -9293, -9111, -2512, -42939,
31211, -18604])
Convert scaled int back to float
>>> c = b/32767.0
>>> c
array([-0.04248177, 0.24408704, 0.64476455, -0.36655782, -0.28360851,
-0.27805414, -0.0766625 , -1.31043428, 0.9525132 , -0.56776635])
c and b are only equal to about 3 or 4 decimal places due to quantization to int.
If librosa is returning a float, you can scale it by 2**15 and cast it to an int to get same range of values that scipy wave reader is returning. Since librosa is returning a float, chances are the values going to lie within a much smaller range, such as [-1, +1], than a 16-bit integer which will be in [-32768, +32767]. So you need to scale one to get the ranges to match. For example,
sig, rate = librosa.load(spec_file, mono=True)
sig = sig × 32767
If you yourself do not want to do the quantization, then you could use pylab using the pylab.specgram function, to do it for you. You can look inside the function and see how it uses vmin and vmax.
It is not completely clear from your post (at least for me) what you want to achieve (as there is also neither a sample input file nor any script beforehand from you). But anyways, to check if the spectrogram of a wave file has significant differences depending on the case that the signal data returned from any of the read functions is float32 or int, I tested the following 3 functions.
Python Script:
_wav_file_ = "africa-toto.wav"
def spectogram_librosa(_wav_file_):
import librosa
import pylab
import numpy as np
(sig, rate) = librosa.load(_wav_file_, sr=None, mono=True, dtype=np.float32)
pylab.specgram(sig, Fs=rate)
pylab.savefig('spectrogram3.png')
def graph_spectrogram_wave(wav_file):
import wave
import pylab
def get_wav_info(wav_file):
wav = wave.open(wav_file, 'r')
frames = wav.readframes(-1)
sound_info = pylab.fromstring(frames, 'int16')
frame_rate = wav.getframerate()
wav.close()
return sound_info, frame_rate
sound_info, frame_rate = get_wav_info(wav_file)
pylab.figure(num=3, figsize=(10, 6))
pylab.title('spectrogram pylab with wav_file')
pylab.specgram(sound_info, Fs=frame_rate)
pylab.savefig('spectrogram2.png')
def graph_wavfileread(_wav_file_):
import matplotlib.pyplot as plt
from scipy import signal
from scipy.io import wavfile
import numpy as np
sample_rate, samples = wavfile.read(_wav_file_)
frequencies, times, spectrogram = signal.spectrogram(samples,sample_rate,nfft=1024)
plt.pcolormesh(times, frequencies, 10*np.log10(spectrogram))
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.savefig("spectogram1.png")
spectogram_librosa(_wav_file_)
#graph_wavfileread(_wav_file_)
#graph_spectrogram_wave(_wav_file_)
which produced the following 3 outputs:
which apart from the minor differences in size and intensity seem quite similar, no matter the read method, library or data type, which makes me question a little, for what purpose need the outputs be 'exactly' same and how exact should they be.
I do find strange though that the librosa.load() function offers a dtype parameter but works anyways only with float values. Googling in this regard led to me to only this issue which wasn't much help and this issue says that that's how it will stay with librosa, as internally it seems to only use floats.
To add on to what has been said, Librosa has a utility to convert integer arrays to floats.
float_audio = librosa.util.buf_to_float(sig)
I use this to great success when producing spectrograms of Pydub audiosegments. Keep in mind, one of its arguments is the number of bytes per sample. It defaults to 2. You can read about it more in the documentation here. Here is the source code:
def buf_to_float(x, n_bytes=2, dtype=np.float32):
"""Convert an integer buffer to floating point values.
This is primarily useful when loading integer-valued wav data
into numpy arrays.
See Also
--------
buf_to_float
Parameters
----------
x : np.ndarray [dtype=int]
The integer-valued data buffer
n_bytes : int [1, 2, 4]
The number of bytes per sample in `x`
dtype : numeric type
The target output type (default: 32-bit float)
Returns
-------
x_float : np.ndarray [dtype=float]
The input data buffer cast to floating point
"""
# Invert the scale of the data
scale = 1./float(1 << ((8 * n_bytes) - 1))
# Construct the format string
fmt = '<i{:d}'.format(n_bytes)
# Rescale and format the data buffer
return scale * np.frombuffer(x, fmt).astype(dtype)
I'm trying to load some text files into numpy arrays. The .txt files represent pixels of an image where each pixel is given an arbitrary relative coordinate between -10 and +10 (for x) and 0 and 10 for y. In total, the image is 10x256 pixels. The catch is that each pixel isn't given an RGB values it is given a list of intensities that corresponds to the wavelength vales in the first /n separated "header". Each coordinate is given as the two first tab separated item and the first entry only has "0 0" because that The format of the text files is as follows:
Line 1: "0 0 625.15360 625.69449 626.23538 ..." (two coordinates followed by the wavelengths)
Line 2: "-10.00000 -10.00000 839 841 833 843 838 847 ..."
Line 3: "-10.00000 -9.92157 838 839 838 ..."
Where 839 and 838 represent the intensity of the wavelength 625.15360 for two different adjacent pixels one on top of another (with a small change in y). Furthermore, 841 and 839 would be the intensity of the 625.69449 wavelength, and so on and so forth.
My reasoning thus far has been to iterate through the file using np.genfromtxt() and adding to a new array 3D numpy array with variables (x,y, lambda) each being assigned one single intensity value. Also, I think it would make much more sense if x and y spanned from 0-9 and 0-255 respectively to represent the image instead of the arbitrary relative coordinates given in the data...
Problem: I don't know how to load the data into a 3x3 (stuck figuring out 2x2) and I can't seem to slice properly...
What I have so far:
intensity_array2 = np.zeros([len(unique_y),len(unique_x)], dtype= int)
for element in np.nditer(intensity_array2, op_flags=['readwrite']):
for i in range(len(unique_y)):
for j in range(len(unique_x)):
with open(os.path.join(path_name,right_file)) as rf:
intensity_array2[i,j] = np.genfromtxt(rf, skip_header = (i*j)+j, delimiter = " ")
Where len(unique_y) = 10 and len(unique_x) = 256 are found in a function above.
I'm not sure I entirely understand your file format, so forgive me if this does not make sense. However, if there is any way you can load all the data in at once I am sure it will run much faster. It appears to me that you can use this to get all the data into memory:
data = np.genfromtxt(rf, delimiter = " ")
Then create your 3D array:
intensity_array2 = np.zeros( (10, 256, num_wavlengths) )
Then fill in the values of the 3D array:
intensity_array2[ data[:,0], data[:,1], :] = data[:, 2:]
This will not work exactly because your x and y indices can go negative--you might need to add an offset in this case. Also, if your input file is in a predictable format, you may be able to simply call np.reshape() on the data matrix to get what you want.
Building on Lukeclh's answer, try:
data = np.genfromtxt(rf)
Then, cleave off the wavelength values
wavelengths = data[0]
intensities = data[1:]
We can now rearrange the data using reshape:
intensitiesshaped = np.reshape(intensities, (len(unique_x),len(unique_y),-1))
The "-1" value says 'the rest goes here'.
We still have the leading values (of on each of these arrays. To trim them, we can do:
wavelengths = wavelengths[2:]
intensitiesshaped = intensities[:,:,2:]
This just throws the information in the first two indices away. If you need to retain it you'll have to do something a bit more sophisticated.
I'm trying to read a wav file, then manipulate its contents, sample by sample
Here's what I have so far:
import scipy.io.wavfile
import math
rate, data = scipy.io.wavfile.read('xenencounter_23.wav')
for i in range(len(data)):
data[i][0] = math.sin(data[i][0])
print data[i][0]
The result I get is:
0
0
0
0
0
0
etc
It is reading properly, because if I write print data[i] instead I get usually non-zero arrays of size 2.
The array data returned by wavfile.read is a numpy array with an integer data type. The data type of a numpy array can not be changed in place, so this line:
data[i][0] = math.sin(data[i][0])
casts the result of math.sin to an integer, which will always be 0.
Instead of that line, create a new floating point array to store your computed result.
Or use numpy.sin to compute the sine of all the elements in the array at once:
import numpy as np
import scipy.io.wavfile
rate, data = scipy.io.wavfile.read('xenencounter_23.wav')
sin_data = np.sin(data)
print sin_data
From your additional comments, it appears that you want to take the sine of each value and write out the result as a new wav file.
Here is an example that (I think) does what you want. I'll use the file 'M1F1-int16-AFsp.wav' from here: http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/Samples.html. The function show_info is just a convenient way to illustrate the results of each step. If you are using an interactive shell, you can use it to inspect the variables and their attributes.
import numpy as np
from scipy.io import wavfile
def show_info(aname, a):
print "Array", aname
print "shape:", a.shape
print "dtype:", a.dtype
print "min, max:", a.min(), a.max()
print
rate, data = wavfile.read('M1F1-int16-AFsp.wav')
show_info("data", data)
# Take the sine of each element in `data`.
# The np.sin function is "vectorized", so there is no need
# for a Python loop here.
sindata = np.sin(data)
show_info("sindata", sindata)
# Scale up the values to 16 bit integer range and round
# the value.
scaled = np.round(32767*sindata)
show_info("scaled", scaled)
# Cast `scaled` to an array with a 16 bit signed integer data type.
newdata = scaled.astype(np.int16)
show_info("newdata", newdata)
# Write the data to 'newname.wav'
wavfile.write('newname.wav', rate, newdata)
Here's the output. (The initial warning means there is perhaps some metadata in the file that is not understood by scipy.io.wavfile.read.)
<snip>/scipy/io/wavfile.py:147: WavFileWarning: Chunk (non-data) not understood, skipping it.
WavFileWarning)
Array 'data'
shape: (23493, 2)
dtype: int16
min, max: -7125 14325
Array 'sindata'
shape: (23493, 2)
dtype: float32
min, max: -0.999992 0.999991
Array 'scaled'
shape: (23493, 2)
dtype: float32
min, max: -32767.0 32767.0
Array 'newdata'
shape: (23493, 2)
dtype: int16
min, max: -32767 32767
The new file 'newname.wav' contains two channels of signed 16 bit values.