I know the question does not seems to make sense but let me explain.
So I have a voice changer software that Change the voice in real time. If i open the audacity and choose the micro of that software I can speak and record with the voice changed.
Now what I want is, I have a audio file already recorded and I want to pass that file into that same microphone (to simulate me speaking) and save the output with the voice changed in another file. I made some attemps using pyaudio but no success.
The idea here is to use a tts module in python to read a dataset I have with multiple lines, save the output in a file and then pass that output to the microphone to change the voice and save in another file. That way I can automate a creation of a new dataset with a new speaker to train a new tts. But the problem is I missing the way to pass a file to the microphone to simulate me speaking to it but instead using an audio file already recorded.
Sorry of it was confused. I made my best to explain. Hope someone can help me.
Thank you in advanced!
This is what I have but no success.
import pyaudio
import wave
# Open the audio file
wf = wave.open("my_audio_file.wav", "rb")
# Open the output file
wf_out = wave.open("my_output_file.wav", "wb")
# Set the output file's format and parameters to match the input file
wf_out.setframerate(wf.getframerate())
wf_out.setsampwidth(wf.getsampwidth())
wf_out.setnchannels(wf.getnchannels())
# Open the microphone using pyaudio
p = pyaudio.PyAudio()
# Create a stream to send the audio data to the microphone
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True)
# Start streaming the audio data to the microphone
stream.start_stream()
# Send the audio data to the stream and output file
data = wf.readframes(1024)
while data != "":
stream.write(data)
wf_out.writeframes(data)
data = wf.readframes(1024)
# Stop the stream
stream.stop_stream()
# Close the stream, microphone, and output file
stream.close()
p.terminate()
wf_out.close()
Related
I want to add to a wav file, ideal would be from a numpy array. I tried the following code:
data = stream.read(CHUNK)
audio_numpy = numpy.frombuffer(data, dtype=numpy.int16)
scipy.io.wavfile.write(FILENAME, RATE, audio_numpy)
where stream is created by
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
as I heard that scipy would add to the file and not overwrite it. Unfortunately however, it overwrites the file.
How can I append to a WAV file? The input comes from the microphone.
The WAV file should be accessed from ffmpeg later, so that the WAV file should not be written in total again, as this is also inefficient.
I have a video file and I want to get the list of streams from it. I can see the needed result by for example executing a simple `ffprobe video.mp4:
....
Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661) ......
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), ......
....
But I need to use python and code that will work both on Windows and Ubuntu, without executing an external process.
My real goal is to check whether there is ANY audio stream within the video (a simple yes/no would suffice), but I think getting extra information can be helpful for my problem, so I'm asking about the entire streams
EDIT: Clarifying that I need to avoid executing some external process, but looking for some python code/library to do it within the process.
import os
import json
import subprocess
file_path = os.listdir("path to your videos folder")
audio_flag = False
for file in file_path:
ffprobe_cmd = "ffprobe -hide_banner -show_streams -print_format json "+file
process = subprocess.Popen(ffprobe_cmd,stdout=subprocess.PIPE,stderr=subprocess.PIPE, shell=True)
output = json.loads(process.communicate()[0])
for stream in output["streams"]:
if(stream['codec_type'] == 'audio'):
audio_flag = True
break;
if(audio_flag):
print("audio present in the file")
else:
print("audio not present in the file")
# loop through the output streams for more detailed output
for stream in output["streams"]:
for k,v in stream.items():
print(k, ":", v)
Note: Make sure that your videos folder path consist of only valid video files as i didn't include any file validation in the above code snippet. Also, I have tested this code for a video file that contains one video stream and one audio stream.
I am creating a program to turn text into speech (TTS).
What I've done so far is to split a given word into syllables and then play each pre-recorded syllables.
For example:
INPUT: [TELEVISION]
OUTPUT: [TEL - E - VI - SION]
And then the program plays each sound in order:
First: play TEL.wav
Second: play E.wav
Third: play VI.wav
Fourth: play SION.wav
I am using wave and PyAudio to play each wav file:
wf = wave.open("sounds/%s.wav" %(ss), 'rb')
p = pyaudio.PyAudio()
stream = p.open(...)
data = wf.readframes(CHUNK)
stream.write(data)
... etc.
Now the problem is that during the playback there is a delay between each audio file and the spoken word sounds unnatural.
Is it possible to mix these audio files without creating a new file and play them with 0.2s delay between each audio file?
Edit: I tried Nullman's solution and it worked better than just calling a new wf on each sound.
I also tried putting a crossfade following these instructions.
I want to record short audio clips from a USB microphone in Python. I have tried pyaudio, which seemed to fail communicating with ALSA, and alsaaudio, the code example of which produces an unreadable files.
So my question: What is the easiest way to record clips from a USB mic in Python?
This script records to test.wav while printing the current amplitute:
import alsaaudio, wave, numpy
inp = alsaaudio.PCM(alsaaudio.PCM_CAPTURE)
inp.setchannels(1)
inp.setrate(44100)
inp.setformat(alsaaudio.PCM_FORMAT_S16_LE)
inp.setperiodsize(1024)
w = wave.open('test.wav', 'w')
w.setnchannels(1)
w.setsampwidth(2)
w.setframerate(44100)
while True:
l, data = inp.read()
a = numpy.fromstring(data, dtype='int16')
print numpy.abs(a).mean()
w.writeframes(data)
i want to programatically record sound coming out of my laptop in python. i found PyAudio and came up with the following program that accomplishes the task:
import pyaudio, wave, sys
chunk = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = sys.argv[1]
p = pyaudio.PyAudio()
channel_map = (0, 1)
stream_info = pyaudio.PaMacCoreStreamInfo(
flags = pyaudio.PaMacCoreStreamInfo.paMacCorePlayNice,
channel_map = channel_map)
stream = p.open(format = FORMAT,
rate = RATE,
input = True,
input_host_api_specific_stream_info = stream_info,
channels = CHANNELS)
all = []
for i in range(0, RATE / chunk * RECORD_SECONDS):
data = stream.read(chunk)
all.append(data)
stream.close()
p.terminate()
data = ''.join(all)
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(data)
wf.close()
the problem is i have to connect the headphone jack to the microphone jack. i tried replacing these lines:
input = True,
input_host_api_specific_stream_info = stream_info,
with these:
output = True,
output_host_api_specific_stream_info = stream_info,
but then i get this error:
Traceback (most recent call last):
File "./test.py", line 25, in
data = stream.read(chunk)
File "/Library/Python/2.5/site-packages/pyaudio.py", line 562, in read
paCanNotReadFromAnOutputOnlyStream)
IOError: [Errno Not input stream] -9975
is there a way to instantiate the PyAudio stream so that it inputs from the computer's output and i don't have to connect the headphone jack to the microphone? is there a better way to go about this? i'd prefer to stick w/ a python app and avoid cocoa.
You can install Soundflower, which allows you to create extra audio devices and route audio between them. This way you can define your system's output to the Soundflower device and read the audio from it using PyAudio.
You can also take a look at PyJack, an audio client for Jack.
unfortunately, theres no foolproof way to do it, but Audio Hijack and Wiretap are the best tools available for that.
I can give an answer by not using a programmatic way.
Panel > Sound > Recording >> enabling stereo mix.
This needs a driver support.
I also found that this makes my real sound echo.
At least this solves my problem.
Doing this programmatically will be tricky. Basically, it is not possible to intercept the audio traffic in front of your "output". Therefore, what you would have to do is create your own virtual audio device and make whatever application you want to capture play to that device.
Different third-party applications also mentioned by other people seem to provide such capabilities on MacOS. I can add Loopback but I have no experience with either of the tools.
Programmatically, you would have to mimic something exactly like that.
Check out this script at pycorder. It records output sound in python. I would consider modifying one of the functions to record output using an output stream with the sounddevice module. You can see documentation of sounddevice here. The script works and you can implement it to your needs, but replacing the recording system with an output stream would probably make the code less messy and more efficient.