gstreamer videomixer2 cpu usage 164%. Any solution to decrease - python

As a part of my project, I will have to synchronize 2 videos. Since i am implementing it in python, i started using gstreamer.
My pipeline looks like this
filesrc -> decoder-> queuev -> videobox
filesrc-1 -> decoder-> queuev1 -> videobox1
both of these videobox is joined to mixer like this
[videobox 1 and 2 ] -> mixer -> ffmpegcolorspace ->videosink
All of them in a single pipeline.
But problem here is when i run the code , i get 174% cpu usage which i think is not really optimized. Is there any way to reduce this? because even if i simply run 3 videos in parallel pipelines i get 14% cpu usage.
I am also uploading part of my code here.
self.pipeline = gst.Pipeline('pipleline')
self.filesrc = gst.element_factory_make("filesrc", "filesrc")
self.filesrc.set_property('location', videoloc1)
self.pipeline.add(self.filesrc)
self.decode = gst.element_factory_make("decodebin2", "decode")
self.pipeline.add(self.decode)
self.queuev = gst.element_factory_make("queue", "queuev")
self.pipeline.add(self.queuev)
self.video = gst.element_factory_make("autovideosink", "video")
self.pipeline.add(self.video)
self.filesrc_2 = gst.element_factory_make("filesrc", "filesrc2")
self.filesrc_2.set_property('location', videoloc2)
self.pipeline.add(self.filesrc_2)
self.decode_2 = gst.element_factory_make("decodebin2", "decode_2")
self.pipeline.add(self.decode_2)
self.queuev_2 = gst.element_factory_make("queue", "queuev_2")
self.pipeline.add(self.queuev_2)
self.mixer = gst.element_factory_make("videomixer2", "mixer")
self.pipeline.add(self.mixer)
self.videobox_1 = gst.element_factory_make("videobox", "videobox_1")
self.pipeline.add(self.videobox_1)
self.videobox_2 = gst.element_factory_make("videobox", "videobox_2")
self.pipeline.add(self.videobox_2)
self.ffmpeg1 = gst.element_factory_make("ffmpegcolorspace", "ffmpeg1")
self.pipeline.add(self.ffmpeg1)
gst.element_link_many(self.filesrc,self.decode)
gst.element_link_many(self.filesrc_2,self.decode_2)
gst.element_link_many(self.queuev,self.videobox_1,self.mixer,self.ffmpeg1,self.video)
gst.element_link_many(self.queuev_2,self.videobox_2,self.mixer)

Videomixer is using the cpu to mix videos. Anyway, in oder to know, run a profiler (oprofile, sysprof) to see what code is using the most cpu. Also you did not said anything on the resolutions and colorspaces involved and the hardware you run this on. Thus it is hard to say wheter is is unexpectedly slow.
Finally, you don#t need to mix videos to sync them, you can just run them in a single pipeline. It is up to your application to e.g. render into separate drawing areas in your window or whatever.

You can use streamsynchronizer https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-base-plugins/html/gst-plugins-base-plugins-streamsynchronizer.html

Related

Distributed tensorflow monopolizes GPUs after running server.__init__

I have two computers with two GPUs each. I am trying to start with distributed tensorflow and very confused about how it all works. On computer A I would like to have one ps tasks (I have the impression this should go on the CPU) and two worker tasks (one per GPU). And I would like to have two 'worker' tasks on computer B. Here's how I have tried to implement this, in test.py
import tensorflow as tf
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--job_name', required = True, type = str)
parser.add_argument('--task_idx', required = True, type = int)
args, _ = parser.parse_known_args()
JOB_NAME = args.job_name
TASK_INDEX = args.task_idx
ps_hosts = ["computerB-i9:2222"]
worker_hosts = ["computerA-i7:2222", "computerA-i7:2223", "computerB-i9:2223", "computerB-i9:2224"]
cluster = tf.train.ClusterSpec({"ps": ps_hosts, "worker": worker_hosts})
server = tf.train.Server(cluster, job_name = JOB_NAME, task_index = TASK_INDEX)
if JOB_NAME == "ps":
server.join()
elif JOB_NAME == "worker":
is_chief = (TASK_INDEX == 0)
with tf.device(tf.train.replica_device_setter(
worker_device = "/job:worker/task:%d" % FLAGS.task_index, cluster = cluster)):
a = tf.constant(8)
b = tf.constant(9)
with tf.Session(server.target) as sess:
sess.run(tf.multiply(a, b))
What I am finding by running python3 test.py --job_name ps == task_idx 0 on computer A, is that I see that both GPUs on computer A have immediately been reserved by the script and that computer B shows no activity. This is not what I expected. I thought that since for the ps job I simply run server.join() that this should not use the GPU. However I can see by setting pdb break points that as soon as the server is initialized, the GPUs are taken. This leaves me with several questions:
- Why does the server immediately take all the GPU capacity?
- How am I supposed to allocate GPU and launch different processes?
- Does my original plan even make sense? (I am still a little confused by tasks vs. clusters vs. servers etc...)
I have watched the Tensorflow Developer Summit 2017 video on distributed Tensorflow and I have also been looking around on Github and blogs. I have not been able to find a working code example using the latest or even relatively recent distributed tensorflow functions. Likewise, I notice that many questions on Stack Overflow are not answered, so I have read related questions but not any that resolve my questions. I would appreciate any guidance or recommendations about other resources. Thanks!
I found that the following will work when invoking from command line:
CUDA_VISIBLE_DEVICES="" python3 test.py --job_name ps --task_idx 0 --dir_name TEST
Since I found this in a lot of code examples it seems like this may be the standard way to control an individual server's access to GPU resources.

Improving copying bytes from an Image

I have the following minimal code that gets the bytes from an image:
import Image
im = Image.open("kitten.png")
im_data = [pix for pixdata in im.getdata() for pix in pixdata]
This is rather slow (I have gigabytes of images to process) so how could this be sped up? I'm also unfamiliar with what exactly that code is trying to do. All my data is 1280 x 960 x 8-bit RGB, so I can ignore corner cases, etc.
(FYI, the full code is here - I've already replaced the ImageFile loop with the above Image.open().)
You can try
scipy.ndimage.imread()
If you mean speeding up by algorythamically i can suggest you accessing file with multiple threads simultaneously (only if you don't have a connection between processing sequence)
divide file logically by few sections and access each part simultaneously with threads (you have to put your operation inside a function and call it with threads)
here is a link to tutorial about threading in python
threding in python
I solved my problem, I think:
>>> [pix for pixdata in im.getdata() for pix in pixdata] ==
numpy.ndarray.tolist(numpy.ndarray.flatten(numpy.asarray(im)))
True
This cuts down the runtime by half, and with a bit of bash magic I can run the conversion on the 56 directories in parallel.

What is the simplest way to get from MIDI to real audio coming out my speakers (sound synthesis) in Python?

I'm starting work on an app that will need to create sound from lots of pre-loaded ".mid" files.
I'm using Python and Kivy to create an app, as I have made an app already with these tools and they are the only code I know. The other app I made uses no sound whatsoever.
Naturally, I want to make sure that the code I write will work cross-platform.
Right now, I'm simply trying to prove that I can create any real sound from a midi note.
I took this code suggested from another answer to a similar question using FluidSynth and Mingus:
from mingus.midi import fluidsynth
fluidsynth.init('/usr/share/sounds/sf2/FluidR3_GM.sf2',"alsa")
fluidsynth.play_Note(64,0,100)
But I hear nothing and get this error:
fluidsynth: warning: Failed to pin the sample data to RAM; swapping is possible.
Why do I get this error, how do I fix it, and is this the simplest way or even right way?
I could be wrong but I don't think there is a "0" channel which is what you are passing as your second argument to .play_Note(). Try this:
fluidsynth.play_Note(64,1,100)
or (from some documentation)
from mingus.containers.note import Note
n = Note("C", 4)
n.channel = 1
n.velocity = 50
fluidSynth.play_Note(n)
UPDATE:
There are references to only channels 1-16 in the source code for that method with the default channel set to 1:
def play_Note(self, note, channel = 1, velocity = 100):
"""Plays a Note object on a channel[1-16] with a \
velocity[0-127]. You can either specify the velocity and channel \
here as arguments or you can set the Note.velocity and Note.channel \
attributes, which will take presedence over the function arguments."""
if hasattr(note, 'velocity'):
velocity = note.velocity
if hasattr(note, 'channel'):
channel = note.channel
self.fs.noteon(int(channel), int(note) + 12, int(velocity))
return True

Python network bandwidth monitor

I am developing a program in python, and one element tells the user how much bandwidth they have used since the program has opened (not just within the program, but regular web browsing while the program has been opened). The output should be displayed in GTK
Is there anything in existence, if not can you point me in the right direction. It seems like i would have to edit an existing proxy script like pythonproxy, but i can't see how i would use it.
Thanks,
For my task I wrote very simple solution using psutil:
import time
import psutil
def main():
old_value = 0
while True:
new_value = psutil.net_io_counters().bytes_sent + psutil.net_io_counters().bytes_recv
if old_value:
send_stat(new_value - old_value)
old_value = new_value
time.sleep(1)
def convert_to_gbit(value):
return value/1024./1024./1024.*8
def send_stat(value):
print ("%0.3f" % convert_to_gbit(value))
main()
import time
def get_bytes(t, iface='wlan0'):
with open('/sys/class/net/' + iface + '/statistics/' + t + '_bytes', 'r') as f:
data = f.read();
return int(data)
while(True):
tx1 = get_bytes('tx')
rx1 = get_bytes('rx')
time.sleep(1)
tx2 = get_bytes('tx')
rx2 = get_bytes('rx')
tx_speed = round((tx2 - tx1)/1000000.0, 4)
rx_speed = round((rx2 - rx1)/1000000.0, 4)
print("TX: %fMbps RX: %fMbps") % (tx_speed, rx_speed)
should be work
Well, not quiet sure if there is something in existence (written in python) but you may want to have a look at the following.
Bandwidth Monitoring (Not really an active project but may give you an idea).
Munin Monitoring (A pearl based Network Monitoring Project)
ntop (written in C/C++, based on libpcap)
Also just to give you pointers if you are looking to do something on your own, one way could be to count and store packets using sudo cat /proc/net/dev
A proxy would only cover network applications that were configured to use it. You could set, e.g. a web browser to use a proxy, but what happens when your proxy exits?
I think the best thing to do is to hook in lower down the stack. There is a program that does this already, iftop. http://en.wikipedia.org/wiki/Iftop
You could start by reading the source code of iftop, perhaps wrap that into a Python C extension. Or rewrite iftop to log data to disk and read it from Python.
Would something like WireShark (https://wiki.wireshark.org/FrontPage) do the trick? I am tackling a similar problem now, and am inclined to use pyshark, a WireShark/TShark wrapper, for the task. That way you can get capture file info readily.

Modules or functions to obtain information from a network interface in Python

I wrote a little application that I use from the terminal in Linux to keep track of the amount of data up and down that I consume in a session of Internet connection (I store the info in MongoDB). The data up and down I write by hand and read them (visually) from the monitor system, the fact is that I would like to automate more my application and make it read data consumed up and down from the interface network i use to connect to internet (in my case ppp0), but the detail is in that I does not find the way to do in Python. I guess Python have a module to import or something that lets me do what I want, but until now I have researched I have not found a way to do it.
Do you know of any module, function or similar that allows me to do in python what I want?
any example?
thanks in advance
Well I answer myself
Found in the community PyAr this recipe to me me like a glove going to do what we wanted without having to use extra commands or other applications.
Slightly modifying the code to better suit my application and add a function that comvierta of bytes to Megabytes leave it like this:
def bytestomb(b):
mb = float(b) / (1024*1024)
return mb
def bytessubidatransferidos():
interface= 'ppp0'
for line in open('/proc/net/dev', 'r'):
if interface in line:
data = line.split('%s:' % interface)[1].split()
tx_bytes = (data[8])
return bytestomb(tx_bytes)
def bytesbajadatransferidos():
interface= 'ppp0'
for line in open('/proc/net/dev', 'r'):
if interface in line:
data = line.split('%s:' % interface)[1].split()
rx_bytes = (data[0])
return bytestomb(rx_bytes)
print bytessubidatransferidos()
print bytesbajadatransferidos()

Categories

Resources