Questions about Improving Performance with from scratch Python Graphics Library - python

I'm currently working on building a graphics library with Python. (I know opengl exists, but this is just for fun) Essentially, I want to be able to implement many of the common features available with opengl.
Right now, I'm using pyopencl which uses opencl under the hood to send code to the GPU for processing.
Here's the kicker though, I'm developing on a 2015 MacBook Pro that has an AMD GPU.
I've written all of the code to make this work and now I'm working on finding the bottlenecks in the code to try to optimize.
I've found that the biggest bottle neck is in my fragment shader implemenetation when I'm converting numpy arrays to cl.Buffer objects prior to GPU processing and then back to numpy arrays after processing.
Doing some research has led me to think that using SVMs would help to minimize this cost, but.... SVMs were introduced in opencl 2.0. Apparently, Apple has stopped supporting opencl in favor of their own in house GPU library Metal and so I'm stuck with opencl 1.2
So I guess my question is, has anyone else hit this roadblock and if so, what is the common way of handling it? If I transition to Metal, then my code is no longer as universal, but if I stay on opencl, I have performance problems. I could autodetect the platform that the code is being run on and use different implementations specific to the platform, but one of my problems is I don't know if there is a trusted standard implementation for Metal in Python.

Related

How to make python run using Windows GPU?

I've been trying to improve the performance of my python scripts and would like to run some using my computer's built-in GPU. However, my computer is Windows 10 and its GPU is not CUDA compatible. From what I've seen, it seems that the GPU must be CUDA compatible in order for it to run python scripts. Is there any way to utilize my GPU for said purposes? If not, are there other programming languages in which I can do this?
The GPU is a proccessing unit for graphics. It most likely won't help except for drawing polygons, transfering data, or massive data sets. The closest you can get is importing a module (depending on your needs), that uses C++ to interact with the GPU (such as OpenCL), or coding interactions yourself (much more complicated).
To answer your 2nd question, C++ or C# should work with your GPU.
Please specify what script you are trying to run for more detail
Good luck!

how can i work with my GPU in python Visual Studio Code

Hello I know that the key to analyzing data and working with artificial intelligence is to use the gpu and not the cpu. The problem is that I don't know how to use it with Python in the visual studio code, I use Ubuntu, I already have nvidia installed
You have to use with the libraries that are designed to work with the GPUs.
You can use Numba to compile Python code directly to binary with CUDA/ROC support, but I don't really know how limiting it is.
Another way is to call APIs that are designed for parallel computing such as OpenCL, there is PyOpenCL binding for this.
A bit limiting, but sometimes valid way - OpenGL/DirectX compute shaders, they are extremely easy, but not so fast if you need to transfer data back and forth in small batches.

Designing an interface between C++ and Python

I have code for a physics simulation in C++. I am trying to solve a control problem using Deep Reinforcement Learning. Most of the popular packages such as keras, pytorch, are based on Python. So, my machine learning code resides in Python and the physics simulator code is in C++.
At every iteration of the machine learning algorithm, I require a call to the C++ program and want the program to persist its state (maintain the variable values). One way I came up with was reading and writing variable values to files. But this didn't seem scalable considering the high number of variables in the program. I googled and found Boost Python library for wrapping my entire C++ code, so that it might be available to the Python program. My question is by doing this do I lose all the speed up of running the code in C++?
Also, I came across numba, which according to the creators is specifically designed for speed up for Scientific Computing and can reach speeds as fast as native C++. But, this would essentially mean re-writing the entire code in Python. Will this be a better choice?
I am on a strict timeline for my project and any advice on which way I should go will be much appreciated.

Matrix calculation using gpu

Today i was asking to me if is possible to do matrix calculation using gpu instead cpu because i know that a gpu is designed to do them faster then a cpu.
I searched on the net and i found notices about the matrix calculation using gpu with different python's libraries but my question is exists a documentation that descibes how should we write code to comunicate with a gpu.
I'm asking that because i want to develop my own one to better understand how gpu work and to try something different.
Thanks to all.
I solved that problem with OpenCL
OpenCL is a standard library that the vendor of the GPU's implement by their own. Like NVIDIA support openCL and other features thanks to CUDA library.
Here a good guide to get start

LLVM, Parrot, JVM, PyPy + python

What is the problem in developing some languages, for example python for some optimized techniques with some of LLVM / Parrot.
PyPy, LLVM, Parrot are the main technologies for common platform development.
I see this like:
PyPy - framework to build VM with build in optimized VM for python
So it quite general solution. The process goes as listed down:
dynamic_language_code ->
PyPy frontend ->
PyPy internal code - bytecode ->
PyPy optimization ->
leaving PyPy code and:
a. PyPy backend for some VM (like jvm)
b. som Kit to make own VM
c. processing/running PyPy internal code
Am I right About this process? For python there is optimized VM? Particularly by default there is build in VM for optimized PyPy code (step 5.c) - which is for python and every language processing can stop there and be running by it?
Parrot - much like PyPy, but without 5.a and 5.b ? Some internal improvements for dynamic processing (Parrot Magic Cookies).
Both Parrot and PyPy are designed to create a platform which create a common dynamic languages runtime, but PyPy wants more - also to create more VM.
Where is the sens of PyPy? For what we need to create more VM? Shouldn't be better to focus on one VM (like in parrot) - because there is common one code level - either PyPy internal bytecode or Parrot ones.
I think we can't gain nothing better to translate to PyPy bytecode to newly created with PyPy VMs.
LLVM - i see this very similar to PyPy but without VM generator.
It is mature, well designed environment with similar targets as PyPy (but without VM generator) but working on low level structure and great optimization/JIT techniques implemeted
Is see this as: LLVM is general use, but Parrot and **PyPy* designed for dynamic languages. In PyPy / Parrot is more easy to introduce some complicated techniques - because it is more high level then LLVM - like sophisticate compiler which can better understand high level code and produce better assembler code (which humans can't write in reasonable time), then the LLVM one?
Questions:
Am I right? Is there any reason that porting some dynamic language would be better to llvm then to for example Parrot?
I haven't see the activity on development python on Parrot. Is it because using python C extensions doesn't work on parrot? The same problem is in PyPy
Why other VMs don't want to move to LLVM / parrot. Eg ruby -> parrot, CLR/ JVM -> LLVM. Wouldn't be better for them to move to more sophisticated solution? LLVM is in high development process and has big companies investing in.
I know the problem might be in recompile are resources, if there is need to change bytecode - but it is not obligatory - as we can try to port old bytecode to new one, and new compilers produce new bytecode (never less java still need to interpreted own bytecode - so the frontend can check it and translate it to new bytecode)?
What are the problems with linking for example jvm libraries inside llvm (if we port somehow java/jvm/scala to llvm)?
Can you correct me if i'm wrong somewhere
Some addings:
How does Parrot compare to other virtual machines
What's the benefit of Parrot VM for end-users
What are the differences between LLVM and java/jvm
=============
CLARIFICATION
I want to figure how all this software consist - and what is the problem to porting one to other.
That not stuff anybody can possible answer in a stackoverflow questions but i give it a minmal shot.
First what problems do the 3 projects solve?
pypy allows you to implement an interpreter in a high level language and you get a generated jit for free. The good thing about this is that you don't have a dependence mismatch between the langauge and the platform. Thats the reason why pypy-clr is faster then IronPython.
More info here: http://codespeak.net/pypy/dist/pypy/doc/extradoc.html --> High performance implementation of Python for CLI/.NET with JIT compiler generation for dynamic)
llvm is a low level infrastructure for compilers. The general idea is to have one "high level assembly". All the optomizations work on that language. Then there is tons of infrastructure around to help you build compilers (JIT or AOT). Implementing a dynamic language on llvm is possible but needs more work then implementing it on pypy or parrot. You, for example, can't get a GC for free (there are GC you can use together with LLVM see http://llvm.org/devmtg/2009-10/ --> the vmkit video ) There are attempts to build a platform better for dynamic langauges based on llvm: http://www.ffconsultancy.com/ocaml/hlvm/
I don't know that much about parrot but as I understand they want to build one standard VM specialized for dynamic langauges (perl, php, python ....). The problem here is the same as with compiling to JVM/CLR there is a dependency missmatch, just a much smaller one. The VM still does not know the semantics of your langauge. As I unterstand parrot is still pretty slow for user code. (http://confreaks.net/videos/118-elcamp2010-parrot)
The answer to your question:
Am I right? Is there any reason that porting some dynamic language would be better to llvm then to for example Parrot?
Thats a question of effort. Building everthing your self and specialized for you will eventually be faster but it's a LOT more effort.
I haven't see the activity on development python on Parrot. Is it because using python C extensions doesn't work on parrot? The same problem is in PyPy.
Targeting parrot would (at this point) not likely have a benefit over pypy. Why nobody else does it I don't know.
Why other VMs don't want to move to LLVM / parrot. Eg ruby -> parrot, CLR/ JVM -> LLVM. Wouldn't be better for them to move to more sophisticated solution? LLVM is in high
development process and has big companies investing in.
Ok there is a lot of stuff in that question.
Like I said LLVM is hard to move to and parrot is not that fast (correct me if im wrong).
Ruby has Rubinius witch tries to do a lot in ruby and jits to llvm (http://llvm.org/devmtg/2009-10/ --> Accelerating Ruby with LLVM).
There is a implementation of CLR/JVM on LLVM but they both already have very mature implemantations that have big investments.
LLVM is not high level.
I know the problem might be in recompile are resources, if there is need to change bytecode - but it is not obligatory - as we can try to port old bytecode to new one, and new compilers produce new bytecode (never less java still need to interpreted own bytecode - so the frontend can check it and translate it to new bytecode)?
I have no idea what the question is.
What are the problems with linking for example jvm libraries inside llvm (if we port somehow java/jvm/scala to llvm)?
Watch the video of VMKit I linked above that show how far they got and what the problem is (and how they solved it).
Can you correct me if i'm wrong somewhere
Lots of stuff you wrote is wrong or I just don't anderstand what you mean, but the stuff I linked should make a lot of stuff clearer.
Some examples:
Clojure
The creater didn't want all the work of implementing his own vm and all the libraries. So where to go to? Since Clojure is a new langauge you can build it in a way that works well on a platform like the JVM by restricting a lot of dynamic stuff a language like python or ruby would have.
Python
The language can't (practically) be changed to work well on JVM/CLR. So implementing python on those wont bring massive speedups. Static compiler won't work very well either because there are not many static guarantees. Writing a JIT in C will be fast but very hard to change (see the psyco project). Using the llvm jit could work and is explored by the Unladen Swallow project (again http://llvm.org/devmtg/2009-10/ --> Unladen Swallow: Python on LLVM). Some people wanted to have python in python so they started pypy and their idea seams to work really well (see above). Parrot could work as well but I have not seen anybody have try (feel free).
On everything:
I think you're confused and I can understand that. Take your time and read, listen, watch everything you can get. Don't stress yourself. There are a lot of parts to this and eventually you see how what fits together and what makes sense and even when you know a lot there is still a lot of discussing one may do. The question is where to implement a new language or how to speed up a old language have many answers and if you ask 3 people you're likely to get three different answers.
What are you trying to implement? Your question is very confusingly worded (I realize English is likely not your first language).
LLVM and PyPy are both mature, useful projects, but really don't overlap much at this point. (At one point, PyPy could generate LLVM bytecode—which was statically compiled to an interpreter—as opposed to C code, but it didn't provide much of a performance benefit and is no longer supported.)
PyPy lets you write an interpreter in RPython and use that as a description to generate a native code interpreter or JIT; LLVM is a C++ framework for building a compiler toolchain which can also be used to implement a JIT. LLVM's optimizers, code generation and platform support are significantly more advanced than those of PyPy, but it isn't as well suited to building a dynamic language runtime (see the Unladen Swallow retrospective for some examples of why). In particular, it is not as effective at collecting/using runtime feedback (which is absolutely essential for making dynamic languages perform well) as PyPy's trace-based JIT. Also, LLVM's garbage collection support is still somewhat primitive, and it lacks PyPy's unique ability to automatically generate a JIT.
Incidentally two Java implementations are built on LLVM—J3/VMKit and Shark.
You might consider watching the PyPy talk from Stanford last week; it provides a pretty decent overview of how PyPy works. Carl Friedrich Bolz's presentation also provides a good overview of the state of VM implementation.
The main reason? Because VM design is not a settled technology, and having a variety of VMs with different goals and objectives allows a variety of mechnisms to be tried in parallel rather than all having to be tried in series.
The JVM, CLR, PyPy, Parrot, LLVM and the rest all target different kinds of problems in different ways. It's similar to the reasons why Chrome, Firefox, Safari and IE all use their own Javascript engines.
Unladen Swallow attempted to apply LLVM to CPython, and spent more of their time fixing issues in LLVM than they did in doing anything Python specific.
Python-on-Parrot suffered from semantic differences between Perl 6 and Python causing problems with the front-end compilation process, so future efforts in this area are likely to use the PyPy front-end to target the Parrot VM.
Different VM developers certainly keep an eye on what the others are doing, but even when they lift good ideas they will put their own spin on them before incorporating them.

Categories

Resources