Converting a while-loop with an array from Python to Assembly - python

To prepare myself for the next school semester, I am trying to learn Assembly, a language I am not at all familiar with. I already know the basics of computer code and assembly coding and I have made a while-loop in Assembly before. However, this time I want to use an array.
I'm trying to convert a simple while-loop that I made in Python, to Assembly code but could use some assistance.
i = 2
numbers = []
while i <= 255:
i -= 1
numbers.append(i)
i *= 3
if i < 255:
numbers.append(i)
else:
break
How can I do this? Once I have an idea of what it should look like, I can start studying the code and begin understanding it.
The computer architecture I'm using is an AVR ATMEGA328P
This is a for-loop I made earlier. It's pretty random and probably not very efficient, I was just practicing:
loop:
cpi r16,255
breq notthree
mov r18, r16
cpi r17,6
brne notsix
sub r17,r16
rjmp loop
notthree:
cpi r17,3
breq notthree
notthree:
dec r16
rjmp loop
end_loop:

this question is pretty far beyond the scope we can usually answer in a question. Python is a very high-level language, so a good place to begin may be by learning some C, especially regarding pointers. C is basically the layer above a computer's own language. It allows programs be written in a more cross-platform way.
Things like arrays and references don't behave the way that their higher level counterparts, lists and variables do. The idea behind an array is to store a contiguous region of data. In very simple implementations, they are stored using an address pointer and indexed by performing math on that pointer to find nearby addresses. In bounds-checked implementations, either a NULL terminator value or a header indicating the size can be used to prevent reading data outside of the array.
The layer between high-level languages and low-level languages is often much harder to study than the languages themselves, because it requires understanding the design of both parts in order to appreciate their connection. With a simple google search, you can find various asm tutorials, such as How to make an assembly array. Note that asm is the broad term for a family of interconnected languages that are unique to their architectures.
Good luck on your journey, and feel free to ask broad questions such as this in chatrooms (such as our HCF), ircs, or discord servers. When anything specific causes you trouble, come write up a question here.

Related

How to use python when learning algorithms?

Basically, I am trying to learn some very basic data structures and algorithms using python. However I think when trying to implementing these algorithms, I unknowingly start using python tricks a bit, even as simple as the following which will not be considered tricks by any stretch of imagination
for i, item in enumerate(arr):
# Algo implementation
or
if item in items:
# do something
I don't know what is the general guideline to follow so that I can grasp the algorithm as it's meant to implemented.
It is all right to use Python's techniques to solve problems. The main exception is when Python does something for you and you want to learn how that something was done. One example is Python's heapq--You can't use that directly if your purpose is to understand how the binary heap structure can be used to implement a priority queue. Of course, you could read the source code and learn much that way.
One thing that can help you is to read a data structures and algorithms book that is based on Python. Then you can be assured that Python will not be used to slide over important topics--at least, if the book is any good.
One such book is Problem Solving with Algorithms and Data Structures. Another is Classic Computer Science Problems in Python. The first book is a free PDF download, though I believe there is a more recent edition that is not free. The second is not free but you can get a discount of 40% at the publisher's web site if you use a discount code mentioned in the Talk Python to Me podcast. I am working through the latter book now, as a reminder of the class I took a very long time ago.
As to a recommendation, the price may be a deciding factor for you. The emphases of the two books also differs. The first is older, using more generic Python and not using many of Python's special features. It is also closer to a textbook, going more into depth in its topics. It covers things like execution complexity, for example. The PDF version, however, does not cover as many topics as other versions I have seen. The PDF does not cover graphs (networks), for example, which another version (which I cannot find now) does.
The second is much more recent, using features of Python 3.7 such as type hints. It also is more of an introduction or review. I think I can use "fair use" to quote the relevant section of the book:
Who is this book for?
This book is for both intermediate and experienced programmers.
Experienced programmers who want to deepen their knowledge of Python
will find comfortably familiar problems from their computer science or
programming education. Intermediate programmers will be introduced to
these classic problems in the language of their choice: Python.
Developers getting ready for coding interviews will likely find this
book to be valuable preparation material.
In addition to professional programmers, students enrolled in
undergraduate computer science programs who have an interest in Python
will likely find this book helpful. It makes no attempt to be a
rigorous introduction to data structures and algorithms. This is not a
data structures and algorithms textbook. You will not find proofs or
extensive use of big-O notation within its pages. Instead, it is
positioned as an approachable, hands-on tutorial to the
problem-solving techniques that should be the end product of taking
data structure, algorithm, and artificial intelligence classes.
Once again, knowledge of Python’s syntax and semantics is assumed. A
reader with zero programming experience will get little out of this
book, and a programmer with zero Python experience will almost
certainly struggle. In other words, Classic Computer Science Problems
in Python is a book for working Python programmers and computer
science students.
If you want to understand how an algorithm works, I would strongly recommend to work with flowcharts. They represent the algorithmic procedure as relations between the elementary logical statements the algorithm is made of and are independent on the programming language an algorithm might be implemented.
If you want to learn python along with it, then here is what you can do:
1. Study the flowchart of the algorithm that interests you.
2. Translate that flowchart 1-to-1 into python code.
3. Have a closer look at your python code and try to optimize or compact it.
This can be best illustrated with an example:
1.
Here is the flowchart of Euclid's algorithm that finds the greatest common denominator of two numbers (taken form the wiki page Algorithm):
To understand an algorithm means to be able to follow or even reproduce this flowchart
2.
Now if your goal is to learn python a great exercise is to take a flowchart and translate it to python. No shortcuts, no simplifications, just 1-to-1 as it is written, translate the algorithm to python. You won't be fooled by any tricks or masked complexity when doing so, as the flowchart tells you the elementary logical steps and you are just translating them to your preferred programming language.
For the example above, a crude 1-to-1 implementation looks like this:
def gcd(a,b): # point 1
while True:
if b == 0: # p. 2
return a # p. 8 + 9
if a > b: # p. 3
a = a - b # p. 6
# p. 7
else: # p. 3
b = b - a # p. 4
# p. 5
3.
By now you have both learned how the algorithm works and how you implement logical statements in python. The tricks you mentioned earlier can enter the game here. You can start to play around and try to make the implementation more efficient, more compact or a one-liner (people like this for some reason). This will not only help your logical understanding but it will also deepen your knowledge of the programming language you are using.
As for the example at hand, Euclid's algorithm, there is not a lot of fancy business that comes to my mind. I somehow find recursive calls elegant, so here is a tricky implementation using this:
def gcd(a,b):
if b == 0:
return a
else:
return gcd(a-b,b) if a > b else gcd(a, b - a)
Note the you can (and sometimes even have to) do this procedure in the reverse order. It can happen that the only thing you know about an algorithm is an implementation of it. The you would proceed exactly in the reversed order: 3.->2. Try to identify and 'expand' all trickery that might be present in the implementation. 2.->1. Use the 'expanded' implementation to create a flowchart of the algorithm, in order to have a proper definition.
They are not tricks!
These are the same thing you would do in any other language. It's just made more simpler in python.
In c/c++ you would do,
for(int i=0; i<sizeof(arr)/sizeof(arr[0]); i++) {
// access the array elements here as arr[i]
}
The same thing you would do in python in a bit convenient way i.e.
for i, a in enumerate (arr):
# do something
or
for i in range(len(arr)):
# do something with arr elements
Your algorithms will NOT depend upon these syntactical difference.
Whether it is in python or in c/c++ or in any other language, if you have a good understanding of the language, you are good to go with any thing. You just have to keep in mind the time complexities of the algorithms you use and how you implement them.
The thing with python is that it's way more easy to understand, it's shorter to write, has a lot of inbuilt functions, you need no class or a main function to execute your program and many more.
If you ask me, I would not say they are any tricks. All programming languages have these things in common with just syntactical difference.
It depends on what you are trying to implement. Like say if you are trying to implement linked list, you just need to know what can you use in python to implement that.

Does every simple mathematical operation use the same amount of power (as in, battery power)?

Recently I have been revising some of my old python codes, which are essentially loops of algebra, in order to have them execute faster, generally by eliminating some un-necessary operations. Often, changing the value of an entry in a list from 0 (as a python float, which I believe is a double by default) to the same value, which is obviously not necessary. Or, checking if a float is equal to something, when it MUST be that thing, because a preceeding "if" would not have triggered if it wasn't, or some other extraneous operation. This got me wondering about what will preserve my battery more, as I do a some of my coding on the bus where I can't plug my laptop in.
For example, which of the following two operations would be expected to use less battery power?
if b != 0: #b was assigned previously, and I know it is zero already
b = 0
or:
b = 0
The first one checks if b is zero, and it is, so it doesn't do the next part. The second one just assigns b to zero without bothering to check. I believe the first one is more time-efficient, as you don't have to change anything in memory. Is that correct, and if so, would it also be more power-efficient? Does "more time efficient" always imply "more power efficient"?
I suggest watching this talk by Chandler Carruth: "Efficiency with Algorithms, Performance with Data Structures"
He addresses the idea of "Power efficient instructions" at 4m 49s in the video. I agree with him, thinking about how much watt particular code consumes is useless. As he put it
Q: "How to save battery life?"
A: "Finish ruining the program".
Also, in Python you do not have low level control to be even thinking about low level problems like this. Use appropriate data structures and algorithms, and pray that Python interpreter will give you well optimized byte-code.
Does every simple mathematical operation use the same amount of power (as in, battery power)?
No. It's not the same to compute a two number addition than a fourier transform of a 20 megapixel photo.
I believe the first one is more time-efficient, as you don't have to change anything in memory. Is that correct, and if so, would it also be more power-efficient? Does "more time efficient" always imply "more power efficient"?
Yes. You are right on your intuitions but these are very trivial examples. And if you dig deeper you will get into uncharted territory of weird optimization that's quite difficult to grasp (e.g., see this question: Times two faster than bit shift?)
In general the more your code utilizes system resources the greater power those resources would use. However it is more useful to optimize your code based on time or size constraints instead of thinking about high level code in terms of power draw.
One way of doing this is Big O notation. In essence, Big O notation is a way of comparing the size and or runtime complexity of an algorithm. https://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/
A computer at its lowest level is large quantity of transistors which require power to change and maintain their state.
It would be extremely difficult to anticipate how much power any one line of python code would draw.
I once had questions like this. Still do sometimes. Here's the answer I wish someone told me earlier.
Summary
You are correct that generally, if your computer does less work, it'll use less power.
But we have to be really careful in figuring out which logical operations involve more work and which ones involve less work - in this case:
Reading vs writing memory is usually the same amount of work.
if and any other conditional execution also costs work.
Python's "simple operations" are not "simple operations" for the CPU.
But the idea you had is probably correct for some cases you had in mind.
If you're concerned about power consumption, measure where power is being used.
For some perspective: You're asking about which Python code costs you one more drop of water, but really in Python every operation costs a bucket and your whole Python program is using a river and your computer as a whole is using an ocean.
Direct Answers
Don't apply these answers to Python yet. Read the rest of the answer first, because there's so much indirection between Python and the CPU that you'll mislead yourself about how they're connected if you don't take that into account.
I believe the first one is more time-efficient, as you don't have to change anything in memory.
As a general rule, reading memory is just as slow as writing to memory, or even slower depending on exactly what your computer is doing. For further reading you'll want to look into CPU memory cache levels, memory access times, and how out-of-order execution and data dependencies factor into modern CPU architectures.
As a general rule, the if statement in a language is itself an operation which can have a non-negligible cost. For further reading you should look into how CPU pipelining relates to branch prediction and branch penalties. Also look into how if statements are implemented in typical CPU instruction sets.
Does "more time efficient" always imply "more power efficient"?
As a general rule, more work efficient (doing less work - less machine instructions, for example) implies more power efficient, because on modern hardware (this wasn't always this way) your hardware will use less power when it's not doing anything.
You should be careful about the idea of "more time efficient" though, because modern hardware doesn't always execute the same amount of work in the same amount of time: for further reading you'll want to look into CPU frequency scaling, ARM's big.LITTLE architectures, and discussions about the "Race to Idle" concept as a starting point.
"One Simple Operation" - CPU vs. Python
Your question is about Python, so it's important to realize that Python's x != 0, if, and x = 0 do not map directly to simple operations in the CPU.
For further reading, especially if you're familiar with C, I would recommend taking a long look at how Python is implemented. There are many implementations - the main one is CPython, which is a C program that reads and interprets Python source, converts it into Python "bytecode" and then when running interprets that bytecode one by one.
As a baseline, if you're using Python, any one "simple" operation is actually a lot of CPU operations, as each step in the Python interpreter is multiple CPU operations, but which ones cost more might be surprising.
Let's break down the three used in our example (I'm primarily describing this from the perspective of the main Python implementation written in C, called "CPython", which I am the most closely familiar with, but in general this explanation is roughly applicable to all of them, though some will be able to optimize out certain steps):
x != 0
It looks like a simple operation, and if this was C and x was an int it would be just one machine instruction - but Python allows for operator overloading, so first Python has to:
look up x (at least one memory read, but may involve one or more hashmap lookups in Python's internals, which is many machine operations),
check the type of x (more memory reading),
based on the type look up a function pointer that implements the not-equality operation (one or arbitrarily many memory reads and one or more arbitrarily many conditional branches, with data dependencies between them),
only then it can finally call that function with references to Python objects holding the values of x and 0 (which is also not "free" - look up function calling ABI for more on that).
All that and more has to be done by the CPU even if x is a Python int or float mapping closely to the CPU's native numerical data types.
x = 0
An assignment is actually far cheaper in Python (though still not trivial): it only has to get as far as step 1 above, because once it knows "where" x is, it can just overwrite that pointer with the pointer to the Python object representing 0.
if
Abstractly speaking, the Python if statement has to be able to handle "truthy" and "falsey" values, which in the most naive implementation would involves running through more CPU instructions to evaluate what result of the condition is according to Python's semantics of what's true and what's false.
Sidenote About Optimizations
Different Python implementations go to different lengths to get Python operations closer to as few CPU operations in possible. For example, an optimizing JIT (Just In Time) compiler might notice that, inside some loop on an array, all elements of the array are native integers and actually reduce the if x != 0 and x = 0 parts into their respective minimal machine instructions, but that only happens in very specific circumstances when the optimizing logic can prove that it can safely bypass a lot of the behavior it would normally need to do.
The biggest thing here is this: a high-level language like Python is so removed from the hardware that "simple" operations are often complex "under the covers".
What You Asked vs. What I Think You Wanted To Ask
Correct me if I'm wrong, but I suspect the use-case you actually had in mind was this:
if x != 0:
# some code
x = 0
vs. this:
if x != 0:
# some code
x = 0
In that case, the first option is superior to the second, because you are already paying the cost of if x != 0 anyway.
Last Point of Emphasis
The hardest breakthrough for me was moving away from trying to reason about individual instructions in my head, and instead switching into looking at how things work and measuring real systems.
Looking at how things work will teach you how to optimize, but measuring will show you where to optimize.
This question is great for exploring the former, but for your stated motivation of reducing power consumption on your laptop, you would benefit more from the latter.

Julia for image processing and speech recognition

I recently stumbled upon Julia Language and I was surprised to see their claims. It claims to be many folds faster than languages like Python, which I'm currently using for machine learning algorithms on speech recogniton.
Their claim for example Fibonacci sequence is about 30 times faster than in Python.
I dont understand what is that it makes it so much faster. However what I really want to know is whether someone has actually verified this. If that is the case I would move from Python to Julia Language for the speech recognizer I am building as I am hitting the slow threshold of my response and cannot afford to make it anymore slower.
Also are there any projects on github(or anywhere) where I can find Julia projects which do heavy number crunching like image processing, speech recognition etc.
I searched upon some other link such as this one , but could not decide effectively. I can run the programs and verify their claims, but if someone has already done so, it would be helpful to me.
It's not that surprising; Python made a number of decisions that make it a fairly slow language, and Guido van Rossum, its creator, says "It is usually much more effective to take that one piece and replace that one function or module with a little bit of code you wrote in C or C++ rather than rewriting your entire system in a faster language, because for most of what you're doing, the speed of the language is irrelevant." As a general rule, any language that is concerned with speed will be faster than Python: C, C++, Ada, Java, Scala, Clojure, a number of other languages, all show up more then an order of magnitude faster in typical implementations than Python in benchmarks. Unless the authors of Julia have completely failed in their attempts to make a faster language, it will be faster than Python.
I dont understand what is that it makes it so much faster.
People often assume Julia is fast due to some sort of superior JIT compiler and that Python could be optimized in the same way since they are both dynamic languages.
However it is actually all due to clever language design choices. All functions in Julia are multiple dispatch unlike Python functions. That means at runtime Julia can pick one particular code implementation for every possible combination of arguments. Since types are immutable in Julia the compiled machine code handling a particular case of argument combinations can be cached.
In Julia all libraries have also been designed with type stability in mind, meaning for a given set of argument types a function will always return objects of the same type. E.g. so if you do a mathematical operation on two integers and it returns and integers, then most of the time the JIT compiler will know that it ALWAYS returns an integer and thus don't have to create lots of code to check for every possible case.
If you are uncertain about the performance of Julia, I suggest you actually look at some small code examples and play with the code generator. With #code_native, you can see the assembly code Julia generates for a function. You will find it is possible to emit basically the same code as a C compiler would do. Quite dynamic and complex looking code in Julia surprisingly often just turns into a few assembly code instructions.
The thing with Julia is that you can't just look at the performance of some random library existing doing what you want. It could be a quick port or not well optimized. It is easy to write slow Julia code if you have no idea what you are doing. The key thing is that Julia gives you quite good tools for writing highly optimized code.
If you need to you can typically manage to get performance close to C. But of course you have to spend some time optimizing your code. Julia has functions which allows you to analyze your functions and tell you were you did something causing potential performance problems.
Fortunately Julia performance benefits from lots of small functions, so you can typically analyze performance issues in very small isolated cases.
Regarding packages, there is a link to all registered "external packages" from the homepage. You may find some of the things you want there.

Performance differences between Python and C

Working on different projects I have the choice of selecting different programming languages, as long as the task is done.
I was wondering what the real difference is, in terms of performance, between writing a program in Python, versus doing it in C.
The tasks to be done are pretty varied, e.g. sorting textfiles, disk access, network access, textfile parsing.
Is there really a noticeable difference between sorting a textfile using the same algorithm in C versus Python, for example?
And in your experience, given the power of current CPU's (i7), is it really a noticeable difference (Consider that its a program that doesnt bring the system to its knees).
Use python until you have a performance problem. If you ever have one figure out what the problem is (often it isn't what you would have guessed up front). Then solve that specific performance problem which will likely be an algorithm or data structure change. In the rare case that your problem really needs C then you can write just that portion in C and use it from your python code.
C will absolutely crush Python in almost any performance category, but C is far more difficult to write and maintain and high performance isn't always worth the trade off of increased time and difficulty in development.
You say you're doing things like text file processing, but what you omit is how much text file processing you're doing. If you're processing 10 million files an hour, you might benefit from writing it in C. But if you're processing 100 files an hour, why not use python? Do you really need to be able to process a text file in 10ms vs 50ms? If you're planning for the future, ask yourself, "Is this something I can just throw more hardware at later?"
Writing solid code in C is hard. Be sure you can justify that investment in effort.
In general IO bound work will depend more on the algorithm then the language. In this case I would go with Python because it will have first class strings and lots of easy to use libraries for manipulating files, etc.
Is there really a noticeable difference between sorting a textfile using the same algorithm in C versus Python, for example?
Yes.
The noticeable differences are these
There's much less Python code.
The Python code is much easier to read.
Python supports really nice unit testing, so the Python code tends to be higher quality.
You can write the Python code more quickly, since there are fewer quirky language features. No preprocessor, for example, really saves a lot of hacking around. Super-experience C programmers hardly notice it. But all that #include sandwich stuff and making the .h files correct is remarkably time-consuming.
Python can be easier to package and deploy, since you don't need a big fancy make script to do a build.
The first rule of computer performance questions: Your mileage will vary. If small performance differences are important to you, the only way you will get valid information is to test with your configuration, your data, and your benchmark. "Small" here is, say, a factor of two or so.
The second rule of computer performance questions: For most applications, performance doesn't matter -- the easiest way to write the app gives adequate performance, even when the problem scales. If that is the case (and it is usually the case) don't worry about performance.
That said:
C compiles down to machine executable and thus has the potential to execute as at least as fast as any other language
Python is generally interpreted and thus may take more CPU than a compiled language
Very few applications are "CPU bound." I/O (to disk, display, or memory) is not greatly affected by compiled vs interpreted considerations and frequently is a major part of computer time spent on an application
Python works at a higher level of abstraction than C, so your development and debugging time may be shorter
My advice: Develop in the language you find the easiest with which to work. Get your program working, then check for adequate performance. If, as usual, performance is adequate, you're done. If not, profile your specific app to find out what is taking longer than expected or tolerable. See if and how you can fix that part of the app, and repeat as necessary.
Yes, sometimes you might need to abandon work and start over to get the performance you need. But having a working (albeit slow) version of the app will be a big help in making progress. When you do reach and conquer that performance goal you'll be answering performance questions in SO rather than asking them.
If your text files that you are sorting and parsing are large, use C. If they aren't, it doesn't matter. You can write poor code in any language though. I have seen simple code in C for calculating areas of triangles run 10x slower than other C code, because of poor memory management, use of structures, pointers, etc.
Your I/O algorithm should be independent of your compute algorithm. If this is the case, then using C for the compute algorithm can be much faster.
(Assumption - The question implies that the author is familiar with C but not Python, therefore I will base my answer with that in mind.)
I was wondering what the real
difference is, in terms of
performance, between writing a program
in Python, versus doing it in C.
C will almost certainly be faster unless it is implemented poorly, but the real questions are:
What are the development implications
(development time, maintenance, etc.)
for either implementation?
Is the performance benefit significant?
Learning Python can take some time, but there are Python modules that can greatly speed development time. For example, the csv module in Python makes reading and writing csv easy. Also, Python strings, arrays, maps, and other objects make it more flexible than plain C and more elegant, in my opinion, than the equivalent C++. Some things like network access may be much quicker to develop in Python as well.
However, it may take time to learn how to program Python well enough to accomplish your task. Since you are concerned with performance, I suggest trying a simple task, such as sorting a text file, in both C and Python. That will give you a better baseline on both languages in terms of performance, development time, and possibly maintenance.
It really depends a lot on what your doing and if the algorithm in question is available in Python via a natively compiled library. If it is, then I believe you'll be looking at performance numbers close enough that Python is most likely your answer -- assuming it's your preferred language. If you must implement the algorithm yourself, depending on the amount of logic required and the size of your data set, C/C++ may be the better option. It's hard to provide a less nebulous answer without more information.
To get an idea of the raw difference in speed, check out the Computer Languages Benchmark Game.
Then you have to decide whether that difference matters to you.
Personally, I ended up deciding that it did, but most of the time instead of using C, I ended up using other higher-level languages. Personally I mostly use Scala, but Haskell and C# and Java each have their advantages also.
Across all programs, it isn't really possible to say whether things will be quicker or slower on average in Python or C.
For the programs that I've implemented in both languages, using similar algorithms, I've seen no improvement (and sometimes a performance degradation) for string- and IO-heavy code, when reimplementing python code in C. The execution time is dominated by allocation and manipulation of strings (which functionality python implements very efficiently) and waiting for IO operations (which incurs the same overhead in either language), so the extra overhead of python makes very little difference.
But for programs that do even simple operations on image files, say (images being large enough for processing time to be noticeable compared to IO), C is enormously quicker. For this sort of task the bulk of the time running the python code is spent doing Python Stuff, and this dwarfs the time spent on the underlying operations (multiply, add, compare, etc.). When reimplemented as C, the bureaucracy goes away, the computer spends its time doing real honest work, and for that reason the thing runs much quicker.
It's not uncommon for the python code to run in (say) 5 seconds where the C code runs in (say) 0.05. So that's a 100x increase -- but in absolute terms, this is not so big a deal. It takes so much less longer to write python code than it does to write C code that your program would have to be run some huge number of times to turn a time profit. I often reimplement in C, for various reasons, but if you don't have this requirement then it's probably not worth bothering. You won't get that part of your life back, and next year computers will be quicker.
Actually you can solve most of your tasks efficiently with python.
You just should know which tools to use. For text processing there is brilliant package from Egenix guys - http://www.egenix.com/products/python/mxBase/mxTextTools/. I was able to create very efficient parsers with it in python, since all the heavy lifting is done by native code.
Same approach goes for any other problem - if you have performance problems, get a C/C++ library with Python interface which implements whatever bottleneck you got efficiently.
C is definitely faster than Python because Python is written in C.
C is middle level language and hence faster but there not much a great difference between C & Python regarding executable time it takes.
but it is really very easy to write code in Python than C and it take much shorter time to write code and learn Python than C.
Because its easy to write its easy to test also.
You will find C is much slower. Your developers will have to keep track of memory allocation, and use libraries (such as glib) to handle simple things such as dictionaries, or lists, which python has built-in.
Moreover, when an error occurs, your C program will typically just crash, which means you'll need to get the error to happen in a debugger. Python would give you a stack trace (typically).
Your code will be bigger, which means it will contain more bugs. So not only will it take longer to write, it will take longer to debug, and will ship with more bugs. This means that customers will notice the bugs more often.
So your developers will spend longer fixing old bugs and thus new features will get done more slowly.
In the mean-time, your competitors will be using a sensible programming language and their products will be increasing in features and usability, rapidly yours will look bad. Your customers will leave and you'll go out of business.
The excess time to write the code in C compared to Python will be exponentially greater than the difference between C and Python execution speed.

What is MATLAB good for? Why is it so used by universities? When is it better than Python? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I've been recently asked to learn some MATLAB basics for a class.
What does make it so cool for researchers and people that works in university?
I saw it's cool to work with matrices and plotting things... (things that can be done easily in Python using some libraries).
Writing a function or parsing a file is just painful. I'm still at the start, what am I missing?
In the "real" world, what should I think to use it for? When should it can do better than Python? For better I mean: easy way to write something performing.
UPDATE 1: One of the things I'd like to know the most is "Am I missing something?" :D
UPDATE 2: Thank you for your answers. My question is not about buy or not to buy MATLAB. The university has the possibility to give me a copy of an old version of MATLAB (MATLAB 5 I guess) for free, without breaking the license. I'm interested in its capabilities and if it deserves a deeper study (I won't need anything more than basic MATLAB in oder to pass the exam :P ) it will really be better than Python for a specific kind of task in the real world.
Adam is only partially right. Many, if not most, mathematicians will never touch it. If there is a computer tool used at all, it's going to be something like Mathematica or Maple. Engineering departments, on the other hand, often rely on it and there are definitely useful things for some applied mathematicians. It's also used heavily in industry in some areas.
Something you have to realize about MATLAB is that it started off as a wrapper on Fortran libraries for linear algebra. For a long time, it had an attitude that "all the world is an array of doubles (floats)". As a language, it has grown very organically, and there are some flaws that are very much baked in, if you look at it just as a programming language.
However, if you look at it as an environment for doing certain types of research in, it has some real strengths. It's about as good as it gets for doing floating point linear algebra. The notation is simple and powerful, the implementation fast and trusted. It is very good at generating plots and other interactive tasks. There are a large number of `toolboxes' with good code for particular tasks, that are affordable. There is a large community of users that share numerical codes (Python + NumPy has nothing in the same league, at least yet)
Python, warts and all, is a much better programming language (as are many others). However, it's a decade or so behind in terms of the tools.
The key point is that the majority of people who use MATLAB are not programmers really, and don't want to be.
It's a lousy choice for a general programming language; it's quirky, slow for many tasks (you need to vectorize things to get efficient codes), and not easy to integrate with the outside world. On the other hand, for the things it is good at, it is very very good. Very few things compare. There's a company with reasonable support and who knows how many man-years put into it. This can matter in industry.
Strictly looking at your Python vs. MATLAB comparison, they are mostly different tools for different jobs. In the areas where they do overlap a bit, it's hard to say what the better route to go is (depends a lot on what you're trying to do). But mostly Python isn't all that good at MATLAB's core strengths, and vice versa.
Most of answers do not get the point.
There is ONE reason matlab is so good and so widely used:
EXTREMELY FAST CODING
I am a computer vision phD student and have been using matlab for 4 years, before my phD I was using different languages including C++, java, php, python... Most of the computer vision researchers are using exclusively matlab.
1) Researchers need fast prototyping
In research environment, we have (hopefully) often new ideas, and we want to test them really quick to see if it's worth keeping on in that direction. And most often only a tiny sub-part of what we code will be useful.
Matlab is often slower at execution time, but we don't care much. Because we don't know in advance what method is going to be successful, we have to try many things, so our bottle neck is programming time, because our code will most often run a few times to get the results to publish, and that's all.
So let's see how matlab can help.
2) Everything I need is already there
Matlab has really a lot of functions that I need, so that I don't have to reinvent them all the time:
change the index of a matrix to 2d coordinate: ind2sub extract all patches of an image: im2col; compute a histogram of an image: hist(Im(:)); find the unique elements in a list unique(list); add a vector to all vectors of a matrix bsxfun(#plus,M,V); convolution on n-dimensional arrays convn(A); calculate the computation time of a sub part of the code: tic; %%code; toc; graphical interface to crop an image: imcrop(im);
The list could be very long...
And they are very easy to find by using the help.
The closest to that is python...But It's just a pain in python, I have to go to google each time to look for the name of the function I need, and then I need to add packages, and the packages are not compatible one with another, the format of the matrix change, the convolution function only handle doubles but does not make an error when I give it char, just give a wrong output... no
3) IDE
An example: I launch a script. It produces an error because of a matrix. I can still execute code with the command line. I visualize it doing: imagesc(matrix). I see that the last line of the matrix is weird. I fix the bug. All variables are still set. I select the remaining of the code, press F9 to execute the selection, and everything goes on. Debuging becomes fast, thanks to that.
Matlab underlines some of my errors before execution. So I can quickly see the problems. It proposes some way to make my code faster.
There is an awesome profiler included in the IDE. KCahcegrind is such a pain to use compared to that.
python's IDEs are awefull. python without ipython is not usable. I never manage to debug, using ipython.
+autocompletion, help for function arguments,...
4) Concise code
To normalize all the columns of a matrix ( which I need all the time), I do:
bsxfun(#times,A,1./sqrt(sum(A.^2)))
To remove from a matrix all colums with small sum:
A(:,sum(A)<e)=[]
To do the computation on the GPU:
gpuX = gpuarray(X);
%%% code normally and everything is done on GPU
To paralize my code:
parfor n=1:100
%%% code normally and everything is multi-threaded
What language can beat that?
And of course, I rarely need to make loops, everything is included in functions, which make the code way easier to read, and no headache with indices. So I can focus, on what I want to program, not how to program it.
5) Plotting tools
Matlab is famous for its plotting tools. They are very helpful.
Python's plotting tools have much less features. But there is one thing super annoying. You can plot figures only once per script??? if I have along script I cannot display stuffs at each step ---> useless.
6) Documentation
Everything is very quick to access, everything is crystal clear, function names are well chosen.
With python, I always need to google stuff, look in forums or stackoverflow.... complete time hog.
PS: Finally, what I hate with matlab: its price
I've been using matlab for many years in my research. It's great for linear algebra and has a large set of well-written toolboxes. The most recent versions are starting to push it into being closer to a general-purpose language (better optimizers, a much better object model, richer scoping rules, etc.).
This past summer, I had a job where I used Python + numpy instead of Matlab. I enjoyed the change of pace. It's a "real" language (and all that entails), and it has some great numeric features like broadcasting arrays. I also really like the ipython environment.
Here are some things that I prefer about Matlab:
consistency: MathWorks has spent a lot of effort making the toolboxes look and work like each other. They haven't done a perfect job, but it's one of the best I've seen for a codebase that's decades old.
documentation: I find it very frustrating to figure out some things in numpy and/or python because the documentation quality is spotty: some things are documented very well, some not at all. It's often most frustrating when I see things that appear to mimic Matlab, but don't quite work the same. Being able to grab the source is invaluable (to be fair, most of the Matlab toolboxes ship with source too)
compactness: for what I do, Matlab's syntax is often more compact (but not always)
momentum: I have too much Matlab code to change now
If I didn't have such a large existing codebase, I'd seriously consider switching to Python + numpy.
Hold everything. When's the last time you programed your calculator to play tetris? Did you actually think you could write anything you want in those 128k of RAM? Likely not. MATLAB is not for programming unless you're dealing with huge matrices. It's the graphing calculator you whip out when you've got Megabytes to Gigabytes of data to crunch and/or plot. Learn just basic stuff, but also don't kill yourself trying to make Python be a graphing calculator.
You'll quickly get a feel for when you want to crunch, plot or explore in MATLAB and when you want to have all that Python offers. Lots of engineers turn to pre and post processing in Python or Perl. Occasionally even just calling out to MATLAB for the hard bits.
They are such completely different tools that you should learn their basic strengths first without trying to replace one with the other. Granted for saving money I'd either use Octave or skimp on ease and learn to work with sparse matrices in Perl or Python.
MATLAB is great for doing array manipulation, doing specialized math functions, and for creating nice plots quick.
I'd probably only use it for large programs if I could use a lot of array/matrix manipulation.
You don't have to worry about the IDE as much as in more formal packages, so it's easier for students without a lot of programming experience to pick up.
MATLAB is a popular and widely adapted piece of a
sophisticated software package. It'd be a mistake to think
it's merely a math software since it has a wide range of
"toolboxes". I recently used Matplotlib to plot some data
from a database and it did the job without needing all the
bells and whistles of MATLAB. However, it may not be proper
to compare Python and MATLAB in every situation. As with
everything else the decision depends on what you need to do.
I used MATLAB in undergrad for control systems design and
simulation and also for image processing in grad school. For
these fields MATLAB makes the most sense because of the
powerful control and image processing toolboxes. As everyone
mentioned, array operations, which are used in every MATLAB
script you'd need to write, are very easy with MATLAB.
Another nice thing about MATLAB is that it's very easy and
fast to do prototyping and trying out ideas using the built
in toolbox functions. For instance, it takes no effort to
import an image and compute it's histogram or do some simple
processing on it. One disadvantage of MATLAB could be it's
speed because of its interpreted nature. However, if one
really needs speed than he can choose to implement the
tested logic in C/C++, etc.
For further comparison with Python, I can say that MATLAB
provides a full package for you to do your work without the
need of looking around for external libraries and
implementing extra functions.
One last point about MATLAB which I see is not mentioned in
the answers here is that it has a very powerful visual
modeling/simulation environment called Simulink. It's
easier to design and simulate larger systems with Simulink.
Finally, again, it all depends on the problem you need to
solve. If your problem domain can make use of one of
MATLAB's toolboxes and you have access to MATLAB then you
can be sure that you'll have the right tool to solve it.
MATLAB, as mentioned by others, is great at matrix manipulation, and was originally built as an extension of the well-known BLAS and LAPACK libraries used for linear algebra. It interfaces well with other languages like Java, and is well favored by engineering and scientific companies for its well developed and documented libraries. From what I know of Python and NumPy, while they share many of the fundamental capabilities of MATLAB, they don't have the full breadth and depth of capabilities with their libraries.
Personally, I use MATLAB because that's what I learned in my internship, that's what I used in grad school, and that's what I used in my first job. I don't have anything against Python (or any other language). It's just what I'm used too.
Also, there is another free version in addition to scilab mentioned by #Jim C from gnu called Octave.
Personally, I tend to think of Matlab as an interactive matrix calculator and plotting tool with a few scripting capabilities, rather than as a full-fledged programming language like Python or C. The reason for its success is that matrix stuff and plotting work out of the box, and you can do a few very specific things in it with virtually no actual programming knowledge. The language is, as you point out, extremely frustrating to use for more general-purpose tasks, such as even the simplest string processing. Its syntax is quirky, and it wasn't created with the abstractions necessary for projects of more than 100 lines or so in mind.
I think the reason why people try to use Matlab as a serious programming language is that most engineers (there are exceptions; my degree is in biomedical engineering and I like programming) are horrible programmers and hate to program. They're taught Matlab in college mostly for the matrix math, and they learn some rudimentary programming as part of learning Matlab, and just assume that Matlab is good enough. I can't think of anyone I know who knows any language besides Matlab, but still uses Matlab for anything other than a few pure number crunching applications.
The most likely reason that it's used so much in universities is that the mathematics faculty are used to it, understand it, and know how to incorporate it into their curriculum.
Between matplotlib+pylab and NumPy I don't think there's much actual difference between Matlab and python other than cultural inertia as suggested by #Adam Bellaire.
I believe you have a very good point and it's one that has been raised in the company where I work. The company is limited in it's ability to apply matlab because of the licensing costs involved. One developer proved that Python was a very suitable replacement but it fell on ignorant ears because to the owners of those ears...
No-one in the company knew Python although many of us wanted to use it.
MatLab has a name, a company, and task force behind it to solve any problems.
There were some (but not a lot) of legacy MatLab projects that would need to be re-written.
If it's worth £10,000 (??) it's gotta be worth it!!
I'm with you here. Python is a very good replacement for MatLab.
I should point out that I've been told the company uses maybe 5% to 10% of MatLabs capabilities and that is the basis for my agreement with the original poster
MATLAB is a fantastic tool for
prototyping
engineering simulation and
fast visualization of data
You can really play with, visualize and test your ideas on a data set very effectively. It should not be regarded as an alternative to other software languages used for product development. I highly recommend it for the above tasks, though it is expensive - free alternatives like Octave and Python are catching up.
Seems to be pure inertia. Where it is in use, everyone is too busy to learn IDL or numpy in sufficient detail to switch, and don't want to rewrite good working programs. Luckily that's not strictly true, but true enough in enough places that Matlab will be around a long time. Like Fortran (in active use where i work!)
The main reason it is useful in industry is the plug-ins built on top of the core functionality. Almost all active Matlab development for the last few years has focused on these.
Unfortunately, you won't have much opportunity to use these in an academic environment.
I know this question is old, and therefore may no longer be
watched, but I felt it was necessary to comment. As an
aerospace engineer at Georgia Tech, I can say, with no
qualms, that MATLAB is awesome. You can have it quickly
interface with your Excel spreadsheets to pull in data about
how high and fast rockets are flying, how the wind affects
those same rockets, and how different engines matter. Beyond
rocketry, similar concepts come into play for cars, trucks,
aircraft, spacecraft, and even athletics. You can pull in
large amounts of data, manipulate all of it, and make sure
your results are as they should be. In the event something is
off, you can add a line break where an error occurs to debug
your program without having to recompile every time you want
to run your program. Is it slower than some other programs?
Well, technically. I'm sure if you want to do the number
crunching it's great for on an NVIDIA graphics processor, it
would probably be faster, but it requires a lot more effort
with harder debugging.
As a general programming language, MATLAB is weak. It's not
meant to work against Python, Java, ActionScript, C/C++ or
any other general purpose language. It's meant for the
engineering and mathematics niche the name implies, and it
does so fantastically.
MATLAB WAS a wrapper around commonly available libraries.
And in many cases it still is. When you get to larger
datasets, it has many additional optimizations, including
examining and special casing common problems (reducing to
sparse matrices where useful, for example), and handling
edge cases. Often, you can submit a problem in a standard
form to a general function, and it will determine the best
underlying algorithm to use based on your data. For small
N, all algorithms are fast, but MATLAB makes determining the
optimal algorithm a non-issue.
This is written by someone who hates MATLAB, and has tried
to replace it due to integration issues. From your
question, you mention getting MATLAB 5 and using it for a
course. At that level, you might want to look at
Octave, an open source implementation with the same
syntax. I'm guessing it is up to MATLAB 5 levels by now (I
only play around with it). That should allow you to "pass
your exam". For bare MATLAB functionality it seems to be
close. It is lacking in the toolbox support (which, again,
mostly serves to reformulate the function calls to forms
familiar to engineers in the field and selects the right
underlying algorithm to use).
One reason MATLAB is popular with universities is the same reason a lot of things are popular with universities: there's a lot of professors familiar with it, and it's fairly robust.
I've spoken to a lot of folks who are especially interested in MATLAB's nascent ability to tap into the GPU instead of working serially. Having used Python in grad school, I kind of wish I had the licks to work with MATLAB in that case. It sure would make vector space calculations a breeze.
It's been some time since I've used Matlab, but from memory it does provide (albeit with extra plugins) the ability to generate source to allow you to realise your algorithm on a DSP.
Since python is a general purpose programming language there is no reason why you couldn't do everything in python that you can do in matlab. However, matlab does provide a number of other tools - eg. a very broad array of dsp features, a broad array of S and Z domain features.
All of these could be hand coded in python (since it's a general purpose language), but if all you're after is the results perhaps spending the money on Matlab is the cheaper option?
These features have also been tuned for performance. eg. The documentation for Numpy specifies that their Fourier transform is optimised for power of 2 point data sets. As I understand Matlab has been written to use the most efficient Fourier transform to suit the size of the data set, not just power of 2.
edit: Oh, and in Matlab you can produce some sensational looking plots very easily, which is important when you're presenting your data. Again, certainly not impossible using other tools.
I think you answered your own question when you noted that Matlab is "cool to work with matrixes and plotting things". Any application that requires a lot of matrix maths and visualisation will probably be easiest to do in Matlab.
That said, Matlab's syntax feels awkward and shows the language's age. In contrast, Python is a much nicer general purpose programming language and, with the right libraries can do much of what Matlab does. However, Matlab is always going to have a more concise syntax than Python for vector and matrix manipulation.
If much of your programming involves these sorts of manipulations, such as in signal processing and some statistical techniques, then Matlab will be a better choice.
First Mover Advantage. Matlab has been around since the late 1970s. Python came along more recently, and the libraries that make it suitable for Matlab type tasks came along even more recently. People are used to Matlab, so they use it.
Matlab is good at doing number crunching. Also Matrix and matrix manipulation. It has many helpful built in libraries(depends on the what version) I think it is easier to use than python if you are going to be calculating equations.

Categories

Resources