How Can I Approach Machine Learning in Python, Simply? - python

I am a web developer and tools creator professionally. The way that I learn things is by making the smallest possible product that is still functional and useful, and then adding complexity / refactoring as needed. I am trying to learn data science and pick up machine learning skills. When I look at the "small" projects in Machine Learning they assume a knowledge of Numpy, Pandas, and Seaborn in addition to being well versed in Python.
How can I make the machine learning equivalent of an index.html with Hello World inside of an <h1> tag?
If this is not possible, what sorts of projects can I do to incrementally "master" the underlying libraries. i.e. What is a "useful" project that one can do with only Numpy, only Numpy + Pandas, only Numpy + Pandas + Seaborn etc?
Apologies for open ended-ness. Not sure where else to post a question like this.

In response to the latter question
"What is a "useful" project that one can do with only Numpy, only Numpy + Pandas, only Numpy + Pandas + Seaborn etc?"
Numpy Projects in Isolation:
http://www.labri.fr/perso/nrougier/teaching/numpy.100/index.html
Projects here start very minimally and move into more and more complex problem spaces using on numpy as the only tool.
Numpy and Pandas Projects in Isolation:
https://www.hackerearth.com/practice/machine-learning/data-manipulation-visualisation-r-python/tutorial-data-manipulation-numpy-pandas-python/tutorial/
This link contains a short introduction to numpy and then spends the bulk of its resources discussing data frames.
Both links have the user solve increasingly complex data frame manipulations and were awesome resources to me and gave me lots of ideas on way I could expand and add additional complexity.

Related

What is the difference between Python’s Pandas, R’s ggplot2 and Tableau? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
I'm a newbie to the data world. Any help at a high level for the above would be greatly appreciated.
Thanks
You can make multiple differences. On the first level: What is the difference between Python, R and Tableau? These are basically all three entirely different tools.
Python is an open-source general purpose programming language. In other words, Python is suitable for a lot of tasks. You can program desktop applications, websites/servers but is also heavily represented in datascience, computational/natural sciences,...
R on the other hand is also considered a programming language, but specifically for datascience and statistics. This means that, out-of-the-box, R will have a lot of statistical tools and functionality whereas Python requires some third-party installation.
Tableau is not a programming language, but rather a visualisation software package. You can make so-called dashboards with it, that is aimed to summarizing your complex data to human-interpretable information. The main advantage is that programming knowledge is not really necessary.
Now pandas is a third-party, open-source python library for data science and is most known for introducing dataframes. Dataframes are popular because they are an intuitive way to work with your data. You can read and write multiple file formats as well as perform operations on your data frames, calculate some statistics, make visualizations,... Note that R does not need a similar library since dataframes are already included in the standard R package.
ggplot2 is an open-source R library that is specifically aimed at plotting data. The equivalent library in Python would be matplotlib, which pandas uses as well to generate its plots.
Which tool is best for you depends on what you would like to do. If you aren't interested in learning how to program, go for Tableau. If you do want to learn how to code, you could go with Python and/or R. If you are only interested in statistics and datascience, you can go with R. However, if you are interested in other parts as well, such as machine learning, web development, ... you're better of with Python.
ggplot2 is a full-featured R library for making publication-quality statistical graphics. It has a very expressive style that makes it useful for exploratory analysis as well as final graphics for publication.
Pandas is a python library that brings the R "data frame" to python. It is not a plotting library but it interfaces with the matplotlib library to make publication quality graphics.
Tableau is a standalone application for building dashboards and interactive graphics that are linked to databases. It has a mostly graphical interface with less coding than R/Python. The dashboards made in Tableau can also be done in R/Python, but by involving other libraries as well.

MatLab user thinking of learning Python [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I’m considering learning Python with the idea of letting go of MatLab, although I really like MatLab. However, I’m concerned that getting all of the moving and independent pieces to fit together may be a challenge and one that may not be worth it in the end. I’ve also thought about getting into Visual Basic or Visual C++. In the end, I keep coming back to the ease of MatLab. Any thoughts or comments regarding the difficulty of getting going in Python? Is it worth it?
A good place to start is this page: SciPy getting started, which gives an overview of the scientific toolstack that you might be able to use to move towards MatLab from Python: notably the libraries numpy, scipy, matplotlib, and the interactive working environment IPython. In particular, numpy and matplotlib are designed to be very similar to working with MatLab.
NumPy‘s array type augments the Python language with an efficient data structure useful for numerical work, e.g., manipulating matrices. NumPy also provides basic numerical routines, such as tools for finding eigenvectors.
For example in matlab you might write
eye(3)-diag([1 1],1)
and get back
1 -1 0
0 1 -1
0 0 1
In Python/numpy you would write
import numpy as np
np.eye(3)-np.diag([1,1],1)
and get back
array([[ 1., -1., 0.],
[ 0., 1., -1.],
[ 0., 0., 1.]])
With matplotlib
you have full control of line styles, font properties, axes properties, etc, via an object oriented interface or via a set of functions familiar to MATLAB users.
In MatLab for plotting you might write
x=linspace(-pi, pi, 100);
plot(x,sin(x))
In Python/numpy/matplotlib you would write
x=np.linspace(-np.pi, np.pi, 100)
import matplotlib.pyplot as plt
plt.plot(x,np.sin(x))
There is plenty on the web designed for people making the transition, see, e.g. NumPy for Matlab Users.
MATLAB® and NumPy/SciPy have a lot in common. But there are many differences. NumPy and SciPy were created to do numerical and scientific computing in the most natural way with Python, not to be MATLAB® clones. This page is intended to be a place to collect wisdom about the differences, mostly for the purpose of helping proficient MATLAB® users become proficient NumPy and SciPy users. NumPyProConPage is another page for curious people who are thinking of adopting Python with NumPy and SciPy instead of MATLAB® and want to see a list of pros and cons.
You might also like to consider pylab, which brings together numpy and matplotlib into a single namespace, so you don't have to bother with the np and plt prefixes I used above. See, e.g., wikipedia.
There are tags on this site that are worth looking at: e.g. numpy, scipy, matplotlib. There is also a question on Python at stats.se which you might find relevant. If you are interested in statistics, or in reading, writing and manipulating tabular data, you will be interested in pandas, Python's answer to R's data frame.
As to C++, it's a great language, but not in the same category as Python. This is not the right place to discuss their pros and cons, but in short, C++ is much closer to the machine than Python and if you spend the time you can write highly optimized code. In Python you can get code working very quickly, glueing together independent pieces and easily reading and writing data from wherever you want to, but Python code can sometimes run slowly (it's like Matlab -- if you vectorize in numpy it's fast, otherwise it's interpreted and slow). You might occasionally want to speed up slow Python code using the ability for Python to call functions defined in C, see, e.g., this question. (I'll leave Visual Basic to one side as it doesn't seem relevant.)
Finally, as noted in the comments, answering any specifics would involve knowing exactly what your requirements were, not just what you want to do, but who you want to do it with, and how much time and money you have available to invest.
Yes Python is worth learning. It was my first major language which I learned over ten years ago. It's used heavily in Linux systems especially when the systems boot up. Also the libraries for the language are very well developed. If you want to design a game PyGame has been around a long time and makes the process pretty easy. If you want to program for networking Twisted is an excellent library they have. And their web framework Django is a beauty.
I found Python to be a very English like language to get into.
If you'd like to you might take a peak at Ruby as well. Ruby allows a lot of different styles of programming. So you can program just like you did in other languages in Ruby (I strictly mean in style). Also Ruby has the most "free" resources for learning and getting into the language online that I've ever seen.
I love both languages. But my answer to you is "Yes!" Python is a great first language.
It all depends on what you want to do, and what method's you prefer.
You can find many mathematical plotting libraries here: https://wiki.python.org/moin/NumericAndScientific/Plotting Many of which could be an excellent alternative to Matlab.
Python is great for Firmware programming like with Arduino's. C++ is great and very powerful for Programming software & applications. If you want to program hardware, go with python. If you want to program software, go with C++. Im learning C++ and its great.

Object Tracking: MATLAB vs. Python Numpy

I will soon be starting a final year Engineering project, consisting of the real-time tracking of objects moving on a 2D-surface. The objects will be registered by my algorithm using feature extraction.
I am trying to do some research to decide whether I should use MATLAB or use Python Numpy (Numerical Python). Some of the factors I am taking into account:
1.) Experience
I have reasonable experience in both, but perhaps more experience in image processing using Numpy. However, I have always found MATLAB to be very intuitive and easy to pick up.
2.) Real-Time abilities
It is very important that my choice be able to support the real-time acquisition of video data from an external camera. I found this link for MATLAB showing how to do it. I am sure that the same would be possible for Python, perhaps using the OpenCV library?
3.) Performance
I have heard, although never used, that MATLAB can easily split independent calculations across multiple cores. I should think that this would be very useful, and I am not sure whether the same is equally simple for Numpy?
4.) Price
I know that there is a cost associated with MATLAB, but I will be working at a university and thus will have access to full MATLAB without any cost to myself, so price is not a factor.
I would greatly appreciate any input from anyone who has done something similar, and what your experience was.
Thanks!
Python (with NumPy, SciPy and MatPlotLib) is the new Matlab. So I strongly recommend Python over Matlab.
I made the change over a year ago and I am very happy with the results.
Here it is a short pro/con list for Python and Matlab
Python pros:
Object Oriented
Easy to write large and "real" programs
Open Source (so it's completely free to use)
Fast (most of the heavy computation algorithms have a python wrapper to connect with C libraries e.g. NumPy, SciPy, SciKits, libSVM, libLINEAR)
Comfortable environment, highly configurable (iPython, python module for VIM, ...)
Fast growing community of Python users. Tons of documentation and people willing to help
Python cons:
Could be a pain to install (especially some modules in OS X)
Plot manipulation is not as nice/easy as in Matlab, especially 3D plots or animations
It's still a script language, so only use it for (fast) prototyping
Python is not designed for multicore programming
Matlab pros:
Very easy to install
Powerful Toolboxes (e.g. SignalProcessing, Systems Biology)
Unified documentation, and personalized support as long as you buy the licence
Easy to have plot animations and interactive graphics (that I find really useful for running experiments)
Matlab cons:
Not free (and expensive)
Based on Java + X11, which looks extremely ugly (ok, I accept I'm completely biased here)
Difficult to write large and extensible programs
A lot of Matlab users are switching to Python :)
I would recommend python.
I switched from MATLAB -> python about 1/2 way through my phd, and do not regret it. At the most simplistic, python is a much nicer language, has real objects, etc.
If you expect to be doing any parts of your code in c/c++ I would definitely recommend python. The mex interface works, but if your build gets complicated/big it starts to be a pain and I never sorted out how to effectively debug it. I also had great difficulty with mex+allocating large blocks interacting with matlab's memory management (my inability to fix that issue is what drove me to switch).
As a side note/self promotion, I have Crocker-Grier in c++ (with swig wrappers) and pure python.
If you're experienced with both languages it's not really a decision criterion.
Matlab has problems coping with real time settings especially since most computer vision algorithms are very costly. This is the advantage of using a tried and tested library such as OpenCV where many of the algorithms you'll be using are efficiently implemented. Matlab offers the possibility of compiling code into Mex-files but that is a lot of work.
Matlab has parallel for loops parfor which makes multicore processing easy (or at least easier). But the question is if that will suffice to get real-time speeds.
No comment.
The main advantage of Matlab is that you'll obtain a running program very quickly due to its good documentation. But I found that code reusability is bad with Matlab unless you put a heavy emphasis on it.
I think the final decision has to be if you have to/can run your algorithm real-time which I doubt in Matlab, but that depends on what methods you're planning to use.
Others have made a lot of great comments (I've opined on this topic before in another answer https://stackoverflow.com/a/5065585/392949) , but I just wanted to point out that Python has a number of really excellent tools for parallel computing/splitting up work across multiple cores. Here's a short and by no means comprehensive list:
IPython Parallel toolkit: http://ipython.org/ipython-doc/dev/parallel/index.html
mpi4py: https://code.google.com/p/mpi4py
The multiprocessing module in the standard library: http://docs.python.org/library/multiprocessing.html
pyzmq: http://zeromq.github.com/pyzmq/ (what the IPython parallel toolkit is based on)
parallel python (pp): http://www.parallelpython.com/
Cython's wrapping of openmp: http://docs.cython.org/src/userguide/parallelism.html
You will also probably find cython to be much to be a vastly superior tool compared to what Matlab has to offer if you ever need to interface external C-libraries or write C-extensions, and it has excellent numpy support built right in.
There is a list with a bunch of other options here:
http://wiki.python.org/moin/ParallelProcessing

How to use python, PyLab, NumPy, etc for my Physics lab class over excel

I took a scientific programming course this semester that I really enjoyed and experimented with a lot. We used python, and all the related modules. I am taking a physics lab next semester and I just wanted to hear from some of you how python can help me in ways that excel can't or in ways that are better than excel's capabilities. I use Mathematica for symbolic stuff so I would use python for data purposes.
Off the top of my head, here are the related things I can do:
All of the things you would expect in a intro course (loops, arrays, slicing arrays, etc).
Reading data from a text file.
Plotting scatter, line, and bar graphs.
Learning how to plot linear regression but haven't totally figured it out.
I have done 7 of the problems on Project Euler (nothing to brag about, but it might give you a better idea of where I stand in skills).
Looking forward to hearing from some of you. You don't have to explain how to use the things you mention, I could look up the documentation.
The paper Python all a scientist needs comes to mind. I hope you can make the needed transformations from Biology to Physics.
Scipy will also be useful to you, as it includes many more advanced analysis tools. For example, Scipy includes a linear regression, and gets more interesting from there. Along with the other tools you mentioned, you'll probably find most of your needs covered.
Other notes on tool selection:
Mathematica is a great tool, if you can afford it. I've played around with the other options, like Sympy, and sadly, they don't come close to being as useful as Mathematica.
I can't imagine using Excel for any serious scientific work. If you're planning to continue forward using the tools that you learn in class, you might as well start with tools that offer you that potential.
Don't reject Excel outright. It's still great for doing simple data analysis and plotting. Excel also has the considerable advantage of being installed on most engineer and scientist's computers, making it a lot easier to share your work with colleagues.
That said, I do use Python when Excel just won't cut it; times when I've had to:
color the points in a scatter plot based on a third column
plot a field of vectors
extract a few values from each of several thousand data files to do statistical process control
generate dozens of scatter plots over different dimensions of a large data set to find which variables are important
solve a nonlinear equation at several intermediate points of a calculation, not just as the final result.
accept variable length input from a user to define a problem
VBA in Excel can do a lot of those things too, but it becomes painful fast in such a primitive language. I dream that Microsoft will make IronPython a first-class scripting language in the next version of Excel. Until then, you might want to try Resolver One
I can recall 2 presentations by Jan Martinek on EuroScipy 2008, he's PhD candidate and presented some fun experiments with Physics in the background. Abstracts are here and I'm sure he would't mind to share more if you contact him directly. Also, take a look at other presentation from EuroScipy, there are some more Physics-related ones.

What is MATLAB good for? Why is it so used by universities? When is it better than Python? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I've been recently asked to learn some MATLAB basics for a class.
What does make it so cool for researchers and people that works in university?
I saw it's cool to work with matrices and plotting things... (things that can be done easily in Python using some libraries).
Writing a function or parsing a file is just painful. I'm still at the start, what am I missing?
In the "real" world, what should I think to use it for? When should it can do better than Python? For better I mean: easy way to write something performing.
UPDATE 1: One of the things I'd like to know the most is "Am I missing something?" :D
UPDATE 2: Thank you for your answers. My question is not about buy or not to buy MATLAB. The university has the possibility to give me a copy of an old version of MATLAB (MATLAB 5 I guess) for free, without breaking the license. I'm interested in its capabilities and if it deserves a deeper study (I won't need anything more than basic MATLAB in oder to pass the exam :P ) it will really be better than Python for a specific kind of task in the real world.
Adam is only partially right. Many, if not most, mathematicians will never touch it. If there is a computer tool used at all, it's going to be something like Mathematica or Maple. Engineering departments, on the other hand, often rely on it and there are definitely useful things for some applied mathematicians. It's also used heavily in industry in some areas.
Something you have to realize about MATLAB is that it started off as a wrapper on Fortran libraries for linear algebra. For a long time, it had an attitude that "all the world is an array of doubles (floats)". As a language, it has grown very organically, and there are some flaws that are very much baked in, if you look at it just as a programming language.
However, if you look at it as an environment for doing certain types of research in, it has some real strengths. It's about as good as it gets for doing floating point linear algebra. The notation is simple and powerful, the implementation fast and trusted. It is very good at generating plots and other interactive tasks. There are a large number of `toolboxes' with good code for particular tasks, that are affordable. There is a large community of users that share numerical codes (Python + NumPy has nothing in the same league, at least yet)
Python, warts and all, is a much better programming language (as are many others). However, it's a decade or so behind in terms of the tools.
The key point is that the majority of people who use MATLAB are not programmers really, and don't want to be.
It's a lousy choice for a general programming language; it's quirky, slow for many tasks (you need to vectorize things to get efficient codes), and not easy to integrate with the outside world. On the other hand, for the things it is good at, it is very very good. Very few things compare. There's a company with reasonable support and who knows how many man-years put into it. This can matter in industry.
Strictly looking at your Python vs. MATLAB comparison, they are mostly different tools for different jobs. In the areas where they do overlap a bit, it's hard to say what the better route to go is (depends a lot on what you're trying to do). But mostly Python isn't all that good at MATLAB's core strengths, and vice versa.
Most of answers do not get the point.
There is ONE reason matlab is so good and so widely used:
EXTREMELY FAST CODING
I am a computer vision phD student and have been using matlab for 4 years, before my phD I was using different languages including C++, java, php, python... Most of the computer vision researchers are using exclusively matlab.
1) Researchers need fast prototyping
In research environment, we have (hopefully) often new ideas, and we want to test them really quick to see if it's worth keeping on in that direction. And most often only a tiny sub-part of what we code will be useful.
Matlab is often slower at execution time, but we don't care much. Because we don't know in advance what method is going to be successful, we have to try many things, so our bottle neck is programming time, because our code will most often run a few times to get the results to publish, and that's all.
So let's see how matlab can help.
2) Everything I need is already there
Matlab has really a lot of functions that I need, so that I don't have to reinvent them all the time:
change the index of a matrix to 2d coordinate: ind2sub extract all patches of an image: im2col; compute a histogram of an image: hist(Im(:)); find the unique elements in a list unique(list); add a vector to all vectors of a matrix bsxfun(#plus,M,V); convolution on n-dimensional arrays convn(A); calculate the computation time of a sub part of the code: tic; %%code; toc; graphical interface to crop an image: imcrop(im);
The list could be very long...
And they are very easy to find by using the help.
The closest to that is python...But It's just a pain in python, I have to go to google each time to look for the name of the function I need, and then I need to add packages, and the packages are not compatible one with another, the format of the matrix change, the convolution function only handle doubles but does not make an error when I give it char, just give a wrong output... no
3) IDE
An example: I launch a script. It produces an error because of a matrix. I can still execute code with the command line. I visualize it doing: imagesc(matrix). I see that the last line of the matrix is weird. I fix the bug. All variables are still set. I select the remaining of the code, press F9 to execute the selection, and everything goes on. Debuging becomes fast, thanks to that.
Matlab underlines some of my errors before execution. So I can quickly see the problems. It proposes some way to make my code faster.
There is an awesome profiler included in the IDE. KCahcegrind is such a pain to use compared to that.
python's IDEs are awefull. python without ipython is not usable. I never manage to debug, using ipython.
+autocompletion, help for function arguments,...
4) Concise code
To normalize all the columns of a matrix ( which I need all the time), I do:
bsxfun(#times,A,1./sqrt(sum(A.^2)))
To remove from a matrix all colums with small sum:
A(:,sum(A)<e)=[]
To do the computation on the GPU:
gpuX = gpuarray(X);
%%% code normally and everything is done on GPU
To paralize my code:
parfor n=1:100
%%% code normally and everything is multi-threaded
What language can beat that?
And of course, I rarely need to make loops, everything is included in functions, which make the code way easier to read, and no headache with indices. So I can focus, on what I want to program, not how to program it.
5) Plotting tools
Matlab is famous for its plotting tools. They are very helpful.
Python's plotting tools have much less features. But there is one thing super annoying. You can plot figures only once per script??? if I have along script I cannot display stuffs at each step ---> useless.
6) Documentation
Everything is very quick to access, everything is crystal clear, function names are well chosen.
With python, I always need to google stuff, look in forums or stackoverflow.... complete time hog.
PS: Finally, what I hate with matlab: its price
I've been using matlab for many years in my research. It's great for linear algebra and has a large set of well-written toolboxes. The most recent versions are starting to push it into being closer to a general-purpose language (better optimizers, a much better object model, richer scoping rules, etc.).
This past summer, I had a job where I used Python + numpy instead of Matlab. I enjoyed the change of pace. It's a "real" language (and all that entails), and it has some great numeric features like broadcasting arrays. I also really like the ipython environment.
Here are some things that I prefer about Matlab:
consistency: MathWorks has spent a lot of effort making the toolboxes look and work like each other. They haven't done a perfect job, but it's one of the best I've seen for a codebase that's decades old.
documentation: I find it very frustrating to figure out some things in numpy and/or python because the documentation quality is spotty: some things are documented very well, some not at all. It's often most frustrating when I see things that appear to mimic Matlab, but don't quite work the same. Being able to grab the source is invaluable (to be fair, most of the Matlab toolboxes ship with source too)
compactness: for what I do, Matlab's syntax is often more compact (but not always)
momentum: I have too much Matlab code to change now
If I didn't have such a large existing codebase, I'd seriously consider switching to Python + numpy.
Hold everything. When's the last time you programed your calculator to play tetris? Did you actually think you could write anything you want in those 128k of RAM? Likely not. MATLAB is not for programming unless you're dealing with huge matrices. It's the graphing calculator you whip out when you've got Megabytes to Gigabytes of data to crunch and/or plot. Learn just basic stuff, but also don't kill yourself trying to make Python be a graphing calculator.
You'll quickly get a feel for when you want to crunch, plot or explore in MATLAB and when you want to have all that Python offers. Lots of engineers turn to pre and post processing in Python or Perl. Occasionally even just calling out to MATLAB for the hard bits.
They are such completely different tools that you should learn their basic strengths first without trying to replace one with the other. Granted for saving money I'd either use Octave or skimp on ease and learn to work with sparse matrices in Perl or Python.
MATLAB is great for doing array manipulation, doing specialized math functions, and for creating nice plots quick.
I'd probably only use it for large programs if I could use a lot of array/matrix manipulation.
You don't have to worry about the IDE as much as in more formal packages, so it's easier for students without a lot of programming experience to pick up.
MATLAB is a popular and widely adapted piece of a
sophisticated software package. It'd be a mistake to think
it's merely a math software since it has a wide range of
"toolboxes". I recently used Matplotlib to plot some data
from a database and it did the job without needing all the
bells and whistles of MATLAB. However, it may not be proper
to compare Python and MATLAB in every situation. As with
everything else the decision depends on what you need to do.
I used MATLAB in undergrad for control systems design and
simulation and also for image processing in grad school. For
these fields MATLAB makes the most sense because of the
powerful control and image processing toolboxes. As everyone
mentioned, array operations, which are used in every MATLAB
script you'd need to write, are very easy with MATLAB.
Another nice thing about MATLAB is that it's very easy and
fast to do prototyping and trying out ideas using the built
in toolbox functions. For instance, it takes no effort to
import an image and compute it's histogram or do some simple
processing on it. One disadvantage of MATLAB could be it's
speed because of its interpreted nature. However, if one
really needs speed than he can choose to implement the
tested logic in C/C++, etc.
For further comparison with Python, I can say that MATLAB
provides a full package for you to do your work without the
need of looking around for external libraries and
implementing extra functions.
One last point about MATLAB which I see is not mentioned in
the answers here is that it has a very powerful visual
modeling/simulation environment called Simulink. It's
easier to design and simulate larger systems with Simulink.
Finally, again, it all depends on the problem you need to
solve. If your problem domain can make use of one of
MATLAB's toolboxes and you have access to MATLAB then you
can be sure that you'll have the right tool to solve it.
MATLAB, as mentioned by others, is great at matrix manipulation, and was originally built as an extension of the well-known BLAS and LAPACK libraries used for linear algebra. It interfaces well with other languages like Java, and is well favored by engineering and scientific companies for its well developed and documented libraries. From what I know of Python and NumPy, while they share many of the fundamental capabilities of MATLAB, they don't have the full breadth and depth of capabilities with their libraries.
Personally, I use MATLAB because that's what I learned in my internship, that's what I used in grad school, and that's what I used in my first job. I don't have anything against Python (or any other language). It's just what I'm used too.
Also, there is another free version in addition to scilab mentioned by #Jim C from gnu called Octave.
Personally, I tend to think of Matlab as an interactive matrix calculator and plotting tool with a few scripting capabilities, rather than as a full-fledged programming language like Python or C. The reason for its success is that matrix stuff and plotting work out of the box, and you can do a few very specific things in it with virtually no actual programming knowledge. The language is, as you point out, extremely frustrating to use for more general-purpose tasks, such as even the simplest string processing. Its syntax is quirky, and it wasn't created with the abstractions necessary for projects of more than 100 lines or so in mind.
I think the reason why people try to use Matlab as a serious programming language is that most engineers (there are exceptions; my degree is in biomedical engineering and I like programming) are horrible programmers and hate to program. They're taught Matlab in college mostly for the matrix math, and they learn some rudimentary programming as part of learning Matlab, and just assume that Matlab is good enough. I can't think of anyone I know who knows any language besides Matlab, but still uses Matlab for anything other than a few pure number crunching applications.
The most likely reason that it's used so much in universities is that the mathematics faculty are used to it, understand it, and know how to incorporate it into their curriculum.
Between matplotlib+pylab and NumPy I don't think there's much actual difference between Matlab and python other than cultural inertia as suggested by #Adam Bellaire.
I believe you have a very good point and it's one that has been raised in the company where I work. The company is limited in it's ability to apply matlab because of the licensing costs involved. One developer proved that Python was a very suitable replacement but it fell on ignorant ears because to the owners of those ears...
No-one in the company knew Python although many of us wanted to use it.
MatLab has a name, a company, and task force behind it to solve any problems.
There were some (but not a lot) of legacy MatLab projects that would need to be re-written.
If it's worth £10,000 (??) it's gotta be worth it!!
I'm with you here. Python is a very good replacement for MatLab.
I should point out that I've been told the company uses maybe 5% to 10% of MatLabs capabilities and that is the basis for my agreement with the original poster
MATLAB is a fantastic tool for
prototyping
engineering simulation and
fast visualization of data
You can really play with, visualize and test your ideas on a data set very effectively. It should not be regarded as an alternative to other software languages used for product development. I highly recommend it for the above tasks, though it is expensive - free alternatives like Octave and Python are catching up.
Seems to be pure inertia. Where it is in use, everyone is too busy to learn IDL or numpy in sufficient detail to switch, and don't want to rewrite good working programs. Luckily that's not strictly true, but true enough in enough places that Matlab will be around a long time. Like Fortran (in active use where i work!)
The main reason it is useful in industry is the plug-ins built on top of the core functionality. Almost all active Matlab development for the last few years has focused on these.
Unfortunately, you won't have much opportunity to use these in an academic environment.
I know this question is old, and therefore may no longer be
watched, but I felt it was necessary to comment. As an
aerospace engineer at Georgia Tech, I can say, with no
qualms, that MATLAB is awesome. You can have it quickly
interface with your Excel spreadsheets to pull in data about
how high and fast rockets are flying, how the wind affects
those same rockets, and how different engines matter. Beyond
rocketry, similar concepts come into play for cars, trucks,
aircraft, spacecraft, and even athletics. You can pull in
large amounts of data, manipulate all of it, and make sure
your results are as they should be. In the event something is
off, you can add a line break where an error occurs to debug
your program without having to recompile every time you want
to run your program. Is it slower than some other programs?
Well, technically. I'm sure if you want to do the number
crunching it's great for on an NVIDIA graphics processor, it
would probably be faster, but it requires a lot more effort
with harder debugging.
As a general programming language, MATLAB is weak. It's not
meant to work against Python, Java, ActionScript, C/C++ or
any other general purpose language. It's meant for the
engineering and mathematics niche the name implies, and it
does so fantastically.
MATLAB WAS a wrapper around commonly available libraries.
And in many cases it still is. When you get to larger
datasets, it has many additional optimizations, including
examining and special casing common problems (reducing to
sparse matrices where useful, for example), and handling
edge cases. Often, you can submit a problem in a standard
form to a general function, and it will determine the best
underlying algorithm to use based on your data. For small
N, all algorithms are fast, but MATLAB makes determining the
optimal algorithm a non-issue.
This is written by someone who hates MATLAB, and has tried
to replace it due to integration issues. From your
question, you mention getting MATLAB 5 and using it for a
course. At that level, you might want to look at
Octave, an open source implementation with the same
syntax. I'm guessing it is up to MATLAB 5 levels by now (I
only play around with it). That should allow you to "pass
your exam". For bare MATLAB functionality it seems to be
close. It is lacking in the toolbox support (which, again,
mostly serves to reformulate the function calls to forms
familiar to engineers in the field and selects the right
underlying algorithm to use).
One reason MATLAB is popular with universities is the same reason a lot of things are popular with universities: there's a lot of professors familiar with it, and it's fairly robust.
I've spoken to a lot of folks who are especially interested in MATLAB's nascent ability to tap into the GPU instead of working serially. Having used Python in grad school, I kind of wish I had the licks to work with MATLAB in that case. It sure would make vector space calculations a breeze.
It's been some time since I've used Matlab, but from memory it does provide (albeit with extra plugins) the ability to generate source to allow you to realise your algorithm on a DSP.
Since python is a general purpose programming language there is no reason why you couldn't do everything in python that you can do in matlab. However, matlab does provide a number of other tools - eg. a very broad array of dsp features, a broad array of S and Z domain features.
All of these could be hand coded in python (since it's a general purpose language), but if all you're after is the results perhaps spending the money on Matlab is the cheaper option?
These features have also been tuned for performance. eg. The documentation for Numpy specifies that their Fourier transform is optimised for power of 2 point data sets. As I understand Matlab has been written to use the most efficient Fourier transform to suit the size of the data set, not just power of 2.
edit: Oh, and in Matlab you can produce some sensational looking plots very easily, which is important when you're presenting your data. Again, certainly not impossible using other tools.
I think you answered your own question when you noted that Matlab is "cool to work with matrixes and plotting things". Any application that requires a lot of matrix maths and visualisation will probably be easiest to do in Matlab.
That said, Matlab's syntax feels awkward and shows the language's age. In contrast, Python is a much nicer general purpose programming language and, with the right libraries can do much of what Matlab does. However, Matlab is always going to have a more concise syntax than Python for vector and matrix manipulation.
If much of your programming involves these sorts of manipulations, such as in signal processing and some statistical techniques, then Matlab will be a better choice.
First Mover Advantage. Matlab has been around since the late 1970s. Python came along more recently, and the libraries that make it suitable for Matlab type tasks came along even more recently. People are used to Matlab, so they use it.
Matlab is good at doing number crunching. Also Matrix and matrix manipulation. It has many helpful built in libraries(depends on the what version) I think it is easier to use than python if you are going to be calculating equations.

Categories

Resources