473,320 Members | 2,193 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Wow, Python much faster than MatLab

hi All,

instead of questions,
my first success story:

I converted my first MatLab algorithm into Python (using SciPy),
and it not only works perfectly,
but also runs much faster:

MatLab: 14 msec
Python: 2 msec

After taking the first difficult steps into Python,
all kind of small problems as you already know,
it nows seems a piece of cake to convert from MatLab to Python.
(the final programs of MatLab and Python can almost only be
distinguished by the comment character ;-)

Especially I like:
- more relaxed behavior of exceeded the upper limit of a (1-dimensional)
array
- much more functions available, like a simple "mean"
- reducing datatype if it's allowed (booleans of 1 byte)

thanks for all your help,
probably need some more in the future,
cheers,
Stef Mientki
Dec 29 '06 #1
11 2885

Stef Mientki wrote:
hi All,

instead of questions,
my first success story:

I converted my first MatLab algorithm into Python (using SciPy),
and it not only works perfectly,
but also runs much faster:

MatLab: 14 msec
Python: 2 msec
For times this small, I wonder if timing comparisons are valid. I do
NOT think SciPy is in general an order of magnitude faster than Matlab
for the task typically performed with Matlab.
>
After taking the first difficult steps into Python,
all kind of small problems as you already know,
it nows seems a piece of cake to convert from MatLab to Python.
(the final programs of MatLab and Python can almost only be
distinguished by the comment character ;-)

Especially I like:
- more relaxed behavior of exceeded the upper limit of a (1-dimensional)
array
Could you explain what this means? In general, I don't want a
programming language to be "relaxed" about exceeding array bounds.

Dec 30 '06 #2
On Fri, 29 Dec 2006 19:35:22 -0800, Beliavsky wrote:
>Especially I like:
- more relaxed behavior of exceeded the upper limit of a (1-dimensional)
array

Could you explain what this means? In general, I don't want a
programming language to be "relaxed" about exceeding array bounds.
I'm not sure about SciPy, but lists in standard Python allow this:
>>array = [1, 2, 3, 4]
array[2:50000]
[3, 4]

That's generally a good thing.


--
Steven.

Dec 30 '06 #3
>MatLab: 14 msec
Python: 2 msec

For times this small, I wonder if timing comparisons are valid. I do
NOT think SciPy is in general an order of magnitude faster than Matlab
for the task typically performed with Matlab.
The algorithm is meant for real-time analysis,
where these kind of differences counts a lot.
I'm also a typical "surface programmer"
(don't need/want to know what's going inside),
just want to get my analysis done,
and the fact that Python has much more functions available,
means I've to write far less explicit or implicit for loops,
and thus I expect it to "look" faster for me always.
>
>After taking the first difficult steps into Python,
all kind of small problems as you already know,
it nows seems a piece of cake to convert from MatLab to Python.
(the final programs of MatLab and Python can almost only be
distinguished by the comment character ;-)

Especially I like:
- more relaxed behavior of exceeded the upper limit of a (1-dimensional)
array

Could you explain what this means? In general, I don't want a
programming language to be "relaxed" about exceeding array bounds.
Well, I've to admit, that wasn't a very tactic remark, "noise" is still
an unwanted issue in software.
But in the meanwhile I've reading further and I should replace that by
some other great things:
- the very efficient way, comment is turned into help information
- the (at first sight) very easy, but yet quit powerfull OOPs implemetation.

cheers,
Stef Mientki
Dec 30 '06 #4
>
I'm not sure about SciPy,
Yes SciPy allows it too !
but lists in standard Python allow this:
>
>>>array = [1, 2, 3, 4]
array[2:50000]
[3, 4]

That's generally a good thing.
You're not perhaps by origin an analog engineer ;-)

cheers,
Stef Mientki
Dec 30 '06 #5
A other great thing: With rpy you have R bindings for python.
So you have the power of R and the easy syntax and big standard lib of python! :)
Dec 30 '06 #6
Mathias Panzenboeck wrote:
A other great thing: With rpy you have R bindings for python.
forgive my ignorance, what's R, rpy ?
Or is only relevant for Linux users ?

cheers
Stef
So you have the power of R and the easy syntax and big standard lib of python! :)
Dec 30 '06 #7
Stef Mientki <S.**************@mailbox.kun.nlwrites:
Mathias Panzenboeck wrote:
A other great thing: With rpy you have R bindings for python.

forgive my ignorance, what's R, rpy ?
Or is only relevant for Linux users ?
[...]

R is a language / environment for statistical programming. RPy is a
Python interface to let you use R from Python. I think they both run
on both Windows and Linux.

http://www.r-project.org/

http://rpy.sourceforge.net/
John
Dec 30 '06 #8

Stef Mientki wrote:
MatLab: 14 msec
Python: 2 msec
I have the same experience. NumPy is usually faster than Matlab. But it
very much depends on how the code is structured.

I wonder if it is possible to improve the performance of NumPy by
having its fundamental types in the language, instead of depending on
operator overloading. For example, in NumPy, a statement like

array3[:] = array1[:] + array2[:]

allocates an intermediate array that is not needed. This is because the
operator overloading cannot know if it's evaluating a part of a larger
statement like

array1[:] = (array1[:] + array2[:]) * (array3[:] + array4[:])

If arrays had been a part of the language, as it is in Matlab and
Fortran 95, the compiler could see this and avoid intermediate storage,
as well as looping over the data only once. This is one of the main
reasons why Fortran is better than C++ for scientific computing. I.e.
instead of

for (i=0; i<n; i++)
array1[i] = (array1[i] + array2[i]) * (array3[i] + array4[i]);

one actually gets something like three intermediates and four loops:

tmp1 = malloc(n*sizeof(whatever));
for (i=0; i<n; i++)
tmp1[i] = array1[i] + array2[i];
tmp2 = malloc(n*sizeof(whatever));
for (i=0; i<n; i++)
tmp2[i] = array3[i] + array4[i];
tmp3 = malloc(n*sizeof(whatever));
for (i=0; i<n; i++)
tmp3[i] = tmp1[i] + tmp2[i];
free(tmp1);
free(tmp2);
for (i=0; i<n; i++)
array1[i] = tmp3[i];
free(tmp3);

In C++ this is actually further bloated by constructor, destructor and
copyconstructor calls.
Why one should use Fortran over C++ is obvious. But it also applies to
NumPy, and also to the issue of Numpy vs. Matlab, as Matlab know about
arrays and has a compiler that can deal with this, whilst NumPy depends
on bloated operator overloading. On the other hand, Matlab is
fundamentally impaired on function calls and array slicing compared
with NumPy (basically copies are created instead of views). Thus, which
is faster - Matlab or NumPy - very much depends on how the code is
written.

Now for my question: operator overloading is (as shown) not the
solution to efficient scientific computing. It creates serious bloat
where it is undesired. Can NumPy's performance be improved by adding
the array types to the Python language it self? Or are the dynamic
nature of Python preventing this?

Sturla Molden

Dec 31 '06 #9
sturlamolden wrote:
array3[:] = array1[:] + array2[:]
OT, but why are you slicing array1 and array2? All that does is create new array
objects pointing to the same data.
Now for my question: operator overloading is (as shown) not the
solution to efficient scientific computing. It creates serious bloat
where it is undesired. Can NumPy's performance be improved by adding
the array types to the Python language it self? Or are the dynamic
nature of Python preventing this?
Pretty much. Making the array types builtin rather than from a third party
module doesn't really change anything. However, if type inferencing tools like
psyco are taught about numpy arrays like they are already taught about ints,
then one could do make it avoid temporaries.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Dec 31 '06 #10

sturlamolden wrote:
as well as looping over the data only once. This is one of the main
reasons why Fortran is better than C++ for scientific computing. I.e.
instead of

for (i=0; i<n; i++)
array1[i] = (array1[i] + array2[i]) * (array3[i] + array4[i]);

one actually gets something like three intermediates and four loops:

tmp1 = malloc(n*sizeof(whatever));
for (i=0; i<n; i++)
tmp1[i] = array1[i] + array2[i];
tmp2 = malloc(n*sizeof(whatever));
for (i=0; i<n; i++)
tmp2[i] = array3[i] + array4[i];
tmp3 = malloc(n*sizeof(whatever));
for (i=0; i<n; i++)
tmp3[i] = tmp1[i] + tmp2[i];
free(tmp1);
free(tmp2);
for (i=0; i<n; i++)
array1[i] = tmp3[i];
free(tmp3);
C/C++ do not allocate extra arrays. What you posted _might_ bear a
small resemblance to what numpy might produce (if using vectorized
code, not explicit loop code). This is entirely unrelated to the
reasons why fortran can be faster than c.

-Mike

Dec 31 '06 #11

Klaas wrote:
C/C++ do not allocate extra arrays. What you posted _might_ bear a
small resemblance to what numpy might produce (if using vectorized
code, not explicit loop code). This is entirely unrelated to the
reasons why fortran can be faster than c.
Array libraries in C++ that use operator overloading produce
intermediate arrays for the same reason as NumPy. There is a C++
library that are sometimes able to avoid intermediates (Blitz++), but
it can only do so for small arrays for which bounds are known at
compile time.

Operator overloading is sometimes portrayed as required for scientific
computing (e.g. in Java vs. C# flame wars), but the cure can be worse
than the disease.

C does not have operator overloading and is an entirely different case.
You can of course avoid intermediates in C++ if you use C++ as C. You
can do that in Python as well.

Jan 1 '07 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Carl | last post by:
I have been using Python for quite some time now and I love it. I use it mainly for explorative computing and numerical prototyping, ie testing and trying out different kinds of algorithms and...
42
by: Fred Ma | last post by:
Hello, This is not a troll posting, and I've refrained from asking because I've seen similar threads get all nitter-nattery. But I really want to make a decision on how best to invest my time....
4
by: dataangel | last post by:
I'm a student who's considering doing a project for a Machine Learning class on pathing (bots learning to run through a maze). The language primarily used by the class has been Matlab. I would...
53
by: Michael Tobis | last post by:
Someone asked me to write a brief essay regarding the value-add proposition for Python in the Fortran community. Slightly modified to remove a few climatology-related specifics, here it is. I...
9
by: Carl | last post by:
I am desperately looking for a way to call Python from Matlab. I have become used to Python's rich syntax and large number of libraries, and feel ridiculously clumsy being stuck with Matlab's...
2
by: lbolla | last post by:
take a look at Python vs Matlab comparison for numerical computations: numerical-computing-matlab-vs-python+numpy+weave it seems that what stated in a previous thread was true! Python+Numpy...
4
by: wang frank | last post by:
Hi, While comparing the speed of octave and matlab, I decided to do a similar test for python and matlab. The result shows that python is slower than matlab by a factor of 5. It is not bad since...
53
by: Vicent Giner | last post by:
Hello. I am new to Python. It seems a very interesting language to me. Its simplicity is very attractive. However, it is usually said that Python is not a compiled but interpreted programming...
55
by: sturlamolden | last post by:
I have recently been playing with a kd-tree for solving the "post office problem" in a 12-dimensional space. This is pure cpu bound number crunching, a task for which I suspected Python to be...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.