(hello

‘world)

How I feel about Scheme’s performance.

I came across this post written earlier today, How fast is Scheme? Well…. which states:

I don’t know much about Scheme [...] but it seems that the Scheme compilers produce quite sluggish code, at least looking through the grainy, distorted lens that is the Computer Language Benchmarks Game.

That seems to make enough sense to me. For one-off heavily numerical tests, Scheme pretty much sucks. Especially considering that to compile the code it often has to go through C.

He goes on to say:

SBCL comes out very much on top; in general the Gambit programs take two or three times as long to do their job (although I haven’t looked at memory usage). But as far as Scheme compilers go, Gambit seems to be improving things.

With the right declarations, SBCL has been known to outperform straight C as well. Its compiler is really quite something. And (IIRC) it doesn’t have the “handicap” of using C as an intermediate language.

Personally however, I tend to consider C generating Scheme compilers more practical for issues of portability and/or easier FFI than performance. Chicken’s a good example of braindead-easy FFI. And I feel where Gambit really shines is when you need many, many threads. It’ll handle hundreds of thousands of threads in a breeze. Termite is a good example of exploiting its threading capabilities. RScheme does real-time GC. SCSH has replaced every use I’ve had for shell/perl/etc scripts. There’s a decent chance that any random Linux box already has Guile installed. SISC and Kawa leverage Java’s JIT machiery and provide trivially easy Java FFI. Bigloo can compile to Java bytecode as well. MzScheme runs its own JIT compiler by default where it can, and is prefered to its C output. And there’s a JIT option for all the C-generating Scheme’s as well, compile with LLVM instead of GCC. I think it’d be especially interesting to compare MzScheme’s JIT to MzScheme’s C + LLVM. And if you’re going the Scheme->C route for performance reasons, I’d think Stalin to be the obvious Scheme to use, whether with GCC or LLVM depending on its expected run time (assuming there’s even noticeable startup overhead with LLVM, it’s embedded VM is really quite minimal). Many of them have really nice OO systems built in as well

But none of these tests run long enough to let any of the JIT options really shine. Basically, I say throw any of the JIT options into the mix, and make the tests long enough to really let them do their magic.

I very strongly suspect the same holds for Java, having run a Freenet node for quite some time it gets noticeably snappier once the JVM has had a chance to see it run for a while, especially for recent versions of Java. I’d also argue that’s exactly (other than its relatively primitive GC) what makes Java so incredibly horrid for client-side stuff such as applets, you’re loading a JIT compiler and GC you’ll hardly get a chance to even use before you’ve closed the window. But its JIT has been making leaps and bounds lately, as mentioned on Good Math, Bad Math:

About a year later, testing a new JIT for Java, the Java time was down to 0.7 seconds to run the code, plus about 1 second for the JVM to start up. (The startup times for C, C++, and Ocaml weren’t really measurable – they were smaller than the margin of error for the measurements.)

This is from the previous measurement of 1 minute and 20 seconds for Java. As I said before, SISC, Kawa, and Bigloo will happily use the Java VM. Straight C scored 0.8 seconds. OCaml kicked all their asses even before compilation, but that’s not the point here. If you really need every last bit of performance you can get though, OCaml seems to be worth looking into.

So yeah, I’m *very* interested in what the performance possibilities for Scheme really are, if nothing else out of shear curiosity. Maybe I’ll wind up running a few benchmarks of my own at some point, reiterating the tests, say, a hundred or a thousand times each… but in the end, even if this does increase their performance relative to SBCL… you don’t need to use an obscure implementation or do JIT tricks with SBCL in the first place. A lot of people havn’t heard of Stalin or LLVM. A lot of people don’t want to load one language (Java) to run another (Scheme). Although again, I’d question whether MzScheme’s performance is really so bad in the long run.

And I’d question whether it was really worth it in the first place. Fluxus and Impromtu are two obvious examples which come to mind, both heavy graphics/audio livecoding systems, which solve many problems the same way you would in, say, Python. Offload much of the heavy work onto libraries. There’s a PDF floating on the net somewhere about MzScheme controlling an array of telescopes, and of course there’s the US Navy’s Metcast project. SchemDoc, Scheme Elucidator, the LAML they’re both based on that you can just feed any XML DTD into and get a Scheme representation of that XML language in. SCWM and Orion window managers. MetaModeler for dealing with many/large databases. For web stuff there’s TeX2page, SISCweb, BRL, WiLiKi, the Hop framework, HtmlPrag, SXML, it goes on. There’s a lot of uses which don’t demand every last bit of performance from the Scheme implementation, and I’m just not really doing anything that does.

And if I were writing something and came across an annoying bottleneck? I’d likely take the NetBSD approach. Instead of trying to tweak things to run faster (LLVM, Java, implementation-specific declarations, etc…), I’d see if I couldn’t find a fundamentally more efficient algorithm first. Which reminds me, I still want those books.

[update] On the Reddit thread where apparently most of the views for this article are coming from (who’d have thought so many people would be interested in some Scheme noob’s opinions of language performance? Well over 500 hits already, scant hours later), there’s a link to a fascinating email thread discussing the floating point speed of Gambit-C. I have Brad Lucier’s paper printing to read later as I type this. There’s also a reply in the thread by Brad himself, my favorite part of which is the end summary, which is generally similar to the “NetBSD approach” mentioned above. Is there some sort of established term for this idea?

Anyway, I’ll certainly be rethinking my opinions of Gambit-C as “that threading/Termite implementation”.

[update 2] It’s morning, getting close to 1500 views for this post now… searching for r5rs performance on Google this is the 3rd result! I still don’t see why this is drawing so much attention, but as long as it is, have you seen Scheme Now!?

Scheme Now!, also known as Snow, is a repository of Scheme packages that are portable to several popular implementations of Scheme.

Snow is a general framework for developing and distributing portable Scheme packages. Snow comes with a set of core packages that provide portable APIs for practical programming features such as networking, cryptography, data compression, file system access, etc. Snow packages can export procedures, macros and records.

[update 3] 1844 views as I write this on July 1st. Also now the first result on google for the aforementioned search. Wow. o_O

About these ads

June 29, 2008 - Posted by | Common Lisp, Programming, Scheme

14 Comments »

  1. Freenet? As an example of speed? Oh well.

    Comment by David | June 29, 2008 | Reply

  2. The daemon itself, or any long-running Java application, yes. Especially with recent JVM’s. The Freenet protocol it implements may not be the fastest thing around, and it’s often dealing with rather large data sets, but there’s not much that can be done about. I’ve never seen a performance problem related to the daemon itself.

    Comment by prael | June 29, 2008 | Reply

  3. > That seems to make enough sense to me. For one-off heavily numerical tests, Scheme pretty much sucks.

    How many of them are heavily numeric?

    Comment by Isaac Gouy | June 29, 2008 | Reply

  4. There’s a list of the programs and language implementations tested on the Benchmark site at http://shootout.alioth.debian.org/gp4/index.php

    Of those, the tests used in the SBCL/Gambit blog were pidigits, fannkuch, fasta, and binary-trees. The first two are numeric, the 3rd is generating & writing random sequences (though not sequences of numbers), and b-trees are b-trees.

    Comment by prael | June 29, 2008 | Reply

  5. There was a time when CMUCL/SBCL were obscure, and even (somewhat earlier) when people expected high level languages, including Lisps, to be slow. We would never have gotten to this point if everyone just decided to leave the status quo as is and treated efforts otherwise as curiosities.

    Comment by web design | June 30, 2008 | Reply

  6. The Scheme implementations that are generally thought to perform best are not represented in the Shootout. For example, Chez Scheme, a commercial product, it generally thought to be the fastest Scheme. Larceny, which is an academic project, is also quite performant, but not present in the shootout.

    Comment by Sam TH | June 30, 2008 | Reply

  7. @web design:
    Fair enough, but I wasn’t necessarily saying they shouldn’t strive to improve performance if that’s what they wish to do. I’m just saying I don’t do anything that would really benefit from it. When I said that I questioned it being worth it, I was referring to things like using LLVM vs. GCC, or using a JVM based implementation. In fact the shootout site itself has a great FAQ, part of which is comparing Java’s performance when it was allowed to run the test 400 times without being restarted, the difference was negligible. If the implementations themselves improve their performance however, especially if it’s by improving their compilation, I can only cheer them on.

    @Sam TH:
    True. Gambit’s not there either. Neither is Gauche, Scheme48, scsh, scm… On the Gambit/SBCL post (first link in this one) there’s mention of improving Gambit’s performance considerably by specifying a larger default heap, and building Gambit itself with a couple options which improves the resulting application’s performance as well. I’d be quite interested to see Chez as well though, TSPL is by far my favorite reference book so far.

    Comment by prael | June 30, 2008 | Reply

  8. prael | June 29, 2008
    > pidigits, fannkuch, fasta, and binary-trees

    When you write “heavily numerical tests” I’d think of tests like n-body or spectral-norm, the sort of thing Fortran made its name doing.

    pidigits is about arbitrary precision arithmetic, not hardware floats and ints.

    fannkuch is about loops and vectors – can you profile the code and find where one of the Schemes spends time?

    fasta certainly spends a chunk of time on random number generation – can you profile the code and find where one of the Schemes spends time?

    binary-trees isn’t what anyone would think of as “heavily numerical” but none of the Scheme implementations are accepted because they use a more compact leaf-node.

    Comment by Isaac Gouy | June 30, 2008 | Reply

  9. > Gambit’s not there either. Neither is Gauche, Scheme48 …

    And the benchmark game will not attempt to measure every Scheme implementation – that’s a chore for the Scheme community ;-)

    Having said that I did install Larceny and did try to find someone interested in implementing the programs – without success; and I did install Gambit-C and did try to find someone interested in implementing the programs but they only provided a couple, which didn’t seem to work.

    Comment by Isaac Gouy | June 30, 2008 | Reply

  10. http://dynamo.iro.umontreal.ca/~gambit/wiki/index.php/Programming_language_shootout
    has both you mention if you wish to run them. I’m content having read Brad Lucier’s paper.

    I didn’t mean to imply that the benchmark guys *should* go out and write tests for all the various implementations, that’d likely turn into a full time job just for Scheme, let alone the various flavors of all the other languages too. And then keeping them up to date… it might be a fun job, but not for free =)

    As for the profiling, I’m sure the implementation devs already do that sort of thing when they’re working on performance, and would be much more apt at such a task.

    In the end I agree with Oisín’s description of benchmarks being a “grainy, distorted lens”.

    Comment by prael | June 30, 2008 | Reply

  11. [...] How I feel about Scheme’s performance. I came across this post written earlier today, How fast is Scheme? Well…. which states: [...]

    Pingback by Top Posts « WordPress.com | June 30, 2008 | Reply

  12. I am not specially fond of Scheme, but believe me, it is fast. Well, the only Scheme I know from hands on is Stalin, since it generates code that is compact enough for my needs. I do a lot of number crunching, and always compare Stalin with gcc (I am supposed to write my programs in C). I consider myself lucky, when Stalin beats gcc by 50%. In numerical applications, Stalin often outperforms gcc by a factor of 3. There are cases, like that famous two D integration, that Stalin produces the result ten times faster than C.

    Comment by Eduardo Costa | October 18, 2008 | Reply

  13. Last I checked, Gambit-C doesn’t have threads. Termite works with separate processes.

    Comment by Aaron | January 6, 2011 | Reply

  14. It is ironic that gambit was mentioned as the example of slow scheme.
    I just did a simple benchmark and compiled/optimized gambit came ahead of popular dynamic languages tested.
    It’s peformance approaches that of java in this admittedly simple benchmark.x

    perl 369.47
    python 142.21
    js rhino 105.95
    lua 69.95
    v8 javascript 6.32
    gsc (standard-bindings) (block) (fixnum) (not safe) 2.05
    java -server -dsa 1.81
    C -O3 1.19

    Optimized gambit beat out unoptimized sbcl, too.

    Here is the progression of gambit performance with stages of optimizations.
    The original poster is perhaps talking about the top number here.
    Not the improvement: two orders of magnitude.
    si 226.34
    gsc plain 12.19
    gsc (standard-bindings) 9.43
    gsc (standard-bindings) (block) 6.70
    gsc (standard-bindings) (block) (fixnum) 5.56
    gsc (standard-bindings) (block) (fixnum) (not safe) 2.05

    If you are interested, please read this doc.
    http://www.iro.umontreal.ca/~gambit/Gambit-inside-out.pdf

    Comment by tengu | January 6, 2011 | Reply


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: