I see why VMs are so popular…

At least for functional-flavored languages. C’s stack and the lack of any explicit way to manage it really makes coroutines/fibers a pain in the ass, and turns any attempt at implementing tail calls into a horrific nightmare. At least if you care about performance /and/ portability. It does however excel at implementing state machines, which gets you about one step away from a VM. But to do them (TCO/fibers) directly you’ve really gotta jump through some hoops. The easy solution would be to use goto… but C won’t let you jump to labels across functions (and for good reason, given they didn’t have its use as a /target/ language in mind).

So let’s see some options I’ve come across. We’ll start with fibers/coroutines as anyone bothering to read this will probably find them more useful (and many of the benefits of TCO can be had with them anyway)…

Protothreads. This is probably the single coolest bit of C I’ve ever seen. In a nutshell, it’s a set of macros which turn a function into an instance of Duff’s device where calls to yield form the boundries of the case statements.
ANSI C, fast, and ridiculously lightweight with a simple implementation (if you grok duff’s device). Includes a timer mechanism which can be leveraged for some semblance of preemtiveness.
Totally loses local variables between yeilds/timeouts. Its idea of “thread” is limited to a function, blocking calls further down the chain need extra work to deal with. Implementation details make use of switch dangerous.
Employment of static declarations where nessesary were succesful but known problems with volatile makes some other possibilites dangerous. Blocking issue wasn’t particularly investigated, the code generated was treating functions as actors — they were /all/ protothreads with timeouts around the few I/O related functions exposed to the HLL. Performance as such was surprisingly decent. The cons are trivial to deal with in the small code bases of its intend use — embedded devices.

Heavyweight OS-backed threads.
Has the obvious advantage of being preemptive and able to utilize multiple cores across multiple processors without any additional work.
Has the obvious disadvantage of being preemptive in a world where people are sadly often too fucking stupid to use something like STM and/or message passing, preferring instead to fool themselves into believing that they can handle manual locking while walking along the edge of non determinism bolted into a serial language rather than reserving it for cases where profiling actually shows it to be nessesary, which frankly for “typical desktop applications”, it probably isn’t.
Probably a bad idea on typical computers if you intend to spawn half a million threads but obviously useful for distributing fibers across cores.
Not ANSI C, but there are many libraries available that can ease portability in various ways such as Pthreads-w32 or a host of libs that simply abstract the native threading capabilities (if any) of whatever OS they’re running on.

Not a whole lot to say, essentially just saves/restores the stack pointer and registers for later restoration. ANSI C, fast, but takes more work to build coroutines out of since you can only jump /down/ the /current/ call stack, provided the matching setjmp is still /below/ it. This bites a lot of first attempts at such, as well as a few attempts at using it for exception handling. Chicken Scheme uses this when nessesary (stack is almost full) to jump back down to a trampoline after moving any live stack data to the heap in its implementation of TCO, folowing the ideas of Cheney on the M.T.A., from which it is free to generate the C code in continuation passing style, and thus provide call/cc, which gets you to whatever sort of threading or exception handling you want to build out of it.

What a lot of people wish setjmp/longjmp were (and many would despise…). Basically gives you one shot continutations by saving the entire stack (rather than just the stack pointer) to heap for later restoration. Unfortunately you need POSIX to get this. But if you have it fibers become nearly trivial. A lot of wrapping libs seem to happily use it where available with or instead of pthreads such as Gnu Pth etc…

Most others attempts I’ve seen at getting coop threading or fibers in C boil down to various wrappers around or reimplementations of setjmp.h or ucontext.h with their own various performance and portability characteristics, many of which were facinating. While not about threads per se, Implementation Strategies for First-Class Continuations is a great overview of differrent methods with more detail than I could ever be bothered with, where even just simpler one shot continuations gets you a stone’s throw from fibers and exceptions and a lot of the problems are the same.

Unfortunately most of this doesn’t really help me to get here, which is where I really want to be (and from which I get the rest practically for free). At least in any intuitive fashion. Surely I can fake it (as mentioned with protothreads) but it just feels… dirty. Of the other possibilities trampolining is common and simple to implement, and Cheney on the M.T.A.’s/Chicken’s way of doing it while also using the stack as part of the GC system I think is quite slick, but they also just don’t seem right. Alas, most the other attempts I’ve seen involve inline assembly, relying on compiler specific extentions (GCC and LLVM will do TCO at least on intel and ppc), or editing the resulting asm to replace calls with jumps before the final assembler/linking passes, which is comical at best, and none of which I’ll attempt. So it seems to me that my only real options are to by whatever means make each function a thread/actor at the C level (as done) or adopt some form of trampolining.

Or… quit worrying about any of it and just implement a VM, which C excells at. Or use any of a number of existing VMs (Neko looks particularly nice for my uses, other than its being under the lgpl which has a tendency to make my brain segfault whenever I attempt to parse it).

Really though, I wish C would just give me a fucking nonlocal goto to use when generating code. Or for someone to throw money at the C– project. And while I’m wishing, being able to depend upon the presence of ucontext.h when I’m writing C by hand would be fucking awesome. I think the C1x people should do that rather than a specific threading implementation, though for many uses the effect would be the same (assuming it retains state as I’d expect, anyway), and just as braindead for embedded devices.

Eh, ok, enough half incoherant rambling. I’ve been awake far too many hours and my stomach is threatening to start digesting my tongue if I don’t provide it with real food soon.

February 18, 2010 - Posted by | Annoyances, Programming, Scheme

No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: