I’m feeling all mad-scientistic tonight, sipping coke, with 30 tabs open in Opera :)

On my journey into the weird world of functional programming on the .NET CLR, I continue to stumble upon interesting stuff. Papers, articles, interviews, each worthy of hours of dedication. Alas, my day still has only 24 hours (I need to work around that somehow), so at the time being I can’t do much more than glance over everything.

The questions I was trying to answer during the last few days were (and to a great degree still are):

  • What is the cost of using threads on the CLR and how does F# handle them?
  • Is F# fast enough to handle computationally-intensive tasks?
  • Is F# suited for distributed programming?

The cost of using threads on the CLR

The answer to that question is pretty clear, fortunately. This thread on the Joel on Software Discussion Board pretty much deals with the question from my last post: .NET/F# vs. Erlang. The initial question in the thread was along the lines of “I want to do computation with a great bunch (n x 1000) of agents, represented by processes, communicating via messages. Should I use F# or Erlang?”.

The first answer was already very helpful to me:

My understanding is that the .net threads are very heavy weight (128K) when compared to erlang processes (300-400 bytes).

So, no native CLR threads for larger numbers of processes. After a bit of thinking though, this turns out not to be a problem. What I am planning can easily (perhaps even best) achieved by a few, long-running threads. Yet, the need for many agents and lots of communication might turn up later and I don’t want to lock myself into something that’ll leaves me in the lurch later.

Further down the thread, someone posted links to a Microsoft Project called Concurrency and Coordination Runtime, used in the Robotics Studio and fit for precisely this problem.

I have only had a peek into the CCR and if I understand the indroduction of Satnam Singh’s Paper An Asynchronous Messaging Library for C# correctly, it’s a library that runs on the unmodified CLR and provides green threads (or something similar) via CPS with good performance.

Im definetly gonna look deeper into this and I hope there's a way to test it out. Hey cool, it’s free for non-commercial uses!

Performance of F# and distributed programming

Its nice performance seems to always be among the major points proponents of OCaml bring up and it was one the major reasons for me to consider F#/OCaml. Hard numbers on F# performance for this weren’t that easy to find though. The official F# page only claims it to have “a performance profile like that of C#”. This article however turned out quite informative:

Like OCaml, F# is excellent at writing tools for symbolic programming. Examples of this type of programming include static analysis, compilers, and other sophisticated analyses over structured data terms. Remember using Maple or Matlab? How about Mathematica? That’s what writing F# is like. If F# is to be used to analyze various complex problems, then harnessing it to a virtual supercomputer seemed like a logical thing. Slated to learn two new technologies, I was hoping I hadn’t bit off more than I could chew.

(emphasis mine)

This is exactly my situation. The article describes an experiment in distributing a computationally intensive algorithm over multiple machines using Alchemi, a .NET Grid Computing Framework which was deployed first using pure C#, then F# wrapped in C# and finally pure F#. The results?

Without any optimization the F# implementation took 453 ms. C# finished up in 62 ms. Turning optimization on caused the C# results to drop to 46 ms. And, most shocking of all, the F# was dead even at 46 ms! Apparently, the F# compiler does an excellent job optimizing the code. Continuing on, I ran the entire C# and F# grid applications against one another. Without any optimization the C# code took 3.9 seconds to calculate 100 digits of pi while F# only took 2.8 seconds. Optimizing this application showed that F# would finish in 1.5 seconds while C# would not complete any faster than 2.6 seconds.

Nice!

Next steps

I’m definetly gonne work my way through one of the excellent OCaml tutorials online.

The ones that seem recommended most often are

More Stuff

Like always, browsing the web got me sidetracked here and there instead of keeping me focused on F#.

I want to share the most interesting discoveries with you.

  • Threading in C# by Joseph Albahari is a pretty profound free online book about, well, threading in C#
  • Brian Beckman: Monads, Monoids, and Mort is a Channel 9 interview in which Brian tells a bit of history about his career, chats about functional programming, nicely explains Monoids and Monads on the way and introduces a pretty shocking statement about the relation between VisualBasic and C# at Microsoft Research.
  • F# News is a blog that deals solely with, who’d have guessed it, F#
  • F# for Visualization is a book from Flying Frog. The page links to an impressive Demo illustrating F#’s performance.
  • Meet Bob the Monadic Lover seems to be a very entertaining introduction to monads. I haven’t read it yet though, I’m still fighting my way through…
  • Category Theory for Beginners, great slides from an introductory course into category theory which focus on the motivations behind and practical uses of category theory for computer scientists.

Holy crap, that was a long post.