Alaminium

|al(j)ʊˈmɪnɪəm| mass noun - the chemical element of atomic number 13, a corrosion-resistant metal named after Habib Alamin, a computer programmer

Miranda: First Impressions of a Haskellite


Miranda, the precursor to Haskell, has recently been released as open source. I took the opportunity to try it out with a little side project I’ve been floating around to myself for a while.

These are my first impressions.

Syntax

Syntactically, Miranda has a similar feel to Haskell, as would be expected. There are some striking differences, some of which are quite limiting.

There is no allowance for _ as the first (or only) character of an identifier, which makes quickly marking out intentionally unused variables hard. There are no case expressions, so pattern matching of intermediate return values can only be done with helper functions. Comments are indicated with a || prefix and terminated with a newline; the boolean ‘or’ operator is \/.

Equality checks can be done with either of == or =, and, as far as I can tell, these have identical behaviour in the context of conditional expressions. Talking of conditionals, Miranda offers no standalone if control flow construct; ifs are only used as a prefix to a conditional where a guard clause would be used, like so:

foo a = a, if a < 0
      = a * 2, otherwise

Whitespace is significant, and it feels noticeably less flexible to me than Haskell, but I can’t prove that. This is not necessarily a bad thing.

There are a fair few oddities. The one standout syntactic oddity to me is the operator-like functions, div (integer division) and mod (modulus), which act like operators, but don’t look like it. You use div, for example, like x div y; you can’t use it like div x y. If you partially apply it, it acts like an operator section, such as in map (div y) xs.

Semantics

Miranda has a lot of semantics you might expect from a precursor to Haskell.

The parametrically polymorphic Hindley-Milner type system with advanced type inference is there, pattern matching, higher order functions, automatic currying, non-strictness, and something I didn’t expect: list comprehensions.

Purity

Something which I really didn’t expect is that Miranda is full of impure workarounds. I knew that, even in Haskell, IO was not done with a monadic interface until Haskell 1.3, and I had a vague idea that before that, main in Haskell involved a list of IO requests. From what I can now gather, main wasn’t just a [Request], but a [Response] -> [Request]; requests could be made based on previous responses, and the laziness of the response list and request list made this possible.

I expected Miranda to do IO using a similar mechanic. I learnt that main in Miranda could be any type and it would behave in one of three ways, depending on its type:

I figured main as [sys_message] was the equivalent to Haskell’s pre-monadic IO main. At that time, I didn’t recall that main in Haskell pre-monadic IO was a function that took a list of responses and returned requests, not just a straight list of requests. I scoured the Miranda documentation trying to figure out how to use the return value of a sys_message request made earlier to determine what sys_message request to make next.

After noticing that the set of possible sys_message values were only output-oriented and were in section 31/2 of the Miranda manual — “UNIX/Miranda system interface” ➤ “Output to UNIX files etc” — it dawned on me that this isn’t possible, and the only way to write programs that get input is to use the copious number of magic impure keywords and functions for that purpose: read, readvals (a.k.a. $+), $*, $-, filemode, filestat, system, getenv (notably, no setenv).

I’m getting num to these hacks

Integers and floats are different types on the value level but not the type level. There’s a num type, which 1 and 1.0 share, but integer 1 = True and integer 1.0 = False.

Ad-hoc Polymorphism

I haven’t yet gotten to the stage where I absolutely can’t make do without typeclasses or some analogous feature. I’ve had a look at the documentation, and abstype, %free, and %include look like half promising candidates to look further into.

In the meantime, so far, I’ve had to write the functor, applicative, and monad functions separately for each type I use them for (for lists and custom types).

Miranda has a magic show function, which returns a string representation of any value given to it.

Tooling

There’s almost no tooling to speak of.

Package management

There is no package manager nor ecosystem of packages I’m aware of, and I haven’t yet found any community.

Syntax highlighting

Vim recognises files with a .m extension as Matlab files by default. I found a Miranda syntax highlighter at github.com/zlahham/vim-miranda which comes with an ftdetect for Miranda and sets files with .m extension to be detected as Miranda files.

It works okay, but it doesn’t cover all of Miranda’s syntax. I noticed that single quotes change the syntax highlighting immediately, even if they are being validly used in the middle or end of an identifier.

mira

mira is a script-oriented interpreter. The documentation refers to Miranda programs as scripts everywhere. There is no option to compile to native code, only bytecode which is then either executed or loaded into a REPL. Miranda, the language, is pretty small, too, and well suited for scripting. To understand how small, this is the full list of values and types in the standard environment:

Appendfile Appendfileb Closefile Exit Stderr Stdout Stdoutb System Tofile Tofileb abs and arctan cjustify code concat const converse cos decode digit drop dropwhile e entier error exp filemode filestat filter foldl foldl1 foldr foldr1 force fst getenv hd hugenum id index init integer iterate last lay layn letter limit lines ljustify log log10 map map2 max max2 member merge min min2 mkset neg numval or pi postfix product read readb rep repeat reverse rjustify scan seq showfloat showhex shownum showoct showscaled sin snd sort spaces sqrt subtract sum sys_message system take takewhile tinynum tl transpose undef until zip zip2 zip3 zip4 zip5 zip6

I believe the only identifiers missing from that list are the primitive types bool, char, and num. The only non-primitive type is sys_message, which has the constructors Appendfile Appendfileb Closefile Exit Stderr Stdout Stdoutb System Tofile Tofileb already listed above.

Running a Miranda file with a .m extension results in mira creating matching bytecode versions with .x extensions for each file it compiles (the entrypoint and included files). To solve this problem and just allow single executable scripts, I created mirapack, only to later found out that mira doesn’t create these bytecode files if the Miranda file doesn’t end in a .m extension, but mira -exec still works (it can have no extension or an extension that’s not .m). I had initially thought the bytecode files were created regardless of extension because the miralib directory came with a prelude and preludx file (these contain internal implementation for parts of stdenv.m).

Unfortunately, running mira (without -exec) on a file that doesn’t exist enters a REPL session with that file as the script, and if the file doesn’t have a .m extension, mira will add a .m extension to the filename it sets as the session’s script, even if the file exists without the .m extension. mirapack is also still useful for when you want to bundle a small project with %include directives into a single executable.

One small quality of life improvement I’d make to mira would be to print error messages even after a repeated build of the same files.

Size and performance

Compiler and standard library

As I pointed out just a little earlier, Miranda comes with a compact standard library. On my machine, the mira binary itself clocks in at ~236K, a little over 10% (~11%) of GHC 8.4.4’s binary size of ~2M; this is more than I expected considering how much fewer features it has. Still, stdenv.m with documentation stripped out (it’s a literate file) is ~6K. stdenv.x is ~5.8K, and GHC’s base-4.11.1.0 (compiled) is ~92M.

Far more important than size to me is performance. This is where all the lack of features becomes a feature in itself. I’ve been using GHC for a long time, and I started to get a frustrated with the compile times. This blog runs on Hakyll, and every time I push a new commit, the CD build takes ~40 minutes (my CD provider doesn’t cache builds).

I don’t mean to put down the work of GHC maintainers at all. It’s an incredible compiler with a lot of high impact benefits and it’s stuck around where other Haskell98 or Haskell2010 compilers haven’t, but being the only viable Haskell compiler around, it also has to work for a broad range of usecases. Advanced type system features of GHC may affect compile times despite not being active for a given project.

I’ve even gone as far as to learn OCaml for its fast compilation times, despite having avoided it for a long time due to its off-putting (to me) syntax; after a little while, my eyes got used to the syntax, but snappiness in software is always a priority for me. Alas, OCaml doesn’t mandate a pure functional style (though it does encourage it). Of course, even Haskell has escape hatches like unsafePerformIO, but it goes much further than OCaml to discourage their use (such as having main be a pure description of an impure program built up with pure functions).

Runtime

Since mira is interpreted, there’s no native code output to measure the performance or size of, so build performance and runtime performance are essentially the same.

However, mira comes with a runtime, and so does GHC. GHC comes with an amazing green threads implementation and STM, to name just a couple of genuinely widely useful features, which can have a noticeable impact on GHC’s startup times.

It may sound silly to care about a ~300ms startup time when using runhaskell, for example (which is what you want when you’re writing scripts), but I feel it, and snappiness isn’t just a problem for hello world in Haskell. I’ve already mentioned that this blog running on Hakyll takes ~30–40 minutes for the CD build to publish a post (although an installation of LaTeX is involved in that to build my CV, before that, the blog took even longer to build; ~55 minutes, in fact).

The culture around Haskell doesn’t seem to care enough about snappiness to make it a priority. That’s perfectly fine, of course, it’s just not for me. I can live with it in larger projects, using tools like ghcid or stack build --fast, but for small scripts, it’s subjectively painful for me. Even for larger projects, where ghcid or stack build --fast are a big help, they only help so much.

Writing small purely functional scripts is a big reason I wanted to try Miranda — small scripts I could put into my home bin directory for dotfiles glue and so on. In this context, startup time matters more than usual. I think pure functional programming lends itself well to scripting, but for the life of me, I can’t figure out a practical way to do it with today’s languages and tooling.


In the end, what I was looking for in Miranda was a pure functional programming language that I could use for scripting. From what little I’ve used Miranda, it hits the scriptability criterion, but not the pure functional criterion.

I can’t say I was expecting much. Even if it had been pure, a main :: [Request] -> [Response] would’ve been a massive pain in the ass. That said, I was hoping to be pleasantly surprised.