Miranda: First Impressions of a Haskellite
Miranda, the precursor to Haskell, has recently been released as open source. I took the opportunity to try it out with a little side project I’ve been floating around to myself for a while.
These are my first impressions.
Syntax
Syntactically, Miranda has a similar feel to Haskell, as would be expected. There are some striking differences, some of which are quite limiting.
There is no allowance for _
as the first (or only) character of an identifier, which makes quickly marking out intentionally unused variables hard. There are no case expressions, so pattern matching of intermediate return values can only be done with helper functions. Comments are indicated with a ||
prefix and terminated with a newline; the boolean ‘or’ operator is \/
.
Equality checks can be done with either of ==
or =
, and, as far as I can tell, these have identical behaviour in the context of conditional expressions. Talking of conditionals, Miranda offers no standalone if
control flow construct; if
s are only used as a prefix to a conditional where a guard clause would be used, like so:
foo a = a, if a < 0
= a * 2, otherwise
Whitespace is significant, and it feels noticeably less flexible to me than Haskell, but I can’t prove that. This is not necessarily a bad thing.
There are a fair few oddities. The one standout syntactic oddity to me is the operator-like functions, div
(integer division) and mod
(modulus), which act like operators, but don’t look like it. You use div
, for example, like x div y
; you can’t use it like div x y
. If you partially apply it, it acts like an operator section, such as in map (div y) xs
.
Semantics
Miranda has a lot of semantics you might expect from a precursor to Haskell.
The parametrically polymorphic Hindley-Milner type system with advanced type inference is there, pattern matching, higher order functions, automatic currying, non-strictness, and something I didn’t expect: list comprehensions.
Purity
Something which I really didn’t expect is that Miranda is full of impure workarounds. I knew that, even in Haskell, IO was not done with a monadic interface until Haskell 1.3, and I had a vague idea that before that, main
in Haskell involved a list of IO requests. From what I can now gather, main
wasn’t just a [Request]
, but a [Response] -> [Request]
; requests could be made based on previous responses, and the laziness of the response list and request list made this possible.
I expected Miranda to do IO using a similar mechanic. I learnt that main
in Miranda could be any type and it would behave in one of three ways, depending on its type:
- if it were a
[sys_message]
— which is essentially a list of IO requests — Miranda would evaluate eachsys_message
in turn, - if it were a
string
, Miranda would print it directly to the console, and - if it were anything else, Miranda would apply
show
to it and print the resulting string.
I figured main
as [sys_message]
was the equivalent to Haskell’s pre-monadic IO main
. At that time, I didn’t recall that main
in Haskell pre-monadic IO was a function that took a list of responses and returned requests, not just a straight list of requests. I scoured the Miranda documentation trying to figure out how to use the return value of a sys_message
request made earlier to determine what sys_message
request to make next.
After noticing that the set of possible sys_message
values were only output-oriented and were in section 31/2 of the Miranda manual — “UNIX/Miranda system interface” ➤ “Output to UNIX files etc” — it dawned on me that this isn’t possible, and the only way to write programs that get input is to use the copious number of magic impure keywords and functions for that purpose: read
, readvals
(a.k.a. $+
), $*
, $-
, filemode
, filestat
, system
, getenv
(notably, no setenv
).
I’m getting num
to these hacks
Integers and floats are different types on the value level but not the type level. There’s a num
type, which 1
and 1.0
share, but integer 1 = True
and integer 1.0 = False
.
Ad-hoc Polymorphism
I haven’t yet gotten to the stage where I absolutely can’t make do without typeclasses or some analogous feature. I’ve had a look at the documentation, and abstype
, %free
, and %include
look like half promising candidates to look further into.
In the meantime, so far, I’ve had to write the functor, applicative, and monad functions separately for each type I use them for (for lists and custom types).
Miranda has a magic show
function, which returns a string representation of any value given to it.
Tooling
There’s almost no tooling to speak of.
Package management
There is no package manager nor ecosystem of packages I’m aware of, and I haven’t yet found any community.
Syntax highlighting
Vim recognises files with a .m
extension as Matlab files by default. I found a Miranda syntax highlighter at github.com/zlahham/vim-miranda which comes with an ftdetect
for Miranda and sets files with .m
extension to be detected as Miranda files.
It works okay, but it doesn’t cover all of Miranda’s syntax. I noticed that single quotes change the syntax highlighting immediately, even if they are being validly used in the middle or end of an identifier.
mira
mira
is a script-oriented interpreter. The documentation refers to Miranda programs as scripts everywhere. There is no option to compile to native code, only bytecode which is then either executed or loaded into a REPL. Miranda, the language, is pretty small, too, and well suited for scripting. To understand how small, this is the full list of values and types in the standard environment:
Appendfile Appendfileb Closefile Exit Stderr Stdout Stdoutb System Tofile Tofileb abs and arctan cjustify code concat const converse cos decode digit drop dropwhile e entier error exp filemode filestat filter foldl foldl1 foldr foldr1 force fst getenv hd hugenum id index init integer iterate last lay layn letter limit lines ljustify log log10 map map2 max max2 member merge min min2 mkset neg numval or pi postfix product read readb rep repeat reverse rjustify scan seq showfloat showhex shownum showoct showscaled sin snd sort spaces sqrt subtract sum sys_message system take takewhile tinynum tl transpose undef until zip zip2 zip3 zip4 zip5 zip6
I believe the only identifiers missing from that list are the primitive types bool
, char
, and num
. The only non-primitive type is sys_message
, which has the constructors Appendfile Appendfileb Closefile Exit Stderr Stdout Stdoutb System Tofile Tofileb
already listed above.
Running a Miranda file with a .m
extension results in mira
creating matching bytecode versions with .x
extensions for each file it compiles (the entrypoint and included files). To solve this problem and just allow single executable scripts, I created mirapack
, only to later found out that mira
doesn’t create these bytecode files if the Miranda file doesn’t end in a .m
extension, but mira -exec
still works (it can have no extension or an extension that’s not .m
). I had initially thought the bytecode files were created regardless of extension because the miralib
directory came with a prelude
and preludx
file (these contain internal implementation for parts of stdenv.m
).
Unfortunately, running mira
(without -exec
) on a file that doesn’t exist enters a REPL session with that file as the script, and if the file doesn’t have a .m
extension, mira
will add a .m
extension to the filename it sets as the session’s script, even if the file exists without the .m
extension. mirapack
is also still useful for when you want to bundle a small project with %include
directives into a single executable.
One small quality of life improvement I’d make to mira
would be to print error messages even after a repeated build of the same files.
Size and performance
Compiler and standard library
As I pointed out just a little earlier, Miranda comes with a compact standard library. On my machine, the mira
binary itself clocks in at ~236K, a little over 10% (~11%) of GHC 8.4.4’s binary size of ~2M; this is more than I expected considering how much fewer features it has. Still, stdenv.m
with documentation stripped out (it’s a literate file) is ~6K. stdenv.x
is ~5.8K, and GHC’s base-4.11.1.0
(compiled) is ~92M.
Far more important than size to me is performance. This is where all the lack of features becomes a feature in itself. I’ve been using GHC for a long time, and I started to get a frustrated with the compile times. This blog runs on Hakyll, and every time I push a new commit, the CD build takes ~40 minutes (my CD provider doesn’t cache builds).
I don’t mean to put down the work of GHC maintainers at all. It’s an incredible compiler with a lot of high impact benefits and it’s stuck around where other Haskell98 or Haskell2010 compilers haven’t, but being the only viable Haskell compiler around, it also has to work for a broad range of usecases. Advanced type system features of GHC may affect compile times despite not being active for a given project.
I’ve even gone as far as to learn OCaml for its fast compilation times, despite having avoided it for a long time due to its off-putting (to me) syntax; after a little while, my eyes got used to the syntax, but snappiness in software is always a priority for me. Alas, OCaml doesn’t mandate a pure functional style (though it does encourage it). Of course, even Haskell has escape hatches like unsafePerformIO
, but it goes much further than OCaml to discourage their use (such as having main
be a pure description of an impure program built up with pure functions).
Runtime
Since mira
is interpreted, there’s no native code output to measure the performance or size of, so build performance and runtime performance are essentially the same.
However, mira
comes with a runtime, and so does GHC. GHC comes with an amazing green threads implementation and STM, to name just a couple of genuinely widely useful features, which can have a noticeable impact on GHC’s startup times.
It may sound silly to care about a ~300ms startup time when using runhaskell
, for example (which is what you want when you’re writing scripts), but I feel it, and snappiness isn’t just a problem for hello world in Haskell. I’ve already mentioned that this blog running on Hakyll takes ~30–40 minutes for the CD build to publish a post (although an installation of LaTeX is involved in that to build my CV, before that, the blog took even longer to build; ~55 minutes, in fact).
The culture around Haskell doesn’t seem to care enough about snappiness to make it a priority. That’s perfectly fine, of course, it’s just not for me. I can live with it in larger projects, using tools like ghcid
or stack build --fast
, but for small scripts, it’s subjectively painful for me. Even for larger projects, where ghcid
or stack build --fast
are a big help, they only help so much.
Writing small purely functional scripts is a big reason I wanted to try Miranda — small scripts I could put into my home bin
directory for dotfiles glue and so on. In this context, startup time matters more than usual. I think pure functional programming lends itself well to scripting, but for the life of me, I can’t figure out a practical way to do it with today’s languages and tooling.
In the end, what I was looking for in Miranda was a pure functional programming language that I could use for scripting. From what little I’ve used Miranda, it hits the scriptability criterion, but not the pure functional criterion.
I can’t say I was expecting much. Even if it had been pure, a main :: [Request] -> [Response]
would’ve been a massive pain in the ass. That said, I was hoping to be pleasantly surprised.