Subscribe to Personal Log via RSS

Two compiler implementations

I think I was reading about Oberon –the programming language, not the operating system– because a post on Hacker News, and I ended in the page of Niklaus Wirth.

Wirth was the chief designer of the programming languages: Euler (1965), PL360 (1966), ALGOL W (1966), Pascal (1970), Modula (1975), Modula-2 (1978), Oberon (1987), Oberon-2 (1991), and Oberon-07 (2007) –from the Wikipedia–, among other things.

That is truly remarkable. So I was looking around and I found that his website has PDFs describing the implementation of two compilers:

The code is very readable, and is not because I had to write some Pascal back at University –and Oberon is a descendant of that language–. There are things missing, but is mostly refinement of error reporting and things like that. Everything that is important, is there.

For example, the code generation of the multiplication for Oberon-0:

PROCEDURE MulOp*(VAR x, y: Item);   (* x := x * y *)
BEGIN
  IF (x.mode = Const) & (y.mode = Const) THEN x.a := x.a * y.a
  ELSIF (y.mode = Const) & (y.a = 2) THEN load(x); Put1(Lsl, x.r, x.r, 1)
  ELSIF y.mode = Const THEN load(x); Put1(Mul, x.r, x.r, y.a)
  ELSIF x.mode = Const THEN load(y); Put1(Mul, y.r, y.r, x.a); x.mode := Reg; x.r := y.r
  ELSE load(x); load(y); Put0(Mul, RH-2, x.r, y.r); DEC(RH); x.r := RH-1
  END
END MulOp;

It even shows constant folding –both operands are constants–. The resulting code may not be super-optimized, but it is a full example in an easy to understand size.

I skimmed through the “Compiler Construction” book during my holidays and is not too long. I’m looking forward to read it properly when I have some time.

Reading off-line

Last night I was finally reading a Scheme Primer, after weeks open on a tab in my phone. Firefox for Android reloads the tab every time I select it, so I thought: why not having a local copy that I can read even if I’m offline or the page is not available?

Well, turns out that Firefox can’t save the page. Not as HTML, not as a full website, not as a PDF; it basically lacks the functionality. Which is a trend with Mozilla, to be honest. Chrome is terrible from the point of view of privacy and I don’t want to play into Google’s hands, but Firefox is great at keeping things working just about to ensure they are never successful, which is a real shame.

I looked around and I found a developer saying back in 2012 that the functionality would be added, but they hadn’t had time yet. I’m not holding my breath.

So I tried Chrome, that allows saving the page, but then when I try to open it using the “Files” app, it doesn’t know how to open the file and wants to search for an app to install. Brilliant.

You can print to a PDF –with Chrome, in Firefox I can’t find how to do it, so it may not be possible–, but the PDF reader is not ideal as the text doesn’t flow using the phone screen efficiently. There must be a better way, isn’t it?

It isn’t a solution out of the box, but this is what I ended doing:

  1. Download the page as HTML in my desktop (using Firefox).
  2. Convert the page to epub format using the always powerful pandoc:
 pandoc --from=html --to=epub 'A Scheme Primer.html' -o scheme-primer.epub
  1. Install ReadEra –that is free, but I’m considering buying the premium version because it is a great reader!–.
  2. Copy the epub, and now I have the Scheme tutorial to read off-line anytime I want.

I don’t know if it is that the website is striclty following standards without much CSS, or pandoc being amazing, or the ebook reader being great, but the result is much better that I was expecting. Feels like an actual ebook.

When I’m away from my desktop PC I already have two apps from the public library to read magazines, ebooks, listen to audio-books, and whatnot, and I don’t use them much because… I guess what I wanted is to read about Scheme instead. I’m not a heavy app user, other than Podcast Addict and email, I tend to use the browser for everything else.

I’m glad I still have some options, but in reality, the ideal would have been that Firefox on Android was a better browser.

Playing The Minish Cap

Shortly after I blogged about playing Pokémon, we stopped. We got some Pokémon to pass level 20, but the kids lost interest –I guess the combats are a bit samey–.

So we started something completely different, or may be not that different because is not an RPG but Lenna’s Inception –from our pile of unfinished games– gets a lot of inspiration from games like this. We are playing ‘The Legend of Zelda: The Minish Cap’, and looks like we are close to finish it –according to an online guide, at least–.

This is one of the Capcom Zeldas from 2004, and it is heavy on puzzles. The gimmick of this one is a magical talking cap that can shrink Link to the size of the Minish, a type of tiny magical creature that only children can see.

Boss

I didn't know how to deal with this boss

The game is your proto-Zelda game: introduces new objects to solve new types of puzzles that allow you to investigate new areas to progress on the story. It also has a good number of small side quests that can be unlocked by “fusing Kinstones” with the inhabitants of Hyrule. And it also includes rescuing Zelda from a curse, of course.

The game looks fantastic on the GBA and we are playing it emulated on a big screen. The bosses are interesting and fun, and considering that this is probably the further I have got on a Zelda game, I would say is not too difficult –even if sometimes it is very obtuse, but in those cases I would suggest checking a guide; we look at the screenshots and that’s enough to give us a spoiler free experience–.

A new key

This key included a mini-puzzle later on

In general I would say the game has aged pretty well. I recall some complains about swapping objects too much, including in the sequence to defeat some of the bosses and sometimes is not clear at all what or where, including using a bomb in specific place –how classic!–, but considering Minish Cap was designed to be played on a hand-held and we have the Internet to get unstuck, in my opinion this game is very enjoyable by today’s standards and I totally recommend it.

Hopefully we won’t lose interest and we will finish it, for a change!

New project: micro2

Since the repo is not in GitHub, and that basically wipes completely any chance of anyone discovering the project accidentally –at least it would pop-up in my followers’ feed-, I thought I would use this blog to make it “public”.

I’m learning Haskell and I have decided to use the same project I used to re-learn Go, which again is huge and it is unlikely I will finish it, but there you go! I’m focusing on the compiler part, as I already scratched the interpreter itch.

The public repo is accessible at micro2-lang and the usual disclaimers apply –specially as I’m a Haskell newbie–. I know the name isn’t great, I may change it later on.

So far it has been a lot of fun. I’m using parser combinators via parsec and it is amazing the amount of functionality I have with a tiny fraction of the code I wrote in Micro. There are some downsides, though. For example, the errors messages are a bit less good.

For example, this program has an error:

1module main
2
3// just add two numbers
4def add(a: u8, b: u8): u8 {
5    return a + b
6}
7
8add(3, 2); // 10

The statements in Micro2 are delimited by semicolons, so the compiler reports:

error: "add.cr2" (line 6, column 1):
unexpected "}"
expecting statement

Which is not terrible, but is not great either. Serviceable I would say. Considering the objectives of the project, this is absolutely fine.

I am currently working on the type-checking and semantic analysis, which is really parsing, but can’t be all implemented using parsec (it will know how a “return” looks like, but can’t tell if it is used inside a function and returns the right type).

As I did it already in Go, is not too complicated, but I’m getting comfortable with Haskell and its Monads. The harder part I suspect will be the code generation, and I’m looking forward to it!

Tomato, Pomodoro, Tomate?

I tried to learn Haskell about a year ago. Which was basically ordering a book (“Programming in Haskell” by Graham Hutton) via the training budget of my employer, to rage quit after reading about 100 pages. It was probably not the book, but I wasn’t really focused, so I moved on to other things.

Since then I have returned to Go, and wrote a non trivial amount of code, and it was alright. At some point I stopped thinking about the language and wrote code fluently, and that’s a good thing, even if I found Go itself a bit boring.

So in my search of adding a new exciting language to my toolset, I was reading a bit about Scheme –again, I know–, and long story short I decided to go back to Haskell. I’m reading now Learn You a Haskell for Great Good!, that has the advantage of having a free version that can be read online, and it is more direct that the other book I have. And I’m enjoying it!

I’m still on chapter 5 (of 14), so it is a bit early, but so far I’m finding it very similar to Scala –with the Typelevel stack, see Cats and Cats Effect–, and you can tell that Haskell has been a great influence –assuming it is not the other way around because in some cases, Scala seems to get the features a bit further–.

Reading technical books is hard because, at least for me, it feels like it is a lot of information that I don’t seem to retain, until I need to apply that information and then it surfaces –most of the time at least–. So now that it felt like I knew some Haskell, and following the Scala similarities, I wrote a small command line tool: Tomato.

It is a very simple Pomodoro tool –in reality is just a timer–, that is inspired by pomo in Go by Rob Muhlestein. I think I saw it in one of his streams and I thought it was neat. It is a simple tool, but it has already some meat to it, including input/output, which is perfect to practice some of the tricky bits from Haskell.

And the experience was very nice. Most of my problems were with some syntax I don’t know yet, and specially with a couple of libraries I had to use. They are documented, but perhaps I missed some examples.

For example, I spent way too much time to figure out how to add minutes to an UTC time.

Let’s look at this function:

doStart :: String -> Integer -> IO ()
doStart stateFile mins = do
  currTime <- getCurrentTime
  let deadline = addUTCTime secs currTime
  writeFile stateFile (show deadline)
  putStrLn ("Started, end on " ++ show deadline)
  where
    secs = secondsToNominalDiffTime (60 * fromInteger mins)

It creates the state file with the UTC time adding the deadline for the timer, I had initially the minutes as an Int –and not an Integer; the difference is subtle– and secondsToNominalDiffTime requires a Pico type to produce the NominalDiffTime instance required by addUTCTime. Sometimes you can have this same problem in Scala, which I call type-matching bingo. The solution is clean, but oh it wasn’t easy to get there!

The library to deal with the program arguments gave some trouble as well, even considering the docs has two good examples. But in that case, it was only me to blame, and when the function signature clicked, then I understood how the defaults for the options worked. It was all because for the state file I’m using XDG_CACHE_HOME, and the function to get that directory returns an IO, so I couldn’t use exactly any of the examples.

The similarities with Scala are definitely a good thing: Option is Maybe (with Some -> Just and None -> Nothing), Either is the same, pattern matching is very similar –although Scala seems more flexible; it could be me not knowing all the tricks in Haskell though–, the IO Monad in Cats Effect is modelled after Haskell’s IO, etc.

The LSP support via haskell-language-server is good, although I tried to do a rename and it wasn’t supported. The docs are good, and I specially like that on hover you see the docs and a link to a local HTML file with the full docs of that package.

Perhaps the discoverability of Haskell is worse than in Scala, because of the syntax. In Scala you have a value and you can inspect methods with the help of your editor and LSP by just using value., but in Haskell syntax it seems to me like all are functions, so the dot trick is not possible. So at the end it is a bit overwhelming having all that API, but not really knowing it, so good look finding secondsToNominalDiffTime to start with!

You can see that the book intro to lists includes a lot of functions that you probably need to learn, eventually, and this is also a bit of a problem with Scala. You are trying to solve a problem, and the cleanest and most elegant solution is on a method you don’t know about.

All these are just first impressions, and I’m sure that I wrote some code that is a bit over my current knowledge of the language, but it wasn’t too frustrating and it only took me a bit longer that it should have. I found myself thinking about Scala native, because of those bits that feel a bit less flexible in Haskell than in Scala, but I have to say that I’m interested and I’ll keep going!

Micro is public

This is a short announcement: I have made “Micro”, my toy programming language, public and it is available here: Micro @ git.usebox.net.

It is finished, but not as finished as I planned originally. I knew building an interpreter and a compiler was a lot of work, but I also made the audacity to build my own thing as I was learning how to make an interpreter, refresh my Go, and on top of all that, my own thing wasn’t easy at all!

Micro is a statically typed programming language, and I was reading the most excellent Crafting Interpreters, that guides you in the journey of building an interpreter for a language that is dynamically typed, which means that I was pretty much on my own in a lot of things I had to write.

Besides, what I really wanted to write was a compiler targeting one 8-bit micro (likely the Z80), and I basically spent too much time dealing with the interpreter implementing things that it was very unlikely I could make happen in the compiler for the target CPU (e.g. closures, or recursive tail call optimizations).

Anyway, I’m very happy with the result and this is, with difference, the best of my toy programming languages. I’m proud of what I have accomplished and I think I’m better prepared to start a more focused project with more chances of success.

I don’t discard playing a bit more with Micro, but it is a bit unrealistic for a first project, so I’m happy to close this chapter for now.

Self-hosting git repos

I already decided to rely less on GitHub and use GitLab instead, and the truth is that since then I haven’t started many new projects –and I even forgot about GitLab and still made a couple of new repos in GH, :facepalm:–.

GitHub Copilot is now available to everybody and the controversy has gotten worse when they are charging money for the service, including a campaign by Software Freedom Conservancy.

I like Drew’s commentary on Copilot and open source laundering, specially:

[…] I would update your licenses to clarify that incorporating the code into a machine learning model is considered a form of derived work, and that your license terms apply to the model and any works produced with that model.

Which is probably the right thing to do, but we know that if GitHub (actually, Microsoft) have implemented Copilot in the way they have is because they think they can get away with it, so it is likely that adding a notice like that is not going to have any effect.

Anyway, I just re-considered if I need one of these hosting solutions (also known as “forge”), and I got to the conclusion that I don’t.

Self-hosting a git repo over SSH is very easy:

$ mkdir my_repo_name
$ cd my_repo_name
$ git init --bare

Optionally, if you plan to serve the repo over HTTP:

# PWD is still my_repo_name
$ cp hooks/post-update.sample hooks/post-update

And that’s all! You can use myuser@hostname:/path/to/repos/my_repo_name as remote and you have a private repo.

Although you may have a clone somewhere else, remember to set backups and things like that, and you are set.

I also wanted to have a way of browsing the repos via web, because sometimes is useful to not require git or a clone to check the code. I also like the idea of rendering a README.md as HTML to have a small microsite for a project that perhaps doesn’t need a dedicated page on my website.

For that I decided to use cgit, that you can see in action at git.usebox.net –very empty, for now–. I also enabled clones over HTTPS (read only), so the repos are public, and all together took me about 15 minutes.

It is clear that I have lost functionality –that I don’t need–, but this is perfect for me because:

  • My projects are small and likely to only interest me.
  • I can do without CI, that arguably would be a waste of resources and energy for such small projects.
  • I have almost never got any contributions, and when I have, the contributors are likely to have the skills to send patches by email (or provide me with a URL to pull). I recommend this tutorial on how to contribute to email driven projects.
  • I can always move to use a forge if the project grows to a point where there is a real benefit. For example, it is likely ubox MSX lib will stay on GitLab.

Obviously there are some benefits that come with centralisation. Besides easier workflows for contribution, discovery is an important one: you search on GitHub for projects.

In my experience, that wasn’t that important for my projects. Most of them got some stars only after I shared a link to the repo on a forum or social media, and for most people it was a way of having bookmark or just say “cool project”. And it really doesn’t matter: I shared the code in case it was useful to somebody else, but if I didn’t have any meaningful contributions, the stars didn’t do anything for me.

Anyway, the bottom line is that anything that is not GitHub won’t have the benefits of being on the most popular hosting service, so I think it won’t matter that much if I use GitLab or if I self-host my repositories.

I know this is just a drop on the ocean, but if we don’t do anything, nothing will ever change.

Panic jump in Go

It is fair to say that at this point I have stopped refreshing my knowledge of Go and I’m learning new things. Lots of them, actually; thanks to the toy programming language that I’m implementing.

One of those things is altering the flow of the program when implementing statements like return, continue or break.

I am following Crafting Interpreters as reference for the implementation of an interpreter for my language. The book is implementing the tree-walk interpreter in Java and it can use exceptions, but those aren’t available in Go (which is a shame, I generally prefer languages that support exceptions).

Let’s look at an example of the book, converted to my language:

 1def count(n number) number {
 2  for n < 100 {
 3    if n == 3 {
 4        return n;
 5    }
 6    println(n);
 7    n = n + 1;
 8  }
 9  return n;
10}
11
12count(1);
13// output:
14// 1
15// 2

Because the way a tree-walk interpreter works, when the return in line 4 gets evaluated, the interpreter is a few functions deep:

  1. Called count –we should return here–.
  2. Evaluate count(1).
  3. Evaluate for.
  4. Evaluate if.
  5. Evaluate return –this is where we are–.

In Java, going from the last point to the first one and return from there is quite simple because we have exceptions that will clear those function calls and take us to where we want to go –because we can catch the exception there–, but in Go we can’t do that. Instead we have panic and recover, and we can use them to do something similar –that I called panic jump, but that is probably not the technical name–.

In my interpreter I have:

type PanicJumpType int

const (
	PanicJumpReturn PanicJumpType = iota
	PanicJumpTailCall
	PanicJumpContinue
	PanicJumpBreak
)

type PanicJump struct {
	typ   PanicJumpType
	value any
	loc   tokens.Location
}

Which defines a type that will allow me to:

  • Say which type of “panic jump” I’m using, because it could be a return call but it could be other statements as well that have a similar behaviour.
  • Provide a value (e.g. the value for return).
  • Track where that jump came from, so we can report useful errors (e.g. “return without function in filename:line:col”).

So in the evaluation of return we can use it like this:

// value is the value to return, and v.Loc is the location of that value
panic(&PanicJump{typ: PanicJumpReturn, value: val, loc: v.Loc})

And in the code that evaluates a function call we have something like this:

func (i *Interpreter) call(call ast.Call) (result any, err error) {

//... more code irrelevant for the example ...

// handle return via panic call
defer func() {
        if r := recover(); r != nil {
                if val, ok := r.(*PanicJump); ok && val.typ == PanicJumpReturn {
                        result = val.value
                        err = nil
                } else {
                        // won't be handled here
                        panic(r)
                }
        }
}()

// this function call may panic jump
_, err = fun.Call(i, args, call.Loc)

//... even more code ...

So before we call fun.Call, that could “panic jump”, we set a handler for it that will check that the panic jump is the one we can handle (PanicJumpReturn) and set the return values of Interpreter.call.

If is not a panic jump that we can handle, including an actual panic –hopefully my code is perfect and that never happens–, we propagate it by calling panic again, and it will be handled somewhere else.

The only thing I found slightly ugly is that the panic handler being a deferred function it can only set the return value of Interpreter.call is by using named return parameters, which is definitely less readable than using explicit return values.

We also need a “catch all” handler on the interpreter, because return could be called out of a function. Currently, that shouldn’t have use in my implementation because my parser already checks for that and it should never pass an AST (Abstract Syntax Tree) to the interpreter that is not 100% valid.

At the end is not too complicated, even if it is arguably less elegant than actual exceptions. It was a great Eureka! moment when I found the solution, even if I don’t know what is the impact performance-wise of this –I’m not really concerned about that for now!–.

Taxonomies

I started this blog over a year ago, and I mentioned in the first post that it was a work in progress. There was at least something that was likely I wanted to do, but I thought it was not important: make the posts’ tags visible.

Which is basically making Hugo render special pages listing posts with a specific tag. Although at the end, the motivation to add the functionality goes beyond the usual functionality of a weblog.

This site was a personal website before I added my personal log, and I implemented it with Hugo moving from my old Django managed website. I had a good amount of content I wanted to sort and classify, even if some of that content was very old –circa 2002–. I wanted to present all that information initially, and perhaps decide later if I wanted to get rid of some of it.

So I thought I could use tags, and that worked well to make the main index pages linked in the menu on the top of the pages. However, is not ideal because I ended with one large archive with basically anything I couldn’t put in any of the other sections.

I could just use those tags to navigate blog posts, but considering that everything is already tagged, I have decided to expose tags in all pages, hoping that it would improve navigation.

Now you can click on the tag names in any post –under the post title; this post is tagged blog–, and any other page that has tags will show them at the bottom after the last updated information. For example all ZX Spectrum tagged pages.

I’m not sure how much is this going to improve things as it is still mostly disconnected pages, only that now they are indexed by tag. The more I think about it, the more this should look like a wiki, where you can see incoming links and recently updated pages as well, but that’s probably something to explore at a different time.

The wiki movement is close to being dead

I have accidentally spent some time recently reading wikis, and I ended in the WikiRevival page on Community Wiki:

Wiki movement is close to being dead. We all know that. One can only imagine how this wiki used to be years ago. Lively, pulsating. Something like that.

The page was last edited on 2021-08-23, so it is kind of fresh –sometimes is hard to know, you may read something that feels like it could have been written today, and it is 10 years old–.

As I understand it, the wiki movement refers to public wikis as a way of collaboration and as tool for building communities. And I wish I could say the rumours of its dead have been greatly exaggerated, but my anecdotal experience seems to confirm that public wikis aren’t at least as alive as they used to be.

In a way, we could say the same about blogs. Or more recently, forums. And before that mailing lists. And before that Usenet news groups. And depending on your age I guess we could go even further back.

I can hear you ask: what about the Wikipedia?

The page has more:

But what’s not dead? Atlassion (sic) Confluence (a proprietary product), WikimediaFoundation and wikis on wikifarms like Fandom (used to be Wikia) that describe modern media. Even wikis on services like GitHub or GitLab are not widely used.

So the Wikipedia is not what I think they meant with the movement.

Then it goes further and questions what features are required for a wiki engine to be successful today –e.g. mobile upport–, and what are the unresolved wiki issues. Obviously all this could be completely wrong, but it is an interesting read, and it ends with some optimism or at least hope that, with some small changes, wikis could be back to be.

I have always been fascinated by wikis. Or, should I say, the wiki movement that I didn’t experience. Not even back in early 2000s when I was active in my local Linux User Group, and it was standard to have a wiki. It had write permission restricted to the LUG members –that didn’t write much–. Sadly, it looks like we didn’t understand what was all about.