As part of modernisng, updating and generally overhauling my blog, I thought it would be nice to add some consistancy to the Yaml front matter used by Jekyll. For those who do not know, Jekyll uses Yaml front matter blocks to process any file which contains one as a special file. The front matter can contain variables in the form foo: value
. Jekyll itself defines some predefined globabl variables and variables for posts, but anything else is valid and can be use in Liquid tags.
I wondered if I could write some F# to:
- Load all the markdown files.
- Parse all the front matter.
- Modify the front matter to drop variables no longer required by a theme.
- Update the front matter with new variables which are understand by the current theme.
- Randomly assign a path to a header image file for each post which doesn’t already have one.
- Write the front matter back to its post.
Fairly straightforward requirements.
Loading and parsing the front matter
I’m using YamlDotNet to do most of the heavy lifting. I think could also have used the FSharp.Configuration Type Provider, but I’m not sure that it would have done exaclty what I wanted.
I’m just writing this in an F# script, hosted in a project. After adding the YamlDotNet NuGet package, we can reference it and get to work:
Here, we reference the package, and then open various namespaces for use later on. The code for my blog is kept in a separate folder, relative to the project which has got the fsharp scripts I’m writing abot in it. This is nice and easy.
This is a class with auto-implemented properties. You can see three attributes in use. The YamlMember
attribute allows us to alias a property in Yaml which doesn’t follow the CamelCase convention we configured the deserialiser with. I think that a C# version of this would look pretty much the same.
This initialises the YamlDotNet deserialiser, and is pretty much almost exactly how you would do this in C#. To deserialise something, we need some Yaml. When I was testing this, I got an error in YamlDotNet that was pretty weird and essentially means that it can’t parse the file, and it turns out it’s because all the other stuff outside the Yaml front matter that is upsetting it.
Oh regex, I do love thee.
Very simply, this regex will parse everything in a file between two ---
blocks, into a named Yaml
group. We now have actual front matter, we still need to parse into an object.
This is a bit more complex so lets unpack it:
- Pass in the
filePath
. - Read all of the text from it.
- Strip only the front matter from the text.
- Parse the front matter test with an inner function, which uses the
deserializer
, and return it. Here, we also keep track of the file path (we will need this later).
We also need to load all of the markdown files:
Notice how those last couple of functions are using ‘currying’. It lets us do all of the work in one pipeline:
This gives us a dataset to work with. Next time we’ll continue with the rest of the requirements.