There are lots of blog comment systems, and this blog has used Disqus as the comment system for a long time. I’m not going to go into all the reasons to move away from Disqus, but page load times and wanting more control over your data and being able to respect your readers privacy figure highly.
Also, this blog is a technical blog focused on software development and associated topics, and this means that anyone who wants to comment on my blog is almost certain to be familar with Github and have an account, and also be as uncomfortable using Disqus as I have been.
I did investigate rolling my own code based on examples from other blogs, who have used some jekyll liquid templates and javascript to pull from the Github API and use it to post comments back to the repo hosting the blog. This has some attraction, but also has a big drawback, which is the authorisation situation to the Github API, as you don’t really want your client id and client secret exposed in the repo.
Enter utteranc.es
You can get around this by hosting an app in heroku to use as the postback url so that you can hide the client id and client secret, and there is also staticman, but none of these seemed as simple as just using utteranc.es
To configure utteranc.es, head over to the website and follow the instructions, and fill out the form to suit you. For the blog post to issue mapping, I chose ‘Issue title contains page title’, and I also chose to have utteranc.es add a ‘Comment’ label to the issue it creates in the blog repository. After you do that, you’ll get a code snippet generated for you that looks somewhat like this:
Add this to a jekyll include, for example utterances.html
and then include it in your post.html
layout at the position you want the blog comments to appear. Most jekyll blog templates have Disqus support, so it will probably just be a simple case of finding where in the layout that Disqus is included, and replacing it.
Exporting existing comments
If your existing comments are not important to you, then at this point you can stop and enjoy your new Github powered comment system. Personally for me, it’s the principle of the thing, and the fact that the comments on my blog belong to me, and the author of the comment. So, we can do something about it.
Disqus allows you to export your comments, and once you do so, you will get your comments emailed to the email registered with your Disqus account. I’ve done a lot of work with XML in a previous role, and I think that the Disqus XML export looks… odd. The reason I say that is that each post on your blog appears to be mapped to a <thread>
element, which contains a bunch of expected metadata about the blog post. I would expect each individual comment to be a nested in a <comments>
element, but this is not the case. Instead, each individual comment has an entry as a <post>
element at the same level as the <thread>
, and they are mapped to each other using and attribute id. I don’t think that makes any sense, I’m sure there must be good reasons. I just can’t think what might be.
A comment then, looks like this:
An actual comment on this post looks like:
You can see the way that the post
element is mapped back to the containing thread
using the dsq:id
attribute.
Parsing the XML
The strange structure of the XML makes it less straightforward to parse the XML, as it means we’ll have to do a little bit of work in matching up blog posts and the comments on them. Also very annoying is the fact that a thread
element doesn’t know if it actually has any associated post
comments.
We can acomplish this fairly easily with a little bit of F# and the FSharp.Data XmlProvider. Setting the provider up is straightforward, here I’m just using a direct reference to the assembly which I’d previously added via NuGet.
If you are new to F# (and I’m still fairly new) this might look scary, but it really isn’t. After referencing the assembly in the script, we open the FSharp.Data namespace, and then initialise an XmlProvider by passing it the XML the file we’re going to parse.
That enables the XmlProvider to infer a lot of things about the XML in the file, and then the XmlProvider loads the actual data from the file. Two records are also defined to hold the details about the Threads/Posts that are going to imported, and how multiple comments map refer to a single blog post. These records are analagous to simple C# POCO classes with getters and setters.
With these types ready, we can define a couple of functions to convert the XML into them, and thus do a way with a lot of the extraneous noise from the XML, that we don’t really care about.
These functions use currying, which as a longtime C# developer I’m still getting the hang of, and that will come in handy shortly. They map the Disqus
types generated by the XmlProvider into the custom types I defined, taking care to filter out comments we don’t want to import and not importing any blog posts which Disqus says have been deleted.
Seq.filter
in the toComments
function worked correctly, as I still had to go and manually delete a couple of comments that were marked as spam from the Github Issues
With those functions defined, we need a way of mapping the comments to the correct blog post.
Here we take a single post, and all of the comments, and then use a nested function to grab the set of comments associated to that post, by way of the ThreadId
. With that written, we can use some more currying to create another function that will do a lot of hard work for us:
This function will take the threads, use the toBlogPosts
method to turn them into BlogPost
and then map each blog post to the correct comments using the method we’ve just defined to do that. But where do the comments
come from? Well, it turns out this currying thing is really quite useful, as it enables all this magic looking |>
, or ‘piping’ to happen.
Take all the posts data, turn them all into comments, and then pipe that to the addCommentsToTheirPosts
function, and then filter out blog posts which don’t have any comments, as importing those is pointless. All for around 24 lines of code. I know full well the C# it would take do all that, and whilst with C# 8 you could probably get close, I doubt you’d equal 24 lines.
Just to be on the safe side, it’s probably a good idea to look through each of the posts and comments that we’ve now got to just to see if things are matching up correctly.
Running that will give you an idea of what blog posts are going to be imported, and the number of comments. The first time I ran this, I found some of the blog posts in the Disqus XML import did not have the posts title set, so I was getting duplicated post titles. As there were only three instances of this error, I just manually corrected the XML and re-reran the script to check I had everything correct.
Uploading to GitHub
So far, so good. Now comes the fun part and something I’ve yet to do in F#, which is interop with a C# library. It turns out that it’s not so hard, but that makes perfect sense when you understand that F# is a .net language, just like C#. A long time ago I started to write an API library for GitHub, but I gave it up in favour of Octokit.net.
We can easily reference Octokit and open the namespace as before:
Then we just need to setup a few variables:
These just get us a client to work with, and all I did was just register a new Personal Access Token on my account to use as the password. Notice how with F# you don’t need to new
anything, even though they are classes from a C# assembly. These can then be used in the following function, which I’m gonna prefix with this warning:
It does work though, so just… use at your own caution.
I’m sure that a more experienced F# person is going to look at that and be like “WTF”, but as I said, it does work. I left the printfn
log messages in, but essentially it loops over each post, waits a couple of seconds and then creates the new issue, and then loops over all of the comments for that post and adds then as comments to the issue. I put the Thread.Sleep
’s in the there just so I didn’t hammer the Github API, but honestly there was that few to import I doubt it would have trigged the rate limit, but I imagine a more popular blog with more comments on the posts woould.