The article is the publishing script?
Table of Contents
I used to have a pretty standard Jekyll theme as this web page. I'd hacked on it a bit and added support for Org Mode being the Emacs user I am, but after getting it up and running I promptly lost interest. Will the same thing happen to this page? Yeah probably. Either way, lectures just ended for another semester and I got a spur of motivation to get something going again. Specifically to try out Pollen. Which describes itself as
[…] a publishing system that helps authors make functional and beautiful digital books.
At the core of Pollen is an argument:
- Digital books should be the best books weve ever had. So far, they're not even close.
- Because digital books are software, an author shouldn't think of a book as merely data. The book is a program.
- The way we make digital books better than their predecessors is by exploiting this programmability.
— Matthew Butterick, Pollen: the book is a program
While it clearly states that it's for digital books, it targets HTML
and looked pretty cool so I gave it a shot. If you've seen the bottom of this page you may notice it was created with org-mode
, so clearly, the pollen
attempt didn't make it. While it's a very cool system, I had issues with the export behaving oddly, paragraph tags were showing up where they shouldn't have and it became difficult to fine-tune the export, despite having immense control of the output with the pollen markup
format. As a templating system, the pollen preprocessor
could be quite nice. I can see being able to inject arbitrary Racket code into anything text-based as something somebody needs for some reason.
1. So what's so special about this?
Well not much. I've used a lot of org-mode
in the past and even attempted to do this once before. It's quite a nice markup language, I've personally used it for organising notes, managing TODO lists, planning projects, writing assignments, scheduling, agenda views, tagging, and exporting to various formats. It also supports literate programming, tables, and extensive customisation with Emacs Lisp. The literate programming has been quite useful for assignments that require lots of in-document code and outputs.
One thing that org-mode
has that I think markdown
and other variants lack is the extensibility. Don't get me wrong, there are a ton of tools that can process markdown
and spit out something useful that is plenty configurable. But org-mode
is first and foremost a part of Emacs. That means it comes with Emac's extensibility, which, in my opinion, is lightyears beyond anything else. And something that I'll talk more about later on this page.
2. org-export
and org-publish
My lecturers certainly aren't interested in reading my assignments in a plain-text markup language almost exclusively supported by a text editor first released in 1985. Fortunately, org-mode
has excellent support exporting documents to other formats including HTML
, and \(\LaTeX\). On top of this org-publish
adds support for processes a set of files, potentiality exporting, and publishing them. These systems are highly configurable and featureful.
All pages here (including the mono-page) are generated from a set of .org
files, published via the org-html-publish-to-html
function, smeared with a little CSS
that isn't mine, then thrown up on a github.io page. All in all it was about an evening's work to learn how the org-publish
system worked, experiment, and polish to just what I had in mind.
3. What about the so-called publishing script being an article?
Yeah, so this article isn't really the publishing script. Specifically the bootstrappaing.org
file is tangled to publish.el
, which is then executed. Tangling is the process of extracting source code from a document expanding, merging, and transforming it. Org then recomposes them into one or more separate files.
But it didn't use to be like that. To bootstrap the page I wrote a bog-standard .el
file that could be loaded and executed from the command line to publish the page during testing,
emacs -Q --batch -l publish.el --funcall publish
During the writing of this article, I'm moving everything into here. Then I'll be able to
emacs -Q --batch -l org bootstrapping.org -f org-babel-tangle -l publish.el --eval '(org-publish "site" t)'
which isn't much of a change to be fair, but it's the principle that matters.
4. The meat of it
4.1. Setup, packages, and variables
First, we'll do some house keeping, I don't want these packages in my main init.el
, so we'll move the package directory and require
the built-in things I'll need.
(defvar base-directory (expand-file-name "~/Projects/github.io/")) (defvar public-directory (file-name-concat base-directory "docs")) (mkdir public-directory t) (require 'package) (setq package-user-dir (file-name-concat base-directory ".packages")) (add-to-list 'package-archives '("melpa" . "https://melpa.org/packages/")) (add-to-list 'package-archives '("melpa-stable" . "https://stable.melpa.org/packages/")) (package-initialize) (unless package-archive-contents (package-refresh-contents)) (unless (package-installed-p 'use-package) (package-install 'use-package)) (require 'use-package) (require 'ox-publish) (require 'cl-lib) (use-package htmlize :ensure t)
I use a few custom variables as well. inline-html-publish-index-subpages
is a bit interesting, because I wanted to generate a mono-page as well as the individual pages I needed to have a custom publishing function, and I need to store the subpages for addition to the mono-page.
The -marker
variables are used to mark where some of the table of contents and subpages should be inserted. It's a bit of a hack I know I know but it works just fine.
(defvar inline-html-publish-index-subpages nil) (defvar inline-html-publish-toc-marker "<!-- Inline html toc marker -->") (defvar inline-html-publish-subpage-marker "<!-- Inline html subpage marker -->") (defvar inline-html-publish-page-format "<h1 class='subpage-title' id='%s'><a href='%s'>%s</a></h1>")
I also need to configure some generic org-html
variables.
(setq org-html-head-include-default-style t org-html-preamble t org-html-htmlize-output-type 'css org-html-postamble t org-html-head (concat "<link rel='stylesheet' type='text/css' href='https://latex.now.sh/style.css'/>" "<link rel='stylesheet' type='text/css' href='org-htmlize-style.css'/>" "<link rel='stylesheet' type='text/css' href='style.css'/>"))
The real meat of the org-publish
system is the org-publish-project-alist
variable, it defines what files should be published and how. This is a pretty boring one, we only have
"contens":
recursively publish any file ending in.org
(exceptindex.org
) in thebase-directory
to thepublic-directory
with theorg-html-publish-to-html
function."static":
just copies some common resources to thepublic-directory
."index":
is a little more interesting. It's very similar to"contents"
but use myinline-html-publish-index
function to publish theindex.org
file.
(setq org-publish-project-alist `(("contents" :base-directory ,base-directory :base-extension "org" :exclude "index.org" :publishing-directory ,public-directory :recursive t :publishing-function org-html-publish-to-html :html-postamble ,(concat "<p class=\"author\">%a</p>" "<p class=\"author\">Date: %T</p>" "<p class=\"author\">%c</p>") :headline-levels 4 :auto-preamble t) ("static" :base-directory ,base-directory :base-extension "css\\|js\\|png\\|jpg\\|gif\\|pdf\\|mp3\\|ogg\\|swf" :include ("CNAME") :exclude "docs" :publishing-directory ,public-directory :recursive t :publishing-function org-publish-attachment) ("index" :inline-components ("contents") :inline-plist (:body-only t :publishing-function inline-html-publish) :publishing-function inline-html-publish-index :base-directory ,base-directory :base-extension "org" :exclude ".*" :include ("index.org") :publishing-directory ,public-directory :recursive t :headline-levels 4 :auto-preamble t) ("site" :components ("index" "contents" "static"))))
inline-components
and inline-plist
are my own properties, they're used in inline-html-publish-index
to configure the mono-page subpage exports.
4.2. Functions
I wrote a small helper macro to make writing some HTML
tags easier. It's not particularly fancy but it does the job.
(defmacro with-tag (tag attribute-alist &rest body) "Insert <TAG `attributes'>, execute BODY, then insert </TAG>. ATTRIBUTE-ALIST's key-value pairs are converted into HTML attributes." (declare (indent 1) (debug t)) (let ((attribute-list (gensym "attribute-list")) (attributes (gensym "attributes"))) `(let* ((,attribute-list (cl-loop for (prob . val) in ,attribute-alist collect (concat prob "='" val "'"))) (,attributes (string-join ,attribute-list " "))) (insert (concat "<" ,tag " " ,attributes ">")) ,@body (insert (concat "</" ,tag ">")))))
The previously mentioned inline-html-publish
function works by getting the to-be-published file and exporting it as HTML
into a temp buffer, adding some HTML
to the top via a format string, then plopping that plus some metadata into the inline-html-publish-index-subpages
variable.
(defun inline-html-publish (plist filename pub-dir) "Publish FILENAME into `inline-html-publish-index-subpages'. Stored as (timestamp output-filename title html-string). Update the `:inline-components' in PLIST with `:inline-plist'. PUB-DIR is ignored." (interactive) (let* ((org-inhibit-startup t) (visiting (find-buffer-visiting filename)) (work-buffer (or visiting (find-file-noselect filename))) (output-filename (org-publish-file-relative-name (file-name-with-extension filename ".html") plist)) (temp-buffer (generate-new-buffer " *temp*" t)) (title nil)) (with-current-buffer work-buffer (setq title (org-get-title)) (org-export-to-buffer 'html temp-buffer nil nil nil t)) (with-current-buffer temp-buffer (when title (insert (format inline-html-publish-page-format (file-name-sans-extension output-filename) output-filename title))) (push `(,(org-publish-cache-mtime-of-src filename) ,output-filename ,title ,(buffer-string)) inline-html-publish-index-subpages)) (kill-buffer temp-buffer) (unless visiting (kill-buffer work-buffer))))
To the "publish" the index.org
file it uses my inline-html-publish-index
function. It first merges the properties for the components from :inline-components
with :inline-plist
and does some other setup-related tasks. Then publishes the index.org
file to the html-buffer
buffer, followed up publishing the :inline-components
with the now updated project alist.
(defun inline-html-publish-index (plist filename pub-dir) "Build a mono-page from FILENAME and `inline-html-publish-index-subpages'. Merges the `:inline-plist' with the plist of `:inline-components' in PLIST" (interactive) (let* ((org-inhibit-startup t) (org-html-preamble-format `(("en" "<h1 class='author'>%a</h1>"))) (org-html-postamble-format `(("en" ,(concat "<p class=\"author\">Date: %T</p>" "<p class=\"author\">%c</p>")))) (visiting (find-buffer-visiting filename)) (work-buffer (or visiting (find-file-noselect filename))) (output-filename (file-name-concat pub-dir (file-name-with-extension (org-publish-file-relative-name filename plist) ".html"))) (html-buffer (find-file-noselect output-filename nil t)) (inline-plist (plist-get plist :inline-plist)) (inline-projects (cl-loop for component in (plist-get plist :inline-components) collect (let* ((project (assoc component org-publish-project-alist)) (project-name (car project)) (project-plist (cdr project))) (cons project-name (append inline-plist project-plist)))))) (with-current-buffer work-buffer (org-export-to-buffer 'html html-buffer nil nil nil nil)) (org-publish-projects inline-projects) (setq inline-html-publish-index-subpages (mapcar #'cdr (sort inline-html-publish-index-subpages #'(lambda (x y) (not (time-less-p (car x) (car y))))))) (with-current-buffer html-buffer (goto-char (point-min)) (search-forward inline-html-publish-toc-marker) (with-tag "ul" nil (cl-loop for (html-filename title html) in inline-html-publish-index-subpages do (with-tag "li" nil (with-tag "a" `(("href" . ,(concat "#" (file-name-sans-extension html-filename)))) (insert title))))) (goto-char (point-min)) (search-forward inline-html-publish-subpage-marker) (cl-loop for (html-filename title html) in inline-html-publish-index-subpages do (with-tag "div" '(("class" . "subpage")) (insert html))) (save-buffer)) (kill-buffer html-buffer) (unless visiting (kill-buffer work-buffer))))
The :inline-components
uses the inline-html-publish
function which populates the inline-html-publish-index-subpages
variable. That's sorted and the timestamps tripped out. It then finds the markers within the html buffer and inserts the relevant content.
And that's it.
4.3. Wait wait where did those markers come from?
They're just embedded as exported HTML
blocks in the index.org
file. Here's what that looks like
#+title:Jake Moss --- Mono-page #+begin_export html <div class='centred-container'> <div class='abstract'> <h2>About me</h2> I'm a Computer Science Honours student at the University of Queensland interested in high performance computing, lisps, and functional programming. I mainly write Cython in Emacs for my day job and am working on computer algebra systems for my thesis. </div> </div> #+end_export This is a mono-page of everything here (although not much). I prefer being able to scroll and search through web pages in their entirety. Here's a list of things on this page #+begin_src emacs-lisp :exports results :results value html (string-join `("<div>" ,inline-html-publish-toc-marker "</div>") "\n") #+end_src #+begin_export html <div> <!-- Inline html toc marker --> </div> #+end_export Pages can be narrowed to by clicking on their respective titles. #+begin_src emacs-lisp :exports results :results value html (string-join `("<div>" ,inline-html-publish-subpage-marker "</div>") "\n") #+end_src #+begin_export html <div> <!-- Inline html subpage marker --> </div> #+end_export Some things in =org-publish= are just hard coded and you cannot configure them, one such thing is the detection of $\LaTeX$ fragments to determine if the =mathjax= scripts should be included. Because I'm inserting the subpages myself the =mathjax= scripts in them don't apply. So there's one down here to make sure they work on the mono-page.