Latest posts for tag staticsite
These are some notes about my redesign work in staticsite 2.x.
Maping constraints and invariants
I started keeping notes of constraints and invariants, and this helped a lot in keeping bounds on the cognitive efforts of design.
I particularly liked how mapping the set of constraints added during site generation has helped breaking down processing into a series of well defined steps. Code that handles each step now has a specific task, and can rely on clear assumptions.
Declarative page metadata
I designed page metadata as declarative fields added to the Page class.
I used typed descriptors for the fields, so that metadata fields can now have logic and validation, and are self-documenting!
This
is the core of the Field
implementation.
Lean core
I tried to implement as much as possible in feature plugins, leaving to the staticsite core only what is essential to create the structure for plugins to build on.
The core provides a tree structure, an abstract Page
object that can render
to a file and resolve references to other pages, a Site
that holds settings
and controls the various loading steps, and little else.
The only type of content supported by the core is static asset files: Markdown, RestructuredText, images, taxonomies, feeds, directory indices, and so on, are all provided via feature plugins.
Feature plugins
Feature plugins work by providing functions to be called at the various loading steps, and mixins to be added to site pages.
Mixins provided by feature plugins can add new declarative metadata fields, and extend Page methods: this ends up being very clean and powerful, and plays decently well with mypy's static type checking, too!
See for example the code of the alias feature, that allows a page to declare aliases that redirect to it, useful for example when moving content around.
It has a mixin (AliasPageMixin
) that adds an aliases
field that holds a list of page paths.
During the "generate" step, when autogenerated pages can be created, the
aliases feature iterates through all pages that defined an aliases
metadata,
and generates the corresponding redirection pages.
Self-documenting code
Staticsite can list loaded features, features can list the page subclasses that they use, and pages can list metadata fields.
As a result, each feature, each type of page, and each field of each page can
generate documentation about itself: the staticsite
reference is
autogenerated in that way, mostly from Feature
, Page
, and Field
docstrings.
Understand the language, stay close to the language
Python has matured massively in the last years, and I like to stay on top of the language and standard library release notes for each release.
I like how what used to be dirty hacks have now found a clean way into the language:
- what one would implement with metaclass magic one can now mostly do with descriptors, and get language support for it, including static type checking.
- understanding the inheritance system and method resolution order allows to write type checkable mixins
- runtime-accessible docstrings help a lot with autogenerating documentation
os.scandir
andos
functions that accept directory file descriptors make filesystem exploration pleasantly fast, for an interpreted language!
In theory I wanted to announce the release of
staticsite 2.0, but then I found
bugs that prevented me from writing this post, so I'm also releasing
2.1 2.2 2.3 :grin:
staticsite is the static site generator that I ended up writing after giving other generators a try.
I did a big round of cleanup of the code, which among other things allowed me to implement incremental builds.
It turned out that staticsite is fast enough that incremental builds are not really needed, however, a bug in caching rendered markdown made me forget about that. Now I fixed that bug, too, and I can choose between running staticsite fast, and ridiculously fast.
My favourite bit of this work is the internal cleanup: I found a way to simplify the core design massively, and now the core and plugin system is simple enough that I can explain it, and I'll probably write a blog post or two about it in the next days.
On top of that, staticsite is basically clean with mypy running in strict mode! Getting there was a great ride which prompted a lot of thinking about designing code properly, as mypy is pretty good at flagging clumsy hacks.
If you want to give it a try, check out the small tutorial A new blog in under one minute.
I just released staticsite version 1.4, dedicated to creating a blog.
GitHub mode
Tobias Gruetzmacher implemented GitHub mode for staticsite.
Although GitHub now has a similar site rendering mode,
it doesn't give you a live preview: if you run ssite serve
on a GitHub
project you will get a live preview of README.md
and the project
documentation.
Post series
I have added support for post series, that allow you to easily interlink posts with previous/next links.
You can see it in action on links and on An Italian song a day, an ongoing series that is currently each day posting a link to an Italian song.
A year ago, I wrote:
Instead of keeping substantial tabs open until I have read all of them, or losing them in the jungle of browser bookmarks, I have written a script that collects them into a file per month, and turns them into markdown files for my blog.
That script turned out to be quirky and overengineered, so much so that I stopped using it myself.
I've now rethought my approach, and downscaled it: instead of saving a copy of each page locally, I can blog a reference to https://archive.org or https://archive.is. I do not need to autogenerate a description from the site itself.
The result has been a nicely minimal set of changes to staticsite that
resulted in a new version where adding a link to a monthly collection is
as easy as typing ssite new -a links
.
As long as I'll remember to rebuild the site 3 weeks from now, a new post should automagically appear in my blog.
At work, to simplify build dependencies of DB-All.e we decided to port the documentation from LaTeX to Markdown.
Shortly after starting with the porting I resented not having a live preview of my work. I guess I got addicted to it with staticsite.
Actually, staticsite does preview interlinked Markdown files. I wonder if GitHub supports cross-linking between Markdown files in the same repo? It does, and incidentally it uses the same syntax as staticfile.
It shouldn't take long to build a different front-end on top of the staticsite engine just for this purpose. Indeed it didn't take long: here it is: mdpreview.
So, as you are editing the README.md
of your project, you can now run
mdpreview
in the project directory, and you get live preview on your browser.
If your README.md
links to other documentation in your project, those links
will work, too.
mdpreview
uses the same themes as staticsite, so you can even tweak its
appearance. And if you need to render the documentation and put it online
somewhere, then staticsite can render it for you.
I experimented with it splitting staticsite's documentation into several parts, a I had great fun with it.
So, you want live preview of your project's Markdown documentation? mdpreview
When you are happy with it you can commit it to GitHub and it will show just fine.
You want it to show on your website instead? Build it with staticsite.
I'm considering merging staticsite and mdpreview somehow. Maybe mdpreview could just be a different command line front-end to staticsite's functionality. That's food for though for the next days.
Would you prefer to preview something else instead of Markdown? There is actually nothing markup specific in staticsite, so you can take this file as inspiration and implement support for the markup language of your choice in this whole toolchain. Except maybe for GitHub's website: that doesn't run on staticsite (yet).
I farm bits and pieces out to the guys who are much more brilliant than I am. I say, "build me a laser", this. "Design me a molecular analyzer", that. They do, and I just stick 'em together. (Seth Brundle, "The Fly")
When I decided to try and turn siterefactor into staticsite, I decided that I would go ahead only for as long as it could be done with minimal work, writing code in the most straightforward way on top of existing and stable components.
I am pleased by how far that went.
Python-Markdown
It works fast enough, already comes with extensions for most of what I needed, and can be extended in several ways.
One of the extension methods is a hook for manipulating the
ElementTree
of the rendered document before serializing it to HTML, which made it really
easy to go and process internal links in all <a href=
and <img src=
attributes.
To tell an internal link from an external link I just use the standard python
urlparse and see if the
link has a scheme
or a netloc
component. If it does not, and if it has a
path
, then it is an internal link.
This also means that I do not need to invent new Markdown syntax for internal
references, avoiding the need for remembering things like
[text]({{< relref "blog/post.md" >}})
or [text]({filename}/blog/post.md)
.
In staticsite
, it's just [text](/blog/post.md)
or [text](post.md)
if the
post is nearby.
This feels nicely clean to me: if I wanted to implement fancy markdown features, I could do it as Python-Markdown extensions and submit them upstream. If I wanted to implement fancy interlinking features, I could do it with a special url scheme in links.
For example, it would be straigtforward to implement a ssite:
url scheme that
expanded the url with elements from staticsite
's settings using a call to
python's string.format
(ssite:{SETTING_NAME}/bar
maybe?), except I do not
currently see any use cases for extending internal linking from what it is now.
Jinja2
Jina2 is a template engine that I already knew, it is widely used, powerful and pleasant to use, both on the templating side and on the API's side.
It is not HTML specific, so I can also use it to generate Atom, RSS2, "dynamic" site content, and even new site Markdown pages.
Implementing RSS and Atom feeds was just a matter of writing and testing these Jinja2 macros and then reusing them anywhere.
toml, yaml, json
No need to implement my own front matter parsing. Also, reusing the same syntax as Hugo allows me to just link to its documentation.
python-slugify
I found python-slugify so I did not bother writing a slug-generating function.
As a side effect, now things works better than I would even have thought to implement, including transliteration of non-ascii characters:
$ ./ssite new example --noedit --title "Cosí parlò Enrico"
/enrico-dev/staticsite/example/site/blog/2016/cosi-parlo-enrico.md
(I just filed an RFP)
python-livereload
Implementing ssite serve
which monitors the file system and autoreloads when
content changes and renders everything on the fly, took about an hour. Most of
that hour went into implementing rendering pages on demand.
Then I discovered that it autoreloads even when I edit staticsite
's source
code.
Then I discovered that it communicates with the browser and even automatically triggers a page refresh.
I can keep vim
on half my screen and a browser in the other half, and I get
live preview for free every time I save, without ever leaving the editor.
Bootstrap
I already use Bootstrap at work, so creating the default theme templates with it took about 10 minutes.
This morning I tried looking at my website using my mobile phone, and I pleasantly saw it automatically turning into a working mobile version of itself.
Pygments
Python-Markdown uses Pygments for syntax highlighting,
and it can be themed just by loading a .css
.
So, without me really doing anything, even staticsite
's syntax highligthing
is themable, and there's even a nice page with a list of themes to choose
from.
Everything else...
Command line parsing? Straight argparse.
Logging? python's logging support.
Copying static resource files? shutil.copy2.
Parsing dates? dateutil.parser.
Timing execution? time.perf_counter.
Timezone handling? pytz.
Building the command to run an editor? string.format.
Matching site pages? fnmatch.translate.
...and then some.
If I ever decide to implement incremental rendering, how do I implement tracking which source files have changed?
Well, for example, how about just asking git?
I decided to rethink the state of my personal site, and try out some of the new static site generators that are available now.
To do that, I jotted down a series of things that I want in a static site generator, then wrote a tool to convert my ikiwiki site to other formats, and set out to evaluate things.
As a benchmark I did a full rebuild of my site, which currently contains 1164 static files and 458 markdown pages.
My requirements
Free layout for my site
My / is mostly a high-level index to the site contents.
Blog posts are at /blog.
My talk archive is organised like a separate blog at /talks.
I want the freedom to create other sections of the site, each with its own rss feed, located wherever I want in the site hierarchy.
Assets next to posts
I occasionally blog just a photo with a little
comment, and I would like the .md
page with the
comment to live next to the image in the file system.
I did not even know that I had this as a requirement until I found static site generators that mandated a completely different directory structure for markdown contents and for static assets.
Multiple RSS/Atom feeds
I want at least one RSS/Atom feed per tag, because I use tags for marking which articles go to http://planet.debian.org.
I also want RSS/Atom feeds for specific parts of the site, like the blog and talks.
Arbitrary contents in /index.html
I want total control over the contents of the main home page of the site.
Quick preview while editing contents
I do not like to wait several seconds for the site to be rebuilt at every review iteration of the pages I write.
This makes me feel like the task of editing is harder than it should, and makes me lose motivation to post.
Reasonable time for a full site rebuild
I want to be able to run a full rebuild of the site in a reasonable time.
I could define "reasonable" in this case as how long I can stare at the screen without getting bored, starting to do something else, and forgetting what it was that I was doing with the site.
It is ok if a rebuild takes something like 10 or 30 seconds. It is not ok if it takes minutes.
Code and dependency ecosystems that I can work with
I can deal with Python and Go.
I cannot deal with Ruby or JavaScript.
I forgot all about Perl.
Also, if it isn't in Debian it does not exist.
Decent themes out of the box
One of my hopes in switching to a more mainstream generator is to pick and choose themes and easily give my site a more modern look.
Hugo
Hugo is written in Go and is in Debian testing.
Full rebuild time for my site is acceptable, and it can even parallelize:
$ time hugo
real 0m5.285s
user 0m9.556s
sys 0m1.052s
Free layout for my site was hard to get.
I could replace /index.html
by editing the template page for it, but then I
did not find out how to create another similar index in an arbitrary place.
Also, archetypes are applied only on the first path component of new posts, but I would like them instead to be matched on the last path component first, and failing that traveling up to the path until the top. This should be easy to fix by reorganizing the content a bit around here
For example, a path for a new blog post of mine could be blog/2016/debian/
and I would like it to match the debian
archetype first, and failing that the
blog
archetype.
Assets next to posts almost work.
Hugo automatically generates one feed per taxonomy element, and one feed per section. This would be currently sufficient for me, although I don't like the idea that sections map 1 to 1 to toplevel directories in the site structure.
Hugo has a server that watches the file system and rerenders pages as they are modified, so the quick preview while editing works fine.
About themes, it took me several tries to find a theme that would render navigation elements for both sections and tags, and most themes would render by pages with white components all around, and expect me to somehow dig in and tweak them. That frustrated me, because for quite a while I could not tell if I had misconfigured Hugo's taxonomies or if the theme was just somehow incomplete.
Nikola
Nikola is written in Python and is in Debian testing.
Full rebuild time for my site is almost two orders of magnitude more than Hugo, and I am miffed to find the phrases "Nikola is fast." or "Fast building process" in its front page and package description:
$ time nikola build
real 3m31.667s
user 3m4.016s
sys 0m24.684s
Free layout could be achieved fiddling with the site configuration to tell it where to read sources.
Assets next to post work after tweaking the configuration, but they require to write inconsistent links in the markdown source: https://github.com/getnikola/nikola/issues/2266 I have a hard time accepting that that, because I want to author content with consistent semantic interlinking, because I want to be able 10 years from now to parse it and convert it to something else if a new technology comes out.
Nikola generates one RSS/Atom feed per tag just fine. I have not tried generating feeds for different sections of the site.
Incremental generation inside its built in server works fine.
Pelican
Pelican is written in Python and is in Debian testing.
Full rebuild time for my site is acceptable:
$ time pelican -d
real 0m18.207s
user 0m16.680s
sys 0m1.448s
By default, pelican seems to put generate a single flat directory of html files regardless of the directory hierarchy of the sources. To have free layout, pelican needs some convincing in the configuration:
PATH_METADATA = r"(?P<relpath>.+)\.md"
ARTICLE_SAVE_AS = "{relpath}/index.html"
but even if I do that, the urls that it generates still point to just
{slug}/index.html
and I have not trivially found a configuration option to
fix that accordingly. I got quite uncomfortable at the idea of needing to
configure content generation and linking to match, instead of having one
automatically being in sync with the other.
Having assets next to posts seems to be possible
(also setting STATIC_PATHS = ["."]
), but I do not recall making progress on
this front.
I did not manage to generate a feed for each tag out of the box, and probably there is some knob in the configuration for it.
I gave up with Pelican as trying it out felt like a constant process of hacking the configuration from defaults that do not make any sense for me, withouth even knowing if a configuration exists that would do what I need
Ikiwiki
Ikiwiki is written in Perl and is in Debian. Although I am not anymore proficient with Perl, I was already using it, so it was worth considering.
Full rebuild time feels a bit on the slow side but is still acceptable:
$ time ikiwiki --setup site.setup
real 0m37.356s
user 0m34.488s
sys 0m1.536s
In terms of free site structure, all feeds for all or part of the site, ikiwiki just excels.
I even considered writing a python web server that monitors the file system and
calls ikiwiki --refresh
when anything changes, and calling it a day.
However, when I tried to re-theme my website around a simple bootstrap boilerplate, I found that to be hard, as a some of the HTML structure is hardcoded in Perl (and it's also my fault) and there is only so much that can be done by tweaking the (rather unreadable) templates.
siterefactor
During all these experiments I had built siterefactor to generate contents for all those static site engines, and it was going through all the contents quite fast:
$ time ./siterefactor src dst -t hugo
real 0m1.222s
user 0m0.904s
sys 0m0.308s
So I wondered how slow it would become if, instead of making it write markdown, I made it write HTML via python markdown and Jinja2:
$ time ./siterefactor ~/zz/ikiwiki/pub/ ~/zz/ikiwiki/web -t web
real 0m6.739s
user 0m5.952s
sys 0m0.416s
I then started wondering how slower it would become if I implemented postprocessing of all local URLs generated by Markdown to make sure they are kept consistent even if the path of a generated page is different than the path of its source. Not much slower, really.
I then added taxonomies. And arbitrary Jinja2 templates in the input, able to generate page lists and RSS/Atom feeds.
And theming.
And realised that reading all the sources and cross-linking them took 0.2 seconds, and the rest was generation time. And that after cross-linking, each page can be generated independently from all the others.
staticsite
So my site is now generated with staticsite:
$ time ssite build
real 0m6.833s
user 0m5.804s
sys 0m0.500s
It's comparable with Hugo, and on a single process.