Blogs should be Trees

Posted Jan. 3 2022 by Tarek.
Discuss on Twitter.

I believe in the Feynman Technique and the saying that "If you can't explain something in simple terms, you don't understand it". When I am struggling to understand a certain topic, I break it down into blocks, and I try to understand each block by explaining it to myself in writing. Doing so sheds light on possible holes in my understanding which when addressed, makes subsequent blocks more comprehensible.

It's been over a decade since I published my last blog post. But I haven't been completely idle in the meantime. I accumulated a lot of written pieces which have aided me at work and studies, but I never published them. I also started seeing patterns in those writings. For example there might be a piece which depends on another piece. Or some piece that is re-usable in more than one place. Consequently, ideas on how to make use of those patterns kept coming, and in particular, if I were to start blogging again, how would I structure my posts.

Modeling a Blog Post

Plain Old Post

Normally a blog post with rich content consists of a single chunk of data with information about how it should be rendered already baked in:

<h1>This is the post title<h1>
<p>This is a post paragraph</p>
<h2>Title of section A</h2>
<p>This content belongs to section A</p>
<h2>Title of section B</h2>
<p>This content belongs to section B</p>

The representation of the above as a single blob of data lacks semantics that could assist giving an improved UX for a reader. Alternatively we could logically split and visualize the above in blocks.

Post as a Series of Blocks

<!-- block_id=0 -->
<h1>This is the post title<h1>
<!-- block_id=1 -->
<p>This is a post paragraph</p>
<!-- block_id=2 -->
<h2>Title of section A</h2>
<!-- block_id=3 -->
<p>This content belongs to section A</p>
<!-- block_id=4 -->
<h2>Title of section B</h2>
<!-- block_id=5 -->
<p>This content belongs to section B</p>

A blog post is then comprised of a concatenation of a set of blocks. Rendering blocks (as opposed to the post as a whole) is a first step in adding some level of semantics within the data being rendered. Those semantics could enable some flexibility in how a specific block should be treated/rendered. For example deciding to render an interactive code executor when a block contains a pre with JavaScript code inside.

The next step would be to separate a block's data from the block's rendering information.

Separating Data from Presentation

With data untangled from presentation, a post's content could modeled a follows:

[
  {
    "type": "header",
    "data":  "This is the Post Title"
  },
  {
    "type": "paragraph",
    "data":  "This is a post paragraph"
  },
  {
    "type": "header",
    "data":  "Title of Section A",
    "level": 1
  },
  {
    "type": "paragraph",
    "data":  "This content belongs to section A"
  },
  {
    "type": "header",
    "data":  "Title of Section B",
    "level": 1
  },
  {
    "type": "paragraph",
    "data":  "This content belongs to section B"
  }
]

Such a model enables fine-grained processing based on the block type. Is the data of a block of type code too long? Display a short version and allow the user to expand. Is this a markdown paragraph that should be rendered to html? Add a property "content_type": "markdown" to the relevant block. The level property under header types is for knowing which header tag <h[1-6]> to use when rendered.

This data model works nicely in Notion, and also editor.js is based on it.

Referencing a Part of a Post

One particular feature I wanted is to be able to reference a specific part of a post. Traditionally this is achieved via bookmarks (appending #block_id to the URL) which would have the browser scroll to the element carrying this id in the web page. I don't want to just to scroll to it though. I want the focus to be on it. I want it to stand out from the rest of the post, for example by highlighting it. Achieving this via JavaScript could look like:

if (window.location.hash) {
  document.getElementById(
      window.location.hash.substring(1)
      ).classList.add("highlight");
}

This, however, will highlight a single block only. When information in the next block completes the highlighted one, but the next block is not highlighted, then things might look messy. A flat list of blocks simply does not contain enough information to decide where a specific highlight should end. To achieve this effect, a different structure is needed. A structure where blocks that belong together could be grouped together: a tree.

Enter the BlockTree

A flat list of blocks could be augmented with grouping information. A tree data structure on the other hand has this grouping information already built-in in form of the node-child relationship. Consider above example post when modeled as a tree:

{
  "type": "header",
  "data": "This is the Post Title",
  "children": [
    {
      "type": "paragraph",
      "data": "This is a post paragraph"
    },
    {
      "type": "header",
      "data": "Title of Section A",
      "children": [
        {
          "type": "paragraph",
          "data": "This content belongs to section A"
        }
      ]
    },
    {
      "type": "header",
      "data": "Title of Section B",
      "children": [
        {
          "type": "paragraph",
          "data": "This content belongs to section B"
        }
      ]
    }
  ]
}

Note that information about the level was omitted as it can be inferred from the tree. Rendering the above to html could look like:

<div id="node-1">
  <h1><This is the Post Title</h1>
  <div id="node-2">
    <p>This is a post paragraph</p>
    <div id="node-3">
      <h2>Title of Section A</h2>
      <div id="node-4">
        <p>This content belongs to section A</p>
      </div>
    </div>
    <div id="node-5">
      <h2>Title of Section B</h2>
      <div id="node-6">
        <p>This content belongs to section B</p>
      </div>
    </div>
  </div>
</div>

With that nested structure, highlighting node-3 would automatically highlight node-4 since it's a child of it and they semantically belong together.

This Blog

In this blog I will be experimenting with this BlockTree concept and see if it's actually that useful. Posts (including this one) will show a small dotted icon next to nodes which support interaction (desktop only for now). The available interaction currently gets you a direct link to that node for displaying it in a standalone page which might be useful for some people. Later I might consider enabling comments on particular nodes and other features. Additionally, modeling in blocks enables generation of other types of views-types relatively with ease. For example append .md to the URL of any blog post to see a text-only markdown version of it.