High-performance tidy trees visualization

June 14, 2022 · 17 min read

NaiveTidy

This article introduces the algorithm to draw non-layered trees in linear time and re-layout partially when some nodes change in O(d) time, where d is the maximum depth of the changed node.

The source code is available at zxch3n/tidy. It only takes a few milliseconds to finish the layout of a tree with tens of thousands of nodes within the web browser.

Introduction
- Layered and non-layered
- Aesthetic rules
A naive non-layered tree visualization
Tidy tree visualization algorithm
Relayout partially
Benchmark
Future work
- Remaining aesthetic issue
- The Tidy Library
References

Introduction

Trees are ubiquitous data structures. Various visualizations of trees were proposed to help people understand different aspects of trees. This article focuses on the classical node-link diagrams.

A good tree drawing should be aesthetically pleasing while using as little space as possible. In 1979, Wetherell and Shannon [1] formalized aesthetic rules for tidy trees and presented the first O(n) algorithm to solve it. In 1981, Reingold and Tilford [2] extended the aesthetic rules to make the drawings more aesthetically pleasing and compact. However, both of them would violate the aesthetic rules when extended to m-ary trees. In 1990, Walker [3] solved this problem with O(n^2) time complexity. Buchheim [4] improved Walker’s work to run in O(n) in 2002. In 2014, Ploeg [5] presented the first O(n) algorithm to produce tidy non-layered drawings of trees.

This article introduces the layout algorithm of non-layered tidy trees [5] and presents a fast relayout algorithm. Engineers can use it to build fast tree editor tools like mindmaps.

Layered and non-layered

In tree visualization, each node may have variant width and height. For example, nodes may contain text content of different lengths. In this case, the tree can be drawn as layered, where nodes with the same depth are horizontally aligned, or non-layered, where there is a fixed vertical distance between parent and children.

Layered drawings make depth comparison easier, whereas non-layered ones generally use less space.

Non-Layered

Layered

The algorithm of non-layered drawing is harder than the layered one since the former can be easily extended to the latter. Ploeg [5] designed the first algorithm to finish the layout of non-layered tidy trees in linear time. The detail of the algorithm will be covered in the rest of this article. If you are interested in how layered drawing works, you can read the article by Bill Mill [6].

Aesthetic rules

A tidy drawing of a tree should obey the aesthetic rules while using as little space as possible. The aesthetic rules include:

No overlapped node
No crossed line
A node’s children should stay on the same line
Parents should be centered over their children
A subtree should be drawn the same way regardless of where it occurs in the tree
Nodes are ordered. Drawings of nodes should have the same order.
A tree and its mirror image should produce drawings that are reflections of one another, which implies
- Small, interior subtrees should be spaced out evenly among larger subtrees, where the larger subtrees are adjacent at one or more levels.
- Small subtrees at the far left or far right should be adjacent to larger subtrees.

A naive non-layered tree visualization

Most mind map applications use a naive version of the tidy layout, i.e., it obeys the aesthetic rules but does not care about compactness.

The naive tidy visualization algorithm treats each node and its offsprings as a bounding box, avoiding collision between the bounding boxes.

function naiveTidyLayout(root) {
  root.y = 0
  root.preOrderTraverse(node => {
    node.y = node.parent.y + node.parent.height + margin
  })
  root.postOrderTraverse(node => {
    if (node.children.length == 0) {
      node.bbox.width = node.width
      return
    }

    const childrenWidth = node.children.reduce(
      (acc, cur) => acc + cur.width + margin,
      -margin
    )
    node.bbox.width = max(node.width, childrenWidth)
    let relativeX = node.width / 2 - childrenWidth / 2
    for (const child of node.children) {
      child.relativeX = relativeX + child.bbox.width / 2 - child.width / 2
      relativeX += child.bbox.width + margin
    }
  })
  root.preOrderTraverse(node => {
    node.x = node.parent.x + node.relativeX
  })
}

It is easy to implement and has good performance. But its layout is not compact, as shown in the interactive example.

Tidy tree visualization algorithm

How to achieve compactness

Note that the aesthetic rules require that a subtree should be drawn the same way no matter where it appears in the tree. It means that if the layout of the subtree is done, inside it, all nodes’ relative positions to their parent are finalized. So we can use a post-order traversal to determine the relative positions. A high-level abstraction of the tree visualization algorithm is

function layout(root) {
  root.postOrderTraverse(node => {
    // determine each child subtree's relative position to the parent
    layoutSubtree(node, node.children)
  })

  root.preOrderTraverse(node => {
    finalizeAbsolutePosition(node)
  })
}

To arrange children compactly, during layoutSubtree, we need child subtrees to be as close to their siblings as possible. Therefore, we can abstract a tidy layout algorithm as below.

function layoutSubtree(node, children) {
  if (children.length == 0) {
    return
  }

  const prev = [children[0]]
  for (let i = 1; i < children.length; i++) {
    const cur = children[i]
    cur.relativeX = getMoveDistance(prev, cur)
    prev.push(cur)
  }

  positionRoot(node)
}

Suppose we are visualizing a complete k-ary tree, then we have , where is the time complexity of the getMoveDistance function. Based on master theorem,

If , then .
If , then .
If , then .
ϵ > 0 is a constant.

We only need a distance function with complexity of to make the algorithm run in linear time. - Ploeg [5] proposed a algorithm with , so that . Ploeg [5] prooved its linear complexity for the cases other than complete k-ary tree.

Fix aesthetic rule 7

The above algorithm satisfies aesthetic rules 1-6, but not aesthetic rule 7: a tree and its mirror image should produce drawings that are reflections of one another, as shown in the image below.

Violet Aesthetic Rule 7

Obey Aesthetic Rule 7

It happens when large neighbors surround small subtrees. The algorithm will pile the small ones to the left.

A simple fix is to take the average positions of the original layout and the mirror of the mirrored layout. But it tends to cluster the small subtrees at the center.

Taking average positions

Walker [3] designed the first algorithm to address this issue, giving a more visually pleasing output. Its idea is that when moving a subtree to the right, the move distance should also be distributed to the smaller interior subtrees.

How to fix aesthetic rule 7

To make the overall algorithm run within linear time, we need an O(1) method to

Find the intermediate siblings
Distribute the distance evenly to the intermedia siblings

Now the code changes to

function layoutSubtree(node, children) {
  if (children.length == 0) {
    return
  }

  const prev = [children[0]]
  for (let i = 1; i < children.length; i++) {
    const cur = children[i]
    const collideIndex = getCollisionIndex(prev, cur)
    const distance = getMoveDistance(prev, cur)
    cur.relativeX = distance
    distributeDistanceToInteriorSubtrees(children, distance, collideIndex, i)
    prev.push(cur)
  }

  positionRoot(node)
}

The O(n) layout algorithm [5]

To make the overall algorithm run in O(n), we need the following methods done in

1.  getMoveDistance(leftSiblings, subtree)
2.  getCollisionIndex(leftSiblings, subtree)
3.  distributeDistanceToInteriorSubtrees(children, distance, collideIndex, currentIndex)

Determine move distance

getMoveDistance calculates how far a subtree needs to move to avoid colliding with the siblings on its left. For this purpose, it only needs to calculate the distance between the siblings’ right and the subtree’s left contour.

Note that aesthetic rule 5 requires: a subtree should be drawn the same way regardless of where it occurs in the tree. So a subtree’s left and right contour are not affected by other subtrees or their ancestors.

We introduce four new variables for this purpose.

threadLeft: points to the next left contour node
modifierThreadLeft: the next left contour node x position relative to this
threadRight: points to the next right contour node
modifierThreadRight: the next right contour node x position relative to this

The thread of the right contour is:

The first node of the thread is the rightmost subtree’s root.
If the current thread node has children, then the next node in the thread is its last-child
If the current thread node c has no child, the next node in the thread is c.threadRight.
c.modifierThreadRight is c.threadRight’s x position relative to c

The left contour thread follows the right contour’s mirrored rules.

When calculating the contour’s distance, only nodes with intersections on the y-axis need to be compared. And in both layered and non-layered cases, each node’s y positions can be precomputed.

We can express the extended behaviors in pseudocode.

function layoutSubtree(node, children) {
  if (children.length == 0) {
    return;
  }

  const prev = [children[0]];
  for (let i = 1; i < children.length; i++) {
    const cur = children[i];
    const collideIndex = getCollisionIndex(prev, cur);
    const distance = getMoveDistance(prev, cur);
    cur.relativeX = distance;
    distributeDistanceToInteriorSubtrees(children, distance, collideIndex, i);
+   mergeContour(prev, cur);
    prev.push(cur);
  }

  positionRoot(node);
}

+function getMoveDistance(prev, cur) {
+  const curLeftContour = cur;
+  const prevRightContour = prev[prev.length - 1];
+  let maxDistance = 0;
+  while (curLeftContour && prevRightContour) {
+    const xL = getRelativeX(curLeftContour)
+    const xR = getRelativeX(prevRightContour) + prevRightContour.width + margin;
+    maxDistance = max(maxDistance, xR - xL);
+    const yL = curLeftContour.y + curLeftContour.height;
+    const yR = prevRightContour.y + prevRightContour.height;
+    if (yL <= yR) {
+      curLeftContour = nextLeftContour(curLeftContour);
+    }
+    if (yL >= yR) {
+      prevRightContour = nextRightContour(prevRightContour);
+    }
+  }
+
+  return maxDistance;
+}
+
+function mergeContour(prev, cur) {
+  if (bottom(prev) > bottom(cur)) {
+    const extremeRight = getRightThreadLastNode(cur)
+    extremeRight.threadRight = getRightThreadNodeAtY(prev[prev.length - 1], extremeRight.y);
+    extremeRight.modifierThreadRight = ...;
+  } else if (bottom(prev) < bottom(cur)) {
+    const extremeLeft = getLeftThreadLastNode(prev[0])
+    extremeLeft.threadLeft = getLeftThreadNodeAtY(cur, extremeLeft.y);
+    extremeLeft.modifierThreadLeft = ...;
+  }
+}

Get collided subtree

During the post-order traversal, inside each iteration, we do layouts on subtrees that share the same parent. getCollisionIndex get which subtree the right contour node, which produces the maxDistance, belongs to.

The easiest way to implement is to follow the parent pointer of the node to find the subtree’s root. But in the worst case, it takes O(n^2).

Notice that each subtree can only take a continuous y-span in the right contour thread of a forest. And the y-span always starts with another subtree’s bottom or 0 and ends with the bottom of the subtree.

So we can tell which subtree the right contour node belongs to by the y position.

Ploeg [5] used a linked list to construct this data structure:

+class IYL {
+  constructor(
+    public bottom: number,
+    public index: number,
+    public next: IYL | null
+  ) {}
+
+  update(minY: number, index: number) {
+    let cur = this;
+    while (cur != null && minY >= cur.bottom) cur = cur.next;
+    return new IYL(minY, index, cur);
+  }
+}

function layoutSubtree(node, children) {
  if (children.length == 0) {
    return;
  }

  const prev = [children[0]];
+ let iyl = new IYL(getBottom(children[0]), 0, null);
  for (let i = 1; i < children.length; i++) {
    const cur = children[i];
+   const [distance, collideIndex] = getMoveDistance(prev, cur, iyl);
    cur.relativeX = distance;
    distributeDistanceToInteriorSubtrees(children, distance, collideIndex, i);
    mergeContour(prev, cur);
    prev.push(cur);
+   iyl = iyl.update(getBottom(cur), i);
  }

  positionRoot(node);
}

function getMoveDistance(prev, cur, iyl) {
  const curLeftContour = cur;
  const prevRightContour = prev[prev.length - 1];
  let maxDistance = 0;
  let collideIndex = 0;
  while (curLeftContour && prevRightContour) {
+   if (xR.y + xR.height > iyl.bottom) {
+     iyl = iyl.next;
+   }

    const xL = getRelativeX(curLeftContour)
    const xR = getRelativeX(prevRightContour) + prevRightContour.width + margin;
+   if (xR - xL > maxDistance) {
+     maxDistance = xR - xL;
+     collideIndex = iyl.index;
+   }

    const yL = curLeftContour.y + curLeftContour.height;
    const yR = prevRightContour.y + prevRightContour.height;
    if (yL <= yR) {
      curLeftContour = nextLeftContour(curLeftContour);
    }
    if (yL >= yR) {
      prevRightContour = nextRightContour(prevRightContour);
    }
  }

  return [maxDistance, collideIndex];
}

Distribute spacing

We can implement it in a bruit-force way as below. But in the worst case, it causes the overall complexity to be .

function distributeDistanceToInteriorSubtrees(children, distance, from, to) {
  for (let i = from + 1; i < to; i++) {
    children[i].relativeX += (distance * (i - from)) / (to - from)
  }
}

[4] introduced a simple trick to make it O(1). Given that the distance distributed is always an arithmetic progression, we can cache the difference at the start node and clear the effect at the end node. We need two variables: shiftAcceleration and shiftChange. They are cached position changes that will apply to children[from+1..to] in finalizeAbsolutePosition.

function layout(root) {
  root.postOrderTraverse(node => {
    // determine each child subtree's relative position to the parent
    layoutSubtree(node, node.children);
  })

  root.preOrderTraverse(node => {
    finalizeAbsolutePosition(node);
  })
}

function finalizeAbsolutePosition(node) {
  addChildSpacing(node);
  ...
}

function addChildSpacing(node) {
  let speed = 0.;
  let delta = 0.;
  for (const child of node.children) {
    speed += child.shiftAcceleration;
    delta += speed + child.shiftChange;
    child.relativeX += delta;
  }
}

function distributeDistanceToInteriorSubtrees(children, distance, from, to) {
  if (to == from + 1){
    return;
  }

  children[from + 1].shiftAcceleration += distance/(to-from);
  children[to].shiftAcceleration -= distance/(to-from);
  children[to].shiftChange -= distance - distance/(to-from);
}

Final Code

You can find the final code of this article here. However, there are still a few trivial details missing in this abstraction. For details, please refer to the source code.

Interactive example

Layout Type:

Tidy

Nodes

In this example, you can try non-layered tidy layout, layered tidy layout, and naive layout.

Relayout partially

Node editing is a common scenario in applications like mind maps. If the relayout time is larger than 16ms, there will be obvious freezes, which is unacceptable. Therefore, a partial relayout is required for a smooth user experience on large editable tree visualization.

In this section, we exclude the finalizeAbsolutePosition and only talk about how to re-calculate each node’s relative position to their parent. Because changing one node’s size may cause the entire tree’s absolute positions to change, relative positions are much more stable as aesthetic rule 5 required. And in the real-world application, we only need the absolute positions of the nodes inside the screen. By detecting the collision between trees’ bounding boxes and the screen, we can filter out the offscreen content swiftly. Strictly speaking, the time complexity of partial relayout is O(d + m), where m is the in-screen nodes number, and d is the max depth of changed nodes.

Adding the partially relayout support to the naive version is relatively easy. Because the affected states are straightforward to reason about, i.e., only the bounding boxes of the edited nodes and their ancestors are changed.

For partial relayout, we need to determine which thread pointer caches are outdated. We say a subtree is changed if it or its offspring’s sizes change or an insertion or deletion happens inside this subtree. Note that

If all nodes inside a subtree are not changed, the thread pointers of its nodes that point to the node inside this subtree won’t change. This is because the layout of a tree is agnostic about nodes outside. And the thread pointers that point inside are only affected by the structure of the subtree itself.
In a subtree, only the deepest nodes, which have the greatest value of node.y + node.height, have thread pointers that point outside of the tree. This rule is obvious in the mergeContour implementation.
Based on 1 and 2, we can infer that we only need to update the thread pointers of its deepest nodes for an unchanged subtree.
From 3, we can infer that if a node and its parent are roots of unchanged subtrees, we only need to update the parent subtree’s deepest nodes’ threads.
Generalizing 4, for all unchanged subtrees, only those who are siblings of the changed subtrees need to update their deepest nodes’ thread pointers.

Based on the above observations, we only need to relayout the changed subtree and update its siblings’ thread pointers. Below is the code about partial relayout when there is a single changed node. We can easily extend it to multiple changed nodes.

function relayout(root, changedNode) {
  let node = changedNode
  while (node.parent) {
    for (const sibling of node.parent.children) {
      const r = getRightBottomNode(sibling)
      r.threadRight = null
      r.modifierThreadRight = 0
      const l = getLeftBottomtNode(sibling)
      l.threadRight = null
      l.modifierThreadRight = 0
    }

    layoutSubtree(node.parent, node.parent.children)
    node = node.parent
  }
}

Benchmark

The code is available at GitHub. The layout algorithms are written in Rust and compiled to WASM. The renderer is written in TypeScript.

benchmark

Benchmarks were done on MacBook Pro (13-inch, M1, 2020). It only measures the layout algorithm without the allocation and deallocation time. Mysteriously, the WASM build is faster than the native build on large data. It is very interesting, but I have not found out why.

Future work

Remaining aesthetic issue

The current layout algorithm is still not ideal in some cases.

Current layout

Ideal layout

As the above example shows, the root’s children are not symmetric in the current layout though it obeys the aesthetic rules. It does not violate aesthetic rule 7, since the drawing of mirrored structure would give a mirrored layout.

To fix this issue, we need a new aesthetic rule to formalize the problem.

The Tidy Library

Though tidy tree visualization is useful in many fields, we haven’t seen many adoptions of this algorithm in the industry because it is error-prone and slower. I hope Tidy Lib can make the adoption simple and reliable.

The plan is to add full support for partial relayout, compress wasm size, and performance improvement. This lib mainly focuses on the layout algorithm. Building a full-fledged tree visualization tool or editor is not a priority of this lib.

This lib is published under MIT License. Don’t hesitate to contact me if you want to participate or have any suggestions.

References

[1]C. Wetherell and A. Shannon, “Tidy Drawings of Trees,” IEEE Transactions on Software Engineering, vol. SE-5, no. 5, pp. 514–520, Sep. 1979, doi: 10.1109/TSE.1979.234212.
[2] E. M. Reingold and J. S. Tilford, “Tidier Drawings of Trees,” IEEE Transactions on Software Engineering, vol. SE-7, no. 2, pp. 223–228, Mar. 1981, doi: 10.1109/TSE.1981.234519.
[3] J. Q. Walker II, “A node-positioning algorithm for general trees,” Software: Practice and Experience, vol. 20, no. 7, pp. 685–705, 1990, doi: 10.1002/spe.4380200705.
[4] C. Buchheim, M. Jünger, and S. Leipert, “Improving Walker’s Algorithm to Run in Linear Time,” in Graph Drawing, Berlin, Heidelberg, 2002, pp. 344–353. doi: 10.1007/3-540-36151-0_32.
[5] A. van der Ploeg, “Drawing non-layered tidy trees in linear time,” Software: Practice and Experience, vol. 44, no. 12, pp. 1467–1484, 2014, doi: 10.1002/spe.2213.
[6] Bill Mill, “Drawing Presentable Trees.” https://llimllib.github.io/pymag-trees/ 2008