sourcediver.org

2022/04/29

How to split yaml documents with anchors

Maximilian Güntner

yaml has become the de-facto standard for infrastructure-as-code tooling and modern applications use it for their configuration files. Surely a relieve from XML (remember XSLT?)! The vast use of yaml together with the trend to declare and configure components instead of writing code (e.g. deploy scripts) also means that yaml editing skills are important if you want to be fast and efficient. In this post, I will show you a trick that saved me at least an hour of boring copy&paste which would also have been error prone.

Now, imagine that you have a single yaml file that has several documents with anchors like the following:

name: document1
foo: &array
- 1
- 2
- 3

---
name: document2
bar: *array

big-document.yaml

You want to split this file into individual files, one for each document. Maybe the original document grew too big or you have decided that document2 does not belong into the same file as document1 any longer. You may use a tool like csplit or awk and split along the delimiter (---), but if you pay attention to the content, you will see that the reference array in the second document would stop working.

Therefore you should use tooling that understands yaml, the one I am using is called yq and is to yaml what jq is to json: a swiss army knife that with enough fiddling can do almost everything.

Now back to the example. Two things are needed:

  • evaluate all references / anchors
  • then, split the files along the delimiters
  • give the files a meaningful name

And all this is possible with yq in a single line

$ yq -s '.name' eval 'explode(.)' one_document.yaml

eval is a command of yq and evaluates the expression explode(.) for all documents. -s '.name' splits the file into a file, one for each document naming it .name, so document1.yaml and document2.yaml in the example above:

name: document1
foo:
  - 1
  - 2
  - 3

document1.yaml

name: document2
bar:
  - 1
  - 2
  - 3

document2.yaml

Of course you can add more expressions and transform the documents while splitting.