2022/04/29
How to split yaml documents with anchors
yaml has become the de-facto standard for infrastructure-as-code tooling and modern applications use it for their configuration files. Surely a relieve from XML (remember XSLT?)! The vast use of yaml together with the trend to declare and configure components instead of writing code (e.g. deploy scripts) also means that yaml editing skills are important if you want to be fast and efficient. In this post, I will show you a trick that saved me at least an hour of boring copy&paste which would also have been error prone.
Now, imagine that you have a single yaml file that has several documents with anchors like the following:
name: document1
foo: &array
- 1
- 2
- 3
---
name: document2
bar: *array
big-document.yaml
You want to split this file into individual files, one for each document.
Maybe the original document grew too big or you have decided that document2
does not
belong into the same file as document1
any longer.
You may use a tool like csplit
or awk
and split along the delimiter (---
), but if
you pay attention to the content, you will see that the reference array
in the second
document would stop working.
Therefore you should use tooling that understands yaml, the one I am using is called yq
and is to yaml
what jq
is to json
: a swiss army knife that with enough fiddling can do almost everything.
Now back to the example. Two things are needed:
- evaluate all references / anchors
- then, split the files along the delimiters
- give the files a meaningful name
And all this is possible with yq
in a single line
$ yq -s '.name' eval 'explode(.)' one_document.yaml
eval
is a command of yq
and evaluates the expression explode(.)
for all documents.
-s '.name'
splits the file into a file, one for each document naming it .name
, so document1.yaml
and document2.yaml
in the example above:
name: document1
foo:
- 1
- 2
- 3
document1.yaml
name: document2
bar:
- 1
- 2
- 3
document2.yaml
Of course you can add more expressions and transform the documents while splitting.