summaryrefslogtreecommitdiff
path: root/docs/atomic-upgrades.md
blob: f4b31acaa292bdcb1a5aba19ef077f1d28ab7763 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
---
nav_order: 5
---

# Atomic Upgrades
{: .no_toc }

1. TOC
{:toc}

## You can turn off the power anytime you want...

OSTree is designed to implement fully atomic and safe upgrades;
more generally, atomic transitions between lists of bootable
deployments.  If the system crashes or you pull the power, you
will have either the old system, or the new one.

## Simple upgrades via HTTP

First, the most basic model OSTree supports is one where it replicates
pre-generated filesystem trees from a server over HTTP, tracking
exactly one ref, which is stored in the `.origin` file for the
deployment.  The command `ostree admin upgrade`
implements this.

To begin a simple upgrade, OSTree fetches the contents of the ref from
the remote server.  Suppose we're tracking a ref named
`exampleos/buildmaster/x86_64-runtime`.  OSTree fetches the URL
`http://example.com/repo/refs/heads/exampleos/buildmaster/x86_64-runtime`,
which contains a SHA256 checksum.  This determines the tree to deploy,
and `/etc` will be merged from currently booted tree.

If we do not have this commit, then we perform a pull process.
At present (without static deltas), this involves quite simply just
fetching each individual object that we do not have, asynchronously.
Put in other words, we only download changed files (zlib-compressed).
Each object has its checksum validated and is stored in `/ostree/repo/objects/`.

Once the pull is complete, we have downloaded all the objects that we need
to perform a deployment.

## Upgrades via external tools (e.g. package managers)

As mentioned in the introduction, OSTree is also designed to allow a
model where filesystem trees are computed on the client.  It is
completely agnostic as to how those trees are generated; they could be
computed with traditional packages, packages with post-deployment
scripts on top, or built by developers directly from revision control
locally, etc.

At a practical level, most package managers today (`dpkg` and `rpm`)
operate "live" on the currently booted filesystem.  The way they could
work with OSTree is to, instead, take the list of installed packages in
the currently booted tree, and compute a new filesystem from that.  A
later chapter describes in more details how this could work:
[Adapting Existing Systems](adapting-existing.md).

For the purposes of this section, let's assume that we have a
newly generated filesystem tree stored in the repo (which shares
storage with the existing booted tree).  We can then move on to
checking it back out of the repo into a deployment.

## Assembling a new deployment directory

Given a commit to deploy, OSTree first allocates a directory for
it.  This is of the form `/boot/loader/entries/ostree-$stateroot-$checksum.$serial.conf`.
The `$serial` is normally `0`, but if a
given commit is deployed more than once, it will be incremented.
This is supported because the previous deployment may have
configuration in `/etc` that we do not want to use or overwrite.

Now that we have a deployment directory, a 3-way merge is
performed between the (by default) currently booted deployment's
`/etc`, its default
configuration, and the new deployment (based on its `/usr/etc`).

## Atomically swapping boot configuration

At this point, a new deployment directory has been created as a
hardlink farm; the running system is untouched, and the bootloader
configuration is untouched.  We want to add this deployment to the
"deployment list".

To support a more general case, OSTree supports atomic transitioning
between arbitrary sets of deployments, with the restriction that the
currently booted deployment must always be in the new set.  In the
normal case, we have exactly one deployment, which is the booted one,
and we want to add the new deployment to the list.  A more complex
command might allow creating 100 deployments as part of one atomic
transaction, so that one can set up an automated system to bisect
across them.

## The bootversion

OSTree allows swapping between boot configurations by implementing the
"swapped directory pattern" in `/boot`.  This means it is a symbolic
link to one of two directories `/ostree/boot.[0|1]`.  To swap the
contents atomically, if the current version is `0`, we create
`/ostree/boot.1`, populate it with the new contents, then atomically
swap the symbolic link.  Finally, the old contents can be garbage
collected at any point.

## The /ostree/boot directory

However, we want to optimize for the case where the set of
kernel/initramfs/devicetree sets is the same between both the old and new
deployment lists.  This happens when doing an upgrade that does not
include the kernel; think of a simple translation update.  OSTree
optimizes for this case because on some systems `/boot` may be on a
separate medium such as flash storage not optimized for significant
amounts of write traffic.  Related to this, modern OSTree has support
for having `/boot` be a read-only mount by default - it will
automatically remount read-write just for the portion of time
necessary to update the bootloader configuration.

To implement this, OSTree also maintains the directory
`/ostree/boot.$bootversion`, which is a set
of symbolic links to the deployment directories.  The
`$bootversion` here must match the version of
`/boot`.  However, in order to allow atomic transitions of
*this* directory, this is also a swapped directory,
so just like `/boot`, it has a version of `0` or `1` appended.

Each bootloader entry has a special `ostree=` argument which refers to
one of these symbolic links.  This is parsed at runtime in the
initramfs.

###### Licensing for this document:
`SPDX-License-Identifier: (CC-BY-SA-3.0 OR GFDL-1.3-or-later)`