docs/dev/design_documents/action_collection.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106

---
title: Action Collection
---

# Action Collection Design

* Extract common code from the Resource Reporter and Data Collector.
* Expose a general purpose API for querying a record of all actions taken during the Chef run.
* Enable utilities like the 'zap' cookbook to be written to interact properly with Custom Resources.

The Action Collection tracks all actions taken by all Chef resources.  The resources can be in recipe code, as sub-resources of custom resources or
they may be built "by hand".  Since the Action Collection hooks the events which are fired from the `run_action` method on Chef::Resource it does
not matter how the resources were built (as long as they were correctly passed the Chef `run_context`).

This is complementary, but superior, to the resource collection which has an incomplete picture of what might happen or has happened in the run since there are
many common ways of invoking resource actions which are not captured by how the resource collection is built.  Replaying the sequence of actions in
the Action Collection would be closer to replaying the chef-client converge than trying to re-converge the resource collection (although both of
those models are still flawed in the presence of any imperative code that controls the shape of those objects).

This design extracts common duplicated code from the Data Collection and old Resource Reporter, and is designed to be used by other consumers which
need to ask questions like "in this run, what file resources had actions fired on them?", which can then be used to answer questions like
"which files is Chef managing in this directory?".

# Usage

## Action Collection Event Hook Registration

Consumers may register an event handler which hooks the `action_collection_registration` hook.  This event is fired directly before recipes are
compiled and converged (after library loading, attributes, etc).  This is just before the earliest point in time that a resource should fire an
action so represents the latest point that a consumer should make a decision about if it needs the Action Collection to be enabled or not.

Consumers can hook this method.  They will be passed the Action Collection instance, which can be saved by the caller to be queried later.  They
should then register themselves with the Action Collection (since without registering any interest, the Action Collection will disable itself).

```ruby
  def action_collection_registration(action_collection)
    @action_collection = action_collection
    action_collection.register(self)
  end
```

## Library Registration

Any cookbook library code may also register itself with the Action Collection.  The Action Collection will be registered with the `run_context` after
it is created, so registration may be accomplished easily:

```ruby
  Chef.run_context.action_collection.register(self)
```

## Action Collection Requires Registration

If one of the prior methods is not used to register for the Action Collection, then the Action Collection will disable itself and will not compile
the Action Collection in order to not waste the memory overhead of tracking the actions during the run.  The Data Collector takes advantage of this
since if the run start message from the Data Collector is refused by the server, then the Data Collector disables itself, and then does not register
with the Action Collection, which would disable the Action Collection.  This makes use of the delayed hooking through the `action_collection_registration`
so that the Data Collector never registers itself after it is disabled.

## Searching

There is a function `filtered_collection` which returns "slices" off of the `ActionCollection` object.  The `max_nesting` argument can be used to prune
how deep into sub-resources the returned view goes (`max_nesting: 0` will return only resources in recipe context, with any hand created resources, but
no subresources).  There are also 5 different states of the action:  `up_to_date`, `skipped`, `updated`, `failed`, `unprocessed` which can be filtered
on.  All of these are true by default, so they must be disabled to remove them from the filtered collection.

The `ActionCollection` object itself implements enumerable and returns `ActionRecord` objects (see the `ActionCollection` code for the fields exposed on
`ActionRecord`s).

This would return all file resources in any state in the recipe context:

```
Chef.run_context.action_collection.filtered_collection(max_nesting: 0).select { |rec| rec.new_resource.is_a?(Chef::Resource::File) }
```

NOTE:
As the Action Collection API was initially designed around the Resource Reporter and Data Collector use cases, the searching API is currently rudimentary
and could easily lift some of the searching features on the name of the resource from the resource collection, and could use a more fluent API
for composing searches.

# Implementation Details

## Resource Event Lifecycle Hooks

Resources actions fire off several events in sequence:

1. `resource_action_start` - this is always fired first
2. `resource_current_state_loaded` - this is normally always second, but may be skipped in the case of a resource which throws an exception during
`load_current_resource` (which means that the `current_resource` off the `ActionRecord` may be nil).
3. `resource_up_to_date` / `resource_skipped` / `resource_updated` / `resource_failed` - one of these is always called which corresponds to the state of the action.
4. `resource_completed` - this is always fired last

For skipped resources, the conditional will be saved in the `ActionRecord`.  For failed resources the exception is saved in the `ActionRecord`.

## Unprocessed Resources

The unprocessed resource concept is to report on resources which are left in the resource collection after a failure.  A successful Chef run should
never leave any unprocessed resources (`action :nothing` resources are still inspected by the resource collection and are processed).  There must be
an exception thrown during the execution of the resource collection, and the unprocessed resources were never visited by the runner that executes
the resource collection.

This list will be necessarily incomplete of any unprocessed sub-resources in custom resources, since the run was aborted before those resources
executed actions and built their own sub-resource collections.

This was a design requirement of the Data Collector.

To implement this in a more sane manner the runner that evaluates the resource collection now tracks the resources that it visits.