summaryrefslogtreecommitdiff
path: root/README.md
blob: 872321420294f366cb89693f16c2530ac8c92398 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
README for lorry-controller
===========================

Overview
--------

Lorry Controller, or LC for short, manages the importing of source
code from external sources into git repositories on a Trove, GitLab,
or Gerrit server (Downstream Host).

LC uses the Lorry tool to do the actual import. Lorry can read code
from several different version control systems, and convert them to
git. External repositories can be specfied individually, as Lorry
`.lorry` specification files. In addition, LC can be told to mirror
all the git repositories on a Trove or GitLab server (Upstream Host).

LC runs Lorry for the right external repositories, and takes care of
running a suitable number of Lorry instances concurrently, and
recovering from any problems. LC has a web based administration
interface, and an HTTP API for reporting and controlling its state.

This README file documents the LC configuration file and general use.
For the architecture of LC and the HTTP API, see the `ARCH.md` file.

Installation
------------

See the `INSTALL.md` file.

Lorry Controller configuration: overview
------------------------------

Lorry Controller has two levels of configuration. The first level is
command line options and configuration files. This level specifies
things such as log levels, network addresses to listen on, and such.
Most importantly, this level specifies the location of the second
level. For information about these options, run
`lorry-controller-webapp --help` to get a list of them.

The second level is a git repository that specifies which external
repositories and Upstream Hosts to import into the Downstream Host.
This git repository is referred to as CONFGIT in documentation, and is
specified with the the `--confgit-url` command line option, or the
`confgit-url` key in the configuration file. The configuration file
could contain this, for example:

    [config]
    confgit-url = ssh://git@localhost/baserock/local-config/lorries

The system integration of a Trove automatically includes a
configuration file that contains a configuration such as the above.
The URL contains the name of the Trove, so it needs to be customised
for each Trove, but as long as you're only using LC as part of a
Baserock Trove, it's all taken care of for you automatically.


The CONFGIT repository
----------------------

The CONFGIT repository must contain at least the file
`lorry-controller.conf`. It may also contain other files, including
`.lorry` files for Lorry, but all other files are ignored unless
referenced by `lorry-controller.conf`.



The `lorry-controller.conf` file
--------------------------------

`lorry-controller.conf` is a JSON file containing a list of maps. Each
map specifies an Upstream Host or one set of `.lorry`
files. Here's an example that tells LC to mirror the `git.baserock.org`
Trove and anything in the `open-source-lorries/*.lorry` files (if any
exist).

    [
        {
            "ignore": [
                "baserock/lorries"
            ], 
            "interval": "2H", 
            "ls-interval": "4H", 
            "prefixmap": {
                "baserock": "baserock", 
                "delta": "delta"
            }, 
            "protocol": "http", 
            "host": "git.baserock.org",
            "type": "trove"
        },
        {
            "type": "lorries",
            "interval": "6H",
            "prefix": "delta",
            "globs": [
                "open-source-lorries/*.lorry"
            ]
        }
    ]

A Host specification (map) uses the following mandatory keys:

* `type:` -- either `trove` or `gitlab`, depending on the type of
  Upstream Host.

* `host` -- the Upstream Host to mirror; a domain name or IP address.

* `protocol` -- specify how Lorry Controller (and Lorry) should talk
  to the Upstream Host. Allowed values are `ssh`, `https`, `http`.

* `prefixmap` -- map repository path prefixes from the Upstream Host
  to the Downstream Host. If the upstream prefix is `foo`, and the
  downstream prefix is `bar`, then upstream repository
  `foo/baserock/yeehaa` gets mirrored to downstream repository
  `bar/baserock/yeehaa`. If the Upstream Host has a repository that
  does not match a prefix, that repository gets ignored.

* `ls-interval` -- determine how often should Lorry Controller query
  the Upstream Host for a list of repositories it may mirror. See below
  for how the value is interpreted. The default is 24 hours.

* `interval` -- specify how often Lorry Controller should mirror the
  repositories in the spec. See below for INTERVAL. The default
  interval is 24 hours.

Additionally, the following optional keys are allowed in Host
specifications:

* `ignore` -- a list of git repositories from the Upstream Host that
  should NOT be mirrored. Each list element is a glob pattern which
  is matched against the path to the git repository (not including leading
  slash).

* `auth` -- specify how to authenticate to the Upstream Host over https
  (only). It should be a dictionary with the fields `username` and
  `password`.

A GitLab specification (map) uses an additional mandatory key:

* `private-token` -- the GitLab private token for a user with the
  minimum permissions of master of any group you may wish to create
  repositories under.

A Lorry specification (map) uses the following keys, all of them
mandatory:

* `type: lorries` -- specify it's a Lorry specification.

* `interval` -- identical in meaning to the `interval` in a
  Host specification.

* `prefix` -- a path prefix to be prepended to all repositories
  created from the `.lorry` files from this spec.

* `globs` -- a list of globs (as strings) for `.lorry` files to use.
  The glob is matched in the directory containing the configuration
  file in which this spec is. It is OK for the globs to not match
  anything.

For backwards compatibility with another implementation of Lorry
Controller, other fields in either type of specification are allowed
and silently ignored.

An INTERVAL value (for `interval` or `ls-interval`) is a number and a
unit to indicate a time interval. Allowed units are minutes (`m`),
hours (`h`), and days (`d`), expressed as single-letter codes in upper
or lower case.

The syntax of `.lorry` files is specified by the Lorry program; see
its documentation for details.  Lorry Controller supports an
optional `description` field in `.lorry` files that is used to set
the repository description on the Downstream Host.


HTTP proxy configuration: `proxy.conf`
--------------------------------------

Lorry Controller will look for a file called `proxy.conf` in the same
directory as the `lorry-controller.conf` configuration file.
It is in JSON format, with the following key/value pairs:

* `hostname` -- the hostname of the HTTP proxy
* `username` -- username for authenticating to the proxy
* `password` -- a **cleartext** password for authenticating to the
  proxy
* `port` -- port number for connecting to the proxy

Lorry Controller will use this information for both HTTP and HTTPS
proxying.

Do note that the **password is stored in cleartext** and that access
to the configuration file (and the git repository where it is stored)
must be controlled appropriately.

WEBAPP 'Admin' Interface
------------------------

An 'admin' interface runs locally on port 12765.

For the moment you can access this interface using an ssh tunnel, for
example:

ssh -L 12765:localhost:12765 root@lorryhost

will bind 12765 on your localhost to 12765 on lorryhost, with this running
you can access the 'admin' interface at
http://localhost:12765/1.0/status-html

When used within Trove, a web interface for managing lorry controller
is accessible from http://trove/1.0/status-html.