From 74ffb6dbd4d21888d147bf912cc0de75e85d609e Mon Sep 17 00:00:00 2001 From: Ben Hutchings Date: Wed, 15 Jul 2020 16:22:51 +0100 Subject: Add .md extension to Markdown documents This will cause them to be rendered on GitLab and other git hosts' web interfaces. --- ARCH | 505 ------------------------------------------------------------- ARCH.md | 505 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ INSTALL | 273 --------------------------------- INSTALL.md | 273 +++++++++++++++++++++++++++++++++ README | 208 ------------------------- README.md | 208 +++++++++++++++++++++++++ 6 files changed, 986 insertions(+), 986 deletions(-) delete mode 100644 ARCH create mode 100644 ARCH.md delete mode 100644 INSTALL create mode 100644 INSTALL.md delete mode 100644 README create mode 100644 README.md diff --git a/ARCH b/ARCH deleted file mode 100644 index 5f7ce7b..0000000 --- a/ARCH +++ /dev/null @@ -1,505 +0,0 @@ -% Architecture of daemonised Lorry Controller -% Codethink Ltd - -Introduction -============ - -This is an architecture document for Lorry Controller. It is aimed at -those who develop the software, or develop against its HTTP API. See -the file `README` for general information about Lorry Controller. - - -Requirements -============ - -Some concepts/terminology: - -* CONFGIT is the git repository Lorry Controller uses for its - configuration. - -* Lorry specification: the configuration to Lorry to mirror an - upstream version control repository or tarball. Note that a `.lorry` - file may contain several specifications. - -* Upstream Host: a git hosting server that Lorry Controller mirrors - from. - -* Host specification: which Upstream Host to mirror. This gets - broken into generated Lorry specifications, one per git repository - on the other Host. There can be many Host specifications to - mirror many Hosts. - -* Downstream Host: a git hosting server that Lorry Controller mirrors - to. - -* run queue: all the Lorry specifications (from CONFGIT or generated - from the Host specifications) a Lorry Controller knows about; this - is the set of things that get scheduled. The queue has a linear - order (first job in the queue is the next job to execute). - -* job: An instance of executing a Lorry specification. Each job has an - identifier and associated data (such as the output provided by the - running job, and whether it succeeded). - -* admin: a person who can control or reconfigure a Lorry Controller - instance. All users of the HTTP API are admins, for example. - -For historical reasons, Hosts are also referred to as Troves in many -places. - -Original set of requirements, which have been broken down and detailed -up below: - -* Lorry Controller should be capable of being reconfigured at runtime - to allow new tasks to be added and old tasks to be removed. - (RC/ADD, RC/RM, RC/START) - -* Lorry Controller should not allow all tasks to become stuck if one - task is taking a long time. (RR/MULTI) - -* Lorry Controller should not allow stuck tasks to remain stuck - forever. (Configurable timeout? monitoring of disk usage or CPU to - see if work is being done?) (RR/TIMEOUT) - -* Lorry Controller should be able to be controlled at runtime to allow: - - Querying of the current task set (RQ/SPECS, RQ/SPEC) - - Querying of currently running tasks (RQ/RUNNING) - - Promotion or demotion of a task in the queue (RT/TOP, RT/BOT) - - Supporting of the health monitoring to allow appropriate alerts - to be sent out (MON/STATIC, MON/DU) - -The detailed requirements (prefixed by a unique identfier, which is -used elsewhere to refer to the exact requirement): - -* (FW) Lorry Controller can access Upstream Hosts from behind firewalls. - * (FW/H) Lorry Controller can access the Upstream Host using HTTP or - HTTPS only, without using ssh, in order to get a list of - repositories to mirror. (Lorry itself also needs to be able to - access the Upstream Host using HTTP or HTTPS only, bypassing - ssh, but that's a Lorry problem and outside the scope of Lorry - Controller, so it'll need to be dealt separately.) - * (FW/C) Lorry Controller does not verify SSL/TLS certificates - when accessing the Upstream Host. -* (RC) Lorry Controller can be reconfigured at runtime. - * (RC/ADD) A new Lorry specification can be added to CONFGIT, and - a running Lorry Controller will add them to its run queue as - soon as it is notified of the change. - * (RC/RM) A Lorry specification can be removed from CONFGIT, and a - running Lorry Controller will remove it from its run queue as - soon as it is notified of the change. - * (RC/START) A Lorry Controller reads CONFGIT when it starts, - updating its run queue if anything has changed. -* (RT) Lorry Controller can controlled at runtime. - * (RT/KILL) An admin can get their Lorry Controller to stop a - running job. - * (RT/TOP) An admin can get their Lorry Controller to move a Lorry - spec to the beginning of the run queue. - * (RT/BOT) An admin can get their Lorry Controller to move a Lorry - spec to the end of the run queue. - * (RT/QSTOP) An admin can stop their Lorry Controller from - scheduling any new jobs. - * (RT/QSTART) An admin can get their Lorry Controller to start - scheduling jobs again. -* (RQ) Lorry Controller can be queried at runtime. - * (RQ/RUNNING) An admin can list all currently running jobs. - * (RQ/ALLJOBS) An admin can list all finished jobs that the Lorry - Controller still remembers. - * (RQ/SPECS) An admin can list all existing Lorry specifications - in the run queue. - * (RQ/SPEC) An admin can query existing Lorry specifications in - the run queue for any information the Lorry Controller holds for - them, such as the last time they successfully finished running. -* (RR) Lorry Controller is reasonably robust. - * (RR/CONF) Lorry Controller ignores any broken Lorry or Host - specifications in CONFGIT, and runs without them. - * (RR/TIMEOUT) Lorry Controller stops a job that runs for too - long. - * (RR/MULTI) Lorry Controller can run multiple jobs at the same - time, and lets the maximal number of such jobs be configured by - the admin. - * (RR/DU) Lorry Controller (and the way it runs Lorry) is - designed to be frugal about disk space usage. - * (RR/CERT) Lorry Controller tells Lorry to not worry about - unverifiable SSL/TLS certificates and to continue even if the - certificate can't be verified or the verification fails. -* (RS) Lorry Controller is reasonably scalable. - * (RS/SPECS) Lorry Controller works for the number of Lorry - specifications we have on git.baserock.org (a number that will - increase, and is currently about 500). - * (RS/GITS) Lorry Controller works for mirroring git.baserock.org - (about 500 git repositories). - * (RS/HW) Lorry Controller may assume that CPU, disk, and - bandwidth are sufficient, if not to be needlessly wasted. -* (MON) Lorry Controller can be monitored from the outside. - * (MON/STATIC) Lorry Controller updates at least once a minute a - static HTML file, which shows its current status with sufficient - detail that an admin knows if things get stuck or break. - * (MON/DU) Lorry Controller measures, at least, the disk usage of - each job and Lorry specification. -* (SEC) Lorry Controller is reasonably secure. - * (SEC/API) Access to the Lorry Controller run-time query and - controller interfaces is managed with iptables (for now). - * (SEC/CONF) Access to CONFGIT is managed by the git server that - hosts it. (Gitano on Trove.) - -Architecture design -=================== - -Constraints ------------ - -Python is not good at multiple threads (partly due to the global -interpreter lock), and mixing threads and executing subprocesses is -quite tricky to get right in general. Thus, this design splits the -software into a threaded web application (using the bottle.py -framework) and one or more single-threaded worker processes to execute -Lorry. - -Entities --------- - -* An admin is a human being or some software using the HTTP API to - communicate with the Lorry Controller. -* Lorry Controller runs Lorry appropriately, and consists of several - components described below. -* The Downstream Host is as defined in Requirements. -* An Upstream Host is as defined in Requirements. There can be - multiple Upstream Hosts. - -Components of Lorry Controller ------------------------------- - -* CONFGIT is a git repository for Lorry Controller configuration, - which the Lorry Controller (see WEBAPP below) can access and pull - from. Pushing is not required and should be prevented by Gitano. - CONFGIT is hosted on the Downstream Host. - -* STATEDB is persistent storage for the Lorry Controller's state: what - Lorry specs it knows about (provided by the admin, or generated from - a Host spec by Lorry Controller itself), their ordering, jobs that - have been run or are being run, information about the jobs, etc. The - idea is that the Lorry Controller process can terminate (cleanly or - by crashing), and be restarted, and continue approximately from - where it was. Also, a persistent storage is useful if there are - multiple processes involved due to how bottle.py and WSGI work. - STATEDB is implemented using sqlite3. - -* WEBAPP is the controlling part of Lorry Controller, which maintains - the run queue, and provides an HTTP API for monitoring and - controlling Lorry Controller. WEBAPP is implemented as a bottle.py - application. bottle.py runs the WEBAPP code in multiple threads to - improve concurrency. - -* MINION runs jobs (external processes) on behalf of WEBAPP. It - communicates with WEBAPP over HTTP, and requests a job to run, - starts it, and while it waits, sends partial output to the WEBAPP - every few seconds, and asks the WEBAPP whether the job should be - aborted or not. MINION may eventually run on a different host than - WEBAPP, for added scalability. - -Components external to Lorry Controller ---------------------------------------- - -* A web server. This runs the Lorry Controller WEBAPP, using WSGI so - that multiple instances (processes) can run at once, and thus serve - many clients. - -* bottle.py is a Python microframework for web applications. It sits - between the web server itself and the WEBAPP code. - -* systemd is the operating system component that starts services and - processes. - -How the components work together --------------------------------- - -* Each WEBAPP instance is started by the web server, when a request - comes in. The web server is started by a systemd unit. - -* Each MINION instance is started by a systemd unit. Each MINION - handles one job at a time, and doesn't block other MINIONs from - running other jobs. The admins decide how many MINIONs run at once, - depending on hardware resources and other considerations. (RR/MULTI) - -* An admin communicates with the WEBAPP only, by making HTTP requests. - Each request is either a query (GET) or a command (POST). Queries - report state as stored in STATEDB. Commands cause the WEBAPP - instance to do something and alter STATEDB accordingly. - -* When an admin makes changes to CONFGIT, and pushes them to the Downstream - Host, the Host's git post-update hook makes an HTTP request to - WEBAPP to update STATEDB from CONFGIT. (RC/ADD, RC/RM) - -* Each MINION likewise communicates only with the WEBAPP using HTTP - requests. MINION requests a job to run (which triggers WEBAPP's job - scheduling), and then reports results to the WEBAPP (which causes - WEBAPP to store them in STATEDB), which tells MINION whether to - continue running the job or not (RT/KILL). There is no separate - scheduling process: all scheduling happens when there is a MINION - available. - -* At system start up, a systemd unit makes an HTTP request to WEBAPP - to make it refresh STATEDB from CONFGIT. (RC/START) - -* A timer unit for systemd makes an HTTP request to get WEBAPP to - refresh the static HTML status page. (MON/STATIC) - -In summary: systemd starts WEBAPP and MINIONs, and whenever a -MINION can do work, it asks WEBAPP for something to do, and reports -back results. Meanwhile, admin can query and control via HTTP requests -to WEBAPP, and WEBAPP instances communicate via STATEDB. - -The WEBAPP ----------- - -The WEBAPP provides an HTTP API as described below. - -Run queue management: - -* `POST /1.0/stop-queue` causes WEBAPP to stop scheduling new jobs to - run. Any currently running jobs are not affected. (RT/QSTOP) - -* `POST /1.0/start-queue` causes WEBAPP to start scheduling jobs - again. (RT/QSTART) - -* `GET /1.0/list-queue` causes WEBAPP to return a JSON list of ids of - all Lorry specifications in the run queue, in the order they are in - the run queue. (RQ/SPECS) - -* `POST /1.0/move-to-top` with `path=lorryspecid` as the body, where - `lorryspecid` is the id (path) of a Lorry specification in the run - queue, causes WEBAPP to move the specified spec to the head of the - run queue, and store this in STATEDB. It doesn't affect currently - running jobs. (RT/TOP) - -* `POST /1.0/move-to-bottom` with `path=lorryspecid` in the body is - like `/move-to-top`, but moves the job to the end of the run queue. - (RT/BOT) - -Running job management: - -* `GET /1.0/list-running-jobs` causes WEBAPP to return a JSON list of - ids of all currently running jobs. (RQ/RUNNING) - -* `GET /1.0/job/` causes WEBAPP to return a JSON map (dict) - with all the information about the specified job. - -* `POST /1.0/stop-job` with `job_id=jobid` where `jobid` is an id of a - running job, causes WEBAPP to record in STATEDB that the job is to - be killed, and waits for it to be killed. (Killing to be done when - MINION gets around to it.) This request returns as soon as the - STATEDB change is done. - -* `GET /1.0/list-jobs` causes WEBAPP to return a JSON list of ids - of all jobs, running or finished, that it knows about. (RQ/ALLJOBS) - -* `GET /1.0/list-jobs-html` is the same as `list-jobs`, but returns an - HTML page instead. - -* `POST /1.0/remove-job` with `job_id=jobid` in the body, removes a - stopped job from the state database. - -* `POST /1.0/remove-ghost-jobs` looks for any running jobs in STATEDB - that haven't been updated (with `job-update`, see below) in a long - time (see `--ghost-timeout`), and marks them as terminated. This is - used to catch situations when a MINION fails to tell the WEBAPP that - a job has terminated. - -Other status queries: - -* `GET /1.0/status` causes WEBAPP to return a JSON object that - describes the state of Lorry Controller. This information is meant - to be programmatically useable and may or may not be the same as in - the HTML page. - -* `GET /1.0/status-html` causes WEBAPP to return an HTML page that - describes the state of Lorry Controller. This also updates an - on-disk copy of the HTML page, which the web server is configured to - serve using a normal HTTP request. This is the primary interface for - human admins to look at the state of Lorry Controller. (MON/STATIC) - -* `GET /1.0/lorry/` causes WEBAPP to return a JSON map - (dict) with all the information about the specified Lorry - specification. (RQ/SPEC) - - -Requests for MINION: - -* `GET /1.0/give-me-job` is used by MINION to get a new job to run. - WEBAPP will either return a JSON object describing the job to run, - or return a status code indicating that there is nothing to do. - WEBAPP will respond immediately, even if there is nothing for MINION - to do, and MINION will then sleep for a while before it tries again. - WEBAPP updates STATEDB to record that the job is allocated to a - MINION. - -* `POST /1.0/job-update` is used by MINION to push updates about the - job it is running to WEBAPP. The body sets fields `exit` (exit code - of program, or `no` if not set), `stdout` (some output from the - job's standard output) and `stderr` (ditto, but standard error - output). There MUST be at least one `job-update` call, which - indicates the job has terminated. WEBAPP responds with a status - indicating whether the job should continue to run or be terminated - (RR/TIMEOUT). WEBAPP records the job as terminated only after MINION - tells it the job has been terminated. MINION makes the `job-update` - request frequently, even if the job has produced no output, so that - WEBAPP can update a timestamp in STATEDB to indicate the job is - still alive. - -Other requests: - -* `POST /1.0/read-configuration` causes WEBAPP to update its copy of - CONFGIT and update STATEDB based on the new configuration, if it has - changed. Returns OK/ERROR status. (RC/ADD, RC/RM, RC/START) - - This is called by systemd units at system startup and periodically - (perhaps once a minute) otherwise. It can also be triggered by an - admin (there is a button on the `/1.0/status-html` web page). - -* `POST /1.0/ls-troves` causes WEBAPP to refresh its list of - repositories in each Upstream Host, if the current list is too old - (see the `ls-interval` setting for each Upstream Host in - `lorry-controller.conf`). This gets called from a systemd timer unit - at a suitable interval. - -* `POST /1.0/force-ls-troves` causes the repository refresh to happen - for all Upstream Hosts, regardless of whether it is due or not. This - can be called manually by an admin. - - -The MINION ----------- - -* Do `GET /1.0/give-me-job` to WEBAPP. -* If didn't get a job, sleep a while and try again. -* If did get job, fork and exec that. -* In a loop: wait for output, for a suitably short period of time, - from job (or its termination), with `select` or similar mechanism, - and send anything (if anything) you get to WEBAPP. If the WEBAPP - told us to kill the job, kill it, then send an update to that effect - to WEBAPP. -* Go back to top to request new job. - - -Old job removal ---------------- - -To avoid the STATEDB filling up with logs of old jobs, a systemd timer -unit will run occasionally to remove jobs so old, nobody cares about -them anymore. To make it easier to experiment with the logic of -choosing what to remove (age only? keep failed ones? something else?) -the removal is kept outside the WEBAPP. - - -STATEDB -------- - -The STATEDB has several tables. This section explains them. - -The `running_queue` table has a single column (`running`) and a single -row, and is used to store a single boolean value that specifies -whether WEBAPP is giving out jobs to run from the run-queue. This -value is controlled by `/1.0/start-queue` and `/1.0/stop-queue` -requests. - -The `lorries` table implements the run-queue: all the Lorry specs that -WEBAPP knows about. It has the following columns: - -* `path` is the path of the git repository on the Downstream Host, i.e., - the git repository to which Lorry will push. This is a unique - identifier. It is used, for example, to determine if a Lorry spec - is obsolete after a CONFGIT update. -* `text` has the text of the Lorry spec. This may be read from a file - or generated by Lorry Controller itself. This text will be given to - Lorry when a job is run. -* `generated` is set to 0 or 1, depending on if the Lorry came from an - actual `.lorry` file or was generated by Lorry Controller. - - -Code structure -============== - -The Lorry Controller code base is laid out as follows: - -* `lorry-controller-webapp` is the main program of WEBAPP. It sets up - the bottle.py framework. All the implementations for the various - HTTP requests are in classes in the `lorrycontroller` Python - package, as subclasses of the `LorryControllerRoute` class. The main - program uses introspection ("magic") to find the subclasses - automatically and sets up the bottle.py routes correctly. This makes - it possible to spread the code into simple classes; bottle's normal - way (with the `@app.route` decorator) seemed to make that harder and - require everything in the same class. - -* `lorrycontroller` is a Python package with: - - - The HTTP request handlers (`LorryControllerRoute` and its subclasses) - - Management of STATEDB (`statedb` module) - - Support for various Downstream and Upstream Host types - (`hosts`, `gitano`, `gerrit`, `gitlab`, `local` modules) - - Some helpful utilities (`proxy` module) - -* `lorry-controller-minion` is the entirety of the MINION, except that - it uses the `lorrycontroller.setup_proxy` function. - The MINION is kept very simple on purpose: all the interesting logic - is in the WEBAPP instead. - -* `static` has static content to be served over HTTP. Primarily, the - CSS file for the HTML interfaces. When LC is integrated within the - Downstream Host, the web server gets configured to serve these files directly. - The `static` directory will be accessible over plain HTTP on port - 80, and on port 12765 via the WEBAPP, to allow HTML pages to refer - to it via a simple path. - -* `templates` contains bottle.py HTML templates for various pages. - -* `etc` contains files to be installed in `/etc` when LC is installed - on a Baserock system. Primarily this is the web server (lighttpd) - configuration to invoke WEBAPP. - -* `units` contains various systemd units that start services and run - time-based jobs. - -* `yarns.webapp` contains an integration test suite for WEBAPP. - This is run by the `./check` script. The `./test-wait-for-port` - script is used by the yarns. - -Example -------- - -As an example, to modify how the `/1.0/status-html` request works, you -would look at its implementation in `lorrycontroller/status.py`, and -perhaps also the HTML templates in `templates/*.tpl`. - -STATEDB -------- - -The persistent state of WEBAPP is stored in an Sqlite3 database. All -access to STATEDB within WEBAPP is via the -`lorrycontroller/statedb.py` code module. That means there are no SQL -statements outside `statedb.py` at all, nor is it OK to add any. If -the interface provided by the `StateDB` class isn't sufficient, then -modify the class suitably, but do not add any new SQL outside it. - -All access from outside of WEBAPP happens via WEBAPP's HTTP API. -Only the WEBAPP is allowed to touch STATEDB in any way. - -The bottle.py framework runs multiple threads of WEBAPP code. The -threads communicate only via STATEDB. There is no shared state in -memory. SQL's locking is used for mutual exclusion. - -The `StateDB` class acts as a context manager for Python's `with` -statements to provide locking. To access STATEDB with locking, use -code such as this: - - with self.open_statedb() as statedb: - hosts = statedb.get_hosts() - for host in hosts: - statedb.remove_host(hosts) - -The code executed by the `with` statement is run under lock, and the -lock gets released automatically even if there is an exception. - -(You could manage locks manually. It's a good way to build character -and learn why using the context manager is really simple and leads to -more correct code.) diff --git a/ARCH.md b/ARCH.md new file mode 100644 index 0000000..77218f1 --- /dev/null +++ b/ARCH.md @@ -0,0 +1,505 @@ +% Architecture of daemonised Lorry Controller +% Codethink Ltd + +Introduction +============ + +This is an architecture document for Lorry Controller. It is aimed at +those who develop the software, or develop against its HTTP API. See +the file `README.md` for general information about Lorry Controller. + + +Requirements +============ + +Some concepts/terminology: + +* CONFGIT is the git repository Lorry Controller uses for its + configuration. + +* Lorry specification: the configuration to Lorry to mirror an + upstream version control repository or tarball. Note that a `.lorry` + file may contain several specifications. + +* Upstream Host: a git hosting server that Lorry Controller mirrors + from. + +* Host specification: which Upstream Host to mirror. This gets + broken into generated Lorry specifications, one per git repository + on the other Host. There can be many Host specifications to + mirror many Hosts. + +* Downstream Host: a git hosting server that Lorry Controller mirrors + to. + +* run queue: all the Lorry specifications (from CONFGIT or generated + from the Host specifications) a Lorry Controller knows about; this + is the set of things that get scheduled. The queue has a linear + order (first job in the queue is the next job to execute). + +* job: An instance of executing a Lorry specification. Each job has an + identifier and associated data (such as the output provided by the + running job, and whether it succeeded). + +* admin: a person who can control or reconfigure a Lorry Controller + instance. All users of the HTTP API are admins, for example. + +For historical reasons, Hosts are also referred to as Troves in many +places. + +Original set of requirements, which have been broken down and detailed +up below: + +* Lorry Controller should be capable of being reconfigured at runtime + to allow new tasks to be added and old tasks to be removed. + (RC/ADD, RC/RM, RC/START) + +* Lorry Controller should not allow all tasks to become stuck if one + task is taking a long time. (RR/MULTI) + +* Lorry Controller should not allow stuck tasks to remain stuck + forever. (Configurable timeout? monitoring of disk usage or CPU to + see if work is being done?) (RR/TIMEOUT) + +* Lorry Controller should be able to be controlled at runtime to allow: + - Querying of the current task set (RQ/SPECS, RQ/SPEC) + - Querying of currently running tasks (RQ/RUNNING) + - Promotion or demotion of a task in the queue (RT/TOP, RT/BOT) + - Supporting of the health monitoring to allow appropriate alerts + to be sent out (MON/STATIC, MON/DU) + +The detailed requirements (prefixed by a unique identfier, which is +used elsewhere to refer to the exact requirement): + +* (FW) Lorry Controller can access Upstream Hosts from behind firewalls. + * (FW/H) Lorry Controller can access the Upstream Host using HTTP or + HTTPS only, without using ssh, in order to get a list of + repositories to mirror. (Lorry itself also needs to be able to + access the Upstream Host using HTTP or HTTPS only, bypassing + ssh, but that's a Lorry problem and outside the scope of Lorry + Controller, so it'll need to be dealt separately.) + * (FW/C) Lorry Controller does not verify SSL/TLS certificates + when accessing the Upstream Host. +* (RC) Lorry Controller can be reconfigured at runtime. + * (RC/ADD) A new Lorry specification can be added to CONFGIT, and + a running Lorry Controller will add them to its run queue as + soon as it is notified of the change. + * (RC/RM) A Lorry specification can be removed from CONFGIT, and a + running Lorry Controller will remove it from its run queue as + soon as it is notified of the change. + * (RC/START) A Lorry Controller reads CONFGIT when it starts, + updating its run queue if anything has changed. +* (RT) Lorry Controller can controlled at runtime. + * (RT/KILL) An admin can get their Lorry Controller to stop a + running job. + * (RT/TOP) An admin can get their Lorry Controller to move a Lorry + spec to the beginning of the run queue. + * (RT/BOT) An admin can get their Lorry Controller to move a Lorry + spec to the end of the run queue. + * (RT/QSTOP) An admin can stop their Lorry Controller from + scheduling any new jobs. + * (RT/QSTART) An admin can get their Lorry Controller to start + scheduling jobs again. +* (RQ) Lorry Controller can be queried at runtime. + * (RQ/RUNNING) An admin can list all currently running jobs. + * (RQ/ALLJOBS) An admin can list all finished jobs that the Lorry + Controller still remembers. + * (RQ/SPECS) An admin can list all existing Lorry specifications + in the run queue. + * (RQ/SPEC) An admin can query existing Lorry specifications in + the run queue for any information the Lorry Controller holds for + them, such as the last time they successfully finished running. +* (RR) Lorry Controller is reasonably robust. + * (RR/CONF) Lorry Controller ignores any broken Lorry or Host + specifications in CONFGIT, and runs without them. + * (RR/TIMEOUT) Lorry Controller stops a job that runs for too + long. + * (RR/MULTI) Lorry Controller can run multiple jobs at the same + time, and lets the maximal number of such jobs be configured by + the admin. + * (RR/DU) Lorry Controller (and the way it runs Lorry) is + designed to be frugal about disk space usage. + * (RR/CERT) Lorry Controller tells Lorry to not worry about + unverifiable SSL/TLS certificates and to continue even if the + certificate can't be verified or the verification fails. +* (RS) Lorry Controller is reasonably scalable. + * (RS/SPECS) Lorry Controller works for the number of Lorry + specifications we have on git.baserock.org (a number that will + increase, and is currently about 500). + * (RS/GITS) Lorry Controller works for mirroring git.baserock.org + (about 500 git repositories). + * (RS/HW) Lorry Controller may assume that CPU, disk, and + bandwidth are sufficient, if not to be needlessly wasted. +* (MON) Lorry Controller can be monitored from the outside. + * (MON/STATIC) Lorry Controller updates at least once a minute a + static HTML file, which shows its current status with sufficient + detail that an admin knows if things get stuck or break. + * (MON/DU) Lorry Controller measures, at least, the disk usage of + each job and Lorry specification. +* (SEC) Lorry Controller is reasonably secure. + * (SEC/API) Access to the Lorry Controller run-time query and + controller interfaces is managed with iptables (for now). + * (SEC/CONF) Access to CONFGIT is managed by the git server that + hosts it. (Gitano on Trove.) + +Architecture design +=================== + +Constraints +----------- + +Python is not good at multiple threads (partly due to the global +interpreter lock), and mixing threads and executing subprocesses is +quite tricky to get right in general. Thus, this design splits the +software into a threaded web application (using the bottle.py +framework) and one or more single-threaded worker processes to execute +Lorry. + +Entities +-------- + +* An admin is a human being or some software using the HTTP API to + communicate with the Lorry Controller. +* Lorry Controller runs Lorry appropriately, and consists of several + components described below. +* The Downstream Host is as defined in Requirements. +* An Upstream Host is as defined in Requirements. There can be + multiple Upstream Hosts. + +Components of Lorry Controller +------------------------------ + +* CONFGIT is a git repository for Lorry Controller configuration, + which the Lorry Controller (see WEBAPP below) can access and pull + from. Pushing is not required and should be prevented by Gitano. + CONFGIT is hosted on the Downstream Host. + +* STATEDB is persistent storage for the Lorry Controller's state: what + Lorry specs it knows about (provided by the admin, or generated from + a Host spec by Lorry Controller itself), their ordering, jobs that + have been run or are being run, information about the jobs, etc. The + idea is that the Lorry Controller process can terminate (cleanly or + by crashing), and be restarted, and continue approximately from + where it was. Also, a persistent storage is useful if there are + multiple processes involved due to how bottle.py and WSGI work. + STATEDB is implemented using sqlite3. + +* WEBAPP is the controlling part of Lorry Controller, which maintains + the run queue, and provides an HTTP API for monitoring and + controlling Lorry Controller. WEBAPP is implemented as a bottle.py + application. bottle.py runs the WEBAPP code in multiple threads to + improve concurrency. + +* MINION runs jobs (external processes) on behalf of WEBAPP. It + communicates with WEBAPP over HTTP, and requests a job to run, + starts it, and while it waits, sends partial output to the WEBAPP + every few seconds, and asks the WEBAPP whether the job should be + aborted or not. MINION may eventually run on a different host than + WEBAPP, for added scalability. + +Components external to Lorry Controller +--------------------------------------- + +* A web server. This runs the Lorry Controller WEBAPP, using WSGI so + that multiple instances (processes) can run at once, and thus serve + many clients. + +* bottle.py is a Python microframework for web applications. It sits + between the web server itself and the WEBAPP code. + +* systemd is the operating system component that starts services and + processes. + +How the components work together +-------------------------------- + +* Each WEBAPP instance is started by the web server, when a request + comes in. The web server is started by a systemd unit. + +* Each MINION instance is started by a systemd unit. Each MINION + handles one job at a time, and doesn't block other MINIONs from + running other jobs. The admins decide how many MINIONs run at once, + depending on hardware resources and other considerations. (RR/MULTI) + +* An admin communicates with the WEBAPP only, by making HTTP requests. + Each request is either a query (GET) or a command (POST). Queries + report state as stored in STATEDB. Commands cause the WEBAPP + instance to do something and alter STATEDB accordingly. + +* When an admin makes changes to CONFGIT, and pushes them to the Downstream + Host, the Host's git post-update hook makes an HTTP request to + WEBAPP to update STATEDB from CONFGIT. (RC/ADD, RC/RM) + +* Each MINION likewise communicates only with the WEBAPP using HTTP + requests. MINION requests a job to run (which triggers WEBAPP's job + scheduling), and then reports results to the WEBAPP (which causes + WEBAPP to store them in STATEDB), which tells MINION whether to + continue running the job or not (RT/KILL). There is no separate + scheduling process: all scheduling happens when there is a MINION + available. + +* At system start up, a systemd unit makes an HTTP request to WEBAPP + to make it refresh STATEDB from CONFGIT. (RC/START) + +* A timer unit for systemd makes an HTTP request to get WEBAPP to + refresh the static HTML status page. (MON/STATIC) + +In summary: systemd starts WEBAPP and MINIONs, and whenever a +MINION can do work, it asks WEBAPP for something to do, and reports +back results. Meanwhile, admin can query and control via HTTP requests +to WEBAPP, and WEBAPP instances communicate via STATEDB. + +The WEBAPP +---------- + +The WEBAPP provides an HTTP API as described below. + +Run queue management: + +* `POST /1.0/stop-queue` causes WEBAPP to stop scheduling new jobs to + run. Any currently running jobs are not affected. (RT/QSTOP) + +* `POST /1.0/start-queue` causes WEBAPP to start scheduling jobs + again. (RT/QSTART) + +* `GET /1.0/list-queue` causes WEBAPP to return a JSON list of ids of + all Lorry specifications in the run queue, in the order they are in + the run queue. (RQ/SPECS) + +* `POST /1.0/move-to-top` with `path=lorryspecid` as the body, where + `lorryspecid` is the id (path) of a Lorry specification in the run + queue, causes WEBAPP to move the specified spec to the head of the + run queue, and store this in STATEDB. It doesn't affect currently + running jobs. (RT/TOP) + +* `POST /1.0/move-to-bottom` with `path=lorryspecid` in the body is + like `/move-to-top`, but moves the job to the end of the run queue. + (RT/BOT) + +Running job management: + +* `GET /1.0/list-running-jobs` causes WEBAPP to return a JSON list of + ids of all currently running jobs. (RQ/RUNNING) + +* `GET /1.0/job/` causes WEBAPP to return a JSON map (dict) + with all the information about the specified job. + +* `POST /1.0/stop-job` with `job_id=jobid` where `jobid` is an id of a + running job, causes WEBAPP to record in STATEDB that the job is to + be killed, and waits for it to be killed. (Killing to be done when + MINION gets around to it.) This request returns as soon as the + STATEDB change is done. + +* `GET /1.0/list-jobs` causes WEBAPP to return a JSON list of ids + of all jobs, running or finished, that it knows about. (RQ/ALLJOBS) + +* `GET /1.0/list-jobs-html` is the same as `list-jobs`, but returns an + HTML page instead. + +* `POST /1.0/remove-job` with `job_id=jobid` in the body, removes a + stopped job from the state database. + +* `POST /1.0/remove-ghost-jobs` looks for any running jobs in STATEDB + that haven't been updated (with `job-update`, see below) in a long + time (see `--ghost-timeout`), and marks them as terminated. This is + used to catch situations when a MINION fails to tell the WEBAPP that + a job has terminated. + +Other status queries: + +* `GET /1.0/status` causes WEBAPP to return a JSON object that + describes the state of Lorry Controller. This information is meant + to be programmatically useable and may or may not be the same as in + the HTML page. + +* `GET /1.0/status-html` causes WEBAPP to return an HTML page that + describes the state of Lorry Controller. This also updates an + on-disk copy of the HTML page, which the web server is configured to + serve using a normal HTTP request. This is the primary interface for + human admins to look at the state of Lorry Controller. (MON/STATIC) + +* `GET /1.0/lorry/` causes WEBAPP to return a JSON map + (dict) with all the information about the specified Lorry + specification. (RQ/SPEC) + + +Requests for MINION: + +* `GET /1.0/give-me-job` is used by MINION to get a new job to run. + WEBAPP will either return a JSON object describing the job to run, + or return a status code indicating that there is nothing to do. + WEBAPP will respond immediately, even if there is nothing for MINION + to do, and MINION will then sleep for a while before it tries again. + WEBAPP updates STATEDB to record that the job is allocated to a + MINION. + +* `POST /1.0/job-update` is used by MINION to push updates about the + job it is running to WEBAPP. The body sets fields `exit` (exit code + of program, or `no` if not set), `stdout` (some output from the + job's standard output) and `stderr` (ditto, but standard error + output). There MUST be at least one `job-update` call, which + indicates the job has terminated. WEBAPP responds with a status + indicating whether the job should continue to run or be terminated + (RR/TIMEOUT). WEBAPP records the job as terminated only after MINION + tells it the job has been terminated. MINION makes the `job-update` + request frequently, even if the job has produced no output, so that + WEBAPP can update a timestamp in STATEDB to indicate the job is + still alive. + +Other requests: + +* `POST /1.0/read-configuration` causes WEBAPP to update its copy of + CONFGIT and update STATEDB based on the new configuration, if it has + changed. Returns OK/ERROR status. (RC/ADD, RC/RM, RC/START) + + This is called by systemd units at system startup and periodically + (perhaps once a minute) otherwise. It can also be triggered by an + admin (there is a button on the `/1.0/status-html` web page). + +* `POST /1.0/ls-troves` causes WEBAPP to refresh its list of + repositories in each Upstream Host, if the current list is too old + (see the `ls-interval` setting for each Upstream Host in + `lorry-controller.conf`). This gets called from a systemd timer unit + at a suitable interval. + +* `POST /1.0/force-ls-troves` causes the repository refresh to happen + for all Upstream Hosts, regardless of whether it is due or not. This + can be called manually by an admin. + + +The MINION +---------- + +* Do `GET /1.0/give-me-job` to WEBAPP. +* If didn't get a job, sleep a while and try again. +* If did get job, fork and exec that. +* In a loop: wait for output, for a suitably short period of time, + from job (or its termination), with `select` or similar mechanism, + and send anything (if anything) you get to WEBAPP. If the WEBAPP + told us to kill the job, kill it, then send an update to that effect + to WEBAPP. +* Go back to top to request new job. + + +Old job removal +--------------- + +To avoid the STATEDB filling up with logs of old jobs, a systemd timer +unit will run occasionally to remove jobs so old, nobody cares about +them anymore. To make it easier to experiment with the logic of +choosing what to remove (age only? keep failed ones? something else?) +the removal is kept outside the WEBAPP. + + +STATEDB +------- + +The STATEDB has several tables. This section explains them. + +The `running_queue` table has a single column (`running`) and a single +row, and is used to store a single boolean value that specifies +whether WEBAPP is giving out jobs to run from the run-queue. This +value is controlled by `/1.0/start-queue` and `/1.0/stop-queue` +requests. + +The `lorries` table implements the run-queue: all the Lorry specs that +WEBAPP knows about. It has the following columns: + +* `path` is the path of the git repository on the Downstream Host, i.e., + the git repository to which Lorry will push. This is a unique + identifier. It is used, for example, to determine if a Lorry spec + is obsolete after a CONFGIT update. +* `text` has the text of the Lorry spec. This may be read from a file + or generated by Lorry Controller itself. This text will be given to + Lorry when a job is run. +* `generated` is set to 0 or 1, depending on if the Lorry came from an + actual `.lorry` file or was generated by Lorry Controller. + + +Code structure +============== + +The Lorry Controller code base is laid out as follows: + +* `lorry-controller-webapp` is the main program of WEBAPP. It sets up + the bottle.py framework. All the implementations for the various + HTTP requests are in classes in the `lorrycontroller` Python + package, as subclasses of the `LorryControllerRoute` class. The main + program uses introspection ("magic") to find the subclasses + automatically and sets up the bottle.py routes correctly. This makes + it possible to spread the code into simple classes; bottle's normal + way (with the `@app.route` decorator) seemed to make that harder and + require everything in the same class. + +* `lorrycontroller` is a Python package with: + + - The HTTP request handlers (`LorryControllerRoute` and its subclasses) + - Management of STATEDB (`statedb` module) + - Support for various Downstream and Upstream Host types + (`hosts`, `gitano`, `gerrit`, `gitlab`, `local` modules) + - Some helpful utilities (`proxy` module) + +* `lorry-controller-minion` is the entirety of the MINION, except that + it uses the `lorrycontroller.setup_proxy` function. + The MINION is kept very simple on purpose: all the interesting logic + is in the WEBAPP instead. + +* `static` has static content to be served over HTTP. Primarily, the + CSS file for the HTML interfaces. When LC is integrated within the + Downstream Host, the web server gets configured to serve these files directly. + The `static` directory will be accessible over plain HTTP on port + 80, and on port 12765 via the WEBAPP, to allow HTML pages to refer + to it via a simple path. + +* `templates` contains bottle.py HTML templates for various pages. + +* `etc` contains files to be installed in `/etc` when LC is installed + on a Baserock system. Primarily this is the web server (lighttpd) + configuration to invoke WEBAPP. + +* `units` contains various systemd units that start services and run + time-based jobs. + +* `yarns.webapp` contains an integration test suite for WEBAPP. + This is run by the `./check` script. The `./test-wait-for-port` + script is used by the yarns. + +Example +------- + +As an example, to modify how the `/1.0/status-html` request works, you +would look at its implementation in `lorrycontroller/status.py`, and +perhaps also the HTML templates in `templates/*.tpl`. + +STATEDB +------- + +The persistent state of WEBAPP is stored in an Sqlite3 database. All +access to STATEDB within WEBAPP is via the +`lorrycontroller/statedb.py` code module. That means there are no SQL +statements outside `statedb.py` at all, nor is it OK to add any. If +the interface provided by the `StateDB` class isn't sufficient, then +modify the class suitably, but do not add any new SQL outside it. + +All access from outside of WEBAPP happens via WEBAPP's HTTP API. +Only the WEBAPP is allowed to touch STATEDB in any way. + +The bottle.py framework runs multiple threads of WEBAPP code. The +threads communicate only via STATEDB. There is no shared state in +memory. SQL's locking is used for mutual exclusion. + +The `StateDB` class acts as a context manager for Python's `with` +statements to provide locking. To access STATEDB with locking, use +code such as this: + + with self.open_statedb() as statedb: + hosts = statedb.get_hosts() + for host in hosts: + statedb.remove_host(hosts) + +The code executed by the `with` statement is run under lock, and the +lock gets released automatically even if there is an exception. + +(You could manage locks manually. It's a good way to build character +and learn why using the context manager is really simple and leads to +more correct code.) diff --git a/INSTALL b/INSTALL deleted file mode 100644 index 5344093..0000000 --- a/INSTALL +++ /dev/null @@ -1,273 +0,0 @@ -# Installing Lorry Controller - -## Dependencies - -Required: - -* **Python 3**: Tested with versions 3.6 and 3.8. - -* **Git**: Tested with versions 2.11 and 2.27. - -* **Lorry**: Also note the dependencies in its README. - -* **Bottle**: Can be installed from PyPI, or as the `python3-bottle` - package in Debian. Tested with version 0.12.18. - -* **cliapp**: Can be installed as the `python3-cliapp` package in - Debian, or with: - - pip3 install https://gitlab.com/trovekube/cliapp/-/archive/cliapp-1.20180812.1/cliapp-cliapp-1.20180812.1.tar.gz - - or from the source at . Tested with version - 1.20180812.1. - -* **PyYAML**: Can be installed from PyPI, or as the `python3-yaml` - package in Debian. Tested with version 5.3.1. - -* **yoyo-migrations**: Can be installed from PyPI. Tested with - version 7.0.2. - -Optional: - -* **curl**: Needed if you want to run the test suite. Can be - installed as a distribution package. Tested with versions 7.52.1 - and 7.70.0. - -* **flup**: Needed if you want to run the web application as a FastCGI - server. Can be installed from PyPI. Tested with version 1.0.3, and - earlier versions won't run on Python 3. - -* **OpenSSH**: The OpenSSH client is needed if you want to use any - Downstream Host other than the local filesystem. Tested with - versions 7.4p1 and 8.2p1. - -* **python-gitlab**: Needed if you want to use GitLab as a Downstream - or Upstream Host. Can be installed from PyPI or as the - `python3-gitlab` package in Debian 10 onward. Tested with version - 1.15.0. - -* **Requests**: Needed if you want to use Gitano or GitLab as the - Downstream Host. Can be installed from PyPI or as the - `python3-requests` package in Debian. Tested with version 2.23.0. - -* **yarn**: Needed if you want to run the test suite. Can be - installed as the `cmdtest` package in Debian, or from the source at - . Tested with version 0.27-1 and with - commit `cdfe14e45134` on the master branch. - -## User account - -Create a single user account and home directory for Lorry and Lorry -Controller on the host where they will run. - -Create an SSH key pair for Lorry, and install the *private* key in -`.ssh` in Lorry's home directory. - -## Configuring the Downstream Host - -### Gerrit - -These instructions were written for Gerrit 3.1. - -1. Create a user for Lorry in Gerrit's authentication provider. - Add Lorry's SSH *public* key to the Gerrit user account. - -2. Create a group in Gerrit, or add the user to a group, that will be - permitted to create repositories and push changes to them. The - Lorry user should be a member but not an owner of this group. - -3. (Optional but strongly recommended) Create a parent project for - the mirror repositories and make this group the owner. - - Use the `gerrit create-project` command with the - `--permissions-only` option. Alternately, in the web UI, create - a new project and fill out the form as follows: - - * Set 'Repository name' as you wish. This is independent of the - names of repositories that Lorry will create. - * Leave 'Rights inherit' blank - * Set Owner to the group - * Set 'Create initial empty commit' to 'False' - * Set 'Only server as parent for other repositories' to 'True' - -4. Give the group permission to create repositories, - [bypass review](https://gerrit-review.googlesource.com/Documentation/user-upload.html#bypass_review), - [skip validation](https://gerrit-review.googlesource.com/Documentation/user-upload.html#skip_validation), - and push tags that aren't on a branch: - - * In 'All-Projects', give the group 'Create Project' permission. - In the web UI this is in the Global Capabilities section. - * In the parent project (or 'All-Projects'), give the group 'Forge - Author Identity', 'Forge Committer Identity', 'Forge Server - Identity', 'Push', and 'Push Merge Commit' permissions over - `refs/*` - * If you *did not* create a parent project, then in 'All-Projects' - also give the group 'Create Reference', 'Create Signed Tag', and - 'Create Annotated Tag' permissions over `refs/*` - -5. In `lorry.conf`: - - * Set `mirror-base-url-{fetch,push}` to - `git+ssh://`*username*`@`*hostname*`:29418` - * Set `push-option = skip-validation` - -6. In `webapp.conf`: - - * Set `downstream-host-type = gerrit` - * Set `downstream-ssh-url = ssh://`*username*`@`*hostname*`:29418` - * Set `gerrit-parent-project =` *parent-project* - -7. Add Gerrit's SSH host public key to `.ssh/known_hosts` in Lorry's - home directory. - -### Gitano - -Gitano and Lorry Controller would normally be deployed together as -part of a Trove: . - -### Gitea - -These instructions were written for Gitea 1.11. - -1. Create a user for Lorry in Gitea (or its authentication provider). - Log in as the user and add Lorry's SSH *public* key to the user - account. Generate an access token for the user. - -2. Set `mirror-base-url-{fetch,push}` in `lorry.conf` to - `git+ssh://git@`*hostname* - -3. In `webapp.conf`: - - * Set `downstream-host-type = gitea` - * Set `downstream-visibility` to the desired visibility of - repositories: `private`, `internal`, or `public` - * Set `downstream-http-url` to the HTTPS or HTTP (not recommended) - URL of the Gitea server. - * Set `gitea-access-token =` *access-token* - -4. Add Gitea's SSH host public key to `.ssh/known_hosts` in Lorry's - home directory. - -Gitea requires all repositories to be organised under a user or -organisation, and organisations cannot contain other organisations. -You must therefore ensure that the CONFGIT specifies repository paths -with exactly two path components. - -Lorry Controller will attempt to create organisations as needed to -contain repositories. If your Gitea configuration does not allow -users to do this, you will need to create organisations in advance and -give the Lorry user permission to create repositories under them. - -### GitLab - -These instructions were written for GitLab CE 12.10. - -1. Create a user for Lorry in GitLab (or its authentication provider). - Add Lorry's SSH *public* key to the user account. Generate an - impersonation token for the user. - -2. Set `mirror-base-url-{fetch,push}` in `lorry.conf` to - `git+ssh://git@`*hostname* - -3. In `webapp.conf`: - - * Set `downstream-host-type = gitlab` - * Set `downstream-visibility` to the desired visibility of - repositories: `private`, `internal`, or `public` - * Set `downstream-http-url` to the HTTPS or HTTP (not recommended) - URL of the GitLab server. - * Set `gitlab-private-token =` *impersonation-token* - -4. Add GitLab's SSH host public key to `.ssh/known_hosts` in Lorry's - home directory. - -GitLab requires all projects to be organised under a user or group. -You must therefore ensure that the CONFGIT specifies repository paths -with at least two path components. - -Lorry Controller will attempt to create groups as needed to contain -projects. If your GitLab configuration does not allow users to do -this, you will need to create top-level groups in advance and give the -Lorry user permission to create subgroups and projects under them. - -### Local filesystem - -1. Create a directory to contain the repositories, writable by - the Lorry user. - -2. Set `mirror-base-url-{fetch,push}` in `lorry.conf` to the directory - name. - -3. In `webapp.conf`: - - * Set `downstream-host-type = local` - * Set `local-base-directory =` *directory* - -## Configuring a front-end web server - -WEBAPP can run behind a front-end web server connected through FastCGI. -To enable FastCGI, set `wsgi = yes` in `webapp.conf`. - -The front-end web server must be configured so that: - -* It does any necessary access control - -* It passes the request path as `PATH_INFO`, not split into - `SCRIPT_NAME` and `PATH_INFO` - -* It creates the FastCGI socket and starts WEBAPP as the Lorry user. - WEBAPP should normally be started with the command: - - /usr/bin/lorry-controller-webapp --config=/etc/lorry-controller/webapp.conf - -An example configuration for lighttpd, and a corresponding systemd -unit file, are included in the source as -`etc/lighttpd/lorry-controller-webapp-httpd.conf` and -`units/lighttpd-lorry-controller-webapp.service`. - -## Creating the CONFGIT repository - -The CONFGIT repository can be hosted anywhere, but will normally be a -private repository on the Downstream Host. The Lorry user account on -the Downstream Host must be given permission to read, but not write, -to it. Only administrators of Lorry Controller should be permitted -to push to it. - -The configuration files stored in CONFGIT are documented in README. - -## Installing Lorry and Lorry Controller - -1. In a copy of the Lorry source tree, run: - - python3 setup.py install --install-layout=deb - -2. Install `lorry.conf` in `/etc/` - -3. Create the directories named in `lorry.conf` - (`bundle-dest`, `tarball-dest`, `working-area` settings), - owned by the Lorry user. - -4. In a copy of the Lorry Controller source tree, run: - - python3 setup.py install --install-layout=deb - -5. Install `webapp.conf` in `/etc/lorry-controller/` - -6. Create the directories named in `webapp.conf` - (`configuration-directory` setting and parent directories for the - `statedb` and `status-html` settings), owned by the Lorry user. - -7. Install the unit files from Lorry Controller's source tree - (`units/lorry-controller-*`) in `/lib/systemd/system/`. These may - need to be modified to specify the correct URL for the front-end - web server. - -8. Run `systemctl daemon-reload` to make systemd load the unit files. - -9. Enable and start as many MINION services as you want to run in - parallel. For example, to create 5 services: - - for i in $(seq 5); do - systemctl enable lorry-controller-minion@$i.service - systemctl start lorry-controller-minion@$i.service - done diff --git a/INSTALL.md b/INSTALL.md new file mode 100644 index 0000000..d7ad241 --- /dev/null +++ b/INSTALL.md @@ -0,0 +1,273 @@ +# Installing Lorry Controller + +## Dependencies + +Required: + +* **Python 3**: Tested with versions 3.6 and 3.8. + +* **Git**: Tested with versions 2.11 and 2.27. + +* **Lorry**: Also note the dependencies in its README. + +* **Bottle**: Can be installed from PyPI, or as the `python3-bottle` + package in Debian. Tested with version 0.12.18. + +* **cliapp**: Can be installed as the `python3-cliapp` package in + Debian, or with: + + pip3 install https://gitlab.com/trovekube/cliapp/-/archive/cliapp-1.20180812.1/cliapp-cliapp-1.20180812.1.tar.gz + + or from the source at . Tested with version + 1.20180812.1. + +* **PyYAML**: Can be installed from PyPI, or as the `python3-yaml` + package in Debian. Tested with version 5.3.1. + +* **yoyo-migrations**: Can be installed from PyPI. Tested with + version 7.0.2. + +Optional: + +* **curl**: Needed if you want to run the test suite. Can be + installed as a distribution package. Tested with versions 7.52.1 + and 7.70.0. + +* **flup**: Needed if you want to run the web application as a FastCGI + server. Can be installed from PyPI. Tested with version 1.0.3, and + earlier versions won't run on Python 3. + +* **OpenSSH**: The OpenSSH client is needed if you want to use any + Downstream Host other than the local filesystem. Tested with + versions 7.4p1 and 8.2p1. + +* **python-gitlab**: Needed if you want to use GitLab as a Downstream + or Upstream Host. Can be installed from PyPI or as the + `python3-gitlab` package in Debian 10 onward. Tested with version + 1.15.0. + +* **Requests**: Needed if you want to use Gitano or GitLab as the + Downstream Host. Can be installed from PyPI or as the + `python3-requests` package in Debian. Tested with version 2.23.0. + +* **yarn**: Needed if you want to run the test suite. Can be + installed as the `cmdtest` package in Debian, or from the source at + . Tested with version 0.27-1 and with + commit `cdfe14e45134` on the master branch. + +## User account + +Create a single user account and home directory for Lorry and Lorry +Controller on the host where they will run. + +Create an SSH key pair for Lorry, and install the *private* key in +`.ssh` in Lorry's home directory. + +## Configuring the Downstream Host + +### Gerrit + +These instructions were written for Gerrit 3.1. + +1. Create a user for Lorry in Gerrit's authentication provider. + Add Lorry's SSH *public* key to the Gerrit user account. + +2. Create a group in Gerrit, or add the user to a group, that will be + permitted to create repositories and push changes to them. The + Lorry user should be a member but not an owner of this group. + +3. (Optional but strongly recommended) Create a parent project for + the mirror repositories and make this group the owner. + + Use the `gerrit create-project` command with the + `--permissions-only` option. Alternately, in the web UI, create + a new project and fill out the form as follows: + + * Set 'Repository name' as you wish. This is independent of the + names of repositories that Lorry will create. + * Leave 'Rights inherit' blank + * Set Owner to the group + * Set 'Create initial empty commit' to 'False' + * Set 'Only server as parent for other repositories' to 'True' + +4. Give the group permission to create repositories, + [bypass review](https://gerrit-review.googlesource.com/Documentation/user-upload.html#bypass_review), + [skip validation](https://gerrit-review.googlesource.com/Documentation/user-upload.html#skip_validation), + and push tags that aren't on a branch: + + * In 'All-Projects', give the group 'Create Project' permission. + In the web UI this is in the Global Capabilities section. + * In the parent project (or 'All-Projects'), give the group 'Forge + Author Identity', 'Forge Committer Identity', 'Forge Server + Identity', 'Push', and 'Push Merge Commit' permissions over + `refs/*` + * If you *did not* create a parent project, then in 'All-Projects' + also give the group 'Create Reference', 'Create Signed Tag', and + 'Create Annotated Tag' permissions over `refs/*` + +5. In `lorry.conf`: + + * Set `mirror-base-url-{fetch,push}` to + `git+ssh://`*username*`@`*hostname*`:29418` + * Set `push-option = skip-validation` + +6. In `webapp.conf`: + + * Set `downstream-host-type = gerrit` + * Set `downstream-ssh-url = ssh://`*username*`@`*hostname*`:29418` + * Set `gerrit-parent-project =` *parent-project* + +7. Add Gerrit's SSH host public key to `.ssh/known_hosts` in Lorry's + home directory. + +### Gitano + +Gitano and Lorry Controller would normally be deployed together as +part of a Trove: . + +### Gitea + +These instructions were written for Gitea 1.11. + +1. Create a user for Lorry in Gitea (or its authentication provider). + Log in as the user and add Lorry's SSH *public* key to the user + account. Generate an access token for the user. + +2. Set `mirror-base-url-{fetch,push}` in `lorry.conf` to + `git+ssh://git@`*hostname* + +3. In `webapp.conf`: + + * Set `downstream-host-type = gitea` + * Set `downstream-visibility` to the desired visibility of + repositories: `private`, `internal`, or `public` + * Set `downstream-http-url` to the HTTPS or HTTP (not recommended) + URL of the Gitea server. + * Set `gitea-access-token =` *access-token* + +4. Add Gitea's SSH host public key to `.ssh/known_hosts` in Lorry's + home directory. + +Gitea requires all repositories to be organised under a user or +organisation, and organisations cannot contain other organisations. +You must therefore ensure that the CONFGIT specifies repository paths +with exactly two path components. + +Lorry Controller will attempt to create organisations as needed to +contain repositories. If your Gitea configuration does not allow +users to do this, you will need to create organisations in advance and +give the Lorry user permission to create repositories under them. + +### GitLab + +These instructions were written for GitLab CE 12.10. + +1. Create a user for Lorry in GitLab (or its authentication provider). + Add Lorry's SSH *public* key to the user account. Generate an + impersonation token for the user. + +2. Set `mirror-base-url-{fetch,push}` in `lorry.conf` to + `git+ssh://git@`*hostname* + +3. In `webapp.conf`: + + * Set `downstream-host-type = gitlab` + * Set `downstream-visibility` to the desired visibility of + repositories: `private`, `internal`, or `public` + * Set `downstream-http-url` to the HTTPS or HTTP (not recommended) + URL of the GitLab server. + * Set `gitlab-private-token =` *impersonation-token* + +4. Add GitLab's SSH host public key to `.ssh/known_hosts` in Lorry's + home directory. + +GitLab requires all projects to be organised under a user or group. +You must therefore ensure that the CONFGIT specifies repository paths +with at least two path components. + +Lorry Controller will attempt to create groups as needed to contain +projects. If your GitLab configuration does not allow users to do +this, you will need to create top-level groups in advance and give the +Lorry user permission to create subgroups and projects under them. + +### Local filesystem + +1. Create a directory to contain the repositories, writable by + the Lorry user. + +2. Set `mirror-base-url-{fetch,push}` in `lorry.conf` to the directory + name. + +3. In `webapp.conf`: + + * Set `downstream-host-type = local` + * Set `local-base-directory =` *directory* + +## Configuring a front-end web server + +WEBAPP can run behind a front-end web server connected through FastCGI. +To enable FastCGI, set `wsgi = yes` in `webapp.conf`. + +The front-end web server must be configured so that: + +* It does any necessary access control + +* It passes the request path as `PATH_INFO`, not split into + `SCRIPT_NAME` and `PATH_INFO` + +* It creates the FastCGI socket and starts WEBAPP as the Lorry user. + WEBAPP should normally be started with the command: + + /usr/bin/lorry-controller-webapp --config=/etc/lorry-controller/webapp.conf + +An example configuration for lighttpd, and a corresponding systemd +unit file, are included in the source as +`etc/lighttpd/lorry-controller-webapp-httpd.conf` and +`units/lighttpd-lorry-controller-webapp.service`. + +## Creating the CONFGIT repository + +The CONFGIT repository can be hosted anywhere, but will normally be a +private repository on the Downstream Host. The Lorry user account on +the Downstream Host must be given permission to read, but not write, +to it. Only administrators of Lorry Controller should be permitted +to push to it. + +The configuration files stored in CONFGIT are documented in `README.md`. + +## Installing Lorry and Lorry Controller + +1. In a copy of the Lorry source tree, run: + + python3 setup.py install --install-layout=deb + +2. Install `lorry.conf` in `/etc/` + +3. Create the directories named in `lorry.conf` + (`bundle-dest`, `tarball-dest`, `working-area` settings), + owned by the Lorry user. + +4. In a copy of the Lorry Controller source tree, run: + + python3 setup.py install --install-layout=deb + +5. Install `webapp.conf` in `/etc/lorry-controller/` + +6. Create the directories named in `webapp.conf` + (`configuration-directory` setting and parent directories for the + `statedb` and `status-html` settings), owned by the Lorry user. + +7. Install the unit files from Lorry Controller's source tree + (`units/lorry-controller-*`) in `/lib/systemd/system/`. These may + need to be modified to specify the correct URL for the front-end + web server. + +8. Run `systemctl daemon-reload` to make systemd load the unit files. + +9. Enable and start as many MINION services as you want to run in + parallel. For example, to create 5 services: + + for i in $(seq 5); do + systemctl enable lorry-controller-minion@$i.service + systemctl start lorry-controller-minion@$i.service + done diff --git a/README b/README deleted file mode 100644 index f2afb80..0000000 --- a/README +++ /dev/null @@ -1,208 +0,0 @@ -README for lorry-controller -=========================== - -Overview --------- - -Lorry Controller, or LC for short, manages the importing of source -code from external sources into git repositories on a Trove, GitLab, -or Gerrit server (Downstream Host). - -LC uses the Lorry tool to do the actual import. Lorry can read code -from several different version control systems, and convert them to -git. External repositories can be specfied individually, as Lorry -`.lorry` specification files. In addition, LC can be told to mirror -all the git repositories on a Trove or GitLab server (Upstream Host). - -LC runs Lorry for the right external repositories, and takes care of -running a suitable number of Lorry instances concurrently, and -recovering from any problems. LC has a web based administration -interface, and an HTTP API for reporting and controlling its state. - -This README file documents the LC configuration file and general use. -For the architecture of LC and the HTTP API, see the `ARCH` file. - -Installation ------------- - -See the INSTALL file. - -Lorry Controller configuration: overview ------------------------------- - -Lorry Controller has two levels of configuration. The first level is -command line options and configuration files. This level specifies -things such as log levels, network addresses to listen on, and such. -Most importantly, this level specifies the location of the second -level. For information about these options, run -`lorry-controller-webapp --help` to get a list of them. - -The second level is a git repository that specifies which external -repositories and Upstream Hosts to import into the Downstream Host. -This git repository is referred to as CONFGIT in documentation, and is -specified with the the `--confgit-url` command line option, or the -`confgit-url` key in the configuration file. The configuration file -could contain this, for example: - - [config] - confgit-url = ssh://git@localhost/baserock/local-config/lorries - -The system integration of a Trove automatically includes a -configuration file that contains a configuration such as the above. -The URL contains the name of the Trove, so it needs to be customised -for each Trove, but as long as you're only using LC as part of a -Baserock Trove, it's all taken care of for you automatically. - - -The CONFGIT repository ----------------------- - -The CONFGIT repository must contain at least the file -`lorry-controller.conf`. It may also contain other files, including -`.lorry` files for Lorry, but all other files are ignored unless -referenced by `lorry-controller.conf`. - - - -The `lorry-controller.conf` file --------------------------------- - -`lorry-controller.conf` is a JSON file containing a list of maps. Each -map specifies an Upstream Host or one set of `.lorry` -files. Here's an example that tells LC to mirror the `git.baserock.org` -Trove and anything in the `open-source-lorries/*.lorry` files (if any -exist). - - [ - { - "ignore": [ - "baserock/lorries" - ], - "interval": "2H", - "ls-interval": "4H", - "prefixmap": { - "baserock": "baserock", - "delta": "delta" - }, - "protocol": "http", - "host": "git.baserock.org", - "type": "trove" - }, - { - "type": "lorries", - "interval": "6H", - "prefix": "delta", - "globs": [ - "open-source-lorries/*.lorry" - ] - } - ] - -A Host specification (map) uses the following mandatory keys: - -* `type:` -- either `trove` or `gitlab`, depending on the type of - Upstream Host. - -* `host` -- the Upstream Host to mirror; a domain name or IP address. - -* `protocol` -- specify how Lorry Controller (and Lorry) should talk - to the Upstream Host. Allowed values are `ssh`, `https`, `http`. - -* `prefixmap` -- map repository path prefixes from the Upstream Host - to the Downstream Host. If the upstream prefix is `foo`, and the - downstream prefix is `bar`, then upstream repository - `foo/baserock/yeehaa` gets mirrored to downstream repository - `bar/baserock/yeehaa`. If the Upstream Host has a repository that - does not match a prefix, that repository gets ignored. - -* `ls-interval` -- determine how often should Lorry Controller query - the Upstream Host for a list of repositories it may mirror. See below - for how the value is interpreted. The default is 24 hours. - -* `interval` -- specify how often Lorry Controller should mirror the - repositories in the spec. See below for INTERVAL. The default - interval is 24 hours. - -Additionally, the following optional keys are allowed in Host -specifications: - -* `ignore` -- a list of git repositories from the Upstream Host that - should NOT be mirrored. Each list element is a glob pattern which - is matched against the path to the git repository (not including leading - slash). - -* `auth` -- specify how to authenticate to the Upstream Host over https - (only). It should be a dictionary with the fields `username` and - `password`. - -A GitLab specification (map) uses an additional mandatory key: - -* `private-token` -- the GitLab private token for a user with the - minimum permissions of master of any group you may wish to create - repositories under. - -A Lorry specification (map) uses the following keys, all of them -mandatory: - -* `type: lorries` -- specify it's a Lorry specification. - -* `interval` -- identical in meaning to the `interval` in a - Host specification. - -* `prefix` -- a path prefix to be prepended to all repositories - created from the `.lorry` files from this spec. - -* `globs` -- a list of globs (as strings) for `.lorry` files to use. - The glob is matched in the directory containing the configuration - file in which this spec is. It is OK for the globs to not match - anything. - -For backwards compatibility with another implementation of Lorry -Controller, other fields in either type of specification are allowed -and silently ignored. - -An INTERVAL value (for `interval` or `ls-interval`) is a number and a -unit to indicate a time interval. Allowed units are minutes (`m`), -hours (`h`), and days (`d`), expressed as single-letter codes in upper -or lower case. - -The syntax of `.lorry` files is specified by the Lorry program; see -its documentation for details. - - -HTTP proxy configuration: `proxy.conf` --------------------------------------- - -Lorry Controller will look for a file called `proxy.conf` in the same -directory as the `lorry-controller.conf` configuration file. -It is in JSON format, with the following key/value pairs: - -* `hostname` -- the hostname of the HTTP proxy -* `username` -- username for authenticating to the proxy -* `password` -- a **cleartext** password for authenticating to the - proxy -* `port` -- port number for connecting to the proxy - -Lorry Controller will use this information for both HTTP and HTTPS -proxying. - -Do note that the **password is stored in cleartext** and that access -to the configuration file (and the git repository where it is stored) -must be controlled appropriately. - -WEBAPP 'Admin' Interface ------------------------- - -An 'admin' interface runs locally on port 12765. - -For the moment you can access this interface using an ssh tunnel, for -example: - -ssh -L 12765:localhost:12765 root@lorryhost - -will bind 12765 on your localhost to 12765 on lorryhost, with this running -you can access the 'admin' interface at -http://localhost:12765/1.0/status-html - -When used within Trove, a web interface for managing lorry controller -is accessible from http://trove/1.0/status-html. diff --git a/README.md b/README.md new file mode 100644 index 0000000..32c804d --- /dev/null +++ b/README.md @@ -0,0 +1,208 @@ +README for lorry-controller +=========================== + +Overview +-------- + +Lorry Controller, or LC for short, manages the importing of source +code from external sources into git repositories on a Trove, GitLab, +or Gerrit server (Downstream Host). + +LC uses the Lorry tool to do the actual import. Lorry can read code +from several different version control systems, and convert them to +git. External repositories can be specfied individually, as Lorry +`.lorry` specification files. In addition, LC can be told to mirror +all the git repositories on a Trove or GitLab server (Upstream Host). + +LC runs Lorry for the right external repositories, and takes care of +running a suitable number of Lorry instances concurrently, and +recovering from any problems. LC has a web based administration +interface, and an HTTP API for reporting and controlling its state. + +This README file documents the LC configuration file and general use. +For the architecture of LC and the HTTP API, see the `ARCH.md` file. + +Installation +------------ + +See the `INSTALL.md` file. + +Lorry Controller configuration: overview +------------------------------ + +Lorry Controller has two levels of configuration. The first level is +command line options and configuration files. This level specifies +things such as log levels, network addresses to listen on, and such. +Most importantly, this level specifies the location of the second +level. For information about these options, run +`lorry-controller-webapp --help` to get a list of them. + +The second level is a git repository that specifies which external +repositories and Upstream Hosts to import into the Downstream Host. +This git repository is referred to as CONFGIT in documentation, and is +specified with the the `--confgit-url` command line option, or the +`confgit-url` key in the configuration file. The configuration file +could contain this, for example: + + [config] + confgit-url = ssh://git@localhost/baserock/local-config/lorries + +The system integration of a Trove automatically includes a +configuration file that contains a configuration such as the above. +The URL contains the name of the Trove, so it needs to be customised +for each Trove, but as long as you're only using LC as part of a +Baserock Trove, it's all taken care of for you automatically. + + +The CONFGIT repository +---------------------- + +The CONFGIT repository must contain at least the file +`lorry-controller.conf`. It may also contain other files, including +`.lorry` files for Lorry, but all other files are ignored unless +referenced by `lorry-controller.conf`. + + + +The `lorry-controller.conf` file +-------------------------------- + +`lorry-controller.conf` is a JSON file containing a list of maps. Each +map specifies an Upstream Host or one set of `.lorry` +files. Here's an example that tells LC to mirror the `git.baserock.org` +Trove and anything in the `open-source-lorries/*.lorry` files (if any +exist). + + [ + { + "ignore": [ + "baserock/lorries" + ], + "interval": "2H", + "ls-interval": "4H", + "prefixmap": { + "baserock": "baserock", + "delta": "delta" + }, + "protocol": "http", + "host": "git.baserock.org", + "type": "trove" + }, + { + "type": "lorries", + "interval": "6H", + "prefix": "delta", + "globs": [ + "open-source-lorries/*.lorry" + ] + } + ] + +A Host specification (map) uses the following mandatory keys: + +* `type:` -- either `trove` or `gitlab`, depending on the type of + Upstream Host. + +* `host` -- the Upstream Host to mirror; a domain name or IP address. + +* `protocol` -- specify how Lorry Controller (and Lorry) should talk + to the Upstream Host. Allowed values are `ssh`, `https`, `http`. + +* `prefixmap` -- map repository path prefixes from the Upstream Host + to the Downstream Host. If the upstream prefix is `foo`, and the + downstream prefix is `bar`, then upstream repository + `foo/baserock/yeehaa` gets mirrored to downstream repository + `bar/baserock/yeehaa`. If the Upstream Host has a repository that + does not match a prefix, that repository gets ignored. + +* `ls-interval` -- determine how often should Lorry Controller query + the Upstream Host for a list of repositories it may mirror. See below + for how the value is interpreted. The default is 24 hours. + +* `interval` -- specify how often Lorry Controller should mirror the + repositories in the spec. See below for INTERVAL. The default + interval is 24 hours. + +Additionally, the following optional keys are allowed in Host +specifications: + +* `ignore` -- a list of git repositories from the Upstream Host that + should NOT be mirrored. Each list element is a glob pattern which + is matched against the path to the git repository (not including leading + slash). + +* `auth` -- specify how to authenticate to the Upstream Host over https + (only). It should be a dictionary with the fields `username` and + `password`. + +A GitLab specification (map) uses an additional mandatory key: + +* `private-token` -- the GitLab private token for a user with the + minimum permissions of master of any group you may wish to create + repositories under. + +A Lorry specification (map) uses the following keys, all of them +mandatory: + +* `type: lorries` -- specify it's a Lorry specification. + +* `interval` -- identical in meaning to the `interval` in a + Host specification. + +* `prefix` -- a path prefix to be prepended to all repositories + created from the `.lorry` files from this spec. + +* `globs` -- a list of globs (as strings) for `.lorry` files to use. + The glob is matched in the directory containing the configuration + file in which this spec is. It is OK for the globs to not match + anything. + +For backwards compatibility with another implementation of Lorry +Controller, other fields in either type of specification are allowed +and silently ignored. + +An INTERVAL value (for `interval` or `ls-interval`) is a number and a +unit to indicate a time interval. Allowed units are minutes (`m`), +hours (`h`), and days (`d`), expressed as single-letter codes in upper +or lower case. + +The syntax of `.lorry` files is specified by the Lorry program; see +its documentation for details. + + +HTTP proxy configuration: `proxy.conf` +-------------------------------------- + +Lorry Controller will look for a file called `proxy.conf` in the same +directory as the `lorry-controller.conf` configuration file. +It is in JSON format, with the following key/value pairs: + +* `hostname` -- the hostname of the HTTP proxy +* `username` -- username for authenticating to the proxy +* `password` -- a **cleartext** password for authenticating to the + proxy +* `port` -- port number for connecting to the proxy + +Lorry Controller will use this information for both HTTP and HTTPS +proxying. + +Do note that the **password is stored in cleartext** and that access +to the configuration file (and the git repository where it is stored) +must be controlled appropriately. + +WEBAPP 'Admin' Interface +------------------------ + +An 'admin' interface runs locally on port 12765. + +For the moment you can access this interface using an ssh tunnel, for +example: + +ssh -L 12765:localhost:12765 root@lorryhost + +will bind 12765 on your localhost to 12765 on lorryhost, with this running +you can access the 'admin' interface at +http://localhost:12765/1.0/status-html + +When used within Trove, a web interface for managing lorry controller +is accessible from http://trove/1.0/status-html. -- cgit v1.2.1