summaryrefslogtreecommitdiff
path: root/dev
diff options
context:
space:
mode:
authorNick Vatamaniuc <vatamane@apache.org>2016-11-25 11:47:56 -0500
committerNick Vatamaniuc <vatamane@apache.org>2017-04-28 17:35:50 -0400
commit4841774575fb5771a245e5f046a26eaa7ac8dbb4 (patch)
tree67d23b5b1e97bcb8d5f07798303c833e291460d2 /dev
parentdcfa0902b62613f6c17062d756aa2c6e68871cfa (diff)
downloadcouchdb-4841774575fb5771a245e5f046a26eaa7ac8dbb4.tar.gz
Stitch scheduling replicator together.
Glue together all the scheduling replicator pieces. Scheduler is the main component. It can run a large number of replication jobs by switching between them, stopping and starting some periodically. Jobs which fail are backed off exponentially. Normal (non-continuous) jobs will be allowed to run to completion to preserve their current semantics. Scheduler behavior can configured by these configuration options in `[replicator]` sections: * `max_jobs` : Number of actively running replications. Making this too high could cause performance issues. Making it too low could mean replications jobs might not have enough time to make progress before getting unscheduled again. This parameter can be adjusted at runtime and will take effect during next reschudling cycle. * `interval` : Scheduling interval in milliseconds. During each reschedule cycle scheduler might start or stop up to "max_churn" number of jobs. * `max_churn` : Maximum number of replications to start and stop during rescheduling. This parameter along with "interval" defines the rate of job replacement. During startup, however a much larger number of jobs could be started (up to max_jobs) in short period of time. Replication jobs are added to the scheduler by the document processor or from the `couch_replicator:replicate/2` function when called from `_replicate` HTTP endpoint handler. Document processor listens for updates via couch_mutlidb_changes module then tries to add replication jobs to the scheduler. Sometimes translating a document update to a replication job could fail, either permantly (if document is malformed and missing some expected fields for example) or temporarily if it is a filtered replication and filter cannot be fetched. A failed filter fetch will be retried with an exponential backoff. couch_replicator_clustering is in charge of monitoring cluster membership changes. When membership changes, after a configurable quiet period, a rescan will be initiated. Rescan will shufle replication jobs to make sure a replication job is running on only one node. A new set of stats were added to introspect scheduler and doc processor internals. The top replication supervisor structure is `rest_for_one`. This means if a child crashes, all children to the "right" of it will be restarted (if visualized supervisor hierarchy as an upside-down tree). Clustering, connection pool and rate limiter are towards the "left" as they are more fundamental, if clustering child crashes, most other components will be restart. Doc process or and multi-db changes children are towards the "right". If they crash, they can be safely restarted without affecting already running replication or components like clustering or connection pool. Jira: COUCHDB-3324
Diffstat (limited to 'dev')
-rwxr-xr-xdev/run14
1 files changed, 14 insertions, 0 deletions
diff --git a/dev/run b/dev/run
index 100256394..781c71530 100755
--- a/dev/run
+++ b/dev/run
@@ -125,6 +125,8 @@ def setup_argparse():
help='HAProxy port')
parser.add_option('--node-number', dest="node_number", type=int, default=1,
help='The node number to seed them when creating the node(s)')
+ parser.add_option('-c', '--config-overrides', action="append", default=[],
+ help='Optional key=val config overrides. Can be repeated')
return parser.parse_args()
@@ -143,6 +145,7 @@ def setup_context(opts, args):
'with_haproxy': opts.with_haproxy,
'haproxy': opts.haproxy,
'haproxy_port': opts.haproxy_port,
+ 'config_overrides': opts.config_overrides,
'procs': []}
@@ -190,6 +193,16 @@ def setup_configs(ctx):
write_config(ctx, node, env)
+def apply_config_overrides(ctx, content):
+ for kv_str in ctx['config_overrides']:
+ key, val = kv_str.split('=')
+ key, val = key.strip(), val.strip()
+ match = "[;=]{0,2}%s.*" % key
+ repl = "%s = %s" % (key, val)
+ content = re.sub(match, repl, content)
+ return content
+
+
def get_ports(idnode):
assert idnode
return ((10000 * idnode) + 5984, (10000 * idnode) + 5986)
@@ -214,6 +227,7 @@ def write_config(ctx, node, env):
if base == "default.ini":
content = hack_default_ini(ctx, node, content)
+ content = apply_config_overrides(ctx, content)
elif base == "local.ini":
content = hack_local_ini(ctx, content)