Deprecating transformers and pipeline partitioning

Theses features doesn't work well, rate-of-change metrics can still wrongly be computed even with Pipeline partioning enabled. Also backend like Gnocchi offers a better alternative to compute them. This deprecates these two features, to be able to remove them in a couple of releases. Change-Id: I52362c69b7d500bfe6dba76f78403a9d376deb80
author: Mehdi Abaakouk <sileht@sileht.net> 2018-04-12 12:15:56 +0200
committer: Mehdi Abaakouk <sileht@sileht.net> 2018-04-20 10:45:45 +0200
commit: 1dcbd607df0696101b40f77d7721489679ebe0ba (patch)
tree: 2b39dd5db78d21efe7d542d26f17f850df5dcfe6 /doc
parent: e9b7abc8711305060ccb395d3daf5474a47e6d5b (diff)
download: ceilometer-1dcbd607df0696101b40f77d7721489679ebe0ba.tar.gz
7 files changed, 12 insertions, 321 deletions
diff --git a/doc/source/admin/telemetry-best-practices.rst b/doc/source/admin/telemetry-best-practices.rst
index db4439cc..4c3dc3f7 100644
--- a/doc/source/admin/telemetry-best-practices.rst
+++ b/doc/source/admin/telemetry-best-practices.rst
@@ -27,9 +27,3 @@ Data collection
    central and compute agents as necessary. The agents are designed to scale
    horizontally. For more information refer to the `high availability guide
    <https://docs.openstack.org/ha-guide/controller-ha-telemetry.html>`_.
-
-#. `workload_partitioning` of notification agents is only required if
-   the pipeline configuration leverages transformers. It may also be enabled if
-   batching is required to minimize load on the defined publisher targets. If
-   transformers are not enabled, multiple agents may still be deployed without
-   `workload_partitioning` and processing will be done greedily.
diff --git a/doc/source/admin/telemetry-data-collection.rst b/doc/source/admin/telemetry-data-collection.rst
index fb80bb60..26c16353 100644
--- a/doc/source/admin/telemetry-data-collection.rst
+++ b/doc/source/admin/telemetry-data-collection.rst
@@ -39,10 +39,9 @@ By default, the notification agent is configured to build both events and
 samples. To enable selective data models, set the required pipelines using
 `pipelines` option under the `[notification]` section.
 
-Additionally, the notification agent is responsible for all data processing
-such as transformations and publishing. After processing, the data is sent
-to any supported publisher target such as gnocchi or panko. These services
-persist the data in configured databases.
+Additionally, the notification agent is responsible to send to any supported
+publisher target such as gnocchi or panko. These services persist the data in
+configured databases.
 
 The different OpenStack services emit several notifications about the
 various types of events that happen in the system during normal
diff --git a/doc/source/admin/telemetry-data-pipelines.rst b/doc/source/admin/telemetry-data-pipelines.rst
index ebcac0c5..db6b751b 100644
--- a/doc/source/admin/telemetry-data-pipelines.rst
+++ b/doc/source/admin/telemetry-data-pipelines.rst
@@ -6,7 +6,7 @@ Data processing and pipelines
 
 The mechanism by which data is processed is called a pipeline. Pipelines,
 at the configuration level, describe a coupling between sources of data and
-the corresponding sinks for transformation and publication of data. This
+the corresponding sinks for publication of data. This
 functionality is handled by the notification agents.
 
 A source is a producer of data: ``samples`` or ``events``. In effect, it is a
@@ -17,13 +17,9 @@ Each source configuration encapsulates name matching and mapping
 to one or more sinks for publication.
 
 A sink, on the other hand, is a consumer of data, providing logic for
-the transformation and publication of data emitted from related sources.
+the publication of data emitted from related sources.
 
-In effect, a sink describes a chain of handlers. The chain starts with
-zero or more transformers and ends with one or more publishers. The
-first transformer in the chain is passed data from the corresponding
-source, takes some action such as deriving rate of change, performing
-unit conversion, or aggregating, before publishing_.
+In effect, a sink describes a list of one or more publishers.
 
 .. _telemetry-pipeline-configuration:
 
@@ -52,7 +48,6 @@ The meter pipeline definition looks like:
          - 'sink name'
    sinks:
      - name: 'sink name'
-       transformers: 'definition of transformers'
        publishers:
          - 'list of publishers'
 
@@ -97,30 +92,8 @@ The above definition methods can be used in the following combinations:
    same pipeline. Wildcard and included meters cannot co-exist in the
    same pipeline definition section.
 
-The transformers section of a pipeline sink provides the possibility to
-add a list of transformer definitions. The available transformers are:
-
-.. list-table::
-   :widths: 50 50
-   :header-rows: 1
-
-   * - Name of transformer
-     - Reference name for configuration
-   * - Accumulator
-     - accumulator
-   * - Aggregator
-     - aggregator
-   * - Arithmetic
-     - arithmetic
-   * - Rate of change
-     - rate\_of\_change
-   * - Unit conversion
-     - unit\_conversion
-   * - Delta
-     - delta
-
 The publishers section contains the list of publishers, where the
-samples data should be sent after the possible transformations.
+samples data should be sent.
 
 Similarly, the event pipeline definition looks like:
 
@@ -140,229 +113,6 @@ Similarly, the event pipeline definition looks like:
 
 The event filter uses the same filtering logic as the meter pipeline.
 
-.. _telemetry-transformers:
-
-Transformers
-------------
-
-.. note::
-
-   Transformers maintain data in memory and therefore do not guarantee
-   durability in certain scenarios. A more durable and efficient solution
-   may be achieved post-storage using solutions like Gnocchi.
-
-The definition of transformers can contain the following fields:
-
-name
-    Name of the transformer.
-
-parameters
-    Parameters of the transformer.
-
-The parameters section can contain transformer specific fields, like
-source and target fields with different subfields in case of the rate of
-change, which depends on the implementation of the transformer.
-
-The following are supported transformers:
-
-Rate of change transformer
-``````````````````````````
-Transformer that computes the change in value between two data points in time.
-In the case of the transformer that creates the ``cpu_util`` meter, the
-definition looks like:
-
-.. code-block:: yaml
-
-   transformers:
-       - name: "rate_of_change"
-         parameters:
-             target:
-                 name: "cpu_util"
-                 unit: "%"
-                 type: "gauge"
-                 scale: "100.0 / (10**9 * (resource_metadata.cpu_number or 1))"
-
-The rate of change transformer generates the ``cpu_util`` meter
-from the sample values of the ``cpu`` counter, which represents
-cumulative CPU time in nanoseconds. The transformer definition above
-defines a scale factor (for nanoseconds and multiple CPUs), which is
-applied before the transformation derives a sequence of gauge samples
-with unit ``%``, from sequential values of the ``cpu`` meter.
-
-The definition for the disk I/O rate, which is also generated by the
-rate of change transformer:
-
-.. code-block:: yaml
-
-   transformers:
-       - name: "rate_of_change"
-         parameters:
-             source:
-                 map_from:
-                     name: "disk\\.(read|write)\\.(bytes|requests)"
-                     unit: "(B|request)"
-             target:
-                 map_to:
-                     name: "disk.\\1.\\2.rate"
-                     unit: "\\1/s"
-                 type: "gauge"
-
-Unit conversion transformer
-```````````````````````````
-
-Transformer to apply a unit conversion. It takes the volume of the meter
-and multiplies it with the given ``scale`` expression. Also supports
-``map_from`` and ``map_to`` like the rate of change transformer.
-
-Sample configuration:
-
-.. code-block:: yaml
-
-   transformers:
-       - name: "unit_conversion"
-         parameters:
-             target:
-                 name: "disk.kilobytes"
-                 unit: "KB"
-                 scale: "volume * 1.0 / 1024.0"
-
-With ``map_from`` and ``map_to``:
-
-.. code-block:: yaml
-
-   transformers:
-       - name: "unit_conversion"
-         parameters:
-             source:
-                 map_from:
-                     name: "disk\\.(read|write)\\.bytes"
-             target:
-                 map_to:
-                     name: "disk.\\1.kilobytes"
-                 scale: "volume * 1.0 / 1024.0"
-                 unit: "KB"
-
-Aggregator transformer
-``````````````````````
-
-A transformer that sums up the incoming samples until enough samples
-have come in or a timeout has been reached.
-
-Timeout can be specified with the ``retention_time`` option. If you want
-to flush the aggregation, after a set number of samples have been
-aggregated, specify the size parameter.
-
-The volume of the created sample is the sum of the volumes of samples
-that came into the transformer. Samples can be aggregated by the
-attributes ``project_id``, ``user_id`` and ``resource_metadata``. To aggregate
-by the chosen attributes, specify them in the configuration and set which
-value of the attribute to take for the new sample (first to take the
-first sample's attribute, last to take the last sample's attribute, and
-drop to discard the attribute).
-
-To aggregate 60s worth of samples by ``resource_metadata`` and keep the
-``resource_metadata`` of the latest received sample:
-
-.. code-block:: yaml
-
-   transformers:
-       - name: "aggregator"
-         parameters:
-             retention_time: 60
-             resource_metadata: last
-
-To aggregate each 15 samples by ``user_id`` and ``resource_metadata`` and keep
-the ``user_id`` of the first received sample and drop the
-``resource_metadata``:
-
-.. code-block:: yaml
-
-   transformers:
-       - name: "aggregator"
-         parameters:
-             size: 15
-             user_id: first
-             resource_metadata: drop
-
-Accumulator transformer
-```````````````````````
-
-This transformer simply caches the samples until enough samples have
-arrived and then flushes them all down the pipeline at once:
-
-.. code-block:: yaml
-
-   transformers:
-       - name: "accumulator"
-         parameters:
-             size: 15
-
-Multi meter arithmetic transformer
-``````````````````````````````````
-
-This transformer enables us to perform arithmetic calculations over one
-or more meters and/or their metadata, for example::
-
-   memory_util = 100 * memory.usage / memory
-
-A new sample is created with the properties described in the ``target``
-section of the transformer's configuration. The sample's
-volume is the result of the provided expression. The calculation is
-performed on samples from the same resource.
-
-.. note::
-
-   The calculation is limited to meters with the same interval.
-
-Example configuration:
-
-.. code-block:: yaml
-
-   transformers:
-       - name: "arithmetic"
-         parameters:
-           target:
-             name: "memory_util"
-             unit: "%"
-             type: "gauge"
-             expr: "100 * $(memory.usage) / $(memory)"
-
-To demonstrate the use of metadata, the following implementation of a
-novel meter shows average CPU time per core:
-
-.. code-block:: yaml
-
-   transformers:
-       - name: "arithmetic"
-         parameters:
-           target:
-             name: "avg_cpu_per_core"
-             unit: "ns"
-             type: "cumulative"
-             expr: "$(cpu) / ($(cpu).resource_metadata.cpu_number or 1)"
-
-.. note::
-
-   Expression evaluation gracefully handles NaNs and exceptions. In
-   such a case it does not create a new sample but only logs a warning.
-
-Delta transformer
-`````````````````
-
-This transformer calculates the change between two sample datapoints of a
-resource. It can be configured to capture only the positive growth deltas.
-
-Example configuration:
-
-.. code-block:: yaml
-
-   transformers:
-       - name: "delta"
-         parameters:
-           target:
-               name: "cpu.delta"
-           growth_only: True
-
 .. _publishing:
 
 Publishers
@@ -510,33 +260,3 @@ specified. A sample ``publishers`` section in the
        - panko://
        - udp://10.0.0.2:1234
        - notifier://?policy=drop&max_queue_length=512&topic=custom_target
-
-Pipeline Partitioning
-~~~~~~~~~~~~~~~~~~~~~
-
-.. note::
-
-   Partitioning is only required if pipelines contain transformations. It has
-   secondary benefit of supporting batching in certain publishers.
-
-On large workloads, multiple notification agents can be deployed to handle the
-flood of incoming messages from monitored services. If transformations are
-enabled in the pipeline, the notification agents must be coordinated to ensure
-related messages are routed to the same agent. To enable coordination, set the
-``workload_partitioning`` value in ``notification`` section.
-
-To distribute messages across agents, ``pipeline_processing_queues`` option
-should be set. This value defines how many pipeline queues to create which will
-then be distributed to the active notification agents. It is recommended that
-the number of processing queues, at the very least, match the number of agents.
-
-Increasing the number of processing queues will improve the distribution of
-messages across the agents. It will also help batching which minimises the
-requests to Gnocchi storage backend. It will also increase the load the on
-message queue as it uses the queue to shard data.
-
-.. warning::
-
-   Decreasing the number of processing queues may result in lost data as any
-   previously created queues may no longer be assigned to active agents. It
-   is only recommended that you **increase** processing queues.
diff --git a/doc/source/admin/telemetry-measurements.rst b/doc/source/admin/telemetry-measurements.rst
index dbd4148c..f1d27cec 100644
--- a/doc/source/admin/telemetry-measurements.rst
+++ b/doc/source/admin/telemetry-measurements.rst
@@ -356,12 +356,11 @@ The following meters are collected for OpenStack Compute.
     To enable libvirt ``disk.*`` support when running on RBD-backed shared
     storage, you need to install libvirt version 1.2.16+.
 
-The Telemetry service supports creating new meters by using
-transformers. For more details about transformers see
-:ref:`telemetry-transformers`. Among the meters gathered from libvirt and
-Hyper-V, there are a few which are derived from other meters. The list of
-meters that are created by using the ``rate_of_change`` transformer from the
-above table is the following:
+The Telemetry service supports creating new meters by using transformers, but
+this is deprecated and discouraged to use. Among the meters gathered from
+libvirt and Hyper-V, there are a few which are derived from other meters. The
+list of meters that are created by using the ``rate_of_change`` transformer
+from the above table is the following:
 
 -  cpu_util
 
diff --git a/doc/source/contributor/3-Pipeline.png b/doc/source/contributor/3-Pipeline.png
index 5948d1c4..01f82436 100644
--- a/doc/source/contributor/3-Pipeline.png
+++ b/doc/source/contributor/3-Pipeline.png
diff --git a/doc/source/contributor/4-Transformer.png b/doc/source/contributor/4-Transformer.png
deleted file mode 100644
index 4aa24059..00000000
--- a/doc/source/contributor/4-Transformer.png
+++ /dev/null
diff --git a/doc/source/contributor/architecture.rst b/doc/source/contributor/architecture.rst
index 24920867..7446c3d3 100644
--- a/doc/source/contributor/architecture.rst
+++ b/doc/source/contributor/architecture.rst
@@ -154,27 +154,6 @@ Ceilometer offers the ability to take data gathered by the agents, manipulate
 it, and publish it in various combinations via multiple pipelines. This
 functionality is handled by the notification agents.
 
-Transforming the data
----------------------
-
-.. figure:: ./4-Transformer.png
-   :width: 100%
-   :align: center
-   :alt: Transformer example
-
-   Example of aggregation of multiple cpu time usage samples in a single
-   cpu percentage sample.
-
-The data gathered from the polling and notifications agents contains a wealth
-of data and if combined with historical or temporal context, can be used to
-derive even more data. Ceilometer offers various transformers which can be used
-to manipulate data in the pipeline.
-
-.. note::
-
-   The equivalent functionality can be handled more stably by storage
-   drivers such as Gnocchi.
-
 Publishing the data
 -------------------
author	Mehdi Abaakouk <sileht@sileht.net>	2018-04-12 12:15:56 +0200
committer	Mehdi Abaakouk <sileht@sileht.net>	2018-04-20 10:45:45 +0200
commit	1dcbd607df0696101b40f77d7721489679ebe0ba (patch)
tree	2b39dd5db78d21efe7d542d26f17f850df5dcfe6 /doc
parent	e9b7abc8711305060ccb395d3daf5474a47e6d5b (diff)
download	ceilometer-1dcbd607df0696101b40f77d7721489679ebe0ba.tar.gz