diff options
author | Vladimir Mihailenco <vladimir.webdev@gmail.com> | 2022-12-14 14:41:33 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2022-12-14 14:41:33 +0200 |
commit | 74c251a19d211b3dcf555adc4a68bdc6f097f763 (patch) | |
tree | 07ff1f3d86556288d14306f7f7f48ff0e1849da5 | |
parent | 022a3e8314daa59b31fdce1d32e0e74d77f564cc (diff) | |
download | redis-py-74c251a19d211b3dcf555adc4a68bdc6f097f763.tar.gz |
Add OpenTelemetry example with Uptrace backend (#2452)
* chore: add opentelemetry example
* chore: add opentelemetry API Jupyter notebook
* chore: use a shorter title
* chore: cleanup
Co-authored-by: dvora-h <67596500+dvora-h@users.noreply.github.com>
18 files changed, 1248 insertions, 2 deletions
diff --git a/docs/conf.py b/docs/conf.py index a265e5c..cdbeb02 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -162,7 +162,7 @@ html_logo = "_static/redis-cube-red-white-rgb.svg" # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". -html_static_path = ["_static"] +html_static_path = ["_static", "images"] # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, # using the given strftime format. diff --git a/docs/examples.rst b/docs/examples.rst index 3fed8b4..47fdbdf 100644 --- a/docs/examples.rst +++ b/docs/examples.rst @@ -13,4 +13,5 @@ Examples examples/search_vector_similarity_examples examples/pipeline_examples examples/timeseries_examples - examples/redis-stream-example.ipynb + examples/redis-stream-example + examples/opentelemetry_api_examples diff --git a/docs/examples/opentelemetry/README.md b/docs/examples/opentelemetry/README.md new file mode 100644 index 0000000..a1d1c04 --- /dev/null +++ b/docs/examples/opentelemetry/README.md @@ -0,0 +1,47 @@ +# Example for redis-py OpenTelemetry instrumentation + +This example demonstrates how to monitor Redis using [OpenTelemetry](https://opentelemetry.io/) and +[Uptrace](https://github.com/uptrace/uptrace). It requires Docker to start Redis Server and Uptrace. + +See +[Monitoring redis-py performance with OpenTelemetry](https://redis-py.readthedocs.io/en/latest/opentelemetry.html) +for details. + +**Step 1**. Download the example using Git: + +```shell +git clone https://github.com/redis/redis-py.git +cd example/opentelemetry +``` + +**Step 2**. Optionally, create a virtualenv: + +```shell +python3 -m venv .venv +source .venv/bin/active +``` + +**Step 3**. Install dependencies: + +```shell +pip install -r requirements.txt +``` + +**Step 4**. Start the services using Docker and make sure Uptrace is running: + +```shell +docker-compose up -d +docker-compose logs uptrace +``` + +**Step 5**. Run the Redis client example and follow the link from the CLI to view the trace: + +```shell +python3 main.py +trace: http://localhost:14318/traces/ee029d8782242c8ed38b16d961093b35 +``` + +![Redis trace](./image/redis-py-trace.png) + +You can also open Uptrace UI at [http://localhost:14318](http://localhost:14318) to view available +spans, logs, and metrics. diff --git a/docs/examples/opentelemetry/config/alertmanager.yml b/docs/examples/opentelemetry/config/alertmanager.yml new file mode 100644 index 0000000..ac3e340 --- /dev/null +++ b/docs/examples/opentelemetry/config/alertmanager.yml @@ -0,0 +1,53 @@ +# See https://prometheus.io/docs/alerting/latest/configuration/ for details. + +global: + # The smarthost and SMTP sender used for mail notifications. + smtp_smarthost: "mailhog:1025" + smtp_from: "alertmanager@example.com" + smtp_require_tls: false + +receivers: + - name: "team-X" + email_configs: + - to: "some-receiver@example.com" + send_resolved: true + +# The root route on which each incoming alert enters. +route: + # The labels by which incoming alerts are grouped together. For example, + # multiple alerts coming in for cluster=A and alertname=LatencyHigh would + # be batched into a single group. + group_by: ["alertname", "cluster", "service"] + + # When a new group of alerts is created by an incoming alert, wait at + # least 'group_wait' to send the initial notification. + # This way ensures that you get multiple alerts for the same group that start + # firing shortly after another are batched together on the first + # notification. + group_wait: 30s + + # When the first notification was sent, wait 'group_interval' to send a batch + # of new alerts that started firing for that group. + group_interval: 5m + + # If an alert has successfully been sent, wait 'repeat_interval' to + # resend them. + repeat_interval: 3h + + # A default receiver + receiver: team-X + + # All the above attributes are inherited by all child routes and can + # overwritten on each. + + # The child route trees. + routes: + # This route matches error alerts created from spans or logs. + - matchers: + - alert_kind="error" + group_interval: 24h + receiver: team-X + +# The directory from which notification templates are read. +templates: + - "/etc/alertmanager/template/*.tmpl" diff --git a/docs/examples/opentelemetry/config/otel-collector.yaml b/docs/examples/opentelemetry/config/otel-collector.yaml new file mode 100644 index 0000000..b44dd1f --- /dev/null +++ b/docs/examples/opentelemetry/config/otel-collector.yaml @@ -0,0 +1,68 @@ +extensions: + health_check: + pprof: + endpoint: 0.0.0.0:1777 + zpages: + endpoint: 0.0.0.0:55679 + +receivers: + otlp: + protocols: + grpc: + http: + hostmetrics: + collection_interval: 10s + scrapers: + cpu: + disk: + load: + filesystem: + memory: + network: + paging: + redis: + endpoint: "redis-server:6379" + collection_interval: 10s + jaeger: + protocols: + grpc: + +processors: + resourcedetection: + detectors: ["system"] + batch: + send_batch_size: 10000 + timeout: 10s + +exporters: + logging: + logLevel: debug + otlp: + endpoint: uptrace:14317 + tls: + insecure: true + headers: { "uptrace-dsn": "http://project2_secret_token@localhost:14317/2" } + +service: + # telemetry: + # logs: + # level: DEBUG + pipelines: + traces: + receivers: [otlp, jaeger] + processors: [batch] + exporters: [otlp, logging] + metrics: + receivers: [otlp] + processors: [batch] + exporters: [otlp] + metrics/hostmetrics: + receivers: [hostmetrics, redis] + processors: [batch, resourcedetection] + exporters: [otlp] + logs: + receivers: [otlp] + processors: [batch] + exporters: [otlp] + + extensions: [health_check, pprof, zpages] diff --git a/docs/examples/opentelemetry/config/vector.toml b/docs/examples/opentelemetry/config/vector.toml new file mode 100644 index 0000000..10db91d --- /dev/null +++ b/docs/examples/opentelemetry/config/vector.toml @@ -0,0 +1,39 @@ +[sources.syslog_logs] +type = "demo_logs" +format = "syslog" +interval = 0.1 + +[sources.apache_common_logs] +type = "demo_logs" +format = "apache_common" +interval = 0.1 + +[sources.apache_error_logs] +type = "demo_logs" +format = "apache_error" +interval = 0.1 + +[sources.json_logs] +type = "demo_logs" +format = "json" +interval = 0.1 + +# Parse Syslog logs +# See the Vector Remap Language reference for more info: https://vrl.dev +[transforms.parse_logs] +type = "remap" +inputs = ["syslog_logs"] +source = ''' +. = parse_syslog!(string!(.message)) +''' + +# Export data to Uptrace. +[sinks.uptrace] +type = "http" +inputs = ["parse_logs", "apache_common_logs", "apache_error_logs", "json_logs"] +encoding.codec = "json" +framing.method = "newline_delimited" +compression = "gzip" +uri = "http://uptrace:14318/api/v1/vector/logs" +#uri = "https://api.uptrace.dev/api/v1/vector/logs" +headers.uptrace-dsn = "http://project2_secret_token@localhost:14317/2" diff --git a/docs/examples/opentelemetry/docker-compose.yml b/docs/examples/opentelemetry/docker-compose.yml new file mode 100644 index 0000000..ea1d6dc --- /dev/null +++ b/docs/examples/opentelemetry/docker-compose.yml @@ -0,0 +1,81 @@ +version: "3" + +services: + clickhouse: + image: clickhouse/clickhouse-server:22.7 + restart: on-failure + environment: + CLICKHOUSE_DB: uptrace + healthcheck: + test: ["CMD", "wget", "--spider", "-q", "localhost:8123/ping"] + interval: 1s + timeout: 1s + retries: 30 + volumes: + - ch_data:/var/lib/clickhouse + ports: + - "8123:8123" + - "9000:9000" + + uptrace: + image: "uptrace/uptrace:1.2.0" + #image: 'uptrace/uptrace-dev:latest' + restart: on-failure + volumes: + - uptrace_data:/var/lib/uptrace + - ./uptrace.yml:/etc/uptrace/uptrace.yml + #environment: + # - DEBUG=2 + ports: + - "14317:14317" + - "14318:14318" + depends_on: + clickhouse: + condition: service_healthy + + otel-collector: + image: otel/opentelemetry-collector-contrib:0.58.0 + restart: on-failure + volumes: + - ./config/otel-collector.yaml:/etc/otelcol-contrib/config.yaml + ports: + - "4317:4317" + - "4318:4318" + + vector: + image: timberio/vector:0.24.X-alpine + volumes: + - ./config/vector.toml:/etc/vector/vector.toml:ro + + alertmanager: + image: prom/alertmanager:v0.24.0 + restart: on-failure + volumes: + - ./config/alertmanager.yml:/etc/alertmanager/config.yml + - alertmanager_data:/alertmanager + ports: + - 9093:9093 + command: + - "--config.file=/etc/alertmanager/config.yml" + - "--storage.path=/alertmanager" + + mailhog: + image: mailhog/mailhog:v1.0.1 + restart: on-failure + ports: + - "8025:8025" + + redis-server: + image: redis + ports: + - "6379:6379" + redis-cli: + image: redis + +volumes: + uptrace_data: + driver: local + ch_data: + driver: local + alertmanager_data: + driver: local diff --git a/docs/examples/opentelemetry/image/redis-py-trace.png b/docs/examples/opentelemetry/image/redis-py-trace.png Binary files differnew file mode 100644 index 0000000..e443238 --- /dev/null +++ b/docs/examples/opentelemetry/image/redis-py-trace.png diff --git a/docs/examples/opentelemetry/main.py b/docs/examples/opentelemetry/main.py new file mode 100755 index 0000000..b140dd0 --- /dev/null +++ b/docs/examples/opentelemetry/main.py @@ -0,0 +1,56 @@ +#!/usr/bin/env python3 + +import time + +import uptrace +from opentelemetry import trace +from opentelemetry.instrumentation.redis import RedisInstrumentor + +import redis + +tracer = trace.get_tracer("app_or_package_name", "1.0.0") + + +def main(): + uptrace.configure_opentelemetry( + dsn="http://project2_secret_token@localhost:14317/2", + service_name="myservice", + service_version="1.0.0", + ) + RedisInstrumentor().instrument() + + client = redis.StrictRedis(host="localhost", port=6379) + + span = handle_request(client) + print("trace:", uptrace.trace_url(span)) + + for i in range(10000): + handle_request(client) + time.sleep(1) + + +def handle_request(client): + with tracer.start_as_current_span( + "handle-request", kind=trace.SpanKind.CLIENT + ) as span: + client.get("my-key") + client.set("hello", "world") + client.mset( + { + "employee_name": "Adam Adams", + "employee_age": 30, + "position": "Software Engineer", + } + ) + + pipe = client.pipeline() + pipe.set("foo", 5) + pipe.set("bar", 18.5) + pipe.set("blee", "hello world!") + pipe.execute() + + return span + + +if __name__ == "__main__": + main() diff --git a/docs/examples/opentelemetry/requirements.txt b/docs/examples/opentelemetry/requirements.txt new file mode 100644 index 0000000..2132801 --- /dev/null +++ b/docs/examples/opentelemetry/requirements.txt @@ -0,0 +1,3 @@ +redis==4.3.4 +uptrace==1.14.0 +opentelemetry-instrumentation-redis==0.35b0 diff --git a/docs/examples/opentelemetry/uptrace.yml b/docs/examples/opentelemetry/uptrace.yml new file mode 100644 index 0000000..4cb39f8 --- /dev/null +++ b/docs/examples/opentelemetry/uptrace.yml @@ -0,0 +1,297 @@ +## +## Uptrace configuration file. +## See https://uptrace.dev/get/config.html for details. +## +## You can use environment variables anywhere in this file, for example: +## +## foo: $FOO +## bar: ${BAR} +## baz: ${BAZ:default} +## +## To escape `$`, use `$$`, for example: +## +## foo: $$FOO_BAR +## + +## +## ClickHouse database credentials. +## +ch: + # Connection string for ClickHouse database. For example: + # clickhouse://<user>:<password>@<host>:<port>/<database>?sslmode=disable + # + # See https://clickhouse.uptrace.dev/guide/golang-clickhouse.html#options + dsn: "clickhouse://default:@clickhouse:9000/uptrace?sslmode=disable" + +## +## A list of pre-configured projects. Each project is fully isolated. +## +projects: + # Conventionally, the first project is used to monitor Uptrace itself. + - id: 1 + name: Uptrace + # Token grants write access to the project. Keep a secret. + token: project1_secret_token + pinned_attrs: + - service.name + - host.name + - deployment.environment + # Group spans by deployment.environment attribute. + group_by_env: false + # Group funcs spans by service.name attribute. + group_funcs_by_service: false + + # Other projects can be used to monitor your applications. + # To monitor micro-services or multiple related services, use a single project. + - id: 2 + name: My project + token: project2_secret_token + pinned_attrs: + - service.name + - host.name + - deployment.environment + # Group spans by deployment.environment attribute. + group_by_env: false + # Group funcs spans by service.name attribute. + group_funcs_by_service: false + +## +## Create metrics from spans and events. +## +metrics_from_spans: + - name: uptrace.tracing.spans_duration + description: Spans duration (excluding events) + instrument: histogram + unit: microseconds + value: span.duration / 1000 + attrs: + - span.system as system + - service.name as service + - host.name as host + - span.status_code as status + where: not span.is_event + + - name: uptrace.tracing.spans + description: Spans count (excluding events) + instrument: counter + unit: 1 + value: span.count + attrs: + - span.system as system + - service.name as service + - host.name as host + - span.status_code as status + where: not span.is_event + + - name: uptrace.tracing.events + description: Events count (excluding spans) + instrument: counter + unit: 1 + value: span.count + attrs: + - span.system as system + - service.name as service + - host.name as host + where: span.is_event + +## +## To require authentication, uncomment the following section. +## +auth: + # users: + # - username: uptrace + # password: uptrace + # - username: admin + # password: admin + + # # Cloudflare user provider: uses Cloudflare Zero Trust Access (Identity) + # # See https://developers.cloudflare.com/cloudflare-one/identity/ for more info. + # cloudflare: + # # The base URL of the Cloudflare Zero Trust team. + # - team_url: https://myteam.cloudflareaccess.com + # # The Application Audience (AUD) Tag for this application. + # # You can retrieve this from the Cloudflare Zero Trust 'Access' Dashboard. + # audience: bea6df23b944e4a0cd178609ba1bb64dc98dfe1f66ae7b918e563f6cf28b37e0 + + # # OpenID Connect (Single Sign-On) + # oidc: + # # The ID is used in API endpoints, for example, in redirect URL + # # `http://<uptrace-host>/api/v1/sso/<oidc-id>/callback`. + # - id: keycloak + # # Display name for the button in the login form. + # # Default to 'OpenID Connect' + # display_name: Keycloak + # # The base URL for the OIDC provider. + # issuer_url: http://localhost:8080/realms/uptrace + # # The OAuth 2.0 Client ID + # client_id: uptrace + # # The OAuth 2.0 Client Secret + # client_secret: ogbhd8Q0X0e5AZFGSG3m9oirPvnetqkA + # # Additional OAuth 2.0 scopes to request from the OIDC provider. + # # Defaults to 'profile'. 'openid' is requested by default and need not be specified. + # scopes: + # - profile + # # The OIDC UserInfo claim to use as the user's username. + # # Defaults to 'preferred_username'. + # claim: preferred_username + +## +## Alerting rules for monitoring metrics. +## +## See https://uptrace.dev/get/alerting.html for details. +## +alerting: + rules: + - name: Network errors + metrics: + - system.network.errors as $net_errors + query: + - $net_errors > 0 group by host.name + # for the last 5 minutes + for: 5m + annotations: + summary: "{{ $labels.host_name }} has high number of net errors: {{ $values.net_errors }}" + + - name: Filesystem usage >= 90% + metrics: + - system.filesystem.usage as $fs_usage + query: + - group by host.name + - group by device + - where device !~ "loop" + - $fs_usage{state="used"} / $fs_usage >= 0.9 + for: 5m + annotations: + summary: "{{ $labels.host_name }} has high FS usage: {{ $values.fs_usage }}" + + - name: Uptrace is dropping spans + metrics: + - uptrace.projects.spans as $spans + query: + - $spans{type=dropped} > 0 + for: 1m + annotations: + summary: "Uptrace has dropped {{ $values.spans }} spans" + + - name: Always firing (for fun and testing) + metrics: + - process.runtime.go.goroutines as $goroutines + query: + - $goroutines >= 0 group by host.name + for: 1m + annotations: + summary: "{{ $labels.host_name }} has high number of goroutines: {{ $values.goroutines }}" + + # Create alerts from error logs and span events. + create_alerts_from_spans: + enabled: true + labels: + alert_kind: error + +## +## AlertManager client configuration. +## See https://uptrace.dev/get/alerting.html for details. +## +## Note that this is NOT an AlertManager config and you need to configure AlertManager separately. +## See https://prometheus.io/docs/alerting/latest/configuration/ for details. +## +alertmanager_client: + # AlertManager API endpoints that Uptrace uses to manage alerts. + urls: + - "http://alertmanager:9093/api/v2/alerts" + +## +## Various options to tweak ClickHouse schema. +## For changes to take effect, you need reset the ClickHouse database with `ch reset`. +## +ch_schema: + # Compression codec, for example, LZ4, ZSTD(3), or Default. + compression: ZSTD(3) + + # Whether to use ReplicatedMergeTree instead of MergeTree. + replicated: false + + # Cluster name for Distributed tables and ON CLUSTER clause. + #cluster: uptrace1 + + spans: + storage_policy: "default" + # Delete spans data after 30 days. + ttl_delete: 30 DAY + + metrics: + storage_policy: "default" + # Delete metrics data after 90 days. + ttl_delete: 90 DAY + +## +## Addresses on which Uptrace receives gRPC and HTTP requests. +## +listen: + # OTLP/gRPC API. + grpc: + addr: ":14317" + # tls: + # cert_file: config/tls/uptrace.crt + # key_file: config/tls/uptrace.key + + # OTLP/HTTP API and Uptrace API with UI. + http: + addr: ":14318" + # tls: + # cert_file: config/tls/uptrace.crt + # key_file: config/tls/uptrace.key + +## +## Various options for Uptrace UI. +## +site: + # Overrides public URL for Vue-powered UI in case you put Uptrace behind a proxy. + #addr: 'https://uptrace.mydomain.com' + +## +## Spans processing options. +## +spans: + # The size of the Go chan used to buffer incoming spans. + # If the buffer is full, Uptrace starts to drop spans. + #buffer_size: 100000 + + # The number of spans to insert in a single query. + #batch_size: 10000 + +## +## Metrics processing options. +## +metrics: + # List of attributes to drop for being noisy. + drop_attrs: + - telemetry.sdk.language + - telemetry.sdk.name + - telemetry.sdk.version + + # The size of the Go chan used to buffer incoming measures. + # If the buffer is full, Uptrace starts to drop measures. + #buffer_size: 100000 + + # The number of measures to insert in a single query. + #batch_size: 10000 + +## +## SQLite/PostgreSQL db that is used to store metadata such us metric names, dashboards, alerts, +## and so on. +## +db: + # Either sqlite or postgres. + driver: sqlite + # Database connection string. + # + # Uptrace automatically creates SQLite database file in the current working directory. + # Make sure the directory is writable by Uptrace process. + dsn: "file:uptrace.sqlite3?_pragma=foreign_keys(1)&_pragma=busy_timeout(1000)" + +# Secret key that is used to sign JWT tokens etc. +secret_key: 102c1a557c314fc28198acd017960843 + +# Enable to log HTTP requests and database queries. +debug: false diff --git a/docs/examples/opentelemetry_api_examples.ipynb b/docs/examples/opentelemetry_api_examples.ipynb new file mode 100644 index 0000000..28fe758 --- /dev/null +++ b/docs/examples/opentelemetry_api_examples.ipynb @@ -0,0 +1,423 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "7b02ea52", + "metadata": {}, + "source": [ + "# OpenTelemetry Python API" + ] + }, + { + "cell_type": "markdown", + "id": "56520927", + "metadata": {}, + "source": [ + "## Install OpenTelemetry" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "c0ed8440", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Defaulting to user installation because normal site-packages is not writeable\n", + "Requirement already satisfied: opentelemetry-api in /home/vmihailenco/.local/lib/python3.10/site-packages (1.14.0)\n", + "Requirement already satisfied: opentelemetry-sdk in /home/vmihailenco/.local/lib/python3.10/site-packages (1.14.0)\n", + "Requirement already satisfied: setuptools>=16.0 in /usr/lib/python3/dist-packages (from opentelemetry-api) (59.6.0)\n", + "Requirement already satisfied: deprecated>=1.2.6 in /home/vmihailenco/.local/lib/python3.10/site-packages (from opentelemetry-api) (1.2.13)\n", + "Requirement already satisfied: opentelemetry-semantic-conventions==0.35b0 in /home/vmihailenco/.local/lib/python3.10/site-packages (from opentelemetry-sdk) (0.35b0)\n", + "Requirement already satisfied: typing-extensions>=3.7.4 in /home/vmihailenco/.local/lib/python3.10/site-packages (from opentelemetry-sdk) (4.4.0)\n", + "Requirement already satisfied: wrapt<2,>=1.10 in /home/vmihailenco/.local/lib/python3.10/site-packages (from deprecated>=1.2.6->opentelemetry-api) (1.14.1)\n", + "Note: you may need to restart the kernel to use updated packages.\n" + ] + } + ], + "source": [ + "pip install opentelemetry-api opentelemetry-sdk" + ] + }, + { + "cell_type": "markdown", + "id": "861fa9cb", + "metadata": {}, + "source": [ + "### Configure OpenTelemetry with console exporter" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "c061b6cb", + "metadata": {}, + "outputs": [], + "source": [ + "from opentelemetry import trace\n", + "from opentelemetry.sdk.trace import TracerProvider\n", + "from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter\n", + "\n", + "trace.set_tracer_provider(TracerProvider())\n", + "trace.get_tracer_provider().add_span_processor(\n", + " BatchSpanProcessor(ConsoleSpanExporter())\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "ae4a626c", + "metadata": {}, + "source": [ + "### Create a span using the tracer" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "f918501b", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"name\": \"operation-name\",\n", + " \"context\": {\n", + " \"trace_id\": \"0xff14cec5f33afeca0d04ced2c2185b39\",\n", + " \"span_id\": \"0xd06e73b03bd55b4a\",\n", + " \"trace_state\": \"[]\"\n", + " },\n", + " \"kind\": \"SpanKind.INTERNAL\",\n", + " \"parent_id\": null,\n", + " \"start_time\": \"2022-12-07T13:46:11.050878Z\",\n", + " \"end_time\": \"2022-12-07T13:46:12.051944Z\",\n", + " \"status\": {\n", + " \"status_code\": \"UNSET\"\n", + " },\n", + " \"attributes\": {},\n", + " \"events\": [],\n", + " \"links\": [],\n", + " \"resource\": {\n", + " \"attributes\": {\n", + " \"telemetry.sdk.language\": \"python\",\n", + " \"telemetry.sdk.name\": \"opentelemetry\",\n", + " \"telemetry.sdk.version\": \"1.14.0\",\n", + " \"service.name\": \"unknown_service\"\n", + " },\n", + " \"schema_url\": \"\"\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "import time\n", + "\n", + "tracer = trace.get_tracer(\"app_or_package_name\", \"1.0.0\")\n", + "\n", + "# measure the timing of the operation\n", + "with tracer.start_as_current_span(\"operation-name\") as span:\n", + " time.sleep(1)" + ] + }, + { + "cell_type": "markdown", + "id": "ec4267aa", + "metadata": {}, + "source": [ + "### Record attributes" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "fa9d265f", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"name\": \"operation-name\",\n", + " \"context\": {\n", + " \"trace_id\": \"0xfc11f0cc7afeefd79134eea639f5c78b\",\n", + " \"span_id\": \"0xee791bf3cab65079\",\n", + " \"trace_state\": \"[]\"\n", + " },\n", + " \"kind\": \"SpanKind.INTERNAL\",\n", + " \"parent_id\": null,\n", + " \"start_time\": \"2022-12-07T13:46:30.886188Z\",\n", + " \"end_time\": \"2022-12-07T13:46:31.887323Z\",\n", + " \"status\": {\n", + " \"status_code\": \"UNSET\"\n", + " },\n", + " \"attributes\": {\n", + " \"enduser.id\": \"jupyter\",\n", + " \"enduser.email\": \"jupyter@redis-py\"\n", + " },\n", + " \"events\": [],\n", + " \"links\": [],\n", + " \"resource\": {\n", + " \"attributes\": {\n", + " \"telemetry.sdk.language\": \"python\",\n", + " \"telemetry.sdk.name\": \"opentelemetry\",\n", + " \"telemetry.sdk.version\": \"1.14.0\",\n", + " \"service.name\": \"unknown_service\"\n", + " },\n", + " \"schema_url\": \"\"\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "with tracer.start_as_current_span(\"operation-name\") as span:\n", + " if span.is_recording():\n", + " span.set_attribute(\"enduser.id\", \"jupyter\")\n", + " span.set_attribute(\"enduser.email\", \"jupyter@redis-py\")\n", + " time.sleep(1)" + ] + }, + { + "cell_type": "markdown", + "id": "e40655de", + "metadata": {}, + "source": [ + "### Change the span kind" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "af2980ac", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"name\": \"operation-name\",\n", + " \"context\": {\n", + " \"trace_id\": \"0x2b4d1ba36423e6c17067079f044b5b62\",\n", + " \"span_id\": \"0x323d6107cfe594bd\",\n", + " \"trace_state\": \"[]\"\n", + " },\n", + " \"kind\": \"SpanKind.SERVER\",\n", + " \"parent_id\": null,\n", + " \"start_time\": \"2022-12-07T13:53:20.538393Z\",\n", + " \"end_time\": \"2022-12-07T13:53:20.638595Z\",\n", + " \"status\": {\n", + " \"status_code\": \"UNSET\"\n", + " },\n", + " \"attributes\": {},\n", + " \"events\": [],\n", + " \"links\": [],\n", + " \"resource\": {\n", + " \"attributes\": {\n", + " \"telemetry.sdk.language\": \"python\",\n", + " \"telemetry.sdk.name\": \"opentelemetry\",\n", + " \"telemetry.sdk.version\": \"1.14.0\",\n", + " \"service.name\": \"unknown_service\"\n", + " },\n", + " \"schema_url\": \"\"\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "with tracer.start_as_current_span(\"operation-name\", kind=trace.SpanKind.SERVER) as span:\n", + " time.sleep(0.1)" + ] + }, + { + "cell_type": "markdown", + "id": "2a9f1d99", + "metadata": {}, + "source": [ + "### Exceptions are automatically recorded" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "1b453d66", + "metadata": {}, + "outputs": [ + { + "ename": "ValueError", + "evalue": "", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", + "Cell \u001b[0;32mIn[6], line 3\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;28;01mwith\u001b[39;00m tracer\u001b[38;5;241m.\u001b[39mstart_as_current_span(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124moperation-name\u001b[39m\u001b[38;5;124m\"\u001b[39m, kind\u001b[38;5;241m=\u001b[39mtrace\u001b[38;5;241m.\u001b[39mSpanKind\u001b[38;5;241m.\u001b[39mSERVER) \u001b[38;5;28;01mas\u001b[39;00m span:\n\u001b[1;32m 2\u001b[0m time\u001b[38;5;241m.\u001b[39msleep(\u001b[38;5;241m0.1\u001b[39m)\n\u001b[0;32m----> 3\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m\n", + "\u001b[0;31mValueError\u001b[0m: " + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"name\": \"operation-name\",\n", + " \"context\": {\n", + " \"trace_id\": \"0x20457d98d4456b99810163027c7899de\",\n", + " \"span_id\": \"0xf16e4c1620091c72\",\n", + " \"trace_state\": \"[]\"\n", + " },\n", + " \"kind\": \"SpanKind.SERVER\",\n", + " \"parent_id\": null,\n", + " \"start_time\": \"2022-12-07T13:55:24.108227Z\",\n", + " \"end_time\": \"2022-12-07T13:55:24.208771Z\",\n", + " \"status\": {\n", + " \"status_code\": \"ERROR\",\n", + " \"description\": \"ValueError: \"\n", + " },\n", + " \"attributes\": {},\n", + " \"events\": [\n", + " {\n", + " \"name\": \"exception\",\n", + " \"timestamp\": \"2022-12-07T13:55:24.208730Z\",\n", + " \"attributes\": {\n", + " \"exception.type\": \"ValueError\",\n", + " \"exception.message\": \"\",\n", + " \"exception.stacktrace\": \"Traceback (most recent call last):\\n File \\\"/home/vmihailenco/.local/lib/python3.10/site-packages/opentelemetry/trace/__init__.py\\\", line 573, in use_span\\n yield span\\n File \\\"/home/vmihailenco/.local/lib/python3.10/site-packages/opentelemetry/sdk/trace/__init__.py\\\", line 1033, in start_as_current_span\\n yield span_context\\n File \\\"/tmp/ipykernel_241440/2787006841.py\\\", line 3, in <module>\\n raise ValueError\\nValueError\\n\",\n", + " \"exception.escaped\": \"False\"\n", + " }\n", + " }\n", + " ],\n", + " \"links\": [],\n", + " \"resource\": {\n", + " \"attributes\": {\n", + " \"telemetry.sdk.language\": \"python\",\n", + " \"telemetry.sdk.name\": \"opentelemetry\",\n", + " \"telemetry.sdk.version\": \"1.14.0\",\n", + " \"service.name\": \"unknown_service\"\n", + " },\n", + " \"schema_url\": \"\"\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "with tracer.start_as_current_span(\"operation-name\", kind=trace.SpanKind.SERVER) as span:\n", + " time.sleep(0.1)\n", + " raise ValueError" + ] + }, + { + "cell_type": "markdown", + "id": "23708329", + "metadata": {}, + "source": [ + "### Use nested blocks to create child spans" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "9eb261d7", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"name\": \"child-span\",\n", + " \"context\": {\n", + " \"trace_id\": \"0x5625fbd0a1be15b49cda0d2bb236d158\",\n", + " \"span_id\": \"0xc13b2c102566ffaf\",\n", + " \"trace_state\": \"[]\"\n", + " },\n", + " \"kind\": \"SpanKind.INTERNAL\",\n", + " \"parent_id\": \"0xa5f1a9afdf26173c\",\n", + " \"start_time\": \"2022-12-07T13:57:14.011221Z\",\n", + " \"end_time\": \"2022-12-07T13:57:14.011279Z\",\n", + " \"status\": {\n", + " \"status_code\": \"UNSET\"\n", + " },\n", + " \"attributes\": {\n", + " \"foo\": \"bar\"\n", + " },\n", + " \"events\": [],\n", + " \"links\": [],\n", + " \"resource\": {\n", + " \"attributes\": {\n", + " \"telemetry.sdk.language\": \"python\",\n", + " \"telemetry.sdk.name\": \"opentelemetry\",\n", + " \"telemetry.sdk.version\": \"1.14.0\",\n", + " \"service.name\": \"unknown_service\"\n", + " },\n", + " \"schema_url\": \"\"\n", + " }\n", + "}\n", + "{\n", + " \"name\": \"operation-name\",\n", + " \"context\": {\n", + " \"trace_id\": \"0x5625fbd0a1be15b49cda0d2bb236d158\",\n", + " \"span_id\": \"0xa5f1a9afdf26173c\",\n", + " \"trace_state\": \"[]\"\n", + " },\n", + " \"kind\": \"SpanKind.INTERNAL\",\n", + " \"parent_id\": null,\n", + " \"start_time\": \"2022-12-07T13:57:13.910849Z\",\n", + " \"end_time\": \"2022-12-07T13:57:14.011320Z\",\n", + " \"status\": {\n", + " \"status_code\": \"UNSET\"\n", + " },\n", + " \"attributes\": {},\n", + " \"events\": [],\n", + " \"links\": [],\n", + " \"resource\": {\n", + " \"attributes\": {\n", + " \"telemetry.sdk.language\": \"python\",\n", + " \"telemetry.sdk.name\": \"opentelemetry\",\n", + " \"telemetry.sdk.version\": \"1.14.0\",\n", + " \"service.name\": \"unknown_service\"\n", + " },\n", + " \"schema_url\": \"\"\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "with tracer.start_as_current_span(\"operation-name\") as span:\n", + " time.sleep(0.1)\n", + " with tracer.start_as_current_span(\"child-span\") as span:\n", + " span.set_attribute(\"foo\", \"bar\")" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.6" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/images/opentelemetry/distributed-tracing.png b/docs/images/opentelemetry/distributed-tracing.png Binary files differnew file mode 100644 index 0000000..a011697 --- /dev/null +++ b/docs/images/opentelemetry/distributed-tracing.png diff --git a/docs/images/opentelemetry/redis-metrics.png b/docs/images/opentelemetry/redis-metrics.png Binary files differnew file mode 100644 index 0000000..7c2beb4 --- /dev/null +++ b/docs/images/opentelemetry/redis-metrics.png diff --git a/docs/images/opentelemetry/redis-py-trace.png b/docs/images/opentelemetry/redis-py-trace.png Binary files differnew file mode 100644 index 0000000..e443238 --- /dev/null +++ b/docs/images/opentelemetry/redis-py-trace.png diff --git a/docs/images/opentelemetry/tree-of-spans.png b/docs/images/opentelemetry/tree-of-spans.png Binary files differnew file mode 100644 index 0000000..399c8a0 --- /dev/null +++ b/docs/images/opentelemetry/tree-of-spans.png diff --git a/docs/index.rst b/docs/index.rst index 6dd5379..a6ee05e 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -72,6 +72,7 @@ Module Documentation advanced_features clustering lua_scripting + opentelemetry examples Contributing diff --git a/docs/opentelemetry.rst b/docs/opentelemetry.rst new file mode 100644 index 0000000..9678102 --- /dev/null +++ b/docs/opentelemetry.rst @@ -0,0 +1,177 @@ +Integrating OpenTelemetry +========================= + +What is OpenTelemetry? +---------------------- + +`OpenTelemetry <https://opentelemetry.io>`_ is an open-source observability framework for traces, metrics, and logs. + +OpenTelemetry allows developers to collect and export telemetry data in a vendor agnostic way. With OpenTelemetry, you can instrument your application once and then add or change vendors without changing the instrumentation, for example, here is a list of `popular DataDog competitors <https://uptrace.dev/get/compare/datadog-competitors.html>`_ that support OpenTelemetry. + +What is tracing? +---------------- + +`OpenTelemetry tracing <https://uptrace.dev/opentelemetry/distributed-tracing.html>`_ allows you to see how a request progresses through different services and systems, timings of each operation, any logs and errors as they occur. + +In a distributed environment, tracing also helps you understand relationships and interactions between microservices. Distributed tracing gives an insight into how a particular microservice is performing and how that service affects other microservices. + +.. image:: images/opentelemetry/distributed-tracing.png + :alt: Trace + +Using tracing, you can break down requests into spans. **Span** is an operation (unit of work) your app performs handling a request, for example, a database query or a network call. + +**Trace** is a tree of spans that shows the path that a request makes through an app. Root span is the first span in a trace. + +.. image:: images/opentelemetry/tree-of-spans.png + :alt: Trace + +To learn more about tracing, see `Distributed Tracing using OpenTelemetry <https://uptrace.dev/opentelemetry/distributed-tracing.html>`_. + +OpenTelemetry instrumentation +----------------------------- + +Instrumentations are plugins for popular frameworks and libraries that use OpenTelemetry API to record important operations, for example, HTTP requests, DB queries, logs, errors, and more. + +To install OpenTelemetry `instrumentation <https://opentelemetry-python-contrib.readthedocs.io/en/latest/instrumentation/redis/redis.html>`_ for redis-py: + +.. code-block:: shell + + pip install opentelemetry-instrumentation-redis + +You can then use it to instrument code like this: + +.. code-block:: python + + from opentelemetry.instrumentation.redis import RedisInstrumentor + + RedisInstrumentor().instrument() + +Once the code is patched, you can use redis-py as usually: + +.. code-block:: python + + # Sync client + client = redis.Redis() + client.get("my-key") + + # Async client + client = redis.asyncio.Redis() + await client.get("my-key") + +OpenTelemetry API +----------------- + +`OpenTelemetry <https://uptrace.dev/opentelemetry/>`_ API is a programming interface that you can use to instrument code and collect telemetry data such as traces, metrics, and logs. + +You can use OpenTelemetry API to measure important operations: + +.. code-block:: python + + from opentelemetry import trace + + tracer = trace.get_tracer("app_or_package_name", "1.0.0") + + # Create a span with name "operation-name" and kind="server". + with tracer.start_as_current_span("operation-name", kind=trace.SpanKind.CLIENT) as span: + do_some_work() + +Record contextual information using attributes: + +.. code-block:: python + + if span.is_recording(): + span.set_attribute("http.method", "GET") + span.set_attribute("http.route", "/projects/:id") + +And monitor exceptions: + +.. code-block:: python + + except ValueError as exc: + # Record the exception and update the span status. + span.record_exception(exc) + span.set_status(trace.Status(trace.StatusCode.ERROR, str(exc))) + +See `OpenTelemetry Python Tracing API <https://uptrace.dev/opentelemetry/python-tracing.html>`_ for details. + +Uptrace +------- + +Uptrace is an `open-source APM <https://uptrace.dev/get/open-source-apm.html>`_ that supports distributed tracing, metrics, and logs. You can use it to monitor applications and set up automatic alerts to receive notifications via email, Slack, Telegram, and more. + +You can use Uptrace to monitor redis-py using this `GitHub example <https://github.com/redis/redis-py/tree/master/docs/examples/opentelemetry>`_ as a starting point. + +.. image:: images/opentelemetry/redis-py-trace.png + :alt: Redis-py trace + +You can `install Uptrace <https://uptrace.dev/get/install.html>`_ by downloading a DEB/RPM package or a pre-compiled binary. + +Monitoring Redis Server performance +----------------------------------- + +In addition to monitoring redis-py client, you can also monitor Redis Server performance using OpenTelemetry Collector Agent. + +OpenTelemetry Collector is a proxy/middleman between your application and a `distributed tracing tool <https://uptrace.dev/get/compare/distributed-tracing-tools.html>`_ such as Uptrace or Jaeger. Collector receives telemetry data, processes it, and then exports the data to APM tools that can store it permanently. + +For example, you can use the Redis receiver provided by Otel Collector to `monitor Redis performance <https://uptrace.dev/opentelemetry/redis-monitoring.html>`_: + +.. image:: images/opentelemetry/redis-metrics.png + :alt: Redis metrics + +See introduction to `OpenTelemetry Collector <https://uptrace.dev/opentelemetry/collector.html>`_ for details. + +Alerting and notifications +-------------------------- + +Uptrace also allows you to monitor `OpenTelemetry metrics <https://uptrace.dev/opentelemetry/metrics.html>`_ using alerting rules. For example, the following rule uses the group by node expression to create an alert whenever an individual Redis shard is down: + +.. code-block:: python + + # /etc/uptrace/uptrace.yml + + alerting: + rules: + - name: Redis shard is down + metrics: + - redis_up as $redis_up + query: + - group by cluster # monitor each cluster, + - group by bdb # each database, + - group by node # and each shard + - $redis_up == 0 + # shard should be down for 5 minutes to trigger an alert + for: 5m + +You can also create queries with more complex expressions. For example, the following rule creates an alert when the keyspace hit rate is lower than 75%: + +.. code-block:: python + + # /etc/uptrace/uptrace.yml + + alerting: + rules: + - name: Redis read hit rate < 75% + metrics: + - redis_keyspace_read_hits as $hits + - redis_keyspace_read_misses as $misses + query: + - group by cluster + - group by bdb + - group by node + - $hits / ($hits + $misses) < 0.75 + for: 5m + +See `Alerting and Notifications <https://uptrace.dev/get/alerting.html>`_ for details. + +What's next? +------------ + +Next, you can learn how to configure `uptrace-python <https://uptrace.dev/get/uptrace-python.html>`_ to export spans, metrics, and logs to Uptrace. + +You may also be interested in the following guides: + +- `OpenTelemetry Django <https://uptrace.dev/opentelemetry/instrumentations/python-django.html>`_ +- `OpenTelemetry Flask <https://uptrace.dev/opentelemetry/instrumentations/python-flask.html>`_ +- `OpenTelemetry FastAPI <https://uptrace.dev/opentelemetry/instrumentations/python-fastapi.html>`_ +- `OpenTelemetry SQLAlchemy <https://uptrace.dev/opentelemetry/instrumentations/python-sqlalchemy.html>`_ +- `OpenTelemetry instrumentations <https://uptrace.dev/opentelemetry/instrumentations/>`_ |