1 files changed, 105 insertions, 6 deletions
diff --git a/doc/development/performance.md b/doc/development/performance.md
index 3b4525dc8ee..e93dc26e4fc 100644
--- a/doc/development/performance.md
+++ b/doc/development/performance.md
@@ -256,7 +256,7 @@ The following configuration options can be configured:
 - `STACKPROF_MODE`: See [sampling modes](https://github.com/tmm1/stackprof#sampling).
   Defaults to `cpu`.
 - `STACKPROF_INTERVAL`: Sampling interval. Unit semantics depend on `STACKPROF_MODE`.
-  For `object` mode this is a per-event interval (every `n`th event is sampled)
+  For `object` mode this is a per-event interval (every `nth` event is sampled)
   and defaults to `1000`.
   For other modes such as `cpu` this is a frequency and defaults to `10000` μs (100hz).
 - `STACKPROF_FILE_PREFIX`: File path prefix where profiles are stored. Defaults
@@ -293,7 +293,7 @@ worker processes), selecting the latter.
 
 For Sidekiq, the signal can be sent to the `sidekiq-cluster` process via `pkill
 -USR2 bin/sidekiq-cluster`, which forwards the signal to all Sidekiq
-children. Alternatively, you can also select a specific pid of interest.
+children. Alternatively, you can also select a specific PID of interest.
 
 Production profiles can be especially noisy. It can be helpful to visualize them
 as a [flame graph](https://github.com/brendangregg/FlameGraph). This can be done
@@ -306,7 +306,7 @@ bundle exec stackprof --stackcollapse /tmp/stackprof.55769.c6c3906452.profile |
 ## RSpec profiling
 
 The GitLab development environment also includes the
-[rspec_profiling](https://github.com/foraker/rspec_profiling) gem, which is used
+[`rspec_profiling`](https://github.com/foraker/rspec_profiling) gem, which is used
 to collect data on spec execution times. This is useful for analyzing the
 performance of the test suite itself, or seeing how the performance of a spec
 may have changed over time.
@@ -347,13 +347,112 @@ example, you can find which tests take longest to run or which execute the most
 queries. This can be handy for optimizing our tests or identifying performance
 issues in our code.
 
-## Memory profiling
+## Memory optimization
 
-We can use two approaches, often in combination, to track down memory issues:
+We can use a set of different techniques, often in combination, to track down memory issues:
 
 - Leaving the code intact and wrapping a profiler around it.
+- Use memory allocation counters for requests and services.
 - Monitor memory usage of the process while disabling/enabling different parts of the code we suspect could be problematic.
 
+### Memory allocations
+
+Ruby shipped with GitLab includes a special patch to allow [tracing memory allocations](https://gitlab.com/gitlab-org/gitlab/-/issues/296530).
+This patch is available by default for
+[Omnibus](https://gitlab.com/gitlab-org/omnibus-gitlab/-/merge_requests/4948),
+[CNG](https://gitlab.com/gitlab-org/build/CNG/-/merge_requests/591),
+[GitLab CI](https://gitlab.com/gitlab-org/gitlab-build-images/-/merge_requests/355),
+[GCK](https://gitlab.com/gitlab-org/gitlab-compose-kit/-/merge_requests/149)
+and can additionally be enabled for [GDK](https://gitlab.com/gitlab-org/gitlab-development-kit/-/blob/master/doc/advanced.md#apply-custom-patches-for-ruby).
+
+This patch provides a set of 3 metrics that makes it easier to understand efficiency of memory usage for a given codepath:
+
+- `mem_objects`: the number of objects allocated.
+- `mem_bytes`: the number of bytes allocated by malloc.
+- `mem_mallocs`: the number of malloc allocations.
+
+The number of objects and bytes allocated impact how often GC cycles happen.
+Fewer objects allocations result in a significantly more responsive application.
+
+It is advised that web server requests do not allocate more than `100k mem_objects`
+and `100M mem_bytes`. You can view the current usage on [GitLab.com](https://log.gprd.gitlab.net/goto/3a9678bb595e3f89a0c7b5c61bcc47b9).
+
+#### Checking memory pressure of own code
+
+There are two ways of measuring your own code:
+
+1. Review `api_json.log`, `development_json.log`, `sidekiq.log` that includes memory allocation counters.
+1. Use `Gitlab::Memory::Instrumentation.with_memory_allocations` for a given codeblock and log it.
+1. Use [Measuring module](service_measurement.md)
+
+```json
+{"time":"2021-02-15T11:20:40.821Z","severity":"INFO","duration_s":0.27412,"db_duration_s":0.05755,"view_duration_s":0.21657,"status":201,"method":"POST","path":"/api/v4/projects/user/1","mem_objects":86705,"mem_bytes":4277179,"mem_mallocs":22693,"correlation_id":"...}
+```
+
+#### Different types of allocations
+
+The `mem_*` values represent different aspects of how objects and memory are allocated in Ruby:
+
+- The following example will create around of `1000` of `mem_objects` since strings
+   can be frozen, and while the underlying string object remains the same, we still need to allocate 1000 references to this string:
+
+  ```ruby
+  Gitlab::Memory::Instrumentation.with_memory_allocations do
+    1_000.times { '0123456789' }
+  end
+
+  => {:mem_objects=>1001, :mem_bytes=>0, :mem_mallocs=>0}
+  ```
+
+- The following example will create around of `1000` of `mem_objects`, as strings are created dynamically.
+   Each of them will not allocate additional memory, as they fit into Ruby slot of 40 bytes:
+
+  ```ruby
+  Gitlab::Memory::Instrumentation.with_memory_allocations do
+    s = '0'
+    1_000.times { s * 23 }
+  end
+
+  => {:mem_objects=>1002, :mem_bytes=>0, :mem_mallocs=>0}
+  ```
+
+- The following example will create around of `1000` of `mem_objects`, as strings are created dynamically.
+   Each of them will allocate additional memory as strings are larger than Ruby slot of 40 bytes:
+
+  ```ruby
+  Gitlab::Memory::Instrumentation.with_memory_allocations do
+    s = '0'
+    1_000.times { s * 24 }
+  end
+
+  => {:mem_objects=>1002, :mem_bytes=>32000, :mem_mallocs=>1000}
+  ```
+
+- The following example will allocate over 40kB of data, and perform only a single memory allocation.
+   The existing object will be reallocated/resized on subsequent iterations:
+
+  ```ruby
+  Gitlab::Memory::Instrumentation.with_memory_allocations do
+    str = ''
+    append = '0123456789012345678901234567890123456789' # 40 bytes
+    1_000.times { str.concat(append) }
+  end
+  => {:mem_objects=>3, :mem_bytes=>49152, :mem_mallocs=>1}
+  ```
+
+- The following example will create over 1k of objects, perform over 1k of allocations, each time mutating the object.
+   This does result in copying a lot of data and perform a lot of memory allocations
+  (as represented by `mem_bytes` counter) indicating very inefficient method of appending string:
+
+  ```ruby
+  Gitlab::Memory::Instrumentation.with_memory_allocations do
+    str = ''
+    append = '0123456789012345678901234567890123456789' # 40 bytes
+    1_000.times { str += append }
+  end
+  => {:mem_objects=>1003, :mem_bytes=>21968752, :mem_mallocs=>1000}
+  ```
+
 ### Using Memory Profiler
 
 We can use `memory_profiler` for profiling.
@@ -409,7 +508,7 @@ Fragmented Ruby heap snapshot could look like this:
 
 ![Ruby heap fragmentation](img/memory_ruby_heap_fragmentation.png)
 
-Memory fragmentation could be reduced by tuning GC parameters as described in [this post by Nate Berkopec](https://www.speedshop.co/2017/12/04/malloc-doubles-ruby-memory.html). This should be considered as a tradeoff, as it may affect overall performance of memory allocation and GC cycles.
+Memory fragmentation could be reduced by tuning GC parameters [as described in this post](https://www.speedshop.co/2017/12/04/malloc-doubles-ruby-memory.html). This should be considered as a tradeoff, as it may affect overall performance of memory allocation and GC cycles.
 
 ## Importance of Changes