summaryrefslogtreecommitdiff
path: root/doc/development/diffs.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/diffs.md')
-rw-r--r--doc/development/diffs.md202
1 files changed, 7 insertions, 195 deletions
diff --git a/doc/development/diffs.md b/doc/development/diffs.md
index b38fcea4f00..c84bf57e085 100644
--- a/doc/development/diffs.md
+++ b/doc/development/diffs.md
@@ -1,199 +1,11 @@
---
-stage: Create
-group: Code Review
-info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
+redirect_to: 'merge_request_concepts/diffs/index.md'
+remove_date: '2023-04-10'
---
-# Working with diffs
+This document was moved to [another location](merge_request_concepts/diffs/index.md).
-We rely on different sources to present diffs. These include:
-
-- Gitaly service
-- Database (through `merge_request_diff_files`)
-- Redis (cached highlighted diffs)
-
-## Deep Dive
-
-<!-- vale gitlab.Spelling = NO -->
-
-In January 2019, Oswaldo Ferreira hosted a Deep Dive (GitLab team members only:
-`https://gitlab.com/gitlab-org/create-stage/issues/1`) on GitLab Diffs and Commenting on Diffs
-functionality to share domain-specific knowledge with anyone who may work in this part of the
-codebase in the future:
-
-<!-- vale gitlab.Spelling = YES -->
-
-- <i class="fa fa-youtube-play youtube" aria-hidden="true"></i>
- [Recording on YouTube](https://www.youtube.com/watch?v=K6G3gMcFyek)
-- Slides on [Google Slides](https://docs.google.com/presentation/d/1bGutFH2AT3bxOPZuLMGl1ANWHqFnrxwQwjiwAZkF-TU/edit)
-- [PDF slides](https://gitlab.com/gitlab-org/create-stage/uploads/b5ad2f336e0afcfe0f99db0af0ccc71a/)
-
-Everything covered in this deep dive was accurate as of GitLab 11.7, and while specific details may
-have changed since then, it should still serve as a good introduction.
-
-## Architecture overview
-
-### Merge request diffs
-
-When refreshing a merge request (pushing to a source branch, force-pushing to target branch, or if the target branch now contains any commits from the MR)
-we fetch the comparison information using `Gitlab::Git::Compare`, which fetches `base` and `head` data using Gitaly and diff between them through
-`Gitlab::Git::Diff.between`.
-The diffs fetching process _limits_ single file diff sizes and the overall size of the whole diff through a series of constant values. Raw diff files are
-then persisted on `merge_request_diff_files` table.
-
-Even though diffs larger than 10% of the value of `ApplicationSettings#diff_max_patch_bytes` are collapsed,
-we still keep them on PostgreSQL. However, diff files larger than defined _safety limits_
-(see the [Diff limits section](#diff-limits)) are _not_ persisted in the database.
-
-In order to present diffs information on the merge request diffs page, we:
-
-1. Fetch all diff files from database `merge_request_diff_files`
-1. Fetch the _old_ and _new_ file blobs in batch to:
- - Highlight old and new file content
- - Know which viewer it should use for each file (text, image, deleted, etc)
- - Know if the file content changed
- - Know if it was stored externally
- - Know if it had storage errors
-1. If the diff file is cacheable (text-based), it's cached on Redis
- using `Gitlab::Diff::FileCollection::MergeRequestDiff`
-
-### Note diffs
-
-When commenting on a diff (any comparison), we persist a truncated diff version
-on `NoteDiffFile` (which is associated with the actual `DiffNote`). So instead
-of hitting the repository every time we need the diff of the file, we:
-
-1. Check whether we have the `NoteDiffFile#diff` persisted and use it
-1. Otherwise, if it's a current MR revision, use the persisted
- `MergeRequestDiffFile#diff`
-1. In the last scenario, go the repository and fetch the diff
-
-## Diff limits
-
-As explained above, we limit single diff files and the size of the whole diff. There are scenarios where we collapse the diff file,
-and cases where the diff file is not presented at all, and the user is guided to the Blob view.
-
-### Diff collection limits
-
-Limits that act onto all diff files collection. Files number, lines number and files size are considered.
-
-```ruby
-Gitlab::Git::DiffCollection.collection_limits[:safe_max_files] = Gitlab::Git::DiffCollection::DEFAULT_LIMITS[:max_files] = 100
-```
-
-File diffs are collapsed (but are expandable) if 100 files have already been rendered.
-
-```ruby
-Gitlab::Git::DiffCollection.collection_limits[:safe_max_lines] = Gitlab::Git::DiffCollection::DEFAULT_LIMITS[:max_lines] = 5000
-```
-
-File diffs are collapsed (but be expandable) if 5000 lines have already been rendered.
-
-```ruby
-Gitlab::Git::DiffCollection.collection_limits[:safe_max_bytes] = Gitlab::Git::DiffCollection.collection_limits[:safe_max_files] * 5.kilobytes = 500.kilobytes
-```
-
-File diffs are collapsed (but be expandable) if 500 kilobytes have already been rendered.
-
-```ruby
-Gitlab::Git::DiffCollection.collection_limits[:max_files] = Commit::DIFF_HARD_LIMIT_FILES = 1000
-```
-
-No more files are rendered at all if 1000 files have already been rendered.
-
-```ruby
-Gitlab::Git::DiffCollection.collection_limits[:max_lines] = Commit::DIFF_HARD_LIMIT_LINES = 50000
-```
-
-No more files are rendered at all if 50,000 lines have already been rendered.
-
-```ruby
-Gitlab::Git::DiffCollection.collection_limits[:max_bytes] = Gitlab::Git::DiffCollection.collection_limits[:max_files] * 5.kilobytes = 5000.kilobytes
-```
-
-No more files are rendered at all if 5 megabytes have already been rendered.
-
-All collection limit parameters are sent and applied on Gitaly. That is, after the limit is surpassed,
-Gitaly only returns the safe amount of data to be persisted on `merge_request_diff_files`.
-
-### Individual diff file limits
-
-Limits that act onto each diff file of a collection. Files number, lines number and files size are considered.
-
-#### Expandable patches (collapsed)
-
-Diff patches are collapsed when surpassing 10% of the value set in `ApplicationSettings#diff_max_patch_bytes`.
-That is, it's equivalent to 10kb if the maximum allowed value is 100kb.
-The diff is persisted and expandable if the patch size doesn't
-surpass `ApplicationSettings#diff_max_patch_bytes`.
-
-Although this nomenclature (Collapsing) is also used on Gitaly, this limit is only used on GitLab (hardcoded - not sent to Gitaly).
-Gitaly only returns `Diff.Collapsed` (RPC) when surpassing collection limits.
-
-#### Not expandable patches (too large)
-
-The patch not be rendered if it's larger than `ApplicationSettings#diff_max_patch_bytes`.
-Users see a `Changes are too large to be shown.` message and a button to view only that file in that commit.
-
-```ruby
-Commit::DIFF_SAFE_LINES = Gitlab::Git::DiffCollection::DEFAULT_LIMITS[:max_lines] = 5000
-```
-
-File diff is suppressed (technically different from collapsed, but behaves the same, and is expandable) if it has more than 5000 lines.
-
-This limit is hardcoded and only applied on GitLab.
-
-## Viewers
-
-Diff Viewers, which can be found on `models/diff_viewer/*` are classes used to map metadata about each type of Diff File. It has information
-whether it's a binary, which partial should be used to render it or which File extensions this class accounts for.
-
-`DiffViewer::Base` validates _blobs_ (old and new versions) content, extension and file type to check if it can be rendered.
-
-## Merge request diffs against the `HEAD` of the target branch
-
-Historically, merge request diffs have been calculated by `git diff target...source` which compares the
-`HEAD` of the source branch with the merge base (or a common ancestor) of the target branch and the source's.
-This solution works well until the target branch starts containing some of the
-changes introduced by the source branch: Consider the following case, in which the source branch
-is `feature_a` and the target is `main`:
-
-1. Checkout a new branch `feature_a` from `main` and remove `file_a` and `file_b` in it.
-1. Add a commit that removes `file_a` to `main`.
-
-The merge request diff still contains the `file_a` removal while the actual diff compared to
-`main`'s `HEAD` has only the `file_b` removal. The diff with such redundant
-changes is harder to review.
-
-In order to display an up-to-date diff, in GitLab 12.9 we
-[introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/27008) merge request
-diffs compared against `HEAD` of the target branch: the
-target branch is artificially merged into the source branch, then the resulting
-merge ref is compared to the source branch to calculate an accurate
-diff.
-
-Until we complete the epics ["use merge refs for diffs"](https://gitlab.com/groups/gitlab-org/-/epics/854)
-and ["merge conflicts in diffs"](https://gitlab.com/groups/gitlab-org/-/epics/4893),
-both options `main (base)` and `main (HEAD)` are available to be displayed in merge requests:
-
-![Merge ref head options](img/merge_ref_head_options_v13_6.png)
-
-The `main (HEAD)` option is meant to replace `main (base)` in the future.
-
-In order to support comments for both options, diff note positions are stored for
-both `main (base)` and `main (HEAD)` versions ([introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/198457) in 12.10).
-The position for `main (base)` version is stored in `Note#position` and
-`Note#original_position` columns, for `main (HEAD)` version `DiffNotePosition`
-has been introduced.
-
-One of the key challenges to deal with when working on merge ref diffs are merge
-conflicts. If the target and source branch contains a merge conflict, the branches
-cannot be automatically merged. The
-<i class="fa fa-youtube-play youtube" aria-hidden="true"></i> [recording on YouTube](https://www.youtube.com/watch?v=GFXIFA4ZuZw&feature=youtu.be&ab_channel=GitLabUnfiltered)
-is a quick introduction to the problem and the motivation behind the [epic](https://gitlab.com/groups/gitlab-org/-/epics/854).
-
-In 13.5 a solution for both-modified merge
-conflict has been
-[introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/232484). However,
-there are more classes of merge conflicts that are to be
-[addressed](https://gitlab.com/groups/gitlab-org/-/epics/4893) in the future.
+<!-- This redirect file can be deleted after <2023-04-10>. -->
+<!-- Redirects that point to other docs in the same project expire in three months. -->
+<!-- Redirects that point to docs in a different project or site (for example, link is not relative and starts with `https:`) expire in one year. -->
+<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/redirects.html -->