summaryrefslogtreecommitdiff
path: root/doc/workflow/lfs/migrate_from_git_annex_to_git_lfs.md
blob: 71c73e3dffee20fa54b8c4c672b2c1edecf104b5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
# Migration guide from Git Annex to Git LFS

>**Note:**
Git Annex support [has been removed][issue-remove-annex] in GitLab Enterprise
Edition 9.0 (2017/03/22).

Both [Git Annex][] and [Git LFS][] are tools to manage large files in Git.

## History

Git Annex [was introduced in GitLab Enterprise Edition 7.8][post-3], at a time
where Git LFS didn't yet exist. A few months later, GitLab brought support for
Git LFS in [GitLab 8.2][post-2] and is available for both Community and
Enterprise editions.

## Differences between Git Annex and Git LFS

Some items below are general differences between the two protocols and some are
ones that GitLab developed.

- Git Annex works only through SSH, whereas Git LFS works both with SSH and HTTPS
  (SSH support was added in GitLab 8.12).
- Annex files are stored in a sub-directory of the normal repositories, whereas
  LFS files are stored outside of the repositories in a place you can define.
- Git Annex requires a more complex setup, but has much more options than Git
  LFS. You can compare the commands each one offers by running `man git-annex`
  and `man git-lfs`.
- Annex files cannot be browsed directly in GitLab's interface, whereas LFS
  files can.

## Migration steps

>**Note:**
Since Git Annex files are stored in a sub-directory of the normal repositories
(`.git/annex/objects`) and LFS files are stored outside of the repositories,
they are not compatible as they are using a different scheme. Therefore, the
migration has to be done manually per repository.

There are basically two steps you need to take in order to migrate from Git
Annex to Git LFS.

### TL; DR

If you know what you are doing and want to skip the reading, this is what you
need to do (we assume you have [git-annex enabled](../git_annex.md#using-gitlab-git-annex) in your
repository and that you have made backups in case something goes wrong).
Fire up a terminal, navigate to your Git repository and:


1. Disable `git-annex`:

    ```bash
    git annex sync --content
    git annex direct
    git annex uninit
    git annex indirect
    ```

1. Enable `git-lfs`:

    ```
    git lfs install
    git lfs track <files>
    git add .
    git commit -m "commit message"
    git push
    ```

### Disabling Git Annex in your repo

Before changing anything, make sure you have a backup of your repository first.
There are a couple of ways to do that, but you can simply clone it to another
local path and maybe push it to GitLab if you want a remote backup as well.
Here you'll find a guide on
[how to back up a **git-annex** repository to an external hard drive][bkp-ext-drive].

Since Annex files are stored as objects with symlinks and cannot be directly
modified, we need to first remove those symlinks.

>**Note:**
Make sure the you read about the [`direct` mode][annex-direct] as it contains
useful information that may fit in your use case. Note that `annex direct` is
deprecated in Git Annex version 6, so you may need to upgrade your repository
if the server also has Git Annex 6 installed. Read more in the
[Git Annex troubleshooting tips](../git_annex.md#troubleshooting-tips) section.

1. Backup your repository

    ```bash
    cd repository
    git annex sync --content
    cd ..
    git clone repository repository-backup
    cd repository-backup
    git annex get
    cd ..
    ```

1. Use `annex direct`:

    ```bash
    cd repository
    git annex direct
    ```

    The output should be similar to this:

    ```bash
    commit
    On branch master
    Your branch is up-to-date with 'origin/master'.
    nothing to commit, working tree clean
    ok
    direct debian.iso ok
    direct  ok
    ```

1. Disable Git Annex with [`annex uninit`][uninit]:

    ```bash
    git annex uninit
    ```

    The output should be similar to this:

    ```bash
    unannex debian.iso ok
    Deleted branch git-annex (was 2534d2c).
    ```

    This will `unannex` every file in the repository, leaving the original files.

1. Switch back to `indirect` mode:

    ```bash
    git annex indirect
    ```

    The output should be similar to this:

    ```bash
    (merging origin/git-annex into git-annex...)
    (recording state in git...)
    commit  (recording state in git...)

    ok
    (recording state in git...)
    [master fac3194] commit before switching to indirect mode
     1 file changed, 1 deletion(-)
     delete mode 120000 alpine-virt-3.4.4-x86_64.iso
    ok
    indirect  ok
    ok
    ```

---

At this point, you have two options. Either add, commit and push the files
directly back to GitLab or switch to Git LFS. We will tackle the LFS switch in
the next section.

### Enabling Git LFS in your repo

Git LFS is enabled by default on all GitLab products (GitLab CE, GitLab EE,
GitLab.com), therefore, you don't need to do anything server-side.

1. First, make sure you have `git-lfs` installed locally:

   ```bash
   git lfs help
   ```

   If the terminal doesn't prompt you with a full response on `git-lfs` commands,
   [install the Git LFS client][install-lfs] first.

1. Inside the repo, run the following command to initiate LFS:

   ```bash
   git lfs install
   ```

1. Enable `git-lfs` for the group of files you want to track. You
   can track specific files, all files containing the same extension, or an
   entire directory:

   ```bash
   git lfs track images/01.png   # per file
   git lfs track **/*.png        # per extension
   git lfs track images/         # per directory
   ```

   Once you do that, run `git status` and you'll see `.gitattributes` added
   to your repo. It collects all file patterns that you chose to track via
   `git-lfs`.

1. Add the files, commit and push them to GitLab:

   ```bash
   git add .
   git commit -m "commit message"
   git push
   ```

   If your remote is set up with HTTP, you will be asked to enter your login
   credentials. If you have [2FA enabled](../../user/profile/account/two_factor_authentication.md), make sure to use a
   [personal access token](../../user/profile/account/two_factor_authentication.md#personal-access-tokens)
   instead of your password.

## Removing the Git Annex branches

After the migration finishes successfully, you can remove all `git-annex`
related branches from your repository.

On GitLab, navigate to your project's **Repository ➔ Branches** and delete all
branches created by Git Annex: `git-annex`, and all under `synced/`.

![repository branches](images/git-annex-branches.png)

You can also do this on the commandline with:

    ```bash
    git branch -d synced/master
    git branch -d synced/git-annex
    git push origin :synced/master
    git push origin :synced/git-annex
    git push origin :git-annex
    git remote prune origin
    ```

If there are still some Annex objects inside your repository (`.git/annex/`)
or references inside `.git/config`, run `annex uninit` again:

```bash
git annex uninit
```

## Further Reading

- (Blog Post) [Getting Started with Git FLS][post-1]
- (Blog Post) [Announcing LFS Support in GitLab][post-2]
- (Blog Post) [GitLab Annex Solves the Problem of Versioning Large Binaries with Git][post-3]
- (GitLab Docs) [Git Annex](../git_annex.md)
- (GitLab Docs) [Git LFS](manage_large_binaries_with_git_lfs.md)

[annex-direct]: https://git-annex.branchable.com/direct_mode/
[bkp-ext-drive]: https://www.thomas-krenn.com/en/wiki/Git-annex_Repository_on_an_External_Hard_Drive
[Git Annex]: http://git-annex.branchable.com/
[Git LFS]: https://git-lfs.github.com/
[install-lfs]: https://git-lfs.github.com/
[issue-remove-annex]: https://gitlab.com/gitlab-org/gitlab-ee/issues/1648
[lfs-track]: https://about.gitlab.com/2017/01/30/getting-started-with-git-lfs-tutorial/#tracking-files-with-lfs
[post-1]: https://about.gitlab.com/2017/01/30/getting-started-with-git-lfs-tutorial/
[post-2]: https://about.gitlab.com/2015/11/23/announcing-git-lfs-support-in-gitlab/
[post-3]: https://about.gitlab.com/2015/02/17/gitlab-annex-solves-the-problem-of-versioning-large-binaries-with-git/
[uninit]: https://git-annex.branchable.com/git-annex-uninit/