summaryrefslogtreecommitdiff
path: root/lib/gitlab/conflict
diff options
context:
space:
mode:
authorSean McGivern <sean@gitlab.com>2017-03-14 14:06:26 +0000
committerSean McGivern <sean@gitlab.com>2017-03-15 11:18:29 +0000
commit96c77bf77550813c30b223448763d14977749e84 (patch)
tree2ce6c32c757c90aeb3256803dfbc37cdd7396305 /lib/gitlab/conflict
parent181c2582fbba4cdb276709b3f4920fab18e1e962 (diff)
downloadgitlab-ce-96c77bf77550813c30b223448763d14977749e84.tar.gz
Allow resolving conflicts with non-ASCII charsallow-resolving-conflicts-in-utf-8
We wanted to check that the text could be encoded as JSON, because conflict resolutions are passed back and forth in that format, so the file itself must be UTF-8. However, all strings from the repository come back without an encoding from Rugged, making them ASCII_8BIT. We force to UTF-8, and reject if it's invalid. This still leaves the problem of a file that 'looks like' UTF-8 (contains valid UTF-8 byte sequences), but isn't. However: 1. If the conflicts contain the problem bytes, the user will see that the file isn't displayed correctly. 2. If the problem bytes are outside of the conflict area, then we will write back the same bytes when we resolve the conflicts, even though we though the encoding was UTF-8.
Diffstat (limited to 'lib/gitlab/conflict')
-rw-r--r--lib/gitlab/conflict/parser.rb8
1 files changed, 3 insertions, 5 deletions
diff --git a/lib/gitlab/conflict/parser.rb b/lib/gitlab/conflict/parser.rb
index d3524c338ee..84f9ecd3d23 100644
--- a/lib/gitlab/conflict/parser.rb
+++ b/lib/gitlab/conflict/parser.rb
@@ -15,11 +15,9 @@ module Gitlab
raise UnmergeableFile if text.blank? # Typically a binary file
raise UnmergeableFile if text.length > 200.kilobytes
- begin
- text.to_json
- rescue Encoding::UndefinedConversionError
- raise UnsupportedEncoding
- end
+ text.force_encoding('UTF-8')
+
+ raise UnsupportedEncoding unless text.valid_encoding?
line_obj_index = 0
line_old = 1