summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLin Jen-Shin <godfat@godfat.org>2017-07-20 21:02:07 +0800
committerLin Jen-Shin <godfat@godfat.org>2017-08-08 19:09:56 +0800
commit8e2350ae9514f4b296e1717ca26a6033e9d2aca8 (patch)
treeb280999d1eb363cab0813a7d150a4d47bdabd609
parentfeb8974cc87455328dea708be556e41b59e8ba26 (diff)
downloadgitlab-ce-8e2350ae9514f4b296e1717ca26a6033e9d2aca8.tar.gz
Raise encoding confidence threshold to 50
It is recommended that we set this to 50: https://gitlab.com/gitlab-org/gitlab-ce/issues/35098#note_35036746 In this particular issue, the confidence was 42 for Shift JIS, but in fact that's encoded in UTF-8 just with a single bad character. In this case, we shouldn't try to treat it as Shift JIS, but just treat it as UTF-8 and remove invalid bytes. Treating it like Shift JIS would corrupt the whole data. Unfortunately, the diff which would cause this could not be disclosed therefore we can't use it as a test example.
-rw-r--r--lib/gitlab/encoding_helper.rb2
1 files changed, 1 insertions, 1 deletions
diff --git a/lib/gitlab/encoding_helper.rb b/lib/gitlab/encoding_helper.rb
index 781f9c56a42..8ddc91e341d 100644
--- a/lib/gitlab/encoding_helper.rb
+++ b/lib/gitlab/encoding_helper.rb
@@ -11,7 +11,7 @@ module Gitlab
# obscure encoding with low confidence.
# There is a lot more info with this merge request:
# https://gitlab.com/gitlab-org/gitlab_git/merge_requests/77#note_4754193
- ENCODING_CONFIDENCE_THRESHOLD = 40
+ ENCODING_CONFIDENCE_THRESHOLD = 50
def encode!(message)
return nil unless message.respond_to? :force_encoding