From aa32b5f398472074bb2b8c5b037c867d470f215b Mon Sep 17 00:00:00 2001 From: Karl Williamson Date: Fri, 3 Mar 2023 05:00:13 -0700 Subject: Fix my_strftime() upper space limit The comments said that 100:1 expansion factor had long been sufficient. But it turns out that was wrong; there are locales with a higher ratio, that we just didn't notice were failing. This commit adds comments and ups the ratio to 2000:1 --- locale.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-) (limited to 'locale.c') diff --git a/locale.c b/locale.c index 7943a7734d..d1410aa007 100644 --- a/locale.c +++ b/locale.c @@ -4906,19 +4906,29 @@ and LC_TIME are not the same locale. } /* There are several possible reasons for a 0 return code for a - * non-empty format, and they are not trivial to tease apart. What we - * do is to assume that the reason is not enough space in the buffer, - * so increase it and try again. */ + * non-empty format, and they are not trivial to tease apart. This + * issue is a known bug in the strftime() API. What we do to cope is + * to assume that the reason is not enough space in the buffer, so + * increase it and try again. */ bufsize *= 2; /* But don't just keep increasing the size indefinitely. Stop when it * becomes obvious that the reason for failure is something besides not - * enough space. This heuristic has long been in effect successfully. - * */ - } while (bufsize < 100 * fmtlen); - - /* Here, strftime() returned 0, and it wasn't for lack of space. There - * are two possible reasons: + * enough space. The most likely largest expanding format is %c. On + * khw's Linux box, the maximum result of this is 67 characters, in the + * km_KH locale. If a new script comes along that uses 4 UTF-8 bytes + * per character, and with a similar expansion factor, that would be a + * 268:2 byte ratio, or a bit more than 128:1 = 2**7:1. Some strftime + * implementations allow you to say %1000c to pad to 1000 bytes. This + * shows that it is impossible to implement this without a heuristic + * (that can fail). But it indicates we need to be generous in the + * upper limit before failing. The previous heuristic used was too + * stingy. Since the size doubles per iteration, it doesn't take many + * to reach the limit */ + } while (bufsize < ((1 << 11) + 1) * fmtlen); + + /* Here, strftime() returned 0, and it likely wasn't for lack of space. + * There are two possible reasons: * * First is that the result is legitimately 0 length. This can happen * when the format is precisely "%p". That is the only documented format -- cgit v1.2.1