summaryrefslogtreecommitdiff
path: root/cygwin
diff options
context:
space:
mode:
authorKarl Williamson <khw@cpan.org>2016-04-09 11:55:58 -0600
committerKarl Williamson <khw@cpan.org>2016-04-09 13:59:24 -0600
commita0b5329765c5b5a7cbb69a2d628d4cea8f0323c1 (patch)
tree7705515b15168f8ed0a755e8c12af03df4478964 /cygwin
parent929e12133425199fce3fe026156c10876ac0cbb8 (diff)
downloadperl-a0b5329765c5b5a7cbb69a2d628d4cea8f0323c1.tar.gz
PATCH: [perl #127708] segfault in "$!" in threads
This was showing up on Darwin because its setlocale is particularly not thread safe. But the problem is more generic. Using locales just isn't a good idea in a threaded application, as locales are process-wide, not thread-specific. Calling setlocale() changes the locale of all threads at the same time. Further the return of setlocale() is a pointer to internal storage. If you call setlocale() just to learn what it currently is without actually changing the locale, there is no guarantee that another thread won't interrupt your thread, switching the locale to something else before you've had a chance to copy it somewhere else for safekeeping, and the internal storage may have been freed during that interruption, leading to things like segfaults. This is a problem that has been around in the locale handling code for a long time. I don't know why it hasn't shown up before, or maybe it has and is not reproducible because it's timing dependent, and so any problems didn't have tickets written for them, or were rejected as not reproducible. But the problem has been made worse in recent releases. Only fairly recently has perl changed so this problem can occur in programs that don't use locale explicitly: ones that don't 'use locale' nor call setlocale(). This ticket is for such a program that gets a locale-related segfault, without ever touching locales itself. I have done an audit of the perl source, looking for all such occurrences, and this patch fixes all of them that I found. The only other ones, besides "$!", is in converting to/from UTF-8 in cygwin.c. In all such cases, perl briefly switches the locale, does an operation, then switches back. The solution here is to add mutexes to make these areas of code uninterruptible critical sections, so that they can rely on having the locale be what they expect it to be during the entirety of the operation, and can't have a setlocale() from another thread free internal storage. But this is not a general solution. A thread executing these sections can interrupt some other thread doing a setlocale() and zap that. However, we have long cautioned against doing setlocales() in a thread, and that caution was strengthened in a commit made yesterday, fc82b82ef4784a38877f35f56ee16f14934460ce. The current commit should make safe all threaded programs that don't use locales explicitly. It's too close to the 5.24 release to do the rearchitecting required for a general solution. That would involve adding more critical sections. POSIX 2008 introduced new locale handling functions that are thread-safe, and affect only a single thread, and don't require mutexes. The ultimate solution would be to use those tools where available, and to hide from the outer code which set is being used. Thus, perl would be thread-safe on such platforms, while remaining problematic on older ones, though fixed so segfaults wouldn't occur. Tony Cook believes we could emulate the newer behavior on all platforms at a significant performance penalty. I think this would require a lot of code, and suspect there would be glitches in it for XS code. But he may have some ideas about how to do it simply. In any case, this has to wait until post 5.24. Three other notes: It seems to me that the cygwin code could be replaced by equivalent code that doesn't use locales at all. The comments in the source seem to even indicate that. I'll look into doing this in 5.25. Another possible reason that this hasn't shown up in earlier perls is that the problems may have been entirely affecting I/O operations and there are already mutexes involving I/O, and so those could be inadvertently protecting from, or at least minimizing, the problems found here. I haven't investigated to verify this. This commit doesn't add a test. I am asking on p5p for assistance in writing one
Diffstat (limited to 'cygwin')
-rw-r--r--cygwin/cygwin.c41
1 files changed, 37 insertions, 4 deletions
diff --git a/cygwin/cygwin.c b/cygwin/cygwin.c
index 59aa730448..24b278f2fd 100644
--- a/cygwin/cygwin.c
+++ b/cygwin/cygwin.c
@@ -154,7 +154,15 @@ wide_to_utf8(const wchar_t *wbuf)
{
char *buf;
int wlen = 0;
- char *oldlocale = setlocale(LC_CTYPE, NULL);
+ char *oldlocale;
+ dVAR;
+
+ /* Here and elsewhere in this file, we have a critical section to prevent
+ * another thread from changing the locale out from under us. XXX But why
+ * not just use uvchr_to_utf8? */
+ LOCALE_LOCK;
+
+ oldlocale = setlocale(LC_CTYPE, NULL);
setlocale(LC_CTYPE, "utf-8");
/* uvchr_to_utf8(buf, chr) or Encoding::_bytes_to_utf8(sv, "UCS-2BE"); */
@@ -164,6 +172,9 @@ wide_to_utf8(const wchar_t *wbuf)
if (oldlocale) setlocale(LC_CTYPE, oldlocale);
else setlocale(LC_CTYPE, "C");
+
+ LOCALE_UNLOCK;
+
return buf;
}
@@ -172,8 +183,13 @@ utf8_to_wide(const char *buf)
{
wchar_t *wbuf;
mbstate_t mbs;
- char *oldlocale = setlocale(LC_CTYPE, NULL);
+ char *oldlocale;
int wlen = sizeof(wchar_t)*strlen(buf);
+ dVAR;
+
+ LOCALE_LOCK;
+
+ oldlocale = setlocale(LC_CTYPE, NULL);
setlocale(LC_CTYPE, "utf-8");
wbuf = (wchar_t *) safemalloc(wlen);
@@ -182,6 +198,9 @@ utf8_to_wide(const char *buf)
if (oldlocale) setlocale(LC_CTYPE, oldlocale);
else setlocale(LC_CTYPE, "C");
+
+ LOCALE_UNLOCK;
+
return wbuf;
}
#endif /* cygwin 1.7 */
@@ -280,7 +299,12 @@ XS(XS_Cygwin_win_to_posix_path)
wchar_t *wbuf = (wchar_t *) safemalloc(wlen);
if (!IN_BYTES) {
mbstate_t mbs;
- char *oldlocale = setlocale(LC_CTYPE, NULL);
+ char *oldlocale;
+ dVAR;
+
+ LOCALE_LOCK;
+
+ oldlocale = setlocale(LC_CTYPE, NULL);
setlocale(LC_CTYPE, "utf-8");
/* utf8_to_uvchr_buf(src_path, src_path + wlen, wpath) or Encoding::_utf8_to_bytes(sv, "UCS-2BE"); */
wlen = mbsrtowcs(wpath, (const char**)&src_path, wlen, &mbs);
@@ -288,6 +312,8 @@ XS(XS_Cygwin_win_to_posix_path)
err = cygwin_conv_path(what, wpath, wbuf, wlen);
if (oldlocale) setlocale(LC_CTYPE, oldlocale);
else setlocale(LC_CTYPE, "C");
+
+ LOCALE_UNLOCK;
} else { /* use bytes; assume already ucs-2 encoded bytestream */
err = cygwin_conv_path(what, src_path, wbuf, wlen);
}
@@ -365,7 +391,12 @@ XS(XS_Cygwin_posix_to_win_path)
int wlen = sizeof(wchar_t)*(len + 260 + 1001);
wchar_t *wpath = (wchar_t *) safemalloc(sizeof(wchar_t)*len);
wchar_t *wbuf = (wchar_t *) safemalloc(wlen);
- char *oldlocale = setlocale(LC_CTYPE, NULL);
+ char *oldlocale;
+ dVAR;
+
+ LOCALE_LOCK;
+
+ oldlocale = setlocale(LC_CTYPE, NULL);
setlocale(LC_CTYPE, "utf-8");
if (!IN_BYTES) {
mbstate_t mbs;
@@ -388,6 +419,8 @@ XS(XS_Cygwin_posix_to_win_path)
wcsrtombs(win_path, (const wchar_t **)&wbuf, wlen, NULL);
if (oldlocale) setlocale(LC_CTYPE, oldlocale);
else setlocale(LC_CTYPE, "C");
+
+ LOCALE_UNLOCK;
} else {
int what = absolute_flag ? CCP_POSIX_TO_WIN_A : CCP_POSIX_TO_WIN_A | CCP_RELATIVE;
win_path = (char *) safemalloc(len + 260 + 1001);