diff options
author | Gabriel Scherer <gabriel.scherer@gmail.com> | 2022-04-05 16:32:21 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2022-04-05 16:32:21 +0200 |
commit | 6b7301bd093454fb6d1fc53692e4686f66d226fa (patch) | |
tree | ea6b21c7773cfd19f3f8ca3fca4134df861dad49 | |
parent | 730efbafc90c4555da6a5bbbdc488199d21d88d2 (diff) | |
download | ocaml-6b7301bd093454fb6d1fc53692e4686f66d226fa.tar.gz |
runtime/HACKING.adoc: tips on debugging the runtime (#11058)
-rw-r--r-- | .gitattributes | 2 | ||||
-rw-r--r-- | Changes | 6 | ||||
-rw-r--r-- | HACKING.adoc | 15 | ||||
-rw-r--r-- | runtime/HACKING.adoc | 156 |
4 files changed, 178 insertions, 1 deletions
diff --git a/.gitattributes b/.gitattributes index 92418e066a..7b6bf2f462 100644 --- a/.gitattributes +++ b/.gitattributes @@ -51,7 +51,7 @@ api_docgen/*.mld typo.missing-header api_docgen/alldoc.tex typo.missing-header tools/mantis2gh_stripped.csv typo.missing-header -*.adoc typo.long-line=may +*.adoc typo.long-line=may typo.very-long-line=may # Github templates and scripts lack headers, have long lines /.github/** typo.missing-header typo.long-line=may typo.very-long-line=may @@ -162,6 +162,9 @@ Working version ### Manual and documentation: +- #11058: runtime/HACKING.adoc tips on debugging the runtime + (Gabriel Scherer, review by Enguerrand Decorne and Nicolás Ojeda Bär) + ### Compiler user-interface and warnings: - #11089: Add 'since <version>' information to compiler warnings. @@ -191,6 +194,9 @@ Working version - #11008, #11047: rework GC statistics in the Multicore runtime (Gabriel Scherer, review by Enguerrand Decorne) +- #11058: basic debugging documentation in runtime/HACKING.adoc + (Gabriel Scherer, review by ???) + ### Build system: * #10893: Remove configuration options --disable-force-safe-string and diff --git a/HACKING.adoc b/HACKING.adoc index 68faafb332..d751186899 100644 --- a/HACKING.adoc +++ b/HACKING.adoc @@ -127,6 +127,21 @@ link:typing/HACKING.adoc[]. === Runtime system +The low-level routines that OCaml programs use during their execution: +garbage collection, interaction with the operating system +(IO in particular), low-level primitives to manipulate some OCaml data +structures, etc. Mostly implemented in C, with some rare bits of +assembly code in architecture-specific files. The "includes" +corresponding to the `.c` files are in the link:runtime/caml[] +subdirectory. + +Some files are only used by bytecode programs, some only used by +native-compiled programs, but most of the runtime code is +common. (See link:runtime/Makefile[] for the list of common, +bytecode-only and native-only source files.) + +See link:runtime/HACKING.adoc[]. + === Libraries link:stdlib/[]:: The standard library. Each file is largely diff --git a/runtime/HACKING.adoc b/runtime/HACKING.adoc new file mode 100644 index 0000000000..0ee84a6a5c --- /dev/null +++ b/runtime/HACKING.adoc @@ -0,0 +1,156 @@ += Tips on hacking the OCaml runtime system = + +== Linking a test program with the debug runtime == + +Suppose you have a self-contained OCaml program `test.ml` that +crashes, you are working on a development repository (not an installed +version of your system). You probably want to run `test.ml` against +the "debug runtime", which in particular activates the `CAMLassert` +debug assertions. + +If you want to use the bytecode compiler: + +---- +# build the runtime +make runtime -j + +# compile as usual +./ocamlc.opt -nostdlib -I stdlib test.ml -o test + +# run with the debug runtime (ocamlrund) +./runtime/ocamlrund ./test +---- + +If you want to use the native compiler: + +---- +# build the native runtime +make runtimeopt -j + +# compile with "-runtime-variant d" +./ocamlopt.opt -nostdlib -I stdlib -runtime-variant d -I runtime test.ml -o test + +./test +---- + +Note that the debug runtime does extra work, so it may slow down your +program -- and sometimes make the issue you are trying to debug +vanish. + +== GC messages == + +The GC can send various messages about what it is doing, enabled with +the "v" option of OCAMLRUNPARAM. Various options are more or less +documented in +link:https://ocaml.org/manual/runtime.html#s:ocamlrun-options[]. +You can enable all printing with + +---- +OCAMLRUNPARAM="v=0xffffffff" ./test +---- + +Note: `caml_gc_log` can be used to show log messages prefixed with the +thread number, and it corresponds to the more precise setting +`v=0x800`. + +== Heap verification == + +Another useful OCAMLRUNPARAM setting is `V=1`, which enables +additional sanity checks on the heap during major GC cycles. + +---- +OCAMLRUNPARAM="V=1" ./test +---- + +== Getting stack traces after assertion failures (Linux) == + +The output of a crashing OCaml program may end up like this: + +---- +[03] file domain.c; line 404 ### Assertion failed: domain_state->young_start == NULL +Aborted (core dumped) +---- + +The message "core dumped" indicates that some debugging information was kept on the disk. + +On Linux, systemd-enabled systems tend to use a systemd tool (of course!) to store core dumps. + +---- +# ask your system how core dumps are handled. +$ cat /proc/sys/kernel/core_pattern +|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h +---- + +If your system is also using `systemd-coredump`, then the command +`coredumpctl dump` will show you information about the last "core +dump". + +---- +$ $ coredumpctl dump + PID: 678260 (Domain0) + UID: 1000 (gasche) + GID: 1000 (gasche) + Signal: 6 (ABRT) + Timestamp: Fri 2022-02-25 09:30:32 CET (4min 30s ago) + Command Line: ./test + Executable: /home/gasche/Prog/ocaml/github-max_domains/test + Control Group: [...] + [...] + Disk Size: 133.0K + Message: Process 678260 (Domain0) of user 1000 dumped core. + + Stack trace of thread 678266: + #0 0x00007f60ee4842a2 raise (libc.so.6 + 0x3d2a2) + #1 0x00007f60ee46d8a4 abort (libc.so.6 + 0x268a4) + #2 0x0000000000475022 n/a (/home/gasche/Prog/ocaml/github-max_domains/test + 0x75022) +Refusing to dump core to tty (use shell redirection or specify --output). +---- + +You can get a full backtrace using `echo bt | coredumpctl debug`: + +---- +$ echo bt | coredumpctl debug +[...] +Core was generated by `./test'. +Program terminated with signal SIGABRT, Aborted. +#0 0x00007f60ee4842a2 in raise () from /lib64/libc.so.6 +[Current thread is 1 (Thread 0x7f60d77fe640 (LWP 678266))] +Missing separate debuginfos, use: dnf debuginfo-install glibc-2.33-20.fc34.x86_64 +(gdb) #0 0x00007f60ee4842a2 in raise () from /lib64/libc.so.6 +#1 0x00007f60ee46d8a4 in abort () from /lib64/libc.so.6 +#2 0x0000000000475022 in caml_failed_assert ( + expr=expr@entry=0x488498 "domain_state->young_start == NULL", + file_os=file_os@entry=0x488218 "domain.c", line=line@entry=404) at misc.c:56 +#3 0x0000000000461831 in caml_free_minor_heap () at domain.c:404 +#4 0x000000000046237b in caml_reallocate_minor_heap (wsize=wsize@entry=786432) at domain.c:469 +#5 0x0000000000474404 in caml_set_minor_heap_size (wsize=wsize@entry=786432) at minor_gc.c:130 +#6 0x00000000004696b3 in caml_gc_set (v=<optimized out>) at gc_ctrl.c:222 +#7 <signal handler called> +#8 0x000000000042a3b2 in camlTest__set_gc_280 () at test.ml:17 +#9 0x000000000042a818 in camlTest__fun_529 () at test.ml:39 +#10 0x000000000044947a in camlStdlib__Domain__body_694 () at domain.ml:204 +#11 <signal handler called> +#12 0x000000000045fe38 in caml_callback_exn (closure=<optimized out>, arg=<optimized out>, arg@entry=1) at callback.c:169 +#13 0x0000000000460369 in caml_callback (closure=<optimized out>, arg=arg@entry=1) at callback.c:253 +#14 0x0000000000461f6a in domain_thread_func (v=0x7ffdd7357bb0) at domain.c:1034 +#15 0x00007f60ee61f299 in start_thread () from /lib64/libpthread.so.0 +#16 0x00007f60ee547353 in clone () from /lib64/libc.so.6 +(gdb) quit +---- + +== Using `rr` for deterministic replay debugging == + +There is a lot of information on how to use `rr` to debug the OCaml +runtime on the OCaml Multicore wiki: +link:https://github.com/ocaml-multicore/ocaml-multicore/wiki/Debugging-the-OCaml-Multicore-runtime#rr[]. + +TODO: it would be nice to migrate some information here. + +== Compiling with sanitizers == + +TODO: I would be curious to know! + +(For the brave there are some scripts in +link:../tools/ci/inria/sanitizers/script[], but you probably don't +want to run them directly, in particular they will `git clean -xfd`, +destroying changed/uncommited files in your development repository!) |