summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGabriel Scherer <gabriel.scherer@gmail.com>2022-04-05 16:32:21 +0200
committerGitHub <noreply@github.com>2022-04-05 16:32:21 +0200
commit6b7301bd093454fb6d1fc53692e4686f66d226fa (patch)
treeea6b21c7773cfd19f3f8ca3fca4134df861dad49
parent730efbafc90c4555da6a5bbbdc488199d21d88d2 (diff)
downloadocaml-6b7301bd093454fb6d1fc53692e4686f66d226fa.tar.gz
runtime/HACKING.adoc: tips on debugging the runtime (#11058)
-rw-r--r--.gitattributes2
-rw-r--r--Changes6
-rw-r--r--HACKING.adoc15
-rw-r--r--runtime/HACKING.adoc156
4 files changed, 178 insertions, 1 deletions
diff --git a/.gitattributes b/.gitattributes
index 92418e066a..7b6bf2f462 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -51,7 +51,7 @@ api_docgen/*.mld typo.missing-header
api_docgen/alldoc.tex typo.missing-header
tools/mantis2gh_stripped.csv typo.missing-header
-*.adoc typo.long-line=may
+*.adoc typo.long-line=may typo.very-long-line=may
# Github templates and scripts lack headers, have long lines
/.github/** typo.missing-header typo.long-line=may typo.very-long-line=may
diff --git a/Changes b/Changes
index 4eff52f2ad..9debeefe74 100644
--- a/Changes
+++ b/Changes
@@ -162,6 +162,9 @@ Working version
### Manual and documentation:
+- #11058: runtime/HACKING.adoc tips on debugging the runtime
+ (Gabriel Scherer, review by Enguerrand Decorne and Nicolás Ojeda Bär)
+
### Compiler user-interface and warnings:
- #11089: Add 'since <version>' information to compiler warnings.
@@ -191,6 +194,9 @@ Working version
- #11008, #11047: rework GC statistics in the Multicore runtime
(Gabriel Scherer, review by Enguerrand Decorne)
+- #11058: basic debugging documentation in runtime/HACKING.adoc
+ (Gabriel Scherer, review by ???)
+
### Build system:
* #10893: Remove configuration options --disable-force-safe-string and
diff --git a/HACKING.adoc b/HACKING.adoc
index 68faafb332..d751186899 100644
--- a/HACKING.adoc
+++ b/HACKING.adoc
@@ -127,6 +127,21 @@ link:typing/HACKING.adoc[].
=== Runtime system
+The low-level routines that OCaml programs use during their execution:
+garbage collection, interaction with the operating system
+(IO in particular), low-level primitives to manipulate some OCaml data
+structures, etc. Mostly implemented in C, with some rare bits of
+assembly code in architecture-specific files. The "includes"
+corresponding to the `.c` files are in the link:runtime/caml[]
+subdirectory.
+
+Some files are only used by bytecode programs, some only used by
+native-compiled programs, but most of the runtime code is
+common. (See link:runtime/Makefile[] for the list of common,
+bytecode-only and native-only source files.)
+
+See link:runtime/HACKING.adoc[].
+
=== Libraries
link:stdlib/[]:: The standard library. Each file is largely
diff --git a/runtime/HACKING.adoc b/runtime/HACKING.adoc
new file mode 100644
index 0000000000..0ee84a6a5c
--- /dev/null
+++ b/runtime/HACKING.adoc
@@ -0,0 +1,156 @@
+= Tips on hacking the OCaml runtime system =
+
+== Linking a test program with the debug runtime ==
+
+Suppose you have a self-contained OCaml program `test.ml` that
+crashes, you are working on a development repository (not an installed
+version of your system). You probably want to run `test.ml` against
+the "debug runtime", which in particular activates the `CAMLassert`
+debug assertions.
+
+If you want to use the bytecode compiler:
+
+----
+# build the runtime
+make runtime -j
+
+# compile as usual
+./ocamlc.opt -nostdlib -I stdlib test.ml -o test
+
+# run with the debug runtime (ocamlrund)
+./runtime/ocamlrund ./test
+----
+
+If you want to use the native compiler:
+
+----
+# build the native runtime
+make runtimeopt -j
+
+# compile with "-runtime-variant d"
+./ocamlopt.opt -nostdlib -I stdlib -runtime-variant d -I runtime test.ml -o test
+
+./test
+----
+
+Note that the debug runtime does extra work, so it may slow down your
+program -- and sometimes make the issue you are trying to debug
+vanish.
+
+== GC messages ==
+
+The GC can send various messages about what it is doing, enabled with
+the "v" option of OCAMLRUNPARAM. Various options are more or less
+documented in
+link:https://ocaml.org/manual/runtime.html#s:ocamlrun-options[].
+You can enable all printing with
+
+----
+OCAMLRUNPARAM="v=0xffffffff" ./test
+----
+
+Note: `caml_gc_log` can be used to show log messages prefixed with the
+thread number, and it corresponds to the more precise setting
+`v=0x800`.
+
+== Heap verification ==
+
+Another useful OCAMLRUNPARAM setting is `V=1`, which enables
+additional sanity checks on the heap during major GC cycles.
+
+----
+OCAMLRUNPARAM="V=1" ./test
+----
+
+== Getting stack traces after assertion failures (Linux) ==
+
+The output of a crashing OCaml program may end up like this:
+
+----
+[03] file domain.c; line 404 ### Assertion failed: domain_state->young_start == NULL
+Aborted (core dumped)
+----
+
+The message "core dumped" indicates that some debugging information was kept on the disk.
+
+On Linux, systemd-enabled systems tend to use a systemd tool (of course!) to store core dumps.
+
+----
+# ask your system how core dumps are handled.
+$ cat /proc/sys/kernel/core_pattern
+|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
+----
+
+If your system is also using `systemd-coredump`, then the command
+`coredumpctl dump` will show you information about the last "core
+dump".
+
+----
+$ $ coredumpctl dump
+ PID: 678260 (Domain0)
+ UID: 1000 (gasche)
+ GID: 1000 (gasche)
+ Signal: 6 (ABRT)
+ Timestamp: Fri 2022-02-25 09:30:32 CET (4min 30s ago)
+ Command Line: ./test
+ Executable: /home/gasche/Prog/ocaml/github-max_domains/test
+ Control Group: [...]
+ [...]
+ Disk Size: 133.0K
+ Message: Process 678260 (Domain0) of user 1000 dumped core.
+
+ Stack trace of thread 678266:
+ #0 0x00007f60ee4842a2 raise (libc.so.6 + 0x3d2a2)
+ #1 0x00007f60ee46d8a4 abort (libc.so.6 + 0x268a4)
+ #2 0x0000000000475022 n/a (/home/gasche/Prog/ocaml/github-max_domains/test + 0x75022)
+Refusing to dump core to tty (use shell redirection or specify --output).
+----
+
+You can get a full backtrace using `echo bt | coredumpctl debug`:
+
+----
+$ echo bt | coredumpctl debug
+[...]
+Core was generated by `./test'.
+Program terminated with signal SIGABRT, Aborted.
+#0 0x00007f60ee4842a2 in raise () from /lib64/libc.so.6
+[Current thread is 1 (Thread 0x7f60d77fe640 (LWP 678266))]
+Missing separate debuginfos, use: dnf debuginfo-install glibc-2.33-20.fc34.x86_64
+(gdb) #0 0x00007f60ee4842a2 in raise () from /lib64/libc.so.6
+#1 0x00007f60ee46d8a4 in abort () from /lib64/libc.so.6
+#2 0x0000000000475022 in caml_failed_assert (
+ expr=expr@entry=0x488498 "domain_state->young_start == NULL",
+ file_os=file_os@entry=0x488218 "domain.c", line=line@entry=404) at misc.c:56
+#3 0x0000000000461831 in caml_free_minor_heap () at domain.c:404
+#4 0x000000000046237b in caml_reallocate_minor_heap (wsize=wsize@entry=786432) at domain.c:469
+#5 0x0000000000474404 in caml_set_minor_heap_size (wsize=wsize@entry=786432) at minor_gc.c:130
+#6 0x00000000004696b3 in caml_gc_set (v=<optimized out>) at gc_ctrl.c:222
+#7 <signal handler called>
+#8 0x000000000042a3b2 in camlTest__set_gc_280 () at test.ml:17
+#9 0x000000000042a818 in camlTest__fun_529 () at test.ml:39
+#10 0x000000000044947a in camlStdlib__Domain__body_694 () at domain.ml:204
+#11 <signal handler called>
+#12 0x000000000045fe38 in caml_callback_exn (closure=<optimized out>, arg=<optimized out>, arg@entry=1) at callback.c:169
+#13 0x0000000000460369 in caml_callback (closure=<optimized out>, arg=arg@entry=1) at callback.c:253
+#14 0x0000000000461f6a in domain_thread_func (v=0x7ffdd7357bb0) at domain.c:1034
+#15 0x00007f60ee61f299 in start_thread () from /lib64/libpthread.so.0
+#16 0x00007f60ee547353 in clone () from /lib64/libc.so.6
+(gdb) quit
+----
+
+== Using `rr` for deterministic replay debugging ==
+
+There is a lot of information on how to use `rr` to debug the OCaml
+runtime on the OCaml Multicore wiki:
+link:https://github.com/ocaml-multicore/ocaml-multicore/wiki/Debugging-the-OCaml-Multicore-runtime#rr[].
+
+TODO: it would be nice to migrate some information here.
+
+== Compiling with sanitizers ==
+
+TODO: I would be curious to know!
+
+(For the brave there are some scripts in
+link:../tools/ci/inria/sanitizers/script[], but you probably don't
+want to run them directly, in particular they will `git clean -xfd`,
+destroying changed/uncommited files in your development repository!)