summaryrefslogtreecommitdiff
path: root/gdb/dwarf2/line-header.h
Commit message (Collapse)AuthorAgeFilesLines
* gdb: move call site types to call-site.hSimon Marchi2023-01-201-2/+0
| | | | | | | | | | | | | | | | | | I hesitated between putting the file in the dwarf2 directory (as gdb/dwarf2/call-site.h) or in the common directory (as gdb/call-site.h). The concept of call site is not DWARF-specific, another debug info reader could provide this information. But as it is, the implementation is a bit DWARF-specific, as one form it can take is a DWARF expression and parameters can be defined using a DWARF register number. So I ended up choosing to put it under dwarf2/. If another debug info reader ever wants to provide call site information, we can introduce a layer of abstraction between the "common" call site and the "dwarf2" call site. The copyright start year comes from the date `struct call_site` was introduced. Change-Id: I1cd84aa581fbbf729edc91b20f7d7a6e0377014d Reviewed-By: Bruno Larsen <blarsen@redhat.com>
* Update copyright year range in header of all files managed by GDBJoel Brobecker2023-01-011-1/+1
| | | | | | | This commit is the result of running the gdb/copyright.py script, which automated the update of the copyright year range for all source files managed by the GDB project to be updated to include year 2023.
* gdb: add "id" fields to identify symtabs and subfilesSimon Marchi2022-07-291-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Printing macros defined in the main source file doesn't work reliably using various toolchains, especially when DWARF 5 is used. For example, using the binaries produced by either of these commands: $ gcc --version gcc (GCC) 11.2.0 $ ld --version GNU ld (GNU Binutils) 2.38 $ gcc test.c -g3 -gdwarf-5 $ clang --version clang version 13.0.1 $ clang test.c -gdwarf-5 -fdebug-macro I get: $ ./gdb -nx -q --data-directory=data-directory a.out (gdb) start Temporary breakpoint 1 at 0x111d: file test.c, line 6. Starting program: /home/simark/build/binutils-gdb-one-target/gdb/a.out Temporary breakpoint 1, main () at test.c:6 6 return ZERO; (gdb) p ZERO No symbol "ZERO" in current context. When starting to investigate this (taking the gcc-compiled binary as an example), we see that GDB fails to look up the appropriate macro scope when evaluating the expression. While stopped in macro_lookup_inclusion: (top-gdb) p name $1 = 0x62100011a980 "test.c" (top-gdb) p source.filename $2 = 0x62100011a9a0 "/home/simark/build/binutils-gdb-one-target/gdb/test.c" `source` is the macro_source_file that we would expect GDB to find. `name` comes from the symtab::filename field of the symtab we are stopped in. GDB doesn't find the appropriate macro_source_file because the name of the macro_source_file doesn't match exactly the name of the symtab. The name of the main symtab comes from the compilation unit's DW_AT_name, passed to the buildsym_compunit's constructor: https://gitlab.com/gnutools/binutils-gdb/-/blob/4815d6125ec580cc02a1094d61b8c9d1cc83c0a1/gdb/dwarf2/read.c#L10627-10630 The contents of DW_AT_name, in this case, is "test.c". It is typically (what I witnessed all compilers do) the same string that was passed to the compiler on the command-line. The name of the macro_source_file comes from the line number program header's file table, from the call to the line_header::file_file_name method: https://gitlab.com/gnutools/binutils-gdb/-/blob/4815d6125ec580cc02a1094d61b8c9d1cc83c0a1/gdb/dwarf2/macro.c#L54-65 line_header::file_file_name prepends the directory path that the file entry refers to, in the file table (if the file name is not already absolute). In this case, the file name is "test.c", appended to the directory "/home/simark/build/binutils-gdb-one-target/gdb". Because the symtab's name is not created the same way as the macro_source_file's name is created, we get this mismatch. GDB fails to find the appropriate macro scope for the symtab, and we can't print macros when stopped in that symtab. To make this work, we must ensure that paths produced in these two ways end up identical. This can be tricky because of the different ways a path can be passed to the compiler by the user. Another thing to consider is that while the main symtab's name (or subfile, before it becomes a symtab) is created using DW_AT_name, the main symtab is also referred to using its entry in the line table header's file table, when processing the line table. We must therefore ensure that the same name is produced in both cases, so that a call to "start_subfile" for the main subfile will correctly find the already-created subfile, created by buildsym_compunit's constructor. If we fail to do that, things still often work, because of a fallback: the watch_main_source_file_lossage method. This method determines that if the main subfile has no symbols but there exists another subfile with the same basename (e.g. "test.c") that does have symbols, it's probably because there was some filename mismatch. So it replaces the main subfile with that other subfile. I think that heuristic is useful as a last effort to work around any bug or bad debug info, but I don't think we should design things such as to rely on it. It's a heuristic, it can get things wrong. So in my search for a fix, it is important that given some good debug info, we don't end up relying on that for things to work. A first attempt at fixing this was to try to prepend the compilation directory here or not prepend it there. In practice, because of all the possible combinations of debug info the compilers produce, it was not possible to get something that would produce reliable, consistent paths. Another attempt at fixing this was to make both macro_source_file objects and symtab objects use the most complete form of path possible. That means to prepend directories at least until we get an absolute path. In theory, we should end up with the same path in all cases. This generally worked, but because it changed the symtab names, it resulted in user-visible changes (for example, paths to source files in Breakpoint hit messages becoming always absolute). I didn't find this very good, first because there is a "set filename-display" setting that lets the user control how they want the paths to be displayed, and that would suddenly make this setting completely ineffective (although even today, it is a bit dependent on the debug info). Second, it would require a good amount of testsuite tweaks to make tests accept these suddenly absolute paths. This new patch is a slight variation of that: it adds a new field called "filename_for_id" in struct symtab and struct subfile, next to the existing filename field. The goal is to separate the internal ids used for finding objects from the names used for presentation. This field is used for identifying subfiles, symtabs and macro_source_files internally. For DWARF symtabs, this new field is meant to contain the "most complete possible" path, as discussed above. So for a given file, it must always be in the same form, everywhere. The existing symtab::filename field remains the one used for printing to the user, so there shouldn't be any change in how paths are printed. Changes in the core symtab files are: - Add "name_for_id" and "filename_for_id" fields to "struct subfile" and "struct symtab", next to existing "name" and "filename" fields. - Make buildsym_compunit::buildsym_compunit and buildsym_compunit::start_subfile accept a "name_for_id" parameter next to the existing "name" ones. - Make buildsym_compunit::start_subfile use "name_for_id" for looking up existing subfiles. This is the key thing for making calls to start_subfile for the main source file look up the existing subfile successfully, and avoid relying on watch_main_source_file_lossage. - Make sal_macro_scope pass "filename_for_id", rather than "filename", to macro_lookup_inclusion. This is the key thing to making the lookup work and macro printing work. Changes in the DWARF files are: - Make line_header::file_file_name return the "most complete possible" name. The only pre-existing user of this method is the macro code, to give the macro_source_file objects their name. And we now want them to have this "most complete possible" name, which will match the corresponding symtab's "filename_for_id". - Make dwarf2_cu::start_compunit_symtab pass the "most complete possible" name for the main symtab's "filename_for_id". In this context, where the info comes from the compilation unit's DW_AT_name / DW_AT_comp_dir, it means prepending DW_AT_comp_dir to DW_AT_name if DW_AT_name is not already absolute. - Change dwarf2_start_subfile to build a name_for_id for the subfile being started. The simplest way is to re-use line_header::file_file_name, since the callers always have a file_entry handy. This ensures that it will get the exact same path representation as the macro code does, for the same file (since it also uses line_header::file_file_name). - Update calls to allocate_symtab to pass the "name_for_id" from the subfile. Tests exercising all this are added by the following patch. Of all the cases I tried, the only one I found that ends up relying on watch_main_source_file_lossage is the following one: $ clang --version clang version 13.0.1 Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/bin $ clang ./test.c -g3 -O0 -gdwarf-4 $ ./gdb -nx --data-directory=data-directory -q -readnow -iex "set debug symtab-create 1" a.out ... [symtab-create] start_subfile: name = test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/test.c [symtab-create] start_subfile: name = ./test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/./test.c [symtab-create] start_subfile: name = ./test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/./test.c [symtab-create] start_subfile: found existing symtab with name_for_id /home/simark/build/binutils-gdb-one-target/gdb/./test.c (/home/simark/build/binutils-gdb-one-target/gdb/./test.c) [symtab-create] watch_main_source_file_lossage: using subfile ./test.c as the main subfile As we can see, there are two forms used for "test.c", one with a "." and one without. This comes from the fact that the compilation unit DIE contains: DW_AT_name ("test.c") DW_AT_comp_dir ("/home/simark/build/binutils-gdb-one-target/gdb") without a ".", and the line table for that file contains: include_directories[ 1] = "." file_names[ 1]: name: "test.c" dir_index: 1 When assembling the filename from that entry, we get a ".". It is a bit unexpected that the main filename resulting from the line table header does not match exactly the name in the compilation unit. For instance, gcc uses "./test.c" for the DW_AT_name, which gives identical paths in the compilation unit and in the line table header. Similarly, with DWARF 5: $ clang ./test.c -g3 -O0 -gdwarf-5 clang create two entries that refer to the same file but are of in a different form. include_directories[ 0] = "/home/simark/build/binutils-gdb-one-target/gdb" include_directories[ 1] = "." file_names[ 0]: name: "test.c" dir_index: 0 file_names[ 1]: name: "test.c" dir_index: 1 The first file name produces a path without a "." while the second does. This is not caught by watch_main_source_file_lossage, because of dwarf_decode_lines that creates a symtab for each file entry in the line table. It therefore appears as "non-empty" to watch_main_source_file_lossage. This results in two symtabs: (gdb) maintenance info symtabs { objfile /home/simark/build/binutils-gdb-one-target/gdb/a.out ((struct objfile *) 0x613000005d00) { ((struct compunit_symtab *) 0x62100011aca0) debugformat DWARF 5 producer clang version 13.0.1 name test.c dirname /home/simark/build/binutils-gdb-one-target/gdb blockvector ((struct blockvector *) 0x621000129ec0) user ((struct compunit_symtab *) (null)) { symtab test.c ((struct symtab *) 0x62100011ad20) fullname (null) linetable ((struct linetable *) 0x0) } { symtab ./test.c ((struct symtab *) 0x62100011ad60) fullname (null) linetable ((struct linetable *) 0x621000129ef0) } } } I am not sure what is the consequence of this, but this is also what happens before my patch, so I think its acceptable to leave it as-is. To handle these two cases nicely, I think we will need a function that removes the unnecessary "." from path names, something that can be done later. Finally, I made a change in find_file_and_directory is necessary to avoid breaking test gdb.dwarf2/dw2-compdir-oldgcc.exp: info source gcc42 Without that change, we would get: (gdb) info source Current source file is /dir/d/dw2-compdir-oldgcc42.S Compilation directory is /dir/d whereas the expected result is: (gdb) info source Current source file is dw2-compdir-oldgcc42.S Compilation directory is /dir/d This test was added here: https://sourceware.org/pipermail/gdb-patches/2012-November/098144.html Long story short, GCC <= 4.2 apparently had a bug where it would generate a DW_AT_name with a full path ("/dir/d/dw2-compdir-oldgcc42.S") and no DW_AT_comp_dir. The line table has one entry with filename "dw2-compdir-oldgcc42.S", which refers to directory 0. Directory 0 normally refers to the compilation unit's comp dir, but it is non-existent in this case. This caused some symtab lookup problems, and to work around them, some workaround was added, which today reads as: if (res.get_comp_dir () == nullptr && producer_is_gcc_lt_4_3 (cu) && res.get_name () != nullptr && IS_ABSOLUTE_PATH (res.get_name ())) res.set_comp_dir (ldirname (res.get_name ())); Source: https://gitlab.com/gnutools/binutils-gdb/-/blob/6577f365ebdee7dda71cb996efa29d3714cbccd0/gdb/dwarf2/read.c#L9428-9432 It extracts an artificial DW_AT_comp_dir from DW_AT_name, if there is no DW_AT_comp_dir and DW_AT_name is absolute. Prior to my patch, a subfile would get created with filename "/dir/d/dw2-compdir-oldgcc42.S", from DW_AT_name, and another would get created with filename "dw2-compdir-oldgcc42.S" from the line table's file table. Then watch_main_source_file_lossage would kick in and merge them, keeping only the "dw2-compdir-oldgcc42.S" one: [symtab-create] start_subfile: name = /dir/d/dw2-compdir-oldgcc42.S [symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S [symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S [symtab-create] start_subfile: found existing symtab with name dw2-compdir-oldgcc42.S (dw2-compdir-oldgcc42.S) [symtab-create] watch_main_source_file_lossage: using subfile dw2-compdir-oldgcc42.S as the main subfile And so "info source" would show "dw2-compdir-oldgcc42.S" as the filename. With my patch applied, but without the change in find_file_and_directory, both DW_AT_name and the line table would try to start a subfile with the same filename_for_id, and there was no need for watch_main_source_file_lossage - which is what we want: [symtab-create] start_subfile: name = /dir/d/dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S [symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S [symtab-create] start_subfile: found existing symtab with name_for_id /dir/d/dw2-compdir-oldgcc42.S (/dir/d/dw2-compdir-oldgcc42.S) [symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S [symtab-create] start_subfile: found existing symtab with name_for_id /dir/d/dw2-compdir-oldgcc42.S (/dir/d/dw2-compdir-oldgcc42.S) But since the one with name == "/dir/d/dw2-compdir-oldgcc42.S", coming from DW_AT_name, gets created first, it wins, and the symtab ends up with "/dir/d/dw2-compdir-oldgcc42.S" as the name, "info source" shows "/dir/d/dw2-compdir-oldgcc42.S" and the test breaks. This is not wrong per-se, after all DW_AT_name is "/dir/d/dw2-compdir-oldgcc42.S", so it wouldn't be wrong to report the current source file as "/dir/d/dw2-compdir-oldgcc42.S". If you compile a file passing "/an/absolute/path.c", DW_AT_name typically contains (at least with GCC) "/an/absolute/path.c" and GDB tells you that the source file is "/an/absolute/path.c". But we can also keep the existing behavior fairly easily with a little change in find_file_and_directory. When extracting an artificial DW_AT_comp_dir from DW_AT_name, we now modify the name to just keep the file part. The result is coherent with what compilers do when you compile a file by just passing its filename ("gcc path.c -g"): DW_AT_name ("path.c") DW_AT_comp_dir ("/home/simark/build/binutils-gdb-one-target/gdb") With this change, filename_for_id is still the full name, "/dir/d/dw2-compdir-oldgcc42.S", but the filename of the subfile / symtab (what ends up shown by "info source") is just "dw2-compdir-oldgcc42.S", and that makes the test happy. Change-Id: I8b5cc4bb3052afdb172ee815c051187290566307
* gdb/dwarf: pass a file_entry to line_header::file_file_nameSimon Marchi2022-07-291-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | In the following patch, there will be some callers of file_file_name that will already have access to the file_entry object for which they want the file name. It would be inefficient to have them pass an index, only for line_header::file_file_name to re-lookup the same file_entry object. Change line_header::file_file_name to accept a file_entry object reference, instead of an index to look up. I think this change makes sense in any case. Callers that have an index can first obtain a file_entry using line_header::file_name_at or line_header::file_names. When passing a file_entry object, we can assume that the file_entry's index is valid, unlike when passing an index. So, push the special case about an invalid index to the sole current caller of file_file_name, macro_start_file. I think that error belongs there anyway, since it specifically talks about "bad file number in macro information". This requires recording the file index in the file_entry structure, so add that. Change-Id: Ic6e44c407539d92b7863d7ba82405ade17f384ad
* gdb/dwarf: pass compilation directory to line headerSimon Marchi2022-07-291-3/+23
| | | | | | | | | | | | The following patch changes line_header::file_file_name to prepend the compilation directory to the file name, if needed. For that, the line header needs to know about the compilation directory. Prepare for that by adding a constructor that takes it as a parameter, and passing the value down everywhere needed. Add a second constructor for the special case of building a line_header for doing a hash table lookup, since that case doesn't require a compilation directory value. Change-Id: Iba3ba0293e4e2d13a64b257cf9a3094684d54330
* gdb/dwarf: remove line_header::header_length fieldSimon Marchi2022-04-211-1/+0
| | | | | | This can be a local in dwarf_decode_line_header. Change-Id: I2ecf4616d1a3197bd1e81ded9f999a2da9a685af
* gdb/dwarf: remove line_header::total_length fieldSimon Marchi2022-04-211-1/+0
| | | | | | | | | | This doesn' have to be a field, it can simply be a local variable in dwarf_decode_line_header. Name the local variable "unit_length", since that's what the field in called in DWARF 4 and 5. It's always easier to follow the code with the standard on the side when we use the same terminology. Change-Id: I3ad1022afd9410b193ea11b9b5437686c1e4e633
* Delete DWARF psymtab codeTom Tromey2022-04-121-3/+0
| | | | | | | This removes the DWARF psymtab reader.
* gdb: change file_file_name to return an std::stringSimon Marchi2022-04-071-4/+2
| | | | | | | Straightforward change, return an std::string instead of a gdb::unique_xmalloc_ptr<char>. No behavior change expected. Change-Id: Ia5e94c94221c35f978bb1b7bdffbff7209e0520e
* Automatic Copyright Year update after running gdb/copyright.pyJoel Brobecker2022-01-011-1/+1
| | | | | | | | This commit brings all the changes made by running gdb/copyright.py as per GDB's Start of New Year Procedure. For the avoidance of doubt, all changes in this commits were performed by the script.
* Fix file-name handling regression with DWARF indexTom Tromey2021-07-171-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When run with the gdb-index or debug-names target boards, dup-psym.exp fails. This came up for me because my new DWARF scanner reuses this part of the existing index code, and so it registers as a regression. This is PR symtab/25834. Looking into this, I found that the DWARF index code here is fairly different from the psymtab code. I don't think there's a deep reason for this, and in fact, it seemed to me that the index code could simply mimic what the psymtab code already does. That is what this patch implements. The DW_AT_name and DW_AT_comp_dir are now stored in the quick file names table. This may require allocating a quick file names table even when DW_AT_stmt_list does not exist. Then, the functions that work with this data are changed to use find_source_or_rewrite, just as the psymbol code does. Finally, line_header::file_full_name is removed, as it is no longer needed. Currently, the index maintains a hash table of "quick file names". The hash table uses a deletion function to free the "real name" components when necessary. There's also a second such function to implement the forget_cached_source_info method. This bug fix patch will create a quick file name object even when there is no DW_AT_stmt_list, meaning that the object won't be entered in the hash table. So, this patch changes the memory management approach so that the entries are cleared when the per-BFD object is destroyed. (A dwarf2_per_cu_data destructor is not introduced, because we have been avoiding adding a vtable to that class.) Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=25834
* Update copyright year range in all GDB filesJoel Brobecker2021-01-011-1/+1
| | | | | | | | | This commits the result of running gdb/copyright.py as per our Start of New Year procedure... gdb/ChangeLog Update copyright year range in copyright header of all GDB files.
* gdb: rename dwarf2_per_objfile variables/fields to per_objfileSimon Marchi2020-05-291-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While doing the psymtab-sharing patchset, I avoided renaming variables unnecessarily to avoid adding noise to patches, but I'd like to do it now. Basically, we have these dwarf2 per-something structures: - dwarf2_per_objfile - dwarf2_per_bfd - dwarf2_per_cu_data I named the instances of dwarf2_per_bfd `per_bfd` and most of instances of dwarf2_per_cu_data are called `per_cu`. Most pre-existing instances of dwarf2_per_objfile are named `dwarf2_per_objfile`. For consistency with the other type, I'd like to rename them to just `per_objfile`. The `dwarf2_` prefix is superfluous, since it's already clear we are in dwarf2 code. It also helps reducing the line wrapping by saving 7 precious columns. gdb/ChangeLog: * dwarf2/comp-unit.c, dwarf2/comp-unit.h, dwarf2/index-cache.c, dwarf2/index-cache.h, dwarf2/index-write.c, dwarf2/index-write.h, dwarf2/line-header.c, dwarf2/line-header.h, dwarf2/macro.c, dwarf2/macro.h, dwarf2/read.c, dwarf2/read.h: Rename struct dwarf2_per_objfile variables and fields from `dwarf2_per_objfile` to just `per_objfile` throughout. Change-Id: I3c45cdcc561265e90df82cbd36b4b4ef2fa73aef
* Move more code to line-header.cTom Tromey2020-03-261-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | This moves some more code out of read.c and into line-header.c. dwarf_decode_line_header is split into two -- the part remaining in read.c handles interfacing to the dwarf2_cu; while the part in line-header.c (more or less) purely handles the actual decoding. gdb/ChangeLog 2020-03-26 Tom Tromey <tom@tromey.com> * dwarf2/line-header.h (dwarf_decode_line_header): Declare. * dwarf2/read.c (dwarf2_statement_list_fits_in_line_number_section_complaint): Move to line-header.c. (read_checked_initial_length_and_offset, read_formatted_entries): Likewise. (dwarf_decode_line_header): Split into two. * dwarf2/line-header.c (dwarf2_statement_list_fits_in_line_number_section_complaint): Move from read.c. (read_checked_initial_length_and_offset, read_formatted_entries): Likewise. (dwarf_decode_line_header): New function, split from read.c.
* Make some line_header methods constTom Tromey2020-03-261-4/+14
| | | | | | | | | | | | | | This changes a few line_header methods to be const. In some cases, a const overload is added. gdb/ChangeLog 2020-03-26 Tom Tromey <tom@tromey.com> * dwarf2/line-header.h (struct line_header) <is_valid_file_index, file_names_size, file_full_name, file_file_name>: Use const. <file_name_at, file_names>: Add const overload. * dwarf2/line-header.c (line_header::file_file_name) (line_header::file_full_name): Update.
* Move DWARF line_header to new fileTom Tromey2020-02-081-0/+188
This moves the line_header class to a pair of new files, making dwarf2/read.c somewhat smaller. 2020-02-08 Tom Tromey <tom@tromey.com> * dwarf2/read.h (dwarf_line_debug): Declare. * Makefile.in (COMMON_SFILES): Add dwarf2/line-header.c. * dwarf2/read.c: Move line_header code to new files. (dwarf_line_debug): No longer static. * dwarf2/line-header.c: New file. * dwarf2/line-header.h: New file. Change-Id: I8d9d8a2398b4e888e20cc5dd68d041c28b5a06e3