diff options
Diffstat (limited to 'dis88/dis88.1')
-rw-r--r-- | dis88/dis88.1 | 147 |
1 files changed, 147 insertions, 0 deletions
diff --git a/dis88/dis88.1 b/dis88/dis88.1 new file mode 100644 index 0000000..c991ad7 --- /dev/null +++ b/dis88/dis88.1 @@ -0,0 +1,147 @@ +.TH dis88 1 LOCAL +.SH "NAME" +dis88 \- 8088 symbolic disassembler +.SH "SYNOPSIS" +\fBdis88\fP [ -o ] ifile [ ofile ] +.SH "DESCRIPTION" +Dis88 reads ifile, which must be in PC/IX a.out format. +It interprets the binary opcodes and data locations, and +writes corresponding assembler source code to stdout, or +to ofile if specified. The program's output is in the +format of, and fully compatible with, the PC/IX assembler, +as(1). If a symbol table is present in ifile, labels and +references will be symbolic in the output. If the input +file lacks a symbol table, the fact will be noted, and the +disassembly will proceed, with the disassembler generating +synthetic labels as needed. If the input file has split +I/D space, or if it is executable, the disassembler will +make all necessary adjustments in address-reference calculations. +.PP +If the "-o" option appears, object code will be included +in comments during disassembly of the text segment. This +feature is used primarily for debugging the disassembler +itself, but may provide information of passing interest +to users. +.PP +The program always outputs the current machine address +before disassembling an opcode. If a symbol table is +present, this address is output as an assembler comment; +otherwise, it is incorporated into the synthetic label +which is generated internally. Since relative jumps, +especially short ones, may target unlabelled locations, +the program always outputs the physical target address +as a comment, to assist the user in following the code. +.PP +The text segment of an object file is always padded to +an even machine address. In addition, if the file has +split I/D space, the text segment will be padded to a +paragraph boundary (i.e., an address divisible by 16). +As a result of this padding, the disassembler may produce +a few spurious, but harmless, instructions at the +end of the text segment. +.PP +Disassembly of the data segment is a difficult matter. +The information to which initialized data refers cannot +be inferred from context, except in the special case +of an external data or address reference, which will be +reflected in the relocation table. Internal data and +address references will already be resolved in the object file, +and cannot be recreated. Therefore, the data +segment is disassembled as a byte stream, with long +stretches of null data represented by an appropriate +".zerow" pseudo-op. This limitation notwithstanding, +labels (as opposed to symbolic references) are always +output at appropriate points within the data segment. +.PP +If disassembly of the data segment is difficult, disassembly of the +bss segment is quite easy, because uninitialized data is all +zero by definition. No data +is output in the bss segment, but symbolic labels are +output as appropriate. +.PP +For each opcode which takes an operand, a particular +symbol type (text, data, or bss) is appropriate. This +tidy correspondence is complicated somewhat, however, +by the existence of assembler symbolic constants and +segment override opcodes. Therefore, the disassembler's +symbol lookup routine attempts to apply a certain amount +of intelligence when it is asked to find a symbol. If +it cannot match on a symbol of the preferred type, it +may return a symbol of some other type, depending on +preassigned (and somewhat arbitrary) rankings within +each type. Finally, if all else fails, it returns a +string containing the address sought as a hex constant; +this behavior allows calling routines to use the output +of the lookup function regardless of the success of its +search. +.PP +It is worth noting, at this point, that the symbol lookup +routine operates linearly, and has not been optimized in +any way. Execution time is thus likely to increase +geometrically with input file size. The disassembler is +internally limited to 1500 symbol table entries and 1500 +relocation table entries; while these limits are generous +(/unix, itself, has fewer than 800 symbols), they are not +guaranteed to be adequate in all cases. If the symbol +table or the relocation table overflows, the disassembly +aborts. +.PP +Finally, users should be aware of a bug in the assembler, +which causes it not to parse the "esc" mnemonic, even +though "esc" is a completely legitimate opcode which is +documented in all the Intel literature. To accommodate +this deficiency, the disassembler translates opcodes of +the "esc" family to .byte directives, but notes the +correct mnemonic in a comment for reference. +.PP +In all cases, it should be possible to submit the output +of the disassembler program to the assembler, and assemble +it without error. In most cases, the resulting object +code will be identical to the original; in any event, it +will be functionally equivalent. +.SH "SEE ALSO" +adb(1), as(1), cc(1), ld(1). +.br +"Assembler Reference Manual" in the PC/IX Programmer's +Guide. +.SH "DIAGNOSTICS" +"can't access input file" if the input file cannot be +found, opened, or read. +.sp +"can't open output file" if the output file cannot be +created. +.sp +"warning: host/cpu clash" if the program is run on a +machine with a different CPU. +.sp +"input file not in object format" if the magic number +does not correspond to that of a PC/IX object file. +.sp +"not an 8086/8088 object file" if the CPU ID of the +file header is incorrect. +.sp +"reloc table overflow" if there are more than 1500 +entries in the relocation table. +.sp +"symbol table overflow" if there are more than 1500 +entries in the symbol table. +.sp +"lseek error" if the input file is corrupted (should +never happen). +.sp +"warning: no symbols" if the symbol table is missing. +.sp +"can't reopen input file" if the input file is removed +or altered during program execution (should never happen). +.SH "BUGS" +Numeric co-processor (i.e., 8087) mnemonics are not currently supported. +Instructions for the co-processor are +disassembled as CPU escape sequences, or as interrupts, +depending on how they were assembled in the first place. +.sp +Despite the program's best efforts, a symbol retrieved +from the symbol table may sometimes be different from +the symbol used in the original assembly. +.sp +The disassembler's internal tables are of fixed size, +and the program aborts if they overflow. |