From 8b494f241f2abfc0f869ab42c1601ea5027d3477 Mon Sep 17 00:00:00 2001 From: Robert de Bath Date: Sun, 24 May 1998 18:43:39 +0200 Subject: Import dis88-pcix --- README | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 91 insertions(+) create mode 100644 README (limited to 'README') diff --git a/README b/README new file mode 100644 index 0000000..d0c80dc --- /dev/null +++ b/README @@ -0,0 +1,91 @@ + dis88 + Beta Release + 87/09/01 + --- + G. M. HARDING + POB 4142 + Santa Clara CA 95054-0142 + + + "Dis88" is a symbolic disassembler for the Intel 8088 CPU, + designed to run under the PC/IX operating system on an IBM XT + or fully-compatible clone. Its output is in the format of, and + is completely compatible with, the PC/IX assembler, "as". The + program is copyrighted by its author, but may be copied and re- + distributed freely provided that complete source code, with all + copyright notices, accompanies any distribution. This provision + also applies to any modifications you may make. You are urged + to comment such changes, giving, as a miminum, your name and + complete address. + + This release of the program is a beta release, which means + that it has been extensively, but not exhaustively, tested. + User comments, recommendations, and bug fixes are welcome. The + principal features of the current release are: + + (a) The ability to disassemble any file in PC/IX object + format, making full use of symbol and relocation information if + it is present, regardless of whether the file is executable or + linkable, and regardless of whether it has continuous or split + I/D space; + + (b) Automatic generation of synthetic labels when no sym- + bol table is available; and + + (c) Optional output of address and object-code informa- + tion as assembler comment text. + + Limitations of the current release are: + + (a) Numeric co-processor (i.e., 8087) mnemonics are not + supported. Instructions for the co-processor are disassembled + as CPU escape sequences, or as interrupts, depending on how + they were assembled in the first place. This limitation will be + addressed in a future release. + + (b) Symbolic references within the object file's data + segment are not supported. Thus, for example, if a data segment + location is initialized to point to a text segment address, no + reference to a text segment symbol will be detected. This limi- + tation is likely to remain in future releases, because object + code does not, in most cases, contain sufficient information to + allow meaningful interpretation of pure data. (Note, however, + that symbolic references to the data segment from within the + text segment are always supported.) + + As a final caveat, be aware that the PC/IX assembler does + not recognize the "esc" mnemonic, even though it refers to a + completely valid CPU operation which is documented in all the + Intel literature. Thus, the corresponding opcodes (0xd8 through + 0xdf) are disassembled as .byte directives. For reference, how- + ever, the syntactically-correct "esc" instruction is output as + a comment. + + To build the disassembler program, transfer all the source + files, together with the Makefile, to a suitable (preferably + empty) PC/IX directory. Then, simply type "make". + + To use dis88, place it in a directory which appears in + your $PATH list. It may then be invoked by name from whatever + directory you happen to be in. As a minimum, the program must + be invoked with one command-line argument: the name of the ob- + ject file to be disassembled. (Dis88 will complain if the file + specified is not an object file.) Optionally, you may specify + an output file; stdout is the default. One command-line switch + is available: "-o", which makes the program display addresses + and object code along with its mnemonic disassembly. + + The "-o" option is useful primarily for verifying the cor- + rectness of the program's output. In particular, it may be used + to check the accuracy of local relative jump opcodes. These + jumps often target local labels, which are lost at assembly + time; thus, the disassembly may contain cryptic instructions + like "jnz .+39". As a user convenience, all relative jump and + call opcodes are output with a comment which identifies the + physical target address. + + By convention, the release level of the program as a whole + is the SID of the file disrel.c, and this SID string appears in + each disassembly. Release 2.1 of the program is the first beta + release to be distributed on Usenet. + -- cgit v1.2.1 From 42192453ea219b80d0bf9f41e51e36d3d4d0740b Mon Sep 17 00:00:00 2001 From: Robert de Bath Date: Tue, 1 Oct 1996 14:00:00 +0200 Subject: Import dis88-minix --- README | 148 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 148 insertions(+) (limited to 'README') diff --git a/README b/README index d0c80dc..a007634 100644 --- a/README +++ b/README @@ -89,3 +89,151 @@ each disassembly. Release 2.1 of the program is the first beta release to be distributed on Usenet. + +.TH dis88 1 LOCAL +.SH "NAME" +dis88 \- 8088 symbolic disassembler +.SH "SYNOPSIS" +\fBdis88\fP [ -o ] ifile [ ofile ] +.SH "DESCRIPTION" +Dis88 reads ifile, which must be in PC/IX a.out format. +It interprets the binary opcodes and data locations, and +writes corresponding assembler source code to stdout, or +to ofile if specified. The program's output is in the +format of, and fully compatible with, the PC/IX assembler, +as(1). If a symbol table is present in ifile, labels and +references will be symbolic in the output. If the input +file lacks a symbol table, the fact will be noted, and the +disassembly will proceed, with the disassembler generating +synthetic labels as needed. If the input file has split +I/D space, or if it is executable, the disassembler will +make all necessary adjustments in address-reference calculations. +.PP +If the "-o" option appears, object code will be included +in comments during disassembly of the text segment. This +feature is used primarily for debugging the disassembler +itself, but may provide information of passing interest +to users. +.PP +The program always outputs the current machine address +before disassembling an opcode. If a symbol table is +present, this address is output as an assembler comment; +otherwise, it is incorporated into the synthetic label +which is generated internally. Since relative jumps, +especially short ones, may target unlabelled locations, +the program always outputs the physical target address +as a comment, to assist the user in following the code. +.PP +The text segment of an object file is always padded to +an even machine address. In addition, if the file has +split I/D space, the text segment will be padded to a +paragraph boundary (i.e., an address divisible by 16). +As a result of this padding, the disassembler may produce +a few spurious, but harmless, instructions at the +end of the text segment. +.PP +Disassembly of the data segment is a difficult matter. +The information to which initialized data refers cannot +be inferred from context, except in the special case +of an external data or address reference, which will be +reflected in the relocation table. Internal data and +address references will already be resolved in the object file, +and cannot be recreated. Therefore, the data +segment is disassembled as a byte stream, with long +stretches of null data represented by an appropriate +".zerow" pseudo-op. This limitation notwithstanding, +labels (as opposed to symbolic references) are always +output at appropriate points within the data segment. +.PP +If disassembly of the data segment is difficult, disassembly of the +bss segment is quite easy, because uninitialized data is all +zero by definition. No data +is output in the bss segment, but symbolic labels are +output as appropriate. +.PP +For each opcode which takes an operand, a particular +symbol type (text, data, or bss) is appropriate. This +tidy correspondence is complicated somewhat, however, +by the existence of assembler symbolic constants and +segment override opcodes. Therefore, the disassembler's +symbol lookup routine attempts to apply a certain amount +of intelligence when it is asked to find a symbol. If +it cannot match on a symbol of the preferred type, it +may return a symbol of some other type, depending on +preassigned (and somewhat arbitrary) rankings within +each type. Finally, if all else fails, it returns a +string containing the address sought as a hex constant; +this behavior allows calling routines to use the output +of the lookup function regardless of the success of its +search. +.PP +It is worth noting, at this point, that the symbol lookup +routine operates linearly, and has not been optimized in +any way. Execution time is thus likely to increase +geometrically with input file size. The disassembler is +internally limited to 1500 symbol table entries and 1500 +relocation table entries; while these limits are generous +(/unix, itself, has fewer than 800 symbols), they are not +guaranteed to be adequate in all cases. If the symbol +table or the relocation table overflows, the disassembly +aborts. +.PP +Finally, users should be aware of a bug in the assembler, +which causes it not to parse the "esc" mnemonic, even +though "esc" is a completely legitimate opcode which is +documented in all the Intel literature. To accommodate +this deficiency, the disassembler translates opcodes of +the "esc" family to .byte directives, but notes the +correct mnemonic in a comment for reference. +.PP +In all cases, it should be possible to submit the output +of the disassembler program to the assembler, and assemble +it without error. In most cases, the resulting object +code will be identical to the original; in any event, it +will be functionally equivalent. +.SH "SEE ALSO" +adb(1), as(1), cc(1), ld(1). +.br +"Assembler Reference Manual" in the PC/IX Programmer's +Guide. +.SH "DIAGNOSTICS" +"can't access input file" if the input file cannot be +found, opened, or read. +.sp +"can't open output file" if the output file cannot be +created. +.sp +"warning: host/cpu clash" if the program is run on a +machine with a different CPU. +.sp +"input file not in object format" if the magic number +does not correspond to that of a PC/IX object file. +.sp +"not an 8086/8088 object file" if the CPU ID of the +file header is incorrect. +.sp +"reloc table overflow" if there are more than 1500 +entries in the relocation table. +.sp +"symbol table overflow" if there are more than 1500 +entries in the symbol table. +.sp +"lseek error" if the input file is corrupted (should +never happen). +.sp +"warning: no symbols" if the symbol table is missing. +.sp +"can't reopen input file" if the input file is removed +or altered during program execution (should never happen). +.SH "BUGS" +Numeric co-processor (i.e., 8087) mnemonics are not currently supported. +Instructions for the co-processor are +disassembled as CPU escape sequences, or as interrupts, +depending on how they were assembled in the first place. +.sp +Despite the program's best efforts, a symbol retrieved +from the symbol table may sometimes be different from +the symbol used in the original assembly. +.sp +The disassembler's internal tables are of fixed size, +and the program aborts if they overflow. -- cgit v1.2.1