20. WLA Symbols

Symbols can be optionally generated as a part of the assembly and link steps. With a compatible emulator, this can provide extra information for debugging a ROM, or otherwise help in understanding how it operates.

The symbols file can be generated by wlalink by adding “-S” onto the command line. This will output labels, definitions, and some other rudimentary data. Most prominently, this can be used to understand where the ROM output various sections such as subroutines and data, and be able to look that up in the emulator’s ROM or RAM space.

Extra information for address-to-line mapping can be provided by adding the following command line arguments: - Run object generation (e.g. “wla-65816”) with “-i” to include list data in the output obj files - Run wlalink with “-S -A” to generate symbols with information related to address-to-line mapping

Address-to-line mappings includes information to relate lines in the source files to individual instructions in the generated ROM. This can be used to provide richer disassembly in the emulator, or allow for rich debugging in an external IDE.

20.1. WLA Symbol Version History

If you are maintaining a WLA symbol file parser, please review this page when new versions of WLA DX are released, as the format might have changed.

Version 1: https://github.com/vhelin/wla-dx/blob/v9.12/doc/symbols.rst

  • Base version, including sections [labels], [definitions], [breakpoints], [symbols], [source files], [rom checksum], [addr-to-line mapping]

Version 2: https://github.com/vhelin/wla-dx/blob/v10.5/doc/symbols.rst

  • Added [information] section

  • Deprecated [source files] section, and replaced with [source files v2]

  • Deprecated [addr-to-line mapping] section definition, and replaced with [addr-to-line mapping v2]

Version 3: https://github.com/vhelin/wla-dx/blob/master/doc/symbols.rst

  • Added [sections] and [ramsections] sections

  • Added “wlasymbol true” under [information] section

20.2. Information For Emulator Developers

In order to properly support loading of WLA symbol files, it is recommended to follow this specification below, especially so as to gracefully support future additions to the symbol files.

  • The file should be read one line at a time

  • Any text on a line following a ; should be ignored

  • Lines matching \[\S+\] in regex or [%s] in scanf code are section headers, and represent a new section. Note that no section data will start with [.

  • Lines following the section header are the data for that section. If you’re acknowledging the section, utilize that section’s specific formatting. Read lines that match until a new section header is encountered.

  • Unless otherwise specified, none of the data in any section should be assumed to be sorted in any particular way.

The following are the list of currently supported sections, what they mean, and how their data should be interpreted.

20.2.1. [information]

The only fields this section has currently are “version” (and then the version number) and “wlasymbol” (which is followed by “true”). [information], if present, must always occur before any other section or data, and its first line will always be the format version.

20.2.2. [labels]

This is a list of all labels to sections of the ROM, such as subroutine locations, or data locations. Each line lists an address in hexadecimal (bank and offset) and a string associated with that address. This data could be used, for example, to identify what section a given target address is in, by searching for the label with the closest address less than the target address.

  • Regex match: [0-9a-fA-F]{2}:[0-9a-fA-F]{4} .*

  • Format specifier: %2x:%4x %s

20.2.3. [definitions]

This is a list of various definitions provided in code - or automatically during WLA’s processing - and values associated with them. Most prominently, WLA outputs the size of each section of the ROM. Each line lists an integer value in hexadecimal, and a string (name) associated with that value.

  • Regex match: [0-9a-fA-F]{8} .*

  • Format specifier: %8x %s

20.2.4. [breakpoints]

This is a list of hexadecimal ROM addresses where the .BREAKPOINT directive was used in the source assembly. Each line lists an address in hexadecimal (bank and offset).

  • Regex match: [0-9a-fA-F]{2}:[0-9a-fA-F]{4}

  • Format specificer: %2x:%4x

20.2.5. [symbols]

This is a list of hexadecimal ROM addresses where the .SYMBOL directive was used in the source assembly. Each line lists an address in hexadecimal (bank and offset) and a string associated with that address.

  • Regex match: [0-9a-fA-F]{2}:[0-9a-fA-F]{4} .*

  • Format specifier: %2x:%4x %s

20.2.6. [source files v2]

These are used to identify what files were used during the assembly process, especially to map generated assembly back to source file contents. Each line lists a hexadecimal object file index, a hexadecimal source file index, a hexadecimal CRC32 checksum of the file, and a file path relative to the generated ROM’s root. This could be used to load in the contents of one of the input files when running the ROM and verifying the file is up-to-date by checking its CRC32 checksum against the one generated during assembly.

  • Regex match: [0-9a-fA-F]{4}:[0-9a-fA-F]{4} [0-9a-fA-F]{8} .*

  • Format specifier: %4x:%4x %8x %s

20.2.7. [rom checksum]

This is just a single line identifying what the hexadecimal CRC32 checksum of the ROM file was when the symbol file was generated. This could be used to verify that the symbol file itself is up-to-date with the ROM in question. This checksum is calculated by reading the ROM file’s entire binary, and not by reading any platform-specific checksum value embedded in the ROM itself.

  • Regex match: [0-9a-fA-F]{8}

  • Format specifier: %8x

20.2.8. [addr-to-line mapping v2]

This is a listing of hexadecimal ROM address, bank, ROM bank offset, memory address, each mapped to a hexadecimal object file index, a source file index and hexadecimal line index. The file indices refer back to the file indices specified in the source files section, so that the source file name can be discovered. This information can be used to, for example, display source file information in line with disassembled code, or to communicate with an external text editor the location of the current Program Counter by specifying a source file and line instead of some address in the binary ROM file.

  • Regex match: [0-9a-fA-F]{8} [0-9a-fA-F]{2}:[0-9a-fA-F]{4} [0-9a-fA-F]{4} [0-9a-fA-F]{4}:[0-9a-fA-F]{4}:[0-9a-fA-F]{8}

  • Format specifier: %8x %2x:%4x %4x %4x:%4x:%8x

20.2.9. [sections]

Each line specifies a .SECTION: hexadecimal ROM address, bank, ROM bank offset, memory address, size and name. Use this information for example to locate .SECTION data in the output.

  • Regex match: [0-9a-fA-F]{8} [0-9a-fA-F]{2}:[0-9a-fA-F]{4} [0-9a-fA-F]{4} [0-9a-fA-F]{8} .*

  • Format specifier: %.8x %.2x:%.4x %.4x %.8x %s

20.2.10. [ramsections]

Each line specifies a .RAMSECTION: hexadecimal bank, RAM bank offset, memory address, size and name. Use this information for example to see where a .RAMSECTION was placed.

  • Regex match: [0-9a-fA-F]{2}:[0-9a-fA-F]{4} [0-9a-fA-F]{4} [0-9a-fA-F]{8} .*

  • Format specifier: %.2x:%.4x %.4x %.8x %s