Assembly language, also incorrectly referred to as assembler, is a low-level symbolic language that represents a microprocessor's binary machine instructions in a human-readable form. An assembly language program is edited in an editor and saved in a file that is referred to as source code. The program that translates the assembly language instructions in the source code file into a machine executable form is referred to as an assembler (a compiler translates a high level language, such as C or FORTRAN, into machine executable form). The output of the assembler is referred to as object code, and depending on the type of assembler used, may be machine executable or may require additional linking to create an executable program. The assembler is said to "assemble" the program, a process referred to as assembly. The reverse of assembly is referred to as "disassembly"—the program is said to have been "disassembled." A disassembled program is human-readable, although usually not to the same extent as the original source code.
Every processor architecture (e.g. x86, MIPS, 6502, 68000) has its own instruction set and syntax, and therefore a different assembly language. Most assembly languages show similarities, but they are still unique to each CPU architecture. Because of this, people speak of "6502 assembly language" or "68000 assembly language" to be clear.
Machine instructions are represented by numbers which are stored as binary code in the computer's memory. To make it possible to program with these commands, each opcode has a short symbol called a mnemonic.
An assembler translates these mnemonics into their corresponding opcodes and data. An example:
*=$c000 lda #$00 ; load the number 0 into the accumulator register sta $d020 ; store the content of the accumulator in the register for the border color ...
... inside the RAM of the C64 this is represented like this:
Address Opcode + Operand(s) c000 A9 00 ; "C000" is the memory address, "A9" the Opcode, "00" the Operand c002 8D 20 D0 ; "C002" is the memory address, "8D" the Opcode, "20" and "D0" are Operands ...
... as binary code from $c000
Some assemblers for the C64 (e.g. TurboAssembler, ACME (cross-compiler) - see []) translate not only the mnemonics but also provide macros for common operations to reduce the programmer's workload.
Advantages / Disadvantages
- As close to the machine as possible—each assembly instruction corresponds directly to a machine instruction
- Depending on the skill of the programmer, assembled programs can be extremely fast
- Well-written assembly programs are usually rather small
- Difficult to learn
- High workload because:
- the language is verbose and quite inexpressive, lacking the data structures and abstraction facilities provided by most higher-level languages
- errors can be difficult to track down and often cause the program (or machine) to crash unpredictably
- usually not portable to other computer systems, even if they have the same CPU (for example, Atari ST and Amiga computers both use the 68000 CPU, but the architectural and BIOS differences make programs non-portable)
Even with these difficulties, assembly language is worth learning. As a byproduct, you get to understand and appreciate the inner workings of a computer at a deep level. Assembly language still has its uses today. Programming of device drivers would be difficult without assembly language, and many small embedded systems require it.
Set screen colors to black ... (sys 49152)
*=$c000 lda #$00 sta $d020 sta $d021 rts
Play music (expecting player routine at $1000) ... (sys 49152)
*=$c000 sei lda #<irq ldx #>irq sta $0314 stx $0315 cli lda #$00 tay tax jsr $1000 rts irq lda $d012 cmp #100 bne irq jsr $1003 jmp $ea31