Looping

We're going to write a program to calculate exponents. Before we can do that, we need to discuss looping.

A loop is a series of instructions which can be executed repeatedly. When execution reaches the end, it jumps back to the beginning and runs the loop again. This continues until some criteria is reached which causes the loop to stop. Take a look at the following minimal example:

    mov rax, 0

loop_start:
    add rax, 2
    jmp loop_start

This snippet starts by setting rax to 0. Then, it continually adds 2 to rax. Let's look at it in more detail:

    mov rax, 0

Before the loop begins, the register rax is set to 0.

loop_start:

This is a label, marking the start of the loop. We can jump to this label any time we want the loop to run again.

    add rax, 2

This is the body of the loop. This instruction is executed each time the loop runs. It adds 2 to the value of rax, so every time the loop runs, rax will be increased by 2. Since rax starts at 0, if the loop runs 10 times, rax will be set to 20 by the end.

    jmp loop_start

This instruction causes the loop to start over. Every time this instruction is reached, the loop will start back over at the beginning. This will cause add rax, 2 to run over and over again.

There's just one problem with this loop: it never ends. This is what is known as an infinite loop, because it continues forever. If you were to run a program with this loop inside, that program would appear to freeze. The loop would run continually until the program or computer was forcibly halted.

In order to be useful, a loop must have some termination criteria: some condition under which the loop ceases to repeat. Take a look at the following updated snippet:

    mov rax, 0

loop_start:
    add rax, 2

    cmp rax, 10
    jl loop_start

This is a bit different. rax still starts at 0, and 2 is still added to rax each time the loop runs. But now, instead of the unconditional jump at the end of the loop, there is now a conditional jump, which only repeats the loop if certain criteria are met.

    cmp rax, 10

This instruction compares the value of rax to 10. The possible results of this comparison are:

  • rax could be greater than 10
  • rax could be equal to 10
  • rax could be less than 10

When this instruction runs, the results of the comparison will be stored in the rflags register. This instruction does not act on the result of the comparison by itself, it sets things up so the next instruction can.

    jl loop_start

This is a conditional jump. It only jumps to the given label loop_start: if rax is less than 10. If rax is greater than or equal to 10, the jump doesn't happen and the loop ends.

There are several variants of the conditional jump instruction. The jl instruction above stands for "jump if less than".

Altogether, this snippet's behavior can be summarized as follows:

  • rax is set to its starting value of 0.
  • rax is less than 10, so the loop restarts.
  • rax is increased to 2.
  • rax is less than 10, so the loop restarts.
  • rax is increased to 4.
  • rax is less than 10, so the loop restarts.
  • rax is increased to 6.
  • rax is less than 10, so the loop restarts.
  • rax is increased to 8.
  • rax is less than 10, so the loop restarts.
  • rax is increased to 10.
  • rax is not less than 10, so the loop does not restart.

The loop runs 5 times, adding 2 to rax each time, until rax reaches 10.

A basic loop with output

This small program is going to use a loop to print a line of text to the console over and over. Take a look at the following code:

%define sys_write 1
%define stdout 1

%define sys_exit 60

%define newline 10

section .data

    output: db "Greetings!", newline
    output_len: equ $-output

section .text

global _start
_start:

; The number of times to print the text out
    mov rbx, 7

loop_start:

; Print the text to the console
    mov rax, sys_write
    mov rdi, stdout
    mov rsi, output
    mov rdx, output_len
    syscall

; Decrement the loop counter
    dec rbx

; Continue the loop while rbx > 0
    cmp rbx, 0
    jg loop_start


; Exit the program
    mov rax, sys_exit
    mov rdi, 0
    syscall

We're using rbx to keep track of how many times to print the text. Each time the loop runs, it:

  1. Prints the text out
  2. Subtracts 1 from the loop counter rbx
  3. Starts over if rbx is still greater than 0

In closer detail:

section .data

    output: db "Greetings!", newline
    output_len: equ $-output

We've added a data section containing two values:

  1. output - the string to print to the console
  2. output_len - the number of characters in the output string
; The number of times to print the text out
    mov rbx, 7

This sets the number of times the loop will run and keeps track of when to stop repeating the loop.

loop_start:

This is the beginning of the loop.

; Print the text to the console
    mov rax, sys_write
    mov rdi, stdout
    mov rsi, output
    mov rdx, output_len
    syscall

This is the body of the loop. Here we print the string output to the console. This will be executed repeatedly each time the loop runs.

; Decrement the loop counter
    dec rbx

Now the loop counter in rbx has to be decremented to keep track of how many times the loop has run. Each time the loop runs, we subtract 1 from rbx. So at any point during the program, rbx contains the number of iterations left to run before the loop will be finished.

; Continue the loop while rbx > 0
    cmp rbx, 0
    jg loop_start

Here we check the loop counter against 0. If rbx is greater than 0, the loop continues. If rbx has reached 0, the loop ends.

; Exit the program
    mov rax, sys_exit
    mov rdi, 0
    syscall

At this point, the loop will have run 7 times and then stopped. The program exits.

Type the program above into a file called "printspam.asm" and run it. You should see the text "Greetings!" written out 7 times. Try changing the printed text to something else. Also try changing the initial value of rbx from 7 to something else. Whatever value rbx starts with is the number of times the string will be printed.

However, there's actually a bug in this program: if you set rbx to 0, the string will still be printed once. This is because we're using the wrong loop style for the job. The loop in this program is called a do..while loop, which works like this:

  1. Print the output string
  2. Check if the loop should end yet and start over if not

Notice that the test to decide whether the loop should end doesn't happen until the end of the loop, after the string has already been printed. This means that the loop will always run at least one time, since we don't check if it should keep going until after it has already run. No matter what value you give to rbx, it will always print the string at least one time.

This style of loop is called a "do..while" loop. The conditional check happens at the end of the loop body, so the loop always runs at least once no matter what the result of the conditional check is.

We can solve this problem by using a while loop. A while loop is another style of loop where the conditional check happens at the beginning of the loop, before the print operation. The loop will be reorganized to look more like this:

  1. Check if the loop should end yet and jump out of the loop if so
  2. Print the output string
  3. Go back to step 1

By checking to see if the loop should end at the very beginning of the loop, we prevent the output string from printing at all if rbx is 0 or a negative number.

Take a look at the updated program:

%define sys_write 1
%define stdout 1

%define sys_exit 60

%define newline 10

section .data

    output: db "Greetings!", newline
    output_len: equ $-output

section .text

global _start
_start:

; The number of times to print the text out
    mov rbx, 0

loop_start:

; Check if the loop should end yet
    cmp rbx, 0
    jle loop_stop

; Print the text to the console
    mov rax, sys_write
    mov rdi, stdout
    mov rsi, output
    mov rdx, output_len
    syscall

; Decrement the loop counter
    dec rbx

; Run the loop again
    jmp loop_start

loop_stop:


; Exit the program
    mov rax, sys_exit
    mov rdi, 0
    syscall

Now the conditional check happens at the beginning of the loop. Let's go over the changes in more detail:

; The number of times to print the text out
    mov rbx, 0

rbx now starts at 0. The string should never be printed if the counter starts at 0.

loop_start:

This is the start of the loop.

; Check if the loop should end yet
    cmp rbx, 0
    jle loop_stop

At the very beginning of each loop iteration, we check to see if the loop should end yet.

The cmp instruction is the same as before, but we're using a different form of conditional jump. jle stands for "jump if less than or equal to". So once rbx hits 0, we jump out of the loop to the loop_stop: label.

As long as rbx is greater than 0, we run the loop body:

; Print the text to the console
    mov rax, sys_write
    mov rdi, stdout
    mov rsi, output
    mov rdx, output_len
    syscall

The output string is printed to the console.

; Decrement the loop counter
    dec rbx

The loop counter is decremented, to keep track of the number of times the loop has run.

; Run the loop again
    jmp loop_start

Here we jump back to the start of the loop. This is an unconditional jump, meaning it jumps no matter what. Since we now check if the loop should continue at the beginning of the loop, we don't need to check it here at the end.

loop_stop:

This is where we jump when the loop ends. Once rbx hits 0, the jle instruction will jump here, breaking out of the loop.

Type the new program into a file (or edit the old one) and run it again. You should see that the bug has been fixed. If rbx is set to 0, the string never prints. If rbx is set to a positive integer (5, 7, etc), the string is printed that number of times.

Exponents

Now we're going to write a program to calculate exponents. We want to be able to take an input like 2 and a power like 3 and calculate the result. With those example values, we should get 2 ^ 3 = 8. 2 ^ 3 is the same as 2 * 2 * 2. So we can calculate this by repeatedly multiplying a value against itself. To do this, we'll need to use a loop similar to the ones introduced above.

Take a look at the following program:

%define sys_exit 60

section .text

global _start
_start:

; Starting values: calculating 2 ^ 3
    mov rbx, 2
    mov rcx, 3

; This stores the result, which starts as 1
    mov rax, 1

loop_start:

; Compare rcx to 0
    cmp rcx, 0

; Break loop once rcx reaches 0
    jle loop_stop

; Multiply rbx by rax, storing the result each time in rbx
    imul rax, rbx

; Decrement rcx
    dec rcx

; Start the loop again
    jmp loop_start

loop_stop:

; End the program
    mov rdi, rax
    mov rax, sys_exit
    syscall

Let's break it down line-by-line:

    mov rbx, 2
    mov rcx, 3

These are the "inputs" of the program. Since they're hard-coded into the source, they're not technically inputs, but they are the values we'll be operating on. Since we're trying to calculate 2 ^ 3, both values need to be in registers so we can work with them.

The basic idea here is we're going to have a loop which multiplies the value in rbx by itself. rcx will keep track of how many times this multiplication still needs to occur.

    mov rax, 1

rax will store the running total. We're going to multiply the value in rbx (2) against the value in rax the number of times specified by rcx (3). So we start with rax set to 1, since 1 * 2 * 2 * 2 = 2 ^ 3.

loop_start:

This marks the beginning of the loop. The loop body will be run 3 times. Execution will jump back to this point repeatedly, until rcx reaches 0 and the final answer is stored in rax.

; Compare rcx to 0
    cmp rcx, 0

We start by checking if the exponent rcx has reached 0. Since rcx might start at 0 (3 ^ 0 = 1), we have to do this check before running the loop.

; Break loop once rcx reaches 0
    jle loop_stop

If rcx has reached 0, we end the loop by jumping out of it to the label loop_stop:. This means that the loop will continue running until rcx reaches 0.

; Multiply rbx by rax, storing the result each time in rbx
    imul rax, rbx

This is where the actual multiplication happens. Each time the loop runs, we multiply the running total stored in rax by the base number in rbx.

Each time a loop runs is called an iteration. Take a look at the following table, which lists each iteration and shows how rax increases each time as it's multiplied by 2:

Iteration rax starting value rax ending value
1 1 2
2 2 4
3 4 8

The loop runs three times, each time multiplying the value in rax by 2.

    dec rcx

We don't want the loop to run forever, so we need a way of keeping track of how many times it's been run. We start the program by setting rcx to the value of 3 as that's how many times we want to run the loop. So each time the loop runs, we need to reduce the value of rcx by 1. That's what the dec instruction does: it subtracts 1 from whatever register you give it. This is also known as decrementing.

Note: We could have also used the sub instruction like this: sub rcx, 1 and it would have worked the same. dec rcx and sub rcx, 1 are functionally equivalent.

See the following table which lists each iteration, including the value of rcx at the end of the loop (after the dec rcx line) for each one:

Iteration rax ending value rcx ending value
1 2 2
2 4 1
3 8 0

For the first two iterations, rcx is greater than 0. The loop continues and rax is continually multiplied by 2. On the third iteration, rcx reaches 0 and the loop stops. At this point, rax is left with its final value of 8, which is the result of 2 ^ 3.

; Start the loop again
    jmp loop_start

This is the end of the loop. We jump back to the beginning to keep the loop running.

loop_stop:

This is the label we jump to in order to end/break the loop. Once rcx hits 0, the conditional jump instruction jle will jump here, breaking out of the loop and preventing it from running again.

    mov rdi, rax
    mov rax, sys_exit
    syscall

When the loop ends, the result will be stored in rax (and it should be 8). In order to return the value as a status exit code, it needs to be in rdi, so we move it there and then end the program. 8 gets returned as the status code.

Type the program into a file called "exponent.asm" and run it with the "run" script:

./run exponent

You should see 8 returned on the console. Try changing the values of rbx and rcx to calculate different exponents.

For example, try 7 ^ 3. Modify the lines at the beginning that set the inputs like this:

    mov rbx, 7
    mov rcx, 3

This should give us 7 * 7 * 7 = 343, right? Wrong! System status codes use only one byte of memory, which means they can only store a number up to 255. When we try to return 343 in a value that can only be a maximum of 255, the value overflows since it can't go any higher. Instead of returning 343, it returns 87, which is 343 - 256. This limitation makes status codes a bad way to get this kind of output from a program. Eventually, we'll work through converting numbers of (virtually) any size into ASCII strings and printing those out, but we're not quite there yet.