Off-by-one error

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

An off-by-one error (OBOE), also commonly known as an OBOB (off-by-one bug), is a logic error involving the discrete equivalent of a boundary condition. It often occurs in computer programming when an iterative loop iterates one time too many or too few. This problem could arise when a programmer makes mistakes such as using "is less than or equal to" where "is less than" should have been used in a comparison or fails to take into account that a sequence starts at zero rather than one (as with array indices in many languages). This can also occur in a mathematical context.

Looping over arrays

Consider an array of items, and items m through n (inclusive) are to be processed. How many items are there? An intuitive answer may be n − m, but that is off by one, exhibiting a fencepost error; the correct answer is n –m + 1.

For this reason, ranges in computing are often represented by half-open intervals; the range from m to n (inclusive) is represented by the range from m (inclusive) to n + 1 (exclusive) to avoid fencepost errors. For example, a loop that iterates five times can be written as a half-open interval from 0 to 5:

for (i = 0; i < 5; i++) 
{
    /* Body of the loop */
}

The loop body is executed first of all with i equal to 0; i then becomes 1, 2, 3, and finally 4 on successive iterations. At that point, i becomes 5, so i < 5 is false and the loop ends. However, if the comparison used were <= (less than or equal to), the loop would be carried out six times: i takes the values 0, 1, 2, 3, 4, and 5. Likewise, if i were initialized to 1 rather than 0, there would only be four iterations: i takes the values 1, 2, 3, and 4. Both of these alternatives can cause off-by-one errors.

Another such error can occur if a do-while loop is used in place of a while loop (or vice versa.) A do-while loop is guaranteed to run at least once.

Array-related confusion may also result from differences in programming languages. Numbering from 0 is most common, but some languages start array numbering with 1. Pascal has arrays with user-defined indices. This makes it possible to model the array indices after the problem domain.

Fencepost error

File:Fencepost error.svg
A straight fence with n sections has n+1 posts.

A fencepost error (occasionally called a telegraph pole or lamp-post error) is a specific type of off-by-one error. The following problem illustrates the error:

<templatestyles src="Template:Blockquote/styles.css" />

If you build a straight fence 30 meters long with posts spaced 3 meters apart, how many posts do you need?

The intuitive answer 10 is wrong. The fence has 10 sections, but 11 posts.

The reverse error occurs when the number of posts is known and the number of sections is assumed to be the same. The actual number of sections is one less than the number of posts.

More generally, the problem can be stated as follows:

<templatestyles src="Template:Blockquote/styles.css" />

If you have n telegraph poles, how many gaps are there between them?

The correct answer may be n − 1, if the line of poles is open-ended, n if they form a loop, or n + 1 if the open sides of the sequence of poles count as sections. The precise problem definition must be carefully considered, as the setup for one situation may give the wrong answer for other situations. Fencepost errors come from counting things rather than the spaces between them, or vice versa, or by neglecting to consider whether one should count one or both ends of a row.

Fencepost errors can also occur in units other than length. For example, the Time Pyramid, consisting of 120 blocks placed at 10 year intervals between blocks, is scheduled to take 1190 (not 1200) years to build, from the installation of the first block to the last block. One of the earliest fencepost errors involved time, where the Julian calendar originally calculated leap years incorrectly, due to counting inclusively rather than exclusively, yielding a leap year every three years rather than every four.

"Fencepost error" can[citation needed], in rare occasions, refer to an error induced by unexpected regularities in input values, which can (for instance) completely thwart a theoretically efficient binary tree or hash function implementation. This error involves the difference between expected and worst case behaviours of an algorithm.

In larger numbers being off-by-one is often not a major issue. In smaller numbers, however, and specific cases where accuracy is paramount committing an off-by-one error can be disastrous. Sometimes such an issue will also be repeated and, therefore, worsened, by someone passing on an incorrect calculation if the following person makes the same kind of mistake again (of course, the error might also be reversed).

An example of this error can occur in the computational language Matlab with the linspace() function, whose parameters are (lower value, upper value, number of values) and not (lower value, upper value, number of increments). A programmer who misunderstands the third parameter to be the number of increments might hope that linspace(0,10,5) would achieve a sequence [0, 2, 4, 6, 8, 10] but instead would get [0, 2.5, 5, 7.5, 10].

Security implications

A common off-by-one error which results in a security-related bug is caused by misuse of the C standard library strncat routine. A common misconception with strncat is that the guaranteed null termination will not write beyond the maximum length. In reality it will write a terminating null character one byte beyond the maximum length specified. The following code contains such a bug:

void foo (char *s) 
{
    char buf[15];
    memset(buf, 0, sizeof(buf));
    strncat(buf, s, sizeof(buf)); // Final parameter should be: sizeof(buf)-1
}

Off-by-one errors are common in using the C library because it is not consistent with respect to whether one needs to subtract 1 byte – functions like fgets() and strncpy will never write past the length given them (fgets() subtracts 1 itself, and only retrieves (length − 1) bytes), whereas others, like strncat will write past the length given them. So the programmer has to remember for which functions they need to subtract 1.

On some systems (little endian architectures in particular) this can result in the overwriting of the least significant byte of the frame pointer. This can cause an exploitable condition where an attacker can hijack the local variables for the calling routine.

One approach that often helps avoid such problems is to use variants of these functions that calculate how much to write based on the total length of the buffer, rather than the maximum number of characters to write. Such functions include strlcat and strlcpy, and are often considered "safer" because they make it easier to avoid accidentally writing past the end of a buffer. (In the code example above, calling strlcat(buf, s, sizeof(buf)) instead would remove the bug.)

See also

References

  • Lua error in package.lua at line 80: module 'strict' not found.