Late night thoughts on security: Defensive Programming

Overview

Defensive Programming is a technique where you assume the worst from all input

The biggest danger to your application is user input.
- It's uncontrolled, unexpected and unpredictable.
The input sent to your application could be malicious.
- Or it could just be something you never expected.
Debugging takes a lot of time.

First Rule

Never Assume Anything!

Two kinds of assumptions:

Expected input
- Data from the user cannot be trusted. As such, all input must be validated.
- For each input:
  - Define the set of all legal input values.
  - When receiving input, validate against this set.
  - Determine the behavior when input is incorrect:
    - Terminate
    - Retry
    - Warning
The programmer assuming something about a programming language
- Some primitive data types have different sizes depending on the operating system and the hardware platform. For example, integers have been 8,16,32 and 64 bits. Assuming the size of a data type can be disastrous when working on different platform.
- In C, the size of data types are defined in limits.h. In addition, C has the sizeof operator which will calculate the size of a variable.
- You need to be especially careful on integer operation.

short x = 10000 * 10
Will x overflow?

Testing:

Just testing that it works is not good enough.
Error cases need to be tested, to see that the application reacts accordingly. Add tests for different kinds of input:

illogical
strange ASCII characters
numerical/non-numerical
positive/negative/0
too large/small
only composed of numbers/letters
border values

Ask other people to test your application
- First start with the technical testers
- The asks non-technical people

Second Rule

Use standards!

Proper coding standards address weaknesses in the language standard and/or compiler design.
They define a format or "style" used for writing code. Every software development team should has an agreed-upon and formally documented coding standard. Coding standards make code more coherent and easier to read, thus reduce the likelihood of bugs. They cover a wide range of topics:

Variable naming
Indentation
Position of brackets
Content of header files
Function declaration
Use of constants/magic numbers
Macro definitions

Good reference: Google C++ Style Guide

Third Rule

Keep code as simple as possible !

Complexity breeds bugs.
Software should only contain the features it needs.
Proper planning is key to keeping you application simple
Functions should be seen as a contract
- Given input, the execute a specific task.
- They should not do anything other than that specific task.
- If they cannot execute that task, they should have some kind of indicator so that the callee can detect the error.
  - Throw an exception (doesn't work in C)
  - Set a global error value
  - Return an invalid value
    - NULL?
    - False?
    - Negative number?
Refactoring
- Is not a bug-fixing technique. Is a good technique to battle feature creep:
  - Features are often added during development
  - These features are more often the source of problems
- Fights this by forcing the programmer to reevaluate the structure of his/her program.
- It can help you keep you application simple
Third-party libraries
- Code reuse is not just a smart-choice, it's a safe choice. Odds are that a specific library has proven itself and is much more stable than anything you could build short-term. Although code reuse is highly recommended, many questions must be addressed before using someone else's code:
  - Do this do exactly what I need?
  - How much will I need to change my design?
  - How stable is it? What reputation does it have?
  - How old is the code?
  - Who built it?
  - Are people still using it? Can I get help?
  - How much documentation is there?

Case Study

(on function PF_SafeStrCpy)

1. Initial version:

/* 
** This function copies the null terminated string str2 to str1. 
** Str1 is truncated, if the the length of str2 is greater than or equal 
** the length of str1. The parameter len1 must conatin the length value 
** of str1 (sizeof).
*/
/* this function replaces the system one due to QAC */
IM_string PF_SafeStrCpy (IM_string str1, 
IM_cstring str2, 
IM_word len1)
{
IM_word len2;

/* strlen returns the length without the terminating '\0' -> + 1 */
len2 = strlen(str2) + 1; 

if (len2 > len1)
{
str1[len1 - 1] = '\0';
return strncpy(str1, str2, len1 - 1);
}
else
{
str1[len2] = '\0';
return strncpy(str1, str2, len2);
}
}

2. Problem discovered: A simple test proves that this function is not so safe though:

char str1 [100]; 
char str2 [100] / * strlen (str2) == 99 * /

The call PF_SafeStrCpy (str1, str2, 100); goes through the else branch and writes past the end of str1.

3. Actual corrected version:

/**
* \see PF_Basic.h 
* corrected version of PF_SafeStrCpy; the origin is from project SAR 
*/
char* PF_SafeStrCpy (char* str1, const char* str2, IM_word len)
{
if ( 0 == len ) {
return NULL;
}
else {
char* cptr = strncpy(str1, str2, len);
str1[len-1] = '\0';
return cptr;
}
}

4. Remarks on actual version:

Attack possibilities:

Passing NULL as "src" (str2) or "dest" (str1) can easily cause the program to terminate, thereby enabling a DoS attack
Improperly passing in the "count" (len) parameter (that strncpy will use), causes buffer overflow problems

5. General solution:

As a rule, the return string buffer must be at least large enough to hold the specified maximum number of characters, not bytes, plus the NULL character.

Follow these rules for safe use of strncpy():

Verify that src and dest are not NULL.
Null terminate the final character of DEST.
Use strncpy(dest, src, sizeof(dest)/sizeof(dest[0])).
If the final character (i.e., sizeof(dest) - 1) of DEST is no longer null, then the buffer was overrun.

Steps are effective, but usage still requires care in checking sizes.

6. Observations:

strlen() vs sizeof()

This code:

char String[]="Hello";

printf("\n Size of string: %d. String length: %d.\n", sizeof(String), strlen(String) );

... prints: Size of string: 6. String length: 5.

Non NULL-terminated string:

This code:

char String[5] = { 'H', 'e', 'l', 'l', 'o' };

printf("\n Size of string: %d. String length: %d.\n", sizeof(String), strlen(String) );

... prints: Size of string: 5. String length: 19. (String length is unknown. The function strlen() searches until it finds a NUL terminator.)

strlen() with pointers:

This code:

char *ptr = "Hello";

printf("For ptr: sizeof = %u, strlen = %u.\n", sizeof ptr, strlen(ptr));

... prints: For ptr: sizeof = 4, strlen = 5\n. (ptr is a pointer, so zieof(ptr) is the size of the pointer, in this case 4 bytes)

strlen() is NOT safe to call!
- Unless you positively know that the string IS null-terminated.
- When you call strlen() on an improperly terminated string:
  - Strlen scans until a null character is found
  - Can scan outside buffer if string is not null-terminated
  - Can result in a segmentation fault or bus error

Good external links

Defensive programming on Wikipedia
Secure programmer: Developing secure programs article by David Wheeler
Secure programmer: Countering buffer overflows article by David Wheeler
Proactive Debugging article by Jack Ganssle
Secure Programming for Linux and Unix HOWTO -- Creating Secure Software free book by David Wheeler
Programming issues: Buffer overflows

Late night thoughts on security

Pages

Tuesday, November 16, 2010

Defensive Programming

No comments:

Post a Comment