Lecture 12: Software Security and Testing

Memory Errors

Memory errors are software bugs in the way we handle memory in memory unsafe languages like C/C++.

When a memory error occurs a program accesses memory that it should not; this is termed violating memory safety.

When a memory error occurs, it leads to either program to crash or to strange program behavior.

All types of memory errors can be potentially exploited by attackers.

Common memory errors: Heap/stack over/under flows

Buffer Overflows

Writing outside the boundaries of a buffer : spatial violation

Buffer overflows due to wrong input checks, unchecked buffer size, integer overflows/underflows

Stack Buffer Overflow Example:

In C, a string is just a buffer of chars; a null character '\0' marks the end of the string.

If the string is bigger than the buffer → Stack Buffer Overflow

all memory parts without stack and heap have fixed size, and the size of stack of heap grow during the program.

Buffer overflows can happen when calling common string and buffer functions.

But not limited to those functions

E.g. read(), fread(), gets(), fgets(), etc

Custom data copying code can also suffer

String copy function.

The problem: if the source buffer is larger than the destination buffer → buffer overflow

Buffer Underflows

Writing outside the boundaries of a buffer → spatial violation

Common programmer errors that lead to it:

- Insufficient input checks

- Unchecked buffer size

- Integer overflows/underflows

Less common than buffer overflows

Tries to copy src buffer to the destination buffer, but we start copying from the opposite

If the src string is larger than dst string, we end up with buffer underflow

Off-by-One Variation

Writing outside the boundaries of a buffer by one byte → spatial violation

dp = dst + dst_size - 1

Off by one buffer overflow

Summary

Write → over/under-write → corrupt neighboring memory areas

Read → over/under-read → leak data from neighboring memory areas

Are These Vulnerabilities

If user can trigger them → Yes

Effects of over/under-writes:

- Crash the appliocation (DoS)

- Take over the application (remote code execution, arbitrary code execution) → attacker can manipulate the applicationt

- Corrupt application state (e.g. is_admin variable that lets the user to connect to the program)

Effects of over/under-read:

- Leak sensitive data

Code Injection

Actions that lead to buffer overflow

gets(buf) functon is insecure and can lead to buffer overflows

Overwrite the stack and inject code to the buf

Rewrite the return address of the function and execute the code just injected

When the main function ends, instead of terminating, it executes "bin/sh"

How to learn buf's address?

We assume an almost fixed memory layout (attacker can infer buffer's location easily)

Writable and Executable Memory

Code injection is possible because there is a memory area that is both writable and executable

Can be eliminated if we introduce the W^X Policy:

The Write XOR Execute (W^X) policy mandates that in a program there are no memory pages that are both writable and executable

The Memory Management Unit (MMU) - Paging

Used in all modern servers, laptops, and smart phones

When CPU tries to do instruction that attacks memory, it goes through the MMU which enforces permission on the page level which prevent code injections

Page Permissions

W^X

If memory is writable, it should not be executable

- Does not allow stack to be executed

- Cannot execute injected code

But we can reuse code present in the executable instead

Code Reuse (Return-to-libc)

After the attacker cannot inject code, try return-to-libc

- Overwrite the return address with the pointer

Call the system function controlled by the attacker

How to learn system's address?

We assume an almost fied memory layout -> the attacker can easily find the system's address and exploit

Code Reuse (Gadgets)

Instead of functions , we can overwrite the stack with return address to the pointer to the gadgets

call gadgets which do some computation

Defenses

Enforcement

Randomization

Fixed Process Layout

The programs we have exploited (this far) have a fixed memory layout

- Data segments start at the same address

- Binary is loaded at the same address

- Shared libraries are loaded at the same address

One Attack Fits All

Fixed process layout → facilitates exploit development

Attacker can statically discover:

- the location of their data

- the location of code (e.g. functions, gadgets)

An exploit developed on one system will work on all other systems running the same software

Address Space Layout Randomization (ASLR)

Ideal version → when starting up a process, randomly pick the base address where each data and code segment will be loaded.

Introduce uncertainty for the attacker → need to guess the location of code and thier data

이렇게 순서를 다 바꾸는건 안됨. ▼

Base addresses are randomly selected from within predtermined ranges

Libraries are loaded in the gaps (usually between the stack and the heap)

Example

No ASLR - Fixed memory layout

With ASLR - the stack frames are slightly moved

Randomization

Most OSs support ASLR

Randomization is prone to information disclosure attacks

First, exploit some vulnerabilities to learn the memory layout and then (usually) exploit other vulnerabilities to complete your attack, e.g. code reuse.

Information Disclosure Attack: Heartbleed (buffer overread)

Heartbleed-style information disclosure can also be used to learn the memory layout

Enforcement

Enforce some policy to protect application state

- W^X

- Return address protections (e.g. canaries, shadow stacks)

- Code Pointers protections (e.g. Control-Flow Integrity, Code Pointer Integrity)

It is extremely difficult and expensive to protect the integrity of everything (enforcement cannot cover all potential attacks)

It is extrememly difficult to protect against any potential attack. However, attackers have to explot bugs to launch their attacks → Find the bugs and fix thme before the attackers

Fuzzing

Find bugs in a program by feeding it large quantities of automatically generated inputs

The program is run on every input generated

Every run is monitored for any sign of bugs or vulnerabilities

Simplest Fuzzing

Program that opens a jpeg file

How could we do better?

- Randomly corrupt/change real JPEG files

- Reference the JPEG spec so that we generate only "JPEG-looking" data (that only properly create JPEG pictures)

- Measure the JPEG parser to see how deep we're getting in the code

Common Fuzzing Strategies

Mutation-based Fuzzing: Randomly mutate test cases from some initial corpus of inputs

Generation-based Fuzzing: Genearate test cases based on the grammar of the input format

Coverage-guided Fuzzing: Measure code coverage of test cases to guide fuzzing towards new (unexplored) program states

Mutation-based Fuzzing

1. Collect a corpus of inputs that explores as many states as possible

2. Mutate inputs randomly, possibly guided by heuristics, or domain-specific knowledge (input format)

3. Run the program on the inputs and check for crashes

4. Go to step 2

Generation-based (non-random) Fuzzing

Gernate test cases based on a specification for the input format

1. Convert a specification (or prior knowledge) of the input format into a genrateive procedure

2. Generate test cases according to the procedure and introduce random perturbations

3. Run the program on the inputs and check for crashes

4. Go back to step 2

To test Network protocols usually different packets (packets have specific forrmat)

Coverage-guided Fuzzing

Use code coverage as a feedback signal that guides fuzzing

1. Collect a corpus of inputs that explores as many states as possible

2. Select an input from the corpus prioritizing those that have high coverage, and randomly mutate it

3. Run the program on the selected input and check for a crash. If the input produces an unsee coverage, add it to the corpus

4. Go to step 2

Summary

Fuzzing tries to generate inputs that lead the program to states that satisfy specific criteria

- Security purposes: find vulnerabilities

- Reliability purposes: find bugs in the program logic

저작자표시 (새창열림)

'학교 > CS' 카테고리의 다른 글

Lecture 14: DNS and Coordinated Vulnerability Disclosure (0)	2024.04.08
Lecture 13: Malware and Malware Detection (0)	2024.04.08
Lecture 11: Cryptography 4 (0)	2024.04.08
Lecture 10: Cryptography 3 (1)	2024.04.07
Lecture 9: Cryptography 2 (1)	2024.04.07

성장하는 코더의 스토리

Lecture 12: Software Security and Testing

Memory Errors

Defenses

'학교 > CS' 카테고리의 다른 글

댓글

티스토리툴바

Lecture 12: Software Security and Testing

Memory Errors

Defenses

'학교 > CS' 카테고리의 다른 글

관련글

댓글

티스토리툴바