본문 바로가기
학교/CS

Lecture 12: Software Security and Testing

by Hongwoo 2024. 4. 8.
반응형

목차

 

Memory Errors

Memory errors are software bugs in the way we handle memory in memory unsafe languages like C/C++.

When a memory error occurs a program accesses memory that it should not; this is termed violating memory safety.

When a memory error occurs, it leads to either program to crash or to strange program behavior.

All types of memory errors can be potentially exploited by attackers.

 

Common memory errors: Heap/stack over/under flows

 

 

Buffer Overflows

Writing outside the boundaries of a buffer : spatial violation

Buffer overflows due to wrong input checks, unchecked buffer size, integer overflows/underflows

 

Stack Buffer Overflow Example:

 

In C, a string is just a buffer of chars; a null character '\0' marks the end of the string.

If the string is bigger than the buffer → Stack Buffer Overflow

 

 

all memory parts without stack and heap have fixed size, and the size of stack of heap grow during the program.

 

Buffer overflows can happen when calling common string and buffer functions.

But not limited to those functions

E.g. read(), fread(), gets(), fgets(), etc

Custom data copying code can also suffer

String copy function.

The problem: if the source buffer is larger than the destination buffer → buffer overflow

 

 

Buffer Underflows

Writing outside the boundaries of a buffer → spatial violation

Common programmer errors that lead to it:

- Insufficient input checks

- Unchecked buffer size

- Integer overflows/underflows

Less common than buffer overflows

 

Tries to copy src buffer to the destination buffer, but we start copying from the opposite

If the src string is larger than dst string, we end up with buffer underflow

 

 

Off-by-One Variation

Writing outside the boundaries of a buffer by one byte → spatial violation

dp = dst + dst_size - 1

Off by one buffer overflow

 

Summary

Write → over/under-write → corrupt neighboring memory areas

Read → over/under-read → leak data from neighboring memory areas

 

 

 

Are These Vulnerabilities

If user can trigger them → Yes

Effects of over/under-writes:

- Crash the appliocation (DoS)

- Take over the application (remote code execution, arbitrary code execution)  → attacker can manipulate the applicationt

- Corrupt application state (e.g. is_admin variable that lets the user to connect to the program)

Effects of over/under-read:

- Leak sensitive data

 

 

Code Injection

Actions that lead to buffer overflow

 

 

gets(buf) functon is insecure and can lead to buffer overflows

 

Overwrite the stack and inject code to the buf

Rewrite the return address of the function and execute the code just injected

When the main function ends, instead of terminating, it executes "bin/sh"

 

 

How to learn buf's address? 

We assume an almost fixed memory layout (attacker can infer buffer's location easily)

 

 

Writable and Executable Memory

Code injection is possible because there is a memory area that is both writable and executable

Can be eliminated if we introduce the W^X Policy:

The Write XOR Execute (W^X) policy mandates that in a program there are no memory pages that are both writable and executable

 

 

The Memory Management Unit (MMU) - Paging

Used in all modern servers, laptops, and smart phones

 

When CPU tries to do instruction that attacks memory, it goes through the MMU which enforces permission on the page level which prevent code injections

 

 

Page Permissions

 

 

W^X

If memory is writable, it should not be executable

- Does not allow stack to be executed

- Cannot execute injected code

But we can reuse code present in the executable instead

 

 

Code Reuse (Return-to-libc)

After the attacker cannot inject code, try return-to-libc

- Overwrite the return address with the pointer

Call the system function controlled by the attacker

How to learn system's address?

We assume an almost fied memory layout -> the attacker can easily find the system's address and exploit

 

 

Code Reuse (Gadgets)

Instead of functions , we can overwrite the stack with return address to the pointer to the gadgets

call gadgets which do some computation

 

 

Defenses

Enforcement

Randomization

 

 

Fixed Process Layout

The programs we have exploited (this far) have a fixed memory layout

- Data segments start at the same address

- Binary is loaded at the same address

- Shared libraries are loaded at the same address

 

 

One Attack Fits All

Fixed process layout → facilitates exploit development

Attacker can statically discover:

- the location of their data

- the location of code (e.g. functions, gadgets)

An exploit developed on one system will work on all other systems running the same software

 

 

Address Space Layout Randomization (ASLR)

Ideal version → when starting up a process, randomly pick the base address where each data and code segment will be loaded.

Introduce uncertainty for the attacker → need to guess the location of code and thier data

 

이렇게 순서를 다 바꾸는건 안됨. ▼

 

 

Base addresses are randomly selected from within predtermined ranges

Libraries are loaded in the gaps (usually between the stack and the heap)

 

 

Example

 

No ASLR - Fixed memory layout

With ASLR - the stack frames are slightly moved 

 

 

Randomization

Most OSs support ASLR 

Randomization is prone to information disclosure attacks

First, exploit some vulnerabilities to learn the memory layout and then (usually) exploit other vulnerabilities to complete your attack, e.g. code reuse.

 

Information Disclosure Attack: Heartbleed (buffer overread)

Heartbleed-style information disclosure can also be used to learn the memory layout

 

 

Enforcement

Enforce some policy to protect application state

- W^X

- Return address protections (e.g. canaries, shadow stacks) 

- Code Pointers protections (e.g. Control-Flow Integrity, Code Pointer Integrity)

 

It is extremely difficult and expensive to protect the integrity of everything (enforcement cannot cover all potential attacks)

It is extrememly difficult to protect against any potential attack. However, attackers have to explot bugs to launch their attacks → Find the bugs and fix thme before the attackers

 

 

Fuzzing

Find bugs in a program by feeding it large quantities of automatically generated inputs

The program is run on every input generated

Every run is monitored for any sign of bugs or vulnerabilities

 

 

Simplest Fuzzing

Program that opens a jpeg file 

How could we do better?

- Randomly corrupt/change real JPEG files

- Reference the JPEG spec so that we generate only "JPEG-looking" data (that only properly create JPEG pictures)

- Measure the JPEG parser to see how deep we're getting in the code

 

 

Common Fuzzing Strategies

Mutation-based Fuzzing: Randomly mutate test cases from some initial corpus of inputs

Generation-based Fuzzing: Genearate test cases based on the grammar of the input format

Coverage-guided Fuzzing: Measure code coverage of test cases to guide fuzzing towards new (unexplored) program states

 

 

Mutation-based Fuzzing

1. Collect a corpus of inputs that explores as many states as possible

2. Mutate inputs randomly, possibly guided by heuristics, or domain-specific knowledge (input format)

3. Run the program on the inputs and check for crashes

4. Go to step 2

 

 

Generation-based (non-random) Fuzzing

Gernate test cases based on a specification for the input format

1. Convert a specification (or prior knowledge) of the input format into a genrateive procedure

2. Generate test cases according to the procedure and introduce random perturbations

3. Run the program on the inputs and check for crashes

4. Go back to step 2

 

To test Network protocols usually different packets (packets have specific forrmat)

 

 

Coverage-guided Fuzzing

Use code coverage as a feedback signal that guides fuzzing

1. Collect a corpus of inputs that explores as many states as possible

2. Select an input from the corpus prioritizing those that have high coverage, and randomly mutate it

3. Run the program on the selected input and check for a crash. If the input produces an unsee coverage, add it to the corpus

4. Go to step 2

 

 

Summary

Fuzzing tries to generate inputs that lead the program to states that satisfy specific criteria

- Security purposes: find vulnerabilities

- Reliability purposes: find bugs in the program logic

 

 

반응형

'학교 > CS' 카테고리의 다른 글

Lecture 14: DNS and Coordinated Vulnerability Disclosure  (0) 2024.04.08
Lecture 13: Malware and Malware Detection  (0) 2024.04.08
Lecture 11: Cryptography 4  (0) 2024.04.08
Lecture 10: Cryptography 3  (1) 2024.04.07
Lecture 9: Cryptography 2  (1) 2024.04.07

댓글