본문 바로가기
학교/CS

Lecture 12: Software Security and Testing

by Hongwoo 2024. 4. 8.
반응형

목차

     

    Memory Errors

    Memory errors are software bugs in the way we handle memory in memory unsafe languages like C/C++.

    When a memory error occurs a program accesses memory that it should not; this is termed violating memory safety.

    When a memory error occurs, it leads to either program to crash or to strange program behavior.

    All types of memory errors can be potentially exploited by attackers.

     

    Common memory errors: Heap/stack over/under flows

     

     

    Buffer Overflows

    Writing outside the boundaries of a buffer : spatial violation

    Buffer overflows due to wrong input checks, unchecked buffer size, integer overflows/underflows

     

    Stack Buffer Overflow Example:

     

    In C, a string is just a buffer of chars; a null character '\0' marks the end of the string.

    If the string is bigger than the buffer → Stack Buffer Overflow

     

     

    all memory parts without stack and heap have fixed size, and the size of stack of heap grow during the program.

     

    Buffer overflows can happen when calling common string and buffer functions.

    But not limited to those functions

    E.g. read(), fread(), gets(), fgets(), etc

    Custom data copying code can also suffer

    String copy function.

    The problem: if the source buffer is larger than the destination buffer → buffer overflow

     

     

    Buffer Underflows

    Writing outside the boundaries of a buffer → spatial violation

    Common programmer errors that lead to it:

    - Insufficient input checks

    - Unchecked buffer size

    - Integer overflows/underflows

    Less common than buffer overflows

     

    Tries to copy src buffer to the destination buffer, but we start copying from the opposite

    If the src string is larger than dst string, we end up with buffer underflow

     

     

    Off-by-One Variation

    Writing outside the boundaries of a buffer by one byte → spatial violation

    dp = dst + dst_size - 1

    Off by one buffer overflow

     

    Summary

    Write → over/under-write → corrupt neighboring memory areas

    Read → over/under-read → leak data from neighboring memory areas

     

     

     

    Are These Vulnerabilities

    If user can trigger them → Yes

    Effects of over/under-writes:

    - Crash the appliocation (DoS)

    - Take over the application (remote code execution, arbitrary code execution)  → attacker can manipulate the applicationt

    - Corrupt application state (e.g. is_admin variable that lets the user to connect to the program)

    Effects of over/under-read:

    - Leak sensitive data

     

     

    Code Injection

    Actions that lead to buffer overflow

     

     

    gets(buf) functon is insecure and can lead to buffer overflows

     

    Overwrite the stack and inject code to the buf

    Rewrite the return address of the function and execute the code just injected

    When the main function ends, instead of terminating, it executes "bin/sh"

     

     

    How to learn buf's address? 

    We assume an almost fixed memory layout (attacker can infer buffer's location easily)

     

     

    Writable and Executable Memory

    Code injection is possible because there is a memory area that is both writable and executable

    Can be eliminated if we introduce the W^X Policy:

    The Write XOR Execute (W^X) policy mandates that in a program there are no memory pages that are both writable and executable

     

     

    The Memory Management Unit (MMU) - Paging

    Used in all modern servers, laptops, and smart phones

     

    When CPU tries to do instruction that attacks memory, it goes through the MMU which enforces permission on the page level which prevent code injections

     

     

    Page Permissions

     

     

    W^X

    If memory is writable, it should not be executable

    - Does not allow stack to be executed

    - Cannot execute injected code

    But we can reuse code present in the executable instead

     

     

    Code Reuse (Return-to-libc)

    After the attacker cannot inject code, try return-to-libc

    - Overwrite the return address with the pointer

    Call the system function controlled by the attacker

    How to learn system's address?

    We assume an almost fied memory layout -> the attacker can easily find the system's address and exploit

     

     

    Code Reuse (Gadgets)

    Instead of functions , we can overwrite the stack with return address to the pointer to the gadgets

    call gadgets which do some computation

     

     

    Defenses

    Enforcement

    Randomization

     

     

    Fixed Process Layout

    The programs we have exploited (this far) have a fixed memory layout

    - Data segments start at the same address

    - Binary is loaded at the same address

    - Shared libraries are loaded at the same address

     

     

    One Attack Fits All

    Fixed process layout → facilitates exploit development

    Attacker can statically discover:

    - the location of their data

    - the location of code (e.g. functions, gadgets)

    An exploit developed on one system will work on all other systems running the same software

     

     

    Address Space Layout Randomization (ASLR)

    Ideal version → when starting up a process, randomly pick the base address where each data and code segment will be loaded.

    Introduce uncertainty for the attacker → need to guess the location of code and thier data

     

    이렇게 순서를 다 바꾸는건 안됨. ▼

     

     

    Base addresses are randomly selected from within predtermined ranges

    Libraries are loaded in the gaps (usually between the stack and the heap)

     

     

    Example

     

    No ASLR - Fixed memory layout

    With ASLR - the stack frames are slightly moved 

     

     

    Randomization

    Most OSs support ASLR 

    Randomization is prone to information disclosure attacks

    First, exploit some vulnerabilities to learn the memory layout and then (usually) exploit other vulnerabilities to complete your attack, e.g. code reuse.

     

    Information Disclosure Attack: Heartbleed (buffer overread)

    Heartbleed-style information disclosure can also be used to learn the memory layout

     

     

    Enforcement

    Enforce some policy to protect application state

    - W^X

    - Return address protections (e.g. canaries, shadow stacks) 

    - Code Pointers protections (e.g. Control-Flow Integrity, Code Pointer Integrity)

     

    It is extremely difficult and expensive to protect the integrity of everything (enforcement cannot cover all potential attacks)

    It is extrememly difficult to protect against any potential attack. However, attackers have to explot bugs to launch their attacks → Find the bugs and fix thme before the attackers

     

     

    Fuzzing

    Find bugs in a program by feeding it large quantities of automatically generated inputs

    The program is run on every input generated

    Every run is monitored for any sign of bugs or vulnerabilities

     

     

    Simplest Fuzzing

    Program that opens a jpeg file 

    How could we do better?

    - Randomly corrupt/change real JPEG files

    - Reference the JPEG spec so that we generate only "JPEG-looking" data (that only properly create JPEG pictures)

    - Measure the JPEG parser to see how deep we're getting in the code

     

     

    Common Fuzzing Strategies

    Mutation-based Fuzzing: Randomly mutate test cases from some initial corpus of inputs

    Generation-based Fuzzing: Genearate test cases based on the grammar of the input format

    Coverage-guided Fuzzing: Measure code coverage of test cases to guide fuzzing towards new (unexplored) program states

     

     

    Mutation-based Fuzzing

    1. Collect a corpus of inputs that explores as many states as possible

    2. Mutate inputs randomly, possibly guided by heuristics, or domain-specific knowledge (input format)

    3. Run the program on the inputs and check for crashes

    4. Go to step 2

     

     

    Generation-based (non-random) Fuzzing

    Gernate test cases based on a specification for the input format

    1. Convert a specification (or prior knowledge) of the input format into a genrateive procedure

    2. Generate test cases according to the procedure and introduce random perturbations

    3. Run the program on the inputs and check for crashes

    4. Go back to step 2

     

    To test Network protocols usually different packets (packets have specific forrmat)

     

     

    Coverage-guided Fuzzing

    Use code coverage as a feedback signal that guides fuzzing

    1. Collect a corpus of inputs that explores as many states as possible

    2. Select an input from the corpus prioritizing those that have high coverage, and randomly mutate it

    3. Run the program on the selected input and check for a crash. If the input produces an unsee coverage, add it to the corpus

    4. Go to step 2

     

     

    Summary

    Fuzzing tries to generate inputs that lead the program to states that satisfy specific criteria

    - Security purposes: find vulnerabilities

    - Reliability purposes: find bugs in the program logic

     

     

    반응형

    '학교 > CS' 카테고리의 다른 글

    Lecture 14: DNS and Coordinated Vulnerability Disclosure  (0) 2024.04.08
    Lecture 13: Malware and Malware Detection  (0) 2024.04.08
    Lecture 11: Cryptography 4  (0) 2024.04.08
    Lecture 10: Cryptography 3  (1) 2024.04.07
    Lecture 9: Cryptography 2  (1) 2024.04.07

    댓글