This thesis presents the first-ever compile-time method for allocating a portion of a program’s dynamic data to scratch-pad memory. A scratch-pad is a fast directly addressed compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees vs cache and by its significantly lower overheads in access time, energy consumption, area and overall runtime..
Contents
Introduction
1.1 Organization of thesis
2 Embedded systems and software development
2.1 Embedded systems
2.2 Intel StrongARM microprocessor
2.3 Embedded software development
2.4 C language compilers
2.5 Heap data allocation
2.6 Recursive functions
3 Previous work on SPM allocation
3.1 Overview of related research
3.2 Static SPM allocation methods
3.3 Dynamic SPM allocation techniques
3.4 Existing methods for dynamic program data
3.5 Heap-to-stack conversion techniques
3.6 Memory hierarchy research
3.7 Dynamic memory manager research
3.8 Other related methods
4 Dynamic allocation of static program data
4.1 Overview for static program allocation
4.2 The Dynamic Program Region Graph
4.3 Allocation method for code, stack and global objects
4.4 Algorithm modifications
4.5 Layout and code generation
4.6 Summary of results
5 Dynamic program data
5.1 Understanding dynamic data in software
5.2 Obstacles to optimizing software with dynamic data
5.3 Creating the DPRG with dynamic data
6 Compiler allocation of dynamic data 139
6.1 Overview of SPM allocation for dynamic data
6.2 Preparing the DPRG for allocation
6.3 Calculating heap bin allocation sizes
6.4 Overview of the iterative portion
6.5 Transfer minimizations
6.6 Heap safety transformations
6.7 Memory layout technique for address assignment
6.8 Feedback driven transformations
6.9 Termination of iterative steps
6.10 Code generation for optimized binaries
7 Robust dynamic data handling
7.1 General optimizations
7.2 Recursive function stack handling
7.3 Compile-time unknown-size heap objects
7.4 Profile sensitivity
8 Methodology
8.1 Target hardware platform
8.2 Software platform requirements
8.3 Compiler implementation
8.4 Simulation platform
8.5 Benchmark overview
8.6 Benchmark classes
8.7 Benchmark suite
9 Results
9.1 Dynamic heap allocation results
9.1.1 Runtime and energy gain
9.1.2 Transfer method comparison
9.1.3 Reduction in heap DRAM accesses
9.1.4 Effect of DRAM latency
9.1.5 Effect of varying SPM size
9.2 Unknown-size heap allocation
9.3 Recursive function allocation
9.4 Comparison with caches
9.5 Profile sensitivity
9.5.1 Profile input variation
9.6 Code allocation
10 Conclusion
A Additional results
A.1 Detailed recursive stack allocation results
A.2 Cache comparison results
A.3 Profile sensitivity results
Author: Dominguez, Angel
Source: University of Maryland
Reference URL 1: Visit Now
Reference URL 2: Visit Now