Optimization of Data Accesses for Database Applications
- Optimization of Data Accesses for Database Applications
- Chen, Zhifeng
- Issue Date
- data access
- As the speed of microprocessors increases according to Moore's law, access speeds of the main memory and disks lag far behind. As a result, disk accesses and memory accesses pose significant performance bottlenecks for a wide range of applications. Specifically the database server performance in a data center is often limited, relying on the workload, by disk I/Os or memory accesses. This dissertation investigates techniques that improve the effectiveness of buffer caches and processor caches to bridge these two speed gaps for database servers in a data center environment. To address the disk I/O bottleneck, this dissertation proposes the global management of the database-storage buffer cache hierarchy, which delivers the performance comparable to that of the aggregate cache size of the hierarchy. To manage buffer caches globally, this dissertation answers two challenging questions: 1) without the modification of the existing I/O interface (namely hierarchy-aware), how to collaborate database and storage caches to achieve a global cache; 2) with the extension of the I/O interface (namely aggressively-collaborative), whether the benefit of the consequent performance improvement is worthwhile. To answer the first question, this dissertation proposes an hierarchy-aware method based on the eviction information. The storage cache uses a Client Content Tracking table to obtain the eviction information transparently. Upon the eviction of the database server, the storage server fetches selectively the corresponding block from the disk. Both simulation and implementation results show that the hierarchy-aware method improves the storage cache hit ratio significantly, cache hit ratios increasing by a factor of 5 in simulations and database transaction rate by 22% in real system results. To answer the second question, this dissertation adopts an empirical evaluation approach to explore the large design space for the database-storage collaborative caching. This design space has three dimensions: collaboration approaches (hierarchy-aware and aggressively-collaborative), replacement algorithms and workload specific optimizations. Through both trace-driven simulation and real system implementation, this dissertation evaluates 248 combinations in the design space, which include all the previous proposed solutions and many new approaches. The results indicate that the aggressively-collaborative caching only provides less than on average 2.5% performance improvement in simulation and 1.0% in real system experiments over the hierarchy-aware caching in all the tested cases. In other words, the hierarchy-aware caching, without requiring the modification to the existing I/O interface, can deliver the performance similar to that of the aggressively-collaborative caching. To address the memory access bottleneck for database servers, this dissertation proposes a technique, Hanuman, to improve the processor cache performance. Hanuman reformats data in database buffer caches dynamically to improve the data spatial locality. By adapting data layout to changing database queries, Hanuman improves the spatial locality of data and accordingly the processor cache hit ratio is increased. To determine the best data layout for the occurring workload with multiple queries, Hanuman conducts the heuristic cost analysis for candidate layouts and chooses the best layout that optimizes the estimated number of cache misses. Our result indicates that Hanuman reduces processor cache misses by 63-80% and query execution time by 16-24% for decision support queries.
- Type of Resource
- Copyright and License Information
- You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format, BUT this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the University of Illinois at Urbana-Champaign Computer Science Department under terms that include this permission. All other rights are reserved by the author(s).
Edit Collection Membership