Special Seminar - Diyu Zhou, "Performance and Reliability in System Software Through Decoupling"
Exponential growth in users, requests, and data poses ever-increasing demands on the performance of today's data centers. To meet these demands, data centers utilize ultra-fast storage devices and scale out computation by leveraging large numbers of cores and servers. Unfortunately, existing systems software has failed to keep up with these developments, thereby preventing applications from fully realizing the potential benefits.
In this talk, I will present my work on designing modern system software to exploit current computing trends by supporting three critical application requirements: I/O efficiency, multicore scalability, and practical reliability. A common theme in the presented works involves identifying and breaking the hidden undesirable couplings in the existing systems to overcome their limitations. I will first present OdinFS, a high-performance and scalable file system for emerging non-volatile memory (NVM). By taking into account the unique characteristics of NVM, OdinFS decouples NVM access from the application threads, thereby scaling to hundreds of cores and achieving tens to hundreds of times better performance than prior state of the art. I will next present RRC, an application-transparent replication system for commercial off-the-shelf containers. RRC decouples replication-based operations from normal operations. It thus incurs latency overhead up to 75x lower than competitive schemes, while also achieving significantly lower throughput overhead, enabling its practical deployment.
Diyu Zhou is postdoctoral researcher at EPFL. He completed his Ph.D. at UCLA advised by Yuval Tamir. His research focuses on building high-performance, scalable, and reliable computer systems. His work spans multiple areas of systems research. Specifically, he has developed I/O stacks to support emerging storage devices, devised frameworks and algorithms for synchronization primitives to scale to massive multi-core machines, built novel tools for detecting concurrency bugs, and designed practical fault tolerance mechanisms for virtualized systems.