Tech Talk: Read-copy update (RCU) validation and verification for Linux

  • Date Tuesday, November 11, 2014  Time 11:00 AM
  • Speaker Paul McKenney
  • Location Galois, Inc., 421 SW 6th Ave., Suite 300, Portland, OR, USA (3rd floor of the Commonwealth Building)
  • Galois is pleased to host the following tech talk.
    These talks are open to the interested public--please join us!
    (There is no need to pre-register for the talk.)

Read-copy update (RCU) is a synchronization mechanism that is sometimes used as an alternative to reader-writer locking (among other things) that was added to the Linux kernel in 2002. A similar mechanism was added to Sequent’s DYNIX/ptx parallel UNIX kernel in 1993, and antecedents go back to at least 1980. Although a fully functional textbook implementation of RCU comprises only about 20 lines of code, the Linux-kernel implementation comprises more than 15,000 lines of code due to harsh requirements involving performance, scalability, energy efficiency, real-time response, memory footprint, module unloading, hotplugging of CPUs, and software-engineering considerations.

Therefore, a key Linux-kernel RCU challenge is validation and verification. To this end, more than 2500 lines of the current Linux-kernel implementation do torture testing. However, there are now more than one billion devices running Linux, so that an RCU bug that manifested on average once every million years of machine time would manifest about three times every day across the installed base. Furthermore, the combination of Internet of Things and Josh Triplett’s Linux Kernel Tinification project is likely to significantly increase the number of running instances.

I write and run rcutorture tests myself, and I like to think that my 20 years of parallel-code testing experience allows me to meet this validation challenge, but a simple analysis suggests a gap of several orders of magnitude. Additional validation and verification techniques are thus called for. This talk gives a brief overview of RCU and describes my adventures evaluating various verification techniques.

Paul E. McKenney has been coding for almost four decades, more than half of that on parallel hardware, where his work has earned him a reputation among some as a flaming heretic. Over the past decade, Paul has been an IBM Distinguished Engineer at the IBM Linux Technology Center. Paul maintains the RCU implementation within the Linux kernel, where the variety of workloads present highly entertaining performance, scalability, real-time response, and energy-efficiency challenges. Prior to that, he worked on the DYNIX/ptx kernel at Sequent, and prior to that on packet-radio and Internet protocols (but long before it was polite to mention Internet at cocktail parties), system administration, business applications, and real-time systems. He has a BS in Computer Science and another in Mechanical Engineering along with an MS in Computer Science, all from Oregon State University. He also has a Ph.D. in Computer Science and Engineering from OGI/OHSU. He has more than 100 publications and more than 50 patents. His hobbies include what passes for running at his age along with the usual house-wife-and-kids habit.