RaceHound 1.0 - Once More about the Races in the Kernel
To put it simple, «data race» is a situation when two or more threads of execution (in an application, in the kernel, etc.) may access the same data in memory concurrently and at least one of the threads changes these data. Such conditions may lead to very unpleasant consequences but are often quite hard to detect.
The tools like KernelStrider may help reveal the races. They usually find a lot of potentials but may produce false alarms too. That is, they may report a potential race when a race is not possible.
On the other hand, RaceHound may miss something but if it has detected a race, the race does happen. By the way, these tools work together very well: KernelStrider finds potential races while RaceHound checks if these races really happen.
This was used not long ago to detect an interesting race in «uvcvideo» driver (webcam support) in the kernel 4.1-rc5. The developers of the driver, however, suggest that nothing bad should happen because of that race, but still.
The ideas implemented in RaceHound are rather simple.
- Place software breakpoints (similar to what the debuggers do) on the instructions that may be involved in the races.
- When a software breakpoint hits, determine the address and the size of the memory area the instruction is about to access.
- Place hardware breakpoints on that memory area to detect appropriate accesses to it.
- Make a delay. If some code tries to access to that memory area during the delay, the hardware breakpoints will trigger and the race will be revealed.
- Remove the hardware breakpoints and let the original instruction execute as usual.
As it is often the case, the devil is in the details. Implementing that algorithm was far from easy. It is no surprise that RaceHound is being developed since the summer of 2012.
Compared to the previous versions (0.x), our specialists overhauled the core components of RaceHound in version 1.0. It was only possible before to analyze one kernel module at a time, with additional restrictions. Now RaceHound is able to monitor the code from several modules and the kernel proper at the same time, any code a software breakpoint can be placed to.
Besides, during the development of RaceHound 1.0, an error in the implementation of the software breakpoints (so called, Kprobes, to be exact) was found and fixed in the kernel. The fix should appear in the kernel 4.1.