5. Synchronising the CPU buffers to the event buffer

At some point, we have to process the data in each CPU buffer and enter it into the main (event) buffer. The file buffer_sync.c contains the relevant code. We periodically (currently every HZ/4 jiffies) start the synchronisation process. In addition, we process the buffers on certain events, such as an application calling munmap(). This is particularly important for exit() - because the CPU buffers contain pointers to the task structure, if we don't process all the buffers before the task is actually destroyed and the task structure freed, then we could end up trying to dereference a bogus pointer in one of the CPU buffers.

We also add a notification when a kernel module is loaded; this is so that user-space can re-read /proc/modules to determine the load addresses of kernel module text sections. Without this notification, samples for a newly-loaded module could get lost or be attributed to the wrong module.

The synchronisation itself works in the following manner: first, mutual exclusion on the event buffer is taken. Remember, we do not need to do that for each CPU buffer, as we only read from the tail iterator (whilst interrupts might be arriving at the same buffer, but they will write to the position of the head iterator, leaving previously written entries intact). Then, we process each CPU buffer in turn. A CPU switch notification is added to the buffer first (for --separate=cpu support). Then the processing of the actual data starts.

As mentioned, the CPU buffer consists of task switch entries and the actual samples. When the routine sync_buffer() sees a task switch, the process ID and process group ID are recorded into the event buffer, along with a dcookie (see below) identifying the application binary (e.g. /bin/bash). The mmap_sem for the task is then taken, to allow safe iteration across the tasks' list of mapped areas. Each sample is then processed as described in the next section.

After a buffer has been read, the tail iterator is updated to reflect how much of the buffer was processed. Note that when we determined how much data there was to read in the CPU buffer, we also called cpu_buffer_reset() to reset last_task and last_is_kernel, as we've already mentioned. During the processing, more samples may have been arriving in the CPU buffer; this is OK because we are careful to only update the tail iterator to how much we actually read - on the next buffer synchronisation, we will start again from that point.