Skip to content

Commit

Permalink
Merge tag 'trace-ring-buffer-v6.14-rc2' of git://git.kernel.org/pub/s…
Browse files Browse the repository at this point in the history
…cm/linux/kernel/git/trace/linux-trace

Pull trace ring buffer fixes from Steven Rostedt:

 - Enable resize on mmap() error

   When a process mmaps a ring buffer, its size is locked and resizing
   is disabled. But if the user passes in a wrong parameter, the mmap()
   can fail after the resize was disabled and the mmap() exits with
   error without reenabling the ring buffer resize. This prevents the
   ring buffer from ever being resized after that. Reenable resizing of
   the ring buffer on mmap() error.

 - Have resizing return proper error and not always -ENOMEM

   If the ring buffer is mmapped by one task and another task tries to
   resize the buffer it will error with -ENOMEM. This is confusing to
   the user as there may be plenty of memory available. Have it return
   the error that actually happens (in this case -EBUSY) where the user
   can understand why the resize failed.

 - Test the sub-buffer array to validate persistent memory buffer

   On boot up, the initialization of the persistent memory buffer will
   do a validation check to see if the content of the data is valid, and
   if so, it will use the memory as is, otherwise it re-initializes it.
   There's meta data in this persistent memory that keeps track of which
   sub-buffer is the reader page and an array that states the order of
   the sub-buffers. The values in this array are indexes into the
   sub-buffers. The validator checks to make sure that all the entries
   in the array are within the sub-buffer list index, but it does not
   check for duplications.

   While working on this code, the array got corrupted and had
   duplicates, where not all the sub-buffers were accounted for. This
   passed the validator as all entries were valid, but the link list was
   incorrect and could have caused a crash. The corruption only produced
   incorrect data, but it could have been more severe. To fix this,
   create a bitmask that covers all the sub-buffer indexes and set it to
   all zeros. While iterating the array checking the values of the array
   content, have it set a bit corresponding to the index in the array.
   If the bit was already set, then it is a duplicate and mark the
   buffer as invalid and reset it.

 - Prevent mmap()ing persistent ring buffer

   The persistent ring buffer uses vmap() to map the persistent memory.
   Currently, the mmap() logic only uses virt_to_page() to get the page
   from the ring buffer memory and use that to map to user space. This
   works because a normal ring buffer uses alloc_page() to allocate its
   memory. But because the persistent ring buffer use vmap() it causes a
   kernel crash.

   Fixing this to work with vmap() is not hard, but since mmap() on
   persistent memory buffers never worked, just have the mmap() return
   -ENODEV (what was returned before mmap() for persistent memory ring
   buffers, as they never supported mmap. Normal buffers will still
   allow mmap(). Implementing mmap() for persistent memory ring buffers
   can wait till the next merge window.

 - Fix polling on persistent ring buffers

   There's a "buffer_percent" option (default set to 50), that is used
   to have reads of the ring buffer binary data block until the buffer
   fills to that percentage. The field "pages_touched" is incremented
   every time a new sub-buffer has content added to it. This field is
   used in the calculations to determine the amount of content is in the
   buffer and if it exceeds the "buffer_percent" then it will wake the
   task polling on the buffer.

   As persistent ring buffers can be created by the content from a
   previous boot, the "pages_touched" field was not updated. This means
   that if a task were to poll on the persistent buffer, it would block
   even if the buffer was completely full. It would block even if the
   "buffer_percent" was zero, because with "pages_touched" as zero, it
   would be calculated as the buffer having no content. Update
   pages_touched when initializing the persistent ring buffer from a
   previous boot.

* tag 'trace-ring-buffer-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  ring-buffer: Update pages_touched to reflect persistent buffer content
  tracing: Do not allow mmap() of persistent ring buffer
  ring-buffer: Validate the persistent meta data subbuf array
  tracing: Have the error of __tracing_resize_ring_buffer() passed to user
  ring-buffer: Unlock resize on mmap error
  • Loading branch information
torvalds committed Feb 16, 2025
2 parents 4966590 + 9793783 commit 5784d8c
Show file tree
Hide file tree
Showing 2 changed files with 31 additions and 9 deletions.
28 changes: 26 additions & 2 deletions kernel/trace/ring_buffer.c
Original file line number Diff line number Diff line change
Expand Up @@ -1672,14 +1672,18 @@ static void *rb_range_buffer(struct ring_buffer_per_cpu *cpu_buffer, int idx)
* must be the same.
*/
static bool rb_meta_valid(struct ring_buffer_meta *meta, int cpu,
struct trace_buffer *buffer, int nr_pages)
struct trace_buffer *buffer, int nr_pages,
unsigned long *subbuf_mask)
{
int subbuf_size = PAGE_SIZE;
struct buffer_data_page *subbuf;
unsigned long buffers_start;
unsigned long buffers_end;
int i;

if (!subbuf_mask)
return false;

/* Check the meta magic and meta struct size */
if (meta->magic != RING_BUFFER_META_MAGIC ||
meta->struct_size != sizeof(*meta)) {
Expand Down Expand Up @@ -1712,6 +1716,8 @@ static bool rb_meta_valid(struct ring_buffer_meta *meta, int cpu,

subbuf = rb_subbufs_from_meta(meta);

bitmap_clear(subbuf_mask, 0, meta->nr_subbufs);

/* Is the meta buffers and the subbufs themselves have correct data? */
for (i = 0; i < meta->nr_subbufs; i++) {
if (meta->buffers[i] < 0 ||
Expand All @@ -1725,6 +1731,12 @@ static bool rb_meta_valid(struct ring_buffer_meta *meta, int cpu,
return false;
}

if (test_bit(meta->buffers[i], subbuf_mask)) {
pr_info("Ring buffer boot meta [%d] array has duplicates\n", cpu);
return false;
}

set_bit(meta->buffers[i], subbuf_mask);
subbuf = (void *)subbuf + subbuf_size;
}

Expand Down Expand Up @@ -1838,6 +1850,11 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
cpu_buffer->cpu);
goto invalid;
}

/* If the buffer has content, update pages_touched */
if (ret)
local_inc(&cpu_buffer->pages_touched);

entries += ret;
entry_bytes += local_read(&head_page->page->commit);
local_set(&cpu_buffer->head_page->entries, ret);
Expand Down Expand Up @@ -1889,17 +1906,22 @@ static void rb_meta_init_text_addr(struct ring_buffer_meta *meta)
static void rb_range_meta_init(struct trace_buffer *buffer, int nr_pages)
{
struct ring_buffer_meta *meta;
unsigned long *subbuf_mask;
unsigned long delta;
void *subbuf;
int cpu;
int i;

/* Create a mask to test the subbuf array */
subbuf_mask = bitmap_alloc(nr_pages + 1, GFP_KERNEL);
/* If subbuf_mask fails to allocate, then rb_meta_valid() will return false */

for (cpu = 0; cpu < nr_cpu_ids; cpu++) {
void *next_meta;

meta = rb_range_meta(buffer, nr_pages, cpu);

if (rb_meta_valid(meta, cpu, buffer, nr_pages)) {
if (rb_meta_valid(meta, cpu, buffer, nr_pages, subbuf_mask)) {
/* Make the mappings match the current address */
subbuf = rb_subbufs_from_meta(meta);
delta = (unsigned long)subbuf - meta->first_buffer;
Expand Down Expand Up @@ -1943,6 +1965,7 @@ static void rb_range_meta_init(struct trace_buffer *buffer, int nr_pages)
subbuf += meta->subbuf_size;
}
}
bitmap_free(subbuf_mask);
}

static void *rbm_start(struct seq_file *m, loff_t *pos)
Expand Down Expand Up @@ -7126,6 +7149,7 @@ int ring_buffer_map(struct trace_buffer *buffer, int cpu,
kfree(cpu_buffer->subbuf_ids);
cpu_buffer->subbuf_ids = NULL;
rb_free_meta_page(cpu_buffer);
atomic_dec(&cpu_buffer->resize_disabled);
}

unlock:
Expand Down
12 changes: 5 additions & 7 deletions kernel/trace/trace.c
Original file line number Diff line number Diff line change
Expand Up @@ -5977,8 +5977,6 @@ static int __tracing_resize_ring_buffer(struct trace_array *tr,
ssize_t tracing_resize_ring_buffer(struct trace_array *tr,
unsigned long size, int cpu_id)
{
int ret;

guard(mutex)(&trace_types_lock);

if (cpu_id != RING_BUFFER_ALL_CPUS) {
Expand All @@ -5987,11 +5985,7 @@ ssize_t tracing_resize_ring_buffer(struct trace_array *tr,
return -EINVAL;
}

ret = __tracing_resize_ring_buffer(tr, size, cpu_id);
if (ret < 0)
ret = -ENOMEM;

return ret;
return __tracing_resize_ring_buffer(tr, size, cpu_id);
}

static void update_last_data(struct trace_array *tr)
Expand Down Expand Up @@ -8285,6 +8279,10 @@ static int tracing_buffers_mmap(struct file *filp, struct vm_area_struct *vma)
struct trace_iterator *iter = &info->iter;
int ret = 0;

/* Currently the boot mapped buffer is not supported for mmap */
if (iter->tr->flags & TRACE_ARRAY_FL_BOOT)
return -ENODEV;

ret = get_snapshot_map(iter->tr);
if (ret)
return ret;
Expand Down

0 comments on commit 5784d8c

Please sign in to comment.