How does linux kernel collect task stats data
Posted on In Linux, SystemsTable of Contents
Motivation
Recently, I find it is hard to know the percentage of time that one process uses to wait for synchronous I/O (eg, read, etc). One way is to use the taskstats API provided by Linux Kernel [1]. However, for this way, the precision may be one problem. With this problem, I dig into Linux Kernel source codes to see how “blkio_delay_total” (Delay time waiting for synchronous block I/O to complete) is calculated.
Details
Actually, “blkio_delay_total” is calculated in function “__delayacct_add_tsk” in “linux/kernel/delayacct.c” file as follows.
83 int __delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
84 {
...
121 tmp = d->blkio_delay_total + tsk->delays->blkio_delay;
122 d->blkio_delay_total = (tmp < d->blkio_delay_total) ? 0 : tmp;
...
131
132 return 0;
133
}
From above source codes, we know that “blkio_delay_total” adds “blkio_delay” each time. And I find “blkio_delay” is the time delay which one process is waiting for synchronous I/O each time. It is calculated with following way.
4969 /*
4970 * This task is about to go to sleep on IO. Increment rq->nr_iowait so
4971 * that process accounting knows that this is a task in IO wait state.
4972 */
4973 long __sched io_schedule_timeout(long timeout)
4974 {
4975 int old_iowait = current->in_iowait;
4976 struct rq *rq;
4977 long ret;
4978
4979 current->in_iowait = 1;
4980 blk_schedule_flush_plug(current);
4981
4982 delayacct_blkio_start();
4983 rq = raw_rq();
4984 atomic_inc(&rq->nr_iowait);
4985 ret = schedule_timeout(timeout);
4986 current->in_iowait = old_iowait;
4987 atomic_dec(&rq->nr_iowait);
4988 delayacct_blkio_end();
4989
4990 return ret;
4991 }
4992 EXPORT_SYMBOL(io_schedule_timeout);
When one process starts to wait for I/O, the start time will be recorded. And after it finishes sync I/O, it will get blkio_delay which equals to current time minus start time. At last, add this delta time (blkio_delay) to process’s “blkio_delay_total”.
Conclusion
1, When current process is waiting for synchronous I/O, its blkio_delay will be calculated and added to blkio_delay_total.
2, blkio_delay is updated when current process finishes its sync I/O.
References
[1] https://www.kernel.org/doc/Documentation/accounting/taskstats-struct.txt
[2] http://lxr.free-electrons.com/source/kernel/sched/core.c?v=4.7#L4988
A nice post!
But how to get the `blkio_delay_total` values out from the kernel to user space?
There are two main ways you can get `blkio_delay_total` values out from Kernel space to User space.
1, Linux Kernel exported System call API (https://www.kernel.org/doc/Documentation/accounting/delay-accounting.txt).
2, Export `blkio_delay_total` values out from kernel space to user space by Proc FS yourself.