fanotify (7) - Linux Manuals
fanotify: monitoring filesystem events
NAME
fanotify - monitoring filesystem events
DESCRIPTION
The fanotify API provides notification and interception of filesystem events. Use cases include virus scanning and hierarchical storage management. In the original fanotify API, only a limited set of events was supported. In particular, there was no support for create, delete, and move events. The support for those events was added in Linux 5.1. (See inotify(7) for details of an API that did notify those events pre Linux 5.1.)Additional capabilities compared to the inotify(7) API include the ability to monitor all of the objects in a mounted filesystem, the ability to make access permission decisions, and the possibility to read or modify files before access by other applications.
The following system calls are used with this API: fanotify_init(2), fanotify_mark(2), read(2), write(2), and close(2).
fanotify_init(), fanotify_mark(), and notification groups
The fanotify_init(2) system call creates and initializes an fanotify notification group and returns a file descriptor referring to it.An fanotify notification group is a kernel-internal object that holds a list of files, directories, filesystems, and mount points for which events shall be created.
For each entry in an fanotify notification group, two bit masks exist: the mark mask and the ignore mask. The mark mask defines file activities for which an event shall be created. The ignore mask defines activities for which no event shall be generated. Having these two types of masks permits a filesystem, mount point, or directory to be marked for receiving events, while at the same time ignoring events for specific objects under a mount point or directory.
The fanotify_mark(2) system call adds a file, directory, filesystem or mount point to a notification group and specifies which events shall be reported (or ignored), or removes or modifies such an entry.
A possible usage of the ignore mask is for a file cache. Events of interest for a file cache are modification of a file and closing of the same. Hence, the cached directory or mount point is to be marked to receive these events. After receiving the first event informing that a file has been modified, the corresponding cache entry will be invalidated. No further modification events for this file are of interest until the file is closed. Hence, the modify event can be added to the ignore mask. Upon receiving the close event, the modify event can be removed from the ignore mask and the file cache entry can be updated.
The entries in the fanotify notification groups refer to files and directories via their inode number and to mounts via their mount ID. If files or directories are renamed or moved within the same mount, the respective entries survive. If files or directories are deleted or moved to another mount or if filesystems or mounts are unmounted, the corresponding entries are deleted.
The event queue
As events occur on the filesystem objects monitored by a notification group, the fanotify system generates events that are collected in a queue. These events can then be read (using read(2) or similar) from the fanotify file descriptor returned by fanotify_init(2).Two types of events are generated: notification events and permission events. Notification events are merely informative and require no action to be taken by the receiving application with one exception: if a valid file descriptor is provided within a generic event, the file descriptor must be closed. Permission events are requests to the receiving application to decide whether permission for a file access shall be granted. For these events, the recipient must write a response which decides whether access is granted or not.
An event is removed from the event queue of the fanotify group when it has been read. Permission events that have been read are kept in an internal list of the fanotify group until either a permission decision has been taken by writing to the fanotify file descriptor or the fanotify file descriptor is closed.
Reading fanotify events
Calling read(2) for the file descriptor returned by fanotify_init(2) blocks (if the flag FAN_NONBLOCK is not specified in the call to fanotify_init(2)) until either a file event occurs or the call is interrupted by a signal (see signal(7)).The use of one of the flags FAN_REPORT_FID, FAN_REPORT_DIR_FID in fanotify_init(2) influences what data structures are returned to the event listener for each event. Events reported to a group initialized with one of these flags will use file handles to identify filesystem objects instead of file descriptors.
- After a successful
- read(2), the read buffer contains one or more of the following structures:
struct fanotify_event_metadata {
In case of an fanotify group that identifies filesystem objects by file
handles, you should also expect to receive one or more additional information
records of the structure detailed below following the generic
fanotify_event_metadata
structure within the read buffer:
struct fanotify_event_info_header {
struct fanotify_event_info_fid {
For performance reasons, it is recommended to use a large
buffer size (for example, 4096 bytes),
so that multiple events can be retrieved by a single
read(2).
The return value of
read(2)
is the number of bytes placed in the buffer,
or -1 in case of an error (but see BUGS).
The fields of the
fanotify_event_metadata
structure are as follows:
A program listening to fanotify events can compare this PID
to the PID returned by
getpid(2),
to determine whether the event is caused by the listener itself,
or is due to a file access by another process.
The bit mask in
mask
indicates which events have occurred for a single filesystem object.
Multiple bits may be set in this mask,
if more than one event occurred for the monitored filesystem object.
In particular,
consecutive events for the same filesystem object and originating from the
same process may be merged into a single event, with the exception that two
permission events are never merged into one queue entry.
The bits that may appear in
mask
are as follows:
To check for any close event, the following bit mask may be used:
To check for any move event, the following bit mask may be used:
The following bits may appear in
mask
only in conjunction with other event type bits:
The fields of the
fanotify_event_info_fid
structure are as follows:
The following macros are provided to iterate over a buffer containing
fanotify event metadata returned by a
read(2)
from an fanotify file descriptor:
In addition, there is:
struct fanotify_response {
The fields of this structure are as follows:
If access is denied, the requesting application call will receive an
EPERM
error.
In addition to the usual errors for
write(2),
the following errors can occur when writing to the fanotify file descriptor:
The fanotify API does not report file accesses and modifications that
may occur because of
mmap(2),
msync(2),
and
munmap(2).
Events for directories are created only if the directory itself is opened,
read, and closed.
Adding, removing, or changing children of a marked directory does not create
events for the monitored directory itself.
Fanotify monitoring of directories is not recursive:
to monitor subdirectories under a directory,
additional marks must be created.
The
FAN_CREATE
event can be used for detecting when a subdirectory has been created under
a marked directory.
An additional mark must then be set on the newly created subdirectory.
This approach is racy, because it can lose events that occurred inside the
newly created subdirectory, before a mark is added on that subdirectory.
Monitoring mounts offers the capability to monitor a whole directory tree
in a race-free manner.
Monitoring filesystems offers the capability to monitor changes made from
any mount of a filesystem instance in a race-free manner.
The event queue can overflow.
In this case, events are lost.
As of Linux 3.17,
the following bugs exist:
The following shell session shows an example of
running this program.
This session involved editing the file
/home/user/temp/notes.
Before the file was opened, a
FAN_OPEN_PERM
event occurred.
After the file was closed, a
FAN_CLOSE_WRITE
event occurred.
Execution of the program ends when the user presses the ENTER key.
# ./fanotify_example /home
Press enter key to terminate.
Listening for events.
FAN_OPEN_PERM: File /home/user/temp/notes
FAN_CLOSE_WRITE: File /home/user/temp/notes
/* Read all available fanotify events from the file descriptor 'fd' */
static void
handle_events(int fd)
{
Monitoring an fanotify file descriptor for events
When an fanotify event occurs, the fanotify file descriptor indicates as
readable when passed to
epoll(7),
poll(2),
or
select(2).
Dealing with permission events
For permission events, the application must
write(2)
a structure of the following form to the
fanotify file descriptor:
Closing the fanotify file descriptor
When all file descriptors referring to the fanotify notification group are
closed, the fanotify group is released and its resources
are freed for reuse by the kernel.
Upon
close(2),
outstanding permission events will be set to allowed.
/proc/[pid]/fdinfo
The file
/proc/[pid]/fdinfo/[fd]
contains information about fanotify marks for file descriptor
fd
of process
pid.
See
proc(5)
for details.
ERRORS
In addition to the usual errors for
read(2),
the following errors can occur when reading from the
fanotify file descriptor:
VERSIONS
The fanotify API was introduced in version 2.6.36 of the Linux kernel and
enabled in version 2.6.37.
Fdinfo support was added in version 3.8.
CONFORMING TO
The fanotify API is Linux-specific.
NOTES
The fanotify API is available only if the kernel was built with the
CONFIG_FANOTIFY
configuration option enabled.
In addition, fanotify permission handling is available only if the
CONFIG_FANOTIFY_ACCESS_PERMISSIONS
configuration option is enabled.
Limitations and caveats
Fanotify reports only events that a user-space program triggers through the
filesystem API.
As a result,
it does not catch remote events that occur on network filesystems.
BUGS
Before Linux 3.19,
fallocate(2)
did not generate fanotify events.
Since Linux 3.19,
calls to
fallocate(2)
generate
FAN_MODIFY
events.
EXAMPLES
The two example programs below demonstrate the usage of the fanotify API.
Example program: fanotify_example.c
The first program is an example of fanotify being
used with its event object information passed in the form of a file
descriptor.
The program marks the mount point passed as a command-line argument and
waits for events of type
FAN_OPEN_PERM
and
FAN_CLOSE_WRITE.
When a permission event occurs, a
FAN_ALLOW
response is given.
Program source: fanotify_example.c
#define _GNU_SOURCE /* Needed to get O_LARGEFILE definition */
#include <errno.h>
#include <fcntl.h>
#include <limits.h>
#include <poll.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/fanotify.h>
#include <unistd.h>