I’m writing this, because I found it difficult to find a complete description of how things work in the zfs l2 cache. Some informations are very easy to find, but seem to lack details. Here is how I believe it works:
1. Format of l2:
Every device in the l2 cache is a ring buffer, if new data is written, the oldest data is dropped/overwritten. There are no other priorities to what is dropped. First written is first dropped.
The l2 is no arc, it has only one list which is feed from the arc, it does not adapt in any way, caching priorities are fix (see search order below).
2. Populating the l2:
The l2 is populated by scanning the tail end of the regular (in memory) arc lists up to a certain depth.
A new scan is initiated every vfs.zfs.l2arc_feed_secs, it scans until it has found vfs.zfs.l2arc_write_max bytes, eligible for l2 (Not allready in L2, not locked etc.).
Each list is scanned from the tail up to vfs.zfs.l2arc_write_max bytes * vfs.zfs.l2arc_headroom.
The Arc lists tails are searched in the flowing order:
MFU Metadata -> MRU Metadata -> MFU Data -> MRU Data
So the MRU Data list is only searched if there is less then vfs.zfs.l2arc_write_max bytes in the other lists tails.
If a scan finds vfs.zfs.l2arc_write_max bytes in the scanned data, it is written to L2.
Because the scan only starts every vfs.zfs.l2arc_feed_secs and writes a maximum of vfs.zfs.l2arc_write_max bytes this effectively limits the write bandwidth to the l2 devices.
If multiple l2 devices are used, data is written round-robin to the devices. (which means that if they are unequal in size it is more or less random how long data is cached depending on which device the data was written to).
3. Cache hits in l2:
If data is not in the arc, but in l2, it is read from l2, and cached in the arc as if it would have been read from the primary disks. Nothing happens to the data in l2, it could be evicted shortly after the Hit (but it is in the arc then, and will probably written to the l2 again before it is evicted from arc)