One of the advantages of the in-kernel BPF virtual machine is that it is
fast. BPF programs are just-in-time compiled and run directly by the CPU,
so there is no interpreter overhead. For many of the intended use cases,
though, “fast” can never be quite fast enough. It is thus unsurprising
that there are currently a
number of patch sets under development that are intended to speed up one
aspect or another of using BPF in the system. A few, in particular, seem
about ready to hit the mainline.
Source: LWN.net – [$] A medley of performance-related BPF patches