ath11k: optimise ath11k_dp_tx_completion_handler
the current code does 4 memcpys for each completion frame.
1) duplicate the desc
2 + 3) inside kfifo insertion
4) kfifo remove
The code simply drops the kfifo and uses a trivial ring buffer. This
requires a single memcpy for insertion. There is no removal needed as
we can simply use the inserted data for processing. As the code runs
inside the NAPI context it is atomic and there is no need for most of
the locking.
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>