net: stmmac: Add Split Header support and enable it in XGMAC cores
authorJose Abreu <Jose.Abreu@synopsys.com>
Sat, 17 Aug 2019 18:54:43 +0000 (20:54 +0200)
committerDavid S. Miller <davem@davemloft.net>
Sat, 17 Aug 2019 19:43:59 +0000 (12:43 -0700)
commit67afd6d1cfdf0d0461fe3c4c922447f3e9b1c6ee
treebb67952bed5794bde765a624e0c5956c75ee5a2a
parentc887e02a938d7cae9837322448de0af2dd75ee1d
net: stmmac: Add Split Header support and enable it in XGMAC cores

Add the support for Split Header feature in the RX path and enable it in
XGMAC cores.

This does not impact neither beneficts bandwidth but it does reduces CPU
usage because without the feature all the entire packet is memcpy'ed,
while that with the feature only the header is.

With Split Header disabled 'perf stat -d' gives:
86870.624945 task-clock (msec)      #    0.429 CPUs utilized
     1073352 context-switches       #    0.012 M/sec
           1 cpu-migrations         #    0.000 K/sec
         213 page-faults            #    0.002 K/sec
327113872376 cycles                 #    3.766 GHz (62.53%)
 56618161216 instructions           #    0.17  insn per cycle (75.06%)
 10742205071 branches               #  123.658 M/sec (75.36%)
   584309242 branch-misses          #    5.44% of all branches (75.19%)
 17594787965 L1-dcache-loads        #  202.540 M/sec (74.88%)
  4003773131 L1-dcache-load-misses  #   22.76% of all L1-dcache hits (74.89%)
  1313301468 LLC-loads              #   15.118 M/sec (49.75%)
   355906510 LLC-load-misses        #   27.10% of all LL-cache hits (49.92%)

With Split Header enabled 'perf stat -d' gives:
49324.456539 task-clock (msec)     #    0.245 CPUs utilized
     2542387 context-switches      #    0.052 M/sec
           1 cpu-migrations        #    0.000 K/sec
         213 page-faults           #    0.004 K/sec
177092791469 cycles                #    3.590 GHz (62.30%)
 68555756017 instructions          #    0.39  insn per cycle (75.16%)
 12697019382 branches              #  257.418 M/sec (74.81%)
   442081897 branch-misses         #    3.48% of all branches (74.79%)
 20337958358 L1-dcache-loads       #  412.330 M/sec (75.46%)
  3820210140 L1-dcache-load-misses #   18.78% of all L1-dcache hits (75.35%)
  1257719198 LLC-loads             #   25.499 M/sec (49.73%)
   685543923 LLC-load-misses       #   54.51% of all LL-cache hits (49.86%)

Changes from v2:
- Reword commit message (Jakub)
Changes from v1:
- Add performance info (David)
- Add misssing dma_sync_single_for_device()

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/stmicro/stmmac/common.h
drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_descs.c
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c
drivers/net/ethernet/stmicro/stmmac/hwif.h
drivers/net/ethernet/stmicro/stmmac/stmmac.h
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c