drm/xe/xe2: Set tile y type in XY_FAST_COPY_BLT to Tile4
authorHaridhar Kalvala <haridhar.kalvala@intel.com>
Fri, 29 Sep 2023 21:36:39 +0000 (14:36 -0700)
committerRodrigo Vivi <rodrigo.vivi@intel.com>
Thu, 21 Dec 2023 16:42:04 +0000 (11:42 -0500)
Set bits 30 and 31 of XY_FAST_COPY_BLT's dword1 for XeHP and above.

Destination or source being Y-Major is selected on dword0 and there's
nothing to set on dword1. According to the bspec for Xe2,
"Behavior is undefined when programmed the value 0". Also for XeHP,
the only value allowed in those bits is 0b11, not being possible to
select "Legacy Tile-Y" anymore, only the newer Tile4.

So, unconditionally set those bits for graphics IP 12.50 and above.

v2: Reword commit message and extend it to graphics version >= 12.50
    (Matt Roper)

Bspec: 57567
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230929213640.3189912-4-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
drivers/gpu/drm/xe/regs/xe_gpu_commands.h
drivers/gpu/drm/xe/xe_migrate.c

index 1fdf2e4f1c9fe7440dfbaee9ebcc7b2e6f1e0655..cc7b56763f1005159dda70262d37482a1c737c28 100644 (file)
@@ -57,6 +57,8 @@
 
 #define XY_FAST_COPY_BLT_CMD           (2 << 29 | 0x42 << 22)
 #define   XY_FAST_COPY_BLT_DEPTH_32    (3<<24)
+#define   XY_FAST_COPY_BLT_D1_SRC_TILE4        REG_BIT(31)
+#define   XY_FAST_COPY_BLT_D1_DST_TILE4        REG_BIT(30)
 
 #define        PVC_MEM_SET_CMD         (2 << 29 | 0x5b << 22)
 #define   PVC_MEM_SET_CMD_LEN_DW       7
index 313e3c0a6e90bb5b93e301fd81e553a20081a9c4..69488a0fada477e69db612c9973f18c33a63ae45 100644 (file)
@@ -543,12 +543,19 @@ static void emit_copy(struct xe_gt *gt, struct xe_bb *bb,
                      u64 src_ofs, u64 dst_ofs, unsigned int size,
                      unsigned int pitch)
 {
+       struct xe_device *xe = gt_to_xe(gt);
+
        xe_gt_assert(gt, size / pitch <= S16_MAX);
        xe_gt_assert(gt, pitch / 4 <= S16_MAX);
        xe_gt_assert(gt, pitch <= U16_MAX);
 
        bb->cs[bb->len++] = XY_FAST_COPY_BLT_CMD | (10 - 2);
-       bb->cs[bb->len++] = XY_FAST_COPY_BLT_DEPTH_32 | pitch;
+       if (GRAPHICS_VERx100(xe) >= 1250)
+               bb->cs[bb->len++] = XY_FAST_COPY_BLT_DEPTH_32 | pitch |
+                                   XY_FAST_COPY_BLT_D1_SRC_TILE4 |
+                                   XY_FAST_COPY_BLT_D1_DST_TILE4;
+       else
+               bb->cs[bb->len++] = XY_FAST_COPY_BLT_DEPTH_32 | pitch;
        bb->cs[bb->len++] = 0;
        bb->cs[bb->len++] = (size / pitch) << 16 | pitch / 4;
        bb->cs[bb->len++] = lower_32_bits(dst_ofs);