Commit | Line | Data |
---|---|---|
0ea8ce61 HC |
1 | .. SPDX-License-Identifier: GPL-2.0 |
2 | ||
3 | ========================= | |
4 | Introduction to LoongArch | |
5 | ========================= | |
6 | ||
7 | LoongArch is a new RISC ISA, which is a bit like MIPS or RISC-V. There are | |
8 | currently 3 variants: a reduced 32-bit version (LA32R), a standard 32-bit | |
9 | version (LA32S) and a 64-bit version (LA64). There are 4 privilege levels | |
10 | (PLVs) defined in LoongArch: PLV0~PLV3, from high to low. Kernel runs at PLV0 | |
11 | while applications run at PLV3. This document introduces the registers, basic | |
12 | instruction set, virtual memory and some other topics of LoongArch. | |
13 | ||
14 | Registers | |
15 | ========= | |
16 | ||
17 | LoongArch registers include general purpose registers (GPRs), floating point | |
18 | registers (FPRs), vector registers (VRs) and control status registers (CSRs) | |
19 | used in privileged mode (PLV0). | |
20 | ||
21 | GPRs | |
22 | ---- | |
23 | ||
24 | LoongArch has 32 GPRs ( ``$r0`` ~ ``$r31`` ); each one is 32-bit wide in LA32 | |
25 | and 64-bit wide in LA64. ``$r0`` is hard-wired to zero, and the other registers | |
26 | are not architecturally special. (Except ``$r1``, which is hard-wired as the | |
27 | link register of the BL instruction.) | |
28 | ||
29 | The kernel uses a variant of the LoongArch register convention, as described in | |
30 | the LoongArch ELF psABI spec, in :ref:`References <loongarch-references>`: | |
31 | ||
32 | ================= =============== =================== ============ | |
33 | Name Alias Usage Preserved | |
34 | across calls | |
35 | ================= =============== =================== ============ | |
36 | ``$r0`` ``$zero`` Constant zero Unused | |
37 | ``$r1`` ``$ra`` Return address No | |
38 | ``$r2`` ``$tp`` TLS/Thread pointer Unused | |
39 | ``$r3`` ``$sp`` Stack pointer Yes | |
40 | ``$r4``-``$r11`` ``$a0``-``$a7`` Argument registers No | |
41 | ``$r4``-``$r5`` ``$v0``-``$v1`` Return value No | |
42 | ``$r12``-``$r20`` ``$t0``-``$t8`` Temp registers No | |
43 | ``$r21`` ``$u0`` Percpu base address Unused | |
44 | ``$r22`` ``$fp`` Frame pointer Yes | |
45 | ``$r23``-``$r31`` ``$s0``-``$s8`` Static registers Yes | |
46 | ================= =============== =================== ============ | |
47 | ||
a667e4d3 YS |
48 | .. Note:: |
49 | The register ``$r21`` is reserved in the ELF psABI, but used by the Linux | |
50 | kernel for storing the percpu base address. It normally has no ABI name, | |
51 | but is called ``$u0`` in the kernel. You may also see ``$v0`` or ``$v1`` | |
52 | in some old code,however they are deprecated aliases of ``$a0`` and ``$a1`` | |
53 | respectively. | |
0ea8ce61 HC |
54 | |
55 | FPRs | |
56 | ---- | |
57 | ||
58 | LoongArch has 32 FPRs ( ``$f0`` ~ ``$f31`` ) when FPU is present. Each one is | |
59 | 64-bit wide on the LA64 cores. | |
60 | ||
61 | The floating-point register convention is the same as described in the | |
62 | LoongArch ELF psABI spec: | |
63 | ||
64 | ================= ================== =================== ============ | |
65 | Name Alias Usage Preserved | |
66 | across calls | |
67 | ================= ================== =================== ============ | |
68 | ``$f0``-``$f7`` ``$fa0``-``$fa7`` Argument registers No | |
69 | ``$f0``-``$f1`` ``$fv0``-``$fv1`` Return value No | |
70 | ``$f8``-``$f23`` ``$ft0``-``$ft15`` Temp registers No | |
71 | ``$f24``-``$f31`` ``$fs0``-``$fs7`` Static registers Yes | |
72 | ================= ================== =================== ============ | |
73 | ||
a667e4d3 YS |
74 | .. Note:: |
75 | You may see ``$fv0`` or ``$fv1`` in some old code, however they are | |
76 | deprecated aliases of ``$fa0`` and ``$fa1`` respectively. | |
0ea8ce61 HC |
77 | |
78 | VRs | |
79 | ---- | |
80 | ||
81 | There are currently 2 vector extensions to LoongArch: | |
82 | ||
83 | - LSX (Loongson SIMD eXtension) with 128-bit vectors, | |
84 | - LASX (Loongson Advanced SIMD eXtension) with 256-bit vectors. | |
85 | ||
86 | LSX brings ``$v0`` ~ ``$v31`` while LASX brings ``$x0`` ~ ``$x31`` as the vector | |
87 | registers. | |
88 | ||
89 | The VRs overlap with FPRs: for example, on a core implementing LSX and LASX, | |
90 | the lower 128 bits of ``$x0`` is shared with ``$v0``, and the lower 64 bits of | |
91 | ``$v0`` is shared with ``$f0``; same with all other VRs. | |
92 | ||
93 | CSRs | |
94 | ---- | |
95 | ||
96 | CSRs can only be accessed from privileged mode (PLV0): | |
97 | ||
98 | ================= ===================================== ============== | |
99 | Address Full Name Abbrev Name | |
100 | ================= ===================================== ============== | |
101 | 0x0 Current Mode Information CRMD | |
102 | 0x1 Pre-exception Mode Information PRMD | |
103 | 0x2 Extension Unit Enable EUEN | |
104 | 0x3 Miscellaneous Control MISC | |
105 | 0x4 Exception Configuration ECFG | |
106 | 0x5 Exception Status ESTAT | |
107 | 0x6 Exception Return Address ERA | |
108 | 0x7 Bad (Faulting) Virtual Address BADV | |
109 | 0x8 Bad (Faulting) Instruction Word BADI | |
110 | 0xC Exception Entrypoint Address EENTRY | |
111 | 0x10 TLB Index TLBIDX | |
112 | 0x11 TLB Entry High-order Bits TLBEHI | |
113 | 0x12 TLB Entry Low-order Bits 0 TLBELO0 | |
114 | 0x13 TLB Entry Low-order Bits 1 TLBELO1 | |
115 | 0x18 Address Space Identifier ASID | |
116 | 0x19 Page Global Directory Address for PGDL | |
117 | Lower-half Address Space | |
118 | 0x1A Page Global Directory Address for PGDH | |
119 | Higher-half Address Space | |
120 | 0x1B Page Global Directory Address PGD | |
121 | 0x1C Page Walk Control for Lower- PWCL | |
122 | half Address Space | |
123 | 0x1D Page Walk Control for Higher- PWCH | |
124 | half Address Space | |
125 | 0x1E STLB Page Size STLBPS | |
126 | 0x1F Reduced Virtual Address Configuration RVACFG | |
127 | 0x20 CPU Identifier CPUID | |
128 | 0x21 Privileged Resource Configuration 1 PRCFG1 | |
129 | 0x22 Privileged Resource Configuration 2 PRCFG2 | |
130 | 0x23 Privileged Resource Configuration 3 PRCFG3 | |
131 | 0x30+n (0≤n≤15) Saved Data register SAVEn | |
132 | 0x40 Timer Identifier TID | |
133 | 0x41 Timer Configuration TCFG | |
134 | 0x42 Timer Value TVAL | |
135 | 0x43 Compensation of Timer Count CNTC | |
136 | 0x44 Timer Interrupt Clearing TICLR | |
137 | 0x60 LLBit Control LLBCTL | |
138 | 0x80 Implementation-specific Control 1 IMPCTL1 | |
139 | 0x81 Implementation-specific Control 2 IMPCTL2 | |
140 | 0x88 TLB Refill Exception Entrypoint TLBRENTRY | |
141 | Address | |
142 | 0x89 TLB Refill Exception BAD (Faulting) TLBRBADV | |
143 | Virtual Address | |
144 | 0x8A TLB Refill Exception Return Address TLBRERA | |
145 | 0x8B TLB Refill Exception Saved Data TLBRSAVE | |
146 | Register | |
147 | 0x8C TLB Refill Exception Entry Low-order TLBRELO0 | |
148 | Bits 0 | |
149 | 0x8D TLB Refill Exception Entry Low-order TLBRELO1 | |
150 | Bits 1 | |
151 | 0x8E TLB Refill Exception Entry High-order TLBEHI | |
152 | Bits | |
153 | 0x8F TLB Refill Exception Pre-exception TLBRPRMD | |
154 | Mode Information | |
155 | 0x90 Machine Error Control MERRCTL | |
156 | 0x91 Machine Error Information 1 MERRINFO1 | |
157 | 0x92 Machine Error Information 2 MERRINFO2 | |
158 | 0x93 Machine Error Exception Entrypoint MERRENTRY | |
159 | Address | |
160 | 0x94 Machine Error Exception Return MERRERA | |
161 | Address | |
162 | 0x95 Machine Error Exception Saved Data MERRSAVE | |
163 | Register | |
164 | 0x98 Cache TAGs CTAG | |
165 | 0x180+n (0≤n≤3) Direct Mapping Configuration Window n DMWn | |
166 | 0x200+2n (0≤n≤31) Performance Monitor Configuration n PMCFGn | |
167 | 0x201+2n (0≤n≤31) Performance Monitor Overall Counter n PMCNTn | |
168 | 0x300 Memory Load/Store WatchPoint MWPC | |
169 | Overall Control | |
170 | 0x301 Memory Load/Store WatchPoint MWPS | |
171 | Overall Status | |
172 | 0x310+8n (0≤n≤7) Memory Load/Store WatchPoint n MWPnCFG1 | |
173 | Configuration 1 | |
174 | 0x311+8n (0≤n≤7) Memory Load/Store WatchPoint n MWPnCFG2 | |
175 | Configuration 2 | |
176 | 0x312+8n (0≤n≤7) Memory Load/Store WatchPoint n MWPnCFG3 | |
177 | Configuration 3 | |
178 | 0x313+8n (0≤n≤7) Memory Load/Store WatchPoint n MWPnCFG4 | |
179 | Configuration 4 | |
180 | 0x380 Instruction Fetch WatchPoint FWPC | |
181 | Overall Control | |
182 | 0x381 Instruction Fetch WatchPoint FWPS | |
183 | Overall Status | |
184 | 0x390+8n (0≤n≤7) Instruction Fetch WatchPoint n FWPnCFG1 | |
185 | Configuration 1 | |
186 | 0x391+8n (0≤n≤7) Instruction Fetch WatchPoint n FWPnCFG2 | |
187 | Configuration 2 | |
188 | 0x392+8n (0≤n≤7) Instruction Fetch WatchPoint n FWPnCFG3 | |
189 | Configuration 3 | |
190 | 0x393+8n (0≤n≤7) Instruction Fetch WatchPoint n FWPnCFG4 | |
191 | Configuration 4 | |
192 | 0x500 Debug Register DBG | |
193 | 0x501 Debug Exception Return Address DERA | |
194 | 0x502 Debug Exception Saved Data Register DSAVE | |
195 | ================= ===================================== ============== | |
196 | ||
197 | ERA, TLBRERA, MERRERA and DERA are sometimes also known as EPC, TLBREPC, MERREPC | |
198 | and DEPC respectively. | |
199 | ||
200 | Basic Instruction Set | |
201 | ===================== | |
202 | ||
203 | Instruction formats | |
204 | ------------------- | |
205 | ||
206 | LoongArch instructions are 32 bits wide, belonging to 9 basic instruction | |
207 | formats (and variants of them): | |
208 | ||
209 | =========== ========================== | |
210 | Format name Composition | |
211 | =========== ========================== | |
212 | 2R Opcode + Rj + Rd | |
213 | 3R Opcode + Rk + Rj + Rd | |
214 | 4R Opcode + Ra + Rk + Rj + Rd | |
215 | 2RI8 Opcode + I8 + Rj + Rd | |
216 | 2RI12 Opcode + I12 + Rj + Rd | |
217 | 2RI14 Opcode + I14 + Rj + Rd | |
218 | 2RI16 Opcode + I16 + Rj + Rd | |
219 | 1RI21 Opcode + I21L + Rj + I21H | |
220 | I26 Opcode + I26L + I26H | |
221 | =========== ========================== | |
222 | ||
223 | Rd is the destination register operand, while Rj, Rk and Ra ("a" stands for | |
224 | "additional") are the source register operands. I8/I12/I16/I21/I26 are | |
225 | immediate operands of respective width. The longer I21 and I26 are stored | |
226 | in separate higher and lower parts in the instruction word, denoted by the "L" | |
227 | and "H" suffixes. | |
228 | ||
229 | List of Instructions | |
230 | -------------------- | |
231 | ||
232 | For brevity, only instruction names (mnemonics) are listed here; please see the | |
233 | :ref:`References <loongarch-references>` for details. | |
234 | ||
235 | ||
236 | 1. Arithmetic Instructions:: | |
237 | ||
238 | ADD.W SUB.W ADDI.W ADD.D SUB.D ADDI.D | |
239 | SLT SLTU SLTI SLTUI | |
240 | AND OR NOR XOR ANDN ORN ANDI ORI XORI | |
241 | MUL.W MULH.W MULH.WU DIV.W DIV.WU MOD.W MOD.WU | |
242 | MUL.D MULH.D MULH.DU DIV.D DIV.DU MOD.D MOD.DU | |
243 | PCADDI PCADDU12I PCADDU18I | |
244 | LU12I.W LU32I.D LU52I.D ADDU16I.D | |
245 | ||
246 | 2. Bit-shift Instructions:: | |
247 | ||
248 | SLL.W SRL.W SRA.W ROTR.W SLLI.W SRLI.W SRAI.W ROTRI.W | |
249 | SLL.D SRL.D SRA.D ROTR.D SLLI.D SRLI.D SRAI.D ROTRI.D | |
250 | ||
251 | 3. Bit-manipulation Instructions:: | |
252 | ||
253 | EXT.W.B EXT.W.H CLO.W CLO.D SLZ.W CLZ.D CTO.W CTO.D CTZ.W CTZ.D | |
254 | BYTEPICK.W BYTEPICK.D BSTRINS.W BSTRINS.D BSTRPICK.W BSTRPICK.D | |
255 | REVB.2H REVB.4H REVB.2W REVB.D REVH.2W REVH.D BITREV.4B BITREV.8B BITREV.W BITREV.D | |
256 | MASKEQZ MASKNEZ | |
257 | ||
258 | 4. Branch Instructions:: | |
259 | ||
260 | BEQ BNE BLT BGE BLTU BGEU BEQZ BNEZ B BL JIRL | |
261 | ||
262 | 5. Load/Store Instructions:: | |
263 | ||
264 | LD.B LD.BU LD.H LD.HU LD.W LD.WU LD.D ST.B ST.H ST.W ST.D | |
265 | LDX.B LDX.BU LDX.H LDX.HU LDX.W LDX.WU LDX.D STX.B STX.H STX.W STX.D | |
266 | LDPTR.W LDPTR.D STPTR.W STPTR.D | |
267 | PRELD PRELDX | |
268 | ||
269 | 6. Atomic Operation Instructions:: | |
270 | ||
271 | LL.W SC.W LL.D SC.D | |
272 | AMSWAP.W AMSWAP.D AMADD.W AMADD.D AMAND.W AMAND.D AMOR.W AMOR.D AMXOR.W AMXOR.D | |
273 | AMMAX.W AMMAX.D AMMIN.W AMMIN.D | |
274 | ||
275 | 7. Barrier Instructions:: | |
276 | ||
277 | IBAR DBAR | |
278 | ||
279 | 8. Special Instructions:: | |
280 | ||
281 | SYSCALL BREAK CPUCFG NOP IDLE ERTN(ERET) DBCL(DBGCALL) RDTIMEL.W RDTIMEH.W RDTIME.D | |
282 | ASRTLE.D ASRTGT.D | |
283 | ||
284 | 9. Privileged Instructions:: | |
285 | ||
286 | CSRRD CSRWR CSRXCHG | |
287 | IOCSRRD.B IOCSRRD.H IOCSRRD.W IOCSRRD.D IOCSRWR.B IOCSRWR.H IOCSRWR.W IOCSRWR.D | |
288 | CACOP TLBP(TLBSRCH) TLBRD TLBWR TLBFILL TLBCLR TLBFLUSH INVTLB LDDIR LDPTE | |
289 | ||
290 | Virtual Memory | |
291 | ============== | |
292 | ||
293 | LoongArch supports direct-mapped virtual memory and page-mapped virtual memory. | |
294 | ||
295 | Direct-mapped virtual memory is configured by CSR.DMWn (n=0~3), it has a simple | |
296 | relationship between virtual address (VA) and physical address (PA):: | |
297 | ||
298 | VA = PA + FixedOffset | |
299 | ||
300 | Page-mapped virtual memory has arbitrary relationship between VA and PA, which | |
301 | is recorded in TLB and page tables. LoongArch's TLB includes a fully-associative | |
302 | MTLB (Multiple Page Size TLB) and set-associative STLB (Single Page Size TLB). | |
303 | ||
304 | By default, the whole virtual address space of LA32 is configured like this: | |
305 | ||
306 | ============ =========================== ============================= | |
307 | Name Address Range Attributes | |
308 | ============ =========================== ============================= | |
309 | ``UVRANGE`` ``0x00000000 - 0x7FFFFFFF`` Page-mapped, Cached, PLV0~3 | |
310 | ``KPRANGE0`` ``0x80000000 - 0x9FFFFFFF`` Direct-mapped, Uncached, PLV0 | |
311 | ``KPRANGE1`` ``0xA0000000 - 0xBFFFFFFF`` Direct-mapped, Cached, PLV0 | |
312 | ``KVRANGE`` ``0xC0000000 - 0xFFFFFFFF`` Page-mapped, Cached, PLV0 | |
313 | ============ =========================== ============================= | |
314 | ||
315 | User mode (PLV3) can only access UVRANGE. For direct-mapped KPRANGE0 and | |
316 | KPRANGE1, PA is equal to VA with bit30~31 cleared. For example, the uncached | |
317 | direct-mapped VA of 0x00001000 is 0x80001000, and the cached direct-mapped | |
318 | VA of 0x00001000 is 0xA0001000. | |
319 | ||
320 | By default, the whole virtual address space of LA64 is configured like this: | |
321 | ||
322 | ============ ====================== ====================================== | |
323 | Name Address Range Attributes | |
324 | ============ ====================== ====================================== | |
325 | ``XUVRANGE`` ``0x0000000000000000 - Page-mapped, Cached, PLV0~3 | |
326 | 0x3FFFFFFFFFFFFFFF`` | |
327 | ``XSPRANGE`` ``0x4000000000000000 - Direct-mapped, Cached / Uncached, PLV0 | |
328 | 0x7FFFFFFFFFFFFFFF`` | |
329 | ``XKPRANGE`` ``0x8000000000000000 - Direct-mapped, Cached / Uncached, PLV0 | |
330 | 0xBFFFFFFFFFFFFFFF`` | |
331 | ``XKVRANGE`` ``0xC000000000000000 - Page-mapped, Cached, PLV0 | |
332 | 0xFFFFFFFFFFFFFFFF`` | |
333 | ============ ====================== ====================================== | |
334 | ||
335 | User mode (PLV3) can only access XUVRANGE. For direct-mapped XSPRANGE and | |
336 | XKPRANGE, PA is equal to VA with bits 60~63 cleared, and the cache attribute | |
337 | is configured by bits 60~61 in VA: 0 is for strongly-ordered uncached, 1 is | |
338 | for coherent cached, and 2 is for weakly-ordered uncached. | |
339 | ||
340 | Currently we only use XKPRANGE for direct mapping and XSPRANGE is reserved. | |
341 | ||
342 | To put this in action: the strongly-ordered uncached direct-mapped VA (in | |
343 | XKPRANGE) of 0x00000000_00001000 is 0x80000000_00001000, the coherent cached | |
344 | direct-mapped VA (in XKPRANGE) of 0x00000000_00001000 is 0x90000000_00001000, | |
345 | and the weakly-ordered uncached direct-mapped VA (in XKPRANGE) of 0x00000000 | |
346 | _00001000 is 0xA0000000_00001000. | |
347 | ||
348 | Relationship of Loongson and LoongArch | |
349 | ====================================== | |
350 | ||
351 | LoongArch is a RISC ISA which is different from any other existing ones, while | |
352 | Loongson is a family of processors. Loongson includes 3 series: Loongson-1 is | |
353 | the 32-bit processor series, Loongson-2 is the low-end 64-bit processor series, | |
354 | and Loongson-3 is the high-end 64-bit processor series. Old Loongson is based on | |
355 | MIPS, while New Loongson is based on LoongArch. Take Loongson-3 as an example: | |
356 | Loongson-3A1000/3B1500/3A2000/3A3000/3A4000 are MIPS-compatible, while Loongson- | |
357 | 3A5000 (and future revisions) are all based on LoongArch. | |
358 | ||
359 | .. _loongarch-references: | |
360 | ||
361 | References | |
362 | ========== | |
363 | ||
364 | Official web site of Loongson Technology Corp. Ltd.: | |
365 | ||
366 | http://www.loongson.cn/ | |
367 | ||
368 | Developer web site of Loongson and LoongArch (Software and Documentation): | |
369 | ||
370 | http://www.loongnix.cn/ | |
371 | ||
372 | https://github.com/loongson/ | |
373 | ||
374 | https://loongson.github.io/LoongArch-Documentation/ | |
375 | ||
376 | Documentation of LoongArch ISA: | |
377 | ||
378 | https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.00-CN.pdf (in Chinese) | |
379 | ||
380 | https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.00-EN.pdf (in English) | |
381 | ||
382 | Documentation of LoongArch ELF psABI: | |
383 | ||
384 | https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v1.00-CN.pdf (in Chinese) | |
385 | ||
386 | https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v1.00-EN.pdf (in English) | |
387 | ||
388 | Linux kernel repository of Loongson and LoongArch: | |
389 | ||
390 | https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git |