Updated SM2 WWMM (2) (markdown)

Sun Yimin 2024-02-26 08:34:52 +08:00
parent 9bec042da7
commit 2af450584b

@ -457,21 +457,23 @@ $T_3=t_7 \ast 2^{448} + t_6 \ast 2^{384} + t_5 \ast 2^{320} + (t_4+Y) \ast 2^{25
ADCQ t1, acc1 // (carry2, acc1) = acc1 + H(t0 * ord0) + carry1 ADCQ t1, acc1 // (carry2, acc1) = acc1 + H(t0 * ord0) + carry1
MOVQ t0, acc0 // acc0 = t0 MOVQ t0, acc0 // acc0 = t0
MULXQ p256ord<>+0x08(SB), AX, t1
ADCQ $0, t1 // t1 = carry2 + H(t0*ord1)
ADDQ AX, acc1 // (carry3, acc1) = acc1 + L(t0*ord1)
ADCQ t1, acc2 // (carry4, acc2) = acc2 + t1 + carry3
ADCQ $0, acc3 // (carry5, acc3) = acc3 + carry4
ADCQ $0, acc0 // acc0 = t0 + carry5
// calculate the negative part: [acc0, acc3, acc2, acc1] - [0, 0x100000000, 1, 0] * t0 // calculate the negative part: [acc0, acc3, acc2, acc1] - [0, 0x100000000, 1, 0] * t0
MOVQ t0, AX MOVQ t0, AX
//MOVQ t0, DX // This is not required due to t0=DX already
SHLQ $32, AX SHLQ $32, AX
SHRQ $32, DX SHRQ $32, DX
SUBQ t0, acc2 SUBQ t0, acc2
SBBQ AX, acc3 SBBQ AX, acc3
SBBQ DX, acc0 SBBQ DX, acc0
MOVQ t0, DX
MULXQ p256ord<>+0x08(SB), AX, t1
ADCQ $0, t1 // t1 = carry2 + H(t0*ord1)
ADDQ AX, acc1 // (carry3, acc1) = acc1 + L(t0*ord1)
ADCQ t1, acc2 // (carry4, acc2) = acc2 + t1 + carry3
ADCQ $0, acc3 // (carry5, acc3) = acc3 + carry4
ADCQ $0, acc0 // acc0 = t0 + carry5
``` ```
乘法: 3 乘法: 3
移位2 移位2
@ -633,7 +635,7 @@ $t_5=t_5 - 0$
乘法: 3 乘法: 3
移位2 移位2
加法9 加法9
减法:4 减法:3
**使用MULXQ**: **使用MULXQ**:
```asm ```asm
@ -645,35 +647,36 @@ $t_5=t_5 - 0$
MULXQ p256ord<>+0x00(SB), AX, BX MULXQ p256ord<>+0x00(SB), AX, BX
ADDQ AX, acc0 ADDQ AX, acc0
ADCQ BX, acc1 ADCQ BX, acc1
MOVQ t0, acc0
MULXQ p256ord<>+0x08(SB), AX, BX
ADCQ $0, BX
ADDQ AX, acc1
ADCQ BX, acc2
ADCQ $0, acc3
ADCQ t0, acc4
ADCQ $0, acc5
MOVQ t0, AX MOVQ t0, AX
//MOVQ t0, DX // This is not required due to t0=DX already
SHLQ $32, AX SHLQ $32, AX
SHRQ $32, DX SHRQ $32, DX
SUBQ t0, acc2 SUBQ t0, acc2
SBBQ AX, acc3 SBBQ AX, acc3
SBBQ DX, acc4 SBBQ DX, acc0
SBBQ $0, acc5
MOVQ t0, DX
MULXQ p256ord<>+0x08(SB), AX, BX
ADCQ $0, BX
ADDQ AX, acc1
ADCQ BX, acc2
ADCQ $0, acc3
ADCQ acc0, acc4
ADCQ $0, acc5
``` ```
乘法: 3 乘法: 3
移位2 移位2
加法8 加法8
减法4 减法:3
| 方案 | 乘法 | 移位 | 加法 | 减法 | | 方案 | 乘法 | 移位 | 加法 | 减法 |
| ----------- | ----------- | ----------- | ----------- | ----------- | | ----------- | ----------- | ----------- | ----------- | ----------- |
| 方案一 | 5 | 0 | 15 | 0 | | 方案一 | 5 | 0 | 15 | 0 |
| 方案一MULX/ADCX/ADOX | 5 | 0 | 10 | 0 | | 方案一MULX/ADCX/ADOX | 5 | 0 | 10 | 0 |
| 方案二 | 3 | 2 | 9 | 3 | | 方案二 | 3 | 2 | 9 | 3 |
| 方案二MULX | 3 | 2 | 8 | 4 | | 方案二MULX | 3 | 2 | 8 | 3 |
看来在支持**MULXQ/ADCXQ/ADOXQ**的情况下使用方案一MULX/ADCX/ADOX更好 看来在支持**MULXQ/ADCXQ/ADOXQ**的情况下使用方案一MULX/ADCX/ADOX更好