Updated MFMM (markdown)

Sun Yimin 2021-12-23 21:25:30 +08:00
parent 02ae8a6270
commit 1f62239755

19
MFMM.md

@ -153,33 +153,30 @@ acc0, acc1, acc2, acc3, acc4, acc5是64位寄存器
考虑以下算法(主要就是一轮加法,一轮减法) 考虑以下算法(主要就是一轮加法,一轮减法)
acc4, acc3, acc2, acc1 acc4, acc3, acc2, acc1
+ acc0, 0, 0, (acc0 - L(acc0*2^32)) + acc0, 0, 0, acc0
- H(acc0*2^32) L(acc0*2^32) H(acc0*2^32) - H(acc0*2^32) L(acc0*2^32) H(acc0*2^32) L(acc0*2^32)
=》继续优化 =》继续优化
acc4, acc3, acc2, acc1 acc4, acc3, acc2, acc1
+ (acc0 - H(acc0*2^32)), 0, 0, (acc0 - L(acc0*2^32)) + (acc0 - H(acc0*2^32)), 0, 0, acc0
- L(acc0*2^32) H(acc0*2^32) - L(acc0*2^32) H(acc0*2^32) L(acc0*2^32)
acc0 - H(acc0 * 2^32) >= 0, acc0 - L(acc0 * 2^32) >= 0 显然。 acc0 - H(acc0 * 2^32) >= 0显然。
MOVQ acc0, AX MOVQ acc0, AX
MOVQ acc0, DX MOVQ acc0, DX
SHLQ $32, AX SHLQ $32, AX
SHRQ $32, DX SHRQ $32, DX
MOVQ acc0, t0
SUBQ AX, t0
SUBQ DX, acc0
ADDQ t0, acc1 ADDQ acc0, acc1
ADCQ $0, acc2 ADCQ $0, acc2
ADCQ $0, acc3 ADCQ $0, acc3
ADCQ acc0, acc4 ADCQ acc0, acc4
ADCQ $0, acc5 ADCQ $0, acc5
SUBQ AX, acc1
SUBQ DX, acc2 SUBQ DX, acc2
SBBQ AX, acc3 SBBQ AX, acc3
SBBQ $0, acc4 SBBQ DX, acc4
SBBQ $0, acc5 SBBQ $0, acc5
### 第三步,计算 X * Y1并且和tmp相加 ### 第三步,计算 X * Y1并且和tmp相加