Updated SM9实现及优化 (markdown)

Sun Yimin 2023-07-11 15:21:24 +08:00
parent 6adab79931
commit 846005eeda

@ -41,6 +41,37 @@ Go语言相对简单但是为了简单编译器做了很多额外的操作
## 应用SIMD复制值
也就是Set操作的汇编实现同时也尽量减少Set操作这个“优化”导致了实现的复杂性、影响了代码的可维护性可能不值得
## Neg改用Sub实现
无意中发现Neg方法不如后来实现的Sub性能好这个挺奇怪的单独测试gfpNeg性能(BenchmarkGfPNeg-6)要比gfpSub()性能好(BenchmarkGfPNeg2-6)
```
goos: windows
goarch: amd64
pkg: github.com/emmansun/gmsm/sm9/bn256
cpu: Intel(R) Core(TM) i5-9500 CPU @ 3.00GHz
BenchmarkGfPNeg-6 349538827 3.399 ns/op 0 B/op 0 allocs/op
BenchmarkGfPNeg2-6 282038318 4.208 ns/op 0 B/op 0 allocs/op
```
但是应用到gfP2的MulUNC方法
gfpNeg
```
goos: windows
goarch: amd64
pkg: github.com/emmansun/gmsm/sm9/bn256
cpu: Intel(R) Core(TM) i5-9500 CPU @ 3.00GHz
BenchmarkGfP2MulU-6 8290990 141.1 ns/op 64 B/op 1 allocs/op
BenchmarkGfP2SquareU-6 10009350 117.0 ns/op 64 B/op 1 allocs/op
```
gfpSub
```
goos: windows
goarch: amd64
pkg: github.com/emmansun/gmsm/sm9/bn256
cpu: Intel(R) Core(TM) i5-9500 CPU @ 3.00GHz
BenchmarkGfP2MulU-6 12727611 92.70 ns/op 0 B/op 0 allocs/op
BenchmarkGfP2SquareU-6 17728008 66.35 ns/op 0 B/op 0 allocs/op
```
## 下一步
* 参考《New software speed records for cryptographic pairings》使用浮点运算和SIMD实现
* [High-Speed Software Implementation of the Optimal Ate Pairing over BarretoNaehrig Curves](https://eprint.iacr.org/2010/354.pdf)平方扩域上的运算优化不过由于他的p选择有其特殊性。