From 96a9696ad04070ffc1bcdcfb09ca4ed4300a358a Mon Sep 17 00:00:00 2001 From: Sun Yimin Date: Mon, 22 Mar 2021 09:38:14 +0800 Subject: [PATCH] =?UTF-8?q?Updated=20SM4=E6=80=A7=E8=83=BD=E4=BC=98?= =?UTF-8?q?=E5=8C=96=20(markdown)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- SM4性能优化.md | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/SM4性能优化.md b/SM4性能优化.md index 6ec4c45..86f4cd7 100644 --- a/SM4性能优化.md +++ b/SM4性能优化.md @@ -12,6 +12,7 @@ Go语言的对称加密实现分离了加密模式和Block级别的加密,同 # 未优化之前 + CPU: i5-9500 goos: windows goarch: amd64 pkg: github.com/emmansun/gmsm/sm4 @@ -34,6 +35,7 @@ Go语言的对称加密实现分离了加密模式和Block级别的加密,同 # Block级别使用AES-NI + CPU: i5-9500 goos: windows goarch: amd64 pkg: github.com/emmansun/gmsm/sm4_test @@ -59,16 +61,19 @@ Go语言的对称加密实现分离了加密模式和Block级别的加密,同 # CBC模式解密并行优化 没有写一个单独的asm函数,偷懒。 + CPU: i5-9500 BenchmarkSM4CBCDecrypt1K-6 292531 4103 ns/op 249.56 MB/s 0 B/op 0 allocs/op # CTR模式并行优化 + CPU: i5-9500 BenchmarkSM4CTR1K-6 292522 4121 ns/op 247.30 MB/s 0 B/op 0 allocs/op BenchmarkSM4CTR8K-6 36483 33203 ns/op 246.57 MB/s 0 B/op 0 allocs/op # GCM模式优化 这个先做加密并行优化,GHASH部分优化得慢慢做。 + CPU: i5-9500 BenchmarkSM4GCMSeal1K-6 153688 7904 ns/op 129.56 MB/s 0 B/op 0 allocs/op BenchmarkSM4GCMOpen1K-6 149971 7896 ns/op 129.69 MB/s 0 B/op 0 allocs/op BenchmarkSM4GCMSign1K-6 315027 3753 ns/op 272.85 MB/s 0 B/op 0 allocs/op @@ -77,6 +82,7 @@ Go语言的对称加密实现分离了加密模式和Block级别的加密,同 # GCM模式GHASH ASM优化 asm部分改造自aes的实现,优化结果很惊人! + CPU: i5-9500 BenchmarkSM4GCMSeal1K-6 273218 4491 ns/op 228.00 MB/s 0 B/op 0 allocs/op BenchmarkSM4GCMOpen1K-6 250770 4516 ns/op 226.73 MB/s 0 B/op 0 allocs/op BenchmarkSM4GCMSign1K-6 3321482 359 ns/op 2853.54 MB/s 0 B/op 0 allocs/op @@ -127,3 +133,35 @@ Golang没提供这两种模式的优化接口,可能这两种模式不怎么 BenchmarkSM4GCMOpen8K-8 26265 53390 ns/op 153.44 MB/s 0 B/op 0 allocs/op PASS ok github.com/emmansun/gmsm/sm4_test 47.862s + + CPU: i5-9500 + goos: windows + goarch: amd64 + BenchmarkAESCBCEncrypt1K-6 1000000 1006 ns/op 1017.63 MB/s 0 B/op 0 allocs/op + BenchmarkSM4CBCEncrypt1K-6 87804 13595 ns/op 75.32 MB/s 0 B/op 0 allocs/op + BenchmarkAESCBCDecrypt1K-6 1240671 964 ns/op 1061.74 MB/s 0 B/op 0 allocs/op + BenchmarkSM4CBCDecrypt1K-6 300069 4037 ns/op 253.68 MB/s 0 B/op 0 allocs/op + BenchmarkAESCFBEncrypt1K-6 876500 1425 ns/op 714.92 MB/s 0 B/op 0 allocs/op + BenchmarkSM4CFBEncrypt1K-6 86581 13843 ns/op 73.61 MB/s 0 B/op 0 allocs/op + BenchmarkAESCFBDecrypt1K-6 878245 1338 ns/op 761.56 MB/s 0 B/op 0 allocs/op + BenchmarkSM4CFBDecrypt1K-6 86564 13823 ns/op 73.72 MB/s 0 B/op 0 allocs/op + BenchmarkAESCFBDecrypt8K-6 112794 10522 ns/op 778.09 MB/s 0 B/op 0 allocs/op + BenchmarkSM4CFBDecrypt8K-6 10000 110776 ns/op 73.91 MB/s 0 B/op 0 allocs/op + BenchmarkAESOFB1K-6 1343679 892 ns/op 1142.41 MB/s 0 B/op 0 allocs/op + BenchmarkSM4OFB1K-6 89094 13409 ns/op 76.00 MB/s 0 B/op 0 allocs/op + BenchmarkAESCTR1K-6 1000000 1036 ns/op 984.00 MB/s 0 B/op 0 allocs/op + BenchmarkSM4CTR1K-6 292957 4098 ns/op 248.66 MB/s 0 B/op 0 allocs/op + BenchmarkAESCTR8K-6 149863 8200 ns/op 998.46 MB/s 0 B/op 0 allocs/op + BenchmarkSM4CTR8K-6 36595 32699 ns/op 250.38 MB/s 0 B/op 0 allocs/op + BenchmarkAESGCMSeal1K-6 4802740 249 ns/op 4113.39 MB/s 0 B/op 0 allocs/op + BenchmarkSM4GCMSeal1K-6 267092 4385 ns/op 233.52 MB/s 0 B/op 0 allocs/op + BenchmarkAESGCMOpen1K-6 5665056 212 ns/op 4836.19 MB/s 0 B/op 0 allocs/op + BenchmarkSM4GCMOpen1K-6 273200 4380 ns/op 233.80 MB/s 0 B/op 0 allocs/op + BenchmarkAESGCMSign1K-6 9603033 124 ns/op 8258.43 MB/s 0 B/op 0 allocs/op + BenchmarkSM4GCMSign1K-6 3725722 322 ns/op 3183.11 MB/s 0 B/op 0 allocs/op + BenchmarkAESGCMSign8K-6 1570182 764 ns/op 10723.55 MB/s 0 B/op 0 allocs/op + BenchmarkSM4GCMSign8K-6 1244473 964 ns/op 8498.90 MB/s 0 B/op 0 allocs/op + BenchmarkAESGCMSeal8K-6 768501 1619 ns/op 5058.99 MB/s 0 B/op 0 allocs/op + BenchmarkSM4GCMSeal8K-6 36162 33197 ns/op 246.77 MB/s 0 B/op 0 allocs/op + BenchmarkAESGCMOpen8K-6 944479 1325 ns/op 6183.50 MB/s 0 B/op 0 allocs/op + BenchmarkSM4GCMOpen8K-6 36162 33197 ns/op 246.77 MB/s 0 B/op 0 allocs/op