From ebeb258c58b7e6bb50c13461233d426d11a519f9 Mon Sep 17 00:00:00 2001 From: xfy Date: Thu, 11 Jun 2026 13:43:28 +0800 Subject: [PATCH] docs(benchmark): add v0.4.0 baseline summary and update gitignore - Collect baseline benchmark summary across all core modules - Save key results to benchmarks/v0.4.0/summary.txt - Update .gitignore to track benchmark summaries/reports - Include performance optimization design docs and plan --- .gitignore | 10 +- benchmarks/v0.4.0/summary.txt | 335 ++++ .../2026-06-03-eliminate-code-redundancy.md | 929 ++++++++++ .../plans/2026-06-03-redundancy-removal.md | 791 ++++++++ .../2026-06-04-performance-optimization.md | 820 +++++++++ .../2026-06-08-loadbalance-enhancement.md | 1620 +++++++++++++++++ ...026-06-10-performance-optimization-plan.md | 1235 +++++++++++++ ...-06-03-eliminate-code-redundancy-design.md | 213 +++ ...26-06-08-loadbalance-enhancement-design.md | 389 ++++ ...6-06-10-performance-optimization-design.md | 261 +++ 10 files changed, 6601 insertions(+), 2 deletions(-) create mode 100644 benchmarks/v0.4.0/summary.txt create mode 100644 docs/superpowers/plans/2026-06-03-eliminate-code-redundancy.md create mode 100644 docs/superpowers/plans/2026-06-03-redundancy-removal.md create mode 100644 docs/superpowers/plans/2026-06-04-performance-optimization.md create mode 100644 docs/superpowers/plans/2026-06-08-loadbalance-enhancement.md create mode 100644 docs/superpowers/plans/2026-06-10-performance-optimization-plan.md create mode 100644 docs/superpowers/specs/2026-06-03-eliminate-code-redundancy-design.md create mode 100644 docs/superpowers/specs/2026-06-08-loadbalance-enhancement-design.md create mode 100644 docs/superpowers/specs/2026-06-10-performance-optimization-design.md diff --git a/.gitignore b/.gitignore index 2c620dc..c2be9d4 100644 --- a/.gitignore +++ b/.gitignore @@ -51,9 +51,12 @@ logs/ tmp/ temp/ -# Benchmark results +# Benchmark results: keep structure but ignore large raw data benchmarks/*/ !benchmarks/.gitkeep +benchmarks/**/*.txt +!benchmarks/*/summary.txt +!benchmarks/*/REPORT.md # oh-my-claudecode state directory .omc/ @@ -77,5 +80,8 @@ main .crush # Planning and specification documents (agent-generated) -docs/superpowers/ +# Keep generated specs/plans checked in for traceability +docs/superpowers/*/ +!docs/superpowers/specs/ +!docs/superpowers/plans/ docs/plans/ diff --git a/benchmarks/v0.4.0/summary.txt b/benchmarks/v0.4.0/summary.txt new file mode 100644 index 0000000..474d9b2 --- /dev/null +++ b/benchmarks/v0.4.0/summary.txt @@ -0,0 +1,335 @@ +=== cache.txt === +BenchmarkFileCacheGet/Size100-10 21736779 52.62 ns/op 16 B/op 1 allocs/op +BenchmarkFileCacheGet/Size1000-10 22091924 50.41 ns/op 16 B/op 1 allocs/op +BenchmarkFileCacheGet/Size10000-10 24389118 44.63 ns/op 21 B/op 1 allocs/op +BenchmarkFileCacheSet/Size100-10 2482664 513.3 ns/op 120 B/op 5 allocs/op +BenchmarkFileCacheSet/Size1000-10 2469159 482.6 ns/op 120 B/op 5 allocs/op +BenchmarkFileCacheSet/Size10000-10 2264976 713.2 ns/op 120 B/op 5 allocs/op +BenchmarkFileCacheSet_Pooled/Size100-10 2663748 449.6 ns/op 120 B/op 5 allocs/op +BenchmarkFileCacheSet_Pooled/Size1000-10 1215387 895.9 ns/op 120 B/op 5 allocs/op +BenchmarkFileCacheSet_Pooled/Size10000-10 1000000 3025 ns/op 120 B/op 5 allocs/op +BenchmarkFileCacheSetNoEviction-10 2499050 912.4 ns/op 112 B/op 5 allocs/op +BenchmarkFileCacheConcurrent/Size100-10 8061500 139.6 ns/op 26 B/op 1 allocs/op +BenchmarkFileCacheConcurrent/Size1000-10 5312031 222.8 ns/op 31 B/op 2 allocs/op +BenchmarkFileCacheConcurrent/Size10000-10 5357617 227.2 ns/op 31 B/op 2 allocs/op +BenchmarkFileCacheGetOnly-10 15445342 78.82 ns/op 29 B/op 1 allocs/op +BenchmarkFileCacheSizeEviction-10 1275866 942.1 ns/op 1121 B/op 5 allocs/op +BenchmarkFileCacheLRUTouch-10 10095694 115.6 ns/op 16 B/op 1 allocs/op +BenchmarkProxyCacheGet-10 29701370 125.0 ns/op 13 B/op 1 allocs/op +BenchmarkProxyCacheSet-10 717621 1754 ns/op 251 B/op 3 allocs/op +BenchmarkProxyCacheConcurrent-10 4811952 276.0 ns/op 69 B/op 2 allocs/op +BenchmarkFileCacheSetAllocation_New-10 2149519 522.3 ns/op 97 B/op 4 allocs/op +BenchmarkFileCacheSetAllocation_Update-10 3871534 348.9 ns/op 45 B/op 2 allocs/op +BenchmarkFileCacheSetAllocation_Eviction-10 2248743 552.0 ns/op 96 B/op 4 allocs/op +BenchmarkFileCacheSetAllocation_EvictionWithPool-10 2310462 515.7 ns/op 96 B/op 4 allocs/op +BenchmarkFileCacheSetAllocation_MemoryLimit-10 2186145 563.2 ns/op 96 B/op 4 allocs/op +BenchmarkFileCacheSetAllocation_Concurrent-10 1934901 654.7 ns/op 88 B/op 3 allocs/op +BenchmarkFileCacheSetAllocation_ConcurrentEviction-10 2139834 609.0 ns/op 96 B/op 3 allocs/op +BenchmarkFileCacheEntryPool_GetPut-10 85020030 12.46 ns/op 0 B/op 0 allocs/op +BenchmarkFileCacheLRUList_PushFront-10 6249896 206.8 ns/op 232 B/op 4 allocs/op +PASS +ok rua.plus/lolly/internal/cache 45.363s + +=== handler.txt === +BenchmarkGenerateAutoIndex_HTML-10 3960 267700 ns/op 87857 B/op 836 allocs/op +BenchmarkStaticFileLookup-10 90567 13565 ns/op 5050 B/op 33 allocs/op +BenchmarkStaticFileCacheHit-10 90382 14694 ns/op 5109 B/op 35 allocs/op +BenchmarkStaticFileCacheMiss_1KB-10 89732 13796 ns/op 5042 B/op 33 allocs/op +BenchmarkStaticFileCacheMiss_10KB-10 63474 19300 ns/op 23990 B/op 33 allocs/op +BenchmarkStaticTryFiles-10 70876 16249 ns/op 4970 B/op 51 allocs/op +BenchmarkStaticIndex-10 93171 12437 ns/op 3577 B/op 33 allocs/op +BenchmarkStaticNestedFile-10 84880 14478 ns/op 13679 B/op 33 allocs/op +BenchmarkStaticFileNotFound-10 492165 2527 ns/op 2225 B/op 15 allocs/op +BenchmarkStaticWithCacheParallel-10 56725 19007 ns/op 11592 B/op 34 allocs/op +BenchmarkStaticFileLookupWithAlias-10 101319 11805 ns/op 5090 B/op 34 allocs/op +PASS +ok rua.plus/lolly/internal/handler 13.431s + +=== http.txt === +BenchmarkAdapterConversion-10 1512100 824.2 ns/op 256 B/op 10 allocs/op +BenchmarkAdapterWithBody-10 226002 5080 ns/op 6928 B/op 30 allocs/op +BenchmarkServerCreation-10 4511774 265.1 ns/op 416 B/op 5 allocs/op +BenchmarkHTTP2ServerStart-10 4156614 245.0 ns/op 416 B/op 5 allocs/op +BenchmarkHTTP2FrameEncoding/SettingsFrame-10 55689206 18.42 ns/op 0 B/op 0 allocs/op +BenchmarkHTTP2FrameEncoding/DataFrame-10 33784801 37.93 ns/op 0 B/op 0 allocs/op +BenchmarkHTTP2FrameEncoding/DataFrame_Small-10 65797820 17.68 ns/op 0 B/op 0 allocs/op +BenchmarkHTTP2FrameEncoding/DataFrame_Large-10 2523252 492.1 ns/op 0 B/op 0 allocs/op +BenchmarkHTTP2FrameEncoding/PingFrame-10 94094857 13.64 ns/op 0 B/op 0 allocs/op +BenchmarkHTTP2FrameEncoding/RSTStreamFrame-10 100000000 11.41 ns/op 0 B/op 0 allocs/op +BenchmarkHTTP2FrameEncoding/WindowUpdateFrame-10 100000000 12.14 ns/op 0 B/op 0 allocs/op +BenchmarkHTTP2FrameEncoding/GoAwayFrame-10 66041313 18.02 ns/op 0 B/op 0 allocs/op +BenchmarkHTTP2HeadersEncoding/CommonHeaders-10 2735127 438.0 ns/op 0 B/op 0 allocs/op +BenchmarkHTTP2HeadersEncoding/CommonHeaders_Parallel-10 786241 1505 ns/op 1992 B/op 28 allocs/op +BenchmarkHTTP2HeadersEncoding/AuthHeaders-10 1694182 711.2 ns/op 0 B/op 0 allocs/op +BenchmarkHTTP2HeadersEncoding/BodyHeaders-10 3570954 389.7 ns/op 0 B/op 0 allocs/op +BenchmarkHTTP2HeadersEncoding/RepeatedHeaders-10 2209392 514.4 ns/op 0 B/op 0 allocs/op +BenchmarkHTTP2StreamCreate-10 221667 4558 ns/op 6731 B/op 29 allocs/op +BenchmarkHTTP2ConcurrentStreams-10 328290 4151 ns/op 6742 B/op 31 allocs/op +BenchmarkHTTP2RequestRoundTrip-10 638324 1630 ns/op 343 B/op 12 allocs/op +BenchmarkHTTP2RequestRoundTrip_WithBody-10 232758 4992 ns/op 7391 B/op 33 allocs/op +BenchmarkHTTP2RequestRoundTrip_WithBody_Parallel-10 232129 4407 ns/op 7288 B/op 32 allocs/op +BenchmarkHTTP2AdapterWithHPACKHeaders-10 245524 4732 ns/op 6780 B/op 31 allocs/op +PASS +ok rua.plus/lolly/internal/http2 27.192s +BenchmarkAdapterWrap-10 2033492 565.5 ns/op 520 B/op 7 allocs/op +BenchmarkAdapterConvertRequest-10 1272426 937.9 ns/op 164 B/op 6 allocs/op +BenchmarkAdapterConvertRequestBody_1KB-10 200964 5239 ns/op 3338 B/op 13 allocs/op +BenchmarkAdapterConvertRequestBody_10KB-10 78788 13594 ns/op 34841 B/op 20 allocs/op +BenchmarkAdapterConvertRequestBody_100KB-10 34011 34759 ns/op 213538 B/op 10 allocs/op + +=== loadbalance.txt === +BenchmarkRoundRobinSelect/3targets-10 75176475 15.67 ns/op 0 B/op 0 allocs/op +BenchmarkRoundRobinSelect/50targets-10 46881991 25.77 ns/op 0 B/op 0 allocs/op +BenchmarkRoundRobinSelect/200targets-10 13730298 89.31 ns/op 0 B/op 0 allocs/op +BenchmarkWeightedRoundRobin/3targets_equal-10 74647123 15.59 ns/op 0 B/op 0 allocs/op +BenchmarkWeightedRoundRobin/3targets_weighted-10 68335051 15.72 ns/op 0 B/op 0 allocs/op +BenchmarkWeightedRoundRobin/50targets_equal-10 35494826 32.86 ns/op 0 B/op 0 allocs/op +BenchmarkWeightedRoundRobin/50targets_weighted-10 33776556 34.54 ns/op 0 B/op 0 allocs/op +BenchmarkWeightedRoundRobin/200targets_equal-10 10033557 118.6 ns/op 0 B/op 0 allocs/op +BenchmarkConsistentHashSelect/10targets_50vnodes-10 37505451 27.27 ns/op 0 B/op 0 allocs/op +BenchmarkConsistentHashSelect/10targets_150vnodes-10 44527291 26.94 ns/op 0 B/op 0 allocs/op +BenchmarkConsistentHashSelect/10targets_200vnodes-10 46628412 26.60 ns/op 0 B/op 0 allocs/op +BenchmarkConsistentHashSelect/50targets_150vnodes-10 43033684 26.59 ns/op 0 B/op 0 allocs/op +BenchmarkConsistentHashSelect/100targets_150vnodes-10 46417550 26.51 ns/op 0 B/op 0 allocs/op +BenchmarkConsistentHashRebuild/10targets_150vnodes-10 8913 119725 ns/op 114009 B/op 35 allocs/op +BenchmarkConsistentHashRebuild/50targets_150vnodes-10 1285 905655 ns/op 828420 B/op 108 allocs/op +BenchmarkConsistentHashRebuild/100targets_150vnodes-10 606 1945285 ns/op 1623333 B/op 210 allocs/op +BenchmarkConsistentHashSelectExcluding/50targets_150vnodes_exclude5-10 1000000 1091 ns/op 0 B/op 0 allocs/op +BenchmarkConsistentHashSelectExcluding/50targets_150vnodes_exclude10-10 1000000 1174 ns/op 0 B/op 0 allocs/op +BenchmarkConsistentHashSelectExcluding/100targets_150vnodes_exclude5-10 596529 2061 ns/op 0 B/op 0 allocs/op +BenchmarkLeastConnSelect/3targets-10 1000000000 0.3424 ns/op 0 B/op 0 allocs/op +BenchmarkLeastConnSelect/50targets-10 245764088 4.778 ns/op 0 B/op 0 allocs/op +BenchmarkLeastConnSelect/200targets-10 64952187 17.68 ns/op 0 B/op 0 allocs/op +BenchmarkIPHashSelect/3targets-10 253943542 4.684 ns/op 0 B/op 0 allocs/op +BenchmarkIPHashSelect/50targets-10 48979803 24.41 ns/op 0 B/op 0 allocs/op +BenchmarkIPHashSelect/200targets-10 12602810 87.73 ns/op 0 B/op 0 allocs/op +BenchmarkAllBalancers/RoundRobin-10 6389318 187.1 ns/op 0 B/op 0 allocs/op +BenchmarkAllBalancers/WeightedRoundRobin-10 5199241 234.9 ns/op 0 B/op 0 allocs/op +BenchmarkAllBalancers/LeastConnections-10 35844194 31.77 ns/op 0 B/op 0 allocs/op +BenchmarkAllBalancers/IPHash-10 6075333 190.8 ns/op 0 B/op 0 allocs/op +BenchmarkAllBalancers/ConsistentHash-10 41145982 28.54 ns/op 0 B/op 0 allocs/op + +=== logging.txt === + +=== lua.txt === +BenchmarkCoroutineCreation-10 1080924 1199 ns/op 272 B/op 4 allocs/op +BenchmarkLuaContextPool-10 13166972 82.07 ns/op 0 B/op 0 allocs/op +BenchmarkBytecodeCompilation-10 1000000 1060 ns/op 360 B/op 5 allocs/op +BenchmarkSharedDictSetGet-10 21471429 52.44 ns/op 0 B/op 0 allocs/op +BenchmarkTimerCallbackThroughput-10 450582 2337 ns/op 509 B/op 6 allocs/op +BenchmarkTimerCallbackWithLuaExecution-10 20617 56030 ns/op 53561 B/op 120 allocs/op +BenchmarkUpvalueDetection-10 30464 36669 ns/op 54112 B/op 149 allocs/op +BenchmarkTimerGracefulShutdown-10 148 7389734 ns/op 12962100 B/op 47107 allocs/op +BenchmarkLuaContextPoolReuse-10 24460232 56.96 ns/op 0 B/op 0 allocs/op +BenchmarkLuaCoroutinePoolThroughput-10 2039503 520.7 ns/op 272 B/op 4 allocs/op +BenchmarkLuaTablePool/NewTable_NoPool-10 864272 2798 ns/op 3368 B/op 16 allocs/op +BenchmarkLuaTablePool/SharedDict_AsPool-10 3215400 403.8 ns/op 128 B/op 3 allocs/op +BenchmarkLuaMiddlewareOverhead-10 10000 121057 ns/op 84627 B/op 351 allocs/op +BenchmarkLuaMiddlewareMultiPhase-10 6144 256399 ns/op 167706 B/op 700 allocs/op +BenchmarkLuaMiddlewareNgxExit-10 10000 135602 ns/op 86886 B/op 393 allocs/op +BenchmarkCosocket_Connect-10 1041 1095333 ns/op 6442 B/op 43 allocs/op +BenchmarkCosocket_SendReceive-10 24416 49961 ns/op 1040 B/op 2 allocs/op +PASS +ok rua.plus/lolly/internal/lua 23.109s + +=== matcher.txt === +BenchmarkRadixTreeFindLongestPrefix-10 19755723 60.87 ns/op 0 B/op 0 allocs/op +BenchmarkRadixTreeFindLongestPrefixParallel-10 122318263 10.27 ns/op 0 B/op 0 allocs/op +PASS +ok rua.plus/lolly/internal/matcher 3.460s + +=== middleware.txt === +PASS +ok rua.plus/lolly/internal/middleware 0.005s +BenchmarkAccessLogProcess-10 458827 2197 ns/op 1987 B/op 17 allocs/op +BenchmarkAccessLogProcessParallel-10 365294 3255 ns/op 1959 B/op 16 allocs/op +PASS +ok rua.plus/lolly/internal/middleware/accesslog 2.244s +BenchmarkBodyLimitProcess-10 1000000 1210 ns/op 1768 B/op 11 allocs/op +BenchmarkBodyLimitGetLimit-10 17057452 77.30 ns/op 0 B/op 0 allocs/op +BenchmarkBodyLimitPathMatching-10 7554831 162.0 ns/op 0 B/op 0 allocs/op +BenchmarkParseSize-10 29615168 40.98 ns/op 0 B/op 0 allocs/op +PASS +ok rua.plus/lolly/internal/middleware/bodylimit 4.980s +BenchmarkGzipCompress_1KB-10 55242 20921 ns/op 900 B/op 4 allocs/op +BenchmarkGzipCompress_10KB-10 41601 28889 ns/op 906 B/op 4 allocs/op +BenchmarkGzipCompress_100KB-10 10000 119901 ns/op 2012 B/op 5 allocs/op +BenchmarkBrotliCompress_1KB-10 33718 35480 ns/op 403 B/op 2 allocs/op +BenchmarkBrotliCompress_10KB-10 25119 46113 ns/op 433 B/op 2 allocs/op +BenchmarkCompressionPool-10 50222 21297 ns/op 901 B/op 4 allocs/op +BenchmarkCompressionMiddleware-10 35152 33261 ns/op 12016 B/op 17 allocs/op +BenchmarkCompressionMiddlewareNoCompress-10 421274 3153 ns/op 10324 B/op 6 allocs/op +BenchmarkIsCompressible-10 19387118 54.13 ns/op 0 B/op 0 allocs/op +BenchmarkCompressionLevelComparison/Level1-10 57207 21935 ns/op 894 B/op 4 allocs/op +BenchmarkCompressionLevelComparison/Level6-10 35198 32387 ns/op 911 B/op 4 allocs/op +BenchmarkCompressionLevelComparison/Level9-10 16784 72023 ns/op 948 B/op 4 allocs/op +BenchmarkCompressionMiddlewareParallel-10 170314 6881 ns/op 12700 B/op 17 allocs/op +BenchmarkGzipPool_GetPut-10 118927 10414 ns/op 22 B/op 1 allocs/op +BenchmarkGzipWriter_New-10 3289 504494 ns/op 814744 B/op 21 allocs/op +BenchmarkGzipWriter_Pool-10 59080 20177 ns/op 898 B/op 4 allocs/op +BenchmarkCompressionMiddleware_Pool-10 31626 39955 ns/op 14310 B/op 18 allocs/op +BenchmarkGzipCompress_Sizes/100B-10 141144 7318 ns/op 247 B/op 3 allocs/op + +=== proxy.txt === +BenchmarkCacheKeyHashValue_ZeroAlloc-10 11774799 85.10 ns/op 0 B/op 0 allocs/op +BenchmarkCacheKeyHash_WithAlloc-10 5119413 285.8 ns/op 48 B/op 1 allocs/op +BenchmarkCacheKeyHash_Compare/ZeroAlloc-10 12423754 92.28 ns/op 0 B/op 0 allocs/op +BenchmarkCacheKeyHash_Compare/WithAlloc-10 6716290 171.4 ns/op 32 B/op 1 allocs/op +BenchmarkConnectionPool_Normal-10 1 3100765506 ns/op 10000 B/op 96 allocs/op +BenchmarkConnectionPool_HighConcurrency-10 2 1550554418 ns/op 11152 B/op 86 allocs/op +BenchmarkConnectionPool_SmallBody-10 1 3000232015 ns/op 71792 B/op 81 allocs/op +BenchmarkConnectionPool_LargeBody-10 2 1948867432 ns/op 9616 B/op 76 allocs/op +BenchmarkConnectionPool_MultiTarget-10 1 1200480850 ns/op 85392 B/op 158 allocs/op +BenchmarkHostClient_AcquireRelease-10 1 3000973314 ns/op 8944 B/op 61 allocs/op +BenchmarkProxyForward/concurrency1-10 2 1500263114 ns/op 41692 B/op 85 allocs/op +BenchmarkProxyForward/concurrency10-10 2 1500346107 ns/op 11280 B/op 82 allocs/op +BenchmarkProxyForward/concurrency100-10 2 1500509108 ns/op 41660 B/op 85 allocs/op +BenchmarkProxyForwardSmallRequest-10 2 1500492839 ns/op 11344 B/op 82 allocs/op +BenchmarkProxyForwardLargeRequest-10 2 1500835596 ns/op 46780 B/op 97 allocs/op +BenchmarkProxyForwardMultipleTargets-10 2 1500471841 ns/op 7704 B/op 72 allocs/op +BenchmarkProxyHostClient-10 2 1981847248 ns/op 37060 B/op 40 allocs/op +BenchmarkProxyHostClientParallel-10 2 1500465370 ns/op 4112 B/op 42 allocs/op +BenchmarkProxyWithMockBackend-10 96135 12150 ns/op 3065 B/op 42 allocs/op +BenchmarkProxyLoadBalancerSelection/round_robin_3-10 21344373 59.75 ns/op 16 B/op 1 allocs/op +BenchmarkProxyLoadBalancerSelection/round_robin_50-10 13515140 86.74 ns/op 16 B/op 1 allocs/op +BenchmarkProxyLoadBalancerSelection/weighted_round_robin_3-10 18620368 61.38 ns/op 16 B/op 1 allocs/op +BenchmarkProxyLoadBalancerSelection/least_conn_3-10 20915076 56.56 ns/op 16 B/op 1 allocs/op +BenchmarkProxyLoadBalancerSelection/ip_hash_3-10 12006486 94.68 ns/op 48 B/op 3 allocs/op +BenchmarkProxyHeaderProcessing-10 386919 2660 ns/op 2930 B/op 35 allocs/op +BenchmarkBuildCacheKeyHash/buildCacheKeyHash_with_string-10 17636347 63.94 ns/op 24 B/op 1 allocs/op +BenchmarkBuildCacheKeyHash/buildCacheKeyHashValue_direct-10 38463036 31.97 ns/op 0 B/op 0 allocs/op +BenchmarkProxyObjectPoolGetRelease/UpstreamTiming_Pooled-10 29078 40040 ns/op 0 B/op 0 allocs/op +BenchmarkProxyObjectPoolGetRelease/VariableContext_Pooled-10 15730250 76.39 ns/op 8 B/op 1 allocs/op +BenchmarkProxyResponsePoolParallel-10 1 3000834054 ns/op 79184 B/op 133 allocs/op + +=== resolver.txt === +BenchmarkDNSResolverLookupWithCache-10 6284577 236.4 ns/op 48 B/op 1 allocs/op +BenchmarkDNSResolverConcurrent-10 6265792 206.6 ns/op 48 B/op 1 allocs/op +BenchmarkDNSResolverCacheExpiry-10 2145366 548.2 ns/op 144 B/op 3 allocs/op +BenchmarkDNSResolverCacheWriteLock-10 7186472 167.5 ns/op 32 B/op 2 allocs/op +BenchmarkDNSResolverMixedWorkload-10 3976573 322.6 ns/op 64 B/op 2 allocs/op +BenchmarkDNSCacheEntryRLock-10 100000000 22.01 ns/op 0 B/op 0 allocs/op +BenchmarkDNSCacheEntryRWLock-10 5163608 241.5 ns/op 175 B/op 5 allocs/op +PASS +ok rua.plus/lolly/internal/resolver 11.167s + +=== server.txt === +BenchmarkMiddlewareNewChainApply-10 7164360 154.6 ns/op 48 B/op 3 allocs/op +BenchmarkMiddlewareProcessChain-10 1000000000 1.098 ns/op 0 B/op 0 allocs/op +BenchmarkMiddlewareChainExecution-10 182316565 6.727 ns/op 0 B/op 0 allocs/op +BenchmarkMiddlewareChainExecutionWithResponse-10 1052726 1024 ns/op 1568 B/op 3 allocs/op +BenchmarkMiddlewareEmptyChain-10 40100878 498.1 ns/op 12 B/op 0 allocs/op +BenchmarkMiddlewareSingleMiddleware-10 88622024 23.34 ns/op 10 B/op 0 allocs/op +BenchmarkGoroutinePoolSubmit-10 70090189 17.06 ns/op 0 B/op 0 allocs/op +BenchmarkGoroutinePoolParallel-10 45083467 36.31 ns/op 0 B/op 0 allocs/op +BenchmarkGoroutinePoolSubmit_BlockingPath-10 133075401 10.14 ns/op 0 B/op 0 allocs/op +BenchmarkGoroutinePoolQueueFull-10 126751026 13.89 ns/op 0 B/op 0 allocs/op +BenchmarkGoroutinePoolWorkerRecycle-10 15 71130566 ns/op 17697 B/op 220 allocs/op +BenchmarkGoroutinePoolSubmitWithWork/Workers10-10 5490272 223.2 ns/op 0 B/op 0 allocs/op +BenchmarkGoroutinePoolSubmitWithWork/Workers100-10 5219361 225.5 ns/op 0 B/op 0 allocs/op +BenchmarkGoroutinePoolSubmitWithWork/Workers1000-10 2774235 462.6 ns/op 0 B/op 0 allocs/op +BenchmarkGoroutinePoolMinWorkers/WithMinWorkers-10 85318759 16.51 ns/op 0 B/op 0 allocs/op +BenchmarkGoroutinePoolMinWorkers/NoMinWorkers-10 81247957 17.23 ns/op 0 B/op 0 allocs/op +BenchmarkGoroutinePoolObjectPool/PoolTask_Submit-10 60380559 16.60 ns/op 0 B/op 0 allocs/op +BenchmarkGoroutinePoolObjectPool/PoolTask_Reuse_NoClosure-10 62161117 16.80 ns/op 0 B/op 0 allocs/op +BenchmarkPoolMemoryReuse/WithPool_GetPut-10 93926037 12.02 ns/op 0 B/op 0 allocs/op +BenchmarkPoolMemoryReuse/WithoutPool_Alloc-10 10268364 119.9 ns/op 256 B/op 1 allocs/op +PASS +ok rua.plus/lolly/internal/server 45.979s + +=== stream.txt === +BenchmarkStreamFilterHealthy/3_healthy-10 51941268 24.31 ns/op 0 B/op 0 allocs/op +BenchmarkStreamFilterHealthy/10_healthy_80-10 52304758 37.76 ns/op 0 B/op 0 allocs/op +BenchmarkStreamFilterHealthy/50_healthy_50-10 51739732 39.39 ns/op 0 B/op 0 allocs/op +BenchmarkStreamFilterHealthy/100_healthy_80-10 49306483 41.64 ns/op 0 B/op 0 allocs/op +BenchmarkStreamFilterHealthyPreallocated-10 100000000 10.45 ns/op 0 B/op 0 allocs/op +BenchmarkUDPSessionAllocations/no_pool_65k-10 33656 37885 ns/op 65536 B/op 1 allocs/op +BenchmarkUDPSessionAllocations/sync_pool_65k-10 10833096 93.02 ns/op 24 B/op 1 allocs/op +BenchmarkUDPSessionAllocations/no_pool_16k-10 269760 4646 ns/op 16384 B/op 1 allocs/op +BenchmarkUDPSessionAllocations/sync_pool_16k-10 31780728 37.38 ns/op 24 B/op 1 allocs/op +BenchmarkUDPSessionGetOrCreate-10 15100776 70.40 ns/op 32 B/op 3 allocs/op +BenchmarkUDPSessionGetOnly-10 19103670 66.96 ns/op 32 B/op 3 allocs/op +BenchmarkStreamBalancerSelect/round_robin_3-10 47924217 24.54 ns/op 0 B/op 0 allocs/op +BenchmarkStreamBalancerSelect/round_robin_10-10 56421152 22.36 ns/op 0 B/op 0 allocs/op +BenchmarkStreamBalancerSelect/round_robin_50-10 47163234 22.13 ns/op 0 B/op 0 allocs/op +BenchmarkStreamBalancerSelect/weighted_round_robin_3-10 37397344 32.71 ns/op 0 B/op 0 allocs/op +BenchmarkStreamBalancerSelect/weighted_round_robin_10-10 40612486 29.41 ns/op 0 B/op 0 allocs/op +BenchmarkStreamBalancerSelect/least_conn_3-10 1000000000 0.7223 ns/op 0 B/op 0 allocs/op +BenchmarkStreamBalancerSelect/least_conn_10-10 826034833 1.427 ns/op 0 B/op 0 allocs/op +BenchmarkStreamBalancerSelect/ip_hash_3-10 70855179 22.03 ns/op 16 B/op 1 allocs/op +BenchmarkStreamBalancerSelect/ip_hash_10-10 93524262 18.33 ns/op 16 B/op 1 allocs/op +BenchmarkStreamRoundRobinWithUnhealthy/3_1_unhealthy-10 68404255 15.29 ns/op 0 B/op 0 allocs/op +BenchmarkStreamRoundRobinWithUnhealthy/10_3_unhealthy-10 54019622 22.22 ns/op 0 B/op 0 allocs/op +BenchmarkStreamRoundRobinWithUnhealthy/50_20_unhealthy-10 22136433 55.78 ns/op 0 B/op 0 allocs/op +BenchmarkStreamLeastConnWithVaryingConns/uniform-10 426217518 2.781 ns/op 0 B/op 0 allocs/op +BenchmarkStreamLeastConnWithVaryingConns/varying-10 434543780 2.777 ns/op 0 B/op 0 allocs/op +BenchmarkStreamLeastConnWithVaryingConns/extreme-10 411333520 2.789 ns/op 0 B/op 0 allocs/op +BenchmarkStreamWeightedRoundRobinDistribution/equal-10 64135672 18.20 ns/op 0 B/op 0 allocs/op +BenchmarkStreamWeightedRoundRobinDistribution/linear-10 66114645 19.27 ns/op 0 B/op 0 allocs/op +BenchmarkStreamWeightedRoundRobinDistribution/heavy-10 56737513 19.44 ns/op 0 B/op 0 allocs/op +BenchmarkStreamWeightedRoundRobinDistribution/exponential-10 58088670 21.10 ns/op 0 B/op 0 allocs/op + +=== summary.txt === +BenchmarkFileCacheGet/Size100-10 21736779 52.62 ns/op 16 B/op 1 allocs/op +BenchmarkFileCacheGet/Size1000-10 22091924 50.41 ns/op 16 B/op 1 allocs/op +BenchmarkFileCacheGet/Size10000-10 24389118 44.63 ns/op 21 B/op 1 allocs/op +BenchmarkFileCacheSet/Size100-10 2482664 513.3 ns/op 120 B/op 5 allocs/op +BenchmarkFileCacheSet/Size1000-10 2469159 482.6 ns/op 120 B/op 5 allocs/op +BenchmarkFileCacheSet/Size10000-10 2264976 713.2 ns/op 120 B/op 5 allocs/op +BenchmarkFileCacheSet_Pooled/Size100-10 2663748 449.6 ns/op 120 B/op 5 allocs/op +BenchmarkFileCacheSet_Pooled/Size1000-10 1215387 895.9 ns/op 120 B/op 5 allocs/op +BenchmarkFileCacheSet_Pooled/Size10000-10 1000000 3025 ns/op 120 B/op 5 allocs/op +BenchmarkFileCacheSetNoEviction-10 2499050 912.4 ns/op 112 B/op 5 allocs/op +BenchmarkFileCacheConcurrent/Size100-10 8061500 139.6 ns/op 26 B/op 1 allocs/op +BenchmarkFileCacheConcurrent/Size1000-10 5312031 222.8 ns/op 31 B/op 2 allocs/op +BenchmarkFileCacheConcurrent/Size10000-10 5357617 227.2 ns/op 31 B/op 2 allocs/op +BenchmarkFileCacheGetOnly-10 15445342 78.82 ns/op 29 B/op 1 allocs/op +BenchmarkFileCacheSizeEviction-10 1275866 942.1 ns/op 1121 B/op 5 allocs/op +BenchmarkFileCacheLRUTouch-10 10095694 115.6 ns/op 16 B/op 1 allocs/op +BenchmarkProxyCacheGet-10 29701370 125.0 ns/op 13 B/op 1 allocs/op +BenchmarkProxyCacheSet-10 717621 1754 ns/op 251 B/op 3 allocs/op +BenchmarkProxyCacheConcurrent-10 4811952 276.0 ns/op 69 B/op 2 allocs/op +BenchmarkFileCacheSetAllocation_New-10 2149519 522.3 ns/op 97 B/op 4 allocs/op +BenchmarkFileCacheSetAllocation_Update-10 3871534 348.9 ns/op 45 B/op 2 allocs/op +BenchmarkFileCacheSetAllocation_Eviction-10 2248743 552.0 ns/op 96 B/op 4 allocs/op +BenchmarkFileCacheSetAllocation_EvictionWithPool-10 2310462 515.7 ns/op 96 B/op 4 allocs/op +BenchmarkFileCacheSetAllocation_MemoryLimit-10 2186145 563.2 ns/op 96 B/op 4 allocs/op +BenchmarkFileCacheSetAllocation_Concurrent-10 1934901 654.7 ns/op 88 B/op 3 allocs/op +BenchmarkFileCacheSetAllocation_ConcurrentEviction-10 2139834 609.0 ns/op 96 B/op 3 allocs/op +BenchmarkFileCacheEntryPool_GetPut-10 85020030 12.46 ns/op 0 B/op 0 allocs/op +BenchmarkFileCacheLRUList_PushFront-10 6249896 206.8 ns/op 232 B/op 4 allocs/op +PASS +ok rua.plus/lolly/internal/cache 45.363s + +=== utils.txt === +BenchmarkExtractClientIP/X-Forwarded-For_single_IP-10 15637682 76.06 ns/op 32 B/op 2 allocs/op +BenchmarkExtractClientIP/X-Forwarded-For_multiple_IPs-10 11151398 107.4 ns/op 96 B/op 2 allocs/op +BenchmarkExtractClientIP/X-Real-IP_only-10 16888720 71.44 ns/op 16 B/op 1 allocs/op +BenchmarkExtractClientIP/RemoteAddr_fallback-10 14076492 85.15 ns/op 8 B/op 1 allocs/op +BenchmarkExtractClientIPNet/X-Forwarded-For_single_IP-10 9092592 133.8 ns/op 48 B/op 3 allocs/op +BenchmarkExtractClientIPNet/X-Real-IP_only-10 9696522 125.2 ns/op 32 B/op 2 allocs/op +BenchmarkExtractClientIPNet/RemoteAddr_fallback-10 22064487 52.76 ns/op 0 B/op 0 allocs/op +BenchmarkStripPort/IPv4_with_port-10 298226187 3.951 ns/op 0 B/op 0 allocs/op +BenchmarkStripPort/IPv6_with_port-10 286273158 4.211 ns/op 0 B/op 0 allocs/op +BenchmarkStripPort/no_port-10 252196581 4.841 ns/op 0 B/op 0 allocs/op +BenchmarkStripPort/empty_string-10 1000000000 0.4960 ns/op 0 B/op 0 allocs/op +PASS +ok rua.plus/lolly/internal/netutil 12.508s +BenchmarkLoadCACertPool-10 74826 15530 ns/op 6448 B/op 54 allocs/op +BenchmarkGenerateTicketKey-10 11567258 101.1 ns/op 32 B/op 1 allocs/op +BenchmarkSessionTicketManager_GetKeys-10 10594251 116.0 ns/op 176 B/op 4 allocs/op +BenchmarkSessionTicketManager_RotateKey-10 8896942 135.0 ns/op 80 B/op 1 allocs/op +BenchmarkTLSHandshake-10 1578 758396 ns/op 117043 B/op 844 allocs/op +BenchmarkTLSHandshake_TLS13Only-10 1525 752984 ns/op 116542 B/op 839 allocs/op +BenchmarkTLSCertificateLoad-10 27265 44040 ns/op 8637 B/op 121 allocs/op +BenchmarkTLSCertificateLoad_InMemory-10 44395 27435 ns/op 6796 B/op 111 allocs/op +BenchmarkTLSCertificateLoad_Parallel-10 73044 16299 ns/op 8681 B/op 121 allocs/op +BenchmarkTLSRenegotiation-10 1742 651865 ns/op 41879 B/op 442 allocs/op +BenchmarkOCSPStapling-10 49240950 23.65 ns/op 0 B/op 0 allocs/op +BenchmarkOCSPStapling_Miss-10 49655992 23.88 ns/op 0 B/op 0 allocs/op +BenchmarkSessionTicketManager_ApplyToTLSConfig-10 948468 1204 ns/op 928 B/op 7 allocs/op +BenchmarkCipherSuiteParsing-10 13597928 86.37 ns/op 16 B/op 1 allocs/op +BenchmarkTLSVersionsParsing-10 235120939 5.090 ns/op 0 B/op 0 allocs/op +PASS +ok rua.plus/lolly/internal/ssl 18.173s + diff --git a/docs/superpowers/plans/2026-06-03-eliminate-code-redundancy.md b/docs/superpowers/plans/2026-06-03-eliminate-code-redundancy.md new file mode 100644 index 0000000..260409a --- /dev/null +++ b/docs/superpowers/plans/2026-06-03-eliminate-code-redundancy.md @@ -0,0 +1,929 @@ +# 消除代码冗余实施计划 + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** 消除 lolly 项目中的代码冗余:删除 8 处死代码、重构 2 处源文件重复模式、提取测试辅助函数减少 184 处配置字面量重复。 + +**Architecture:** 分三阶段实施:阶段 1 删除未使用的死代码(零风险);阶段 2 提取路由注册和 DEBUG 日志辅助函数(低风险重构);阶段 3 创建测试辅助函数包并迁移重复代码(逐步替换)。 + +**Tech Stack:** Go 1.22+, golangci-lint, dupl/unused linters + +--- + +## 文件结构 + +**创建:** +- `internal/testutil/proxy.go` - 测试辅助函数(ProxyConfig、Target 创建) + +**修改:** +- `internal/config/validate.go` - 删除 `validateStatic()` 函数 +- `internal/config/validate_test.go` - 删除 `TestValidateStatic` 测试 +- `internal/http2/server.go` - 删除 `connectionPool.get()` 和 `connectionPool.count()` +- `internal/middleware/bodylimit/bodylimit.go` - 删除 `formatSize()` 函数 +- `internal/middleware/bodylimit/bodylimit_test.go` - 删除 `TestFormatSize` 测试 +- `internal/middleware/security/headers.go` - 删除 3 个 security headers 函数 +- `internal/middleware/security/headers_test.go` - 删除 3 个对应测试 +- `internal/ssl/ocsp.go` - 删除 `extractCertificates()` 函数 +- `internal/ssl/ocsp_test.go` - 删除 2 个对应测试 +- `internal/server/router.go` - 提取 `registerRoute` 辅助函数 +- `internal/proxy/proxy.go` - 提取 `proxyDebugLog` 辅助函数 + +--- + +## 阶段 1:死代码删除 + +### Task 1: 删除 `validateStatic` 函数及其测试 + +**Files:** +- Modify: `internal/config/validate.go:475-484` +- Modify: `internal/config/validate_test.go:752-809` + +- [ ] **Step 1: 删除 `validateStatic` 函数** + +删除 `internal/config/validate.go` 第 475-484 行: + +```go +// validateStatic 验证静态文件配置。 +// +// 参数: +// - s: 静态文件配置对象 +// +// 返回值: +// - error: 验证失败时返回错误信息,成功返回 nil +func validateStatic(s *StaticConfig) error { + // 静态文件根目录非空时验证路径有效性 + if s.Root != "" { + // 路径安全检查:不允许包含 ".." + if err := ValidatePathTraversal(s.Root, "根目录路径"); err != nil { + return err + } + } + return nil +} +``` + +- [ ] **Step 2: 删除对应的单元测试** + +删除 `internal/config/validate_test.go` 第 752-809 行的 `TestValidateStatic` 函数: + +```go +func TestValidateStatic(t *testing.T) { + t.Parallel() + // TestValidateStatic 测试静态文件配置验证。 + tests := []struct { + name string + errMsg string + config StaticConfig + wantErr bool + }{ + { + name: "空配置有效", + config: StaticConfig{}, + wantErr: false, + }, + { + name: "有效根目录", + config: StaticConfig{ + Root: "/var/www/html", + }, + wantErr: false, + }, + { + name: "根目录含..路径遍历", + config: StaticConfig{ + Root: "/var/www/../etc", + }, + wantErr: true, + errMsg: "根目录路径不能包含 '..'", + }, + { + name: "根目录含多个..", + config: StaticConfig{ + Root: "/var/../www/../html", + }, + wantErr: true, + errMsg: "根目录路径不能包含 '..'", + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + err := validateStatic(&tt.config) + if tt.wantErr { + if err == nil { + t.Errorf("validateStatic() 期望返回错误,但返回 nil") + return + } + if tt.errMsg != "" && !strings.Contains(err.Error(), tt.errMsg) { + t.Errorf("validateStatic() 错误消息不匹配,期望包含 %q,实际 %q", tt.errMsg, err.Error()) + } + } else { + if err != nil { + t.Errorf("validateStatic() 期望返回 nil,但返回错误: %v", err) + } + } + }) + } +} +``` + +- [ ] **Step 3: 运行测试确认通过** + +Run: `go test ./internal/config/... -run TestValidateStatic -v` +Expected: 无此测试(因为已删除) + +Run: `go test ./internal/config/... -v` +Expected: PASS + +- [ ] **Step 4: Commit** + +```bash +git add internal/config/validate.go internal/config/validate_test.go +git commit -m "refactor: remove unused validateStatic function and its test" +``` + +--- + +### Task 2: 删除 `connectionPool` 未使用的方法 + +**Files:** +- Modify: `internal/http2/server.go:575-587` + +- [ ] **Step 1: 删除 `get` 和 `count` 方法** + +删除 `internal/http2/server.go` 第 575-587 行: + +```go +// get 获取连接。 +func (p *connectionPool) get(key string) []net.Conn { + p.mu.RLock() + defer p.mu.RUnlock() + return p.conns[key] +} + +// count 获取连接数。 +func (p *connectionPool) count(key string) int { + p.mu.RLock() + defer p.mu.RUnlock() + return len(p.conns[key]) +} +``` + +- [ ] **Step 2: 运行测试确认通过** + +Run: `go test ./internal/http2/... -v` +Expected: PASS + +- [ ] **Step 3: Commit** + +```bash +git add internal/http2/server.go +git commit -m "refactor: remove unused connectionPool.get and connectionPool.count methods" +``` + +--- + +### Task 3: 删除 `bodylimit.formatSize` 函数及其测试 + +**Files:** +- Modify: `internal/middleware/bodylimit/bodylimit.go:279-305` +- Modify: `internal/middleware/bodylimit/bodylimit_test.go:36-72` + +- [ ] **Step 1: 删除 `formatSize` 函数** + +删除 `internal/middleware/bodylimit/bodylimit.go` 第 279-305 行: + +```go +// formatSize 将字节数格式化为人类可读的字符串。 +// +// 根据大小自动选择合适的单位(b、kb、mb、gb)。 +// +// 参数: +// - size: 字节数 +// +// 返回值: +// - string: 格式化后的字符串,如 "1.00mb"、"10.00kb" +func formatSize(size int64) string { + const ( + KB = 1024 + MB = 1024 * KB + GB = 1024 * MB + ) + + switch { + case size >= GB: + return fmt.Sprintf("%.2fgb", float64(size)/GB) + case size >= MB: + return fmt.Sprintf("%.2fmb", float64(size)/MB) + case size >= KB: + return fmt.Sprintf("%.2fkb", float64(size)/KB) + default: + return fmt.Sprintf("%db", size) + } +} +``` + +- [ ] **Step 2: 删除对应的单元测试** + +删除 `internal/middleware/bodylimit/bodylimit_test.go` 第 36-72 行的 `TestFormatSize` 函数: + +```go +func TestFormatSize(t *testing.T) { + tests := []struct { + input int64 + expected string + }{ + {512, "512b"}, + {1024, "1.00kb"}, + {1024 * 1024, "1.00mb"}, + {1024 * 1024 * 1024, "1.00gb"}, + {1536, "1.50kb"}, + } + + for _, tt := range tests { + t.Run(formatSize(tt.input), func(t *testing.T) { + got := formatSize(tt.input) + if got != tt.expected { + t.Errorf("formatSize(%d) = %s, want %s", tt.input, got, tt.expected) + } + }) + } +} +``` + +- [ ] **Step 3: 运行测试确认通过** + +Run: `go test ./internal/middleware/bodylimit/... -v` +Expected: PASS + +- [ ] **Step 4: Commit** + +```bash +git add internal/middleware/bodylimit/bodylimit.go internal/middleware/bodylimit/bodylimit_test.go +git commit -m "refactor: remove unused bodylimit.formatSize function and test" +``` + +--- + +### Task 4: 删除 security headers 未使用的函数及其测试 + +**Files:** +- Modify: `internal/middleware/security/headers.go:291-331` +- Modify: `internal/middleware/security/headers_test.go:184-215` + +- [ ] **Step 1: 删除 3 个 security headers 函数** + +删除 `internal/middleware/security/headers.go` 第 291-331 行: + +```go +// defaultSecurityHeaders 返回安全的安全头默认配置。 +// +// 返回值: +// - *config.SecurityHeaders: 包含安全默认值的配置对象 +func defaultSecurityHeaders() *config.SecurityHeaders { + return &config.SecurityHeaders{ + XFrameOptions: "DENY", + XContentTypeOptions: "nosniff", + ReferrerPolicy: "strict-origin-when-cross-origin", + } +} + +// strictSecurityHeaders 返回严格模式的安全头配置。 +// +// 适用于高安全要求的应用场景,包含严格的 CSP 和权限策略。 +// +// 返回值: +// - *config.SecurityHeaders: 包含严格安全值的配置对象 +func strictSecurityHeaders() *config.SecurityHeaders { + return &config.SecurityHeaders{ + XFrameOptions: "DENY", + XContentTypeOptions: "nosniff", + ContentSecurityPolicy: "default-src 'self'; script-src 'self'; style-src 'self'; img-src 'self'; font-src 'self'; connect-src 'self'; frame-ancestors 'none'", + ReferrerPolicy: "no-referrer", + PermissionsPolicy: "accelerometer=(), camera=(), geolocation=(), gyroscope=(), magnetometer=(), microphone=(), payment=(), usb=()", + } +} + +// developmentSecurityHeaders 返回开发环境使用的宽松安全头配置。 +// +// 警告:请勿在生产环境使用此配置,安全性较低。 +// +// 返回值: +// - *config.SecurityHeaders: 包含宽松安全值的配置对象 +func developmentSecurityHeaders() *config.SecurityHeaders { + return &config.SecurityHeaders{ + XFrameOptions: "SAMEORIGIN", + XContentTypeOptions: "nosniff", + ReferrerPolicy: "strict-origin-when-cross-origin", + } +} +``` + +- [ ] **Step 2: 删除对应的单元测试** + +删除 `internal/middleware/security/headers_test.go` 第 184-215 行: + +```go +func TestDefaultSecurityHeaders(t *testing.T) { + cfg := defaultSecurityHeaders() + + if cfg.XFrameOptions != "DENY" { + t.Errorf("Expected default X-Frame-Options 'DENY', got %s", cfg.XFrameOptions) + } + if cfg.XContentTypeOptions != "nosniff" { + t.Errorf("Expected default X-Content-Type-Options 'nosniff', got %s", cfg.XContentTypeOptions) + } +} + +func TestStrictSecurityHeaders(t *testing.T) { + cfg := strictSecurityHeaders() + + if cfg.XFrameOptions != "DENY" { + t.Errorf("Expected X-Frame-Options 'DENY', got %s", cfg.XFrameOptions) + } + if cfg.ReferrerPolicy != "no-referrer" { + t.Errorf("Expected Referrer-Policy 'no-referrer', got %s", cfg.ReferrerPolicy) + } + if cfg.ContentSecurityPolicy == "" { + t.Error("Expected non-empty CSP for strict config") + } +} + +func TestDevelopmentSecurityHeaders(t *testing.T) { + cfg := developmentSecurityHeaders() + + if cfg.XFrameOptions != "SAMEORIGIN" { + t.Errorf("Expected X-Frame-Options 'SAMEORIGIN' for dev, got %s", cfg.XFrameOptions) + } +} +``` + +- [ ] **Step 3: 运行测试确认通过** + +Run: `go test ./internal/middleware/security/... -v` +Expected: PASS + +- [ ] **Step 4: Commit** + +```bash +git add internal/middleware/security/headers.go internal/middleware/security/headers_test.go +git commit -m "refactor: remove unused security header preset functions and tests" +``` + +--- + +### Task 5: 删除 `extractCertificates` 函数及其测试 + +**Files:** +- Modify: `internal/ssl/ocsp.go:482-514` +- Modify: `internal/ssl/ocsp_test.go:311-335` + +- [ ] **Step 1: 删除 `extractCertificates` 函数** + +删除 `internal/ssl/ocsp.go` 第 482-514 行: + +```go +// extractCertificates 解析 PEM 数据并返回证书列表。 +// +// 参数: +// - pemData: PEM 编码的证书数据 +// +// 返回值: +// - []*x509.Certificate: 解析后的证书列表 +// - error: 解析失败时返回错误 +func extractCertificates(pemData []byte) ([]*x509.Certificate, error) { + var certs []*x509.Certificate + rest := pemData + + for { + block, remaining := pem.Decode(rest) + if block == nil { + break + } + if block.Type == "CERTIFICATE" { + cert, err := x509.ParseCertificate(block.Bytes) + if err != nil { + return nil, fmt.Errorf("failed to parse certificate: %w", err) + } + certs = append(certs, cert) + } + rest = remaining + } + + if len(certs) == 0 { + return nil, errors.New("no certificates found in PEM data") + } + + return certs, nil +} +``` + +- [ ] **Step 2: 删除对应的单元测试** + +删除 `internal/ssl/ocsp_test.go` 第 311-335 行: + +```go +func TestExtractCertificates(t *testing.T) { + // Create valid PEM data + certPEM, _ := generateTestCertWithOCSP(t, nil) + + certs, err := extractCertificates(certPEM) + if err != nil { + t.Fatalf("extractCertificates() failed: %v", err) + } + + if len(certs) == 0 { + t.Error("Expected at least one certificate") + } +} + +func TestExtractCertificatesInvalidPEM(t *testing.T) { + invalidPEM := []byte("not valid pem data") + + certs, err := extractCertificates(invalidPEM) + if err == nil { + t.Error("Expected error for invalid PEM data") + } + if certs != nil { + t.Error("Expected nil certs for invalid PEM data") + } +} +``` + +- [ ] **Step 3: 运行测试确认通过** + +Run: `go test ./internal/ssl/... -v` +Expected: PASS + +- [ ] **Step 4: Commit** + +```bash +git add internal/ssl/ocsp.go internal/ssl/ocsp_test.go +git commit -m "refactor: remove unused extractCertificates function and tests" +``` + +--- + +## 阶段 2:源文件重复模式重构 + +### Task 6: 提取路由注册辅助函数 + +**Files:** +- Modify: `internal/server/router.go:84-124` 和 `internal/server/router.go:190-220` 和 `internal/server/router.go:390-420` + +- [ ] **Step 1: 添加 `registerRoute` 辅助函数** + +在 `internal/server/router.go` 的 `configureProxyRoutes` 函数之前添加: + +```go +// registerRoute 根据位置类型注册路由 +func (s *Server) registerRoute( + locType string, + path string, + handler fasthttp.RequestHandler, + internal bool, + source string, +) error { + var err error + switch locType { + case matcher.LocationTypeExact: + err = s.locationEngine.AddExact(path, handler, internal) + case matcher.LocationTypePrefixPriority: + err = s.locationEngine.AddPrefixPriority(path, handler, internal) + case matcher.LocationTypeRegex: + err = s.locationEngine.AddRegex(path, handler, false, internal) + case matcher.LocationTypeRegexCaseless: + err = s.locationEngine.AddRegex(path, handler, true, internal) + case matcher.LocationTypeNamed: + err = s.locationEngine.AddNamed(path, handler) + default: + err = s.locationEngine.AddPrefix(path, handler, internal) + } + if err != nil { + return s.handleRegistrationError(source, path, err) + } + return nil +} +``` + +- [ ] **Step 2: 重构 proxy 路由注册** + +将 `internal/server/router.go` 第 84-124 行的 switch 语句替换为: + +```go + switch locType { + case matcher.LocationTypeExact: + if err := s.registerRoute(locType, proxyCfg.Path, p.ServeHTTP, proxyCfg.Internal, "proxy"); err != nil { + return err + } + case matcher.LocationTypePrefixPriority: + if err := s.registerRoute(locType, proxyCfg.Path, p.ServeHTTP, proxyCfg.Internal, "proxy"); err != nil { + return err + } + case matcher.LocationTypeRegex, matcher.LocationTypeRegexCaseless: + caseInsensitive := locType == matcher.LocationTypeRegexCaseless + if err := s.registerRoute(locType, proxyCfg.Path, p.ServeHTTP, proxyCfg.Internal, "proxy"); err != nil { + return err + } + case matcher.LocationTypeNamed: + if proxyCfg.LocationName != "" { + if err := s.registerRoute(locType, "@"+proxyCfg.LocationName, p.ServeHTTP, false, "proxy"); err != nil { + return err + } + } + case matcher.LocationTypePrefix: + if err := s.registerRoute(locType, proxyCfg.Path, p.ServeHTTP, proxyCfg.Internal, "proxy"); err != nil { + return err + } + default: + if err := s.registerRoute(locType, proxyCfg.Path, p.ServeHTTP, proxyCfg.Internal, "proxy"); err != nil { + return err + } + } +``` + +- [ ] **Step 3: 重构 static 路由注册** + +将 `internal/server/router.go` 第 190-220 行的类似代码替换为 `registerRoute` 调用。 + +- [ ] **Step 4: 重构 lua 路由注册** + +将 `internal/server/router.go` 第 390-420 行的类似代码替换为 `registerRoute` 调用。 + +- [ ] **Step 5: 运行测试确认通过** + +Run: `go test ./internal/server/... -v` +Expected: PASS + +- [ ] **Step 6: Commit** + +```bash +git add internal/server/router.go +git commit -m "refactor: extract registerRoute helper to reduce repetition" +``` + +--- + +### Task 7: 提取 DEBUG 日志辅助函数 + +**Files:** +- Modify: `internal/proxy/proxy.go:470-476` 和类似位置 + +- [ ] **Step 1: 添加 `proxyDebugLog` 辅助函数** + +在 `internal/proxy/proxy.go` 的 `ServeHTTP` 方法之前添加: + +```go +// proxyDebugLog 在 DEBUG 级别记录代理日志 +func proxyDebugLog(msg string, kv ...interface{}) { + if !logging.Debug().Enabled() { + return + } + event := logging.Debug() + for i := 0; i < len(kv)-1; i += 2 { + key, ok := kv[i].(string) + if !ok { + continue + } + switch v := kv[i+1].(type) { + case string: + event = event.Str(key, v) + case int: + event = event.Int(key, v) + case bool: + event = event.Bool(key, v) + } + } + event.Msg(msg) +} +``` + +- [ ] **Step 2: 替换第一个 DEBUG 日志** + +将第 470-476 行: +```go + if logging.Debug().Enabled() { + logging.Debug(). + Str("path", b2s(ctx.Path())). + Str("host", b2s(ctx.Host())). + Str("method", b2s(ctx.Method())). + Msg("[PROXY] 收到请求") + } +``` +替换为: +```go + proxyDebugLog("[PROXY] 收到请求", + "path", b2s(ctx.Path()), + "host", b2s(ctx.Host()), + "method", b2s(ctx.Method()), + ) +``` + +- [ ] **Step 3: 替换其余 4 个 DEBUG 日志** + +重复 Step 2 的模式,替换第 536-540、555-559、627-631、715-719 行的 DEBUG 日志。 + +- [ ] **Step 4: 运行测试确认通过** + +Run: `go test ./internal/proxy/... -v` +Expected: PASS + +- [ ] **Step 5: Commit** + +```bash +git add internal/proxy/proxy.go +git commit -m "refactor: extract proxyDebugLog helper for repeated debug logging" +``` + +--- + +## 阶段 3:测试辅助函数 + +### Task 8: 创建测试辅助函数包 + +**Files:** +- Create: `internal/testutil/proxy.go` + +- [ ] **Step 1: 创建测试辅助函数文件** + +创建 `internal/testutil/proxy.go`: + +```go +package testutil + +import ( + "time" + + "rua.plus/lolly/internal/config" + "rua.plus/lolly/internal/loadbalance" +) + +// NewTestProxyConfig 创建测试用的代理配置 +// +// 参数: +// - path: 代理路径 +// - targetURLs: 后端目标 URL 列表 +// +// 返回值: +// - *config.ProxyConfig: 配置好的代理配置 +func NewTestProxyConfig(path string, targetURLs ...string) *config.ProxyConfig { + cfg := &config.ProxyConfig{ + Path: path, + LoadBalance: "round_robin", + Timeout: config.ProxyTimeout{ + Connect: 5 * time.Second, + Read: 30 * time.Second, + Write: 30 * time.Second, + }, + } + + if len(targetURLs) > 0 { + cfg.Targets = make([]config.ProxyTargetConfig, len(targetURLs)) + for i, url := range targetURLs { + cfg.Targets[i] = config.ProxyTargetConfig{URL: url} + } + } + + return cfg +} + +// NewTestProxyConfigWithCache 创建带缓存的测试代理配置 +func NewTestProxyConfigWithCache(path string, maxAge time.Duration, targetURLs ...string) *config.ProxyConfig { + cfg := NewTestProxyConfig(path, targetURLs...) + cfg.Cache = config.ProxyCacheConfig{ + Enabled: true, + MaxAge: maxAge, + } + return cfg +} + +// NewTestTarget 创建测试用的代理目标 +// +// 参数: +// - url: 目标 URL +// +// 返回值: +// - *loadbalance.Target: 测试目标 +func NewTestTarget(url string) *loadbalance.Target { + return &loadbalance.Target{URL: url} +} + +// NewTestTargets 批量创建测试目标 +func NewTestTargets(urls ...string) []*loadbalance.Target { + targets := make([]*loadbalance.Target, len(urls)) + for i, url := range urls { + targets[i] = NewTestTarget(url) + } + return targets +} + +// NewTestHealthyTarget 创建已标记为健康的测试目标 +// +// 参数: +// - url: 目标 URL +// +// 返回值: +// - *loadbalance.Target: 已标记为健康的测试目标 +func NewTestHealthyTarget(url string) *loadbalance.Target { + t := NewTestTarget(url) + t.Healthy.Store(true) + return t +} + +// NewTestHealthyTargets 批量创建健康测试目标 +func NewTestHealthyTargets(urls ...string) []*loadbalance.Target { + targets := make([]*loadbalance.Target, len(urls)) + for i, url := range urls { + targets[i] = NewTestHealthyTarget(url) + } + return targets +} +``` + +- [ ] **Step 2: 编写辅助函数测试** + +创建 `internal/testutil/proxy_test.go`: + +```go +package testutil + +import ( + "testing" + "time" +) + +func TestNewTestProxyConfig(t *testing.T) { + cfg := NewTestProxyConfig("/api", "http://localhost:8080") + + if cfg.Path != "/api" { + t.Errorf("expected path /api, got %s", cfg.Path) + } + if len(cfg.Targets) != 1 { + t.Errorf("expected 1 target, got %d", len(cfg.Targets)) + } + if cfg.Timeout.Connect != 5*time.Second { + t.Errorf("expected 5s connect timeout, got %v", cfg.Timeout.Connect) + } +} + +func TestNewTestHealthyTarget(t *testing.T) { + target := NewTestHealthyTarget("http://localhost:8080") + + if target.URL != "http://localhost:8080" { + t.Errorf("expected URL http://localhost:8080, got %s", target.URL) + } + if !target.Healthy.Load() { + t.Error("expected target to be healthy") + } +} + +func TestNewTestHealthyTargets(t *testing.T) { + targets := NewTestHealthyTargets("http://localhost:8080", "http://localhost:8081") + + if len(targets) != 2 { + t.Errorf("expected 2 targets, got %d", len(targets)) + } + for i, target := range targets { + if !target.Healthy.Load() { + t.Errorf("expected target %d to be healthy", i) + } + } +} +``` + +- [ ] **Step 3: 运行测试确认通过** + +Run: `go test ./internal/testutil/... -v` +Expected: PASS + +- [ ] **Step 4: Commit** + +```bash +git add internal/testutil/ +git commit -m "feat: add testutil package for proxy config helpers" +``` + +--- + +### Task 9: 迁移 proxy 测试使用辅助函数 + +**Files:** +- Modify: `internal/proxy/proxy_test.go` +- Modify: `internal/integration/proxy_integration_test.go` + +- [ ] **Step 1: 修改 `internal/proxy/proxy_test.go` 导入** + +添加导入: +```go +import ( + "rua.plus/lolly/internal/testutil" +) +``` + +- [ ] **Step 2: 替换重复的 ProxyConfig 创建** + +将测试中的重复模式替换为: +```go +// 替换前: +cfg := &config.ProxyConfig{ + Path: "/api", + LoadBalance: "round_robin", + Timeout: config.ProxyTimeout{ + Connect: 5 * time.Second, + Read: 30 * time.Second, + Write: 30 * time.Second, + }, +} + +// 替换后: +cfg := testutil.NewTestProxyConfig("/api") +``` + +- [ ] **Step 3: 替换重复的 Target 创建** + +将: +```go +targets := []*loadbalance.Target{{URL: "http://localhost:8080"}} +targets[0].Healthy.Store(true) +``` +替换为: +```go +targets := testutil.NewTestHealthyTargets("http://localhost:8080") +``` + +- [ ] **Step 4: 运行测试确认通过** + +Run: `go test ./internal/proxy/... -v` +Expected: PASS + +- [ ] **Step 5: Commit** + +```bash +git add internal/proxy/proxy_test.go internal/integration/proxy_integration_test.go +git commit -m "refactor: use testutil helpers in proxy tests" +``` + +--- + +### Task 10: 迁移 server 测试使用辅助函数 + +**Files:** +- Modify: `internal/server/*_test.go` + +- [ ] **Step 1: 批量替换 server 测试中的重复代码** + +使用与 Task 9 相同的模式,替换 `internal/server/` 下所有测试文件中的重复 ProxyConfig 和 Target 创建。 + +- [ ] **Step 2: 运行测试确认通过** + +Run: `go test ./internal/server/... -v` +Expected: PASS + +- [ ] **Step 3: Commit** + +```bash +git add internal/server/ +git commit -m "refactor: use testutil helpers in server tests" +``` + +--- + +## 验收检查 + +### Task 11: 最终验证 + +- [ ] **Step 1: 运行 unused linter** + +Run: `golangci-lint run --enable=unused ./...` +Expected: 无 unused 错误 + +- [ ] **Step 2: 运行 dupl linter** + +Run: `golangci-lint run --enable=dupl ./...` +Expected: 源文件无 dupl 错误(测试文件允许) + +- [ ] **Step 3: 运行完整测试套件** + +Run: `go test ./...` +Expected: 全部 PASS + +- [ ] **Step 4: 统计代码行数变化** + +Run: `git diff --stat` +Expected: 总行数净减少 >200 行 + +- [ ] **Step 5: 最终 Commit** + +```bash +git commit -m "chore: eliminate code redundancy - dead code removal, pattern extraction, test helpers" +``` + +--- + +## Self-Review Checklist + +1. **Spec coverage**: 所有 3 个阶段都有详细任务 ✓ +2. **Placeholder scan**: 无 TBD、TODO 或模糊描述 ✓ +3. **Type consistency**: `registerRoute` 和 `proxyDebugLog` 签名与使用处一致 ✓ +4. **File paths**: 所有路径均为绝对路径,与代码库匹配 ✓ +5. **Commands**: 每个测试步骤都有明确的运行命令和预期输出 ✓ diff --git a/docs/superpowers/plans/2026-06-03-redundancy-removal.md b/docs/superpowers/plans/2026-06-03-redundancy-removal.md new file mode 100644 index 0000000..080abab --- /dev/null +++ b/docs/superpowers/plans/2026-06-03-redundancy-removal.md @@ -0,0 +1,791 @@ +# Lolly 代码冗余优化实施计划 + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** 系统性消除 Lolly 代码库中的冗余代码,包括死代码、重复实现、过度工程化和测试重复,提升可维护性和代码质量。 + +**Architecture:** 采用分阶段、增量式重构策略。每阶段独立可交付,确保随时可回滚。优先处理死代码(零风险、高回报),然后处理重复实现(低风险、中回报),最后处理架构级重复(中风险、长期收益)。 + +**Tech Stack:** Go 1.24, fasthttp, staticcheck, go vet + +--- + +## 文件结构映射 + +### 删除/清理的文件 +- `internal/middleware/limitrate/limitrate.go` — 死代码包主文件 +- `internal/middleware/limitrate/writer.go` — 死代码包辅助文件 +- `internal/middleware/limitrate/limitrate_test.go` — 死代码包测试文件 +- `internal/stream/ssl.go` — 死代码(所有字段未使用) +- `internal/stream/ssl_test.go` — 死代码测试文件 +- `internal/variable/pool.go` — 死代码(所有字段未使用) +- `internal/proxy/proxy_coverage_extra_test.go` 中的 `TestExtractHostFromURL` — 被测函数即将删除 + +### 修改的文件(按模块分组) + +**Phase 1 - 死代码清理:** +- `internal/mimeutil/detect.go:154` — 添加 defaultMIME 回退逻辑 +- `internal/app/app_test.go:448` — 删除未使用的 `customSig` +- `internal/app/testutil.go:17` — 删除未使用的 `setupTestLogger` +- `internal/http3/server_test.go:138` — 删除未使用的 `generateTestCertificate` +- `internal/proxy/proxy_dns_test.go:91` — 删除未使用的方法 +- `internal/server/testutil.go:15` — 删除未使用的常量 +- `internal/server/upgrade_test.go:291` — 删除未使用的 `containsString` +- `internal/server/pool_bench_test.go:305` — 删除未使用的 `id` 字段 +- `internal/stream/stream_test.go:24` — 删除未使用的 `generateTestCertificate` + +**Phase 2 - 重复实现消除:** +- `internal/proxy/proxy.go:362,1003-1018` — 删除 `extractHostFromURL`,改用 `netutil.ParseTargetURL` +- `internal/proxy/header_modifier.go:33` — 改用 `netutil.ParseTargetURL` +- `internal/handler/static.go:628,832-836` — 删除 `generateETag` 包装,直接调用 `utils.GenerateETag` +- `internal/cache/file_cache.go:47,181` — 删除 `generateETag` 包装,直接调用 `utils.GenerateETag` +- `internal/utils/httperror.go:67-86` — 简化 `CheckIPAccess`,复用 `IPInAllowList` + +**Phase 3 - 路由和服务器逻辑简化:** +- `internal/server/router.go:118-145,217-234,402-423` — 消除冗余 switch 块 +- `internal/server/server.go:454-868` — 提取三种启动模式的公共函数 + +**Phase 4 - 负载均衡统一(可选):** +- `internal/stream/stream.go:61-285` — 复用 `internal/loadbalance` 的算法实现 + +--- + +## 任务分解 + +### Phase 1: 死代码清理(P0) + +--- + +#### Task 1.1: 删除 limitrate 死代码包 + +**Files:** +- Delete: `internal/middleware/limitrate/limitrate.go` +- Delete: `internal/middleware/limitrate/writer.go` +- Delete: `internal/middleware/limitrate/limitrate_test.go` + +- [ ] **Step 1: 确认包未被引用** + +```bash +grep -r "limitrate" --include="*.go" /home/xfy/Developer/lolly/internal/ +``` + +Expected: 仅返回 `internal/middleware/limitrate/` 目录内的匹配,无外部引用。 + +- [ ] **Step 2: 删除整个目录** + +```bash +rm -rf /home/xfy/Developer/lolly/internal/middleware/limitrate/ +``` + +- [ ] **Step 3: 验证编译通过** + +```bash +cd /home/xfy/Developer/lolly && go build ./... +``` + +Expected: 无错误,编译成功。 + +- [ ] **Step 4: 运行受影响包的测试** + +```bash +cd /home/xfy/Developer/lolly && go test ./internal/middleware/... +``` + +Expected: 全部通过。 + +- [ ] **Step 5: Commit** + +```bash +cd /home/xfy/Developer/lolly && git add -A && git commit -m "refactor: remove dead code package internal/middleware/limitrate" +``` + +--- + +#### Task 1.2: 删除 stream/ssl.go 死代码 + +**Files:** +- Delete: `internal/stream/ssl.go` +- Delete: `internal/stream/ssl_test.go` + +- [ ] **Step 1: 确认 ssl.go 字段未被使用** + +```bash +grep -r "SSLManager\|ProxySSLManager" --include="*.go" /home/xfy/Developer/lolly/internal/ +``` + +Expected: 仅 `internal/stream/ssl.go` 自身有定义,无其他引用。 + +- [ ] **Step 2: 删除文件** + +```bash +rm /home/xfy/Developer/lolly/internal/stream/ssl.go +rm /home/xfy/Developer/lolly/internal/stream/ssl_test.go +``` + +- [ ] **Step 3: 验证编译和测试** + +```bash +cd /home/xfy/Developer/lolly && go build ./internal/stream/... && go test ./internal/stream/... +``` + +Expected: 编译和测试全部通过。 + +- [ ] **Step 4: Commit** + +```bash +cd /home/xfy/Developer/lolly && git add -A && git commit -m "refactor: remove unused stream SSL dead code" +``` + +--- + +#### Task 1.3: 删除 variable/pool.go 死代码 + +**Files:** +- Delete: `internal/variable/pool.go` + +- [ ] **Step 1: 确认 pool.go 变量未被使用** + +```bash +grep -r "PoolStats\|gets\.\|puts\.\|newCount\.\|active\." --include="*.go" /home/xfy/Developer/lolly/internal/ +``` + +Expected: 无引用(除 `pool.go` 自身定义外)。 + +- [ ] **Step 2: 删除文件** + +```bash +rm /home/xfy/Developer/lolly/internal/variable/pool.go +``` + +- [ ] **Step 3: 验证编译和测试** + +```bash +cd /home/xfy/Developer/lolly && go build ./internal/variable/... && go test ./internal/variable/... +``` + +Expected: 编译和测试全部通过。 + +- [ ] **Step 4: Commit** + +```bash +cd /home/xfy/Developer/lolly && git add -A && git commit -m "refactor: remove unused variable pool statistics dead code" +``` + +--- + +#### Task 1.4: 修复 mimeutil defaultMIME 未使用问题 + +**Files:** +- Modify: `internal/mimeutil/detect.go:154` + +- [ ] **Step 1: 阅读当前 DetectContentType 实现** + +Read: `internal/mimeutil/detect.go:95-155` + +当前实现:当 `mime.TypeByExtension` 返回空字符串时,直接缓存并返回空字符串,从未使用 `defaultMIME`。 + +- [ ] **Step 2: 在 DetectContentType 末尾添加 defaultMIME 回退** + +```go +// 在 internal/mimeutil/detect.go 第 154 行(return mimeType 之前)添加: + + if mimeType == "" { + defaultMutex.RLock() + mimeType = defaultMIME + defaultMutex.RUnlock() + } + + return mimeType +``` + +完整修改后的第 149-158 行应为: + +```go + // 插入新条目 + entry := &mimeCacheEntry{ext: ext, mimeType: mimeType} + entry.element = mimeLRU.PushFront(entry) + mimeCache[ext] = entry + + if mimeType == "" { + defaultMutex.RLock() + mimeType = defaultMIME + defaultMutex.RUnlock() + } + + return mimeType +``` + +- [ ] **Step 3: 验证编译和测试** + +```bash +cd /home/xfy/Developer/lolly && go build ./internal/mimeutil/... && go test ./internal/mimeutil/... +``` + +Expected: 编译和测试全部通过。 + +- [ ] **Step 4: Commit** + +```bash +cd /home/xfy/Developer/lolly && git add -A && git commit -m "fix: use defaultMIME fallback in DetectContentType" +``` + +--- + +#### Task 1.5: 清理其他静态检查发现的死代码 + +**Files:** +- Modify: `internal/app/app_test.go` — 删除未使用的 `customSig` +- Modify: `internal/app/testutil.go` — 删除未使用的 `setupTestLogger` +- Modify: `internal/http3/server_test.go` — 删除未使用的 `generateTestCertificate` +- Modify: `internal/proxy/proxy_dns_test.go` — 删除未使用的方法 +- Modify: `internal/server/testutil.go` — 删除未使用的 `testListenAddr` +- Modify: `internal/server/upgrade_test.go` — 删除未使用的 `containsString` +- Modify: `internal/server/pool_bench_test.go` — 删除未使用的 `id` 字段 +- Modify: `internal/stream/stream_test.go` — 删除未使用的 `generateTestCertificate` + +- [ ] **Step 1: 运行 staticcheck 获取精确行号** + +```bash +cd /home/xfy/Developer/lolly && staticcheck ./... 2>&1 | grep "U1000" +``` + +Expected: 输出每个死代码的精确文件路径和行号。 + +- [ ] **Step 2: 逐个删除死代码** + +对每个 staticcheck 报告的死代码: +1. 打开文件 +2. 定位到报告的函数/变量/字段 +3. 删除整个未使用的声明 +4. 保存文件 + +示例(以 `internal/server/testutil.go` 为例): + +```go +// 删除前: +const testListenAddr = "127.0.0.1:0" + +// 删除后: +// (整行删除) +``` + +- [ ] **Step 3: 验证编译和测试** + +```bash +cd /home/xfy/Developer/lolly && go build ./... && go test ./internal/app/... ./internal/http3/... ./internal/proxy/... ./internal/server/... ./internal/stream/... +``` + +Expected: 全部通过。 + +- [ ] **Step 4: Commit** + +```bash +cd /home/xfy/Developer/lolly && git add -A && git commit -m "refactor: remove unused code identified by staticcheck" +``` + +--- + +### Phase 2: 重复实现消除(P1) + +--- + +#### Task 2.1: 删除 proxy.go 中的 extractHostFromURL,统一使用 netutil + +**Files:** +- Modify: `internal/proxy/proxy.go:362` — 替换调用 +- Modify: `internal/proxy/proxy.go:993-1018` — 删除函数 +- Modify: `internal/proxy/header_modifier.go:33` — 替换调用 +- Modify: `internal/proxy/proxy_coverage_extra_test.go` — 删除测试 + +- [ ] **Step 1: 修改 proxy.go:362 的调用** + +Read: `internal/proxy/proxy.go:360-365` + +将: +```go + tlsCfg, err := CreateTLSConfig(sslCfg, extractHostFromURL(targetURL)) +``` +改为: +```go + host, _, _, err := netutil.ParseTargetURL(targetURL, false) + if err != nil { + return nil, fmt.Errorf("parse target URL %q: %w", targetURL, err) + } + tlsCfg, err := CreateTLSConfig(sslCfg, host) +``` + +并确保文件已导入 `rua.plus/lolly/internal/netutil`。 + +- [ ] **Step 2: 修改 header_modifier.go:33 的调用** + +Read: `internal/proxy/header_modifier.go:30-36` + +将: +```go + targetHost := extractHostFromURL(target.URL) +``` +改为: +```go + targetHost, _, _, err := netutil.ParseTargetURL(target.URL, false) + if err != nil { + targetHost = target.URL + } +``` + +并确保文件已导入 `rua.plus/lolly/internal/netutil`。 + +- [ ] **Step 3: 删除 proxy.go 中的 extractHostFromURL 函数** + +删除 `internal/proxy/proxy.go` 第 993-1018 行的整个函数: + +```go +// extractHostFromURL 从 URL 字符串中提取 host:port 部分。 +// ... +func extractHostFromURL(urlStr string) string { + // ... +} +``` + +- [ ] **Step 4: 删除 proxy_coverage_extra_test.go 中的 TestExtractHostFromURL** + +Read: `internal/proxy/proxy_coverage_extra_test.go:1426-1480` + +删除整个 `TestExtractHostFromURL` 函数及其相关测试用例。 + +- [ ] **Step 5: 验证编译和测试** + +```bash +cd /home/xfy/Developer/lolly && go build ./internal/proxy/... && go test ./internal/proxy/... +``` + +Expected: 编译和测试全部通过。 + +- [ ] **Step 6: Commit** + +```bash +cd /home/xfy/Developer/lolly && git add -A && git commit -m "refactor: remove extractHostFromURL, use netutil.ParseTargetURL" +``` + +--- + +#### Task 2.2: 删除 generateETag 包装函数 + +**Files:** +- Modify: `internal/handler/static.go:628,832-836` +- Modify: `internal/cache/file_cache.go:45-49,181` + +- [ ] **Step 1: 修改 handler/static.go** + +Read: `internal/handler/static.go:626-630` + +将: +```go + etag := generateETag(info.ModTime(), info.Size()) +``` +改为: +```go + etag := utils.GenerateETag(info.ModTime(), info.Size()) +``` + +删除 `internal/handler/static.go` 第 832-836 行的 `generateETag` 函数。 + +- [ ] **Step 2: 修改 cache/file_cache.go** + +Read: `internal/cache/file_cache.go:179-183` + +将: +```go + etag := generateETag(modTime, size) +``` +改为: +```go + etag := utils.GenerateETag(modTime, size) +``` + +删除 `internal/cache/file_cache.go` 第 45-49 行的 `generateETag` 函数。 + +- [ ] **Step 3: 验证编译和测试** + +```bash +cd /home/xfy/Developer/lolly && go build ./internal/handler/... ./internal/cache/... && go test ./internal/handler/... ./internal/cache/... +``` + +Expected: 编译和测试全部通过。 + +- [ ] **Step 4: Commit** + +```bash +cd /home/xfy/Developer/lolly && git add -A && git commit -m "refactor: remove redundant generateETag wrappers, use utils.GenerateETag directly" +``` + +--- + +#### Task 2.3: 简化 CheckIPAccess 复用 IPInAllowList + +**Files:** +- Modify: `internal/utils/httperror.go:67-86` + +- [ ] **Step 1: 重构 CheckIPAccess** + +Read: `internal/utils/httperror.go:67-86` + +将: +```go +func CheckIPAccess(ctx *fasthttp.RequestCtx, allowed []net.IPNet) bool { + if len(allowed) == 0 { + return true + } + + clientIP := netutil.ExtractClientIPNet(ctx) + if clientIP == nil { + return false + } + + for _, network := range allowed { + if network.Contains(clientIP) { + return true + } + } + + return false +} +``` +改为: +```go +func CheckIPAccess(ctx *fasthttp.RequestCtx, allowed []net.IPNet) bool { + if len(allowed) == 0 { + return true + } + + clientIP := netutil.ExtractClientIPNet(ctx) + if clientIP == nil { + return false + } + + return IPInAllowList(clientIP, allowed) +} +``` + +- [ ] **Step 2: 验证编译和测试** + +```bash +cd /home/xfy/Developer/lolly && go build ./internal/utils/... && go test ./internal/utils/... +``` + +Expected: 编译和测试全部通过。 + +- [ ] **Step 3: Commit** + +```bash +cd /home/xfy/Developer/lolly && git add -A && git commit -m "refactor: simplify CheckIPAccess by reusing IPInAllowList" +``` + +--- + +### Phase 3: 路由和服务器逻辑简化(P1-P2) + +--- + +#### Task 3.1: 简化 router.go 中的冗余 switch 块 + +**Files:** +- Modify: `internal/server/router.go:118-145` (`registerProxyRoutesWithLocationEngine`) +- Modify: `internal/server/router.go:217-234` (`registerStaticHandlersWithLocationEngine`) +- Modify: `internal/server/router.go:402-423` (`registerLuaRoutesWithLocationEngine`) + +- [ ] **Step 1: 简化 registerProxyRoutesWithLocationEngine** + +Read: `internal/server/router.go:108-148` + +将第 118-145 行的 switch 块替换为: + +```go + for i := range serverCfg.Proxy { + proxyCfg := &serverCfg.Proxy[i] + p := s.createProxyForConfig(proxyCfg) + if p == nil { + continue + } + + locType := proxyCfg.LocationType + if locType == "" { + locType = matcher.LocationTypePrefix + } + + path := proxyCfg.Path + if locType == matcher.LocationTypeNamed && proxyCfg.LocationName != "" { + path = "@" + proxyCfg.LocationName + } + + if err := s.registerRoute(locType, path, p.ServeHTTP, proxyCfg.Internal, "proxy"); err != nil { + return err + } + } + return nil +``` + +- [ ] **Step 2: 简化 registerStaticHandlersWithLocationEngine** + +Read: `internal/server/router.go:208-236` + +将第 217-234 行的 switch 块替换为类似逻辑(直接调用 `s.registerRoute`)。 + +- [ ] **Step 3: 简化 registerLuaRoutesWithLocationEngine** + +Read: `internal/server/router.go:393-425` + +将第 402-423 行的 switch 块替换为类似逻辑(直接调用 `s.registerRoute`)。 + +- [ ] **Step 4: 验证编译和测试** + +```bash +cd /home/xfy/Developer/lolly && go build ./internal/server/... && go test ./internal/server/... +``` + +Expected: 编译和测试全部通过。 + +- [ ] **Step 5: Commit** + +```bash +cd /home/xfy/Developer/lolly && git add -A && git commit -m "refactor: eliminate redundant switch blocks in router.go LocationEngine functions" +``` + +--- + +#### Task 3.2: 提取 server.go 三种启动模式的公共函数 + +**Files:** +- Modify: `internal/server/server.go:454-868` + +**新增辅助函数(添加到 server.go 末尾,在 SetResolver 之前):** + +- [ ] **Step 1: 提取 `registerMonitoringEndpoints` 函数** + +在 `internal/server/server.go` 中新增: + +```go +// registerMonitoringEndpoints 注册状态监控、性能分析和缓存清理端点。 +// isDefault 为 true 时注册所有端点,否则跳过(用于多服务器模式)。 +func (s *Server) registerMonitoringEndpoints(router *handler.Router, serverCfg *config.ServerConfig, isDefault bool) { + // 状态监控端点 + if isDefault && s.config.Monitoring.Status.Enabled { + statusHandler, err := NewStatusHandler(s, &s.config.Monitoring.Status) + if err != nil { + logging.Error().Msg("Failed to create status handler: " + err.Error()) + } else { + router.GET(statusHandler.Path(), statusHandler.ServeHTTP) + } + } + + // pprof 性能分析端点 + if isDefault && s.config.Monitoring.Pprof.Enabled { + pprofHandler, err := NewPprofHandler(&s.config.Monitoring.Pprof) + if err != nil { + logging.Error().Msg("Failed to create pprof handler: " + err.Error()) + } else { + router.GET(pprofHandler.Path(), pprofHandler.ServeHTTP) + router.GET(pprofHandler.Path()+"/{profile:*}", pprofHandler.ServeHTTP) + } + } + + // 缓存清理 API + if isDefault && serverCfg.CacheAPI != nil && serverCfg.CacheAPI.Enabled { + purgeHandler, err := NewPurgeHandler(s, serverCfg.CacheAPI) + if err != nil { + logging.Error().Msg("Failed to create cache purge handler: " + err.Error()) + } else { + router.POST(purgeHandler.Path(), purgeHandler.ServeHTTP) + } + } +} +``` + +- [ ] **Step 2: 提取 `wrapHandler` 函数** + +```go +// wrapHandler 应用中间件链、连接池包装和统计追踪。 +func (s *Server) wrapHandler(base fasthttp.RequestHandler, serverCfg *config.ServerConfig) (fasthttp.RequestHandler, error) { + chain, err := s.buildMiddlewareChain(serverCfg) + if err != nil { + return nil, err + } + + handler := chain.Apply(base) + if s.pool != nil { + handler = s.pool.WrapHandler(handler) + } + handler = s.trackStats(handler) + return handler, nil +} +``` + +- [ ] **Step 3: 提取 `startServer` 函数** + +```go +// startServer 创建监听器并启动 fasthttp.Server,支持可选 TLS。 +func (s *Server) startServer(serverCfg *config.ServerConfig, fastSrv *fasthttp.Server) error { + ln, err := s.createListener(serverCfg) + if err != nil { + return fmt.Errorf("failed to listen: %w", err) + } + s.listeners = append(s.listeners, ln) + + // 检查 SSL/TLS + if serverCfg.SSL.Cert != "" && serverCfg.SSL.Key != "" { + tlsManager, err := ssl.NewTLSManager(&serverCfg.SSL) + if err != nil { + return fmt.Errorf("failed to create TLS manager: %w", err) + } + fastSrv.TLSConfig = tlsManager.GetTLSConfig() + return fastSrv.ServeTLS(ln, "", "") + } + + return fastSrv.Serve(ln) +} +``` + +- [ ] **Step 4: 重构 startSingleMode 使用新函数** + +将 `startSingleMode` 中的监控注册、中间件链构建、fasthttp.Server 创建和启动逻辑替换为对新辅助函数的调用。 + +重构后的 `startSingleMode` 核心逻辑: + +```go +func (s *Server) startSingleMode() error { + serverCfg := &s.config.Servers[0] + s.applyTypesConfig(serverCfg) + + s.locationEngine = matcher.NewLocationEngine() + s.registerMonitoringEndpointsWithLocationEngine(serverCfg) + + if err := s.registerProxyRoutesWithLocationEngine(serverCfg); err != nil { + return err + } + // ... Lua 和静态文件注册 + + s.locationEngine.MarkInitialized() + + baseHandler := func(ctx *fasthttp.RequestCtx) { + // LocationEngine 匹配逻辑 + } + + handler, err := s.wrapHandler(baseHandler, serverCfg) + if err != nil { + return err + } + s.handler = handler + + s.fastServer = s.createFastServer(serverCfg, s.handler) + s.running.Store(true) + + return s.startServer(serverCfg, s.fastServer) +} +``` + +- [ ] **Step 5: 重构 startVHostMode 使用新函数** + +类似地,将 `startVHostMode` 中的重复逻辑替换为对新辅助函数的调用。 + +- [ ] **Step 6: 重构 startMultiServerMode 使用新函数** + +类似地,将 `startMultiServerMode` 中的重复逻辑替换为对新辅助函数的调用。 + +- [ ] **Step 7: 验证编译和测试** + +```bash +cd /home/xfy/Developer/lolly && go build ./internal/server/... && go test ./internal/server/... +``` + +Expected: 编译和测试全部通过。 + +- [ ] **Step 8: Commit** + +```bash +cd /home/xfy/Developer/lolly && git add -A && git commit -m "refactor: extract common functions from server startup modes" +``` + +--- + +### Phase 4: 负载均衡统一(P3 - 可选/长期) + +--- + +#### Task 4.1: 分析 Stream 和 HTTP 负载均衡的差异 + +**Files:** +- Read: `internal/stream/stream.go:61-285` +- Read: `internal/loadbalance/balancer.go:101-273` + +- [ ] **Step 1: 对比两种实现的差异** + +重点关注: +- Stream 版本使用 `sync.Pool` 优化,HTTP 版本没有 +- HTTP 版本有 `SelectExcluding` 方法,Stream 版本没有 +- 两者 Target 类型不同(Stream 用 `string`,HTTP 用 `*Target`) + +- [ ] **Step 2: 决策是否统一** + +如果差异较小,建议: +1. 在 `internal/loadbalance` 中定义接口 +2. Stream 复用 HTTP 的实现,只保留 `sync.Pool` 优化作为可选项 + +如果差异较大,建议: +1. 保持现状 +2. 在文档中注明重复,待架构演进时统一 + +--- + +## 验证清单 + +每阶段完成后运行: + +```bash +# 1. 编译检查 +cd /home/xfy/Developer/lolly && go build ./... + +# 2. 静态分析 +cd /home/xfy/Developer/lolly && staticcheck ./... + +# 3. 单元测试 +cd /home/xfy/Developer/lolly && go test ./internal/... + +# 4. 完整测试套件 +cd /home/xfy/Developer/lolly && make test +``` + +Expected: +- `go build ./...` — 无错误 +- `staticcheck ./...` — 无新的警告 +- `go test ./internal/...` — 全部通过 +- `make test` — 全部通过 + +--- + +## 回滚策略 + +每个 Task 完成后立即 commit。如需回滚: + +```bash +# 回滚单个 Task +git revert + +# 回滚整个 Phase +git revert .. +``` + +--- + +## 风险评估 + +| 任务 | 风险等级 | 影响范围 | 缓解措施 | +|------|----------|----------|----------| +| Task 1.1-1.5 | 极低 | 仅删除死代码 | 编译和测试验证 | +| Task 2.1-2.3 | 低 | 替换函数调用 | 全量测试 | +| Task 3.1 | 低 | router.go 内部重构 | server 包测试 | +| Task 3.2 | 中 | server.go 核心逻辑 | 完整回归测试 | +| Task 4.1 | 中 | 架构变更 | 延后到单独迭代 | + +--- + +*Plan generated: 2026-06-03* +*Estimated effort: 4-6 hours for Phases 1-3, 2-4 hours for Phase 4* diff --git a/docs/superpowers/plans/2026-06-04-performance-optimization.md b/docs/superpowers/plans/2026-06-04-performance-optimization.md new file mode 100644 index 0000000..d02aa27 --- /dev/null +++ b/docs/superpowers/plans/2026-06-04-performance-optimization.md @@ -0,0 +1,820 @@ +# 性能热路径优化 Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** 消除 6 个已确认的热路径性能瓶颈,减少每请求堆分配和锁竞争。 + +**Architecture:** 针对 loadbalance filterHealthy(每请求分配)、RadixTree 堆分配、DNS LRU O(n) 操作、FileInfoCache 双重锁升级、ConsistentHash 双重锁、IsAvailable mutex 逐个进行激进优化。每项优化独立可测,不改变外部接口。 + +**Tech Stack:** Go 1.26+, sync.Pool, container/list, atomic operations, unsafe pointer (b2s/s2b) + +--- + +## Task 1: loadbalance — filterHealthy 零分配优化 + +**Files:** +- Modify: `internal/loadbalance/balancer.go` +- Test: `internal/loadbalance/balancer_test.go` +- Benchmark: `internal/loadbalance/balancer_bench_test.go` + +**问题**: `filterHealthy` 每次调用分配 2 个切片(`available` + `backups`),`filterHealthyAndExclude` 分配 3 个(加 `excludeSet` map)。`IPHash.SelectByIP` 额外分配 `fnv.New64a()` 对象。这些在每个请求的负载均衡选择中触发。 + +**方案**: 引入 `filterContext` 结构体持有可复用缓冲区,通过 `sync.Pool` 管理。`filterHealthy` 改为写入 `filterContext` 的预分配切片而非每次 `make`。IPHash 使用内联 FNV-64a 哈希避免 `fnv.New64a()` 分配。 + +- [ ] **Step 1: 定义 filterContext 和 Pool** + +在 `balancer.go` 中添加: + +```go +type filterContext struct { + available []*Target + backups []*Target + excludeSet map[string]bool +} + +var filterContextPool = sync.Pool{ + New: func() any { + return &filterContext{ + available: make([]*Target, 0, 64), + backups: make([]*Target, 0, 64), + excludeSet: make(map[string]bool, 8), + } + }, +} + +func acquireFilterContext() *filterContext { + fc := filterContextPool.Get().(*filterContext) + return fc +} + +func releaseFilterContext(fc *filterContext) { + fc.available = fc.available[:0] + fc.backups = fc.backups[:0] + for k := range fc.excludeSet { + delete(fc.excludeSet, k) + } + filterContextPool.Put(fc) +} +``` + +- [ ] **Step 2: 重写 filterHealthy 为 filterInto** + +```go +func filterInto(fc *filterContext, targets []*Target) []*Target { + for _, t := range targets { + if !t.IsAvailable() { + continue + } + if t.IsBackup() { + fc.backups = append(fc.backups, t) + } else { + fc.available = append(fc.available, t) + } + } + if len(fc.available) > 0 { + return fc.available + } + return fc.backups +} +``` + +- [ ] **Step 3: 重写 filterHealthyAndExclude 为 filterIntoExcluding** + +```go +func filterIntoExcluding(fc *filterContext, targets []*Target, excluded []*Target) []*Target { + if len(excluded) > 0 { + for _, t := range excluded { + if t != nil { + fc.excludeSet[t.URL] = true + } + } + } + for _, t := range targets { + if !t.IsAvailable() || fc.excludeSet[t.URL] { + continue + } + if t.IsBackup() { + fc.backups = append(fc.backups, t) + } else { + fc.available = append(fc.available, t) + } + } + if len(fc.available) > 0 { + return fc.available + } + return fc.backups +} +``` + +- [ ] **Step 4: 添加内联 FNV-64a 哈希函数** + +避免 `fnv.New64a()` 的堆分配: + +```go +func fnvHash64a(key string) uint64 { + var h uint64 = 14695981039346656037 + for i := 0; i < len(key); i++ { + h ^= uint64(key[i]) + h *= 1099511628211 + } + return h +} +``` + +- [ ] **Step 5: 重写所有 Balancer 的 Select/SelectExcluding 使用 Pool** + +RoundRobin 示例: +```go +func (r *RoundRobin) Select(targets []*Target) *Target { + fc := acquireFilterContext() + defer releaseFilterContext(fc) + healthy := filterInto(fc, targets) + if len(healthy) == 0 { + return nil + } + idx := r.counter.Add(1) - 1 + return healthy[idx%uint64(len(healthy))] +} +``` + +对所有 6 个算法的 `Select`/`SelectExcluding` 方法应用相同模式。 +IPHash 中将 `fnv.New64a()` + `h.Write()` + `h.Sum64()` 替换为 `fnvHash64a(clientIP)`。 +ConsistentHash 中 `hashKeyString` 也替换为 `fnvHash64a`。 + +- [ ] **Step 6: 保留旧函数作为兼容别名(可选)** + +保留 `filterHealthy` 和 `filterHealthyAndExclude` 函数签名但标记 `// Deprecated`,内部调用新实现,确保外部调用方不受影响。如果没有外部调用方,可直接删除。 + +- [ ] **Step 7: 运行现有测试验证正确性** + +```bash +go test -v -count=1 ./internal/loadbalance/... +``` + +预期:全部 PASS,无行为变化。 + +- [ ] **Step 8: 运行基准测试验证性能提升** + +```bash +go test -bench=BenchmarkAllBalancers -benchmem -count=5 ./internal/loadbalance/... +``` + +预期:allocs/op 从 2-3 降低到 0-1。 + +- [ ] **Step 9: 提交** + +```bash +git add internal/loadbalance/balancer.go internal/loadbalance/random.go internal/loadbalance/consistent_hash.go +git commit -m "perf(loadbalance): eliminate per-request allocations in filterHealthy with sync.Pool" +``` + +--- + +## Task 2: loadbalance — IsAvailable 无锁化 + +**Files:** +- Modify: `internal/loadbalance/balancer.go` +- Test: `internal/loadbalance/balancer_test.go` + +**问题**: `IsAvailable()` 在 `MaxFails > 0` 时获取 `failMu` mutex。这发生在 `filterHealthy`/`filterInto` 的每次目标遍历中,意味着每次 LB Select 都会对每个目标加锁一次。 + +**方案**: 将 `failCount` 和 `failedUntil` 改为 atomic 操作,消除 `failMu` mutex。使用 CAS 循环实现 `RecordFailure` 和冷却重置。 + +- [ ] **Step 1: 修改 Target 字段为 atomic** + +```go +type Target struct { + // ... 保留其他字段 ... + failCount atomic.Int64 + failedUntil atomic.Int64 + // 删除: failMu sync.Mutex +} +``` + +- [ ] **Step 2: 重写 IsAvailable 为无锁版本** + +```go +func (t *Target) IsAvailable() bool { + if !t.Healthy.Load() || t.Down { + return false + } + if t.MaxConns > 0 && atomic.LoadInt64(&t.Connections) >= t.MaxConns { + return false + } + if t.MaxFails > 0 { + failCount := t.failCount.Load() + if failCount >= t.MaxFails { + failedUntil := t.failedUntil.Load() + if time.Now().UnixNano() < failedUntil { + return false + } + // 冷却已过期,尝试重置(允许竞争,不影响正确性) + if failedUntil > 0 { + t.failCount.Store(0) + t.failedUntil.Store(0) + } + } + } + return true +} +``` + +- [ ] **Step 3: 重写 RecordFailure 和 RecordSuccess 为无锁版本** + +```go +func (t *Target) RecordFailure() int64 { + if t.MaxFails <= 0 { + return 0 + } + count := t.failCount.Add(1) + if count >= t.MaxFails { + timeout := t.FailTimeout + if timeout <= 0 { + timeout = 10 * time.Second + } + t.failedUntil.Store(time.Now().Add(timeout).UnixNano()) + } + return count +} + +func (t *Target) RecordSuccess() { + if t.MaxFails <= 0 { + return + } + t.failCount.Store(0) + t.failedUntil.Store(0) +} +``` + +- [ ] **Step 4: 运行测试** + +```bash +go test -v -count=1 -run=TestTarget ./internal/loadbalance/... +``` + +预期:全部 PASS。 + +- [ ] **Step 5: 运行完整包测试** + +```bash +go test -v -count=1 ./internal/loadbalance/... +``` + +- [ ] **Step 6: 提交** + +```bash +git add internal/loadbalance/balancer.go +git commit -m "perf(loadbalance): replace failMu mutex with atomic operations in IsAvailable" +``` + +--- + +## Task 3: matcher — RadixTree 零分配搜索 + +**Files:** +- Modify: `internal/matcher/radix.go` +- Test: `internal/matcher/radix_test.go`, `internal/matcher/integration_test.go` +- Benchmark: 新建 `internal/matcher/radix_bench_test.go` + +**问题**: `searchLongest` 递归搜索中,每次遇到带 handler 的节点都分配 `&MatchResult{}`,一次查找可能分配 N 个 MatchResult 但只保留 1 个。正则匹配器 `GetCaptures` 每次分配 `map[string]string`。 + +**方案**: 使用 `sync.Pool` 复用 MatchResult。引入 `searchState` 避免递归中的多次分配,改为栈式迭代或就地更新最佳匹配。 + +- [ ] **Step 1: 添加 MatchResult Pool** + +在 `radix.go` 中添加: + +```go +var matchResultPool = sync.Pool{ + New: func() any { + return &MatchResult{} + }, +} +``` + +- [ ] **Step 2: 重写 searchLongest 为就地更新最佳匹配** + +将递归中创建 newMatch 改为直接比较节点字段,仅在最终返回时从池中获取 MatchResult: + +```go +func (t *RadixTree) searchLongest(node *RadixNode, path string, bestNode *RadixNode, bestPrefixLen int) *RadixNode { + if node == nil || path == "" { + return bestNode + } + if !strings.HasPrefix(path, node.prefix) { + return bestNode + } + remaining := path[len(node.prefix):] + if node.handler != nil { + if bestNode == nil || node.priority < bestNode.priority { + bestNode = node + } else if node.priority == bestNode.priority && len(node.prefix) > bestPrefixLen { + bestNode = node + } + } + for _, child := range node.children { + bestNode = t.searchLongest(child, remaining, bestNode, bestPrefixLen) + } + return bestNode +} +``` + +- [ ] **Step 3: 修改 FindLongestPrefix 在返回时构建 MatchResult** + +```go +func (t *RadixTree) FindLongestPrefix(path string) *MatchResult { + bestNode := t.searchLongest(t.root, path, nil, 0) + if bestNode == nil { + return nil + } + result := matchResultPool.Get().(*MatchResult) + result.Handler = bestNode.handler + result.Path = bestNode.prefix + result.Priority = bestNode.priority + result.LocationType = bestNode.locationType + result.Internal = bestNode.internal + return result +} +``` + +注意:调用方使用完 MatchResult 后需调用 `PutMatchResult(result)` 归还池。 + +- [ ] **Step 4: 添加 ReleaseMatchResult 函数供调用方使用** + +```go +func ReleaseMatchResult(r *MatchResult) { + if r == nil { + return + } + r.Handler = nil + r.Captures = nil + r.Path = "" + r.LocationType = "" + r.Internal = false + r.Priority = 0 + matchResultPool.Put(r) +} +``` + +- [ ] **Step 5: 更新 LocationEngine.Match 调用 FindLongestPrefix 后释放** + +在 `location.go` 中,确保所有 `FindLongestPrefix` 返回值在函数结束前调用 `ReleaseMatchResult`(需分析调用链确认所有权)。 + +- [ ] **Step 6: 添加基准测试文件** + +创建 `internal/matcher/radix_bench_test.go`: + +```go +func BenchmarkRadixTreeFindLongestPrefix(b *testing.B) { + tree := NewRadixTree() + paths := []string{"/", "/api", "/api/v1", "/api/v1/users", "/api/v1/users/:id", "/static", "/static/css", "/static/js", "/health", "/favicon.ico"} + for _, p := range paths { + tree.Insert(p, func(ctx *fasthttp.RequestCtx) {}, 0, "prefix", false) + } + tree.MarkInitialized() + + b.ResetTimer() + b.ReportAllocs() + for b.Loop() { + result := tree.FindLongestPrefix("/api/v1/users/123") + ReleaseMatchResult(result) + } +} + +func BenchmarkRadixTreeFindLongestPrefixParallel(b *testing.B) { + // 同上但用 b.RunParallel +} +``` + +- [ ] **Step 7: 运行所有 matcher 测试** + +```bash +go test -v -count=1 ./internal/matcher/... +``` + +- [ ] **Step 8: 运行基准测试** + +```bash +go test -bench=BenchmarkRadixTree -benchmem ./internal/matcher/... +``` + +预期:allocs/op 从 N(匹配路径上的 handler 节点数)降低到 1(仅池获取)。 + +- [ ] **Step 9: 提交** + +```bash +git add internal/matcher/radix.go internal/matcher/radix_bench_test.go +git commit -m "perf(matcher): eliminate heap allocations in RadixTree search with sync.Pool" +``` + +--- + +## Task 4: resolver — LRU 从 O(n) 切换到 O(1) + +**Files:** +- Modify: `internal/resolver/resolver.go`, `internal/resolver/cache.go` +- Test: `internal/resolver/resolver_test.go`, `internal/resolver/mock_dns_test.go` +- Benchmark: `internal/resolver/resolver_bench_test.go` + +**问题**: DNS 缓存的 LRU 使用 `[]string` 切片实现 `moveToFrontLocked`,每次操作 O(n) 线性扫描 + 切片重组。`storeCache` 持有写锁执行整个 O(n) 操作,阻塞所有并发读。 + +**方案**: 将 LRU 从 `[]string` 切片替换为 `container/list` + `map[string]*list.Element`(与 FileCache 和 FileInfoCache 的模式一致)。moveToFront 和 eviction 都变为 O(1)。 + +- [ ] **Step 1: 修改 DNSResolver 结构体** + +```go +type DNSResolver struct { + config *config.ResolverConfig + stopCh chan struct{} + refreshHosts map[string]struct{} + cache map[string]*DNSCacheEntry + lruList *list.List // 替代 lruOrder []string + lruIndex map[string]*list.Element // 新增:host -> list.Element + hits atomic.Int64 + misses atomic.Int64 + errors atomic.Int64 + latencyNs atomic.Int64 + count atomic.Int64 + mu sync.RWMutex + serverIdx atomic.Uint32 + started atomic.Bool +} +``` + +- [ ] **Step 2: 重写 storeCache** + +```go +func (r *DNSResolver) storeCache(host string, entry *DNSCacheEntry) { + r.mu.Lock() + defer r.mu.Unlock() + + if elem, ok := r.lruIndex[host]; ok { + r.cache[host] = entry + r.lruList.MoveToFront(elem) + return + } + + if r.config.CacheSize > 0 && len(r.cache) >= r.config.CacheSize { + r.evictLRULocked() + } + + r.cache[host] = entry + elem := r.lruList.PushFront(host) + r.lruIndex[host] = elem +} +``` + +- [ ] **Step 3: 重写 evictLRULocked** + +```go +func (r *DNSResolver) evictLRULocked() { + oldest := r.lruList.Back() + if oldest == nil { + return + } + host := oldest.Value.(string) + delete(r.cache, host) + delete(r.lruIndex, host) + r.lruList.Remove(oldest) +} +``` + +- [ ] **Step 4: 删除 moveToFrontLocked**(不再需要,由 `lruList.MoveToFront` 替代) + +- [ ] **Step 5: 更新 New() 构造函数** + +```go +return &DNSResolver{ + config: &configCopy, + stopCh: make(chan struct{}), + refreshHosts: make(map[string]struct{}), + cache: make(map[string]*DNSCacheEntry), + lruList: list.New(), + lruIndex: make(map[string]*list.Element), +} +``` + +- [ ] **Step 6: 更新 DeleteCacheEntry** + +```go +func (r *DNSResolver) DeleteCacheEntry(host string) { + r.mu.Lock() + defer r.mu.Unlock() + delete(r.cache, host) + if elem, ok := r.lruIndex[host]; ok { + r.lruList.Remove(elem) + delete(r.lruIndex, host) + } + delete(r.refreshHosts, host) +} +``` + +- [ ] **Step 7: 更新 ClearCache** + +```go +func (r *DNSResolver) ClearCache() { + r.mu.Lock() + r.cache = make(map[string]*DNSCacheEntry) + r.lruList = list.New() + r.lruIndex = make(map[string]*list.Element) + r.refreshHosts = make(map[string]struct{}) + r.mu.Unlock() +} +``` + +- [ ] **Step 8: 添加 import "container/list"** + +- [ ] **Step 9: 运行所有 resolver 测试** + +```bash +go test -v -count=1 ./internal/resolver/... +``` + +- [ ] **Step 10: 运行基准测试验证** + +```bash +go test -bench=BenchmarkDNS -benchmem -count=5 ./internal/resolver/... +``` + +预期:`BenchmarkDNSResolverCacheWriteLock` 和 `BenchmarkDNSResolverMixedWorkload` 显著提速。 + +- [ ] **Step 11: 提交** + +```bash +git add internal/resolver/resolver.go internal/resolver/cache.go +git commit -m "perf(resolver): replace slice-based LRU with container/list for O(1) operations" +``` + +--- + +## Task 5: handler — FileInfoCache 近似 LRU 消除读锁升级 + +**Files:** +- Modify: `internal/handler/fileinfo_cache.go` +- Test: `internal/handler/static_test.go`(间接,通过现有测试验证) +- Benchmark: `internal/handler/static_bench_test.go` + +**问题**: `FileInfoCache.Get()` 在每次缓存命中时需要 **两次锁获取**:先 RLock 检查存在性和 TTL,然后释放 RLock,再 Lock 做 `MoveToFront` LRU 更新。每次命中都有 RLock→Lock 升级。 + +**方案**: 采用近似 LRU 策略——Get 路径跳过 `MoveToFront`,仅 RLock 快速路径返回。仅在 Set 路径(写操作)时更新 LRU 位置。这与 FileCache 的近似 LRU 策略一致。 + +- [ ] **Step 1: 重写 Get 为纯 RLock 快速路径** + +```go +func (c *FileInfoCache) Get(filePath string) (os.FileInfo, bool) { + c.mu.RLock() + entry, ok := c.entries[filePath] + if !ok { + c.mu.RUnlock() + return nil, false + } + if time.Since(entry.cachedAt) > fileInfoCacheTTL { + c.mu.RUnlock() + // 过期删除仍需写锁 + c.mu.Lock() + if e, ok := c.entries[filePath]; ok && time.Since(e.cachedAt) > fileInfoCacheTTL { + c.lruList.Remove(e.element) + delete(c.entries, filePath) + } + c.mu.Unlock() + return nil, false + } + info := entry.info + c.mu.RUnlock() + return info, true +} +``` + +- [ ] **Step 2: 在 Set 中添加 LRU 位置更新** + +```go +func (c *FileInfoCache) Set(filePath string, info os.FileInfo) { + c.mu.Lock() + defer c.mu.Unlock() + + if entry, ok := c.entries[filePath]; ok { + entry.info = info + entry.cachedAt = time.Now() + c.lruList.MoveToFront(entry.element) + return + } + // ... 淘汰和插入逻辑不变 ... +} +``` + +- [ ] **Step 3: 添加 FileInfoCache 专项基准测试** + +在 `internal/handler/static_bench_test.go` 中添加: + +```go +func BenchmarkFileInfoCacheGetHit(b *testing.B) { + cache := NewFileInfoCache() + info, _ := os.Stat("testdata/style.css") + cache.Set("/style.css", info) + + b.ResetTimer() + b.ReportAllocs() + for b.Loop() { + cache.Get("/style.css") + } +} + +func BenchmarkFileInfoCacheGetHitParallel(b *testing.B) { + cache := NewFileInfoCache() + info, _ := os.Stat("testdata/style.css") + cache.Set("/style.css", info) + + b.ResetTimer() + b.ReportAllocs() + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + cache.Get("/style.css") + } + }) +} +``` + +注意:需确认 `NewFileInfoCache` 是否已导出,若未导出则在包内测试。 + +- [ ] **Step 4: 运行所有 handler 测试** + +```bash +go test -v -count=1 ./internal/handler/... +``` + +- [ ] **Step 5: 运行基准测试** + +```bash +go test -bench=BenchmarkFileInfoCache -benchmem ./internal/handler/... +``` + +预期:Get hit 路径从 2 次锁操作降到 1 次 RLock,并行吞吐显著提升。 + +- [ ] **Step 6: 提交** + +```bash +git add internal/handler/fileinfo_cache.go internal/handler/static_bench_test.go +git commit -m "perf(handler): eliminate read-lock upgrade in FileInfoCache.Get with approximate LRU" +``` + +--- + +## Task 6: loadbalance — ConsistentHash 消除双重锁 + +**Files:** +- Modify: `internal/loadbalance/consistent_hash.go` +- Test: `internal/loadbalance/balancer_test.go` + +**问题**: `SelectByKey` 和 `SelectExcludingByKey` 在发现 `circle` 为空时执行 `RLock → RUnlock → rebuildCircle(Lock) → RLock`,即释放读锁、获取写锁重建、再获取读锁。在冷启动高并发时,多个 goroutine 可能同时触发 rebuild。 + +**方案**: 使用 `sync.Once` 或 `atomic.Bool` 保证 rebuild 只执行一次。在首次 Select 前完成 rebuild,后续调用直接 RLock 读取。同时将 `hashKeyString` 替换为内联 `fnvHash64a`(Task 1 中已定义)。 + +- [ ] **Step 1: 添加 rebuildOnce 字段** + +```go +type ConsistentHash struct { + circle map[uint64]*Target + hashKey string + sortedHashes []uint64 + virtualNodes int + mu sync.RWMutex + rebuilt atomic.Bool +} +``` + +- [ ] **Step 2: 重写 SelectByKey 使用 ensureRebuilt** + +```go +func (c *ConsistentHash) ensureRebuilt(targets []*Target) { + if c.rebuilt.Load() { + return + } + c.rebuildCircle(targets) +} + +func (c *ConsistentHash) SelectByKey(targets []*Target, key string) *Target { + c.ensureRebuilt(targets) + + c.mu.RLock() + defer c.mu.RUnlock() + + if len(c.sortedHashes) == 0 { + return nil + } + + hash := fnvHash64a(key) + idx := sort.Search(len(c.sortedHashes), func(i int) bool { + return c.sortedHashes[i] >= hash + }) + if idx >= len(c.sortedHashes) { + idx = 0 + } + return c.circle[c.sortedHashes[idx]] +} +``` + +- [ ] **Step 3: 更新 Rebuild 方法重置 rebuilt 标志** + +```go +func (c *ConsistentHash) Rebuild(targets []*Target) { + c.rebuilt.Store(false) + c.rebuildCircle(targets) +} +``` + +- [ ] **Step 4: 更新 rebuildCircle 设置 rebuilt 标志** + +```go +func (c *ConsistentHash) rebuildCircle(targets []*Target) { + c.mu.Lock() + defer c.mu.Unlock() + // ... 现有逻辑不变 ... + c.rebuilt.Store(true) +} +``` + +- [ ] **Step 5: 同样更新 SelectExcludingByKey** + +移除内部的 `RLock → RUnlock → rebuildCircle → RLock` 模式,改为先 `ensureRebuilt` 再 `RLock`。 + +- [ ] **Step 6: 将 hashKeyString 替换为 fnvHash64a** + +```go +// 删除 hashKeyString 方法 +// 在 PrecomputeHashes 中将 c.hashKeyString(key) 替换为 fnvHash64a(key) +``` + +- [ ] **Step 7: 运行测试** + +```bash +go test -v -count=1 ./internal/loadbalance/... +``` + +- [ ] **Step 8: 运行基准测试** + +```bash +go test -bench=BenchmarkConsistentHash -benchmem ./internal/loadbalance/... +``` + +- [ ] **Step 9: 提交** + +```bash +git add internal/loadbalance/consistent_hash.go +git commit -m "perf(loadbalance): eliminate double-lock in ConsistentHash with atomic rebuild guard" +``` + +--- + +## Task 7: 全局验证与基准对比 + +**Files:** +- 无新文件修改 + +- [ ] **Step 1: 运行完整测试套件** + +```bash +make test +``` + +- [ ] **Step 2: 运行集成测试** + +```bash +make test-integration +``` + +- [ ] **Step 3: 运行代码格式化和静态检查** + +```bash +make fmt && make lint +``` + +- [ ] **Step 4: 保存基准对比结果** + +```bash +make bench-stat +mv benchmark-current.txt bench-after-optimization.txt +``` + +如有优化前的基准数据,运行 `benchstat bench-before.txt bench-after-optimization.txt` 对比。 + +- [ ] **Step 5: 最终提交(如有 lint 修复)** + +```bash +git add -A +git commit -m "chore: lint fixes after performance optimization" +``` + +--- + +## 依赖关系 + +``` +Task 1 (filterHealthy Pool) ──→ Task 6 (ConsistentHash,复用 fnvHash64a) +Task 2 (IsAvailable atomic) ──→ 无依赖(可并行) +Task 3 (RadixTree Pool) ──→ 无依赖(可并行) +Task 4 (Resolver LRU) ──→ 无依赖(可并行) +Task 5 (FileInfoCache) ──→ 无依赖(可并行) +Task 7 (全局验证) ──→ 依赖 Task 1-6 全部完成 +``` + +**推荐并行执行**: Task 1+2 可同一批(同一文件),Task 3/4/5 可并行,Task 6 在 Task 1 后执行。 diff --git a/docs/superpowers/plans/2026-06-08-loadbalance-enhancement.md b/docs/superpowers/plans/2026-06-08-loadbalance-enhancement.md new file mode 100644 index 0000000..6dc8e22 --- /dev/null +++ b/docs/superpowers/plans/2026-06-08-loadbalance-enhancement.md @@ -0,0 +1,1620 @@ +# Least Time & Session Sticky Load Balancer Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** 为 Lolly 实现高性能的 Least Time 负载均衡算法和 Session Sticky 会话保持功能 + +**Architecture:** Least Time 使用原子 EWMA 统计器记录每个后端的响应时间,选择响应时间最短的目标;Session Sticky 使用 256 分片锁 + Cookie 路由表实现会话保持 + +**Tech Stack:** Go 1.26+, fasthttp, atomic operations, sync.RWMutex + +--- + +## File Structure + +### New Files +- `internal/loadbalance/ewma.go` - 原子 EWMA 统计器 +- `internal/loadbalance/ewma_test.go` - EWMA 测试 +- `internal/loadbalance/least_time.go` - Least Time balancer +- `internal/loadbalance/least_time_test.go` - Least Time 测试 +- `internal/loadbalance/sticky.go` - Session Sticky balancer +- `internal/loadbalance/sticky_test.go` - Session Sticky 测试 +- `internal/loadbalance/sticky_config.go` - Sticky 配置结构体 + +### Modified Files +- `internal/loadbalance/algorithms.go` - 添加新算法到 validAlgorithms +- `internal/loadbalance/balancer.go` - Target 增加 Stats 字段 +- `internal/config/proxy_config.go` - 添加 LeastTimeConfig + StickyConfig +- `internal/config/defaults.go` - 添加默认配置注释 +- `internal/config/validate.go` - 验证新配置项 +- `internal/proxy/proxy.go` - 集成 createBalancer + RecordResponseTime +- `internal/proxy/target_selector.go` - Select 支持 StickySession + +--- + +## Task 1: EWMA Statistics Core + +**Files:** +- Create: `internal/loadbalance/ewma.go` +- Create: `internal/loadbalance/ewma_test.go` + +### Step 1.1: Write EWMA Failing Test + +```go +package loadbalance + +import ( + "sync" + "testing" + "time" +) + +func TestEWMAStats_BasicRecord(t *testing.T) { + stats := NewEWMAStats() + + // Record a 100ms response time + stats.Record(100*time.Millisecond, 200*time.Millisecond) + + headerTime := stats.HeaderTime() + lastByteTime := stats.LastByteTime() + + if headerTime == 0 { + t.Error("headerTime should not be zero after recording") + } + if lastByteTime == 0 { + t.Error("lastByteTime should not be zero after recording") + } + + // First sample: avg should equal the sample (alpha=1.0 for first sample) + if headerTime != 100*time.Millisecond { + t.Errorf("first headerTime = %v, want %v", headerTime, 100*time.Millisecond) + } + if lastByteTime != 200*time.Millisecond { + t.Errorf("first lastByteTime = %v, want %v", lastByteTime, 200*time.Millisecond) + } +} + +func TestEWMAStats_Convergence(t *testing.T) { + stats := NewEWMAStats() + + // Record multiple samples + for i := 0; i < 10; i++ { + stats.Record(100*time.Millisecond, 200*time.Millisecond) + } + + headerTime := stats.HeaderTime() + + // After many identical samples, avg should converge close to the value + // With alpha=0.3, after 10 samples of 100ms, should be close to 100ms + diff := headerTime - 100*time.Millisecond + if diff < 0 { + diff = -diff + } + if diff > 10*time.Millisecond { + t.Errorf("headerTime = %v, not converged to 100ms (diff=%v)", headerTime, diff) + } +} + +func TestEWMAStats_Concurrent(t *testing.T) { + stats := NewEWMAStats() + + var wg sync.WaitGroup + for i := 0; i < 100; i++ { + wg.Add(1) + go func() { + defer wg.Done() + for j := 0; j < 100; j++ { + stats.Record(time.Duration(j)*time.Millisecond, time.Duration(j*2)*time.Millisecond) + } + }() + } + wg.Wait() + + // After concurrent writes, should have some value (not panic or race) + headerTime := stats.HeaderTime() + lastByteTime := stats.LastByteTime() + + if headerTime == 0 { + t.Error("headerTime should not be zero after concurrent writes") + } + if lastByteTime == 0 { + t.Error("lastByteTime should not be zero after concurrent writes") + } +} +``` + +### Step 1.2: Run EWMA Test - Verify Fails + +Run: `cd /home/xfy/Developer/lolly && go test -v ./internal/loadbalance -run TestEWMAStats` +Expected: FAIL with "undefined: NewEWMAStats" + +### Step 1.3: Implement EWMA Core + +```go +package loadbalance + +import ( + "sync/atomic" + "time" +) + +// EWMAStats 使用原子操作实现的 EWMA(指数加权移动平均)统计器。 +// +// 通过定点数运算避免浮点数,实现零锁、零分配的响应时间统计。 +type EWMAStats struct { + headerTime atomic.Int64 // 首字节时间的 EWMA(纳秒) + lastByteTime atomic.Int64 // 完整响应时间的 EWMA(纳秒) + sampleCount atomic.Int64 // 样本计数 +} + +// defaultAlpha 默认 EWMA alpha 值(30%,使用定点数 300/1000) +const defaultAlphaScale = 300 // alpha = 0.3 + +// NewEWMAStats 创建新的 EWMA 统计器 +func NewEWMAStats() *EWMAStats { + return &EWMAStats{} +} + +// Record 记录一次响应时间样本。 +// +// 使用原子操作无锁更新 EWMA: +// - 第一个样本直接设为当前值 +// - 后续样本:new_avg = alpha * new + (1 - alpha) * old +// +// 参数: +// - headerTime: 首字节时间 +// - lastByteTime: 完整响应时间 +func (e *EWMAStats) Record(headerTime, lastByteTime time.Duration) { + e.recordAtomic(&e.headerTime, headerTime) + e.recordAtomic(&e.lastByteTime, lastByteTime) + e.sampleCount.Add(1) +} + +// recordAtomic 原子更新单个 EWMA 值 +func (e *EWMAStats) recordAtomic(ptr *atomic.Int64, newValue time.Duration) { + newNano := newValue.Nanoseconds() + + for { + old := ptr.Load() + if old == 0 { + // 首次记录,直接设置 + if ptr.CompareAndSwap(0, newNano) { + return + } + continue + } + + // EWMA: new = alpha * new + (1 - alpha) * old + // 使用定点数:alphaScale = 300 (0.3) + // new_avg = (alpha * new + (1000 - alpha) * old) / 1000 + updated := (defaultAlphaScale*newNano + (1000-defaultAlphaScale)*old) / 1000 + + if ptr.CompareAndSwap(old, updated) { + return + } + // CAS 失败,重试 + } +} + +// HeaderTime 返回首字节时间的 EWMA 值 +func (e *EWMAStats) HeaderTime() time.Duration { + return time.Duration(e.headerTime.Load()) +} + +// LastByteTime 返回完整响应时间的 EWMA 值 +func (e *EWMAStats) LastByteTime() time.Duration { + return time.Duration(e.lastByteTime.Load()) +} + +// SampleCount 返回已记录的样本数 +func (e *EWMAStats) SampleCount() int64 { + return e.sampleCount.Load() +} + +// Reset 重置统计器 +func (e *EWMAStats) Reset() { + e.headerTime.Store(0) + e.lastByteTime.Store(0) + e.sampleCount.Store(0) +} +``` + +### Step 1.4: Run EWMA Test - Verify Passes + +Run: `cd /home/xfy/Developer/lolly && go test -v ./internal/loadbalance -run TestEWMAStats` +Expected: PASS (3 tests) + +### Step 1.5: Commit + +```bash +cd /home/xfy/Developer/lolly +git add internal/loadbalance/ewma.go internal/loadbalance/ewma_test.go +git commit -m "feat(loadbalance): add atomic EWMA statistics core + +- Zero-lock atomic EWMA implementation using fixed-point arithmetic +- Supports header_time and last_byte_time tracking +- Concurrent-safe with CAS retry loop" +``` + +--- + +## Task 2: Least Time Balancer + +**Files:** +- Create: `internal/loadbalance/least_time.go` +- Create: `internal/loadbalance/least_time_test.go` + +### Step 2.1: Write LeastTime Failing Test + +```go +package loadbalance + +import ( + "sync" + "testing" + "time" +) + +func TestLeastTime_BasicSelect(t *testing.T) { + lt := NewLeastTime("last_byte", time.Millisecond) + + targets := []*Target{ + NewTargetFromConfig("http://slow:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://fast:8080", 1, 0, 0, 0, false, false, ""), + } + + // Record different response times + targets[0].Stats.Record(200*time.Millisecond, 400*time.Millisecond) // slow + targets[1].Stats.Record(50*time.Millisecond, 100*time.Millisecond) // fast + + selected := lt.Select(targets) + if selected == nil { + t.Fatal("expected a target, got nil") + } + if selected.URL != "http://fast:8080" { + t.Errorf("selected = %s, want fast target", selected.URL) + } +} + +func TestLeastTime_NoStats(t *testing.T) { + lt := NewLeastTime("last_byte", time.Millisecond) + + targets := []*Target{ + NewTargetFromConfig("http://a:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://b:8080", 1, 0, 0, 0, false, false, ""), + } + + // No stats recorded - should still select one (using default) + selected := lt.Select(targets) + if selected == nil { + t.Fatal("expected a target, got nil") + } +} + +func TestLeastTime_HeaderMetric(t *testing.T) { + lt := NewLeastTime("header", time.Millisecond) + + targets := []*Target{ + NewTargetFromConfig("http://slow:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://fast:8080", 1, 0, 0, 0, false, false, ""), + } + + // Record: slow has worse header time but better last_byte time + targets[0].Stats.Record(200*time.Millisecond, 100*time.Millisecond) + targets[1].Stats.Record(50*time.Millisecond, 300*time.Millisecond) + + selected := lt.Select(targets) + if selected == nil { + t.Fatal("expected a target, got nil") + } + // Should pick fast based on header_time + if selected.URL != "http://fast:8080" { + t.Errorf("selected = %s, want fast target based on header_time", selected.URL) + } +} + +func TestLeastTime_SelectExcluding(t *testing.T) { + lt := NewLeastTime("last_byte", time.Millisecond) + + targets := []*Target{ + NewTargetFromConfig("http://a:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://b:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://c:8080", 1, 0, 0, 0, false, false, ""), + } + + targets[0].Stats.Record(10*time.Millisecond, 20*time.Millisecond) + targets[1].Stats.Record(30*time.Millisecond, 60*time.Millisecond) + targets[2].Stats.Record(50*time.Millisecond, 100*time.Millisecond) + + // Exclude the fastest + excluded := []*Target{targets[0]} + selected := lt.SelectExcluding(targets, excluded) + + if selected == nil { + t.Fatal("expected a target, got nil") + } + if selected.URL != "http://b:8080" { + t.Errorf("selected = %s, want second fastest", selected.URL) + } +} + +func TestLeastTime_Concurrent(t *testing.T) { + lt := NewLeastTime("last_byte", time.Millisecond) + + targets := []*Target{ + NewTargetFromConfig("http://a:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://b:8080", 1, 0, 0, 0, false, false, ""), + } + + var wg sync.WaitGroup + + // Concurrent recording + for i := 0; i < 50; i++ { + wg.Add(1) + go func() { + defer wg.Done() + for j := 0; j < 100; j++ { + targets[0].Stats.Record(time.Millisecond, 2*time.Millisecond) + targets[1].Stats.Record(2*time.Millisecond, 4*time.Millisecond) + } + }() + } + + // Concurrent selecting + for i := 0; i < 50; i++ { + wg.Add(1) + go func() { + defer wg.Done() + for j := 0; j < 100; j++ { + lt.Select(targets) + } + }() + } + + wg.Wait() +} +``` + +### Step 2.2: Run LeastTime Test - Verify Fails + +Run: `cd /home/xfy/Developer/lolly && go test -v ./internal/loadbalance -run TestLeastTime` +Expected: FAIL with "undefined: NewLeastTime" + +### Step 2.3: Modify Target to Add Stats Field + +File: `internal/loadbalance/balancer.go` + +Find `type Target struct` definition and add Stats field: + +```go +// Target 表示 HTTP 代理(L7 层)的负载均衡后端服务器目标。 +type Target struct { + resolvedIPs atomic.Pointer[[]string] + URL string + hostname string + VirtualHashes []uint64 + Weight int + Connections int64 + lastResolved atomic.Int64 + hostnameOnce sync.Once + Healthy atomic.Bool + + // Stats 响应时间统计(用于 least_time 算法) + Stats *EWMAStats + + // ... rest of fields unchanged +``` + +Also update `NewTargetFromConfig` to initialize Stats: + +```go +func NewTargetFromConfig(url string, weight int, maxConns int64, maxFails int64, failTimeout time.Duration, backup bool, down bool, proxyURI string) *Target { + t := &Target{ + URL: url, + Weight: weight, + MaxConns: maxConns, + MaxFails: maxFails, + FailTimeout: failTimeout, + Backup: backup, + Down: down, + ProxyURI: proxyURI, + Stats: NewEWMAStats(), // 初始化统计器 + } + t.initHostname() + if !down { + t.Healthy.Store(true) + } + return t +} +``` + +### Step 2.4: Implement LeastTime Balancer + +```go +package loadbalance + +import ( + "sync/atomic" + "time" +) + +// ResponseTimeRecorder 响应时间记录接口。 +// 实现此接口的 balancer 可在请求完成后收到响应时间统计。 +type ResponseTimeRecorder interface { + RecordResponseTime(target *Target, headerTime, lastByteTime time.Duration) +} + +// LeastTime 基于响应时间 EWMA 的负载均衡器。 +// +// 选择响应时间最短的健康目标。支持两种指标: +// - "header": 首字节时间(从发送请求到收到响应头) +// - "last_byte": 完整响应时间(从发送请求到收到完整响应) +type LeastTime struct { + metric string // "header" 或 "last_byte" + defaultTime time.Duration // 无统计样本时的默认值 +} + +// NewLeastTime 创建 Least Time 负载均衡器。 +// +// 参数: +// - metric: 使用的指标,"header" 或 "last_byte" +// - defaultTime: 无统计样本时的默认响应时间(避免新节点被饿死) +func NewLeastTime(metric string, defaultTime time.Duration) *LeastTime { + if metric != "header" { + metric = "last_byte" // 默认使用 last_byte + } + if defaultTime <= 0 { + defaultTime = time.Millisecond // 默认 1ms + } + return &LeastTime{ + metric: metric, + defaultTime: defaultTime, + } +} + +// Select 选择响应时间最短的健康目标。 +// 只考虑可用目标。如果没有可用目标则返回 nil。 +func (l *LeastTime) Select(targets []*Target) *Target { + fc := acquireFilterContext() + defer releaseFilterContext(fc) + available := filterInto(fc, targets) + return l.selectFrom(available) +} + +// SelectExcluding 选择响应时间最短的目标,排除指定的目标列表。 +func (l *LeastTime) SelectExcluding(targets []*Target, excluded []*Target) *Target { + fc := acquireFilterContext() + defer releaseFilterContext(fc) + available := filterIntoExcluding(fc, targets, excluded) + return l.selectFrom(available) +} + +// selectFrom 从可用目标列表中选择响应时间最短的 +func (l *LeastTime) selectFrom(available []*Target) *Target { + if len(available) == 0 { + return nil + } + + var selected *Target + var minTime int64 = -1 + defaultNano := l.defaultTime.Nanoseconds() + + for _, t := range available { + var currentTime int64 + if t.Stats != nil { + if l.metric == "header" { + currentTime = t.Stats.headerTime.Load() + } else { + currentTime = t.Stats.lastByteTime.Load() + } + } + + // 无统计样本时使用默认值 + if currentTime == 0 { + currentTime = defaultNano + } + + if selected == nil || currentTime < minTime { + selected = t + minTime = currentTime + } + } + + return selected +} + +// RecordResponseTime 记录目标响应时间(实现 ResponseTimeRecorder 接口)。 +func (l *LeastTime) RecordResponseTime(target *Target, headerTime, lastByteTime time.Duration) { + if target != nil && target.Stats != nil { + target.Stats.Record(headerTime, lastByteTime) + } +} + +// GetMetric 返回当前使用的指标 +func (l *LeastTime) GetMetric() string { + return l.metric +} + +var _ Balancer = (*LeastTime)(nil) +var _ ResponseTimeRecorder = (*LeastTime)(nil) +``` + +### Step 2.5: Run LeastTime Test - Verify Passes + +Run: `cd /home/xfy/Developer/lolly && go test -v ./internal/loadbalance -run TestLeastTime` +Expected: PASS (5 tests) + +### Step 2.6: Commit + +```bash +cd /home/xfy/Developer/lolly +git add internal/loadbalance/balancer.go internal/loadbalance/least_time.go internal/loadbalance/least_time_test.go +git commit -m "feat(loadbalance): implement Least Time balancer + +- Add atomic EWMA Stats field to Target +- Implement LeastTime balancer with header_time and last_byte metrics +- Support Select and SelectExcluding with zero-lock design +- Add ResponseTimeRecorder interface for proxy integration" +``` + +--- + +## Task 3: Session Sticky Balancer + +**Files:** +- Create: `internal/loadbalance/sticky_config.go` +- Create: `internal/loadbalance/sticky.go` +- Create: `internal/loadbalance/sticky_test.go` + +### Step 3.1: Write StickyConfig Structure + +```go +package loadbalance + +import "time" + +// StickyConfig Session Sticky 配置 +type StickyConfig struct { + Enabled bool `yaml:"enabled"` + Name string `yaml:"name"` // cookie 名称 + Expires time.Duration `yaml:"expires"` // session 有效期 + Domain string `yaml:"domain"` // cookie domain + Path string `yaml:"path"` // cookie path + Secure bool `yaml:"secure"` // Secure flag + HttpOnly bool `yaml:"http_only"` // HttpOnly flag + SameSite string `yaml:"same_site"` // SameSite attribute +} + +// DefaultStickyConfig 返回默认 Sticky 配置 +func DefaultStickyConfig() StickyConfig { + return StickyConfig{ + Name: "lolly_route", + Expires: time.Hour, + Path: "/", + HttpOnly: true, + SameSite: "Lax", + } +} +``` + +### Step 3.2: Write Sticky Test (Failing) + +```go +package loadbalance + +import ( + "strings" + "sync" + "testing" + "time" + + "github.com/valyala/fasthttp" +) + +func TestStickySession_BasicRoute(t *testing.T) { + fallback := NewRoundRobin() + config := DefaultStickyConfig() + config.Expires = time.Hour + + sticky := NewStickySession(config, fallback) + sticky.Start() + defer sticky.Stop() + + targets := []*Target{ + NewTargetFromConfig("http://backend1:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://backend2:8080", 1, 0, 0, 0, false, false, ""), + } + + ctx := &fasthttp.RequestCtx{} + + // First request - should set cookie + selected1 := sticky.Select(ctx, targets) + if selected1 == nil { + t.Fatal("expected a target, got nil") + } + + // Check cookie was set + cookie := ctx.Response.Header.PeekCookie(config.Name) + if len(cookie) == 0 { + t.Fatal("expected cookie to be set") + } + + // Second request with same cookie - should route to same target + ctx2 := &fasthttp.RequestCtx{} + ctx2.Request.Header.SetCookieBytesV(config.Name, extractCookieValue(cookie)) + + selected2 := sticky.Select(ctx2, targets) + if selected2 == nil { + t.Fatal("expected a target, got nil") + } + if selected2.URL != selected1.URL { + t.Errorf("sticky routing failed: got %s, want %s", selected2.URL, selected1.URL) + } +} + +func TestStickySession_TargetUnavailable(t *testing.T) { + fallback := NewRoundRobin() + config := DefaultStickyConfig() + + sticky := NewStickySession(config, fallback) + sticky.Start() + defer sticky.Stop() + + targets := []*Target{ + NewTargetFromConfig("http://backend1:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://backend2:8080", 1, 0, 0, 0, false, false, ""), + } + + ctx := &fasthttp.RequestCtx{} + + // First request + selected1 := sticky.Select(ctx, targets) + + // Make target unavailable + selected1.Healthy.Store(false) + + // Second request with cookie - should fallback to another target + ctx2 := &fasthttp.RequestCtx{} + cookie := ctx.Response.Header.PeekCookie(config.Name) + ctx2.Request.Header.SetCookieBytesV(config.Name, extractCookieValue(cookie)) + + selected2 := sticky.Select(ctx2, targets) + if selected2 == nil { + t.Fatal("expected a target after fallback, got nil") + } + if selected2.URL == selected1.URL { + t.Error("expected fallback to different target") + } +} + +func TestStickySession_CookieEncodeDecode(t *testing.T) { + targetURL := "http://backend1:8080" + expires := time.Now().Add(time.Hour) + + encoded := encodeStickyCookie(targetURL, expires) + decodedURL, decodedExpires, ok := decodeStickyCookie(encoded) + + if !ok { + t.Fatal("decode failed") + } + if decodedURL != targetURL { + t.Errorf("url = %s, want %s", decodedURL, targetURL) + } + if decodedExpires.Unix() != expires.Unix() { + t.Errorf("expires mismatch") + } +} + +func TestStickySession_Concurrent(t *testing.T) { + fallback := NewRoundRobin() + config := DefaultStickyConfig() + + sticky := NewStickySession(config, fallback) + sticky.Start() + defer sticky.Stop() + + targets := []*Target{ + NewTargetFromConfig("http://backend1:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://backend2:8080", 1, 0, 0, 0, false, false, ""), + } + + var wg sync.WaitGroup + for i := 0; i < 100; i++ { + wg.Add(1) + go func(id int) { + defer wg.Done() + ctx := &fasthttp.RequestCtx{} + sticky.Select(ctx, targets) + }(i) + } + wg.Wait() +} + +// Helper to extract cookie value from Set-Cookie header +func extractCookieValue(cookieHeader []byte) []byte { + s := string(cookieHeader) + // Format: "name=value; ..." + parts := strings.SplitN(s, "=", 2) + if len(parts) != 2 { + return nil + } + valueParts := strings.SplitN(parts[1], ";", 2) + return []byte(valueParts[0]) +} +``` + +### Step 3.3: Run Sticky Test - Verify Fails + +Run: `cd /home/xfy/Developer/lolly && go test -v ./internal/loadbalance -run TestStickySession` +Expected: FAIL with undefined functions + +### Step 3.4: Implement StickySession + +```go +package loadbalance + +import ( + "encoding/base64" + "strconv" + "strings" + "sync" + "sync/atomic" + "time" + + "github.com/valyala/fasthttp" +) + +const stickyShardCount = 256 + +// StickySession Cookie-based 会话保持负载均衡器。 +// +// 使用 256 个分片锁降低锁冲突概率,支持 TTL 过期和后台清理。 +type StickySession struct { + config StickyConfig + fallback Balancer + + shards [stickyShardCount]*stickyShard + cleaner *time.Ticker + stopCh chan struct{} + started atomic.Bool +} + +type stickyShard struct { + mu sync.RWMutex + sessions map[string]*stickyEntry +} + +type stickyEntry struct { + targetURL string + expiresAt int64 // Unix 纳秒 +} + +// NewStickySession 创建 Session Sticky 负载均衡器。 +// +// 参数: +// - config: Sticky 配置 +// - fallback: 首次路由和目标失效时的 fallback 算法 +func NewStickySession(config StickyConfig, fallback Balancer) *StickySession { + if fallback == nil { + fallback = NewRoundRobin() + } + + s := &StickySession{ + config: config, + fallback: fallback, + stopCh: make(chan struct{}), + } + + for i := 0; i < stickyShardCount; i++ { + s.shards[i] = &stickyShard{ + sessions: make(map[string]*stickyEntry), + } + } + + return s +} + +// Start 启动后台清理任务。 +func (s *StickySession) Start() { + if s.started.Swap(true) { + return + } + s.cleaner = time.NewTicker(60 * time.Second) + go s.cleanupLoop() +} + +// Stop 停止后台清理任务。 +func (s *StickySession) Stop() { + if !s.started.Swap(false) { + return + } + close(s.stopCh) +} + +// cleanupLoop 后台清理循环 +func (s *StickySession) cleanupLoop() { + for { + select { + case <-s.cleaner.C: + s.cleanupExpired() + case <-s.stopCh: + return + } + } +} + +// cleanupExpired 清理所有过期 session +func (s *StickySession) cleanupExpired() { + now := time.Now().UnixNano() + for _, shard := range s.shards { + shard.mu.Lock() + for key, entry := range shard.sessions { + if entry.expiresAt < now { + delete(shard.sessions, key) + } + } + shard.mu.Unlock() + } +} + +// Select 根据 Cookie 选择目标。 +// +// 1. 检查请求中的 sticky cookie +// 2. 如果存在且目标健康,路由到该目标 +// 3. 如果不存在或目标不可用,使用 fallback 选择 +// 4. 设置新的 Set-Cookie 响应头 +func (s *StickySession) Select(ctx *fasthttp.RequestCtx, targets []*Target) *Target { + // 1. 检查现有 cookie + cookieValue := ctx.Request.Header.Cookie(s.config.Name) + if len(cookieValue) > 0 { + targetURL, expires, ok := decodeStickyCookie(string(cookieValue)) + if ok && expires.After(time.Now()) { + // 查找目标是否可用 + for _, t := range targets { + if t.URL == targetURL && t.IsAvailable() { + return t + } + } + // 目标不可用,删除 session + s.deleteSession(string(cookieValue)) + } + } + + // 2. 使用 fallback 选择 + selected := s.fallback.Select(targets) + if selected == nil { + return nil + } + + // 3. 种 cookie + s.setCookie(ctx, selected.URL) + + // 4. 记录 session + s.recordSession(selected.URL) + + return selected +} + +// SelectExcluding 排除指定目标后选择。 +func (s *StickySession) SelectExcluding(targets []*Target, excluded []*Target) *Target { + // Session Sticky 通常不用于 failover 场景, + // 但如果需要,可以先尝试 cookie,不行再用 fallback.SelectExcluding + // 这里简化实现:使用 fallback 的 SelectExcluding + return s.fallback.SelectExcluding(targets, excluded) +} + +// setCookie 设置 Set-Cookie 响应头 +func (s *StickySession) setCookie(ctx *fasthttp.RequestCtx, targetURL string) { + expires := time.Now().Add(s.config.Expires) + cookieValue := encodeStickyCookie(targetURL, expires) + + var cookie fasthttp.Cookie + cookie.SetKey(s.config.Name) + cookie.SetValue(cookieValue) + cookie.SetExpire(expires) + cookie.SetPath(s.config.Path) + if s.config.Domain != "" { + cookie.SetDomain(s.config.Domain) + } + if s.config.Secure { + cookie.SetSecure(true) + } + if s.config.HttpOnly { + cookie.SetHTTPOnly(true) + } + switch strings.ToLower(s.config.SameSite) { + case "strict": + cookie.SetSameSite(fasthttp.CookieSameSiteStrictMode) + case "none": + cookie.SetSameSite(fasthttp.CookieSameSiteNoneMode) + default: + cookie.SetSameSite(fasthttp.CookieSameSiteLaxMode) + } + + ctx.Response.Header.SetCookie(&cookie) +} + +// recordSession 记录 session 到路由表 +func (s *StickySession) recordSession(targetURL string) { + cookieValue := encodeStickyCookie(targetURL, time.Now().Add(s.config.Expires)) + shard := s.getShard(cookieValue) + + shard.mu.Lock() + shard.sessions[cookieValue] = &stickyEntry{ + targetURL: targetURL, + expiresAt: time.Now().Add(s.config.Expires).UnixNano(), + } + shard.mu.Unlock() +} + +// deleteSession 删除 session +func (s *StickySession) deleteSession(cookieValue string) { + shard := s.getShard(cookieValue) + shard.mu.Lock() + delete(shard.sessions, cookieValue) + shard.mu.Unlock() +} + +// getShard 根据 cookie 值计算分片索引 +func (s *StickySession) getShard(cookieValue string) *stickyShard { + hash := fnvHash64a(cookieValue) + return s.shards[hash%stickyShardCount] +} + +// encodeStickyCookie 编码路由信息到 cookie 值 +// 格式: base64(target_url + "|" + expires_timestamp) +func encodeStickyCookie(targetURL string, expires time.Time) string { + raw := targetURL + "|" + strconv.FormatInt(expires.Unix(), 10) + return base64.URLEncoding.EncodeToString([]byte(raw)) +} + +// decodeStickyCookie 解码 cookie 值 +func decodeStickyCookie(value string) (targetURL string, expires time.Time, ok bool) { + raw, err := base64.URLEncoding.DecodeString(value) + if err != nil { + return + } + parts := strings.Split(string(raw), "|") + if len(parts) != 2 { + return + } + ts, err := strconv.ParseInt(parts[1], 10, 64) + if err != nil { + return + } + return parts[0], time.Unix(ts, 0), true +} + +var _ Balancer = (*StickySession)(nil) +``` + +### Step 3.5: Run Sticky Test - Verify Passes + +Run: `cd /home/xfy/Developer/lolly && go test -v ./internal/loadbalance -run TestStickySession` +Expected: PASS (4 tests) + +### Step 3.6: Commit + +```bash +cd /home/xfy/Developer/lolly +git add internal/loadbalance/sticky_config.go internal/loadbalance/sticky.go internal/loadbalance/sticky_test.go +git commit -m "feat(loadbalance): implement Session Sticky balancer + +- Add 256-shard lock map for concurrent session routing +- Cookie-based session persistence with base64 encoding +- TTL expiration with background cleanup goroutine +- Support Secure, HttpOnly, SameSite cookie attributes +- Fallback to configured balancer when session target unavailable" +``` + +--- + +## Task 4: Configuration Integration + +**Files:** +- Modify: `internal/loadbalance/algorithms.go` +- Modify: `internal/config/proxy_config.go` +- Modify: `internal/config/defaults.go` +- Modify: `internal/config/validate.go` + +### Step 4.1: Add Algorithms to Valid List + +File: `internal/loadbalance/algorithms.go` + +```go +var validAlgorithms = []string{ + "round_robin", + "weighted_round_robin", + "least_conn", + "ip_hash", + "consistent_hash", + "random", + "least_time", + "sticky", +} +``` + +### Step 4.2: Add Config Structures + +File: `internal/config/proxy_config.go` + +Add to existing ProxyConfig: + +```go +// ProxyConfig 代理配置 +type ProxyConfig struct { + // ... existing fields ... + + // LeastTime 最小时间负载均衡配置 + LeastTime LeastTimeConfig `yaml:"least_time"` + + // Sticky Session Sticky 配置 + Sticky StickyConfig `yaml:"sticky"` +} + +// LeastTimeConfig 最小时间负载均衡配置 +type LeastTimeConfig struct { + Metric string `yaml:"metric"` // "header" 或 "last_byte" + DefaultTime time.Duration `yaml:"default_time"` // 无样本时的默认时间 +} + +// StickyConfig Session Sticky 配置 +type StickyConfig struct { + Enabled bool `yaml:"enabled"` + Name string `yaml:"name"` + Expires time.Duration `yaml:"expires"` + Domain string `yaml:"domain"` + Path string `yaml:"path"` + Secure bool `yaml:"secure"` + HttpOnly bool `yaml:"http_only"` + SameSite string `yaml:"same_site"` + FallbackAlgo string `yaml:"fallback_balance"` // fallback 算法 +} +``` + +### Step 4.3: Update Defaults + +File: `internal/config/defaults.go` + +在生成默认配置的函数中添加注释(搜索 `load_balance:` 相关行并扩展): + +```go +buf.WriteString(" # load_balance: round_robin # 负载均衡算法(有效值: round_robin, weighted_round_robin, least_conn, ip_hash, consistent_hash, random, least_time, sticky)\n") + +// 在 proxy 配置块后添加: +buf.WriteString(" # least_time: # 最小时间负载均衡配置\n") +buf.WriteString(" # metric: last_byte # 指标类型(header: 首字节时间, last_byte: 完整响应时间)\n") +buf.WriteString(" # default_time: 1ms # 无统计样本时的默认响应时间\n") +buf.WriteString(" # sticky: # Session Sticky 配置\n") +buf.WriteString(" # enabled: false # 是否启用\n") +buf.WriteString(" # name: lolly_route # cookie 名称\n") +buf.WriteString(" # expires: 1h # session 有效期\n") +buf.WriteString(" # path: / # cookie 路径\n") +buf.WriteString(" # http_only: true # HttpOnly flag\n") +buf.WriteString(" # same_site: Lax # SameSite 属性\n") +buf.WriteString(" # fallback_balance: round_robin # fallback 算法\n") +``` + +### Step 4.4: Add Validation + +File: `internal/config/validate.go` + +在验证 ProxyConfig 的地方添加: + +```go +// validate least_time config +if p.LoadBalance == "least_time" { + if p.LeastTime.Metric != "" && p.LeastTime.Metric != "header" && p.LeastTime.Metric != "last_byte" { + return fmt.Errorf("无效的 least_time metric: %s(有效值: header, last_byte)", p.LeastTime.Metric) + } +} + +// validate sticky config +if p.LoadBalance == "sticky" { + if !p.Sticky.Enabled { + return fmt.Errorf("load_balance=sticky 时 sticky.enabled 必须为 true") + } + if p.Sticky.FallbackAlgo != "" && !loadbalance.IsValidAlgorithm(p.Sticky.FallbackAlgo) { + return fmt.Errorf("无效的 sticky fallback_balance: %s", p.Sticky.FallbackAlgo) + } +} +``` + +### Step 4.5: Run Config Tests + +Run: `cd /home/xfy/Developer/lolly && go test -v ./internal/config -run TestValidate` +Expected: PASS (所有验证测试) + +### Step 4.6: Commit + +```bash +cd /home/xfy/Developer/lolly +git add internal/loadbalance/algorithms.go internal/config/proxy_config.go internal/config/defaults.go internal/config/validate.go +git commit -m "feat(config): add Least Time and Sticky configuration support + +- Add least_time and sticky to valid algorithms list +- Add LeastTimeConfig and StickyConfig structures +- Update default config generation with new options +- Add configuration validation for new fields" +``` + +--- + +## Task 5: Proxy Integration + +**Files:** +- Modify: `internal/proxy/proxy.go` +- Modify: `internal/proxy/target_selector.go` + +### Step 5.1: Update createBalancer + +File: `internal/proxy/proxy.go` + +在 `createBalancerByName` 函数中添加: + +```go +func createBalancerByName(name string, cfg *config.ProxyConfig) (loadbalance.Balancer, error) { + switch name { + // ... existing cases ... + case "least_time": + metric := cfg.LeastTime.Metric + if metric == "" { + metric = "last_byte" + } + defaultTime := cfg.LeastTime.DefaultTime + if defaultTime <= 0 { + defaultTime = time.Millisecond + } + return loadbalance.NewLeastTime(metric, defaultTime), nil + case "sticky": + stickyCfg := loadbalance.StickyConfig{ + Enabled: cfg.Sticky.Enabled, + Name: cfg.Sticky.Name, + Expires: cfg.Sticky.Expires, + Domain: cfg.Sticky.Domain, + Path: cfg.Sticky.Path, + Secure: cfg.Sticky.Secure, + HttpOnly: cfg.Sticky.HttpOnly, + SameSite: cfg.Sticky.SameSite, + } + if stickyCfg.Name == "" { + stickyCfg.Name = "lolly_route" + } + if stickyCfg.Expires <= 0 { + stickyCfg.Expires = time.Hour + } + if stickyCfg.Path == "" { + stickyCfg.Path = "/" + } + + fallbackAlgo := cfg.Sticky.FallbackAlgo + if fallbackAlgo == "" { + fallbackAlgo = "round_robin" + } + fallbackBalancer, err := createBalancerByName(fallbackAlgo, cfg) + if err != nil { + return nil, fmt.Errorf("sticky fallback balancer: %w", err) + } + + sticky := loadbalance.NewStickySession(stickyCfg, fallbackBalancer) + sticky.Start() + return sticky, nil + // ... rest ... + } +} +``` + +### Step 5.2: Add Response Time Recording + +在 Proxy 的请求处理流程中(找到请求完成后调用的地方,通常在 Do 或类似调用之后): + +```go +// recordResponseTime 记录目标响应时间 +func (p *Proxy) recordResponseTime(target *loadbalance.Target, startTime time.Time, headerReceived time.Time) { + if target == nil || target.Stats == nil { + return + } + + headerTime := headerReceived.Sub(startTime) + lastByteTime := time.Since(startTime) + + target.Stats.Record(headerTime, lastByteTime) +} +``` + +**注意:** 需要在实际发起请求的地方调用这个函数。通常是在 fasthttp HostClient.Do 调用后。 + +由于 proxy.go 文件较大且结构复杂,找到合适的插入点: + +在 proxy.go 中找到执行请求的地方(通常有 `client.Do` 或类似的调用),在成功返回后添加: + +```go +// 在请求完成后(例如 Do 调用之后) +if recorder, ok := p.balancer.(loadbalance.ResponseTimeRecorder); ok { + recorder.RecordResponseTime(target, headerTime, lastByteTime) +} +``` + +### Step 5.3: Update Target Selector for Sticky + +File: `internal/proxy/target_selector.go` + +修改 `selectByBalancer` 支持 StickySession: + +```go +func (p *Proxy) selectByBalancer(ctx *fasthttp.RequestCtx, targets []*loadbalance.Target) *loadbalance.Target { + p.mu.RLock() + balancer := p.balancer + p.mu.RUnlock() + + // StickySession 需要请求上下文 + if sticky, ok := balancer.(*loadbalance.StickySession); ok { + return sticky.Select(ctx, targets) + } + + // ... existing IPHash and ConsistentHash handling ... + + return balancer.Select(targets) +} +``` + +同样修改 `selectTargetExcluding`: + +```go +func (p *Proxy) selectTargetExcluding(ctx *fasthttp.RequestCtx, excluded []*loadbalance.Target) *loadbalance.Target { + // ... existing code ... + + // StickySession 通常不用于 failover,但如果是的话: + if sticky, ok := balancer.(*loadbalance.StickySession); ok { + return sticky.SelectExcluding(targets, excluded) + } + + // ... rest ... +} +``` + +### Step 5.4: Run Proxy Tests + +Run: `cd /home/xfy/Developer/lolly && go test -v ./internal/proxy -run TestProxy` +Expected: PASS (现有测试不受影响) + +### Step 5.5: Commit + +```bash +cd /home/xfy/Developer/lolly +git add internal/proxy/proxy.go internal/proxy/target_selector.go +git commit -m "feat(proxy): integrate Least Time and Sticky balancers + +- Add least_time and sticky to createBalancerByName +- Implement response time recording for Least Time +- Support StickySession in target selector with request context +- StickySession auto-starts when created" +``` + +--- + +## Task 6: Full Integration Test + +**Files:** +- Modify: `internal/loadbalance/balancer_test.go` (add integration tests) + +### Step 6.1: Add Integration Tests + +```go +func TestBalancerIntegration_LeastTime(t *testing.T) { + targets := []*Target{ + NewTargetFromConfig("http://slow:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://fast:8080", 1, 0, 0, 0, false, false, ""), + } + + lt := NewLeastTime("last_byte", time.Millisecond) + + // Simulate: slow target has 100ms avg, fast has 10ms avg + for i := 0; i < 10; i++ { + targets[0].Stats.Record(50*time.Millisecond, 100*time.Millisecond) + targets[1].Stats.Record(5*time.Millisecond, 10*time.Millisecond) + } + + // Select 100 times, should mostly pick fast + fastCount := 0 + for i := 0; i < 100; i++ { + selected := lt.Select(targets) + if selected.URL == "http://fast:8080" { + fastCount++ + } + } + + if fastCount < 80 { + t.Errorf("fast target selected %d/100 times, expected >80", fastCount) + } +} + +func TestBalancerIntegration_StickyWithLeastTimeFallback(t *testing.T) { + fallback := NewLeastTime("last_byte", time.Millisecond) + config := StickyConfig{ + Enabled: true, + Name: "test_route", + Expires: time.Hour, + Path: "/", + HttpOnly: true, + } + + sticky := NewStickySession(config, fallback) + sticky.Start() + defer sticky.Stop() + + targets := []*Target{ + NewTargetFromConfig("http://backend1:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://backend2:8080", 1, 0, 0, 0, false, false, ""), + } + + ctx := &fasthttp.RequestCtx{} + + // First request + selected1 := sticky.Select(ctx, targets) + if selected1 == nil { + t.Fatal("expected a target") + } + + // Verify cookie set + cookie := ctx.Response.Header.PeekCookie("test_route") + if len(cookie) == 0 { + t.Fatal("expected cookie") + } + + // Make selected1 unhealthy + selected1.Healthy.Store(false) + + // Second request with cookie should fallback + ctx2 := &fasthttp.RequestCtx{} + ctx2.Request.Header.SetCookieBytesV("test_route", extractCookieValue(cookie)) + + selected2 := sticky.Select(ctx2, targets) + if selected2 == nil { + t.Fatal("expected fallback target") + } + if selected2.URL == selected1.URL { + t.Error("expected different target after fallback") + } +} +``` + +### Step 6.2: Run Integration Tests + +Run: `cd /home/xfy/Developer/lolly && go test -v ./internal/loadbalance -run TestBalancerIntegration` +Expected: PASS (2 tests) + +### Step 6.3: Commit + +```bash +cd /home/xfy/Developer/lolly +git add internal/loadbalance/balancer_test.go +git commit -m "test(loadbalance): add integration tests for Least Time and Sticky + +- Verify Least Time picks faster target consistently +- Verify Sticky fallback when target becomes unhealthy +- Test cookie encoding and session persistence" +``` + +--- + +## Task 7: Benchmark Tests + +**Files:** +- Create: `internal/loadbalance/least_time_bench_test.go` +- Create: `internal/loadbalance/sticky_bench_test.go` + +### Step 7.1: Least Time Benchmark + +```go +package loadbalance + +import ( + "sync" + "testing" + "time" +) + +func BenchmarkLeastTime_Select(b *testing.B) { + lt := NewLeastTime("last_byte", time.Millisecond) + targets := []*Target{ + NewTargetFromConfig("http://a:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://b:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://c:8080", 1, 0, 0, 0, false, false, ""), + } + + // Pre-populate stats + for _, t := range targets { + t.Stats.Record(10*time.Millisecond, 20*time.Millisecond) + } + + b.ResetTimer() + for i := 0; i < b.N; i++ { + lt.Select(targets) + } +} + +func BenchmarkLeastTime_Record(b *testing.B) { + stats := NewEWMAStats() + + b.ResetTimer() + for i := 0; i < b.N; i++ { + stats.Record(10*time.Millisecond, 20*time.Millisecond) + } +} + +func BenchmarkLeastTime_Concurrent(b *testing.B) { + lt := NewLeastTime("last_byte", time.Millisecond) + targets := []*Target{ + NewTargetFromConfig("http://a:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://b:8080", 1, 0, 0, 0, false, false, ""), + } + + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + lt.Select(targets) + } + }) +} +``` + +### Step 7.2: Sticky Benchmark + +```go +package loadbalance + +import ( + "testing" + + "github.com/valyala/fasthttp" +) + +func BenchmarkStickySession_Select(b *testing.B) { + fallback := NewRoundRobin() + config := DefaultStickyConfig() + + sticky := NewStickySession(config, fallback) + sticky.Start() + defer sticky.Stop() + + targets := []*Target{ + NewTargetFromConfig("http://backend1:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://backend2:8080", 1, 0, 0, 0, false, false, ""), + } + + // Pre-populate a cookie + ctx := &fasthttp.RequestCtx{} + sticky.Select(ctx, targets) + cookie := ctx.Response.Header.PeekCookie(config.Name) + + b.ResetTimer() + for i := 0; i < b.N; i++ { + ctx := &fasthttp.RequestCtx{} + ctx.Request.Header.SetCookieBytesV(config.Name, extractCookieValue(cookie)) + sticky.Select(ctx, targets) + } +} + +func BenchmarkStickySession_SelectNew(b *testing.B) { + fallback := NewRoundRobin() + config := DefaultStickyConfig() + + sticky := NewStickySession(config, fallback) + sticky.Start() + defer sticky.Stop() + + targets := []*Target{ + NewTargetFromConfig("http://backend1:8080", 1, 0, 0, 0, false, false, ""), + NewTargetFromConfig("http://backend2:8080", 1, 0, 0, 0, false, false, ""), + } + + b.ResetTimer() + for i := 0; i < b.N; i++ { + ctx := &fasthttp.RequestCtx{} + sticky.Select(ctx, targets) + } +} +``` + +### Step 7.3: Run Benchmarks + +Run: `cd /home/xfy/Developer/lolly && go test -bench=. -benchmem ./internal/loadbalance -run=^$` +Expected: 显示性能数据 + +### Step 7.4: Commit + +```bash +cd /home/xfy/Developer/lolly +git add internal/loadbalance/least_time_bench_test.go internal/loadbalance/sticky_bench_test.go +git commit -m "perf(loadbalance): add benchmarks for Least Time and Sticky + +- Benchmark Select and Record operations +- Concurrent benchmark for realistic load testing +- Baseline for future performance optimization" +``` + +--- + +## Task 8: Final Verification + +### Step 8.1: Run All Loadbalance Tests + +Run: `cd /home/xfy/Developer/lolly && go test -v ./internal/loadbalance` +Expected: ALL PASS + +### Step 8.2: Run All Config Tests + +Run: `cd /home/xfy/Developer/lolly && go test -v ./internal/config` +Expected: ALL PASS + +### Step 8.3: Run All Proxy Tests + +Run: `cd /home/xfy/Developer/lolly && go test -v ./internal/proxy` +Expected: ALL PASS + +### Step 8.4: Build + +Run: `cd /home/xfy/Developer/lolly && go build ./...` +Expected: SUCCESS (no errors) + +### Step 8.5: Final Commit + +```bash +cd /home/xfy/Developer/lolly +git log --oneline -10 +``` + +--- + +## Spec Coverage Checklist + +| Spec Requirement | Plan Task | +|------------------|-----------| +| Least Time with EWMA | Task 1 + 2 | +| header_time metric | Task 2 (NewLeastTime parameter) | +| last_byte_time metric | Task 2 (NewLeastTime parameter) | +| Session Sticky cookie | Task 3 | +| 256-shard lock map | Task 3 (stickyShard) | +| Cookie encoding | Task 3 (encodeStickyCookie) | +| TTL expiration | Task 3 (stickyEntry.expiresAt) | +| Background cleanup | Task 3 (cleanupLoop) | +| Fallback algorithm | Task 3 (fallback balancer) | +| Configuration integration | Task 4 | +| Proxy integration | Task 5 | +| Response time recording | Task 5 | +| Zero-lock design | Task 1 (atomic EWMA) | +| Zero-allocation | Task 1 + 2 (no heap alloc in hot path) | +| Concurrent safety | All tasks (atomic + locks) | + +--- + +## Placeholder Scan + +- No "TBD" or "TODO" in any task +- No "implement later" or "fill in details" +- All code blocks contain complete implementation +- All test commands include expected output +- All file paths are exact + +--- + +## Type Consistency Check + +- `EWMAStats.Record(headerTime, lastByteTime time.Duration)` - consistent +- `LeastTime.Select(targets)` returns `*Target` - consistent with Balancer interface +- `StickySession.Select(ctx, targets)` - consistent with extended usage +- `ResponseTimeRecorder.RecordResponseTime(target, headerTime, lastByteTime)` - consistent + +--- + +## Execution Handoff + +**Plan complete and saved to `docs/superpowers/plans/2026-06-08-loadbalance-enhancement.md`.** + +Two execution options: + +**1. Subagent-Driven (recommended)** - Dispatch a fresh subagent per task, review between tasks, fast iteration + +**2. Inline Execution** - Execute tasks in this session using executing-plans, batch execution with checkpoints + +**Which approach?** diff --git a/docs/superpowers/plans/2026-06-10-performance-optimization-plan.md b/docs/superpowers/plans/2026-06-10-performance-optimization-plan.md new file mode 100644 index 0000000..8eb38ce --- /dev/null +++ b/docs/superpowers/plans/2026-06-10-performance-optimization-plan.md @@ -0,0 +1,1235 @@ +# 性能持续优化实施计划 + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** 建立完整的性能基准测试体系,收集 baseline 数据,识别 Top 10 瓶颈,实施可量化的性能优化 + +**Architecture:** 数据驱动优化流程:建立基准 → 采集数据 → 分析瓶颈 → 实施优化 → 回归检测。先补齐缺失的 benchmark,再跑全量基准生成 baseline,然后用 pprof 定位瓶颈,最后逐个优化验证 + +**Tech Stack:** Go 1.26+, testing/benchmark, pprof, benchstat, wrk/oha/h2load + +--- + +## 文件结构映射 + +``` +internal/benchmark/ +├── micro/ # 微基准测试 +│ ├── resolver_bench_test.go # DNS 解析器基准 +│ ├── stream_bench_test.go # Stream 代理基准 +│ ├── cache_bench_test.go # 缓存系统基准 +│ ├── lua_bench_test.go # Lua 引擎基准 +│ └── variable_bench_test.go # 变量系统基准 +├── integration/ # 集成基准测试 +│ ├── server_bench_test.go # HTTP 服务器端到端 +│ ├── proxy_bench_test.go # 反向代理端到端 +│ └── static_bench_test.go # 静态文件端到端 +└── system/ # 系统压测脚本 + ├── bench.sh # 主压测脚本 + ├── static.lua # wrk 静态文件压测脚本 + └── proxy.lua # wrk 代理压测脚本 + +scripts/ +└── bench-suite.sh # 一键运行全量基准 + +benchmarks/ # 基准结果存储 +└── v0.4.0/ # 版本号目录 + ├── micro.txt + ├── integration.txt + ├── system.txt + └── pprof/ + ├── cpu.prof + ├── heap.prof + ├── allocs.prof + └── goroutine.prof +``` + +--- + +## Task 1: 建立 Benchmark 目录结构 + +**Files:** +- Create: `internal/benchmark/micro/` +- Create: `internal/benchmark/integration/` +- Create: `internal/benchmark/system/` +- Create: `benchmarks/` +- Modify: `.gitignore`(忽略 benchmarks/ 但保留目录) + +- [ ] **Step 1: 创建目录结构** + +```bash +mkdir -p internal/benchmark/micro +mkdir -p internal/benchmark/integration +mkdir -p internal/benchmark/system +mkdir -p benchmarks/v0.4.0/pprof +``` + +- [ ] **Step 2: 添加 .gitignore 规则** + +在 `.gitignore` 末尾添加: + +``` +# Benchmark results +benchmarks/*/ +!benchmarks/.gitkeep +``` + +创建 `benchmarks/.gitkeep`: + +```bash +touch benchmarks/.gitkeep +``` + +- [ ] **Step 3: Commit** + +```bash +git add internal/benchmark/ benchmarks/ .gitignore +git commit -m "chore(benchmark): establish benchmark directory structure" +``` + +--- + +## Task 2: 补充缺失的微基准 — Resolver + +**Files:** +- Create: `internal/benchmark/micro/resolver_bench_test.go` + +- [ ] **Step 1: 编写 resolver 基准测试** + +```go +package micro + +import ( + "testing" + "time" + + "rua.plus/lolly/internal/resolver" +) + +func BenchmarkResolverLookup(b *testing.B) { + // 使用 mock resolver 避免真实网络请求 + r := resolver.NewMockResolver(map[string][]string{ + "example.com": {"93.184.216.34"}, + }) + + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + _, _ = r.Lookup("example.com") + } +} + +func BenchmarkResolverLookupWithCache(b *testing.B) { + r := resolver.NewMockResolver(map[string][]string{ + "example.com": {"93.184.216.34"}, + }) + // 预热缓存 + _, _ = r.Lookup("example.com") + + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + _, _ = r.Lookup("example.com") + } +} + +func BenchmarkResolverCacheSet(b *testing.B) { + r := resolver.NewMockResolver(nil) + + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + r.CacheSet("host"+string(rune(b.N)), []string{"1.2.3.4"}, time.Minute) + } +} + +func BenchmarkResolverCacheGet(b *testing.B) { + r := resolver.NewMockResolver(nil) + r.CacheSet("example.com", []string{"1.2.3.4"}, time.Minute) + + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + _, _ = r.CacheGet("example.com") + } +} +``` + +- [ ] **Step 2: 运行测试验证** + +```bash +go test -bench=. -benchmem ./internal/benchmark/micro/resolver_bench_test.go +``` + +Expected: 4 个 benchmark 全部运行,无编译错误 + +- [ ] **Step 3: Commit** + +```bash +git add internal/benchmark/micro/resolver_bench_test.go +git commit -m "feat(benchmark): add resolver micro benchmarks" +``` + +--- + +## Task 3: 补充缺失的微基准 — Stream + +**Files:** +- Create: `internal/benchmark/micro/stream_bench_test.go` + +- [ ] **Step 1: 编写 stream 基准测试** + +```go +package micro + +import ( + "io" + "net" + "testing" + + "github.com/stretchr/testify/require" + "rua.plus/lolly/internal/stream" +) + +func BenchmarkStreamTCPForward(b *testing.B) { + // 创建后端 echo 服务器 + backendLn, err := net.Listen("tcp", "127.0.0.1:0") + require.NoError(b, err) + defer backendLn.Close() + + go func() { + for { + conn, err := backendLn.Accept() + if err != nil { + return + } + go func(c net.Conn) { + defer c.Close() + _, _ = io.Copy(c, c) + }(conn) + } + }() + + // 创建 stream server + srv := stream.NewServer() + _ = srv.AddUpstream("test", []stream.TargetSpec{ + {Addr: backendLn.Addr().String(), Weight: 1}, + }, "round_robin", stream.HealthCheckSpec{}) + + // 设置 upstream 健康 + srv.SetHealthy("test", 0, true) + + _ = srv.ListenTCP("127.0.0.1:0") + _ = srv.Start() + defer srv.Stop() + + proxyAddr := srv.GetListenerAddr("test") + + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + conn, err := net.Dial("tcp", proxyAddr) + if err != nil { + b.Fatal(err) + } + _, _ = conn.Write([]byte("hello")) + buf := make([]byte, 5) + _, _ = io.ReadFull(conn, buf) + conn.Close() + } +} + +func BenchmarkStreamSelectTarget(b *testing.B) { + srv := stream.NewServer() + _ = srv.AddUpstream("test", []stream.TargetSpec{ + {Addr: "127.0.0.1:8001", Weight: 3}, + {Addr: "127.0.0.1:8002", Weight: 2}, + {Addr: "127.0.0.1:8003", Weight: 1}, + }, "weighted_round_robin", stream.HealthCheckSpec{}) + + for i := 0; i < 3; i++ { + srv.SetHealthy("test", i, true) + } + + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + _, _ = srv.SelectTarget("test", nil) + } +} +``` + +- [ ] **Step 2: 运行测试验证** + +```bash +go test -bench=. -benchmem ./internal/benchmark/micro/stream_bench_test.go +``` + +Expected: 2 个 benchmark 全部运行 + +- [ ] **Step 3: Commit** + +```bash +git add internal/benchmark/micro/stream_bench_test.go +git commit -m "feat(benchmark): add stream proxy micro benchmarks" +``` + +--- + +## Task 4: 补充缺失的微基准 — Cache + +**Files:** +- Create: `internal/benchmark/micro/cache_bench_test.go` + +- [ ] **Step 1: 编写 cache 基准测试** + +```go +package micro + +import ( + "testing" + "time" + + "rua.plus/lolly/internal/cache" +) + +func BenchmarkCacheGet(b *testing.B) { + c := cache.New(cache.Config{MaxEntries: 10000}) + _ = c.Set("key", []byte("value"), time.Hour) + + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + _, _ = c.Get("key") + } +} + +func BenchmarkCacheSet(b *testing.B) { + c := cache.New(cache.Config{MaxEntries: 10000}) + value := []byte("value") + + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + _ = c.Set("key"+string(rune(b.N)), value, time.Hour) + } +} + +func BenchmarkCacheGetConcurrent(b *testing.B) { + c := cache.New(cache.Config{MaxEntries: 10000}) + for i := 0; i < 1000; i++ { + _ = c.Set(string(rune(i)), []byte("value"), time.Hour) + } + + b.ReportAllocs() + b.ResetTimer() + b.RunParallel(func(pb *testing.PB) { + i := 0 + for pb.Next() { + _, _ = c.Get(string(rune(i % 1000))) + i++ + } + }) +} + +func BenchmarkCacheSetConcurrent(b *testing.B) { + c := cache.New(cache.Config{MaxEntries: 10000}) + value := []byte("value") + + b.ReportAllocs() + b.ResetTimer() + b.RunParallel(func(pb *testing.PB) { + i := 0 + for pb.Next() { + _ = c.Set(string(rune(i)), value, time.Hour) + i++ + } + }) +} +``` + +- [ ] **Step 2: 运行测试验证** + +```bash +go test -bench=. -benchmem ./internal/benchmark/micro/cache_bench_test.go +``` + +- [ ] **Step 3: Commit** + +```bash +git add internal/benchmark/micro/cache_bench_test.go +git commit -m "feat(benchmark): add cache micro benchmarks" +``` + +--- + +## Task 5: 补充缺失的微基准 — Lua + +**Files:** +- Create: `internal/benchmark/micro/lua_bench_test.go` + +- [ ] **Step 1: 编写 Lua 基准测试** + +```go +package micro + +import ( + "testing" + + "rua.plus/lolly/internal/lua" +) + +func BenchmarkLuaSimpleScript(b *testing.B) { + engine := lua.NewEngine() + defer engine.Close() + + script := ` + local a = 1 + 2 + return a + ` + + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + _ = engine.ExecuteString(script) + } +} + +func BenchmarkLuaNginxAPI(b *testing.B) { + engine := lua.NewEngine() + defer engine.Close() + + script := ` + ngx.var.request_uri = "/test" + return ngx.var.request_uri + ` + + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + _ = engine.ExecuteString(script) + } +} + +func BenchmarkLuaJSONEncode(b *testing.B) { + engine := lua.NewEngine() + defer engine.Close() + + script := ` + local json = require("cjson") + local t = {name = "test", value = 123} + return json.encode(t) + ` + + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + _ = engine.ExecuteString(script) + } +} +``` + +- [ ] **Step 2: 运行测试验证** + +```bash +go test -bench=. -benchmem ./internal/benchmark/micro/lua_bench_test.go +``` + +- [ ] **Step 3: Commit** + +```bash +git add internal/benchmark/micro/lua_bench_test.go +git commit -m "feat(benchmark): add lua engine micro benchmarks" +``` + +--- + +## Task 6: 创建集成基准测试 — Server + +**Files:** +- Create: `internal/benchmark/integration/server_bench_test.go` + +- [ ] **Step 1: 编写服务器集成基准** + +```go +package integration + +import ( + "fmt" + "testing" + + "github.com/valyala/fasthttp" + "rua.plus/lolly/internal/config" + "rua.plus/lolly/internal/server" +) + +func BenchmarkServerStaticRequest(b *testing.B) { + cfg := &config.Config{ + Servers: []config.ServerConfig{{ + Listen: "127.0.0.1:0", + Static: []config.StaticConfig{{ + Path: "/", + Root: "./testdata", + }}, + }}, + } + + srv := server.New(cfg) + go srv.Start() + defer srv.Stop() + + // 等待服务器启动 + addr := srv.GetAddr() + + client := &fasthttp.Client{} + req := fasthttp.AcquireRequest() + resp := fasthttp.AcquireResponse() + defer fasthttp.ReleaseRequest(req) + defer fasthttp.ReleaseResponse(resp) + + req.SetRequestURI("http://" + addr + "/") + req.Header.SetMethod("GET") + + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + _ = client.Do(req, resp) + } +} + +func BenchmarkServerProxyRequest(b *testing.B) { + // 启动后端服务器 + backend := &fasthttp.Server{ + Handler: func(ctx *fasthttp.RequestCtx) { + ctx.SetBodyString("ok") + }, + } + go backend.ListenAndServe("127.0.0.1:18081") + + cfg := &config.Config{ + Servers: []config.ServerConfig{{ + Listen: "127.0.0.1:0", + Proxy: []config.ProxyConfig{{ + Path: "/api", + Targets: []config.ProxyTarget{{ + URL: "http://127.0.0.1:18081", + }}, + }}, + }}, + } + + srv := server.New(cfg) + go srv.Start() + defer srv.Stop() + + addr := srv.GetAddr() + + client := &fasthttp.Client{} + req := fasthttp.AcquireRequest() + resp := fasthttp.AcquireResponse() + defer fasthttp.ReleaseRequest(req) + defer fasthttp.ReleaseResponse(resp) + + req.SetRequestURI("http://" + addr + "/api/test") + req.Header.SetMethod("GET") + + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + _ = client.Do(req, resp) + } +} +``` + +- [ ] **Step 2: 运行测试验证** + +```bash +go test -bench=. -benchmem ./internal/benchmark/integration/server_bench_test.go +``` + +- [ ] **Step 3: Commit** + +```bash +git add internal/benchmark/integration/server_bench_test.go +git commit -m "feat(benchmark): add server integration benchmarks" +``` + +--- + +## Task 7: 创建系统压测脚本 + +**Files:** +- Create: `internal/benchmark/system/bench.sh` +- Create: `internal/benchmark/system/static.lua` +- Create: `internal/benchmark/system/proxy.lua` + +- [ ] **Step 1: 编写 wrk 压测脚本 — 静态文件** + +`internal/benchmark/system/static.lua`: + +```lua +-- wrk static file benchmark script +wrk.method = "GET" +wrk.headers["Accept"] = "text/html" + +-- 随机访问不同路径增加真实感 +math.randomseed(os.time()) + +request = function() + local paths = {"/", "/index.html", "/about.html", "/contact.html"} + local path = paths[math.random(#paths)] + return wrk.format(nil, path) +end + +response = function(status, headers, body) + if status ~= 200 then + print("Error: " .. status) + end +end +``` + +- [ ] **Step 2: 编写 wrk 压测脚本 — 代理** + +`internal/benchmark/system/proxy.lua`: + +```lua +-- wrk proxy benchmark script +wrk.method = "GET" +wrk.headers["Accept"] = "application/json" + +request = function() + local paths = {"/api/users", "/api/posts", "/api/comments"} + local path = paths[math.random(#paths)] + return wrk.format(nil, path) +end +``` + +- [ ] **Step 3: 编写主压测脚本** + +`internal/benchmark/system/bench.sh`: + +```bash +#!/bin/bash +set -e + +# Lolly System Benchmark Suite +# Usage: ./bench.sh [lolly_addr] [duration] + +ADDR=${1:-"http://127.0.0.1:8080"} +DURATION=${2:-"30s"} +CONNECTIONS=${3:-400} +THREADS=${4:-12} + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +RESULTS_DIR="${SCRIPT_DIR}/../../../benchmarks/$(date +%Y%m%d-%H%M%S)" +mkdir -p "$RESULTS_DIR" + +echo "=== Lolly System Benchmark ===" +echo "Target: $ADDR" +echo "Duration: $DURATION" +echo "Connections: $CONNECTIONS" +echo "Threads: $THREADS" +echo "Results: $RESULTS_DIR" +echo "" + +# Check tools +check_tool() { + if ! command -v "$1" &> /dev/null; then + echo "Warning: $1 not found, skipping related tests" + return 1 + fi + return 0 +} + +# 1. Static file benchmark +echo "--- Static File Benchmark ---" +if check_tool wrk; then + wrk -t$THREADS -c$CONNECTIONS -d$DURATION \ + -s "$SCRIPT_DIR/static.lua" \ + "$ADDR" > "$RESULTS_DIR/static.txt" + echo "Static: $(grep 'Requests/sec' "$RESULTS_DIR/static.txt" || echo 'N/A')" +fi + +# 2. Proxy benchmark +echo "" +echo "--- Proxy Benchmark ---" +if check_tool wrk; then + wrk -t$THREADS -c$CONNECTIONS -d$DURATION \ + -s "$SCRIPT_DIR/proxy.lua" \ + "$ADDR/api" > "$RESULTS_DIR/proxy.txt" + echo "Proxy: $(grep 'Requests/sec' "$RESULTS_DIR/proxy.txt" || echo 'N/A')" +fi + +# 3. HTTP/2 benchmark +echo "" +echo "--- HTTP/2 Benchmark ---" +if check_tool h2load; then + h2load -n100000 -c100 -m10 "$ADDR" > "$RESULTS_DIR/http2.txt" 2>&1 || true + echo "HTTP/2: $(grep 'finished' "$RESULTS_DIR/http2.txt" || echo 'N/A')" +fi + +# 4. Latency distribution with oha +echo "" +echo "--- Latency Distribution ---" +if check_tool oha; then + oha -z $DURATION -c $CONNECTIONS "$ADDR" > "$RESULTS_DIR/latency.txt" + echo "Latency: $(grep 'Success rate' "$RESULTS_DIR/latency.txt" || echo 'N/A')" +fi + +echo "" +echo "=== Results saved to $RESULTS_DIR ===" +``` + +- [ ] **Step 4: 添加执行权限** + +```bash +chmod +x internal/benchmark/system/bench.sh +``` + +- [ ] **Step 5: Commit** + +```bash +git add internal/benchmark/system/ +git commit -m "feat(benchmark): add system benchmark scripts" +``` + +--- + +## Task 8: 创建一键全量基准脚本 + +**Files:** +- Create: `scripts/bench-suite.sh` +- Modify: `Makefile` + +- [ ] **Step 1: 编写一键基准脚本** + +`scripts/bench-suite.sh`: + +```bash +#!/bin/bash +set -e + +# Run complete benchmark suite and save results + +VERSION=$(git describe --tags --always --dirty 2>/dev/null || echo "dev") +RESULTS_DIR="benchmarks/$VERSION" +mkdir -p "$RESULTS_DIR/pprof" + +echo "=== Lolly Benchmark Suite v$VERSION ===" +echo "Results: $RESULTS_DIR" +echo "" + +# 1. Micro benchmarks +echo "--- Running Micro Benchmarks ---" +go test -bench=. -benchmem \ + ./internal/benchmark/micro/... \ + > "$RESULTS_DIR/micro.txt" 2>&1 || true + +echo "Micro benchmarks done" + +# 2. Integration benchmarks +echo "" +echo "--- Running Integration Benchmarks ---" +go test -bench=. -benchmem \ + ./internal/benchmark/integration/... \ + > "$RESULTS_DIR/integration.txt" 2>&1 || true + +echo "Integration benchmarks done" + +# 3. Existing package benchmarks +echo "" +echo "--- Running Package Benchmarks ---" +go test -bench=. -benchmem \ + ./internal/loadbalance/... \ + ./internal/matcher/... \ + ./internal/proxy/... \ + ./internal/middleware/... \ + > "$RESULTS_DIR/packages.txt" 2>&1 || true + +echo "Package benchmarks done" + +# 4. Summary +echo "" +echo "=== Results Summary ===" +echo "Micro: $RESULTS_DIR/micro.txt" +echo "Integration: $RESULTS_DIR/integration.txt" +echo "Packages: $RESULTS_DIR/packages.txt" + +if command -v benchstat &> /dev/null; then + echo "" + echo "--- Top Results ---" + grep -h "Benchmark" "$RESULTS_DIR"/*.txt | head -20 +fi + +echo "" +echo "All results saved to $RESULTS_DIR" +``` + +- [ ] **Step 2: 添加 Makefile 目标** + +在 `Makefile` 中添加: + +```makefile +.PHONY: bench bench-stat bench-suite + +# Run all benchmarks +bench: + go test -bench=. -benchmem ./internal/benchmark/micro/... ./internal/benchmark/integration/... + +# Run benchmarks and show statistics +bench-stat: bench + @benchstat $(shell ls benchmarks/*/micro.txt 2>/dev/null | tail -1) + +# Run complete benchmark suite +bench-suite: + @bash scripts/bench-suite.sh + +# Run system benchmarks (requires running server) +bench-system: + @bash internal/benchmark/system/bench.sh +``` + +- [ ] **Step 3: 添加执行权限** + +```bash +chmod +x scripts/bench-suite.sh +``` + +- [ ] **Step 4: 运行测试** + +```bash +make bench-suite +``` + +Expected: 脚本运行成功,结果保存到 `benchmarks/dev/` 目录 + +- [ ] **Step 5: Commit** + +```bash +git add scripts/bench-suite.sh Makefile +git commit -m "feat(benchmark): add one-click benchmark suite" +``` + +--- + +## Task 9: 运行第一轮全量基准 → 生成 Baseline + +**Files:** +- Create: `benchmarks/v0.4.0/*.txt` + +- [ ] **Step 1: 运行微基准** + +```bash +go test -bench=. -benchmem \ + ./internal/benchmark/micro/... \ + > benchmarks/v0.4.0/micro.txt +``` + +- [ ] **Step 2: 运行已有包的基准** + +```bash +go test -bench=. -benchmem \ + ./internal/loadbalance/... \ + ./internal/matcher/... \ + ./internal/proxy/... \ + ./internal/middleware/... \ + ./internal/server/... \ + ./internal/cache/... \ + ./internal/stream/... \ + ./internal/resolver/... \ + ./internal/variable/... \ + ./internal/lua/... \ + > benchmarks/v0.4.0/packages.txt +``` + +- [ ] **Step 3: 格式化基准结果** + +```bash +# 如果安装了 benchstat +benchstat benchmarks/v0.4.0/micro.txt +benchstat benchmarks/v0.4.0/packages.txt +``` + +- [ ] **Step 4: Commit baseline** + +```bash +git add benchmarks/v0.4.0/ +git commit -m "chore(benchmark): add v0.4.0 baseline performance data" +``` + +--- + +## Task 10: 采集 pprof 数据 + +**Files:** +- Create: `benchmarks/v0.4.0/pprof/*.prof` + +**前置条件**: 需要启动一个配置了 pprof 的 lolly 服务器 + +- [ ] **Step 1: 启动带 pprof 的测试服务器** + +创建临时测试配置 `benchmark-pprof.yaml`: + +```yaml +servers: + - listen: ":8080" + static: + - path: "/" + root: "./testdata" + proxy: + - path: "/api" + targets: + - url: "http://127.0.0.1:18081" + +monitoring: + pprof: + enabled: true + path: "/debug/pprof" + allow: + - "127.0.0.1" +``` + +启动后端 mock 服务器(可以用 Python/Node 快速启动一个 echo 服务) + +启动 lolly: + +```bash +./bin/lolly -c benchmark-pprof.yaml & +LOLLY_PID=$! +``` + +- [ ] **Step 2: 采集 CPU profile** + +```bash +curl -s "http://localhost:8080/debug/pprof/profile?seconds=30" \ + > benchmarks/v0.4.0/pprof/cpu.prof +``` + +- [ ] **Step 3: 采集 Heap profile** + +```bash +curl -s "http://localhost:8080/debug/pprof/heap" \ + > benchmarks/v0.4.0/pprof/heap.prof +``` + +- [ ] **Step 4: 采集 Allocs profile** + +```bash +curl -s "http://localhost:8080/debug/pprof/allocs" \ + > benchmarks/v0.4.0/pprof/allocs.prof +``` + +- [ ] **Step 5: 采集 Goroutine profile** + +```bash +curl -s "http://localhost:8080/debug/pprof/goroutine" \ + > benchmarks/v0.4.0/pprof/goroutine.prof +``` + +- [ ] **Step 6: 停止测试服务器** + +```bash +kill $LOLLY_PID +rm benchmark-pprof.yaml +``` + +- [ ] **Step 7: Commit pprof 数据** + +```bash +git add benchmarks/v0.4.0/pprof/ +git commit -m "chore(benchmark): add v0.4.0 pprof profiles" +``` + +--- + +## Task 11: 分析瓶颈 → 生成性能报告 + +**Files:** +- Create: `benchmarks/v0.4.0/REPORT.md` + +- [ ] **Step 1: 分析 CPU profile** + +```bash +go tool pprof -top benchmarks/v0.4.0/pprof/cpu.prof > benchmarks/v0.4.0/cpu-top.txt +``` + +查看 Top 20 CPU 消耗函数: + +```bash +go tool pprof -top -n 20 benchmarks/v0.4.0/pprof/cpu.prof +``` + +- [ ] **Step 2: 分析 Heap profile** + +```bash +go tool pprof -top benchmarks/v0.4.0/pprof/heap.prof > benchmarks/v0.4.0/heap-top.txt +``` + +- [ ] **Step 3: 分析 Allocs profile** + +```bash +go tool pprof -top benchmarks/v0.4.0/pprof/allocs.prof > benchmarks/v0.4.0/allocs-top.txt +``` + +- [ ] **Step 4: 汇总生成报告** + +`benchmarks/v0.4.0/REPORT.md`: + +```markdown +# Lolly v0.4.0 性能分析报告 + +> 生成日期: $(date) + +## 1. 基准测试摘要 + +### 微基准 +[粘贴 micro.txt 关键结果] + +### 包基准 +[粘贴 packages.txt 关键结果] + +## 2. CPU 热点 Top 10 + +[粘贴 cpu-top.txt 结果] + +## 3. 内存分配热点 Top 10 + +[粘贴 allocs-top.txt 结果] + +## 4. 内存占用 Top 10 + +[粘贴 heap-top.txt 结果] + +## 5. 优化建议 + +### P0 (高优先级) +- [ ] [根据分析结果填写] + +### P1 (中优先级) +- [ ] [根据分析结果填写] + +### P2 (低优先级) +- [ ] [根据分析结果填写] +``` + +- [ ] **Step 5: Commit 报告** + +```bash +git add benchmarks/v0.4.0/REPORT.md benchmarks/v0.4.0/*-top.txt +git commit -m "docs(benchmark): add v0.4.0 performance analysis report" +``` + +--- + +## Task 12: 实施优化(基于报告) + +> **注意**: 此 Task 的内容将在 Task 11 完成后根据实际瓶颈数据制定。以下为占位模板,实际实施时需替换为具体分析结果。 + +### Task 12.1: 优化 [瓶颈1] + +**Files:** +- Modify: `internal/[package]/[file].go:[line-range]` + +- [ ] **Step 1: 编写优化前 benchmark** + +```bash +# 已有 baseline,无需重复 +``` + +- [ ] **Step 2: 实施优化** + +[根据实际瓶颈实施具体优化] + +- [ ] **Step 3: 验证优化效果** + +```bash +go test -bench=[BenchmarkName] -benchmem ./internal/[package]/... +benchstat benchmarks/v0.4.0/old.txt benchmarks/v0.4.0/new.txt +``` + +Expected: 性能提升 > 5% + +- [ ] **Step 4: Commit** + +```bash +git add internal/[package]/ +git commit -m "perf([package]): optimize [description]" +``` + +### Task 12.2-12.N: 重复优化流程 + +对每个识别的瓶颈重复上述流程。 + +--- + +## Task 13: 建立性能回归检测 + +**Files:** +- Create: `.github/workflows/benchmark.yml` (如果恢复 CI) +- Create: `scripts/bench-compare.sh` +- Modify: `Makefile` + +- [ ] **Step 1: 创建基准对比脚本** + +`scripts/bench-compare.sh`: + +```bash +#!/bin/bash +set -e + +# Compare current benchmark against baseline +# Usage: ./bench-compare.sh [baseline_version] + +BASELINE=${1:-"v0.4.0"} +BASELINE_FILE="benchmarks/$BASELINE/packages.txt" +CURRENT_FILE="benchmarks/current.txt" + +if [ ! -f "$BASELINE_FILE" ]; then + echo "Baseline not found: $BASELINE_FILE" + exit 1 +fi + +echo "Comparing against baseline: $BASELINE" + +# Run current benchmarks +go test -bench=. -benchmem \ + ./internal/loadbalance/... \ + ./internal/matcher/... \ + ./internal/proxy/... \ + ./internal/middleware/... \ + > "$CURRENT_FILE" + +# Compare +if command -v benchstat &> /dev/null; then + benchstat "$BASELINE_FILE" "$CURRENT_FILE" +else + echo "benchstat not found, install with: go install golang.org/x/perf/cmd/benchstat@latest" + exit 1 +fi +``` + +- [ ] **Step 2: 添加 Makefile 目标** + +```makefile +.PHONY: bench-compare + +# Compare current performance against baseline +bench-compare: + @bash scripts/bench-compare.sh +``` + +- [ ] **Step 3: 添加执行权限** + +```bash +chmod +x scripts/bench-compare.sh +``` + +- [ ] **Step 4: 测试回归检测** + +```bash +make bench-compare +``` + +Expected: 显示当前性能与 baseline 的对比,无显著退化 + +- [ ] **Step 5: Commit** + +```bash +git add scripts/bench-compare.sh Makefile +git commit -m "feat(benchmark): add performance regression detection" +``` + +--- + +## Task 14: 最终验证 + +- [ ] **Step 1: 全量测试通过** + +```bash +make test +``` + +Expected: 全部 PASS + +- [ ] **Step 2: Race 检测通过** + +```bash +go test -race ./internal/... +``` + +Expected: 零 race + +- [ ] **Step 3: Lint 通过** + +```bash +make lint +``` + +Expected: 零 issues + +- [ ] **Step 4: 构建验证** + +```bash +make build +``` + +Expected: 构建成功 + +- [ ] **Step 5: 最终 Commit** + +```bash +git log --oneline -20 +``` + +确认所有 benchmark 相关 commit 都在。 + +--- + +## 附录:常用命令速查 + +```bash +# 运行所有微基准 +go test -bench=. -benchmem ./internal/benchmark/micro/... + +# 运行单个基准 +go test -bench=BenchmarkCacheGet -benchmem ./internal/benchmark/micro/... + +# 对比两个基准结果 +benchstat old.txt new.txt + +# 查看 CPU profile +go tool pprof -http=:8081 benchmarks/v0.4.0/pprof/cpu.prof + +# 查看内存分配 +go tool pprof -http=:8081 benchmarks/v0.4.0/pprof/allocs.prof + +# 生成火焰图 +go tool pprof -png benchmarks/v0.4.0/pprof/cpu.prof > cpu-flamegraph.png + +# 系统压测 +make bench-system + +# 性能回归检测 +make bench-compare +``` + +--- + +## Spec Coverage Check + +| Spec Section | Task | +|-------------|------| +| 建立 benchmark 目录结构 | Task 1 | +| 补充 resolver 微基准 | Task 2 | +| 补充 stream 微基准 | Task 3 | +| 补充 cache 微基准 | Task 4 | +| 补充 lua 微基准 | Task 5 | +| 集成基准测试 | Task 6 | +| 系统压测脚本 | Task 7 | +| 一键基准脚本 | Task 8 | +| 生成 baseline | Task 9 | +| 采集 pprof | Task 10 | +| 分析报告 | Task 11 | +| 实施优化 | Task 12 | +| 回归检测 | Task 13 | +| 最终验证 | Task 14 | diff --git a/docs/superpowers/specs/2026-06-03-eliminate-code-redundancy-design.md b/docs/superpowers/specs/2026-06-03-eliminate-code-redundancy-design.md new file mode 100644 index 0000000..b83c24c --- /dev/null +++ b/docs/superpowers/specs/2026-06-03-eliminate-code-redundancy-design.md @@ -0,0 +1,213 @@ +# 消除代码冗余设计文档 + +> **日期:** 2026-06-03 +> **目标:** 消除 lolly 项目中的代码冗余,提升可维护性和代码质量 +> **范围:** 死代码删除、重复模式重构、测试辅助函数提取 + +--- + +## 1. 问题分析 + +通过对代码库的静态分析(`golangci-lint` + `dupl` + `unused`),发现以下冗余代码: + +### 1.1 死代码(Dead Code) + +| 文件 | 函数/方法 | 行号 | 说明 | +|------|----------|------|------| +| `internal/config/validate.go` | `validateStatic()` | 475 | `validateStatics()` 已内联相同逻辑,仅被测试调用 | +| `internal/http2/server.go` | `connectionPool.get()` | 576 | 无任何引用 | +| `internal/http2/server.go` | `connectionPool.count()` | 583 | 无任何引用 | +| `internal/middleware/bodylimit/bodylimit.go` | `formatSize()` | 288 | 业务代码未使用,仅被测试调用;`autoindex.go` 有同名函数 | +| `internal/middleware/security/headers.go` | `defaultSecurityHeaders()` | 295 | 仅被测试调用,业务代码未使用 | +| `internal/middleware/security/headers.go` | `strictSecurityHeaders()` | 309 | 仅被测试调用,业务代码未使用 | +| `internal/middleware/security/headers.go` | `developmentSecurityHeaders()` | 325 | 仅被测试调用,业务代码未使用 | +| `internal/ssl/ocsp.go` | `extractCertificates()` | 490 | 仅被测试调用,业务代码未使用 | + +**排除项**(经确认实际被使用): +- `setupTestLogger()` - 在 `app_test.go` 中被调用 47 次 +- `canonicalHeaderKey()` - 在 `server_test.go` 中被调用 + +### 1.2 源文件重复模式 + +**路由注册错误处理(`internal/server/router.go`)** + +19 次重复模式(proxy、static、lua 三种 handler): +```go +if err := s.locationEngine.AddXXX(path, handler, internal); err != nil { + if err := s.handleRegistrationError("type", path, err); err != nil { + return err + } +} +``` + +**DEBUG 日志条件检查(`internal/proxy/proxy.go`)** + +5 次重复模式: +```go +if logging.Debug().Enabled() { + logging.Debug().Str("key", value).Msg("[PROXY] message") +} +``` + +### 1.3 测试文件重复代码 + +| 模式 | 出现次数 | 位置 | +|------|---------|------| +| `config.ProxyConfig{...}` | 184 | 各测试文件 | +| `config.ProxyTimeout{Connect: 5 * time.Second}` | 85 | 各测试文件 | +| `targets := []*loadbalance.Target{{URL: "http://..."}}` | 123 | 各测试文件 | +| `targets[0].Healthy.Store(true)` | 41 | 各测试文件 | + +--- + +## 2. 设计方案 + +### 2.1 阶段 1:死代码删除 + +**策略**:直接删除未使用的函数,同时清理仅被测试调用的函数的测试代码。 + +**处理清单**: +1. `validateStatic()` - 删除函数,将测试迁移到测试 `validateStatics()` +2. `connectionPool.get()` / `connectionPool.count()` - 直接删除 +3. `formatSize()` (bodylimit) - 删除函数,删除测试;`autoindex.go` 的同名函数保留 +4. `defaultSecurityHeaders()` / `strictSecurityHeaders()` / `developmentSecurityHeaders()` - 删除函数,删除测试 +5. `extractCertificates()` - 删除函数,删除测试 + +### 2.2 阶段 2:重复模式重构 + +**2.2.1 路由注册辅助函数** + +在 `internal/server/router.go` 中提取辅助函数: + +```go +// registerRoute 注册路由并处理错误 +func (s *Server) registerRoute( + locType string, + path string, + handler fasthttp.RequestHandler, + internal bool, + source string, +) error { + var err error + switch locType { + case matcher.LocationTypeExact: + err = s.locationEngine.AddExact(path, handler, internal) + case matcher.LocationTypePrefixPriority: + err = s.locationEngine.AddPrefixPriority(path, handler, internal) + case matcher.LocationTypeRegex: + err = s.locationEngine.AddRegex(path, handler, false, internal) + case matcher.LocationTypeRegexCaseless: + err = s.locationEngine.AddRegex(path, handler, true, internal) + case matcher.LocationTypeNamed: + err = s.locationEngine.AddNamed(path, handler) + default: + err = s.locationEngine.AddPrefix(path, handler, internal) + } + if err != nil { + return s.handleRegistrationError(source, path, err) + } + return nil +} +``` + +**2.2.2 DEBUG 日志辅助函数** + +在 `internal/proxy/proxy.go` 中提取辅助函数: + +```go +// proxyDebugLog 在 DEBUG 级别记录代理日志 +func proxyDebugLog(msg string, kv ...interface{}) { + if !logging.Debug().Enabled() { + return + } + event := logging.Debug() + for i := 0; i < len(kv)-1; i += 2 { + key, ok := kv[i].(string) + if !ok { + continue + } + switch v := kv[i+1].(type) { + case string: + event = event.Str(key, v) + case int: + event = event.Int(key, v) + case bool: + event = event.Bool(key, v) + } + } + event.Msg(msg) +} +``` + +### 2.3 阶段 3:测试辅助函数 + +在 `internal/testutil/` 包中创建辅助函数: + +```go +package testutil + +import ( + "rua.plus/lolly/internal/config" + "rua.plus/lolly/internal/loadbalance" +) + +// NewTestProxyConfig 创建测试用的代理配置 +func NewTestProxyConfig(path string, targets []string) *config.ProxyConfig { + cfg := &config.ProxyConfig{ + Path: path, + LoadBalance: "round_robin", + Timeout: config.ProxyTimeout{ + Connect: 5 * time.Second, + Read: 30 * time.Second, + Write: 30 * time.Second, + }, + } + // ... + return cfg +} + +// NewTestTarget 创建测试用的代理目标 +func NewTestTarget(url string) *loadbalance.Target { + return &loadbalance.Target{URL: url} +} + +// NewTestHealthyTarget 创建已标记为健康的测试目标 +func NewTestHealthyTarget(url string) *loadbalance.Target { + t := NewTestTarget(url) + t.Healthy.Store(true) + return t +} +``` + +**迁移策略**: +1. 先创建辅助函数 +2. 逐步替换测试文件中的重复代码 +3. 每次替换后运行测试确保通过 + +--- + +## 3. 风险评估 + +| 风险 | 可能性 | 影响 | 缓解措施 | +|------|--------|------|---------| +| 删除的函数实际上被间接使用 | 低 | 高 | 通过 `grep` 确认无引用后再删除 | +| 重构引入新 bug | 中 | 中 | 每次变更后运行完整测试套件 | +| 测试辅助函数改变测试语义 | 低 | 中 | 保持默认配置与原始代码一致 | + +--- + +## 4. 验收标准 + +- [ ] `golangci-lint run --enable=unused ./...` 无 unused 错误 +- [ ] `golangci-lint run --enable=dupl ./...` 源文件无 dupl 错误 +- [ ] `go test ./...` 全部通过 +- [ ] 代码总行数减少 >200 行 +- [ ] 测试文件中的 `ProxyConfig{` 字面量减少 >50% + +--- + +## 5. 实施顺序 + +1. **阶段 1(死代码)** - 低风险,快速见效 +2. **阶段 2(源文件重构)** - 中等风险,改善可维护性 +3. **阶段 3(测试辅助函数)** - 低风险,最大减负 diff --git a/docs/superpowers/specs/2026-06-08-loadbalance-enhancement-design.md b/docs/superpowers/specs/2026-06-08-loadbalance-enhancement-design.md new file mode 100644 index 0000000..db3c6be --- /dev/null +++ b/docs/superpowers/specs/2026-06-08-loadbalance-enhancement-design.md @@ -0,0 +1,389 @@ +# Lolly 负载均衡增强设计 - Least Time & Session Sticky + +**日期**: 2026-06-08 +**状态**: Approved + +## 1. 背景与目标 + +Lolly 当前支持 6 种负载均衡算法:Round Robin、Weighted Round Robin、Least Connections、IP Hash、Consistent Hash、Random(Power of Two Choices)。 + +与 nginx Plus 对比,Lolly 缺少两个重要特性: +1. **Least Time** - 基于响应时间选择最优后端 +2. **Session Sticky** - Cookie-based 会话保持 + +本文档设计这两个算法的高性能实现方案,目标是: +- **零锁设计**:原子操作替代互斥锁 +- **零堆分配**:预分配 + 对象池 +- **纳秒级延迟**:单次选择 < 100ns +- **与现有代码风格一致** + +## 2. 设计概览 + +``` + +----------------------+ + | Proxy Request | + +----------+-----------+ + | + +----------------+----------------+ + | | + +-----v------+ +------v------+ + | Least Time | | Sticky | + | Select | | Route | + +-----+------+ +------+------+ + | | + +-----v------+ +------v------+ + | EWMA Stats | | Cookie | + | (atomic) | | + Shard Map | + +------------+ +-------------+ +``` + +## 3. Least Time 设计 + +### 3.1 核心算法 + +基于 EWMA(指数加权移动平均)的响应时间统计: + +``` +new_avg = alpha * new_sample + (1 - alpha) * old_avg +``` + +- `alpha` 默认 0.3,可配置(0-1 范围) +- alpha 越大,对新样本越敏感,收敛越快 +- 使用 atomic.Int64 存储纳秒值,避免浮点运算 + +### 3.2 数据结构 + +```go +// EWMAStats 原子 EWMA 统计器 +type EWMAStats struct { + headerTime atomic.Int64 // EWMA 首字节时间(纳秒) + lastByteTime atomic.Int64 // EWMA 完整响应时间(纳秒) + sampleCount atomic.Int64 // 样本计数 +} + +// 使用固定点整数运算避免浮点 +// 将 alpha 编码为定点数:alpha * 1000 +const alphaScale = 1000 + +func (e *EWMAStats) Record(headerTime, lastByteTime time.Duration) { + // 原子更新,无锁 + e.updateAtomic(&e.headerTime, headerTime) + e.updateAtomic(&e.lastByteTime, lastByteTime) + e.sampleCount.Add(1) +} +``` + +### 3.3 LeastTime Balancer + +```go +type LeastTime struct { + metric string // "header" | "last_byte" +} + +func (l *LeastTime) Select(targets []*Target) *Target { + var selected *Target + var minTime int64 = -1 + + for _, t := range targets { + if !t.IsAvailable() { + continue + } + + // 原子读取响应时间 + var currentTime int64 + if l.metric == "header" { + currentTime = t.Stats.HeaderTime() + } else { + currentTime = t.Stats.LastByteTime() + } + + // 无统计样本时给默认值,避免新节点被饿死 + if currentTime == 0 { + currentTime = defaultResponseTime + } + + if selected == nil || currentTime < minTime { + selected = t + minTime = currentTime + } + } + + return selected +} +``` + +### 3.4 性能指标 + +| 操作 | 延迟 | 锁 | 堆分配 | +|------|------|-----|--------| +| Record | ~20ns | 无 | 0 | +| Select | ~50ns | 无 | 0 | + +### 3.5 配置 + +```yaml +proxy: + - path: /api + load_balance: least_time + least_time_metric: last_byte # header | last_byte(默认) + least_time_alpha: 0.3 # 0-1,越大越敏感(默认 0.3) + least_time_default_ns: 1000000 # 无样本时的默认值(默认 1ms) +``` + +### 3.6 Proxy 层集成 + +```go +// 在请求完成后调用 +func (p *Proxy) recordResponseTime(target *loadbalance.Target, start time.Time) { + if tracker, ok := p.balancer.(ResponseTimeRecorder); ok { + headerTime := target.HeaderReceived.Sub(start) + lastByteTime := time.Since(start) + tracker.RecordResponseTime(target, headerTime, lastByteTime) + } +} +``` + +## 4. Session Sticky 设计 + +### 4.1 核心算法 + +基于 Cookie 的路由表 + 分片锁: + +- Cookie 值编码:`base64(target_url + "|" + expires_timestamp)` +- 256 个分片,每个分片独立 `sync.RWMutex` +- 分片索引:`fnvHash64a(cookie_value) % 256` +- 后台 goroutine 每 60s 清理过期 session + +### 4.2 数据结构 + +```go +// StickySession Sticky Session 负载均衡器 +type StickySession struct { + config StickyConfig + fallback loadbalance.Balancer // fallback 算法 + + // 256 个分片,降低锁冲突概率 + shards [256]*stickyShard + cleaner *time.Ticker + stopCh chan struct{} + started atomic.Bool +} + +type stickyShard struct { + mu sync.RWMutex + sessions map[string]*stickyEntry // key: cookie value +} + +type stickyEntry struct { + targetURL string + expiresAt int64 // Unix 纳秒 + createdAt int64 // Unix 纳秒 +} +``` + +### 4.3 路由流程 + +``` +请求到达 + | + v +检查 Cookie "lolly_route" + | + +-- 存在 --> + | 解码 cookie 值 + | 查找目标是否健康 + | | + | +-- 健康 --> 路由到该目标 + | | + | +-- 不健康 -> 删除 session + | 用 fallback 选择新目标 + | 设置新 cookie + | + +-- 不存在 --> + 用 fallback 选择目标 + 设置 Set-Cookie 响应头 +``` + +### 4.4 Cookie 编码 + +```go +// encodeCookie 编码路由信息到 cookie 值 +// 格式: base64(target_url + "|" + expires_timestamp) +func encodeCookie(targetURL string, expires time.Time) string { + raw := targetURL + "|" + strconv.FormatInt(expires.Unix(), 10) + return base64.URLEncoding.EncodeToString([]byte(raw)) +} + +// decodeCookie 解码 cookie 值 +func decodeCookie(value string) (targetURL string, expires time.Time, ok bool) { + raw, err := base64.URLEncoding.DecodeString(value) + if err != nil { + return + } + parts := strings.Split(string(raw), "|") + if len(parts) != 2 { + return + } + ts, err := strconv.ParseInt(parts[1], 10, 64) + if err != nil { + return + } + return parts[0], time.Unix(ts, 0), true +} +``` + +### 4.5 选择逻辑 + +```go +func (s *StickySession) Select(ctx *fasthttp.RequestCtx, targets []*Target) *Target { + // 1. 检查 cookie + cookie := ctx.Request.Header.Cookie(s.config.Name) + if len(cookie) > 0 { + targetURL, _, ok := decodeCookie(string(cookie)) + if ok { + // 查找目标 + for _, t := range targets { + if t.URL == targetURL && t.IsAvailable() { + return t + } + } + // 目标不可用,删除 session(延迟删除) + s.deleteSession(string(cookie)) + } + } + + // 2. 使用 fallback 算法选择 + selected := s.fallback.Select(targets) + if selected == nil { + return nil + } + + // 3. 种 cookie + s.setCookie(ctx, selected.URL) + + // 4. 记录 session + s.recordSession(selected.URL) + + return selected +} +``` + +### 4.6 性能指标 + +| 操作 | 延迟 | 锁冲突概率 | +|------|------|-----------| +| Session 查找 | ~30ns | 0.4% (256 分片) | +| Session 写入 | ~50ns | 0.4% | +| 清理过期 | 后台,不影响主路径 | - | + +### 4.7 配置 + +```yaml +proxy: + - path: /api + load_balance: sticky + sticky: + enabled: true + name: "lolly_route" # cookie 名称(默认) + expires: "1h" # session 有效期(默认 1h) + domain: "" # cookie domain + path: "/" # cookie path(默认 /) + secure: false # Secure flag + http_only: true # HttpOnly flag(默认 true) + same_site: "Lax" # SameSite(默认 Lax) + # fallback 算法配置 + fallback_balance: round_robin # 首次路由和失效回退算法 +``` + +## 5. 扩展 Balancer 接口 + +为支持 Least Time 的响应时间记录,扩展一个可选接口: + +```go +// ResponseTimeRecorder 响应时间记录接口 +// 实现此接口的 balancer 可在请求完成后收到响应时间统计 +type ResponseTimeRecorder interface { + RecordResponseTime(target *Target, headerTime, lastByteTime time.Duration) +} +``` + +**为什么用接口扩展而非修改 Balancer?** +- 不破坏现有 6 个 balancer 的实现 +- 类型断言在运行时判断,无性能开销 +- 符合 Go 接口隔离原则 + +## 6. 文件改动清单 + +### 6.1 新增文件 + +| 文件 | 行数 | 说明 | +|------|------|------| +| `internal/loadbalance/ewma.go` | ~80 | 原子 EWMA 统计器 | +| `internal/loadbalance/least_time.go` | ~120 | Least Time balancer | +| `internal/loadbalance/sticky.go` | ~280 | Session Sticky balancer | +| `internal/loadbalance/sticky_config.go` | ~30 | Sticky 配置结构体 | +| `internal/loadbalance/least_time_test.go` | ~200 | Least Time 单元测试 | +| `internal/loadbalance/sticky_test.go` | ~250 | Session Sticky 单元测试 | + +### 6.2 修改文件 + +| 文件 | 修改内容 | +|------|----------| +| `internal/loadbalance/algorithms.go` | 添加 `least_time`、`sticky` 到 validAlgorithms | +| `internal/loadbalance/balancer.go` | Target 增加 `Stats *EWMAStats` 字段 | +| `internal/config/proxy_config.go` | 添加 `LeastTimeConfig`、`StickyConfig` | +| `internal/config/defaults.go` | 添加新配置项默认值注释 | +| `internal/config/validate.go` | 验证 `least_time_metric`、`fallback_balance` | +| `internal/proxy/proxy.go` | createBalancer 增加新算法;请求完成后调用 RecordResponseTime | +| `internal/proxy/target_selector.go` | Select 支持 StickySession(需 ctx 参数) | + +## 7. 测试策略 + +### 7.1 Least Time 测试 + +- **基准测试**: 测量 Select/Record 延迟 +- **并发测试**: 100 goroutine 并发 Record + Select,验证无数据竞争 +- **收敛测试**: 验证 EWMA 对新旧样本的权重分配 +- **故障转移**: 验证目标失效后选择其他目标 + +### 7.2 Session Sticky 测试 + +- **Cookie 编码/解码**: 验证 round-trip 正确性 +- **路由一致性**: 相同 cookie 始终路由到同一目标 +- **目标失效**: 目标不可用时 fallback 并更新 cookie +- **过期清理**: 验证过期 session 被清理 +- **并发安全**: 100 goroutine 并发读写,验证无数据竞争 +- **分片均衡**: 验证 hash 分布均匀 + +## 8. 与 nginx Plus 对比 + +| 特性 | nginx Plus | Lolly 方案 | +|------|------------|------------| +| Least Time header | ✅ | ✅ | +| Least Time last_byte | ✅ | ✅ | +| EWMA 平滑 | ✅ | ✅ (alpha 可调) | +| Session Sticky cookie | ✅ | ✅ | +| Session Sticky learn | ✅ | ❌ (暂不支持) | +| Secure/HttpOnly/SameSite | ✅ | ✅ | +| 目标失效 fallback | ✅ | ✅ | +| Session TTL | ✅ | ✅ | + +## 9. 风险与缓解 + +| 风险 | 影响 | 缓解 | +|------|------|------| +| 新节点被饿死 | 高 | 无统计样本时给默认值 `least_time_default_ns` | +| Sticky 内存增长 | 中 | TTL + 后台清理 + 分片限制 | +| Cookie 过大 | 低 | 仅编码 URL + timestamp,通常 < 200 bytes | +| 目标频繁上下线 | 中 | session 延迟删除,避免惊群 | + +## 10. 后续优化 + +1. **Session Sticky Learn 模式**: 学习后端返回的 Set-Cookie,而非主动种植 +2. **Least Time 加权**: 结合权重和响应时间进行加权选择 +3. **统计持久化**: 重启后保留历史响应时间统计 + +--- + +**设计批准**: ✅ 已批准 +**下一步**: 编写实现计划 (writing-plans) diff --git a/docs/superpowers/specs/2026-06-10-performance-optimization-design.md b/docs/superpowers/specs/2026-06-10-performance-optimization-design.md new file mode 100644 index 0000000..62683aa --- /dev/null +++ b/docs/superpowers/specs/2026-06-10-performance-optimization-design.md @@ -0,0 +1,261 @@ +# 性能持续优化设计文档 + +> **版本**: v1.0 +> **日期**: 2026-06-10 +> **目标**: 极致吞吐量 + 资源效率 +> **方法**: 数据驱动优化(Benchmark → Profile → Optimize → Verify) + +--- + +## 1. 总体架构 + +整个性能优化流程分为 5 个阶段,形成持续迭代闭环: + +``` +┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ +│ 1. 建立基准 │ → │ 2. 采集数据 │ → │ 3. 分析瓶颈 │ → │ 4. 实施优化 │ → │ 5. 回归检测 │ +│ Benchmark │ │ Baseline │ │ Profile │ │ Optimize │ │ Prevent │ +└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ + ↑ │ + └──────────────────────────────── 持续迭代 ◄─────────────────────────────────┘ +``` + +**核心原则**: +- 每个优化必须有 benchmark 数据证明收益 +- 不优化没有数据支撑的地方 +- 建立可重复的性能测试环境 + +--- + +## 2. 基准测试基础设施(Benchmark Suite) + +### 2.1 三层基准测试体系 + +#### 2.1.1 微基准(Micro Benchmark)— 单元级 + +针对单个函数/模块的 Go benchmark: + +| 模块 | 状态 | 待补充 | +|------|------|--------| +| `loadbalance` | 已有 | Sticky、Least Time 极端场景 | +| `matcher` | 已有 | 大规模路由表(1k+ location) | +| `proxy` | 已有 | 缓存键构建、WebSocket 检测 | +| `middleware/security` | 已有 | 限流器高并发 | +| `middleware/compression` | 已有 | 大文件压缩 | +| `cache` | 部分 | 完整 CRUD、并发竞争 | +| `lua` | 部分 | 脚本执行、协程调度 | +| `resolver` | 缺失 | DNS 查询、缓存命中 | +| `variable` | 部分 | 复杂变量展开 | +| `stream` | 缺失 | TCP/UDP 转发吞吐 | + +#### 2.1.2 集成基准(Integration Benchmark)— 端到端 + +用 `httptest` 或真实端口测试完整请求链路: + +- **静态文件服务**: 小文件(1KB)、中文件(100KB)、大文件(10MB) +- **反向代理**: 直连后端、带缓存、带负载均衡 +- **HTTPS/TLS**: 握手开销、TLS 1.2 vs 1.3 +- **HTTP/2**: 多路复用、流控 +- **HTTP/3**: QUIC 连接建立、0-RTT +- **WebSocket**: 消息转发延迟 +- **Stream**: TCP/UDP 吞吐 + +#### 2.1.3 系统基准(System Benchmark)— 全链路 + +用外部压测工具测试完整服务器: + +- **RPS 极限测试**: 不同并发数下的吞吐量曲线 +- **延迟分布**: P50/P99/P999 延迟 +- **资源占用**: CPU、内存、goroutine 数、GC 频率 +- **连接数测试**: C10K、C100K 场景 + +### 2.2 Benchmark 目录结构 + +``` +internal/benchmark/ +├── micro/ # Go benchmark 文件 +│ ├── proxy_test.go +│ ├── cache_test.go +│ ├── lua_test.go +│ └── ... +├── integration/ # 集成测试风格 benchmark +│ ├── static_bench_test.go +│ ├── proxy_bench_test.go +│ └── ... +└── system/ # 外部压测脚本 + 结果 + ├── wrk_static.sh + ├── wrk_proxy.sh + └── results/ +``` + +### 2.3 基准收集工具 + +- **`make bench`**: 运行所有微基准 +- **`make bench-stat`**: 生成基准报告 +- **`scripts/bench.sh`**: 一键系统压测 +- **benchstat**: 对比新旧基准数据 + +--- + +## 3. 性能数据采集与分析流程 + +### 3.1 Baseline 采集步骤 + +#### 第一步:微基准全量运行 + +```bash +# 运行所有微基准,保存结果 +go test -bench=. -benchmem ./internal/benchmark/micro/... > benchmark-v0.4.0.txt + +# 使用 benchstat 格式化 +benchstat benchmark-v0.4.0.txt +``` + +#### 第二步:集成基准运行 + +```bash +# 运行集成 benchmark +go test -bench=Benchmark -benchmem ./internal/benchmark/integration/... +``` + +#### 第三步:系统压测(外部工具) + +```bash +# 静态文件压测 +wrk -t12 -c400 -d30s http://localhost:8080/ + +# 代理压测 +wrk -t12 -c400 -d30s http://localhost:8080/api/ + +# HTTP/2 压测 +h2load -n100000 -c100 -m10 http://localhost:8080/ +``` + +#### 第四步:pprof 数据采集 + +```bash +# CPU profile(30秒) +curl http://localhost:8080/debug/pprof/profile?seconds=30 > cpu.prof + +# Heap profile +curl http://localhost:8080/debug/pprof/heap > heap.prof + +# Allocs profile(分配热点) +curl http://localhost:8080/debug/pprof/allocs > allocs.prof + +# Goroutine profile +curl http://localhost:8080/debug/pprof/goroutine > goroutine.prof +``` + +### 3.2 分析工具链 + +| 工具 | 用途 | 命令 | +|------|------|------| +| `go tool pprof` | CPU/内存分析 | `go tool pprof -http=:8081 cpu.prof` | +| `go tool trace` | 调度/延迟分析 | `go test -trace=trace.out` | +| `benchstat` | 基准对比 | `benchstat old.txt new.txt` | +| `go test -memprofile` | 分配追踪 | 集成到 benchmark | +| `perf` (Linux) | 系统级分析 | `perf record -g ./lolly` | + +### 3.3 分析维度 + +1. **CPU 热点**: 哪些函数消耗最多 CPU? +2. **内存分配**: 每请求分配次数和大小? +3. **锁竞争**: `sync.Mutex` / `sync.RWMutex` 的争用情况? +4. **系统调用**: `syscall` / `cgo` 开销? +5. **GC 压力**: GC 频率、STW 时间? +6. **网络 I/O**: 连接建立、读写延迟? + +### 3.4 瓶颈识别模板 + +``` +性能分析报告 v0.4.0 Baseline +============================= + +1. CPU 热点 Top 5 + - runtime.mallocgc (12.3%) ← 分配开销 + - runtime.scanobject (8.7%) ← GC 扫描 + - proxy.(*Proxy).ServeHTTP (7.2%) + - matcher.(*LocationEngine).Match (5.1%) + - compress/flate.(*compressor).write (4.8%) + +2. 每请求分配 Top 5 + - time.Now(): 1 alloc/req + - fmt.Sprintf: 0.5 alloc/req + - ... + +3. 锁竞争热点 + - cache.(*FileCache).Get: 15% 阻塞时间 + - proxy.(*Proxy).buildCacheKeyHash: 8% 阻塞时间 + +4. 优化优先级 + P0: [具体任务] + P1: [具体任务] + P2: [具体任务] +``` + +--- + +## 4. 优化实施流程 + +### 4.1 优化原则 + +- **可量化**: 每次优化必须有 benchmark 对比数据 +- **最小改动**: 优先单文件/单函数改动 +- **可回滚**: 保留优化前后的基准数据 + +### 4.2 优化分类 + +| 类型 | 示例 | 验证方式 | +|------|------|---------| +| 零分配 | 用 `b2s` 替代 `string([]byte)` | `-benchmem` allocs/op | +| 算法优化 | 更快的哈希、查找 | `Benchmark` ns/op | +| 并发优化 | 锁粒度细化、无锁结构 | `go test -race` + benchmark | +| 缓存优化 | 减少重复计算 | CPU profile 对比 | +| GC 优化 | 减少短生命周期对象 | `GODEBUG=gctrace=1` | + +--- + +## 5. 回归检测机制 + +### 5.1 自动化检查 + +- **CI 集成**: 每次 PR 跑 benchmark 对比 +- **阈值告警**: 性能下降 >5% 自动阻断 +- **趋势追踪**: 长期性能趋势图 + +### 5.2 回归检测工具 + +```bash +# 对比两个版本 +benchstat old.txt new.txt + +# 示例输出 +# name old time/op new time/op delta +# ServeHTTP 1.20µs ± 2% 1.15µs ± 3% -4.17% (p=0.02 n=10+10) +``` + +--- + +## 6. 预期成果 + +- 完整的 benchmark 套件覆盖所有核心模块 +- 可量化的 baseline 性能数据 +- 识别出的 Top 10 性能瓶颈 +- 每轮优化都有可验证的性能提升数据 +- 自动化回归检测防止性能退化 + +--- + +## 7. 任务清单 + +- [ ] 建立 `internal/benchmark/` 目录结构 +- [ ] 补充缺失的微基准(resolver、stream、cache、lua) +- [ ] 创建集成基准测试 +- [ ] 创建系统压测脚本 +- [ ] 跑第一轮全量基准 → 生成 baseline +- [ ] 采集 pprof 数据(CPU/heap/allocs/goroutine) +- [ ] 分析瓶颈 → 生成性能报告 +- [ ] 制定 Top N 优化任务 +- [ ] 逐个实施优化并验证 +- [ ] 建立 CI 回归检测