-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathatom.xml
635 lines (391 loc) · 857 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Andrewei's Blog</title>
<link href="https://andrewei1316.github.io/atom.xml" rel="self"/>
<link href="https://andrewei1316.github.io/"/>
<updated>2021-01-03T09:16:39.494Z</updated>
<id>https://andrewei1316.github.io/</id>
<author>
<name>Andrewei</name>
</author>
<generator uri="https://hexo.io/">Hexo</generator>
<entry>
<title>《A Top-Down Method for Performance Analysis and Counters Architecture》阅读笔记</title>
<link href="https://andrewei1316.github.io/2020/12/20/top-down-performance-analysis/"/>
<id>https://andrewei1316.github.io/2020/12/20/top-down-performance-analysis/</id>
<published>2020-12-20T06:35:36.000Z</published>
<updated>2021-01-03T09:16:39.494Z</updated>
<content type="html"><![CDATA[<h2 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h2><p>随着处理器复杂度的增加、处理任务的多样化以及性能分析工具数据的难以管理,使得性能分析的难度日益增加。同时,在某些领域中,对于资源和时间的限制更加严格,进一步要求性能分析给出分析速度和结果准确性更优的方法。</p><p>这篇文章给出了一个自顶向下的分析方法(Top-Down Analysis),可以在乱序处理器上快速定位真正的性能瓶颈。该方法通过将性能数据结构化、分层展示,直观快速的展示性能瓶颈,并且已经被包括 <code>VTune</code> 在内的众多性能工具使用。</p><p>不同于其他性能分析方法,该方法的开销很低,只需要在传统的 <code>PMU(Performance Monitor Unit)</code> 中增加 8 个简单的性能事件。它没有对问题域的限制,可以全面的进行性能分析,并且可以找到超标量核心的性能问题。</p><a id="more"></a><h2 id="背景"><a href="#背景" class="headerlink" title="背景"></a>背景</h2><p>现代处理器在高性能方面有了长足的发展,比如大窗口无序执行(large-window out-of-order execution)、分支预测(predictive speculation)和硬件预取(hardware prefetching),这些方法使得程序都达到了很高的 <code>IPC(instructions-per-cycle)</code>。但是这些精密的设计在将性能推到更高的同时,也让程序能高速运行的条件变得苛刻,从众多数据中找到限制程序运行速度的原因也变得更加棘手。</p><p>从全局视角来说,现代 CPU 有两个主要的部分:前端(Frontend)和后端(Backend)。</p><p>前端负责从内存中拉取指令并将指令翻译为微操作(micro-operations, μops),这些 <code>μops</code> 会被传递给后端部分,但通常这些 <code>μops</code> 会暂存在 <code>ready-μops-queue</code> 中。</p><p>后端负责以原始程序中的顺序调度(schedule)、执行(execute)和提交(commit, retire)这些 <code>μops</code>。</p><p>Intel 第 3 代处理器(Ivy bridge)的微体系架构如图 1 所示</p><p><img src="/images/top_down_performance_analysis/figure1_out_of_order_cpu_block_diagram.png" alt="Figure 1: Out-of-order CPU block diagram - Intel Core"></p><p>使用传统方法预估停顿(stalls),比如计算 <code>cache misses</code> 的公式 $StallCycles = \sum Penalty_i * MissEvent_i$ ,仅适用于顺序执行(in-order)的 <code>CPU</code>。这种方法不适用于现代 <code>CPU</code> 的原因有:</p><ol><li><p>停顿重叠(Stalls Overlap):由于很多单元并行工作,比如当指令 <code>cache</code> 没有命中的时候,可能 <code>data cache miss</code> 被处理</p><blockquote><p>Stalls overlap, where many units work in parallel. E.g. a data cache miss can be handled, while some future instruction is missing the instruction cache.</p></blockquote></li><li><p>预测执行(Speculative execution):当 <code>CPU</code> 跑在错误的控制分支上时,相对于正确分支的事件,错误分支的事件应当淡化处理</p><blockquote><p>Speculative execution, when CPU follows an incorrect control-path. Events from incorrect path are less critical than those from correct-path.</p></blockquote></li><li><p>基于工作负载的判罚(Penalties are workloaddependent):传统算法假设每种工作负载判罚都是相同的,但是比如不同的分支在预测错误时,导致的判罚可能的不同的</p><blockquote><p>Penalties are workloaddependent, while naïve-approach assumes a fixed penalty for all workloads. E.g. the distance between branches may add to a misprediction cost.</p></blockquote></li><li><p>预先设置好的事件(Restriction to a pre-defined set of miss-events):现代复杂的微处理器架构,会出现多种可能的情况,预先设置好事件集合只能覆盖大多数情况</p><blockquote><p>Restriction to a pre-defined set of miss-events, these sophisticated microarchitectures have so many possible hiccups and only the most common subset is covered by dedicated events.</p></blockquote></li><li><p>超标量统计不准确(Superscalar inaccuracy):<code>CPU</code> 可以在一个周期内 <code>issue</code>、<code>execute</code> 和 <code>retire</code> 多个操作,在越来越多技术的加持下,应用被流水线带宽延迟限制的情况被缓和。</p><blockquote><p>Superscalar inaccuracy, a CPU can issue, execute and retire multiple operations in a cycle. Some (e.g. client) applications become limited by the pipeline’s bandwidth as latency is mitigated with more and more techniques. </p></blockquote></li></ol><h2 id="自顶向下的分析(Top-Down-Analysis)"><a href="#自顶向下的分析(Top-Down-Analysis)" class="headerlink" title="自顶向下的分析(Top-Down Analysis)"></a>自顶向下的分析(Top-Down Analysis)</h2><p>自顶向下的分析方法旨在能快速准确的定位性能瓶颈点。该方法的执行过程为</p><ol><li>将 <code>CPU</code> 执行时间在较高结构层次上进行分类。这一步可以通过较高的数值标记(flagged)出性能较低的域(domain)</li><li>层层下钻被标记的域(domain),最终可以确定性能出问题的一些具体的点,后面可以针对性的调查具体原因</li></ol><h3 id="层次结构(Hierarchy)"><a href="#层次结构(Hierarchy)" class="headerlink" title="层次结构(Hierarchy)"></a>层次结构(Hierarchy)</h3><p><code>Top-Down Analysis</code> 划定的层次结构如图 2 所示</p><p><img src="/images/top_down_performance_analysis/figure2_the_top_down_analysis_hierarchy.png" alt="Figure 2: The Top-Down Analysis Hierarchy"></p><p>在 <code>Top-Down Analysis</code> 中,流水线被分解为四个基础的分类:<code>Retiring</code>、<code>Bad Speculation</code>、<code>Frontend Bound</code> 和 <code>Backend Bound</code>。</p><p>下面通过一个被 <code>cache</code> 性能限制的例子来介绍排查问题的流程,当使用 <code>Top-Down Analysis</code> 方法分析后:</p><ol><li><p>发现 <code>Backend Bound</code> 被标记,而 <code>Frontend Bound</code> 则不会被标记。此时应该下钻 <code>Backend Bound</code> 部分而忽略 <code>Frontend Bound</code>;</p></li><li><p>下钻 <code>Backend Bound</code> 后,会发现 <code>Memory Bound</code> 分类被标记(假设程序是 <code>cache</code> 密集型)。此时应当继续下钻 <code>Memory Bound</code>,而忽略非 <code>Memory Bound</code> 部分;</p></li><li><p>下钻 <code>Memory Bound</code> 后,会发现 <code>L1 Bound</code> 被标记;</p></li><li><p>最终可以确定程序性能低可能是由于加载数据与写回数据重叠或者缓存被分割加载。</p><blockquote><p>这个结论没有看明白,把原文放在这里</p><p>Lastly, Loads block due to overlap with earlier stores or cache line split loads might be specific performance issues underneath L1 Bound</p></blockquote></li></ol><p><strong>基于分层的结构为分析统计信息增加了一张自然的安全网</strong>,除非从根节点到某个内部节点之间的所有节点都被标记了,否则这个内部节点应该被忽略。例如,一段计算除法的代码可能会导致 <code>Memory Bound</code> 和 <code>Divider</code> 节点都具有很高的值,但是 <code>Divider</code> 节点应该被忽略如果 <code>Backend</code> 和 <code>CoreBound</code> 节点没有被标记。我们将这个原则称为分层安全性(hierarchical-safety property)。<strong>只有兄弟节点的值才有比较意义</strong>,因为兄弟节点的值是基于相同的流水线阶段统计出来的。</p><h3 id="顶层结构的分解(Top-Level-Breakdown)"><a href="#顶层结构的分解(Top-Level-Breakdown)" class="headerlink" title="顶层结构的分解(Top Level Breakdown)"></a>顶层结构的分解(Top Level Breakdown)</h3><p>对于一个复杂的微体系结构,顶层结构如何分类是一个很大的挑战。这里我们选择分类的边界是图 1 中被标记星号的指令发射点。基于这个边界,我们将流水线槽位(pipeline-slot)划分为四类:前端限制(Frontend Bound)、后端限制(Backend Bound)、错误预测(Bad Speculation)和退休(Retiring)。具体的分类标准如图 3 所示</p><p><img src="/images/top_down_performance_analysis/figure3_top_level_breakdown_flowchart.png" alt="Figure 3: Top Level breakdown flowchart"></p><ol><li>在一个周期内,如果 <code>μops</code> 被发射(issued),则这个 <code>μops</code> 只会被退休(retired)或者取消(cancelled),所以它应当归属于退休或者错误预测的分类</li><li>如果出现了 <code>backend-stall</code> (即由于后端资源不足而造成 <code>μops</code> 反压),导致 <code>μops</code> 没有被发射,则这次的停顿归属于后端</li><li>如果没有出现 <code>backend-stall</code> 并且 <code>μops</code> 没有被发射,则停顿归属于前端</li></ol><p>由于超标量处理器可以在每个周期发出多个 <code>μops</code> ,所以在流水线槽粒度上分类可以非常健壮和准确,这也是 <code>Top-Down Analysis</code> 与之前的性能分析方法的明显区别。</p><h3 id="前端限制分类(Frontend-Bound-Category)"><a href="#前端限制分类(Frontend-Bound-Category)" class="headerlink" title="前端限制分类(Frontend Bound Category)"></a>前端限制分类(Frontend Bound Category)</h3><p>回忆一下前端的主要工作:</p><ol><li>分支预测,预测下一个要拉取的地址</li><li>拉取 <code>cache line</code> 并解析成指令</li><li>将指令解析成 <code>micro-ops</code> </li></ol><p><code>Frontend Bound</code> 进一步分为 <code>Fetch Latency</code> 和 <code>Fetch Bandwidth</code> 两类:</p><ol><li><p><code>Fetch Latency</code> 表示任何原因导致的指令拉取延迟。 <code>icache miss</code> 、 <code>iTLB miss</code> 和 <code>Branch Resteers</code> 都属于这个分类。</p><blockquote><p><code>Branch Resteers</code> 表示流水线刷新(pipeline flush)之后的指令提取延迟。<code>pipeline flush</code> 可能是由于一些清楚状态的事件引起,例如 <code>branch misprediction</code> 或者 <code>memory nukes</code> ,<code>Branch Resteers</code> 与 <code>Bad Speculation</code> 密切相关。</p></blockquote></li><li><p><code>Fetch Bandwidth</code> 表示指令解码低效的问题。比如高 <code>IPC</code> 程序往往会受到前端带宽(<code>bandwith</code>) 的影响。此时需要增加额外的硬件来维持带宽,减少延迟。Intel 使用 <code> LSD(Loop Stream Detector)</code> 和 <code>DSB(Decoded-μop Stream Buffer)</code> 来解决这个问题。</p><blockquote><p>环流侦测器LSD最初由Intel Core微架构引入(注:位于BPU中)。LSD侦测符合条件的小循环,将其锁定在微指令队列中。循环指令可以直接从微指令队列中获取,不再需要取指/译码或者从任何的缓存中读取微指令,直到分支预测失败结束循环。</p><p>符合如下特征的循环方可由LSD/微指令队列锁定:</p><ul><li>最多由8个32字节指令块构成</li><li>最多28条微指令(~28 x86指令)</li><li>所有的微指令同时存在于微指令缓存中</li><li>可以包括最多8个采纳分支,且这些分支不能是CALL或者RET指令</li><li>不能有未匹配的栈操作。例如,PUSH指令比POP指令多。</li></ul><p>许多计算密集型的循环,查找,和软件字符串搬移操作都符合这些特征。</p><p>软件应该“机会主义式的”使用LSD功能。对于要求高性能的代码,循环展开通常比LSD循环锁定更受推荐,即便是循环展开可能导致无法做LSD锁定(例如代码长度过大)。</p><p>引用自 <a href="https://blog.csdn.net/qq_43401808/article/details/85997414">https://blog.csdn.net/qq_43401808/article/details/85997414</a></p></blockquote></li></ol><h3 id="错误预测分类(Bad-Speculation-category)"><a href="#错误预测分类(Bad-Speculation-category)" class="headerlink" title="错误预测分类(Bad Speculation category)"></a>错误预测分类(Bad Speculation category)</h3><p><code>Bad Speculation</code> 反映了由于错误预测(incorrect speculations)导致的 <code>pipeline slots</code> 浪费,主要包括两部分:</p><ol><li><p>执行没有退休(retire)的 <code>μops</code> 的槽位(slots)</p></li><li><p>从先前错误预测中恢复而导致流水线被阻塞的槽位</p><blockquote><p>由于错误预测还涉及到能否快速的将正确的指令拉取到,所以这边可能与 <code>Frontend Bound</code> 中的 <code>Branch Resteers</code> 重叠。</p></blockquote></li></ol><p><code>Top-Down Analysis</code> 把 <code>Bad Speculation</code> 放在了顶层,这样可以方便的确认预测错误影响正常工作的比例,并反过来决定其他类别中值的准确性。</p><p><code>Bad Speculation</code> 进一步分成了 <code>Branch Mispredict</code> 和 <code>Machine Clears</code> ,后者的情况导致的问题与 <code>pipeline flush</code> 类似:</p><ol><li><code>Branch Mispredict</code> 关注如何使程序控制流对分支预测更友好</li><li><code>Machine Clears</code> 指出一些异常情况,例如清除内存排序机(Memory Ordering Nukes clears)、自修改代码(self modifying code)或者非法地址访问(certain loads to illegal address ranges)</li></ol><h3 id="退休分类(Retiring-Category)"><a href="#退休分类(Retiring-Category)" class="headerlink" title="退休分类(Retiring Category)"></a>退休分类(Retiring Category)</h3><p><code>Retiring Category</code> 反映了最终退休的 <code>μops</code> 占槽位的比例。理想情况下所有的槽位都应该被标记为退休状态,即 <code>Retiring</code> 比例为 <code>100%</code>。</p><p>并非有了很高的 <code>Retiring</code> 比例性能就没有提升空间了,依然可以看下下面几个指标</p><ol><li><code>Microcode Sequences</code> 中的 <code>Floating Point assists</code> 是一种对性能很不友好的伪指令,应当尽可能避免</li><li>占比很高的非向量化(non-vectorized)代码可以优化成为向量化(vectorized)的代码。向量化代码本质上可以让单个指令(或者说 <code>μop</code>)完成更多的操作,从而提高性能。 </li></ol><h3 id="后端限制分类(Backend-Bound-Category)"><a href="#后端限制分类(Backend-Bound-Category)" class="headerlink" title="后端限制分类(Backend Bound Category)"></a>后端限制分类(Backend Bound Category)</h3><p>后端限制反映了由于后端没有足够的资源接收而导致没有 <code>μops</code> 被 <code>issued</code> 的情况。</p><p>后端限制根据是否因为执行单元(execution units)被占用而导致的停顿分为内存限制(Memory Bound)和核心限制(Core Bound)。通常情况下,为了达到最高的 <code>IPC</code>,必须要保持执行单元始终处于繁忙状态。例如在一个有 4 个 slot 的机器上,如果在某个状态下只有 3 个或者更少的 <code>μops</code> 被执行就意味着没有达到最佳状态(即 <code>IPC</code> 不为 4)。这些周期成为执行停顿(ExecutionStalls)。</p><h4 id="内存限制(Memory-Bound)"><a href="#内存限制(Memory-Bound)" class="headerlink" title="内存限制(Memory Bound)"></a>内存限制(Memory Bound)</h4><p>内存限制反应了由于内存子系统导致的执行停顿,比如对于一个 <code>load</code> 操作所有的缓存都没有命中。</p><p>现代 CPU 实现了三个不同层次的缓存来解决主存的延迟问题。在 <code>intel</code> 的解决方案中,第一层级的 <code>cache</code> 是一个数据缓存,第二层级的 <code>cache</code> 是一个指令和数据共享的缓存,这两层缓存是每个核心一个。第三层级的 <code>cahce</code> 是所有核心共享的。</p><p>一个乱序(out-of-order)执行的调度器会通过先执行那些没有依赖的 <code>μops</code> 尽可能的保证执行单元始终处于忙碌状态来避免内存访问停顿。所以内存访问停顿的真正代价是调度器没有准备好的 <code>μops</code> 给执行模块,这些(delay)的 <code>μops</code> 要么在等待内存访问,要么依赖其他的 <code>μops</code>。</p><p>图4说明了如何区分是哪一层缓存导致的执行停顿</p><p><img src="/images/top_down_performance_analysis/figure4_memory_bound_breakdown_flowchart.png" alt="Figure 4: Memory Bound breakdown flowchart"></p><ul><li><code>L1D</code> 通常具有与 <code>ALU</code> 停顿媲美的短暂延迟,但是在某些场景下,<code>L1D</code> 也会有较高的延迟。比如 <code>load</code> 操作被阻塞,无法将数据从早先的 <code>store</code> 转发到一个相同的地址(load blocked to forward data from earlier store to an overlapping address)。再比如 <code>load</code> 操作由于 <code>4K</code> 对齐而被阻塞。这些情况都属于 <code>L1 Bound</code>。</li><li>在乱序执行的 <code>CPU</code> 上,由于 <code>X86</code> 架构的内存顺序访问的要求,<code>store</code> 操作会被缓存(buffered)并在退休之后(post-retirement)执行。多数时候,<code>store</code> 操作对性能影响很小,但仍不能随意被忽视。在 <code>Top-Donw Analysis</code> 中,定义了 <code>Store Bound</code> 作为低执行端口利用率(low execution ports utilization)、高 <code>store</code> 缓冲数量(high number of stores are buffered)的周期。</li><li>在 <code>Ext. Memory Bound</code> 分类中,区分 <code>Memory Bandwidth</code> 和 <code>Memory Latency</code> 的方法是统计有多少请求依赖从内存中获取数据,如果这类请求的占比超过一个阈值(比如 70%),就将其划归到 <code>Memory Bandwidth</code> 中,否则划归到 <code>Memory Latency</code> 中。</li></ul><h4 id="核心限制(Core-Bound)"><a href="#核心限制(Core-Bound)" class="headerlink" title="核心限制(Core Bound)"></a>核心限制(Core Bound)</h4><p>核心限制反映了短的执行饥饿周期或者执行端口利用率不佳,也就是说执行单元存在压力或者程序中缺少指令级别并行。例如一个长延迟的除法操作可能会序列化执行,导致一个周期内只有少量的执行端口被使用。</p><p><code>Core Bound</code> 的问题一般可以通过更优秀的代码来解决。例如,一系列相关的算术运算被标记为 <code>Core Bound</code>,编译器可以通过更好的指令调度来缓解。同时,矢量化(Vectorization)也可以缓解 <code>Core Bound</code> 问题。</p><h2 id="计数架构(Counters-Architecture)"><a href="#计数架构(Counters-Architecture)" class="headerlink" title="计数架构(Counters Architecture)"></a>计数架构(Counters Architecture)</h2><p>这一章主要介绍 <code>Top-Down Analysis</code> 方法所需要的硬件支持。现代 <code>CPU</code> 中都含有一个元件 <code>PMU</code>,它提供了一组能够计算性能事件的通用计数器。</p><p>表1 中总结了基本的事件,表2 总结了 <code>Top-Down Analysis</code> 中指标的计算方法</p><p>Table 1: Definitions of Top-Down performance events</p><table><thead><tr><th>Event</th><th>Definition</th></tr></thead><tbody><tr><td>TotalSlots</td><td>Total number of issue-pipeline slots.</td></tr><tr><td>SlotsIssued</td><td>Utilized issue-pipeline slots to issue operations</td></tr><tr><td>SlotsRetired</td><td>Utilized issue-pipeline slots to retire (complete) operations</td></tr><tr><td>FetchBubbles</td><td>Unutilized issue-pipeline slots while there is no backend-stall</td></tr><tr><td>RecoveryBubbles</td><td>Unutilized issue-pipeline slots due to recovery from earlier miss-speculation</td></tr><tr><td>BrMispredRetired</td><td>Retired miss-predicted branch instructions</td></tr><tr><td>MachineClears</td><td>Machine clear events (pipeline is flushed)</td></tr><tr><td>MsSlotsRetired</td><td>Retired pipeline slots supplied by the microsequencer fetch-unit</td></tr><tr><td>OpsExecuted</td><td>Number of operations executed in a cycle</td></tr><tr><td>MemStalls.AnyLoad</td><td>Cycles with no uops executed and at least 1 inflight load that is not completed yet</td></tr><tr><td>MemStalls.L1miss</td><td>Cycles with no uops executed and at least 1 inflight load that has missed the L1-cache</td></tr><tr><td>MemStalls.L2miss</td><td>Cycles with no uops executed and at least 1 inflight load that has missed the L2-cache</td></tr><tr><td>MemStalls.L3miss</td><td>Cycles with no uops executed and at least 1 inflight load that has missed the L3-cache</td></tr><tr><td>MemStalls.Stores</td><td>Cycles with few uops executed and no more stores can be issued</td></tr><tr><td>ExtMemOutstanding</td><td>Number of outstanding requests to the memory controller every cycle</td></tr></tbody></table><p>Table 2: Formulas for Top-Down Metrics</p><table><thead><tr><th>Metric Name</th><th>Formula</th></tr></thead><tbody><tr><td>Frontend Bound</td><td>FetchBubbles / TotalSlots</td></tr><tr><td>Bad Speculation</td><td>(SlotsIssued – SlotsRetired + RecoveryBubbles) / TotalSlots</td></tr><tr><td>Retiring</td><td>SlotsRetired / TotalSlots</td></tr><tr><td>Backend Bound</td><td>1 – (Frontend Bound + Bad Speculation + Retiring)</td></tr><tr><td>Fetch Latency Bound</td><td>FetchBubbles[≥ #MIW] / Clocks</td></tr><tr><td>Fetch Bandwidth Bound</td><td>Frontend Bound – Fetch Latency Bound</td></tr><tr><td>#BrMispredFraction</td><td>BrMispredRetired / (BrMispredRetired + MachineClears)</td></tr><tr><td>Branch Mispredicts</td><td>#BrMispredFraction * Bad Speculation</td></tr><tr><td>Machine Clears</td><td>Bad Speculation – Branch Mispredicts</td></tr><tr><td>MicroSequencer</td><td>MsSlotsRetired / TotalSlots</td></tr><tr><td>BASE</td><td>Retiring – MicroSequencer</td></tr><tr><td>#ExecutionStalls</td><td>(ΣOpsExecuted[= FEW] ) / Clocks</td></tr><tr><td>Memory Bound</td><td>(MemStalls.AnyLoad + MemStalls.Stores) / Clocks</td></tr><tr><td>Core Bound</td><td>#ExecutionStalls – Memory Bound</td></tr><tr><td>L1 Bound</td><td>(MemStalls.AnyLoad – MemStalls.L1miss) / Clocks</td></tr><tr><td>L2 Bound</td><td>(MemStalls.L1miss – MemStalls.L2miss) / Clocks</td></tr><tr><td>L3 Bound</td><td>(MemStalls.L2miss – MemStalls.L3miss) / Clocks</td></tr><tr><td>Ext. Memory Bound</td><td>MemStalls.L3miss / Clocks</td></tr><tr><td>MEM Bandwidth</td><td>ExtMemOutstanding[≥ THRESHOLD] / ExtMemOutstanding[≥ 1]</td></tr><tr><td>MEM Latency</td><td>(ExtMemOutstanding[≥ 1] / Clocks) – MEM Bandwidth</td></tr></tbody></table><h2 id="使用效果"><a href="#使用效果" class="headerlink" title="使用效果"></a>使用效果</h2><p>这里使用了业界通用的 CPU 测试方法 <code>SPEC CPU2006</code> 来进行测试,并使用 <code>Top-Down Analysis</code> 方法来收集和分析性能点。</p><p>测试分为两个场景,分别为 <code>single-thread (1C)</code> 和 <code>multi-copy (4C)</code>。具体的测试环境为</p><table><thead><tr><th>项目</th><th>规格</th></tr></thead><tbody><tr><td>Processor</td><td>Intel® Core™ i7-3940XM (Ivy Bridge). 3 GHz fixed frequency. A quadcore with 8MB L3 cache. Hardware prefetchers enabled.</td></tr><tr><td>Memory</td><td>8GB DDR3 @1600 MHz</td></tr><tr><td>OS</td><td>Windows 8 64-bit</td></tr><tr><td>Benchmark</td><td>SPEC CPU 2006 v1.2 (base/rate mode)</td></tr><tr><td>Compiler</td><td>Intel Compiler 14 (SSE4.2 ISA)</td></tr></tbody></table><h3 id="SPEC-CPU2006"><a href="#SPEC-CPU2006" class="headerlink" title="SPEC CPU2006"></a>SPEC CPU2006</h3><p>图5、图 6 展示了 <code>1C</code> 和 <code>2C</code> 两个场景下的分析结果</p><p><img src="/images/top_down_performance_analysis/figure10_spec_cpu2006_result.png" alt="Figure10 SPEC CPU2006 Result"></p><p>通过对比可以发现,相对于 <code>1C</code> 场景,在 <code>4C</code> 场景中,<code> Memory Bound</code> 的占比扩大。这是一个预期的结果,因为 <code>L3 Cache</code> 是多核共享的。</p><p>继续下钻 <code>Ext. Memory Bound</code> 这个分类</p><p><img src="/images/top_down_performance_analysis/figure11_spec_cpu2006_memory_bound.png" alt="Figure11 SPEC CPU2006 Memory Bound"></p><p>可以看到 ,在 <code>1C</code> 场景下,<code>DRAM Bound</code> 原因是 <code>Latency</code> 导致的,但是在 <code>4C</code> 场景下,<code>DRAM Bandwidth</code> 的比重在某些测试用例中成为主要的 <code>DRAM Bound</code> 原因,这是由于这些用例都有对 <code>DRAM</code> 进行大量的数据访问所导致的。</p><p>而对于不同处理器架构的对比</p><p><img src="/images/top_down_performance_analysis/figure12_spec_cpu2006_across_microarchitectures.png" alt="Figure12 SPEC CPU2006 Across Microarchitectures"></p><p>可以看到,<code>4th</code> 的处理器 <code>Frontend Bound</code> 的比例明显减少,这有由于Intel在4代的处理器上对 <code>i-TLB</code> 和 <code>i-Cache</code> 均进行了优化所导致的。从该结果可以佐证对特定模块优化后的性能提升,并且还可以进行不同系列/不同架构直接的处理器对比,来更好地理解不同处理器之间的差异。</p><h2 id="实验"><a href="#实验" class="headerlink" title="实验"></a>实验</h2><p>本章将使用 <code>java</code> 语言以不同的方式实现矩阵乘法,并通过 <code>vtune</code> 监控运行信息,对比不同实现对于 <code>CPU</code> 的使用情况。</p><p>实验将首先初始化两个个 $3000 * 3000$ 长度的 <code>float</code> 一维数组,并填充随机数值,然后使用不同的方法计算矩阵乘法 10 次。</p><h3 id="场景1"><a href="#场景1" class="headerlink" title="场景1"></a>场景1</h3><p>最简单最朴素的一种实现</p><p>代码如下</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">private</span> <span class="keyword">static</span> <span class="keyword">float</span>[] multiply1(<span class="keyword">float</span>[] matrix1, <span class="keyword">float</span>[] matrix2, <span class="keyword">int</span> size) {</span><br><span class="line"> <span class="keyword">float</span>[] res = <span class="keyword">new</span> <span class="keyword">float</span>[size * size];</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < size; i++) {</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> j = <span class="number">0</span>; j < size; j++) {</span><br><span class="line"> <span class="keyword">float</span> sum = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> k = <span class="number">0</span>; k < size; k++) {</span><br><span class="line"> sum += matrix1[i * size + k] * matrix2[k * size + j];</span><br><span class="line"> }</span><br><span class="line"> res[i * size + j] = sum;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> res;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>vtune 的监控信息如下</p><p><img src="/images/top_down_performance_analysis/figure5_fmultiply1_3000.png" alt="Figure5 Fmultiply1_3000"></p><h3 id="场景2"><a href="#场景2" class="headerlink" title="场景2"></a>场景2</h3><p>该版本为文章 <a href="http://csapp.cs.cmu.edu/2e/waside/waside-blocking.pdf">CS:APP2e Web Aside MEM:BLOCKING: Using Blocking to Increase Temporal Locality</a> 中介绍的一个方法。具体的原理是先将原始的矩阵分成多个子矩阵,然后利用数学原理,将子矩阵之间的运算变成标量运算。</p><p>下面这段代码实现了这种方法,基本思想是将A,C分成 $1 * bsize$ 的 行片段(row slivers),并把B 分成 $bsize * bsize$ 的块。</p><ol><li>首先,最内部的 <code>(j, k)</code> 循环(就是最深和次深的两个循环),将A的行片段乘上B的块,然后将求和结果赋值给C的行片段。</li><li><code>i</code>的循环,迭代了A和C的n个行片段,每次循环使用了B中相同的block(块大小是<code>bsize * bsize</code>)</li></ol><p><img src="/images/top_down_performance_analysis/figure6_blocked_multiply.png" alt="Figure6 Block Multiply"></p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">private</span> <span class="keyword">static</span> <span class="keyword">float</span>[] multiply2(<span class="keyword">float</span>[] matrix1, <span class="keyword">float</span>[] matrix2, <span class="keyword">int</span> size) {</span><br><span class="line"> <span class="keyword">int</span> BLOCK_SIZE = <span class="number">8</span>;</span><br><span class="line"> <span class="keyword">float</span>[] res = <span class="keyword">new</span> <span class="keyword">float</span>[size * size];</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> kk = <span class="number">0</span>; kk < size; kk += BLOCK_SIZE) {</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> jj = <span class="number">0</span>; jj < size; jj += BLOCK_SIZE) {</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < size; i++) {</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> j = jj; j < jj + BLOCK_SIZE; ++j) {</span><br><span class="line"> <span class="keyword">float</span> sum = res[i * size + j];</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> k = kk; k < kk + BLOCK_SIZE; ++k) {</span><br><span class="line"> sum += matrix1[i * size + k] * matrix2[k * size + j];</span><br><span class="line"> }</span><br><span class="line"> res[i * size + j] = sum;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> res;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Vtune 的监控信息如下</p><p><img src="/images/top_down_performance_analysis/figure7_fmultiply2_3000.png" alt="Figure7 Fmultiply2_3000"></p><h3 id="场景3"><a href="#场景3" class="headerlink" title="场景3"></a>场景3</h3><p>该版本为文章 <a href="https://richardstartin.github.io/posts/multiplying-matrices-fast-and-slow">Multiplying Matrices, Fast and Slow</a> 中提到的一种方法。</p><p>代码如下</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">private</span> <span class="keyword">static</span> <span class="keyword">float</span>[] multiply3(<span class="keyword">float</span>[] matrix1, <span class="keyword">float</span>[] matrix2, <span class="keyword">int</span> size) {</span><br><span class="line"> <span class="keyword">float</span>[] res = <span class="keyword">new</span> <span class="keyword">float</span>[size * size];</span><br><span class="line"> <span class="keyword">int</span> in = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < size; ++i) {</span><br><span class="line"> <span class="keyword">int</span> kn = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> k = <span class="number">0</span>; k < size; ++k) {</span><br><span class="line"> <span class="keyword">float</span> aik = matrix1[in + k];</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> j = <span class="number">0</span>; j < size; ++j) {</span><br><span class="line"> res[in + j] += aik * matrix2[kn + j];</span><br><span class="line"> }</span><br><span class="line"> kn += size;</span><br><span class="line"> }</span><br><span class="line"> in += size;</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> res;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Vtune 监控信息如下</p><p><img src="/images/top_down_performance_analysis/figure8_fmultiply3_3000.png" alt="Figure8 Fmultiply3_3000"></p><h3 id="场景4"><a href="#场景4" class="headerlink" title="场景4"></a>场景4</h3><p>该版本为场景3版本的升级版</p><p>代码如下</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">float</span>[] multiply4(<span class="keyword">float</span>[] matrix1, <span class="keyword">float</span>[] matrix2, <span class="keyword">int</span> size) {</span><br><span class="line"> <span class="keyword">float</span>[] res = <span class="keyword">new</span> <span class="keyword">float</span>[size * size];</span><br><span class="line"> <span class="keyword">float</span>[] bBuffer = <span class="keyword">new</span> <span class="keyword">float</span>[size];</span><br><span class="line"> <span class="keyword">float</span>[] cBuffer = <span class="keyword">new</span> <span class="keyword">float</span>[size];</span><br><span class="line"> <span class="keyword">int</span> in = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < size; ++i) {</span><br><span class="line"> <span class="keyword">int</span> kn = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> k = <span class="number">0</span>; k < size; ++k) {</span><br><span class="line"> <span class="keyword">float</span> aik = matrix1[in + k];</span><br><span class="line"> System.arraycopy(matrix2, kn, bBuffer, <span class="number">0</span>, size);</span><br><span class="line"> saxpy(size, aik, bBuffer, cBuffer);</span><br><span class="line"> kn += size;</span><br><span class="line"> }</span><br><span class="line"> System.arraycopy(cBuffer, <span class="number">0</span>, res, in, size);</span><br><span class="line"> Arrays.fill(cBuffer, <span class="number">0f</span>);</span><br><span class="line"> in += size;</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> res;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">private</span> <span class="keyword">static</span> <span class="keyword">void</span> <span class="title">saxpy</span><span class="params">(<span class="keyword">int</span> n, <span class="keyword">float</span> aik, <span class="keyword">float</span>[] b, <span class="keyword">float</span>[] c)</span> </span>{</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < n; ++i) {</span><br><span class="line"> c[i] += aik * b[i];</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>Vtune 监控信息如下</p><p><img src="/images/top_down_performance_analysis/figure9_fmultiply4_3000.png" alt="Figure9 Fmultiply4_3000"></p><h2 id="参考资料"><a href="#参考资料" class="headerlink" title="参考资料"></a>参考资料</h2><p><a href="https://ieeexplore.ieee.org/document/6844459/metrics#metrics">A Top-Down method for performance analysis and counters architecture</a></p><p><a href="https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top.html">Intel® VTune™ Profiler</a></p><p><a href="https://kernel.taobao.org/2019/03/Top-down-Microarchitecture-Analysis-Method/">Top-down Microarchitecture Analysis Method</a></p><p><a href="https://richardstartin.github.io/posts/multiplying-matrices-fast-and-slow">Multiplying Matrices, Fast and Slow</a></p><p><a href="http://csapp.cs.cmu.edu/2e/waside/waside-blocking.pdf">CS:APP2e Web Aside MEM:BLOCKING: Using Blocking to Increase Temporal Locality</a></p>]]></content>
<summary type="html"><h2 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h2><p>随着处理器复杂度的增加、处理任务的多样化以及性能分析工具数据的难以管理,使得性能分析的难度日益增加。同时,在某些领域中,对于资源和时间的限制更加严格,进一步要求性能分析给出分析速度和结果准确性更优的方法。</p>
<p>这篇文章给出了一个自顶向下的分析方法(Top-Down Analysis),可以在乱序处理器上快速定位真正的性能瓶颈。该方法通过将性能数据结构化、分层展示,直观快速的展示性能瓶颈,并且已经被包括 <code>VTune</code> 在内的众多性能工具使用。</p>
<p>不同于其他性能分析方法,该方法的开销很低,只需要在传统的 <code>PMU(Performance Monitor Unit)</code> 中增加 8 个简单的性能事件。它没有对问题域的限制,可以全面的进行性能分析,并且可以找到超标量核心的性能问题。</p></summary>
<category term="cpu" scheme="https://andrewei1316.github.io/categories/cpu/"/>
<category term="performance" scheme="https://andrewei1316.github.io/categories/cpu/performance/"/>
<category term="cpu" scheme="https://andrewei1316.github.io/tags/cpu/"/>
<category term="performance" scheme="https://andrewei1316.github.io/tags/performance/"/>
</entry>
<entry>
<title>【转载】Skylake Microarchitecture</title>
<link href="https://andrewei1316.github.io/2020/12/13/skylake-microarchitecture/"/>
<id>https://andrewei1316.github.io/2020/12/13/skylake-microarchitecture/</id>
<published>2020-12-13T12:35:04.000Z</published>
<updated>2021-01-03T09:16:39.440Z</updated>
<content type="html"><![CDATA[<blockquote><p>本文全部内容都来自于 DECODEZ “Skylake 微架构剖析” 系列,地址 <a href="https://decodezp.github.io/2019/01/07/quickwords9-skylake-pipeline-1/">https://decodezp.github.io/2019/01/07/quickwords9-skylake-pipeline-1/</a></p><p>搬运仅仅为了留作笔记,详细内容请直接访问 DECODEZ 的博客网站 <a href="https://decodezp.github.io/">https://decodezp.github.io/</a></p></blockquote><h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>了解 <code>CPU</code> 的微架构是基于其开发“硬核”软件的必需步骤。由于一些历史遗留问题,现存的技术资料往往存在一些概念混淆、重复命名甚至自相矛盾之处。本文一来梳理 <code>Skylake</code> 微架构(主要是流水线)的组成和特性,二来试图厘清一些含混的概念用以帮助后来者。</p><p>另外在介绍完微架构之后,会继续结合 <code>Perf</code> 中的 <code>Performance Event</code> 来对照说明互为印证。</p><blockquote><p>需要强调的是,本文的重点是Skylake的流水线(pipeline)架构,core间的连接和架构方式不作重点说明。</p></blockquote><a id="more"></a><h2 id="Skylake-介绍"><a href="#Skylake-介绍" class="headerlink" title="Skylake 介绍"></a>Skylake 介绍</h2><p><code>Skylake</code> 是由 <code>Intel</code> 以色列研发中心于2015年发布的 <code>14nm CPU</code> 架构。作为 <code>Broadwell</code> 的继任者,<code>Skylake</code> 在原有架构的基础上,对一些关键特性和组件做出了相当幅度的优化:</p><p><img src="/images/skylake_microarchitecture/figure1_compare_with_5th.png" alt="Figure1 Compare With 5th"></p><p>上图简单列举了一些量化指标,现在不求甚解就好。</p><p>在指令集方面,引入了<code>AVX-512</code>、<code>CLFLUSHOPT</code>、<code>CLWB</code>等新的指令集,不过本文不打算介绍这些东西,写到这里只是觉得如果只用上一段话结束这一小节有些太突兀了。</p><h2 id="流水线概览"><a href="#流水线概览" class="headerlink" title="流水线概览"></a>流水线概览</h2><p><img src="/images/skylake_microarchitecture/figure2_intel_common_arch_post_ucache.png" alt="Figure2 intel_common_arch_post_ucache"></p><p>引用上面这张图是为了举一个反例,说明一下本文要解决的问题。这张图可以被当做是一张流水线的架构抽象,我可以指着每一个组件讲讲它们都是干嘛的,但这里的问题就是某一个相同的组件在不同的文档、资料、甚至语境下可能有两个甚至更多个名字。</p><p>比如蓝色方块最下面的<code>Allocation Queue</code>,它就还有一个名字叫做<code>Instruction Decode Queue</code>,同时它还有可能被叫做<code>IDQ</code>或<code>AQ</code>。而关于<code>Decoded Instruction Queue</code>、<code>Micro Instruction Sequencer</code>、<code>Re-order buffer</code>、<code>Scheduler</code>、<code>Reservation Station</code>等概念的辨析也是…需要下一番功夫。</p><p>本文将以全网最清晰的方式讲清楚这些概念。</p><p>从high-level的层面来讲,Skylake的流水线架构与Broadwell和Haswell没有太大出入。还是可以分为两个阶段:</p><h3 id="前端-Front-End"><a href="#前端-Front-End" class="headerlink" title="前端(Front-End)"></a>前端(Front-End)</h3><p>上图中蓝色部分就代表流水线的前端。它的主要作用就是获取指令、解码(Decode)指令。</p><p>为了最大限度的发挥CPU的能力,前端就需要尽可能高效率地把程序指令输送给后端。这里就面临两个挑战:</p><ol><li>如何更快更准确地取得要执行的指令</li><li>如何将取得的指令更快地解码为微指令(micro-ops/uops)</li></ol><p>有了更多的微指令输送给后端(执行单元),后端的工作量才能饱和。而前端的所有组件和机制,都是围绕这两个挑战进行的。</p><h3 id="后端-Back-End"><a href="#后端-Back-End" class="headerlink" title="后端(Back-End)"></a>后端(Back-End)</h3><p>上图中红色的部分就代表流水线的后端。一般来讲绿色的部分是存储子系统,虽然与后端交互,但严格讲不算在后端里面。</p><p>后端的主要任务就是执行前端送过来的指令。和前端类似,后端除了“来料加工”之外,也有它自己需要面对的挑战:</p><ol><li>如何提高指令的并行程度</li><li>如何充分利用已有的CPU能力</li></ol><p>如果将CPU比作一家餐厅,跑在上面的应用就是来餐厅就餐的食客。前端类似餐厅的服务生,需要接受客人的下单,同时将订单送到后厨。而后厨就类似后端,负责做出客人需要的菜品。</p><p>但如何能让上菜速度更快?前端是否可以在客人排位时就让其提前下单?后厨是否能够提前准备好本店热门的特色菜,或者一并煮好一大锅面条,根据需要浇上不同的浇头?</p><p>CPU说是高科技,其实干得也就是这些事情。</p><h2 id="前端-Frontend"><a href="#前端-Frontend" class="headerlink" title="前端(Frontend)"></a>前端(Frontend)</h2><p>处理器在前端这一部分的时候还是顺序(in-order)处理的,主要是也确实没什么乱序的空间。虽然说是顺序,但前端因为贴近业务,所以受人写的代码的影响也比较大。如果仅仅只是“取指令->解码”,恐怕需要写程序的人是个非常聪明的程序员。前端很多组件的工作其实都是在填程序员的坑,这也是我比较心疼前端的地方。</p><h3 id="Fetch"><a href="#Fetch" class="headerlink" title="Fetch"></a>Fetch</h3><p><img src="/images/skylake_microarchitecture/figure3_skylake_fetch.png" alt="Figure3 skylake_fetch"></p><p>前端的任务,首先是从内存中取得指令。同读取数据类似,<code>CPU</code> 通过查询页表获得指令所在的内存地址,同时把指令塞到 <code>CPU</code> 的 <code>L1</code> 指令缓存里。</p><p>具体要把哪个地址上的指令数据送到 <code>L1I$</code> 里,这是分支预测器(Branch predictor)的工作。作为 <code>CPU</code> 的核心技术,<code>Intel</code> 并没有透露太多信息,我们这里也只好一笔带过。不过它的细节也许很复杂,但它的脾气很好掌握:和我们很多人不喜欢自己的工作一样,它的工作就是处理分支,但它最不喜欢分支。</p><p>在 <code>Skylake</code> 架构里,<code>L1I$</code> 大小为 <code>32KB</code>,组织形式为 <code>8-way set associative</code> (关于 <code>CPU</code> 缓存组织形式的讲解可以参照<a href="https://decodezp.github.io/2018/11/25/quickwords2-cacheassociativity/">这篇</a>),每个 <code>Cycle</code> 可以取 <code>16Byte</code> 长度(fetch window)的指令。如果你开了 <code>Hyper-thread</code>,那么同一个物理核上的两个逻辑核均分这个 <code>fetch window</code>,每个 <code>Cycle</code>各占用一次。</p><p>在 <code>L1I$</code> 里的指令还都是变长的 <code>x86 macro-ops</code>,也就是我们看到的那些编译之后的汇编指令。如果熟悉这些指令的话,就会知道这些指令的长度(就是那些二进制数字)都不一样,同时一条指令有时可以由好几个操作组成。</p><p>这种指令对 <code>CPU</code> 的执行单元来说是很不友好的,同时如果想要通过乱序执行提高指令的并行度,减小指令的粒度也是必须的步骤。因此需要把这些<code>marco-ops</code>“解码”为 <code>micro-ops</code>。</p><p>当然具体的解码工作还在后面。从 <code>L1I$</code> 中取得指令数据后,首先要进入“预解码”阶段,在这里需要识别出在一个 <code>fetch window</code> 中取得的这 <code>16</code> 个 <code>Byte</code> 的数据里面有多少个指令。除此之外,还需要对一些特殊指令,比如分支转跳打上一些标记。</p><p>但因为指令变长的原因,<code>16</code> 个 <code>Byte</code> 往往并不对应固定的指令数,还有可能最后一个指令有一半在这 <code>16Byte</code> 里,另一边还在后面。另外就是 <code>pre-decode</code> 在一个 <code>Cycle</code> 最多识别出 <code>6</code> 个指令,或者这 <code>16Byte</code> 的数据都解析完。如果你这 <code>16</code> 个 <code>Byte</code> 里包含有 <code>7</code> 个指令,那么第一个 <code>Cycle</code> 识别出前 <code>6</code> 个之后,还需要第二个 <code>Cycle</code> 识别最后一个,然后才能再读取后面 <code>16Byte</code>。</p><p>那么 <code>pre-decode</code> 的效率就变成了 <code>3.5 instruction / cycle</code>,比最理想的情况 <code>6 instruction / cycle</code> 降低了<code>41%</code>,现实就是这么残酷。</p><p>经过 <code>pre-decode</code> 之后,才真正从 <code>16Byte</code> 的二进制数据中识别出了指令,这些指令下一步要被塞到一个队列里(Instruction Queue)看看有没有什么能被优化的地方。一个最常见的优化方式就是<code>macro-op fusion</code>,就是把两个相邻的,且能被一个指令表示的指令,用那一个指令替换掉。比如:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">cmp eax, [mem]</span><br><span class="line">jne loop</span><br></pre></td></tr></table></figure><p>直接换成</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">cmpjne eax, [mem], loop</span><br></pre></td></tr></table></figure><p>当然既然决定这么替换,新指令对流水线的开销肯定小于被替换的两个指令之和。如此可以减轻一部分后端执行单元的工作负荷和资源开销。</p><p>OK, 在取得了指令数据,识别出了数据中的指令,并对指令做了一些优化合并之后,就该开始正儿八经地解码了。</p><h3 id="解码"><a href="#解码" class="headerlink" title="解码"></a>解码</h3><p><img src="/images/skylake_microarchitecture/figure4_skylake_decode.png" alt="Figure4 skylake_decode"></p><p>在拿到了经过“预解码”的<code>macro-ops</code>之后,开始正式进入解码过程。<code>marco-ops</code>进入 <code>Instruction Decode</code> 组件解码,最终的输出为定长的 <code>micro-ops</code>。</p><p><code>Insturction Decode</code> 组件也有入口带宽限制,每个 <code>Cycle</code> 最多取 <code>3</code> 个 <code>unfused</code> 指令 + <code>2</code> 个 <code>fused</code> 指令,或者 <code>5</code> 个 <code>unfused</code> 指令(这里指 <code>macro ops</code>)。所以说 <code>fused</code> 多了也不好,一个 <code>Cycle</code> 最多取两个。同时如果开了 <code>Hyper Thread</code>,则两个 <code>Thread</code> 按 <code>Cycle</code> 交替使用 <code>Instruction Decode</code>。</p><p>在 <code>Instruction Decode</code> 组件里面的就是各个具体的 <code>Decoder</code>。<code>Decoder</code> 类型可以分类两类,一类是 <code>Simple Decoder</code>,一类是 <code>Complex Decoder</code>,感觉这句是在说废话。</p><p>顾名思义,<code>Simple Decoder</code> 处理的是解码之后的输出为1个 <code>fused-μop</code> 的指令;<code>Complex Decoder</code> 处理的是解码之后的输出为1个至4个 <code>fused-μop</code> 的指令。</p><h4 id="Fused-μop"><a href="#Fused-μop" class="headerlink" title="Fused-μop"></a>Fused-μop</h4><p>注意这里说的是 fused-<code>μop</code>,不是 fused-<code>marco</code>。在这里所有输出的 <code>μop</code> 都是做过 <code>fused</code> 处理的,目的是减少后续资源的占用。</p><blockquote><p>但这里有一个比较容易混淆的概念,就是 <code>fused-μop</code> 并非专指那些两个 <code>μop</code> 合并之后生成的 “合并μop”,而是指所有经过了 <code>μop fusion</code> 处理的 <code>μop</code>:有些指令可能两个 <code>μop</code> 变一个,但也有一些是一个还是一个,即便如此,输出的那一个也叫 <code>fused-μop</code>。</p></blockquote><p>为了进一步澄清这个概念,我们稍微需要涉及一点后端的概念。在前端这里,生成 <code>fused-μop</code> 的部分还属于 <code>CPU</code> 流水线中的 <code>μops fused domain</code>,而在后端需要将指令发射到执行单元去的时候,是不能执行 <code>fused μop</code>的,所以 <code>fused μop</code> 还需要再分解为 <code>unfused μop</code>才可以执行,这一部分就属于 <code>CPU</code> 流水线中的<code>μops unfused domain</code>。</p><p>有了这些概念之后,我们可以看一下<a href="https://www.agner.org/optimize/instruction_tables.pdf">Instruction Tables.pdf</a> 这份文档。</p><p>在P244中有对 <code>skylake</code> 指令的说明,上面有对一些概念的解释,下面是一张表格:</p><p><img src="/images/skylake_microarchitecture/figure5_fjnbCV.png" alt="Figure5 fjnbCV"></p><p>在这张表格里是最常见的 <code>mov</code> 命令的说明。但因为操作数(operands)的不同在真正执行的时候也会有细节上的差别。第一行中的 <code>mov</code> 的两个操作数一个是 <code>register</code>,另外一个是一个立即数。在 <code>μops fused domain</code> 和 <code>μops unfused domain</code> 两栏中的计数都是1。</p><p>这种指令也算在 <code>μops fused domain</code> 经过了 <code>fusion</code> 处理。只不过其实前后没什么区别。</p><p>但如果我们看一下所有在 <code>μops unfused domain</code> 里计数为 <code>2</code> 的 <code>mov</code> 指令,它们在 <code>μops fused domain</code> 中的计数都是 <code>1</code>。这种 <code>mov</code> 指令就是真正做过 <code>2</code> 条 <code>μop</code>合并的<code>mov</code>指令。</p><p>这份表格还有很多有趣的内容,推荐有时间的时候随手翻翻。</p><p><code>Skylake</code> 有 <code>4</code> 个 <code>Simple Decoder</code> 和 <code>1</code> 个 <code>Complex Decoder</code>。但从表里我们可以看到 <code>μopps fused domain</code> 计数为1,也就是可以被 <code>Simple Decoder</code> 处理的指令在所有指令中所占的比例似乎并没有达到 <code>4/5</code> 那么高。</p><p>这里需要说明的是,输出大于 <code>4</code> 个 <code>μop</code> 的指令,既不由 <code>Simple Decoder</code> 处理,也不由 <code>Complex Decoder</code> 处理,而是直接去查 <code>Microcode Sequencer(MS)</code>,这是一块类似于缓存的 <code>ROM</code>。</p><p><code>Complex Decoder</code> 的数量始终为 <code>1</code> 的原因是 <code>Complex Decoder</code> 解码出来的 <code>μop</code> 都很难在执行时进行并行化处理,同时 <code>Complex Decoder</code> 每多解码一个 <code>μop</code>,就要有一个 <code>Simple Decoder</code> 处于不能工作的状态。</p><p>对 <code>CPU</code> 来说,它最希望的就是它要做的工作,它需要的数据,它要执行的指令,都已经在一块缓存里准备就绪了。这是 <code>CPU</code> 上班摸鱼的主要方法,但摸出了风格,摸出了水平。下一部分介绍一下在指令解码方面的缓存内容。</p><h4 id="MSROM"><a href="#MSROM" class="headerlink" title="MSROM"></a>MSROM</h4><p>MSROM(Micro-code sequencer ROM)就是在上文中提到的专门处理输出大于 <code>4</code> 个 <code>μop</code>的那块类似缓存的 <code>ROM</code>。很多文档里面也直接将其称为 <code>MS</code>,具体叫什么多需要结合上下文语境,知道是一回事就好了。</p><blockquote><p>我个人其实推荐读者在编写自己的文档时能注意这些名称上的“一致性”,同编写程序时给变量或函数命名时的一致性一样,这些看似没什么“技术含量”的工作,却能够极大地提高信息传达的效率,也就是提高文档或代码的可读性和可维护性。</p></blockquote><p>在 <code>Instruction Decoder</code> 收到一个输出要大于 <code>4</code> 个 <code>μop</code> 的指令之后,它就会将请求转发给 <code>MSROM</code>。<code>MSROM</code> 虽然是专门解码/查询大于 <code>4</code> 个 <code>μop</code> 的指令的组件,但它最大的传输效率是 <code>4 μop/Cycle</code>。同时在它工作的时候,所有的 <code>Instruction Decoder</code> 都要处于 <code>Disable</code> 的状态。因此虽然它的工作不太需要“动脑子”,但却仍要尽量避免。</p><h4 id="Stack-Engine"><a href="#Stack-Engine" class="headerlink" title="Stack Engine"></a>Stack Engine</h4><p><code>Stack Engine</code> 是专门处理栈操作指令的专用组件。类似 <code>PUSH</code>、<code>POP</code>、<code>CALL</code>、<code>RET</code> 这样的指令都算栈操作指令。<code>Stack Engine</code> 不算什么新鲜的黑科技,自从<code>Pentium M</code> 时代起就已经出现在 <code>Intel</code> 的 <code>CPU</code> 中。它的主要目的是避免栈操作指令对后端资源的占用,从而为其他计算任务提供出更多的资源。为此,<code>Stack Engine</code> 提供栈操作指令专用的加法器和其他所需的逻辑完成这一任务。</p><p><code>Stack Engine</code> 在 <code>Instruction Decoder</code> 之后,监控所有流出的 <code>μop</code>,并且从中提取出栈操作指令,进而直接执行,从而减轻栈操作指令对后端资源的占用。</p><p>这也可能是为什么有些时候 <code>inline</code> 的函数性能还不如不 <code>inline</code> 的原因吧:D(不负责任猜测</p><h4 id="Decoded-Stream-Buffer-DSB"><a href="#Decoded-Stream-Buffer-DSB" class="headerlink" title="Decoded Stream Buffer(DSB)"></a>Decoded Stream Buffer(DSB)</h4><p><img src="/images/skylake_microarchitecture/figure6_dsb_cache.png" alt="Figure6 dsb_cache"></p><h5 id="别名"><a href="#别名" class="headerlink" title="别名"></a>别名</h5><p>像 <code>DSB</code> 这种组件,首先要说明的就是它也叫 <code>μop cache</code> 或 <code>decoded icache</code>。</p><h5 id="作用"><a href="#作用" class="headerlink" title="作用"></a>作用</h5><p>无论是用 <code>Instruction Decoder</code> 还是用 <code>MSROM</code>,终究还是要做一次 “解码” 的操作。但同所有 <code>Cache</code> 加速的原理一样,如果能把解码之后的结果(μop)存下来,下次再出现的时候直接使用,那么就可以显著提高解码速度,<code>DSB</code> 就是这个目的。</p><h5 id="参数"><a href="#参数" class="headerlink" title="参数"></a>参数</h5><p><code>DSB</code> 的组织形式是 <code>32</code> 个 <code>set</code>,每个 <code>set</code> 有 <code>8</code> 条 <code>cache line</code>,每条 <code>cache line</code> 最多保存 <code>6</code> 个 <code>μop</code>。</p><p>每次 <code>cache hit</code> 可以传输最大 <code>6</code> 个 <code>μop/Cycle</code>,这 <code>6</code> 个 <code>μop</code> 最大可以对应到 <code>64 byte</code> 的前端 <code>fetch window size</code>,并且完全不需要任何 <code>Instruction decoder</code> 参与,也没有繁琐的解码过程。在实际应用中,<code>DSB</code> 的 <code>cache hit rate</code> 在 <code>80%</code> 或以上。</p><h5 id="与icache的关系"><a href="#与icache的关系" class="headerlink" title="与icache的关系"></a>与icache的关系</h5><p><code>CPU</code> 的 <code>icache</code> 一般存储的是最原始的从内存里读进来的程序的汇编指令(marco instruction)。而 <code>DSB</code> 或者 <code>μop cache</code> 虽然也是存 <code>instruction</code> 的 <code>cache</code>,但如前所述,它存的是已经解码好的 <code>μop</code>,所以这玩意有时候又被称为 “decoded icache”。当然了,这些 <code>μop</code> 都是 <code>CPU</code> 的 <code>icache</code> 中的指令解码之后得到的。</p><h5 id="与MSROM的关系"><a href="#与MSROM的关系" class="headerlink" title="与MSROM的关系"></a>与MSROM的关系</h5><p>输出大于 <code>4</code> 个 <code>μop</code> 的指令依然只能由 <code>MSROM</code> 解码。<code>DSB</code> 保存的也是那些小于等于 <code>4</code> 个 <code>μop</code> 指令的 <code>μop</code>。</p><h4 id="MITE-Path和DSB-Path"><a href="#MITE-Path和DSB-Path" class="headerlink" title="MITE Path和DSB Path"></a>MITE Path和DSB Path</h4><p>这两个概念主要用于区分最终需要执行的 <code>μop</code> 是通过什么方式来的。在上一节 <code>Decoded Stream Buffer</code> 之前的所有内容,都算是 <code>MITE Path</code>。<code>MITE</code> 是(Micro-instruction Translation Engine)的缩写,同时它在有些文档里也被称作 <code>legacy decode pipeline</code> 或 <code>legacy path</code>。这条线路上过来的 <code>μop</code> 都是从 <code>marco instruction</code> 一步一步解码来的。</p><p><code>DSB path</code> 就是直接从 <code>DSB</code> 那条道上过来的 <code>μop</code>。当 <code>CPU</code> 需要在 <code>MITE Path</code>、<code>DSB Path</code> 以及 <code>MSROM</code> 之间切换(switch)以便取得所需的 <code>μop</code> 时,需要花费一定的 <code>CPU Cycle</code> 完成这一工作。</p><h4 id="Instruction-Decode-Queue-IDQ"><a href="#Instruction-Decode-Queue-IDQ" class="headerlink" title="Instruction Decode Queue(IDQ)"></a>Instruction Decode Queue(IDQ)</h4><p><code>IDQ</code> 也叫 <code>Allocation Queue(AQ)</code>,也有时候会写成是 <code>Decode Queue</code>。解码完成的 <code>μop</code> 在进入后端之前需要先在 <code>IDQ</code> 中做一下缓冲。作为一个 ”缓冲队列”,主要作用是将前端解码可能引入的流水线”气泡(bubbles)“消化掉,为后端提供稳定的 <code>μop</code> 供应(目标是 <code>6μop/Cycle</code>)。</p><p><code>Skylake</code> 的 <code>IDQ</code> 最大可以存放 <code>64</code> 个 <code>μop/thread</code>,比 <code>Broadwell</code> 的 <code>28</code> 个多一倍还多。这些 <code>μop</code> 在 <code>IDQ</code> 中除了排一下队之外,还会被 <code>Loop Stream Detector(LSD)</code>扫描一遍,用来发现这些 <code>μop</code> 是不是来自于一个循环。</p><h5 id="Loop-Stream-Detector-LSD"><a href="#Loop-Stream-Detector-LSD" class="headerlink" title="Loop Stream Detector(LSD)"></a>Loop Stream Detector(LSD)</h5><p>如果在 <code>IDQ</code> 中能被发现存在循环体 <code>μop</code>,那么在下一次循环的时候,就不需要去重新解码这些循环体生成的 <code>μop</code>,而是直接由 <code>LSD</code> 提供 <code>μops</code>。这便可以省去指令 <code>fetch</code>、解码、读 <code>μop cache</code>、分支预测等所有之前的步骤,并且能进一步减少缓存占用。当然,当 <code>LSD</code> 起作用的时候,整个前端都是处于 <code>Disabled</code> 的状态。</p><p><code>Skylake</code> 的 <code>LSD</code> 需要在 <code>IDQ</code> 的长度(64μop)内发现循环,所以,循环体还是尽量紧凑一点吧:D</p><h2 id="后端-Backend"><a href="#后端-Backend" class="headerlink" title="后端 (Backend)"></a>后端 (Backend)</h2><p><img src="/images/skylake_microarchitecture/figure7_backend.png" alt="Figure7 backend"></p><p>还是首先介绍一下这个部分是否有别的名字。在有些文档里后端又直接被称为 <code>Execution Engine</code>。后端的主要任务当然就是执行前端解码出来的这些 <code>μop</code>。但后端和前端的设计都在围绕着“如何提高指令的并行性”来设计和优化。</p><p>在 <code>Skylake</code> 架构中,<code>IDQ</code> 以最大 <code>6μop/Cycle</code> 的速度将 <code>μop</code> 送入 <code>Re-order Buffer</code>,后端的处理在 <code>Re-order Buffer</code> 中正式开始。</p><h3 id="Out-of-order-OOO-Execution-Engine"><a href="#Out-of-order-OOO-Execution-Engine" class="headerlink" title="Out-of-order(OOO)Execution/Engine"></a>Out-of-order(OOO)Execution/Engine</h3><p>先讲一下OOO(乱序)以便对后端的执行有一个整体的把握。</p><p>我们的程序虽然是按顺序编写的指令,但CPU并不(一定)会按相同的方式执行。为了提升整体效率,CPU采用的是乱序执行的方式。从一个“窗口”范围内选取可以执行的指令执行,并且这些操作对用户透明,在程序编写者的角度看来仍是在按他编写的指令顺序执行。</p><blockquote><p>从根本上来讲,OOO是用”数据流(Data flow)”的角度来看待程序,而非程序员的“指令流”视角。</p></blockquote><p>指令的目的就是以一种特定的方式操纵存在于内存/缓存中的数据,引起数据的变化,其实这就是我们通常所说的“写程序”。只不过这是人类习惯的逻辑方式,在机器看来并不一定高效。</p><p><img src="/images/skylake_microarchitecture/figure8_execution_engine_example.png" alt="Figure8 execution_engine_example"></p><p>在上图例子中,需要执行左上角的六个计算指令。<code>In-order execution</code> 是假设完全按照程序顺序执行这六个指令的耗时。下面的<code>In-order(superscalar3)</code> 是合并了一些可以并行执行的指令的耗时。</p><p>因为指令(2)中的 <code>r1</code> 要依赖指令(1)的结果,所以指令(2)只能等(1)执行结束再执行。而本来可以并行执行的(3)(4)也因为要保证 <code>In-order</code> 顺序而只能一同放在(1)之后执行。</p><p>但从左下角的 <code>Data flow</code> 的角度来看,其实我们并不需要按照指令顺序运行程序:指令(2)完全可以放在后面执行,并重新安排并行计算顺序。这样就又节省了执行所需的时间。</p><p>OOO选择可执行指令的依据是:</p><ul><li>不依赖未执行指令操纵的数据</li><li>有可用的执行资源</li></ul><p>为了尽可能让进入后端的指令满足这两个条件,OOO采用了一系列的组件和技术。在后面的章节中将会进行介绍。</p><p><img src="/images/skylake_microarchitecture/figure9_out_of_order.png" alt="Figure9 out_of_order"></p><p>上图是一个OOO的概念示意图。前端输出给后端的都是顺序指令流,后端在一个窗口范围中选择可以执行的指令进行乱序执行。这里面没有强调的是,最终指令退出(retire)的顺序仍是按照程序的顺序。</p><h3 id="OOO-Once-More"><a href="#OOO-Once-More" class="headerlink" title="OOO Once More"></a>OOO Once More</h3><p>这里对 OOO(Out-Of-Order) 乱序执行再简单讲两句。深入乱序执行的难点不在于“不按指令顺序执行”,而是如何做到“按指令顺序退出”。</p><p>这里面的关键是,所有执行过的指令都先被“缓存”起来,并不把执行之后的结果真正写到寄存器或者内存里。从用户角度看,这个指令其实并没有被“执行”,因为它没有引起任何数据方面的变化。等到它可以确定是需要被执行的指令,并且它前面的指令都已经把结果写入(commit)之后,它再去 <code>Commit</code>。这样从用户角度看来,程序就是按照指令顺序执行了。</p><blockquote><p>在很多文档里,<code>Commit</code> 和 <code>Retire</code> 是两个可以互换(interchangable)的词。</p></blockquote><p>说实话,研究这块东西,最烦的就是同一个概念有N个名字。</p><p><img src="/images/skylake_microarchitecture/figure10_out_of_order_once_more.png" alt="Figure10 out_of_order_once_more"></p><p>再来总结一下 <code>OOO</code> 的 <code>Big Picture</code> :</p><ul><li>左边 <code>Fetch&Decode</code> 是之前讲的前端(Front-End)相关的内容。此时指令还是有序的。</li><li><code>Decode</code> 成微指令(μop)之后,这些微指令进入一个指令池(Instruction Pool),这里面能够被执行的指令,就直接被执行。“能够被执行”是指满足以下两个条件:<ul><li>已有指令需要的数据</li><li>执行单元有空闲</li></ul></li><li>当指令被执行之后<ul><li>通知所有对该指令有依赖的指令(们),它们所需要的数据已经准备好。</li><li>注意这里说的是“执行”,不是上面说的 “Retire” 或 “Commit”</li><li>为实现这一功能,CPU 中还必须要对微指令的操作数(数据)有 Bookkeeping 的能力</li></ul></li><li>Commit 指令<ul><li>只有当前指令的前序(指令顺序)指令都 Commit 之后,才能 Commit 当前指令</li><li>Commit 也可以并行进行,前提是满足上面一条的条件,同时并行 Commit 的指令间没有依赖</li></ul></li></ul><h3 id="False-Dependency"><a href="#False-Dependency" class="headerlink" title="False Dependency"></a>False Dependency</h3><p>乱序执行的一大前置条件就是指令数据间没有相互依赖。下面就着重分析一下依赖。</p><p>用下面的指令过程作一个示例:</p><p><img src="/images/skylake_microarchitecture/figure11_out_of_order_false_dependency.png" alt="Figure11 out_of_order_false_dependency"></p><p>简单分析一下:</p><ul><li>Read After Write(RAW) 型依赖<br> (2)指令需要读取r1的值,而r1的值需要(1)指令执行之后给出。所以(2)指令对(1)指令有 RAW 依赖。RAW 依赖也被称作 <code>true dependency</code> 或者 <code>flow dependency</code>。</li><li>Write After Read(WAR) 型依赖<br> (3)指令需要更新 <code>r8</code> 的值,但在此之前(2)指令需要读取 <code>r8</code> 的值参与计算。所以(3)指令对(2)指令有 WAR 依赖。WAR 依赖也被称作 <code>anti-dependencies</code>。</li><li>Write After Write(WAW) 型依赖<br> (4)指令需要在(2)指令写入 <code>r3</code> 之后再写入 <code>r3</code>。所以(4)指令对(2)指令有 WAW 依赖。WAW 依赖也可以被叫做 <code>output dependencies</code></li></ul><p>按照以上的分析,这几条指令几乎没有可以并行执行的余地。不过,我想你也已经看出了一些“转机”:针对WAR和WAW,是可以被Register Rename这种方法破解的。这两种依赖都被称为 <code>false dependency</code>。</p><h3 id="Register-Rename"><a href="#Register-Rename" class="headerlink" title="Register Rename"></a>Register Rename</h3><p>当需要写入 <code>r1</code> 的指令在读取 <code>r1</code> 的指令之后,写入的 <code>r1</code> 的新值可以首先保存在另外一个寄存器 <code>r1’</code>里。读取 <code>r1</code> 的指令仍然读取原 <code>r1</code> 寄存器中的值,这样WAR 指令就可以并行执行。当所有需要读取 <code>r1</code> 原值的指令都执行完毕,<code>r1</code> 就可以用新值代替。</p><blockquote><p>Register Rename其实就是利用CPU提供的大量的物理寄存器,为寄存器制作“分身”或者,Alias,提供能够增加程序并行性的便利。</p></blockquote><p>上面的例子里,<code>r1</code> 是 <code>architectural register</code>,<code>r1’</code> 是内部的 <code>physical register</code>。Rigster Rename 就是在制作这两种寄存器间的映射关系。当然,这一切对用户来说都是透明的。</p><p>如前所述,<code>physical register</code> 的数量远多于 <code>architectural register</code> 的数量。其实 <code>architectural register</code> 仅仅是一个“代号”,并不是真正存放数据的位置。用这种方式,可以消除 <code>WAW</code> 和 <code>WAR</code> 这两种数据依赖进而增加程序整体的并行性。</p><p>那么到底怎么操作呢?其实本质上也就是建立一个“映射表”,一个从“代号”到存储位置的映射表。</p><p>E.g.</p><p>现有5个 <code>architectural register</code> 寄存器:r1, r2, r3, r4, r5;9个 <code>physical register</code> 寄存器:p1, p2, …, p9。</p><p>指令:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">Add r1, r2, r3 ;r1 = r2 + r3</span><br><span class="line">Sub r2, r1, r2 ;r2 = r1 - r2</span><br><span class="line">Add r1, r4, r5 ;r1 = r4 + r5</span><br></pre></td></tr></table></figure><p>最开始是一个简单的映射关系:</p><table><thead><tr><th align="left">r1</th><th align="left">r2</th><th align="left">r3</th><th align="left">r4</th><th align="left">r5</th></tr></thead><tbody><tr><td align="left">p1</td><td align="left">p2</td><td align="left">p3</td><td align="left">p4</td><td align="left">p5</td></tr></tbody></table><p>在这张表里面还有一个 <code>FreeList</code>,用来保存还没有被占用的 <code>physical register</code>。</p><table><thead><tr><th align="left"></th><th align="left"></th><th align="left"></th><th align="left"></th></tr></thead><tbody><tr><td align="left">p6</td><td align="left">p7</td><td align="left">p8</td><td align="left">p9</td></tr></tbody></table><p>OK,首先考虑不使用 <code>Register Rename</code> 的情景。第二条指令是必须等待第一条指令执行完成之后才能执行,因为 <code>r1</code> 有 <code>RAW</code> 型依赖。这个其实 <code>Register Rename</code> 也没有办法。但是第三条指令也不能在第二条指令之前执行,因为写入 <code>r1</code> 可能会影响第二条指令的结果(r2)。</p><p>为了增加指令的并行性,让第三条指令能与第一条指令并行,同时消除 <code>WAW</code> 和 <code>WAR</code> 型依赖,看一下 <code>Register Rename</code> 是怎么做的。</p><p>第一条指令就用原始对应的寄存器,此时还没有 <code>Register Rename</code>。对应的“映射表”<code>Rename Table</code>如下:</p><table><thead><tr><th align="left"></th><th align="left"></th></tr></thead><tbody><tr><td align="left">r1</td><td align="left">p1</td></tr><tr><td align="left">r2</td><td align="left">p2</td></tr><tr><td align="left">r3</td><td align="left">p3</td></tr><tr><td align="left">r4</td><td align="left">p4</td></tr><tr><td align="left">r5</td><td align="left">p5</td></tr></tbody></table><p>第二条指令中,<code>r2</code> 针对第一条指令有 <code>WAR</code> 型依赖,可以将写入 <code>r2</code> 的结果放在另外一个寄存器里。从 <code>FreeList</code> 中选取下一个空闲的 <code>physical register</code>,即 <code>p6</code>。</p><p>所以这条指令实际上就变成了<code>Sub r6, r1, r2; r6 = r1 - r2</code>。</p><p><code>Rename Table</code>如下:</p><table><thead><tr><th align="left"></th><th align="left"></th></tr></thead><tbody><tr><td align="left">r1</td><td align="left">p1</td></tr><tr><td align="left">r2</td><td align="left"><strong>p6</strong></td></tr><tr><td align="left">r3</td><td align="left">p3</td></tr><tr><td align="left">r4</td><td align="left">p4</td></tr><tr><td align="left">r5</td><td align="left">p5</td></tr></tbody></table><p>即告之后续指令 <code>r2</code> 最终的结果保存在 <code>p6</code> 里面。</p><p>第三条指令,<code>r1 </code>针对第一条指令有 <code>WAW</code> 型依赖,可以将写入 <code>r1</code> 的结果放到另外一个寄存器里。从 <code>FreeList</code> 中选取下一个空闲的 <code>physical register</code>,即 <code>p7</code>。</p><p>所以这条指令实际上就变成了 <code>Add p7, p4, p5 ; p7 = p4 + p5</code></p><p><code>Rename Table</code> 如下:</p><table><thead><tr><th align="left"></th><th align="left"></th></tr></thead><tbody><tr><td align="left">r1</td><td align="left"><strong>p7</strong></td></tr><tr><td align="left">r2</td><td align="left">p6</td></tr><tr><td align="left">r3</td><td align="left">p3</td></tr><tr><td align="left">r4</td><td align="left">p4</td></tr><tr><td align="left">r5</td><td align="left">p5</td></tr></tbody></table><p>即告之 <code>r1</code> 最终的结果保存在 <code>p7</code> 里面。</p><blockquote><p>所有指令对 <code>architectural register</code>的读取都先通过 <code>Rename Table</code> 获得确切地址。</p></blockquote><p>回到最初提到的问题,因为第一条指令和第三条指令实际写入的寄存器(分别是p1和p7)并不冲突,且第二条指令仅在 <code>p1</code> 中读取数据,因此这两条指令可以并行执行。</p><p>现代CPU的 <code>Rename Table </code>一般是在 <code>ROB</code> 里的 <code>RAT(Rename Alias Table)</code>。同时 <code>physical register</code> 也会被 <code>ROB entry</code> 取代。</p><p>其实现在对 <code>Register Rename </code>的理解更多的是建立一个概念,在整个微架构中,这一步不是一个孤立的组件,所有组件之间都需要紧密配合。</p><p>后续会对后端的执行进行示例介绍。</p><h4 id="示例"><a href="#示例" class="headerlink" title="示例"></a>示例</h4><p>一个示例介绍Reorder Buffer(ROB)和Register Alias Table(RAT)和Reservation Station(RS)</p><p>理解乱序执行(Out-of-Order)的核心其实就是把ROB,RAT和RS这三个组件搞透。</p><p>如果要单独讲,很容易成为一大锅概念和专有名词的杂烩。所以这次把这几个紧密相关的组件放到一起,先用例子说明,仅描述自然行为,同时也避免出现太多概念。</p><p><img src="/images/skylake_microarchitecture/figure12_AW0jkn.png" alt="Figure12 AW0jkn"></p><p>上图是在一个起始时刻 <code>CYCLE 0</code> 时CPU后端各组件的状态。它即将执行 <code>Instructions</code> 表格里的6条指令。不同种类指令所需要消耗的执行时间如 <code>Cycle Consumption</code> 所示。</p><p><code>ARF</code> 是 <code>Architectural Register File</code>,里面保存有当前时刻 <code>architectural register</code> 中的值;<code>RAT</code> 就是前面介绍过的 <code>Register Alias Table</code>,主要用作对 <code>architectural register</code> 的Rename。</p><p><code>Reservation Station(RS)</code> 根据所连接的执行不同类型指令的Port而分成两类,一类保存 <code>ADD/SUB</code> 相关的指令,一类保存 <code>MUL/DIV</code> 相关的指令。里面的指令在两个Value都 <code>Ready</code> 的时候将发送到执行单元执行。</p><p><code>Re Order Buffer</code> 旁边的表格是这6条指令从 <code>Issue</code> 到 <code>Execute</code>, <code>Write</code> 最后再到 <code>Commit</code> 这几个状态的 <code>cycle</code> 时刻表。</p><p>OK,那么下面进入第一个cycle。</p><p><img src="/images/skylake_microarchitecture/figure13_AW0HOg.png" alt="Figure13 AW0HOg"></p><p>第一条指令 <code>DIV R2, R3, R4</code> 按照先进先出的原则首先进入 <code>ROB1</code>。</p><p>在ROB中,<code>Dst</code> 填该指令的目的 <code>architectural register</code>,也就是 R2;<code>Value</code> 是该指令执行完计算出来的结果,显然现在还不得而知,表示是否执行完的<code>Done</code> 标志位也是N的状态。</p><p>同时针对 <code>DIV</code> 指令的RS中也有空闲资源,因此该指令也会在同一cycle进入RS。目的 tag<code>D-tag</code> 填写指令对应的 <code>ROB</code> 条目(ROB1);<code>Tag1</code> 和 <code>Tag2</code> 通过查阅 <code>RAT</code> 中 <code>R3</code> 和 <code>R4</code> 的状态,如果有 Rename 的情况,则填写对应的 ROB 条目,如果没有,则直接读取 <code>ARF</code> 中的值,作为 <code>Value</code> 填入。</p><p>因此,<code>D-tag</code> 是 <code>ROB1</code>,<code>Tag1</code> 和 <code>Tag2</code> 因为 <code>R3</code> 和 <code>R4</code> 没有 Rename 所以不填,直接读取 <code>ARF</code> 中的值,20 和 5,放入 <code>Value1</code> 和 <code>Value2</code> 中。</p><p>之后,在 <code>RAT</code> 中,R2 被 Rename 成了 <code>ROB1</code>,即表示后续指令欲读取 R2 的值的话,都应该去读取 <code>ROB1</code> 中 <code>value</code> 的值。</p><p>此时该 <code>DIV</code> 指令所需要的操作数都已经 <code>Ready</code>,那么就可以在下一个 cycle 时从 RS 中 <code>发射</code> 到执行单元去执行。</p><p>下面进入第二个 cycle。</p><p><img src="/images/skylake_microarchitecture/figure14_AW0qmQ.png" alt="Figure14 AW0qmQ"></p><p>在第二个 cycle 中,第一条 <code>DIV</code> 指令开始执行,根据 <code>DIV</code> 的执行周期,那么我们知道它将在第 <code>2 + 10 = 12</code> 个 cycle 中执行完成。同时 ROB 中还有空闲,我们可以 <code>issue</code> 第二条 <code>MUL</code> 指令。</p><p>在 RS 中,上一条 <code>DIV</code> 指令已经清出,也有空闲资源,所以 <code>MUL</code> 指令也可以进入到 RS 中。另外几个选项也如 <code>DIV</code> 指令的判断方式,因此 <code>D-tag</code> 为 <code>ROB2</code>,两个 <code>value</code> 为 4 和 2。此时 <code>MUL</code> 指令也已经 <code>Ready</code>,可以在下一 个cycle 开始执行。</p><p>同时 <code>RAT</code> 中将 <code>R1</code> rename 到 <code>ROB2</code>。因为后续最新的 <code>R1</code> 的值将等于 <code>ROB2</code> 中的 <code>value</code>。</p><p><img src="/images/skylake_microarchitecture/figure15_AW0Lwj.png" alt="Figure15 AW0Lwj"></p><p>在第三个 cycle 中,<code>MUL</code> 指令开始执行,根据 <code>MUL</code> 的执行周期,它将在第 <code>3 + 3 = 6</code> 个 cycle 中执行完成。因 ROB 中还有空闲,此时可以 <code>issue</code> 第三条 <code>ADD</code> 指令。</p><p>RS 里面,<code>ADD</code> 指令需要放到存放 <code>ADD/SUB</code> 指令的 RS 中,除此之外,各字段的填写方式与之前的指令没有区别。<code>R7</code> 和 <code>R8</code> 也可以直接从 <code>ARF</code> 中获取数值,因此该 <code>ADD</code> 指令也已经 <code>Ready</code>,可以在下一个 cycle 开始执行。</p><p>之后,<code>RAT</code> 中将 <code>R3</code> rename 到 <code>ROB3</code>。</p><p><img src="/images/skylake_microarchitecture/figure16_AW076S.png" alt="Figure16 AW076S"></p><p>那么在第四个 cycle 中,第四条 <code>MUL</code> 指令可以进入 <code>ROB</code> 和 <code>RS</code> 之中。在 RS 中,<code>D-tag</code> 填入该指令对应的 <code>ROB</code> 条目,即 <code>ROB4</code>。而它的第一个操作数 <code>R1</code> 通过<code>RAT</code> 读取(参见 cycle 3 中的 <code>RAT</code> 情况。),rename 到了 <code>ROB2</code>,因此 <code>tag1</code> 需要填 <code>ROB2</code>。<code>Tag2</code> 同理,填 <code>ROB1</code>。</p><p>之后,<code>RAT</code> 中的 <code>R1</code> 需要 rename 到 <code>ROB4</code>,以保持最新的状态。</p><p>RS 中,因为该条指令两个操作数的 <code>value</code> 还没有 Ready,不能在下一个 cycle 开始执行,因此还暂存在 RS 之中。</p><p><img src="/images/skylake_microarchitecture/figure17_AW0OTs.png" alt="Figure17 AW0OTs"></p><p>在第五个 cycle 中,拆成两个阶段来看。第一个阶段,也即 <code>cycle 5'</code>,第五条 <code>SUB</code> 指令进入 <code>ROB</code> 和 <code>RS</code>,各字段的填写方式与之前相同。</p><p><img src="/images/skylake_microarchitecture/figure18_AW0vYq.png" alt="Figure18 AW0vYq"></p><p>在 cycle 5 的第二个阶段中,注意到指令时刻表中,第三条在指令将在 cycle 5 完成执行,并进入 <code>Write</code> 阶段。</p><p>于是此时第三条指令在 <code>ROB</code> 中对应的 <code>ROB3</code> 的 <code>Value</code> 中将填入该指令执行的结果,也就是 3,同时设置标志位 <code>DONE</code> 为 Y。</p><p>在执行完成之后,在同一个 cycle 中,CPU 还将进行一个操作,就是将该结果广播给 RS 中现存的指令,如果有等待 <code>ROB3</code> 执行结果的指令,将接收该结果并更新状态。</p><p>在当前 <code>RS(Adder)</code> 中,<code>SUB</code> 指令正在等待 <code>ROB3</code> 的结果(参见<code>cycle5'</code>),于是其不再等待 <code>Tag1</code>,并在 <code>Value1</code> 中填入结果 3。此时该 <code>SUB</code> 指令也已经 Ready,并将在下一个 cycle 中执行,根据其执行开销,将在第 <code>6 + 1 = 7</code> cycle 时执行完成。</p><p><img src="/images/skylake_microarchitecture/figure19_AW0xf0.png" alt="Figure19 AW0xf0"></p><p>第六个 cycle 仍然分为两个阶段。第一个阶段 <code>cycle 6’</code> 里,第六条 <code>ADD</code> 指令指令可以进入 ROB 以及 RS。</p><p>在 RS 中,<code>D-tag</code> 填写该指令所在的 ROB 条目 <code>ROB6</code>,两个操作数通过读取 <code>RAT</code> 获得,<code>R4</code> 和 <code>R2</code> 对应的分别是 <code>ROB5</code> 和 <code>ROB1</code>。</p><p><code>RAT</code> 中 <code>R1</code> 所对应的最新值修改为 <code>ROB6</code>。</p><p><img src="/images/skylake_microarchitecture/figure20_AWBCXF.png" alt="Figure20 AWBCXF"></p><p>在第二个阶段,注意到此时第二条指令也在 <code>cycle 6</code> 执行完毕,因此它将执行的结果(8)写入到其所在的 ROB 条目 <code>ROB2</code>,并在同时将执行的结果广播给 RS 中的指令。</p><p>此时 RS 中的 <code>MUL</code> 指令正在等待 <code>ROB2</code> 的值,此时将其对应的 <code>Value1</code> 中写入计算的结果(8)。</p><p><img src="/images/skylake_microarchitecture/figure21_AWBplT.png" alt="Figure21 AWBplT"></p><p>在第七个周期,注意到第五条指令也该执行完成,其所执行所得到的结果(-1),也需要写回到 <code>ROB5</code> 并广播给 RS 中的指令。但此时没有等待该值的指令。所以对其他状态暂时没有影响。</p><p>但如果此时有新的指令需要 <code>R4</code>,<code>ROB5</code> 此时的值可以直接传递给该指令。</p><p><img src="/images/skylake_microarchitecture/figure22_AWBSpV.png" alt="Figure22 AWBSpV"></p><p>在第 7 个指令之后,CPU 进入一个尴尬的时期。没有新的指令执行完毕,RS 中的指令也没有 <code>Ready</code> 的,观察一下时刻表,下一个时刻有新的指令执行完毕是 <code>cycle 12</code> 的事。</p><p>在 <code>cycle 12</code> 中第一条 <code>DIV</code> 指令执行完毕,结果写入 <code>ROB1</code>,广播结果给 RS 中的指令,正好两个都需要 <code>ROB1</code>,并且拿到这个结果之后都进入 <code>Ready</code> 状态,可以在下一个 cycle 执行。</p><p>更新一下第四条和第六条指令的时刻表,执行都是在第13个 cycle,完成将分别在第 16 和 14 个cycle。</p><p>此时还发生了一件事,就是 ROB 中的第一条指令的 <code>DONE</code> 标志位标成了 <code>Y</code>。ROB 之前我们介绍是一个先入先出的 FIFO 结构,只有第一条指令完成之后,才能按顺序开始 commit。</p><p><img src="/images/skylake_microarchitecture/figure23_AWB96U.png" alt="Figure23 AWB96U"></p><p>所以在 <code>cycle 13</code>,第一条指令历史性的 commit 了。Commit 的意思就是把结果写入到 <code>ARF</code>,因此 <code>R2</code> 在 ARF 中改为了4。同时删除该 ROB 条目,为后续的指令腾出资源。当然 <code>RAT</code> 中也不再需要 rename 到 <code>ROB1</code>,最新的值已经在 <code>ARF</code> 中。</p><p><img src="/images/skylake_microarchitecture/figure24_AWBim4.png" alt="Figure24 AWBim4"></p><p>在 <code>cycle 14</code> 中,ROB 中的当前在队列头部的指令,也就是第二条指令也可以 commit 了,按之前的操作,<code>R1 </code>的值也改成了最新的值(8)。</p><p>同时,第六条指令也执行完毕,计算的结果写入 <code>ROB6</code>。当然这条指令还不能 commit,因为 commit 需要按指令顺序。</p><p><img src="/images/skylake_microarchitecture/figure25_AWBF0J.png" alt="Figure25 AWBF0J"></p><p>第15个cycle,除了commit第三条指令之外没什么好做的。和以前的操作类似。</p><p><img src="/images/skylake_microarchitecture/figure26_AWBVt1.png" alt="Figure26 AWBVt1"></p><p>第 16 个指令,第 4 条指令执行完毕,结果写入 <code>ROB4</code>,同时它也是当前 ROB 中在队列头部的指令,可以在下一个 cycle commit。</p><p><img src="/images/skylake_microarchitecture/figure27_AWBEkR.png" alt="Figure27 AWBEkR"></p><p>那就commit呗。</p><p><img src="/images/skylake_microarchitecture/figure28_AWBk79.png" alt="Figure28 AWBk79"></p><p>剩下的第 18,19 cycle 想必你也知道该干什么了:把最后的两条指令 commit 掉。</p><p><img src="/images/skylake_microarchitecture/figure29_AWBZfx.png" alt="Figure29 AWBZfx"></p><p>OK,当指令时刻表都完成之后,这6条指令正式执行完毕</p><h4 id="关于这几个组件"><a href="#关于这几个组件" class="headerlink" title="关于这几个组件"></a>关于这几个组件</h4><p>全部目的都在于通过一个示例解释 <code>RAT</code>, <code>ROB</code> 和 <code>RS</code> 这三个组件的组成、特性和功能。在熟悉了这个例子的基础上可以再去寻找那些传统的“教科书”去印证理解那些大段大段的文字描述。</p><p>这个例子其实还缺少一些类似分支转跳,尤其是分支预测失败之后如何操作的说明。但足矣描述清楚 CPU 的乱序执行和顺序 commit 到底是怎么回事。</p><p>关于 CPU 微架构,前端和后端的内容基本上介绍的差不多了,后面会开始最后一个部分,也就是内存操作相关的组件的介绍。</p><h2 id="Load-and-Store"><a href="#Load-and-Store" class="headerlink" title="Load and Store"></a>Load and Store</h2><p>这一章节我们讲一下 <code>load</code> 和 <code>store</code> 。</p><p>尽管我们将指令<code>load</code>和<code>store</code>指令归类为其他类别的指令中的特殊指令,但所有指令和管道的设计都具有统一的目的:通过消除依赖关系来提高指令级别的并行度(parellarmise)。换句话说</p><ol><li>通过 <code>branch prediction</code> 来消除控制依赖性</li><li>通过 <code>register renaming</code> 来消除 <code>false dependencies</code></li></ol><blockquote><p>需要注意的是 <code>register renaming</code> 是针对寄存器的,而不是针对主存的。</p></blockquote><p>主存操作是否也存在依赖关系,如果有我们应当如何解决?</p><h3 id="read-write-与-load-store-的不同"><a href="#read-write-与-load-store-的不同" class="headerlink" title="read/write 与 load/store 的不同"></a>read/write 与 load/store 的不同</h3><p><code>load</code> 和 <code>store</code> 是内存操作的两个指令,而 <code>read</code> 和 <code>write</code> 是直接操作内存的动作。在大多数情况下,这两个属术语可以互换,但是在下面的语境中,为了不对他们的含义产生误解,我们使用以下定义来区分:</p><ol><li><code>store</code> 是内存操作指令,只有在 <code>store</code> 指令提交后,才会发生内存写入</li><li><code>load</code> 也是内存操作指令,但是 <code>load</code> 指令提交前后,都有可能发生内存读取。原因是 <code>load</code> 指令能够利用之前存储到 <code>load</code> 指令相同地址的结果。</li></ol><h3 id="寄存器与主存"><a href="#寄存器与主存" class="headerlink" title="寄存器与主存"></a>寄存器与主存</h3><p>寄存器和内存共享相同类型的依赖关系。可以在乱序执行期间消除错 <code>false dependencies</code> 关系。</p><p>不同的是,内存操作只有在运行时才可以知道操作地址,这就使得判断依赖性变得更加复杂。例如</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">Load r3 = 0[R6]</span><br><span class="line">Add r7 = r3 + r9</span><br><span class="line">Store r4 -> 0[r7]</span><br><span class="line">Sub r1 = r1 - r2</span><br><span class="line">Load r8 = 0[r1]</span><br></pre></td></tr></table></figure><p>在第三条指令中,将 <code>r4</code> 的值存储到 <code>r7</code> 记录的内存地址,然后第五条指令,从 <code>r1</code> 记录的内存地址中读取数据。我们假设缓存没有命中,如果 <code>r7</code> 的值与 <code>r1</code> 的值不同,没有任何问题;但如果 <code>r7 == r1</code> ,就会出现问题。尚未提交的第三条指令会让第五条指令读取的值不正确。换句话说这是 <code>RAW true dependencies</code>。</p><p>导致这种局面的真正原因是 <code>memory aliasing</code>,当两个指针指向同一个内存地址时,<code>true dependency</code> 就会发生。</p><h3 id="load-与-store-示例"><a href="#load-与-store-示例" class="headerlink" title="load 与 store 示例"></a><code>load</code> 与 <code>store</code> 示例</h3><p>假设初始条件如下</p><p><img src="/images/skylake_microarchitecture/figure30_ELdlTO.png" alt="Figure30 ELdlTO"></p><p>最开始,几个 <code>load</code> 指令和 <code>store</code> 指令被传输到 <code>Load-Store Queue (LSQ)</code> ,有 4 个 <code>address-value</code> 键值对在缓存中。</p><p><img src="/images/skylake_microarchitecture/figure31_ELdQ0K.png" alt="Figure31 ELdQ0K"></p><p>开始从 <code>LSQ</code> 中执行第一条指令:<code>load from addr 0x3290</code>。</p><p>首先检查先前是否有 <code>store</code> 指令将值存储在同一地址。由于这是第一条指令,前面没有任何 <code>store</code> 指令。</p><p>然后,在缓存中查找匹配项。在我们假设的场景中,会命中缓存,并将 <code>42</code> 传递到 <code>LSQ</code> 中 <code>Value</code> 的位置。</p><p><img src="/images/skylake_microarchitecture/figure32_ELdMm6.png" alt="Figure32 ELdMm6"></p><p>继续执行下一条 <code>store</code> 指令,假定 <code>store</code> 计算出来的值为 <code>25</code>,并将其存储到 <code>LSQ</code> 的 <code>Value</code> 列中。</p><blockquote><p>因为只有提交时,才会写入到主存中</p></blockquote><p><img src="/images/skylake_microarchitecture/figure33_ELduOx.png" alt="Figure33 ELduOx"></p><p>再下一条指令也类似,假设置为 <code>-17</code>。</p><p><img src="/images/skylake_microarchitecture/figure34_ELdn61.png" alt="Figure34 ELdn61"></p><p>下一条 <code>load</code> 指令,依然会首先检查前面是否有 <code>store</code> 指令写入到相同的位置,发现没有,然后从 <code>Cache</code> 中将响应地址的值传递到 <code>LSQ</code> 的 <code>Value</code> 中。</p><p><img src="/images/skylake_microarchitecture/figure35_ELd8te.png" alt="Figure35 ELd8te"></p><p>再下一条 <code>load</code> 指令,可以找到之前 <code>store</code> 指令存储到了相同的地址,于是直接将 <code>store</code> 的值读取到 <code>Value</code> 中。这是一个 <code>store-forward</code> 操作。</p><p><img src="/images/skylake_microarchitecture/figure36_ELdYpd.png" alt="Figure36 ELdYpd"></p><p>下一条 <code>load</code> 指令会从 <code>Cache</code> 中读取值 <code>1</code> ,并放入 <code>LSQ</code> 中。</p><p><img src="/images/skylake_microarchitecture/figure37_ELd3kD.png" alt="Figure37 ELd3kD"></p><p>接下来 <code>store</code> 指令,假设计算值为 <code>0</code> 并放入 <code>LSQ</code> 中。</p><p><img src="/images/skylake_microarchitecture/figure38_ELdGfH.png" alt="Figure38 ELdGfH"></p><p>接下来 <code>load</code> 指令,继续通过 <code>store-forward</code> 操作,将 <code>25</code> 放入 <code>Value</code> 中。</p><p><img src="/images/skylake_microarchitecture/figure39_EOa09H.png" alt="Figure39 EOa09H"></p><p>下一条 <code>load</code> 指令,会找到多个 <code>store</code> 有相同的地址,取最接近的一个,将 <code>0</code> 放入 <code>LSQ</code> 的 <code>Value</code> 中。</p><p><img src="/images/skylake_microarchitecture/figure40_EOaB3d.png" alt="Figure40 EOaB3d"></p><p>最后一条 <code>load</code> 指令,从 <code>Cache</code> 中将 <code>1</code> 放到 <code>LSQ</code> 中。</p><p>然后,将提交指令。</p><p><img src="/images/skylake_microarchitecture/figure41_EOdEPe.png" alt="Figure41 EOdEPe"></p><p>对于 <code>load</code> 指令而言,它只是从 <code>LSQ</code> 出队,因为该值已在执行阶段加载到寄存器中。</p><p><img src="/images/skylake_microarchitecture/figure42_EOdurt.png" alt="Figure42 EOdurt"></p><p>对于<code>store</code>指令,将值更新到 <code>Cache</code> 中,然后出队。</p><p><img src="/images/skylake_microarchitecture/figure43_EOdexA.png" alt="Figure43 EOdexA"></p><p>下一条 <code>store</code> 指令,依然是更新到 <code>Cache</code> 中,然后出队。</p><p><img src="/images/skylake_microarchitecture/figure44_EOdZ2d.png" alt="Figure44 EOdZ2d"></p><p>接下来的三条 <code>load</code> 指令,出队。</p><p><img src="/images/skylake_microarchitecture/figure45_EOdV8H.png" alt="Figure45 EOdV8H"></p><p><code>store</code>,更新缓存,出队。</p><p><img src="/images/skylake_microarchitecture/figure46_EOdnKI.png" alt="Figure46 EOdnKI"></p><p>最后三条 <code>load</code> 指令,出队。</p><p>之所以 <code>store</code> 仅在提交阶段更新缓存,是因为如果处理器在流水线中检测到预测失败,并且最后一条 <code>store</code> 指令之后的指令需要刷新,则缓存状态不会受到影响,并且如果正确预测时,可以认为从未收到影响。</p><blockquote><p>再次声明</p><p>本文全部内容都来自于 DECODEZ “Skylake 微架构剖析” 系列,地址 <a href="https://decodezp.github.io/2019/01/07/quickwords9-skylake-pipeline-1/">https://decodezp.github.io/2019/01/07/quickwords9-skylake-pipeline-1/</a></p><p>搬运仅仅为了留作笔记,详细内容请直接访问 DECODEZ 的博客网站 <a href="https://decodezp.github.io/">https://decodezp.github.io/</a></p></blockquote>]]></content>
<summary type="html"><blockquote>
<p>本文全部内容都来自于 DECODEZ “Skylake 微架构剖析” 系列,地址 <a href="https://decodezp.github.io/2019/01/07/quickwords9-skylake-pipeline-1/">https://decodezp.github.io/2019/01/07/quickwords9-skylake-pipeline-1/</a></p>
<p>搬运仅仅为了留作笔记,详细内容请直接访问 DECODEZ 的博客网站 <a href="https://decodezp.github.io/">https://decodezp.github.io/</a></p>
</blockquote>
<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>了解 <code>CPU</code> 的微架构是基于其开发“硬核”软件的必需步骤。由于一些历史遗留问题,现存的技术资料往往存在一些概念混淆、重复命名甚至自相矛盾之处。本文一来梳理 <code>Skylake</code> 微架构(主要是流水线)的组成和特性,二来试图厘清一些含混的概念用以帮助后来者。</p>
<p>另外在介绍完微架构之后,会继续结合 <code>Perf</code> 中的 <code>Performance Event</code> 来对照说明互为印证。</p>
<blockquote>
<p>需要强调的是,本文的重点是Skylake的流水线(pipeline)架构,core间的连接和架构方式不作重点说明。</p>
</blockquote></summary>
<category term="cpu" scheme="https://andrewei1316.github.io/categories/cpu/"/>
<category term="arch" scheme="https://andrewei1316.github.io/categories/cpu/arch/"/>
<category term="cpu" scheme="https://andrewei1316.github.io/tags/cpu/"/>
<category term="arch" scheme="https://andrewei1316.github.io/tags/arch/"/>
</entry>
<entry>
<title>《Star Schema Benchmark》阅读笔记</title>
<link href="https://andrewei1316.github.io/2020/12/12/star-schema-benchmark/"/>
<id>https://andrewei1316.github.io/2020/12/12/star-schema-benchmark/</id>
<published>2020-12-12T05:28:55.000Z</published>
<updated>2020-12-13T06:34:11.522Z</updated>
<content type="html"><![CDATA[<h2 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h2><p><code>SSB</code>(Star Schema Benchmark)是麻省州立大学波士顿校区的研究人员定义的基于现实商业应用的数据模型,业界公认用来模拟决策支持类应用,比较公正和中立。学术界和工业界普遍采用它来评价决策支持技术方面应用的性能。<br><code>SSB</code> 由 <code>TPC</code>(Transaction Processing Performance Council,事务处理性能委员会)发布的 <code>TPC-H</code> 标准改进而来。它将 <code>TPC-H</code> 的雪花模型改为星型模型,将基准查询由 <code>TPC-H</code> 的复杂 <code>Ad-Hoc</code> 查询改为了结构更固定的 <code>OLAP</code> 查询。</p><blockquote><p>事务处理性能委员会( Transaction Processing Performance Council ),是由数10家会员公司创建的非盈利组织,总部设在美国。该组织对全世界开放,但迄今为止,绝大多数会员都是美、日、西欧的大公司。TPC的成员主要是计算机软硬件厂家,而非计算机用户,它的功能是制定商务应用基准程序(Benchmark)的标准规范、性能和价格度量,并管理测试结果的发布。</p><p>引用自百度百科 <a href="https://baike.baidu.com/item/TPC/1814556">TPC (事务处理性能委员会)</a></p></blockquote><p>不使用 <code>TPC-H</code> 的原因是,想要提供更普适的功能覆盖(Functional Coverage)和选择覆盖(Selectivity Coverage):</p><ol><li>功能覆盖(Functional Coverage):尽可能的选用跨多个表的查询,来贴近实际使用情况</li><li>选择覆盖(Selectivity Coverage):通过维度表的条件来过滤事实表,并使得过滤后的结果集相对较少</li></ol><p>几个概念:</p><ol><li>SF(Scale Factor):生成测试数据集时传入的数据量规模因子,决定了各表最终生成的行数。</li><li>FF(Filter Factor):每个 WHERE 过滤条件筛选出一部分行,被筛选出的行数占过滤前行数的比例叫做 FF。在过滤列彼此独立的条件下,表的FF为该表上各个过滤条件FF的乘积。</li></ol><a id="more"></a><h2 id="表结构"><a href="#表结构" class="headerlink" title="表结构"></a>表结构</h2><p><img src="/images/star-schema-benchmark/figure_1.2_ssb_schema.png" alt="Figure1.2 SSB Schema"></p><p><code>TPC-H</code> 和 <code>SSB</code> 中所有表的规模都从给定的 <code>SF=1</code> 到 <code>SF=10</code> ,通常表的规模是 <code>SF</code> 的倍数。</p><h3 id="LINEORDER-事实表"><a href="#LINEORDER-事实表" class="headerlink" title="LINEORDER 事实表"></a>LINEORDER 事实表</h3><p>事实表合并了在 TPC-H 中的 <code>LINEITEM</code> 和 <code>ORDERS</code> 表,更加符合数据仓库的标准,减少了在查询过程中不必要的 <code>join</code> 计算。</p><p>规模:$SF*6,000,000$</p><h4 id="Schema-描述"><a href="#Schema-描述" class="headerlink" title="Schema 描述"></a>Schema 描述</h4><table><thead><tr><th>字段</th><th>描述</th></tr></thead><tbody><tr><td>LO_ORDERKEY</td><td>numeric (int up to SF 300) first 8 of each 32 keys populated</td></tr><tr><td>LO_LINENUMBER</td><td>numeric 1-7</td></tr><tr><td>LO_CUSTKEY</td><td>numeric identifier FK to C_CUSTKEY</td></tr><tr><td>LO_PARTKEY</td><td>identifier FK to P_PARTKEY</td></tr><tr><td>LO_SUPPKEY</td><td>numeric identifier FK to S_SUPPKEY</td></tr><tr><td>LO_ORDERDATE</td><td>identifier FK to D_DATEKEY</td></tr><tr><td>LO_ORDERPRIORITY</td><td>fixed text, size 15</td></tr><tr><td>LO_SHIPPRIORITY</td><td>fixed text, size 1</td></tr><tr><td>LO_QUANTITY</td><td>numeric 1-50 (for PART)</td></tr><tr><td>LO_EXTENDEDPRICE</td><td>numeric ≤ 55,450 (for PART)</td></tr><tr><td>LO_ORDTOTALPRICE</td><td>numeric ≤ 388,000 (ORDER)</td></tr><tr><td>LO_DISCOUNT</td><td>numeric 0-10 (for PART, percent)</td></tr><tr><td>LO_REVENUE</td><td>numeric (for PART: (lo_extendedprice*(100-lo_discnt))/100)</td></tr><tr><td>LO_SUPPLYCOST</td><td>numeric (for PART)</td></tr><tr><td>LO_TAX</td><td>numeric 0-8 (for PART)</td></tr><tr><td>LO_COMMITDATE</td><td>FK to D_DATEKEY</td></tr><tr><td>LO_SHIPMODE</td><td>fixed text, size 10</td></tr><tr><td>Compound Primary Key: LO_ORDERKEY, LO_LINENUMBER</td><td></td></tr></tbody></table><h4 id="建表-SQL"><a href="#建表-SQL" class="headerlink" title="建表 SQL"></a>建表 SQL</h4><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE</span> <span class="keyword">TABLE</span> LINEORDER </span><br><span class="line">(</span><br><span class="line"> LO_ORDERKEY <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_LINENUMBER <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_CUSTKEY <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_PARTKEY <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_SUPPKEY <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_ORDERDATE <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_ORDERPRIORITY <span class="built_in">VARCHAR</span>(<span class="number">15</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_SHIPPRIORITY <span class="built_in">VARCHAR</span>(<span class="number">1</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_QUANTITY <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_EXTENDEDPRICE <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_ORDERTOTALPRICE <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_DISCOUNT <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_REVENUE <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_SUPPLYCOST <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_TAX <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_COMMITDATE <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> LO_SHIPMODE <span class="built_in">VARCHAR</span>(<span class="number">10</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span></span><br><span class="line">);</span><br></pre></td></tr></table></figure><h3 id="PART-维度表"><a href="#PART-维度表" class="headerlink" title="PART 维度表"></a>PART 维度表</h3><p>规模:$ 200,000*floor(1+log2SF)$</p><h4 id="Schema-描述-1"><a href="#Schema-描述-1" class="headerlink" title="Schema 描述"></a>Schema 描述</h4><table><thead><tr><th>字段</th><th>描述</th></tr></thead><tbody><tr><td>P_PARTKEY</td><td>identifier</td></tr><tr><td>P_NAME</td><td>variable text, size 22 (Not unique)</td></tr><tr><td>P_MFGR</td><td>fixed text, size 6 (MFGR#1-5, CARD = 5)</td></tr><tr><td>P_CATEGORY</td><td>fixed text, size 7 (‘MFGR#’||1-5||1-5: CARD = 25)</td></tr><tr><td>P_BRAND1</td><td>fixed text, size 9 (P_CATEGORY||1-40: CARD = 1000)</td></tr><tr><td>P_COLOR</td><td>variable text, size 11 (CARD = 94)</td></tr><tr><td>P_TYPE</td><td>variable text, size 25 (CARD = 150)</td></tr><tr><td>P_SIZE</td><td>numeric 1-50 (CARD = 50)</td></tr><tr><td>P_CONTAINER</td><td>fixed text, size 10 (CARD = 40)</td></tr><tr><td>Primary Key: P_PARTKEY</td><td></td></tr></tbody></table><h4 id="建表-SQL-1"><a href="#建表-SQL-1" class="headerlink" title="建表 SQL"></a>建表 SQL</h4><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE</span> <span class="keyword">TABLE</span> PART </span><br><span class="line">(</span><br><span class="line"> P_PARTKEY <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> P_NAME <span class="built_in">VARCHAR</span>(<span class="number">22</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> P_MFGR <span class="built_in">VARCHAR</span>(<span class="number">6</span>),</span><br><span class="line"> P_CATEGORY <span class="built_in">VARCHAR</span>(<span class="number">7</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> P_BRAND1 <span class="built_in">VARCHAR</span>(<span class="number">9</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> P_COLOR <span class="built_in">VARCHAR</span>(<span class="number">11</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> P_TYPE <span class="built_in">VARCHAR</span>(<span class="number">25</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> P_SIZE <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> P_CONTAINER <span class="built_in">VARCHAR</span>(<span class="number">10</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span></span><br><span class="line">);</span><br></pre></td></tr></table></figure><h3 id="SUPPLIER-维度表"><a href="#SUPPLIER-维度表" class="headerlink" title="SUPPLIER 维度表"></a>SUPPLIER 维度表</h3><p>规模:$SF*2,000$</p><h4 id="Schema-描述-2"><a href="#Schema-描述-2" class="headerlink" title="Schema 描述"></a>Schema 描述</h4><table><thead><tr><th>字段</th><th>描述</th></tr></thead><tbody><tr><td>S_SUPPKEY</td><td>numeric identifier</td></tr><tr><td>S_NAME</td><td>fixed text, size 25: ‘Supplier’||S_SUPPKEY</td></tr><tr><td>S_ADDRESS</td><td>variable text, size 25 (city below)</td></tr><tr><td>S_CITY</td><td>fixed text, size 10 (10/nation:</td></tr><tr><td>S_NATION_PREFI</td><td>||(0-9)</td></tr><tr><td>S_NATION</td><td>fixed text, size 15 (25 values, longest UNITED KINGDOM)</td></tr><tr><td>S_REGION</td><td>fixed text, size 12 (5 values: longest MIDDLE EAST)</td></tr><tr><td>S_PHONE</td><td>fixed text, size 15 (many values, format: 43-617-354-1222)</td></tr><tr><td>Primary Key: S_SUPPKEY</td><td></td></tr></tbody></table><h4 id="建表-SQL-2"><a href="#建表-SQL-2" class="headerlink" title="建表 SQL"></a>建表 SQL</h4><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE</span> <span class="keyword">TABLE</span> SUPPLIER</span><br><span class="line">(</span><br><span class="line"> S_SUPPKEY <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> S_NAME <span class="built_in">VARCHAR</span>(<span class="number">25</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> S_ADDRESS <span class="built_in">VARCHAR</span>(<span class="number">25</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> S_CITY <span class="built_in">VARCHAR</span>(<span class="number">10</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> S_NATION <span class="built_in">VARCHAR</span>(<span class="number">15</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> S_REGION <span class="built_in">VARCHAR</span>(<span class="number">12</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> S_PHONE <span class="built_in">VARCHAR</span>(<span class="number">15</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span></span><br><span class="line">);</span><br></pre></td></tr></table></figure><h3 id="CUSTOMER-维度表"><a href="#CUSTOMER-维度表" class="headerlink" title="CUSTOMER 维度表"></a>CUSTOMER 维度表</h3><p>规模:$SF*30,000$</p><h4 id="Schema-描述-3"><a href="#Schema-描述-3" class="headerlink" title="Schema 描述"></a>Schema 描述</h4><table><thead><tr><th>字段</th><th>描述</th></tr></thead><tbody><tr><td>C_CUSTKEY</td><td>numeric identifier</td></tr><tr><td>C_NAME</td><td>variable text, size 25 ‘Cutomer’||C_CUSTKEY</td></tr><tr><td>C_ADDRESS</td><td>variable text, size 25 (city below)</td></tr><tr><td>C_CITY</td><td>fixed text, size 10 (10/nation:</td></tr><tr><td>C_NATION_PREFI</td><td>||(0-9)</td></tr><tr><td>C_NATION</td><td>fixed text, size 15 (25 values, longest UNITED KINGDOM)</td></tr><tr><td>C_REGION</td><td>fixed text, size 12 (5 values: longest MIDDLE EAST)</td></tr><tr><td>C_PHONE</td><td>fixed text, size 15 (many values, format: 43-617-354-1222)</td></tr><tr><td>C_MKTSEGMENT</td><td>fixed text, size 10 (longest is AUTOMOBILE)</td></tr><tr><td>Primary Key: C_CUSTKEY</td><td></td></tr></tbody></table><h4 id="建表-SQL-3"><a href="#建表-SQL-3" class="headerlink" title="建表 SQL"></a>建表 SQL</h4><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE</span> <span class="keyword">TABLE</span> CUSTOMER</span><br><span class="line">(</span><br><span class="line"> C_CUSTKEY <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> C_NAME <span class="built_in">VARCHAR</span>(<span class="number">25</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> C_ADDRESS <span class="built_in">VARCHAR</span>(<span class="number">25</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> C_CITY <span class="built_in">VARCHAR</span>(<span class="number">10</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> C_NATION <span class="built_in">VARCHAR</span>(<span class="number">15</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> C_REGION <span class="built_in">VARCHAR</span>(<span class="number">12</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> C_PHONE <span class="built_in">VARCHAR</span>(<span class="number">15</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> C_MKTSEGMENT <span class="built_in">VARCHAR</span>(<span class="number">10</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span></span><br><span class="line">);</span><br></pre></td></tr></table></figure><h3 id="DATE-维度表"><a href="#DATE-维度表" class="headerlink" title="DATE 维度表"></a>DATE 维度表</h3><p>规模:7 years of days</p><h4 id="Schema-描述-4"><a href="#Schema-描述-4" class="headerlink" title="Schema 描述"></a>Schema 描述</h4><table><thead><tr><th>字段</th><th>描述</th></tr></thead><tbody><tr><td>D_DATEKEY</td><td>identifier, unique id – e.g. 19980327 (what we use)</td></tr><tr><td>D_DATE</td><td>fixed text, size 18: e.g. December 22, 1998</td></tr><tr><td>D_DAYOFWEEK</td><td>fixed text, size 8, Sunday..Saturday</td></tr><tr><td>D_MONTH</td><td>fixed text, size 9: January, …, December</td></tr><tr><td>D_YEAR</td><td>unique value 1992-1998</td></tr><tr><td>D_YEARMONTHNUM</td><td>numeric (YYYYMM)</td></tr><tr><td>D_YEARMONTH</td><td>fixed text, size 7: (e.g.: Mar1998)</td></tr><tr><td>D_DAYNUMINWEEK</td><td>numeric 1-7</td></tr><tr><td>D_DAYNUMINMONTH</td><td>numeric 1-31</td></tr><tr><td>D_DAYNUMINYEAR</td><td>numeric 1-366</td></tr><tr><td>D_MONTHNUMINYEAR</td><td>numeric 1-12</td></tr><tr><td>D_WEEKNUMINYEAR</td><td>numeric 1-53</td></tr><tr><td>D_SELLINGSEASON</td><td>text, size 12 (e.g.: Christmas)</td></tr><tr><td>D_LASTDAYINWEEKFL</td><td>1 bit</td></tr><tr><td>D_LASTDAYINMONTHFL</td><td>1 bit</td></tr><tr><td>D_HOLIDAYFL</td><td>1 bit</td></tr><tr><td>D_WEEKDAYFL</td><td>1 bit</td></tr><tr><td>Primary Key: D_DATEKEY</td><td></td></tr></tbody></table><h4 id="建表-SQL-4"><a href="#建表-SQL-4" class="headerlink" title="建表 SQL"></a>建表 SQL</h4><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE</span> <span class="keyword">TABLE</span> DATES (</span><br><span class="line"> D_DATEKEY <span class="built_in">INTEGER</span>,</span><br><span class="line"> D_DATE <span class="built_in">VARCHAR</span>(<span class="number">18</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> D_DAYOFWEEK <span class="built_in">VARCHAR</span>(<span class="number">18</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> D_MONTH <span class="built_in">VARCHAR</span>(<span class="number">9</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> D_YEAR <span class="built_in">INTEGER</span> <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> D_YEARMONTHNUM <span class="built_in">INTEGER</span>,</span><br><span class="line"> D_YEARMONTH <span class="built_in">VARCHAR</span>(<span class="number">7</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> D_DAYNUMINWEEK <span class="built_in">INTEGER</span>,</span><br><span class="line"> D_DAYNUMINMONTH <span class="built_in">INTEGER</span>,</span><br><span class="line"> D_DAYNUMINYEAR <span class="built_in">INTEGER</span>,</span><br><span class="line"> D_MONTHNUMINYEAR <span class="built_in">INTEGER</span>,</span><br><span class="line"> D_WEEKNUMINYEAR <span class="built_in">INTEGER</span>,</span><br><span class="line"> D_SELLINGSEASON <span class="built_in">VARCHAR</span>(<span class="number">12</span>) <span class="keyword">NOT</span> <span class="literal">NULL</span>,</span><br><span class="line"> D_LASTDAYINWEEKFL <span class="built_in">INTEGER</span>,</span><br><span class="line"> D_LASTDAYINMONTHFL <span class="built_in">INTEGER</span>,</span><br><span class="line"> D_HOLIDAYFL <span class="built_in">INTEGER</span>,</span><br><span class="line"> D_WEEKDAYFL <span class="built_in">INTEGER</span></span><br><span class="line">);</span><br></pre></td></tr></table></figure><h2 id="查询语句"><a href="#查询语句" class="headerlink" title="查询语句"></a>查询语句</h2><p>相对于 <code>TPC-H</code> ,<code>SSB</code> 简化了模型,减少了部分 <code>Table</code> 并增加了新的 <code>Table</code> 。同时在 <code>SQL</code> 上,<code>SSB</code> 在 <code>TPC-H</code> 的基础上,使用尽可能少的 <code>SQL</code> 来得出完整的结论。</p><h3 id="查询语句定义"><a href="#查询语句定义" class="headerlink" title="查询语句定义"></a>查询语句定义</h3><h4 id="Q1"><a href="#Q1" class="headerlink" title="Q1"></a>Q1</h4><p>第一类查询的模板如下</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> <span class="keyword">sum</span>(lo_extendedprice * lo_discount) <span class="keyword">as</span> revenue</span><br><span class="line"> <span class="keyword">from</span> lineorder, <span class="built_in">date</span></span><br><span class="line"> <span class="keyword">where</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> [DATE_FILTER]</span><br><span class="line"> <span class="keyword">and</span> [LO_DISCOUNT_FILTER]</span><br><span class="line"> <span class="keyword">and</span> [LO_QUANTITY_FILTER]; </span><br></pre></td></tr></table></figure><p>其场景是:在一个给定的 <code>时间范围</code> 内,过滤 <code>折扣</code> 和 <code>销售数量</code> 在某个范围内的订单,计算所带来的的 <code>销售收入</code> <code>之和</code>。</p><p>该场景从一个维度表对数据进行限制。</p><p>该场景拓展出 3 个查询语句,各个语句过滤出来的数据都没有交集,可以有效避免系统缓存带来的影响(缓存可能会造成后访问的数据没有磁盘 IO 开销)。</p><h5 id="Q1-1"><a href="#Q1-1" class="headerlink" title="Q1.1"></a>Q1.1</h5><p>为上述变量赋值</p><figure class="highlight"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">DATE_FILTER -> d_year = 1993</span><br><span class="line">LO_DISCOUNT_FILTER -> lo_discount between 1 and 3</span><br><span class="line">LO_QUANTITY_FILTER -> lo_quantity < 25</span><br></pre></td></tr></table></figure><p>完整的 <code>SQL</code> 为</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> <span class="keyword">sum</span>(lo_extendedprice*lo_discount) <span class="keyword">as</span> revenue</span><br><span class="line"> <span class="keyword">from</span> lineorder, <span class="built_in">date</span></span><br><span class="line"> <span class="keyword">where</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> d_year = <span class="number">1993</span></span><br><span class="line"> <span class="keyword">and</span> lo_discount <span class="keyword">between</span> <span class="number">1</span> <span class="keyword">and</span> <span class="number">3</span></span><br><span class="line"> <span class="keyword">and</span> lo_quantity < <span class="number">25</span>;</span><br></pre></td></tr></table></figure><p>其中:$FF = (1/7) * (3/11) * 0.5 = 0.0194805$ ,对于 <code>SF = 1</code> 被过滤出来的 <code>LINEORDER</code> 数量为 $0.0194805 * 6,000,000 ≈ 116,883$。</p><h5 id="Q1-2"><a href="#Q1-2" class="headerlink" title="Q1.2"></a>Q1.2</h5><p>为上述变量赋值</p><figure class="highlight"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">DATE_FILTER -> d_yearmonthnum = 199401</span><br><span class="line">LO_DISCOUNT_FILTER -> lo_discount between 4 and 6</span><br><span class="line">LO_QUANTITY_FILTER -> lo_quantity between 26 and 35</span><br></pre></td></tr></table></figure><p>完整的 <code>SQL</code> 为</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> <span class="keyword">sum</span>(lo_extendedprice*lo_discount) <span class="keyword">as</span> revenue</span><br><span class="line"> <span class="keyword">from</span> lineorder, <span class="built_in">date</span></span><br><span class="line"> <span class="keyword">where</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> d_yearmonthnum = <span class="number">199401</span></span><br><span class="line"> <span class="keyword">and</span> lo_discount between4 <span class="keyword">and</span> <span class="number">6</span></span><br><span class="line"> <span class="keyword">and</span> lo_quantity <span class="keyword">between</span> <span class="number">26</span> <span class="keyword">and</span> <span class="number">35</span>;</span><br></pre></td></tr></table></figure><p>其中:$FF = (1/84) * (3/11) * 0.2 = 0.00064935$ ,对于 <code>SF = 1</code> 被过滤出来的 <code>LINEORDER</code> 数量为 $0.00064935 * 6,000,000 ≈ 3896$。</p><h5 id="Q1-3"><a href="#Q1-3" class="headerlink" title="Q1.3"></a>Q1.3</h5><p>为上述变量赋值</p><figure class="highlight"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">DATE_FILTER -> d_weeknuminyear = 6 and d_year = 1994</span><br><span class="line">LO_DISCOUNT_FILTER -> lo_discount between 5 and 7</span><br><span class="line">LO_QUANTITY_FILTER -> lo_quantity between 26 and 35</span><br></pre></td></tr></table></figure><p>完整的 <code>SQL</code> 为</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> <span class="keyword">sum</span>(lo_extendedprice*lo_discount) <span class="keyword">as</span> revenue</span><br><span class="line"> <span class="keyword">from</span> lineorder, <span class="built_in">date</span></span><br><span class="line"> <span class="keyword">where</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> d_weeknuminyear = <span class="number">6</span></span><br><span class="line"> <span class="keyword">and</span> d_year = <span class="number">1994</span></span><br><span class="line"> <span class="keyword">and</span> lo_discount <span class="keyword">between</span> <span class="number">5</span> <span class="keyword">and</span> <span class="number">7</span></span><br><span class="line"> <span class="keyword">and</span> lo_quantity <span class="keyword">between</span> <span class="number">26</span> <span class="keyword">and</span> <span class="number">35</span>;</span><br></pre></td></tr></table></figure><p>其中:$FF = (1/364) * (3/11) * 0.1 = 0.000075$,对于 <code>SF = 1</code> 被过滤出来的 <code>LINEORDER</code> 数量为 $0.000075 * 6,000,000 ≈ 450$。</p><h4 id="Q2"><a href="#Q2" class="headerlink" title="Q2"></a>Q2</h4><p>第二类查询语句的模板如下</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> <span class="keyword">sum</span>(lo_revenue), d_year, p_brand1</span><br><span class="line"> <span class="keyword">from</span> lineorder, <span class="built_in">date</span>, part, supplier</span><br><span class="line"> <span class="keyword">where</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> lo_partkey = p_partkey</span><br><span class="line"> <span class="keyword">and</span> lo_suppkey = s_suppkey</span><br><span class="line"> <span class="keyword">and</span> [PART_FILTER]</span><br><span class="line"> <span class="keyword">and</span> [S_REGION_FILTER]</span><br><span class="line"> <span class="keyword">group</span> <span class="keyword">by</span> d_year, p_brand1</span><br><span class="line"> <span class="keyword">order</span> <span class="keyword">by</span> d_year, p_brand1; </span><br></pre></td></tr></table></figure><p>其场景是:在给定的 <code>供应商</code> 和 <code>分类</code> 条件下,每个 <code>品牌</code> 每 <code>年</code> 带来的 <code>收入之和</code>。</p><p>该场景从两个维度表对数据进行限制。</p><p>这个场景也拓展了 3 个查询,这 3 个查询过滤出来的数据互相没有交集,且与 <code>Q1</code> 中的数据没有交集。</p><h5 id="Q2-1"><a href="#Q2-1" class="headerlink" title="Q2.1"></a>Q2.1</h5><p>为上述变量赋值</p><figure class="highlight"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">PART_FILTER -> p_category = 'MFGR#12'</span><br><span class="line">S_REGION_FILTER -> s_region = 'AMERICA'</span><br></pre></td></tr></table></figure><p>完整的 <code>SQL</code> 为</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> <span class="keyword">sum</span>(lo_revenue), d_year, p_brand1</span><br><span class="line"><span class="keyword">from</span> lineorder, <span class="built_in">date</span>, part, supplier</span><br><span class="line"> <span class="keyword">where</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> lo_partkey = p_partkey</span><br><span class="line"> <span class="keyword">and</span> lo_suppkey = s_suppkey</span><br><span class="line"> <span class="keyword">and</span> p_category = <span class="string">'MFGR#12'</span></span><br><span class="line"> <span class="keyword">and</span> s_region = <span class="string">'AMERICA'</span></span><br><span class="line"> <span class="keyword">group</span> <span class="keyword">by</span> d_year, p_brand1</span><br><span class="line"> <span class="keyword">order</span> <span class="keyword">by</span> d_year, p_brand1;</span><br></pre></td></tr></table></figure><p>其中:$FF = (1/25) * (1/5) = 1/125$,对于 <code>SF = 1</code> 被过滤出来的 <code>LINEORDER</code> 数量为 $(1/125) * 6,000,000 ≈ 48,000$。</p><h5 id="Q2-2"><a href="#Q2-2" class="headerlink" title="Q2.2"></a>Q2.2</h5><p>为上述变量赋值</p><figure class="highlight"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">PART_FILTER -> p_brand1 between 'MFGR#2221' and 'MFGR#2228'</span><br><span class="line">S_REGION_FILTER -> s_region = 'ASIA'</span><br></pre></td></tr></table></figure><p>完整的 <code>SQL</code> 为</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> <span class="keyword">sum</span>(lo_revenue), d_year, p_brand1</span><br><span class="line"> <span class="keyword">from</span> lineorder, <span class="built_in">date</span>, part, supplier</span><br><span class="line"> <span class="keyword">where</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> lo_partkey = p_partkey</span><br><span class="line"> <span class="keyword">and</span> lo_suppkey = s_suppkey</span><br><span class="line"> <span class="keyword">and</span> p_brand1 <span class="keyword">between</span> <span class="string">'MFGR#2221'</span> <span class="keyword">and</span> <span class="string">'MFGR#2228'</span></span><br><span class="line"> <span class="keyword">and</span> s_region = <span class="string">'ASIA'</span></span><br><span class="line"> <span class="keyword">group</span> <span class="keyword">by</span> d_year, p_brand1</span><br><span class="line"> <span class="keyword">order</span> <span class="keyword">by</span> d_year, p_brand1;</span><br></pre></td></tr></table></figure><p>其中:$FF = (1/125) * (1/5) = 1/625$,对于 <code>SF = 1</code> 被过滤出来的 <code>LINEORDER</code> 数量为 $(1/625) * 6,000,000 ≈ 9600$。</p><h5 id="Q2-3"><a href="#Q2-3" class="headerlink" title="Q2.3"></a>Q2.3</h5><p>为上述变量赋值</p><figure class="highlight"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">PART_FILTER -> p_brand1 = 'MFGR#2221'</span><br><span class="line">S_REGION_FILTER -> s_region = 'EUROPE'</span><br></pre></td></tr></table></figure><p>完整的 <code>SQL</code> 为</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> <span class="keyword">sum</span>(lo_revenue), d_year, p_brand1</span><br><span class="line"> <span class="keyword">from</span> lineorder, <span class="built_in">date</span>, part, supplier</span><br><span class="line"> <span class="keyword">where</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> lo_partkey = p_partkey</span><br><span class="line"> <span class="keyword">and</span> lo_suppkey = s_suppkey</span><br><span class="line"> <span class="keyword">and</span> p_brand1 = <span class="string">'MFGR#2221'</span></span><br><span class="line"> <span class="keyword">and</span> s_region = <span class="string">'EUROPE'</span></span><br><span class="line"> <span class="keyword">group</span> <span class="keyword">by</span> d_year, p_brand1</span><br><span class="line"> <span class="keyword">order</span> <span class="keyword">by</span> d_year, p_brand1;</span><br></pre></td></tr></table></figure><p>其中:$FF = (1/1000) * (1/5) = 1/5000$,对于 <code>SF = 1</code> 被过滤出来的 <code>LINEORDER</code> 数量为 $(1/5000) * 6,000,000 ≈ 1200$。</p><h4 id="Q3"><a href="#Q3" class="headerlink" title="Q3"></a>Q3</h4><p>第三类查询语句的模板如下</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> c_nation, s_nation, d_year, <span class="keyword">sum</span>(lo_revenue) <span class="keyword">as</span> revenue</span><br><span class="line"> <span class="keyword">from</span> customer, lineorder, supplier, <span class="built_in">date</span></span><br><span class="line"> <span class="keyword">where</span> lo_custkey = c_custkey</span><br><span class="line"> <span class="keyword">and</span> lo_suppkey = s_suppkey</span><br><span class="line"> <span class="keyword">and</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> [CUSTOMOR_FILTER]</span><br><span class="line"> <span class="keyword">and</span> [SUPPLIER_FILTER]</span><br><span class="line"> <span class="keyword">and</span> [DATE_FILTER]</span><br><span class="line"> <span class="keyword">group</span> <span class="keyword">by</span> c_nation, s_nation, d_year</span><br><span class="line"> <span class="keyword">order</span> <span class="keyword">by</span> d_year <span class="keyword">asc</span>, revenue <span class="keyword">desc</span>;</span><br></pre></td></tr></table></figure><p>其场景是:在给定的 <code>供应商</code> 、<code>客户</code> 和 <code>时间</code> 条件下,求出每个 <code>客户所在国家</code>、<code>供应商所在国家</code>、<code> 每年</code> 的 <code>收入之和</code>。</p><p>该场景从三个维度表对数据进行限制。</p><p>这个场景拓展了 4 个查询,除了 <code>Q3.3</code> 、<code>Q3.4</code> 之外,其他查询过滤出来的数据互相没有交集,且与 <code>Q1</code>、<code>Q2</code> 中的数据没有交集。</p><h5 id="Q3-1"><a href="#Q3-1" class="headerlink" title="Q3.1"></a>Q3.1</h5><p>为上述变量赋值</p><figure class="highlight"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">CUSTOMOR_FILTER -> c_region = 'ASIA'</span><br><span class="line">SUPPLIER_FILTER -> s_region = 'ASIA'</span><br><span class="line">DATE_FILTER -> d_year >= 1992 and d_year <= 1997</span><br></pre></td></tr></table></figure><p>完整的 <code>SQL</code> 为</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> c_nation, s_nation, d_year, <span class="keyword">sum</span>(lo_revenue) <span class="keyword">as</span> revenue</span><br><span class="line"> <span class="keyword">from</span> customer, lineorder, supplier, <span class="built_in">date</span></span><br><span class="line"> <span class="keyword">where</span> lo_custkey = c_custkey</span><br><span class="line"> <span class="keyword">and</span> lo_suppkey = s_suppkey</span><br><span class="line"> <span class="keyword">and</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> c_region = <span class="string">'ASIA'</span></span><br><span class="line"> <span class="keyword">and</span> s_region = <span class="string">'ASIA'</span></span><br><span class="line"> <span class="keyword">and</span> d_year >= <span class="number">1992</span> <span class="keyword">and</span> d_year <= <span class="number">1997</span></span><br><span class="line"> <span class="keyword">group</span> <span class="keyword">by</span> c_nation, s_nation, d_year</span><br><span class="line"> <span class="keyword">order</span> <span class="keyword">by</span> d_year <span class="keyword">asc</span>, revenue <span class="keyword">desc</span>;</span><br></pre></td></tr></table></figure><p>其中:$FF = (1/5) * (1/5) * (6/7) = 6/175$,对于 <code>SF = 1</code> 被过滤出来的 <code>LINEORDER</code> 数量为 $(6/175) * 6,000,000 ≈ 205,714$。</p><h5 id="Q3-2"><a href="#Q3-2" class="headerlink" title="Q3.2"></a>Q3.2</h5><p>为上述变量赋值</p><figure class="highlight"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">CUSTOMOR_FILTER -> c_nation = 'UNITED STATES'</span><br><span class="line">SUPPLIER_FILTER -> s_nation = 'UNITED STATES'</span><br><span class="line">DATE_FILTER -> d_year >= 1992 and d_year <= 1997</span><br></pre></td></tr></table></figure><p>完整的 <code>SQL</code> 为</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> c_city, s_city, d_year, <span class="keyword">sum</span>(lo_revenue) <span class="keyword">as</span> revenue</span><br><span class="line"> <span class="keyword">from</span> customer, lineorder, supplier, <span class="built_in">date</span></span><br><span class="line"> <span class="keyword">where</span> lo_custkey = c_custkey</span><br><span class="line"> <span class="keyword">and</span> lo_suppkey = s_suppkey</span><br><span class="line"> <span class="keyword">and</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> c_nation = <span class="string">'UNITED STATES'</span></span><br><span class="line"> <span class="keyword">and</span> s_nation = <span class="string">'UNITED STATES'</span></span><br><span class="line"> <span class="keyword">and</span> d_year >= <span class="number">1992</span> <span class="keyword">and</span> d_year <= <span class="number">1997</span></span><br><span class="line"> <span class="keyword">group</span> <span class="keyword">by</span> c_city, s_city, d_year</span><br><span class="line"> <span class="keyword">order</span> <span class="keyword">by</span> d_year <span class="keyword">asc</span>, revenue <span class="keyword">desc</span>;</span><br></pre></td></tr></table></figure><p>其中:$FF = (1/25) * (1/25) * (6/7) = 6/4375$,对于 <code>SF = 1</code> 被过滤出来的 <code>LINEORDER</code> 数量为 $(6/4375) * 6,000,000 ≈ 8,228$。</p><h5 id="Q3-3"><a href="#Q3-3" class="headerlink" title="Q3.3"></a>Q3.3</h5><p>为上述变量赋值</p><figure class="highlight"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">CUSTOMOR_FILTER -> (c_city='UNITED KI1' or c_city='UNITED KI5')</span><br><span class="line">SUPPLIER_FILTER -> (s_city='UNITED KI1' or s_city='UNITED KI5')</span><br><span class="line">DATE_FILTER -> d_year >= 1992 and d_year <= 1997</span><br></pre></td></tr></table></figure><p>完整的 <code>SQL</code> 为</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> c_city, s_city, d_year, <span class="keyword">sum</span>(lo_revenue) <span class="keyword">as</span> revenue</span><br><span class="line"> <span class="keyword">from</span> customer, lineorder, supplier, <span class="built_in">date</span></span><br><span class="line"> <span class="keyword">where</span> lo_custkey = c_custkey</span><br><span class="line"> <span class="keyword">and</span> lo_suppkey = s_suppkey</span><br><span class="line"> <span class="keyword">and</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> (c_city=<span class="string">'UNITED KI1'</span> <span class="keyword">or</span> c_city=<span class="string">'UNITED KI5'</span>)</span><br><span class="line"> <span class="keyword">and</span> (s_city=<span class="string">'UNITED KI1'</span> <span class="keyword">or</span> s_city=<span class="string">'UNITED KI5'</span>) </span><br><span class="line"> <span class="keyword">and</span> d_year >= <span class="number">1992</span> <span class="keyword">and</span> d_year <= <span class="number">1997</span></span><br><span class="line"> <span class="keyword">group</span> <span class="keyword">by</span> c_city, s_city, d_year</span><br><span class="line"> <span class="keyword">order</span> <span class="keyword">by</span> d_year <span class="keyword">asc</span>, revenue <span class="keyword">desc</span>;</span><br></pre></td></tr></table></figure><p>其中:$FF = (1/125) * (1/125) * (6/7) = 6/109375$,对于 <code>SF = 1</code> 被过滤出来的 <code>LINEORDER</code> 数量为 $(6/109375) * 6,000,000 ≈ 329$。</p><h5 id="Q3-4"><a href="#Q3-4" class="headerlink" title="Q3.4"></a>Q3.4</h5><p>为上述变量赋值</p><figure class="highlight"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">CUSTOMOR_FILTER -> (c_city='UNITED KI1' or c_city='UNITED KI5')</span><br><span class="line">SUPPLIER_FILTER -> (s_city='UNITED KI1' or s_city='UNITED KI5')</span><br><span class="line">DATE_FILTER -> d_yearmonth = 'Dec1997'</span><br></pre></td></tr></table></figure><p>完整的 <code>SQL</code> 为</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> c_city, s_city, d_year, <span class="keyword">sum</span>(lo_revenue) <span class="keyword">as</span> revenue</span><br><span class="line"> <span class="keyword">from</span> customer, lineorder, supplier, <span class="built_in">date</span></span><br><span class="line"> <span class="keyword">where</span> lo_custkey = c_custkey</span><br><span class="line"> <span class="keyword">and</span> lo_suppkey = s_suppkey</span><br><span class="line"> <span class="keyword">and</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> (c_city=<span class="string">'UNITED KI1'</span> <span class="keyword">or</span> c_city=<span class="string">'UNITED KI5'</span>)</span><br><span class="line"> <span class="keyword">and</span> (s_city=<span class="string">'UNITED KI1'</span> <span class="keyword">or</span> s_city=<span class="string">'UNITED KI5'</span>)</span><br><span class="line"> <span class="keyword">and</span> d_yearmonth = <span class="string">'Dec1997'</span></span><br><span class="line"> <span class="keyword">group</span> <span class="keyword">by</span> c_city, s_city, d_year</span><br><span class="line"> <span class="keyword">order</span> <span class="keyword">by</span> d_year <span class="keyword">asc</span>, revenue <span class="keyword">desc</span>;</span><br></pre></td></tr></table></figure><p>其中:$FF = (1/125) * (1/125) * (1/84) = 1/1,312,500$,对于 <code>SF = 1</code> 被过滤出来的 <code>LINEORDER</code> 数量为 $(1/1,312,500) * 6,000,000 ≈ 5$。</p><h4 id="Q4"><a href="#Q4" class="headerlink" title="Q4"></a>Q4</h4><p>第四类查询语句的场景是:在给定的 <code>供应商</code> 、<code>客户</code> 、<code>零件</code> 和 <code>时间</code> 条件下,求出每个 <code>客户所在国家</code>、<code>供应商所在国家</code>、<code> 每年</code> 的 <code>利润之和</code>。该场景从四个维度表对数据进行限制。</p><p><code>Q4</code> 的查询结果与 <code>Q1</code>、<code>Q2</code>、<code>Q3</code> 的结果都是没有交集的,不需要担心 <code>Cache</code> 的影响。但是 <code>Q4.2</code>、<code>Q4.3</code> 的结果都是前一次查询的子集,会受到 <code>Cache</code> 的影响,但是这类查询的流程都会具有这样的特点,不可避免。</p><h5 id="Q4-1"><a href="#Q4-1" class="headerlink" title="Q4.1"></a>Q4.1</h5><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> d_year, c_nation, <span class="keyword">sum</span>(lo_revenue - lo_supplycost) <span class="keyword">as</span> profit</span><br><span class="line"> <span class="keyword">from</span> <span class="built_in">date</span>, customer, supplier, part, lineorder</span><br><span class="line"> <span class="keyword">where</span> lo_custkey = c_custkey</span><br><span class="line"> <span class="keyword">and</span> lo_suppkey = s_suppkey</span><br><span class="line"> <span class="keyword">and</span> lo_partkey = p_partkey</span><br><span class="line"> <span class="keyword">and</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> c_region = <span class="string">'AMERICA'</span></span><br><span class="line"> <span class="keyword">and</span> s_region = <span class="string">'AMERICA'</span></span><br><span class="line"> <span class="keyword">and</span> (p_mfgr = <span class="string">'MFGR#1'</span> <span class="keyword">or</span> p_mfgr = <span class="string">'MFGR#2'</span>)</span><br><span class="line"> <span class="keyword">group</span> <span class="keyword">by</span> d_year, c_nation</span><br><span class="line"> <span class="keyword">order</span> <span class="keyword">by</span> d_year, c_nation;</span><br></pre></td></tr></table></figure><p>其中:$FF = (1/5) * (1/5) * (2/5) = 2/125$,对于 <code>SF = 1</code> 被过滤出来的 <code>LINEORDER</code> 数量为 $(2/125) * 6,000,000 ≈ 96000$。</p><h5 id="Q4-2"><a href="#Q4-2" class="headerlink" title="Q4.2"></a>Q4.2</h5><p>假如通过 <code>Q4.1</code> 的结果,我们发现 <code>1997 ~ 1998</code> 年间,利润增加了 40%。此时,我们需要进一步查看这个时间段内导致利润上涨的具体原因,比如通过 <code>s_nation</code> 和 <code>p_category</code> 分组。</p><p><code>Q4</code> 的结果与 <code>Q1</code>、<code>Q2</code>、<code>Q3</code> 的结果没有交集,但是 <code>Q4.2</code>、<code>Q4.3</code>、<code>Q4.4</code> 每个查询都是前一个查询的子集,这依然会收到 <code>Cache</code> 的影响。但是这个场景的分析都会是这个步骤,这不可避免。</p><p>于是使用下面的 <code>SQL</code></p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> d_year, s_nation, p_category, <span class="keyword">sum</span>(lo_revenue - lo_supplycost) <span class="keyword">as</span> profit</span><br><span class="line"> <span class="keyword">from</span> <span class="built_in">date</span>, customer, supplier, part, lineorder</span><br><span class="line"> <span class="keyword">where</span> lo_custkey = c_custkey</span><br><span class="line"> <span class="keyword">and</span> lo_suppkey = s_suppkey</span><br><span class="line"> <span class="keyword">and</span> lo_partkey = p_partkey</span><br><span class="line"> <span class="keyword">and</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> c_region = <span class="string">'AMERICA'</span></span><br><span class="line"> <span class="keyword">and</span> s_region = <span class="string">'AMERICA'</span></span><br><span class="line"> <span class="keyword">and</span> (d_year = <span class="number">1997</span> <span class="keyword">or</span> d_year = <span class="number">1998</span>)</span><br><span class="line"> <span class="keyword">and</span> (p_mfgr = <span class="string">'MFGR#1'</span> <span class="keyword">or</span> p_mfgr = <span class="string">'MFGR#2'</span>)</span><br><span class="line"> <span class="keyword">group</span> <span class="keyword">by</span> d_year, s_nation, p_category</span><br><span class="line"> <span class="keyword">order</span> <span class="keyword">by</span> d_year, s_nation, p_category;</span><br></pre></td></tr></table></figure><p>其中:$FF = (1/5) * (1/5) * (2/7) * (2/5) = 4/875$,对于 <code>SF = 1</code> 被过滤出来的 <code>LINEORDER</code> 数量为 $(4/875) * 6,000,000 ≈ 27,428$。</p><h5 id="Q4-3"><a href="#Q4-3" class="headerlink" title="Q4.3"></a>Q4.3</h5><p>假如通过 <code>Q4.2</code> 的结果,我们发现 <code>1997 ~ 1998</code> 年间大部分的利润增长都是来自 <code>s_nation = 'UNITED STATES'</code> 和 <code>p_category = 'MFGR1#4'</code> 。现在我们想要继续下钻到美国的 <code>城市</code>以及零件的 <code>p_brand1</code> 来看细节。</p><p>于是使用下面的 <code>SQL</code></p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">select</span> d_year, s_city, p_brand1, <span class="keyword">sum</span>(lo_revenue - lo_supplycost) <span class="keyword">as</span> profit</span><br><span class="line"> <span class="keyword">from</span> <span class="built_in">date</span>, customer, supplier, part, lineorder</span><br><span class="line"> <span class="keyword">where</span> lo_custkey = c_custkey</span><br><span class="line"> <span class="keyword">and</span> lo_suppkey = s_suppkey</span><br><span class="line"> <span class="keyword">and</span> lo_partkey = p_partkey</span><br><span class="line"> <span class="keyword">and</span> lo_orderdate = d_datekey</span><br><span class="line"> <span class="keyword">and</span> c_region = <span class="string">'AMERICA'</span></span><br><span class="line"> <span class="keyword">and</span> s_nation = <span class="string">'UNITED STATES'</span></span><br><span class="line"> <span class="keyword">and</span> (d_year = <span class="number">1997</span> <span class="keyword">or</span> d_year = <span class="number">1998</span>)</span><br><span class="line"> <span class="keyword">and</span> p_category = <span class="string">'MFGR#14'</span></span><br><span class="line"> <span class="keyword">group</span> <span class="keyword">by</span> d_year, s_city, p_brand1</span><br><span class="line"> <span class="keyword">order</span> <span class="keyword">by</span> d_year, s_city, p_brand1;</span><br></pre></td></tr></table></figure><p>其中:$(1/5) * (1/25) * (2/7) * (1/25) = 2/21875$,对于 <code>SF = 1</code> 被过滤出来的 <code>LINEORDER</code> 数量为 $(2/21875) * 6,000,000 ≈ 549$。</p><h3 id="查询分析"><a href="#查询分析" class="headerlink" title="查询分析"></a>查询分析</h3><p><img src="/images/star-schema-benchmark/Table_3.1_FF_Analysis_of_Queries_in_Section_3.1.png" alt="Table 3.1. FF Analysis of Queries in Section 3.1"></p><p>表中带有下划线的 <code>FF</code> 是每个查询中可以被索引的维度列上最小的 <code>FF</code>。加快可被索引的维度列滤列条件的查询的最佳方法是按这一列对 <code>LINEORDER</code> 进行排序,否则可能无法优化磁盘访问。</p><blockquote><p>这一节没怎么看懂,把原文放在这里</p><p>The underlined FF for each query distinguishes the smallest FF over the indexable dimension column predicate. The most valuable way we can speed up a query which has an indexable dimension column restriction is to sort the LINEORDER by that column; Otherwise, indexes on such columns will probably not limit the number of disk pages that must be accessed. Note that by breaking ties for underlining away from supplier, we can avoid underlines in the supplier city roll-up column in Table 3.1. Thus we can avoid a LINEORDER sort by s_city. The query set suggests sorts by time, part brand roll-up and (customer roll-up, supplier roll-up).</p><p>We see that Q4 shifts from customer-sort to part-sort as best match between Q4.1 and Q4.3. </p></blockquote><h2 id="测试数据"><a href="#测试数据" class="headerlink" title="测试数据"></a>测试数据</h2><p>使用 <a href="https://github.com/eyalroz/ssb-dbgen">dbgen</a> 可以生成 <code>SSB</code>性能测试的数据,具体方法为</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">git clone [email protected]:eyalroz/ssb-dbgen.git</span><br><span class="line">cd ssb-dbgen</span><br><span class="line">cmake . && cmake --build .</span><br><span class="line">mkdir data</span><br><span class="line">mv dbgen ./data</span><br><span class="line">cp dists.dss ./data</span><br><span class="line">cd data</span><br><span class="line">./dbgen -v -s 10</span><br></pre></td></tr></table></figure><p>工具的一些命令</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line">./dbgen -h</span><br><span class="line">SSB (Star Schema Benchmark) Population Generator (Version 1.0.0)</span><br><span class="line">Copyright Transaction Processing Performance Council 1994 - 2000</span><br><span class="line">USAGE:</span><br><span class="line">dbgen [-{vfFD}] [-O {fhmsv}][-T {pcsdla}]</span><br><span class="line">[-s <scale>][-C <procs>][-S <step>]</span><br><span class="line">dbgen [-v] [-O {dfhmr}] [-s <scale>] [-U <updates>] [-r <percent>]</span><br><span class="line"></span><br><span class="line">-b <s> -- load distributions for <s></span><br><span class="line">-C <n> -- use <n> processes to generate data</span><br><span class="line"> [Under DOS, must be used with -S]</span><br><span class="line">-D -- do database load in line</span><br><span class="line">-d <n> -- split deletes between <n> files</span><br><span class="line">-f -- force. Overwrite existing files</span><br><span class="line">-F -- generate flat files output</span><br><span class="line">-h -- display this message</span><br><span class="line">-i <n> -- split inserts between <n> files</span><br><span class="line">-n <s> -- inline load into database <s></span><br><span class="line">-O d -- generate SQL syntax for deletes</span><br><span class="line">-O f -- over-ride default output file names</span><br><span class="line">-O h -- output files with headers</span><br><span class="line">-O m -- produce columnar output</span><br><span class="line">-O r -- generate key ranges for deletes.</span><br><span class="line">-O v -- Verify data set without generating it.</span><br><span class="line">-q -- enable QUIET mode</span><br><span class="line">-r <n> -- updates refresh (n/100)% of the</span><br><span class="line"> data set</span><br><span class="line">-s <n> -- set Scale Factor (SF) to <n></span><br><span class="line">-S <n> -- build the <n>th step of the data/update set</span><br><span class="line">-T c -- generate cutomers dimension table ONLY</span><br><span class="line">-T p -- generate parts dimension table ONLY</span><br><span class="line">-T s -- generate suppliers dimension table ONLY</span><br><span class="line">-T d -- generate date dimension table ONLY</span><br><span class="line">-T l -- generate lineorder fact table ONLY</span><br><span class="line">-U <s> -- generate <s> update sets</span><br><span class="line">-v -- enable VERBOSE mode</span><br><span class="line"></span><br><span class="line">To generate the SF=1 (1GB), validation database population, use:</span><br><span class="line">dbgen -vfF -s 1</span><br><span class="line"></span><br><span class="line">To generate updates for a SF=1 (1GB), use:</span><br><span class="line">dbgen -v -U 1 -s 1</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><h2 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h2><p><code>SSB</code>(Star Schema Benchmark)是麻省州立大学波士顿校区的研究人员定义的基于现实商业应用的数据模型,业界公认用来模拟决策支持类应用,比较公正和中立。学术界和工业界普遍采用它来评价决策支持技术方面应用的性能。<br><code>SSB</code> 由 <code>TPC</code>(Transaction Processing Performance Council,事务处理性能委员会)发布的 <code>TPC-H</code> 标准改进而来。它将 <code>TPC-H</code> 的雪花模型改为星型模型,将基准查询由 <code>TPC-H</code> 的复杂 <code>Ad-Hoc</code> 查询改为了结构更固定的 <code>OLAP</code> 查询。</p>
<blockquote>
<p>事务处理性能委员会( Transaction Processing Performance Council ),是由数10家会员公司创建的非盈利组织,总部设在美国。该组织对全世界开放,但迄今为止,绝大多数会员都是美、日、西欧的大公司。TPC的成员主要是计算机软硬件厂家,而非计算机用户,它的功能是制定商务应用基准程序(Benchmark)的标准规范、性能和价格度量,并管理测试结果的发布。</p>
<p>引用自百度百科 <a href="https://baike.baidu.com/item/TPC/1814556">TPC (事务处理性能委员会)</a></p>
</blockquote>
<p>不使用 <code>TPC-H</code> 的原因是,想要提供更普适的功能覆盖(Functional Coverage)和选择覆盖(Selectivity Coverage):</p>
<ol>
<li>功能覆盖(Functional Coverage):尽可能的选用跨多个表的查询,来贴近实际使用情况</li>
<li>选择覆盖(Selectivity Coverage):通过维度表的条件来过滤事实表,并使得过滤后的结果集相对较少</li>
</ol>
<p>几个概念:</p>
<ol>
<li>SF(Scale Factor):生成测试数据集时传入的数据量规模因子,决定了各表最终生成的行数。</li>
<li>FF(Filter Factor):每个 WHERE 过滤条件筛选出一部分行,被筛选出的行数占过滤前行数的比例叫做 FF。在过滤列彼此独立的条件下,表的FF为该表上各个过滤条件FF的乘积。</li>
</ol></summary>
<category term="olap" scheme="https://andrewei1316.github.io/categories/olap/"/>
<category term="benchmark" scheme="https://andrewei1316.github.io/categories/olap/benchmark/"/>
<category term="olap" scheme="https://andrewei1316.github.io/tags/olap/"/>
<category term="benchmark" scheme="https://andrewei1316.github.io/tags/benchmark/"/>
<category term="ssb" scheme="https://andrewei1316.github.io/tags/ssb/"/>
</entry>
<entry>
<title>《Column-Stores vs. Row-Stores How Different Are They Really?》 阅读笔记</title>
<link href="https://andrewei1316.github.io/2020/11/20/column-stores-vs-row-stores/"/>
<id>https://andrewei1316.github.io/2020/11/20/column-stores-vs-row-stores/</id>
<published>2020-11-20T13:11:35.000Z</published>
<updated>2020-12-13T06:31:06.073Z</updated>
<content type="html"><![CDATA[<h2 id="摘要"><a href="#摘要" class="headerlink" title="摘要"></a>摘要</h2><p>本文主要讨论在 OLAP 领域,面向列的存储和计算为什么会比面向行的存储和计算更快的问题。</p><p>在 OLAP 场景下,基准测试都会说面向列的存储和计算比面向行的存储和计算块一个数量级。而大家普遍理解面向列快的原因是</p><blockquote><p>column-stores are more I/O efficient for read-only queries since they only have to read from disk (or from memory) those attributes accessed by a query.</p><p>对于只读查询,列存储的I/O效率更高,因为它们只需要从磁盘(或内存)中读取查询所需要的那些字段。</p></blockquote><p>这种想法让大家认为即使是行存也可以通过一些优化手段,达到列存的性能,包括:</p><ol><li>垂直分表(vertically partitioning)</li><li>全列索引(indexing every column)</li></ol><p>这些优化手段可以在查询时,只查询部分列对应的数据,从而加快分析速度。</p><p>通过一系列实验不难发现,这些手段并不能让面向行存打到列存的性能,原因是列存除了存储优势外,在计算上还有以下几种优化手段:</p><ol><li>压缩(Compression)</li><li>延迟物化(Late Materialization)</li><li>快迭代(Block Iteration)</li><li>Invisible Join</li></ol><p>前三种手段是目前面向列的系统中已有的优化手段,最后一种是本文新提出的一种策略,后面的章节会详细介绍。</p><a id="more"></a><h2 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h2><p>近些年很多列存数据库系统的论文都通过一系列数据表明,在特定领域,特别是读密集型(read-intensive)分析处理的工作负载上,面向列的系统要比面向行的系统性能上要好一个数量级。但是这些面向列的系统性能测试方式显然过于“传统”,仍然以面向行的系统数据结构的设计思路来设计性能测试的方案,虽然能看出面向列系统的潜力,但没有回答大家对此最关键的疑惑:</p><blockquote><p>Are these performance gains due to something fundamental about the way column-oriented DBMSs are internally architected, or would such gains also be possible in a conventional system that used a more column-oriented physical design?</p><p>面向列的系统好于面向行的系统,是因为面向列的系统内在基础架构(个人觉得是列式的存储格式)导致的吗?传统的系统能否通过一些面向列的物理设计(通过一些手段来达到列式存储的效果)来达到面向列的系统的性能?</p></blockquote><p>本文将通过 Star Schema Benchmark (SSBM) 典型的数仓测试拓扑结构,在面向行的系统中,模拟列存的设计范式,来解答上述疑惑。模拟列式范式的设计有:</p><ol><li>表结构垂直拆分(Vertically artitioning the tables),将表结构拆分为两元组的形式(key-value),来实现对于一个查询,只需要访问特定列的目标</li><li>仅从索引中提取数据的计划(Index-only plans),为每个表都创建一组索引,以确保可以覆盖所有查询中所需要的列,这样可以保证查询过程中直接从索引中提取字段的值</li><li>物化视图(Materialized views),针对所有查询做物化视图,以获取最佳的查询性能</li></ol><p>经过上述模拟实验得出结论,<strong>即使在面向行的系统中应用了面向列的思路来设计存储范式,在分析场景下,性能仍然无法与面向列的系统抗衡</strong>。</p><p>在得到上述结论后,我们再来关注下一个问题</p><blockquote><p>Which of the many column-database specific optimizations proposed in the literature are most responsible for the significant performance advantage of column-stores over row-stores on warehouse workloads?</p><p>在数仓场景下,究竟是哪些优化手段,让面向列的系统性能由于面向行的系统性能</p></blockquote><p>这里直接给出了结论,面向列的系统中优化主要有:</p><ol><li>延迟物化(Late Materialization)</li><li>基于块的迭代(Block Iteration)</li><li>压缩(Compression)</li><li>本文提出一个新的优化点 Invisible Joins</li></ol><p>到这里,本文又提出了一个问题</p><blockquote><p>However, because each of these techniques was described in a separate research paper, no work has analyzed exactly which of these gains are most significant.</p><p>定量来看,这几种优化手段分别可以提升多少性能呢?</p></blockquote><p>在接下来的工作中,作者逐个移除 C-Store 数据库中的上述优化手段来进行测试,最终发现:</p><ol><li><strong>压缩带来的性能提升要看具体的数据,最多时可以提升一个数量级</strong></li><li><strong>延迟物化可以提高 3 倍</strong></li><li><strong>基于块的迭代和 Invisible Joins 可以提升约 1.5 倍</strong></li></ol><p>接下来的章节将分别讲述上述几个实验的细节。</p><h2 id="面向行的执行"><a href="#面向行的执行" class="headerlink" title="面向行的执行"></a>面向行的执行</h2><p>这一章将分别讨论垂直分表、全索引和物化视图设计对于面向行的系统的性能影响。</p><h3 id="垂直分表(Vertical-Partitioning)"><a href="#垂直分表(Vertical-Partitioning)" class="headerlink" title="垂直分表(Vertical Partitioning)"></a>垂直分表(Vertical Partitioning)</h3><p>垂直分表也就意味着同时也得有某种机制,可以将所有的字段再组合成之前的数据。第一反应可能需要在表中增加一个主键字段(primary key),但是主键字段可能会比较大,并且很多时候主键是复合主键,所以通常需要向每个表添加一个整数“position”列。但是查询中如果需要多个列,就得基于 position 列做额外的 join 操作。如果为了加速 join 再引入索引,则又会增加存储和 I/O 的开销,很难有优势。</p><h3 id="全索引执行计划(Index-only-plans)"><a href="#全索引执行计划(Index-only-plans)" class="headerlink" title="全索引执行计划(Index-only plans)"></a>全索引执行计划(Index-only plans)</h3><p>垂直分表会引入两个新的问题:</p><ol><li>它需要在每一列上增加一个 <code>position</code> 字段用来还原之前的记录,这会浪费大量的磁盘空间和磁盘带宽</li><li>大多数行式存储在每个元组上会存储一个一个大的头部数据(a relatively large header on every tuple),这又进一步浪费了磁盘空间)</li></ol><p>为了避免上述问题,我们考虑使用全索引计划。</p><p>为了不回原始表查数据,势必要将 query 中任意条件的字段,都通过对应字段的索引来过滤出各种的主键列表,然后做合并计算。如果某些字段对应的条件,无法被其索引快速过滤数据的话,就会导致索引的全扫描,且这样的扫描可能会有多次。最终造成多个索引扫描然后合并主键列的速度,还不如一趟扫描原始表数据并过滤。另外,元信息和头信息的大量冗余,也是造成巨大的性能损失。因此整体的存储、I/O开销都很大。</p><h2 id="物化视图(materialization-view)"><a href="#物化视图(materialization-view)" class="headerlink" title="物化视图(materialization view)"></a>物化视图(materialization view)</h2><p>完全根据预定义的SQL来生成确定的物化视图,且其中不会关联多余的列。显然这种方式查询性能很好(插入性能差),I/O效率高,但这种方法又只能应付极其有限的场景。</p><h2 id="面向列的执行"><a href="#面向列的执行" class="headerlink" title="面向列的执行"></a>面向列的执行</h2><p>这一章将分别讨论压缩、延迟物化、基于块的迭代以及 Invisible Join 带来的性能影响。</p><h3 id="压缩(Compression)"><a href="#压缩(Compression)" class="headerlink" title="压缩(Compression)"></a>压缩(Compression)</h3><p>所谓压缩,即将相似度很高、信息熵很低的数据放在一起,用更小的空间表达相同的信息量。所以压缩优化在列存系统上要比行存更加有效,因为对于列存来说同一列的数据被放在一起,同一列的数据往往类型相同,相同的特征更多,更容易被压缩。但是仅仅去追求高压缩比是没有意义的,可能还会导致计算效率下降。因为大多数算子,需要对数据解压缩后才能进行操作,越高的压缩比往往解压时性能越差。</p><p>压缩优化带来的优势有:</p><ol><li><p>压缩后的数据量更小,可以减少硬盘存储空间,同时硬盘的数据量变少在读取时就可以减少 I/O 压力</p></li><li><p>有些时候解压缩的过程可以省略掉,从而直接对压缩后的数据进行操作。比如使用 <code>Run-Length</code> 编码方式进行压缩的数据,就可以直接进行某些运算</p><blockquote><p>Run-Length 的压缩过程大概是,对于原始序列为 1, 1, 2, 2, 3, 3, 3 的数据,压缩后表达为 1 * 2, 2 * 2, 3 * 3。当我们要对这一列进行 sum 或者 count 运算时,原始数据可以直接转换为 sum(1 * 2 + 2 * 2 + 3 * 3) 和 count(2 + 2 + 3),不仅不需要解压缩,而且还提高了计算效率。</p></blockquote></li></ol><p>另外,压缩优化最好可以配合 <code>sort</code> 使用,如果数据是经过排序的,则更容易找到相邻数据的同质化特征,获得更好的压缩效果。</p><h3 id="延迟物化(Late-Materialization)"><a href="#延迟物化(Late-Materialization)" class="headerlink" title="延迟物化(Late Materialization)"></a>延迟物化(Late Materialization)</h3><p>物化(materialization)的意思是说,为了把底层面向列的存储格式跟客户要求的格式(行式)对的上,需要在查询的某个阶段转换一下。</p><p>为什么需要物化这个过程呢?</p><p>往往一个查询只会涉及到部分列,在行存模式下,计算时需要将整行拿出来解析并提取需要的字段,并将其他字段丢掉。而在列存中,由于每列都是独立存储的,所以只需要读取查询所需要的列就可以了,这样的数据结构在内存中依然是以列为单位组织的,所以需要在计算的某个时刻将其变为一行为单位组织。</p><p>延迟物化的几点优势:</p><ol><li><code>select</code> 和 <code>aggregation</code> 操作下其实不需要整行数据,此时过早物化会浪费</li><li>如果数据是被压缩过的,物化的过程就必须对数据进行解压,这会影响压缩带来的好处</li><li>列式的内存组织形式对 CPU Cache 非常友好,从而提高计算效率,相反行式的组织形式因为非必要的列占用了 Cache Line 的空间,Cache 效率低。</li><li>针对定长的列做块迭代处理,可以当成一个数组来操作,可以利用CPU的很多优势(SIMD加速、cache line适配、CPU pipeline等);相反,行存中列类型往往不一样,长度也不一样,还有大量不定长字段,难以加速</li></ol><h2 id="块迭代计算(Block-Iteration)"><a href="#块迭代计算(Block-Iteration)" class="headerlink" title="块迭代计算(Block Iteration)"></a>块迭代计算(Block Iteration)</h2><p>行存模型中,每一个算子在处理数据的时候,都要先迭代一条数据,然后通过定义的接口从数据中获取到某个字段的值,然后再对值进行操作。这个流程使得每处理一条数据就得额外调用一两次用来获取数据的函数(一般称为火山模型)。</p><p>而在列式存储中,每一次块迭代都可以获取到多条数据,并且当需要对某一列操作时,可以将一整块列的值传递给处理函数。同时不需要额外调用函数获取值,并且如果列是等宽的(fixed-width),可以直接作为数组来迭代。</p><p>使用上述方法时,可以充分利用 CPU 的很多优势(SIMD加速、cache line适配、CPU pipeline等)。</p><h3 id="隐式连接(Invisible-Join)"><a href="#隐式连接(Invisible-Join)" class="headerlink" title="隐式连接(Invisible Join)"></a>隐式连接(Invisible Join)</h3><p>假如现在有这样的一些表结构</p><p><img src="/images/column_store_vs_row_store/Figure1_Schema_of_the_SSBM_Benchmark.png" alt="Figure 1: Schema of the SSBM Benchmark"></p><p>针对这个模型,我们给出一个 join 的场景,例如以下的 SQL</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span> c.nation, s.nation, d.year, <span class="keyword">sum</span>(lo.revenue) <span class="keyword">as</span> revenue</span><br><span class="line"><span class="keyword">FROM</span> customer <span class="keyword">AS</span> c, lineorder <span class="keyword">AS</span> lo, supplier <span class="keyword">AS</span> s, dwdate <span class="keyword">AS</span> d</span><br><span class="line"><span class="keyword">WHERE</span> lo.custkey = c.custkey</span><br><span class="line"><span class="keyword">AND</span> lo.suppkey = s.suppkey</span><br><span class="line"><span class="keyword">AND</span> lo.orderdate = d.datekey</span><br><span class="line"><span class="keyword">AND</span> c.region = ’ASIA’</span><br><span class="line"><span class="keyword">AND</span> s.region = ’ASIA’</span><br><span class="line"><span class="keyword">AND</span> d.year >= <span class="number">1992</span> <span class="keyword">and</span> d.year <= <span class="number">1997</span></span><br><span class="line"><span class="keyword">GROUP</span> <span class="keyword">BY</span> c.nation, s.nation, d.year</span><br><span class="line"><span class="keyword">ORDER</span> <span class="keyword">BY</span> d.year <span class="keyword">asc</span>, revenue <span class="keyword">desc</span>;</span><br></pre></td></tr></table></figure><p>一般来说,有以下两种方式来实现 join</p><h4 id="按照-Selectivity-来-join"><a href="#按照-Selectivity-来-join" class="headerlink" title="按照 Selectivity 来 join"></a>按照 Selectivity 来 join</h4><p>这种方式很简单,就是按照谓词的选择性来依次执行 join。</p><p>例如,<code>c.region = 'ASIA'</code> 的选择性很强,则首先使用这个条件过滤 <code>customer</code> 表,然后使用 <code>customer</code> 表去 <code>join</code> <code>LineOrder</code> 表,所以 <code>customer</code> 表中的 <code>nation</code> 字段就被增加到了 <code>customer-order</code> 表中。依次类推,去完成 <code>supplier</code> 和 <code>date</code> 表的 <code>join</code>。</p><p>这种方式的缺点为:一开始就开始做 <code>join</code> ,后续无法享受上面提到的延迟物化的好处。</p><h4 id="传统延迟物化的-join"><a href="#传统延迟物化的-join" class="headerlink" title="传统延迟物化的 join"></a>传统延迟物化的 join</h4><p>这个方式可以规避一开始做 <code>join</code> 的行为,具体方法为:</p><ol><li>用 <code>c.region = 'ASIA'</code> 过滤 <code>custom</code> 表,并拿到满足条件的 <code>custom key</code> 的集合,同时记录 <code>custom</code> 表中满足条件的记录的位置</li><li>用 <code>1</code> 中获得的 <code>custom key</code> 来过滤 <code>orderline</code> 表,并拿到满足条件的记录的位置</li><li>遍历 <code>2</code> 中获得的位置列表,提取 <code>suppplier key</code>、<code>order date</code> 和 <code>revenue</code> 并且借助 <code>custom key</code> 和 <code>1</code> 中获取到的位置信息,提取 <code>custom</code> 表中的 <code>c.nation</code> 字段</li><li><code>supplier</code> 和 <code>date</code> 表的 <code>join</code> 类似处理</li></ol><p>这种方式的缺点为:在 <code>3</code> 中提取 <code>c.nation</code> 的操作为随机访问,会产生较大的开销。</p><h4 id="Invisible-join"><a href="#Invisible-join" class="headerlink" title="Invisible join"></a>Invisible join</h4><p><code>Invisible Join</code> 是本文新提出的一种方法,用于上文中提到的星型模型的 <code>join</code> 场景。它优化了传统延迟物化join 的缺点,尽可能减少随机读取的数据,从而提高性能。具体的执行步骤为:</p><ol><li><p>在每个维度表上应用对应的过滤条件,得到每个维度表(<code>dimension table</code>)满足条件的记录的 <code>key</code>,同时这个 <code>key</code> 也应该是事实表(<code>the fact table</code>)的外键(<code>foreign key</code>)。</p><p> <img src="/images/column_store_vs_row_store/Figure2_The_%EF%AC%81rst_phase_of_the_joins_needed_to_execute_Query_3.1_from_the_Star_Schema_benchmark_on_some_sample_data.png" alt="Figure: The first phase of the joins needed to execute Query 3.1 from the Star Schema benchmark on some sample data"></p></li><li><p>遍历事实表的各个外键列,使用 <code>1</code> 中得到的 <code>key</code> 来判断是否满足条件,生成一个满足条件的记录的位置信息的 <code>bitmap</code> ,并将这些 <code>bitmap</code> 做 <code>AND</code> 操作,生成最终过滤结果的 <code>bitmap</code></p><p> <img src="/images/column_store_vs_row_store/Figure3_The_second_phase_of_the_joins_needed_to_execute_Query_3.1_from_the_Star_Schema_benchmark_on_some_sample_data.png" alt="Figure 3: The second phase of the joins needed to execute Query 3.1 from the Star Schema benchmark on some sample data"></p></li><li><p>利用 <code>2</code> 中得到的 <code>bitmap</code> 依次提取各个维度表的外键,使用维度表的键来提取维度表中查询所需要的其他列。如果维度表的键是排过序的、从 <code>1</code> 开始连续的值,意味着维度表里面的列可以通过类似访问数组一样的方式提取出来(这一点会比传统的延迟物化方法快很多)。</p><p> <img src="/images/column_store_vs_row_store/Figure4_The_third_phase_of_the_joins_needed_to_execute_Query_3.1_from_the_Star_Schema_benchmark_on_some_sample_data.png" alt="Figure 4: The third phase of the joins needed to execute Query 3.1 from the Star Schema benchmark on some sample data"></p></li></ol><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>总结本文的几个成果</p><ol><li>展示了试图在行存储中模拟列存储不会产生良好的性能结果,并且各种通常被视为对仓库性能“有益”的技术(仅索引计划、位图索引等)对改善这种情况几乎没有帮助。</li><li>提出了一种用于提高列存的 join 性能的新技术,称为 invisible join。并通过实验证明,在多数情况下,使用此技术执行 join 的性能与从一个已实现连接的反范式化的表中选择和提取数据一样好,甚至更好。因此可以得出结论,面向行的系统在数据仓库场景中使用的反范式的优化手段,在列存储中完全没有必要(可以极大节省空间开销)。</li><li>分析了数仓场景下列存数据库性能的来源,探讨了延迟物化、压缩、块迭代和 invisible joins 对整体系统性能的贡献,并证明了简单的面向列操作(没有压缩和延迟物化)并没有显著优于做了优化的行存储设计。</li></ol><h2 id="参考资料"><a href="#参考资料" class="headerlink" title="参考资料"></a>参考资料</h2><p><a href="https://zhuanlan.zhihu.com/p/54484592">读后感之《Column-Stores vs. Row-Stores》</a></p><p><a href="https://zhuanlan.zhihu.com/p/54433448">《Column-Stores vs. Row-Stores》读后感</a></p>]]></content>
<summary type="html"><h2 id="摘要"><a href="#摘要" class="headerlink" title="摘要"></a>摘要</h2><p>本文主要讨论在 OLAP 领域,面向列的存储和计算为什么会比面向行的存储和计算更快的问题。</p>
<p>在 OLAP 场景下,基准测试都会说面向列的存储和计算比面向行的存储和计算块一个数量级。而大家普遍理解面向列快的原因是</p>
<blockquote>
<p>column-stores are more I/O efficient for read-only queries since they only have to read from disk (or from memory) those attributes accessed by a query.</p>
<p>对于只读查询,列存储的I/O效率更高,因为它们只需要从磁盘(或内存)中读取查询所需要的那些字段。</p>
</blockquote>
<p>这种想法让大家认为即使是行存也可以通过一些优化手段,达到列存的性能,包括:</p>
<ol>
<li>垂直分表(vertically partitioning)</li>
<li>全列索引(indexing every column)</li>
</ol>
<p>这些优化手段可以在查询时,只查询部分列对应的数据,从而加快分析速度。</p>
<p>通过一系列实验不难发现,这些手段并不能让面向行存打到列存的性能,原因是列存除了存储优势外,在计算上还有以下几种优化手段:</p>
<ol>
<li>压缩(Compression)</li>
<li>延迟物化(Late Materialization)</li>
<li>快迭代(Block Iteration)</li>
<li>Invisible Join</li>
</ol>
<p>前三种手段是目前面向列的系统中已有的优化手段,最后一种是本文新提出的一种策略,后面的章节会详细介绍。</p></summary>
<category term="olap" scheme="https://andrewei1316.github.io/categories/olap/"/>
<category term="存储" scheme="https://andrewei1316.github.io/categories/olap/%E5%AD%98%E5%82%A8/"/>
<category term="olap" scheme="https://andrewei1316.github.io/tags/olap/"/>
<category term="存储" scheme="https://andrewei1316.github.io/tags/%E5%AD%98%E5%82%A8/"/>
</entry>
<entry>
<title>Google File System 总结</title>
<link href="https://andrewei1316.github.io/2020/10/05/google-file-system/"/>
<id>https://andrewei1316.github.io/2020/10/05/google-file-system/</id>
<published>2020-10-05T08:52:25.000Z</published>
<updated>2020-11-19T13:22:30.704Z</updated>
<content type="html"><![CDATA[<p>本文是博主学习 <code>MIT6.824</code> 课程的学习笔记,其中会总结论文知识点并加入自己的理解,内容可能与论文原文有出入,想要了解细节的读者可以阅读论文原文或者学习 <code>MIT6.824</code>课程。</p><p><a href="https://pdos.csail.mit.edu/6.824/papers/gfs.pdf">The Google File System</a></p><p><a href="https://pdos.csail.mit.edu/6.824/video/3.html">GFS MIT Video</a></p><h2 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h2><p><code>Google File System</code> 简称 <code>GFS</code> 是 <code>Google</code> 设计并实现的一个面向数据密集型应用的、可伸缩的分布式文件系统。</p><p><code>GFS</code> 的设计基于以下使用场景:</p><ol><li>运行在廉价的日用硬件上,组件失效是常态事件。因此,系统必须具有持久的监控、错误侦测、容错以及自动恢复的功能。</li><li>以存储大文件(<code>100MB</code> 到 <code>数GB</code>)为主,同时要支持小文件,但是不需要针对小文件做优化。</li><li>支持两种读操作:大规模的流式读取(<code>数百KB</code>,或者一次读取 <code>1MB</code> 甚至更多)和小规模的随机读取(在任意位移上读取 <code>几个KB</code>)。</li><li>支持两种写操作:大规模的、顺序的对文件的追加和小规模的任意位置写入(不必高效)。</li><li>必须支持高效的多客户端同时并行追加数据到同一个文件的语义(<code>Google</code> 的场景下,<code>GFS</code> 中存储的文件通常用于 <code>生产者-消费者</code> 队列,或者其他多路文件合并操作)</li><li>大吞吐量优先于低延时</li></ol><a id="more"></a><h2 id="GFS-架构"><a href="#GFS-架构" class="headerlink" title="GFS 架构"></a>GFS 架构</h2><p><img src="/images/gfs/figure1_gfs_architecture.png" alt="图一:GFS 结构图"></p><p>一个 <code>GFS</code> 集群包含一个单独的 <code>Master</code> 节点和多台 <code>Chunk Server</code>,并且同时被多个(几百个)客户端同时访问。</p><blockquote><p>单独的 <code>Master</code> 节点并不是集群中只有一个可以成为 <code>Master</code> 的服务器,只是说在任意时刻只能有一个节点的角色为 <code>Master</code>,当这个节点挂掉时,会有新的 <code>Master</code> 节点起作用。</p></blockquote><p><code>GFS</code> 存储的文件都被分割成固定大小的 <code>Chunk</code>,每个 <code>Chunk</code> 在被创建的时候,会由 <code>Master</code> 分配一个不变的、全球唯一的 64 位 <code>Chunk</code> 标识。<code>Chunk Server</code>把 <code>Chunk</code> 以 <code>linux</code> 普通文件的形式保存在本地硬盘上,并且根据指定的 <code>Chunk</code> 标识和字节范围来读写块数据。同时为了可靠性,每个块会被复制多份,存储在不同的 <code>Chunk Server</code>上(通常是 3 份)</p><blockquote><p>副本同时可以在大规模读取的时候起到负载均衡的作用</p></blockquote><p><code>Master</code> 节点管理着整个文件系统,主要涉及以下几个方面:</p><ol><li><p>整个文件系统的元数据:包括命名空间(namespace)、访问控制信息(access control information)、文件与<code>Chunk Server</code>的映射关系以及每个 <code>Chunk</code> 的当前位置(the current locations of chunks)</p></li><li><p>文件系统的动态信息:<code>Chunk</code> 租用管理(chunk leases management)、孤儿 <code>Chunk</code> 的回收(garbage collection of orphaned chunks)以及<code>Chunk</code> 在 <code>Chunk Server</code> 之间的迁移</p></li><li><p>使用心跳周期性与每个 <code>Chunk Server</code>通信,发送至指令到各个 <code>Chunk Server</code>并接受 <code>Chunk Server</code>的状态信息</p></li><li><p>接受并回应客户端的操作请求</p></li></ol><p><code>GFS</code> 客户端代码实现了 <code>GFS</code> 文件系统的 <code>API</code> 接口函数调用、与 <code>Master</code> 节点和 <code>Chunk Server</code>通信以及数据进行读写等操作。</p><p>鉴于整个系统只有一个 <code>Master</code> 节点,为了防止 <code>Master</code> 节点成为瓶颈,客户端与 <code>Master</code> 节点的通信只获取元数据,所有的数据操作都是客户端直接和 <code>Chunk Server</code>进行交互的,同时客户端会将从 <code>Master</code> 拿到的元数据缓存一段时间。另外无论是客户端还是 <code>Chunk Server</code>都不需要缓存文件数据。</p><h2 id="元数据"><a href="#元数据" class="headerlink" title="元数据"></a>元数据</h2><p><code>Master</code> 服务器存储 3 种主要类型的元数据:</p><ol><li>文件和 <code>Chunk</code> 的命名空间(namespace)</li><li>文件和 <code>Chunk</code> 的映射关系</li><li>每个 <code>Chunk</code> 的存放位置</li></ol><p>上述元数据信息都保存在 <code>Master</code> 服务器的内存中,这使得 <code>Master</code> 节点对元数据变更变得极为容易,<code>Master</code> 可以在后台简单、高效的周期性扫描自己保存的全部状态信息,以实现 <code>Chunk</code> 垃圾收集、<code>Chunk Server</code> 失效时重新复制数据、<code>Chunk Server</code> 的负载均衡以及磁盘使用情况统计等。唯一有风险的是,元数据放在内存中可能会使得集群能管理的 <code>Chunk</code> 数会受限于 <code>Master</code> 的内存大小。但从论文来看,<code>Google</code> 并没有遇到这个问题,因为 <code>64MB</code>的 <code>Chunk</code> 只会占用 <code>64B</code> 的 <code>Master</code> 内存,并且在 <code>Google</code> 的场景中,大多数 <code>Chunk</code> 都是被填充满的。</p><blockquote><p>每个 <code>Chunk</code> 被设计为 <code>64MB</code> 大小,主要出于一下考虑:</p><ol><li>减少了元数据的数量从而减少了客户端与 <code>Master</code> 的交互频率(每个 <code>Chunk</code> 覆盖了更多的数据范围 && 客户端可以缓存更多数据的元数据信息)</li><li>每个 <code>Chunk</code> 覆盖较大的数据范围,客户端可以对同一个 <code>Chunk</code> 进行比较多的操作,可以通过 <code>TCP</code> 长连接与 <code>Chunk Server</code>交互,减少网络开销</li><li>减少元数据的数量可以减少 <code>Master</code> 的内存压力</li></ol><p>但相对的,Chunk 较大也会引入一些问题,比如小文件只有一个 <code>Chunk</code>,对其操作时会造成热点。对于这个问题,论文中给出的缓解方式为:</p><ol><li>这样的 <code>Chunk</code> 配置较多的副本,分担读取压力</li><li>尽可能不要同时对这个 <code>Chunk</code> 进行操作</li></ol><p>或者可以实现读取时,客户端之间可以共享数据。</p></blockquote><p>为了防止 <code>Master</code> 节点崩溃造成状态丢失,对于 <strong>文件和Chunk的命名空间</strong> 以及 <strong>文件和Chunk的映射关系</strong> 这两种元数据,会按照修改时间以操作日志(Operation Log)的形式持久化在本地硬盘,同时复制到其他 <code>Master</code> 节点。并且对于一个更改上述元数据的客户端请求,只有当本地和其他 <code>Master</code> 节点都把 <code>操作日志</code> 持久化到硬盘后,才会响应客户端。</p><p><code>Master</code> 会在 <code>操作日志</code> 增长到一定量时,对系统状态做一次 <code>Checkpoint</code>,当 <code>Master</code> 启动时,只需要从最近的 <code>Checkpoint</code> 状态启动并重演 <code>Checkpoint</code> 之后有限的 <code>操作日志</code>就可以恢复到奔溃前的状态。</p><p><strong>每个 Chunk 的存放位置</strong> 并不会被持久化,<code>Master</code> 服务器只是在启动的时候轮询 <code>Chunk服务器</code> 以获取这些信息,并且会周期性的通过心跳信息监控 <code>Chunk Server</code> 的状态。这种设计简化了当 <code>Chunk Server</code> 加入集群、离开集群、更改名称、失效以及重启时的数据变更问题。</p><h2 id="一致性模型"><a href="#一致性模型" class="headerlink" title="一致性模型"></a>一致性模型</h2><p><code>GFS</code> 提供了相对宽松的一致性,在支撑高度分布式的同时,保持了相对简单切容易实现的优点。</p><h3 id="一致性保障机制"><a href="#一致性保障机制" class="headerlink" title="一致性保障机制"></a>一致性保障机制</h3><p>由于整个文件系统的元数据都存储在 <code>Master</code> 节点的内存中,所以文件命名空间的修改(比如文件的创建)可以通过 <code>Master</code> 的锁来保障原子性和正确性。同时,<code>Master</code> 节点的 <code>操作日志</code> 定义了这些操作在全局的顺序。</p><p><code>GFS</code> 定义了一些概念,来标识文件修改后的状态:</p><ol><li>如果所有的客户端,无论从哪个副本(replica)读取,读到的数据都相同,我们称 <code>文件region</code> 是<strong>一致的(Consistent)</strong></li><li>相反,如果存在任意两个客户端,从某些副本(replica)读取,读到的数据不相同,我们称 <code>文件region</code> 是<strong>不一致的(Consistent)</strong></li><li>对于<strong>一致的文件 region</strong>,如果每个客户端都能读取到它上次修改的内容,我们称 <code>文件region</code> 是 <strong>确定的(Defined)</strong></li></ol><blockquote><p>这里的<strong>一致</strong>、<strong>确定</strong> 是从客户端的角度来理解的:</p><ol><li>10 个客户端同时 <code>GET</code> 修改后的数据发现每个客户端获取到的数据都相同,就称为 <code>一致</code></li><li>10 个客户端同时执行了 <strong>不同</strong> 的修改操作(例如,修改的是文件的不同部分,不会发生重叠),然后 <code>GET</code> 修改后的数据,发现每个客户端获取到的数据都相同(此时已经可以称为 <code>一致</code> 状态),且跟自己修改后的预期相同,故客户端可以 <strong>确定</strong> 自己的修改成功了,故称之为 <code>确定的</code>。</li><li>10 个客户端同时执行了 <strong>不同</strong> 的修改操作(例如,修改的是文件的相同部分,会发生重叠),然后 <code>GET</code> 修改后的数据,发现每个客户端获取到的数据都相同(此时已经可以称为 <code>一致</code> 状态),但是客户端发现获取到的数据跟自己修改后的预期不同,此时客户端的角度无法知道结果是否正确,故称之为<code>不确定</code> 状态。</li></ol></blockquote><p>对于数据修改后的 <code>文件region</code>,它的状态取决于操作类型、成功与否以及是否同步修改,下面我们结合论文给出的表格,以及论文原文描述分情况讨论几种情况。</p><blockquote><p>对于 region 的定义论文中没有提到,猜测是修改操作涉及到的文件范围</p></blockquote><p><img src="/images/gfs/table1_file_region_state_after_mutation.png" alt="表1:文件region修改后的状态"></p><p><strong>随机写(Write)</strong></p><blockquote><p> A write causes data to be written at an application-specified file offset.</p></blockquote><p>由论文原文可知,随机写是由客户端来指定写入位置的,所以无论是否存在重试,写入的位置和内容都相同。</p><ol><li><p>并行写入</p><p>由论文 <code>3.1 Leases and Mutation Order</code> 一节可知,并行写入时,写入的顺序是由 <code>Chunk</code> 的 <code>Master</code> 来指定的,并且所有的副本写入顺序都一致。所以只要最后都成功写入,<code>Chunk</code> 的所有副本的内容就一定是相同的,即状态为 <code>一致的</code>。但由于各个客户端写入的范围可能存在重叠,故会存在 <code>不确定</code> 的情况。</p><p>例如某个 <code>Chunk</code> 的原始内容如下</p><table><thead><tr><th>位置</th><th>0</th><th>1</th><th>2</th><th>3</th><th>4</th><th>5</th><th>6</th><th>7</th><th>8</th><th>9</th></tr></thead><tbody><tr><td>内容</td><td>a</td><td>b</td><td>c</td><td>d</td><td>e</td><td>f</td><td>g</td><td>h</td><td>i</td><td>j</td></tr></tbody></table><p>此时 <code>客户端1</code> 需要改写 <code>[0, 3]</code> 范围内的数据为 <code>0</code>,<code>客户端2</code> 需要改写 <code>[2, 5]</code> 范围内的数据为 <code>1</code>。</p><ol><li>若写入顺序为 <code>客户端1</code> 、<code>客户端2</code> ,则写入完成后,<code>Chunk</code> 内容变为</li></ol><table><thead><tr><th>位置</th><th>0</th><th>1</th><th>2</th><th>3</th><th>4</th><th>5</th><th>6</th><th>7</th><th>8</th><th>9</th></tr></thead><tbody><tr><td>内容</td><td>0</td><td>0</td><td>1</td><td>1</td><td>1</td><td>1</td><td>g</td><td>h</td><td>i</td><td>j</td></tr></tbody></table><ol start="2"><li>若写入顺序为 <code>客户端2</code>、<code>客户端1</code>,则写入完成后,<code>Chunk</code> 内容变为</li></ol><table><thead><tr><th>位置</th><th>0</th><th>1</th><th>2</th><th>3</th><th>4</th><th>5</th><th>6</th><th>7</th><th>8</th><th>9</th></tr></thead><tbody><tr><td>内容</td><td>0</td><td>0</td><td>0</td><td>0</td><td>1</td><td>1</td><td>g</td><td>h</td><td>i</td><td>j</td></tr></tbody></table><p>除了上述讨论的情况外,论文 <code>3.1 Leases and Mutation Order</code> 一节中还提到,如果某次写操作横跨多个 <code>Chunk</code>,则会将这个写操作分开,分别在每个 <code>Chunk</code> 中进行。由于写入顺序的控制在 <code>Chunk</code> 级别,所以有可能 <code>Chunk1</code> 的写入顺序为 <code>客户端1</code>、<code>客户端2</code>,而 <code>Chunk2</code> 中的顺序为 <code>客户端2</code> 、<code>客户端1</code>。这种情况会更加糟糕。</p></li><li><p>顺序成功</p><p>顺序成功意味着同一时刻,只有一个客户端在写入,写入完成后可以读取到自己预期的数据,即状态为 <code>确定的</code></p></li><li><p>写入失败</p><p>当 <code>Chunk</code> 的某些副本写入成功,但是另外一些副本写入失败时,就会陷入 <code>不一致</code> 状态。</p></li></ol><p><strong>追加写(Append Records)</strong></p><blockquote><p>A record append causes data (the “record”) to be appended atomically at least once even in the presence of concurrent mutations, but at an offset of GFS’s choosing (Section 3.3). (In contrast, a “regular” append is merely a write at an offset that the client believes to be the current end of file.) The offset is returned to the client and marks the beginning of a defined region that contains the record. In addition, GFS may insert padding or record duplicates in between. They occupy regions considered to be inconsistent and are typically dwarfed by the amount of user data.</p></blockquote><p>由论文原文可知,<code>GFS</code> 为追加写操作的几个特点:</p><ol><li>追加写操作为 <code>原子性</code> 操作(即不会出现交叉写的情况)</li><li>追加写操作的 <code>offset</code> 由 <code>GFS</code> 指定(准确的说是被选为 <code>primary</code> 的 <code>Chunk</code> 指定)</li><li>追加写操作失败时,<code>GFS</code> 会重试,此时 <code>GFS</code> 可能会插入一些 <code>padding</code> 或者会有一些重复数据</li></ol><p>我们仍然举例来说明此时可能出现的状态,由于写操作的 <code>原子性</code>,我们将 <code>并发追加</code> 和 <code>顺序追加</code> 合并在一起讨论</p><ol><li><p>并发成功、顺序成功</p><p>由于追加写为原子性的,所以客户端数据不可能出现重叠,即每个客户端在写入之后都能获取到预期的数据,是 <code>确定的</code> 状态。</p><p>当客户端出现重试操作时,考虑下面的情况,某个 <code>Chunk</code> 共存在 3 个副本</p><p><strong>Chunk1(primary)</strong> </p><table><thead><tr><th>位置</th><th>0</th><th>1</th><th>2</th></tr></thead><tbody><tr><td>内容</td><td>a</td><td>b</td><td>c</td></tr></tbody></table><p><strong>Chunk2</strong></p><table><thead><tr><th>位置</th><th>0</th><th>1</th><th>2</th></tr></thead><tbody><tr><td>内容</td><td>a</td><td>b</td><td>c</td></tr></tbody></table><p><strong>Chunk3</strong></p><table><thead><tr><th>位置</th><th>0</th><th>1</th><th>2</th></tr></thead><tbody><tr><td>内容</td><td>a</td><td>b</td><td>c</td></tr></tbody></table><p>此时,客户端请求追加写操作,追加内容为 <code>123</code>,如果 <code>Chunk1</code> 成功但是 <code>Chunk2</code> 失败了,则 <code>Chunk</code> 内容变为</p><p><strong>Chunk1(primary)</strong> </p><table><thead><tr><th>位置</th><th>0</th><th>1</th><th>2</th><th>3</th><th>4</th><th>5</th></tr></thead><tbody><tr><td>内容</td><td>a</td><td>b</td><td>c</td><td>1</td><td>2</td><td>3</td></tr></tbody></table><p><strong>Chunk2</strong></p><table><thead><tr><th>位置</th><th>0</th><th>1</th><th>2</th><th>3</th><th>4</th></tr></thead><tbody><tr><td>内容</td><td>a</td><td>b</td><td>c</td><td>1</td><td>2</td></tr></tbody></table><p><strong>Chunk3</strong></p><table><thead><tr><th>位置</th><th>0</th><th>1</th><th>2</th></tr></thead><tbody><tr><td>内容</td><td>a</td><td>b</td><td>c</td></tr></tbody></table><p>客户端感知到错误后,开始发起重试,由于追加写的 <code>offset</code> 由 <code>primary</code> 指定,所以 <code>Chunk1</code>将会指定此次追加写操作从 <code>offset = 6</code> 开始。<code>Chunk2</code> 和 <code>Chunk3</code> 会填充特殊字符使其文件尾 <code>offset</code>与 <code>primary</code> 一致。 </p><p><strong>Chunk1(primary)</strong> </p><table><thead><tr><th>位置</th><th>0</th><th>1</th><th>2</th><th>3</th><th>4</th><th>5</th><th>6</th><th>7</th><th>8</th></tr></thead><tbody><tr><td>内容</td><td>a</td><td>b</td><td>c</td><td>1</td><td>2</td><td>3</td><td>1</td><td>2</td><td>3</td></tr></tbody></table><p><strong>Chunk2</strong></p><table><thead><tr><th>位置</th><th>0</th><th>1</th><th>2</th><th>3</th><th>4</th><th>5</th><th>6</th><th>7</th><th>8</th></tr></thead><tbody><tr><td>内容</td><td>a</td><td>b</td><td>c</td><td>1</td><td>2</td><td>-</td><td>1</td><td>2</td><td>3</td></tr></tbody></table><p><strong>Chunk3</strong></p><table><thead><tr><th>位置</th><th>0</th><th>1</th><th>2</th><th>3</th><th>4</th><th>5</th><th>6</th><th>7</th><th>8</th></tr></thead><tbody><tr><td>内容</td><td>a</td><td>b</td><td>c</td><td>-</td><td>-</td><td>-</td><td>1</td><td>2</td><td>3</td></tr></tbody></table><p>对于客户端来说,<code>GFS</code> 会使用一些检查和重复校验,使得客户端获取到的数据为 <code>确定的</code>(祥见论文 <code>2.7.2 Implications for Applications</code> 小节)。</p><p>但是对于真实的 <code>Chunk</code> 副本来说,确实出现了 <code>不一致</code>。</p></li><li><p>写入失败</p><p>写入失败同随机写入一样,会造成 <code>不一致</code> 的状态。</p></li></ol><h3 id="程序实现"><a href="#程序实现" class="headerlink" title="程序实现"></a>程序实现</h3><p>基于 <code>GFS</code> 的特点,<code>GFS</code> 建议使用它的应用程序尽量使用以下技术来获得最佳实践:</p><ol><li>尽量采用追加写入操作</li><li>Checkpoint</li><li>自验证写入操作</li><li>自标识记录</li></ol><p>在追加写入所有数据之后,应用程序自动将文件改名为一个永久保存的文件名,或者周期性的作 <code>Checkpoint</code>,记录成功写入了多少数据。 <code>Checkpoint</code> 文件可以包含程序级别的校验和。<code>Readers</code> 仅校验并处理上个 <code>Checkpoint</code> 之后产生的文件内容,这些文件内容的状态一定是已定义的。这个方法满足了我们一致性和并发处理的要求。追加写入比随机位置写入更加有效率,对应用程序的失败处理更具有弹性。 <code>Checkpoint</code> 可以让 <code>Writer</code> 以渐进的方式重新开始,并且可以防止 <code>Reader</code> 处理已经被成功写入,但是从应用程序的角度来看还并未完成的数 据。<code>Readers</code> 使用下面的方法来处理偶然性的填充数据和重复内容。<code>Writers</code> 在每条写入的记录中都 包含了额外的信息,例如 <code>Checksum</code>,用来验证它的有效性。<code>Reader</code> 可以利用 <code>Checksum</code> 识别和抛弃额外的填充数据和记录片段。如果应用不能容忍偶尔的重复内容,可以用记录的唯一标识符来过滤它们。</p><h2 id="GFS-中的常见操作"><a href="#GFS-中的常见操作" class="headerlink" title="GFS 中的常见操作"></a>GFS 中的常见操作</h2><h3 id="读取"><a href="#读取" class="headerlink" title="读取"></a>读取</h3><p><img src="/images/gfs/figure1_gfs_architecture.png" alt="图一:GFS 结构图"></p><p>客户端在读取 <code>GFS</code> 中的数据时,过程如下</p><ol><li>客户端把文件名称和指定的字节偏移,根据 <code>Chunk</code> 的大小,转换成文件的 <code>Chunk</code> 标识</li><li>客户端把文件名称和 <code>Chunk</code> 标识发送给 <code>Master</code> 节点</li><li><code>Master</code> 节点将相应 <code>Chunk</code> 标识的副本位置信息返回给客户端</li><li>客户端以文件名称和 <code>Chunk</code> 标识为 <code>key</code> 缓存这些信息</li><li>客户端发送读取请求(其中包含了 <code>Chunk</code> 标识和字节范围)到最近的 <code>Chunk</code> 副本处。</li></ol><h3 id="租约(lease)"><a href="#租约(lease)" class="headerlink" title="租约(lease)"></a>租约(lease)</h3><p><code>GFS</code> 使用租约 (lease)机制来保持多个副本间变更顺序的一致性。<code>Master</code> 节点为 <code>Chunk</code> 的某个副本建立一个租约,这个副本被称为 <code>主Chunk(primary)</code>。<code>主Chunk</code> 对 <code>Chunk</code> 的所有更改操作进行排序,所有的副本都遵从这个序列进行修改操作。因此,修改操作全局的顺序首先由 <code>Master</code> 节点选择的租约的顺序决定,然后由租约中 <code>主Chunk</code> 分配的序列号决定。</p><p>租约可以减小 <code>Master</code> 节点的负担,并且租约的默认有效时间为 <code>60s</code>,在此期间 <code>主Chunk</code> 可以通过在与 <code>Master</code> 的心跳中附加信息来申请延长租期。<code>Master</code> 也可以提前取消租约,亦或者在 <code>主Chunk</code>失联且租约过期后,与其他的 <code>Chunk</code> 副本签订新的租约。</p><h3 id="写入操作"><a href="#写入操作" class="headerlink" title="写入操作"></a>写入操作</h3><p> <img src="/images/gfs/figure2_write_control_and_data_flow.png" alt="图2:写入和数据流"></p><p>写入操作的过程如下:</p><ol><li>客户机向 <code>Master</code> 节点询问哪一个 <code>Chunk Server</code>持有当前的租约,以及其它副本的位置。如果没有一个 <code>Chunk</code> 持有租约,<code>Master</code> 节点就选择其中一个副本建立一个租约(这个步骤在图上没有显示)。</li><li><code>Master</code> 节点将 <code>主Chunk</code> 的标识符以及其它副本(又称为 <code>secondary副本</code>、二级副本)的位置返回给客户端。客户机缓存这些数据以便后续的操作。只有在 <code>主Chunk</code> 不可用,或者 <code>主Chunk</code> 回复信息表明它已不再持有租约的时候,客户端才需要重新跟 <code>Master</code> 节点联系。</li><li>客户端把数据推送到所有的副本上。客户端可以以任意的顺序推送数据。<code>Chunk Server</code>接收到数据并保存在它的内部 <code>LRU缓存</code> 中,一直到数据被使用或者过期交换出去。由于数据流的网络传输负载非常高,通过分离数据流和控制流,我们可以基于网络拓扑情况对数据流进行规划,提高系统性能,而不用去理会哪个 <code>Chunk Server</code>保存了 <code>主Chunk</code>。</li><li>当所有的副本都确认接收到了数据,客户端发送写请求到 <code>主Chunk</code> 服务器。这个请求标识了早前推送到所有副本的数据。<code>主Chunk</code> 为接收到的所有操作分配连续的序列号,这些操作可能来自不同的客户端,序列号保证了操作顺序执行。它以序列号的顺序把操作应用到它自己的本地状态中。</li><li><code>主Chunk</code> 把写请求传递到所有的二级副本。每个二级副本依照 <code>主Chunk</code> 分配的序列号以相同的顺序执行这些操作。</li><li>所有的二级副本回复 <code>主Chunk</code>,它们已经完成了操作。</li><li><code>主Chunk</code> 服务器回复客户端。任何副本产生的任何错误都会返回给客户端。在出现错误的情况下,写入操作可能在 <code>主Chunk</code> 和一些二级副本执行成功。(如果操作在主Chunk 上失败了,操作就不会被分配序列号,也不会被传递。)客户端的请求被确认为失败,被修改的region处于不一致的状态。客户端代码通过重复执行失败的操作来处理这样的错误。在从头开始重复执行之前,客户机会先从步骤(3)到步骤(7)做几次尝试。</li></ol><p>如果写入的数据量很大,跨域了多个 <code>Chunk</code> ,客户端会将其分成多个写操作。</p><p>由于数据推送需要消耗大量的网络带宽,客户端在推送数据的时候,会沿着一个 <code>Chunk Server</code>链顺序的推送,而不是以其它拓扑形式分散推送(例如,树型拓扑结构)。线性推送模式下,每台机器所有的出口带宽都用于以最快的速度传输数据,而不是在多个接受者之间分配带宽。</p><p>同时,利用基于 <code>TCP</code> 连接的、管道式数据推送的方式来让延长最小化,<code>Chunk Server</code>接收到数据后,马上开始向前推送。</p><h3 id="追加写入"><a href="#追加写入" class="headerlink" title="追加写入"></a>追加写入</h3><p>追加写入与上文提到的覆盖写入过程基本一致:</p><ol><li>客户端只需要指定要追加的数据,写入的偏移量由 <code>GFS</code> 来决定</li><li>客户端把数据推送完毕后,向 <code>主Chunk</code> 发送写入请求</li><li><code>主Chunk</code> 会首先检查当前的追加操作是否超出了 <code>Chunk</code> 的最大尺寸(64MB)<ol><li>如果超出则 <code>主Chunk</code> 会首先将当前的 <code>Chunk</code> 填充到最佳尺寸,然后通知所有 <code>二级副本</code> 做相同的操作,最后回复客户端要求其对下一个 <code>Chunk</code> 继续进行追加操作</li><li>如果没有超出,则 <code>主Chunk</code> 将数据追加到自己的副本内,再通知 <code>二级副本</code> 写在跟 <code>主Chunk</code> 相同的位置上</li></ol></li></ol><p>关于追加失败的场景,前面讲 <code>一致性模型</code> 时已经提到,不再在此赘述。</p><h3 id="快照"><a href="#快照" class="headerlink" title="快照"></a>快照</h3><p>快照使用 <code>Copy-on-Write</code> 技术,当 <code>Master</code> 节点收到一个快照请求时:</p><ol><li>取消作快照文件的所有租约。这样就保证了,后续与这些文件有关的操作,都必须先请求 <code>Master</code> 节点(参考前面提到的写入流程)</li><li>等租约撤回后,<code>Master</code> 首先会将这个操作以日志的形式记录到磁盘,然后开始在内存中复制相关文件或者目录的元数据,这些元数据指向相同的 <code>Chunk</code></li><li>当客户端第一次查询 <code>Chunk C</code> 的 <code>primary</code> 以及副本位置,想要做写入操作时,<code>Master</code> 发现指向 <code>Chunk C</code> 的引用计数超过了 <code>1</code>。此时 <code>Master</code> 不会马上响应客户端的请求,而是首先创建一个 <code>Chunk C</code> 的新 <code>handle</code>,并要求每个拥有 <code>Chunk C</code> 的服务器在本地复制一个相同的 <code>Chunk C</code>,之后在新创建出的 <code>Chunk C</code> 中选择一个签订租约,并将信息返回给客户端</li></ol><h3 id="命名空间和锁"><a href="#命名空间和锁" class="headerlink" title="命名空间和锁"></a>命名空间和锁</h3><p>为了能允许客户端并发操作,<code>Master</code> 会使用命名空间上的锁来保证操作的正确性。</p><p>与传统的文件系统不同,<code>GFS</code> 没有维护一个目录树,也不支持文件或者目录的链接(<code>unix</code> 中的符号链接)。在逻辑上,<code>GFS</code> 的名称空间就是一个全路径和元数据映射关系的查找表。利用前缀压缩,这个表可以高效的存储在内存中。在存储名称空间的树型结构上,每个节点(绝对路径的文件名或绝对路径的目录名)都有一个关联的读写锁。</p><p>每个 <code>Master</code> 节点的操作在开始之前都要获得一系列的锁。通常情况下,如果一个操作涉及 <code>/d1/d2/…/dn/leaf</code>,那么操作首先要获得目录 <code>/d1,/d1/d2,…,/d1/d2/…/dn</code> 的读锁,以及 <code>/d1/d2/…/dn/leaf</code> 的读写锁。注意,根据操作的不同,<code>leaf</code> 可以是 一个文件,也可以是一个目录。</p><p>为了优化锁占用的内存,读写锁采用惰性分配的方式,且不再使用的时候会被及时回收。</p><p>锁的获取也要依据一个全局一致的顺序来避免死锁:首先按名称空间的层次排序,在同一个层次内按字典顺序排序。</p><h3 id="副本"><a href="#副本" class="headerlink" title="副本"></a>副本</h3><h4 id="位置"><a href="#位置" class="headerlink" title="位置"></a>位置</h4><p>副本位置的选择主要遵循两个目标:</p><ol><li>最大化数据可靠性和可用性</li><li>最大化网络带宽利用率</li></ol><p>而仅仅在多台机器上存储副本只能保证硬盘损坏或者机器失效带来的影响,以及最大化每台机器的带宽利用率。所以必须要在多个机架见分布存储 <code>Chunk</code> 副本。</p><h4 id="创建、复制和负载均衡"><a href="#创建、复制和负载均衡" class="headerlink" title="创建、复制和负载均衡"></a>创建、复制和负载均衡</h4><p>除去读写之外,副本主要还有三个操作:创建、重新复制和负载均衡(迁移)</p><p><strong>创建</strong></p><p>创建副本时,要选择在什么地方放置空的副本,<code>Master</code> 在选择时主要考虑下面几个因素:</p><ol><li>优先考虑硬盘使用率低于平均水平的服务器</li><li>保证每个服务器上最近创建的 <code>Chunk</code> 不要过多,因为 <code>Chunk</code> 创建意味着接下来会有大量的写入和查询。</li><li>如上文所说,倾向于分布在不同的机架上</li></ol><p><strong>复制</strong></p><p>当 <code>Chunk</code> 副本由于以下几个可能的原因,导致副本数量小于用户指定的复制因数的时候,<code>Master</code> 节点就会重新复制它:</p><ol><li><code>Chunk Server</code>不可用</li><li><code>Chunk Server</code>报告它所存储的一个副本损坏</li><li><code>Chunk Server</code>的一块磁盘不可用</li><li><code>Chunk</code> 副本的复制参数被增加</li></ol><p>当多个 <code>Chunk</code> 需要被复制时,优先级会考虑以下因素</p><ol><li>当前副本数与复制因数的差值,差值越大优先级越高</li><li>优先复制未被删除的 <code>Chunk</code> (删除是惰性的,会被定时回收,下文有介绍)</li><li>优先复制会阻塞客户端查询处理流程的</li></ol><p>复制时, <code>Master</code> 会 “命令” 拥有相应 <code>Chunk</code> 副本的 <code>Chunk Server</code>上克隆一个副本出来,并按照 <code>Chunk</code> 创建时的策略选择副本位置。</p><p>为了防止克隆时产生的流量影响客户端的操作,<code>Master</code> 对整个集群和每个 <code>Chunk Server</code>上同时进行克隆操作的数量做了限制,并且 <code>Chunk Server</code>通过调节它对源 <code>Chunk Server</code>读请求的频率来限制它用于克隆操作的带宽。</p><p><strong>重新负载均衡</strong></p><p><code>Master</code> 服务器周期性地检查当前的副本分布情况,然后移动副本以便更好的利用硬盘空间、更有效的进行负载 均衡。而且在这个过程中,<code>Master</code> 服务器逐渐的填满一个新的 <code>Chunk Server</code>,而不是在短时间内用新的 <code>Chunk</code> 填满它,以至于过载。新副本的存储位置选择策略和上面讨论的相同。另外,<code>Master</code> 节点必须选择哪个副本要被移走。通常情况,<code>Master</code> 节点移走那些剩余空间低于平均值的 <code>Chunk</code> 服务 器上的副本,从而平衡系统整体的硬盘使用率。</p><h3 id="垃圾回收"><a href="#垃圾回收" class="headerlink" title="垃圾回收"></a>垃圾回收</h3><p><code>GFS</code> 使用惰性删除策略来处理文件删除操作。</p><p>文件删除的流程为:</p><ol><li><code>Master</code> 节点立即将删除操作写入操作日志中</li><li>把文件名改为一个包含删除时间戳的隐藏的名字</li><li>当 <code>Master</code> 对文件系统命名空间做常规扫描时删除所有三天前的隐藏文件</li></ol><p>在真正删除隐藏文件之前,被客户端删除的文件都可以通过更改文件名的方式回滚删除操作。文件的元数据也是在删除隐藏文件时被删除的。</p><p><code>Master</code> 在对 <code>Chunk</code> 名字空间做类似的常规扫描时,<code>Master</code> 节点找到孤儿 <code>Chunk</code>(不被任何文件包含的 <code>Chunk</code>)并删除它们的元数据。 <code>Chunk Server</code>在和 <code>Master</code> 节点交互的心跳信息中,报告它拥有的 <code>Chunk</code> 子集的信息,<code>Master</code> 节点回复 <code>Chunk Server</code>哪些 <code>Chunk</code> 在 <code>Master</code> 节点保存的元数据中已经不存在了。<code>Chunk Server</code>可以任意删除这些 <code>Chunk</code> 的副本。</p><p>惰性删除的优势:</p><ol><li>对于组件失效是常态的大规模分布式系统,垃圾回收方式简单可靠。<code>Chunk</code> 可能在某些 <code>Chunk Server</code>创建成功,某些 <code>Chunk Server</code>上创建失败,失败的副本处于无法被 <code>Master</code> 节点识别的状态。副本删除消息可能丢失,<code>Master</code> 节点必须重新发送失败的删除消息,包括自身的和 <code>Chunk</code>服务器的 。 垃圾回收提供了一致的、可靠的清除无用副本的方法。</li><li>垃圾回收把存储空间的回收操作合并到 <code>Master</code> 节点规律性的后台活动中。因此,操作被批量的执行,开销会被分散。另外,垃圾回收在<code>Master</code> 节点相对空闲的时候完成。这样 <code>Master</code> 节点就可以给那些需要快速反应的客户机请求提供更快捷的响应。</li><li>延缓存储空间回收为意外的、不可逆转的删除操作提供了安全保障。</li></ol><p>当然,延迟回收可能会造成空间的浪费,特别是当磁盘空间紧张或者客户端频繁创建、删除新文件的时候。对于这个问题,可以通过显式的再次删除一个已经被删除文件的方式来加速回收空间。另外 <code>GFS</code> 允许为命名空间的不同部分设置不同的复制参数和回收策略,比如可以指定某些目录下不做复制,删除时立即回收空间。</p><h3 id="失效副本检测"><a href="#失效副本检测" class="headerlink" title="失效副本检测"></a>失效副本检测</h3><p><code>Master</code> 在每次跟 <code>Chunk</code> 签订租约时增加 <code>Chunk</code> 版本号,然后通知最新副本,只有当 <code>Master</code> 和所有副本都将新的版本号持久化存储后,才会响应客户端的请求。</p><p>当一个 <code>Chunk Server</code>在更新版本号时失效,在它重启向 <code>Master</code> 报告当前副本状态时,<code>Master</code> 就会检测出它包含过期 <code>Chunk</code>。相反如果 <code>Master</code> 发现他记录的版本号比自己要高,则会更新自己的版本号到最新版本。(<strong>此处 Master 会更新自己的 Chunk 吗?</strong>)</p><p>客户端请求 <code>Master</code> 节点 <code>Chunk</code> 信息时,对于已经过期的 <code>Chunk</code>,<code>Master</code> 会直接认为不存在。另外,<code>Master</code> 节点在通知客户端哪个 <code>Chunk Server</code>持有租约、或者指示 <code>Chunk Server</code>从哪个 <code>Chunk Server</code>进行克隆时,消息中都附带了 <code>Chunk</code> 的版本号。客户端或者 <code>Chunk Server</code>在执行操作时都会验证版本号以确保总是访问当前版本的数据。</p><h2 id="容错和诊断"><a href="#容错和诊断" class="headerlink" title="容错和诊断"></a>容错和诊断</h2><p>由于 <code>GFS</code> 在设计之初的目标为运行在廉价的日用硬件上,组件的频繁失效是一种常态,所以容错和诊断是 <code>GFS</code> 设计时非常重要的一部分。</p><h3 id="高可用"><a href="#高可用" class="headerlink" title="高可用"></a>高可用</h3><p><code>GFS</code> 使用两条简单的策略保证系统的高可用性:快速恢复和复制。</p><h4 id="快速恢复"><a href="#快速恢复" class="headerlink" title="快速恢复"></a>快速恢复</h4><p><code>Master</code> 节点和 <code>Chunk Server</code> 的状态都保存在本地,无论是正常的重启还是异常的重启都可以快速的恢复到之前的状态。</p><h4 id="Master-的复制"><a href="#Master-的复制" class="headerlink" title="Master 的复制"></a>Master 的复制</h4><p><code>Master</code> 的所有操作日志和 <code>Checkpoint</code> 文件都被复制到多台机器上。并且凡是涉及更改 <code>Master</code> 状态的操作,一定会确保操作日志写入到 <code>Master</code> 和复制机器的磁盘上才会响应客户端的请求。客户端使用规范的别名(例如 gfs-master)访问 <code>Master</code>,一旦当前 <code>Master</code> 节点不可用,<code>GFS</code> 系统外部的监控进程会在其它的存有完整操作日志的机器上启动一个新的 <code>Master</code>进程,并将别名指向新的 <code>Master</code> 节点。</p><p><code>GFS</code> 中还有一些 <code>Shadow Master</code> 节点,他们在 <code>Master</code> 宕机期间可以临时提供文件系统的只读访问,由于 <code>Shadow Master</code> 的元数据比 <code>Master</code> 节点更新慢(通常不到 1s),所以通过 <code>Shadow Master</code> 读取文件内容时,有可能读取到过期数据。</p><p><code>Shadow Master</code> 服务器为通过读取正在进行操作的日志副本来保持自身状态是最新的,它依照和主 <code>Master</code> 服务器完全相同的顺序来更改内部的数据结构。<code>Shadow Master</code> 服务器在启动的时候也会从 <code>Chun Server</code> 轮询数据(之后定期拉数据),数据中包括了 <code>Chunk</code> 副本的位置信息;<code>Shadow Master</code> 服务器也会定期和 <code>Chunk Server</code> 通信以确定它们的状态。在主 <code>Master</code> 服务器因创建和删除副本导致副本位置信息更新时,<code>Shadow Master</code> 服务器才和主 <code>Master</code> 服务器通信来更新自身状态。</p><h4 id="Chunk-的复制"><a href="#Chunk-的复制" class="headerlink" title="Chunk 的复制"></a>Chunk 的复制</h4><p>每个 <code>Chunk</code> 都被复制到不同机架上的不同的 <code>Chunk Server</code> 上。用户可以为文件命名空间的不同部分设定不同的复制级别。缺省是 3。当有<code>Chunk Server</code> 离线了,或者通过 <code>Chksum校验</code> 发现了已经损坏的数据,<code>Master</code> 节点通过克隆已有的副本保证每个 <code>Chunk</code> 都被完整复制。</p><h3 id="数据完整性"><a href="#数据完整性" class="headerlink" title="数据完整性"></a>数据完整性</h3><p><code>Chunk Server</code> 会把每个 <code>Chunk Replica</code> 切分为若干个 64KB 大小的块,并为每个块计算 32 位校验和。和 <code>Master</code> 的元数据一样,这些校验和会被保存在 <code>Chunk Server</code> 的内存中,每次修改前都会用先写日志的形式来保证可用。当 <code>Chunk Server</code> 接收到读请求时,<code>Chunk Server</code> 首先会利用校验和检查所需读取的数据是否有发生损坏,如此一来 <code>Chunk Server</code> 便不会把损坏的数据传递给其他请求发送者,无论它是客户端还是另一个 <code>Chunk Server</code>。发现损坏后,<code>Chunk Server</code> 会为请求发送者发送一个错误,并向 <code>Master</code> 告知数据损坏事件。接收到错误后,请求发送者会选择另一个 <code>Chunk Server</code> 重新发起请求,而 <code>Master</code> 则会利用另一个 <code>Replica</code> 为该 <code>Chunk</code> 进行重备份。当新的 <code>Replica</code> 创建完成后,<code>Master</code> 便会通知该 <code>Chunk Server</code> 删除这个损坏的 <code>Replica</code>。</p><p>当进行数据追加操作时,<code>Chunk Server</code> 可以为位于 <code>Chunk</code> 尾部的校验和块的校验和进行增量式的更新,或是在产生了新的校验和块时为其计算新的校验和。即使是被追加的校验和块在之前已经发生了数据损坏,增量更新后的校验和依然会无法与实际的数据相匹配,在下一次读取时依然能够检测到数据的损坏。在进行数据写入操作时,<code>Chunk Server</code> 必须读取并校验包含写入范围起始点和结束点的校验和块,然后进行写入,最后再重新计算校验和。</p><p>除外,在空闲的时候,<code>Chunk Server</code> 也会周期地扫描并校验不活跃的 <code>Chunk Replica</code> 的数据,以确保某些 <code>Chunk Replica</code> 即使在不怎么被读取的情况下,其数据的损坏依然能被检测到,同时也确保了这些已损坏的 <code>Chunk Replica</code> 不至于让 <code>Master</code> 认为该 <code>Chunk</code> 已有足够数量的 <code>Replica</code>。</p><h2 id="FAQ"><a href="#FAQ" class="headerlink" title="FAQ"></a>FAQ</h2><p>MIT 6.824 的课程材料中给出了和 GFS 有关的 FAQ,以下是相关问答的翻译。</p><blockquote><p>Q:为什么原子记录追加操作是至少一次(At Least Once),而不是确定一次(Exactly Once)?</p></blockquote><p>要让追加操作做到确定一次是不容易的,因为如此一来 Primary 会需要保存一些状态信息以检测重复的数据,而这些信息也需要复制到其他服务器上,以确保 Primary 失效时这些信息不会丢失。在 Lab 3 中你会实现确定一次的行为,但用的是比 GFS 更复杂的协议(Raft)。</p><blockquote><p>Q:应用怎么知道 Chunk 中哪些是填充数据或者重复数据?</p></blockquote><p>要想检测填充数据,应用可以在每个有效记录之前加上一个魔数(Magic Number)进行标记,或者用校验和保证数据的有效性。应用可通过在记录中添加唯一 ID 来检测重复数据,这样应用在读入数据时就可以利用已经读入的 ID 来排除重复的数据了。GFS 本身提供了 library 来支撑这些典型的用例。</p><blockquote><p>Q:考虑到原子记录追加操作会把数据写入到文件的一个不可预知的偏移值中,客户端该怎么找到它们的数据?</p></blockquote><p>追加操作(以及 GFS 本身)主要是面向那些会完整读取文件的应用的。这些应用会读取所有的记录,所以它们并不需要提前知道记录的位置。例如,一个文件中可能包含若干个并行的网络爬虫获取的所有链接 URL。这些 URL 在文件中的偏移值是不重要的,应用只会想要完整读取所有 URL。</p><blockquote><p>Q:如果一个应用使用了标准的 POSIX 文件 API,为了使用 GFS 它会需要做出修改吗?</p></blockquote><p>答案是需要的,不过 GFS 并不是设计给已有的应用的,它主要面向的是新开发的应用,如 MapReduce 程序。</p><blockquote><p>Q:GFS 是怎么确定最近的 Replica 的位置的?</p></blockquote><p>论文中有提到 GFS 是基于保存 Replica 的服务器的 IP 地址来判断距离的。在 2003 年的时候,Google 分配 IP 地址的方式应该确保了如果两个服务器的 IP 地址在 IP 地址空间中较为接近,那么它们在机房中的位置也会较为接近。</p><blockquote><p>Q:Google 现在还在使用 GFS 吗?</p></blockquote><p>Google 仍然在使用 GFS,而且是作为其他如 BigTable 等存储系统的后端。由于工作负载的扩大以及技术的革新,GFS 的设计在这些年里无疑已经经过大量调整了,但我并不了解其细节。HDFS 是公众可用的对 GFS 的设计的一种效仿,很多公司都在使用它。</p><blockquote><p>Q:Master 不会成为性能瓶颈吗?</p></blockquote><p>确实有这个可能,GFS 的设计者也花了很多心思来避免这个问题。例如,Master 会把它的状态保存在内存中以快速地进行响应。从实验数据来看,对于大文件读取(GFS 主要针对的负载类型),Master 不是瓶颈所在;对于小文件操作以及目录操作,Master 的性能也还跟得上(见 6.2.4 节)。</p><blockquote><p>Q:GFS 为了性能和简洁而牺牲了正确性,这样的选择有多合理呢?</p></blockquote><p>这是分布式系统领域的老问题了。保证强一致性通常需要更加复杂且需要机器间进行更多通信的协议(正如我们会在接下来几门课中看到的那样)。通过利用某些类型的应用可以容忍较为松懈的一致性的事实,人们就能够设计出拥有良好性能以及足够的一致性的系统。例如,GFS 对 MapReduce 应用做出了特殊优化,这些应用需要的是对大文件的高读取效率,还能够容忍文件中存在数据空洞、重复记录或是不一致的读取结果;另一方面,GFS 则不适用于存储银行账号的存款信息。</p><blockquote><p>Q:如果 Master 失效了会怎样?</p></blockquote><p>GFS 集群中会有持有 Master 状态完整备份的 Replica Master;通过论文中没有提到的某个机制,GFS 会在 Master 失效时切换到其中一个 Replica(见 5.1.3 节)。有可能这会需要一个人类管理者的介入来指定一个新的 Master。无论如何,我们都可以确定集群中潜伏着一个故障单点,理论上能够让集群无法从 Master 失效中进行自动恢复。我们会在后面的课程中学习如何使用 Raft 协议实现可容错的 Master。</p><h3 id="问题"><a href="#问题" class="headerlink" title="问题"></a>问题</h3><p>除了 FAQ,课程还要求学生在阅读 GFS 的论文后回答一个问题,问题如下:</p><blockquote><p>Describe a sequence of events that result in a client reading stale data from the Google File System</p><p>描述一个事件序列,使得客户端会从 Google File System 中读取到过时的数据</p></blockquote><p>通过查阅论文,不难找到两处答案:由失效后重启的 Chunk Server + 客户端缓存的 Chunk 位置数据导致客户端读取到过时的文件内容(见 4.5 和 2.7.1 节),和由于 Shadow Master 读取到的过时文件元信息(见 5.1.3 节)。以上是保证所有写入操作都成功时客户端可能读取到过时数据的两种情况 —— 如果有写入操作失败,数据会进入<strong>不确定</strong>的状态,自然客户端也有可能读取到过时或是无效的数据。</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>论文的第六章为 <code>GFS</code> 的 <code>Benchmark</code> ,第七章为 <code>GFS</code> 在生产环境使用时遇到的一些问题,本文没有总结,有兴趣的读者可以阅读论文原文。</p><p>基于 <code>Google File System </code> 开发的 <code>HDFS</code> 一直是分布式文件系统开源实现的首选。本篇论文与 <code>Map Reduce</code> 是大数据的开山之作,与 <code>Big Table</code> 并称为 <code>Google</code> 的三驾马车,是非常经典的论文,值得不断的学习和推敲。 </p><h2 id="参考资料"><a href="#参考资料" class="headerlink" title="参考资料"></a>参考资料</h2><ol><li><p><a href="https://pdos.csail.mit.edu/6.824/papers/gfs-faq.txt">GFS FAQ</a></p></li><li><p><a href="https://blog.wonter.net/posts/aa8a2f6c/">MIT 6.824(二)GFS的一致性模型</a></p></li><li><p><a href="https://mr-dai.github.io/gfs/">Google File System 总结</a></p></li></ol>]]></content>
<summary type="html"><p>本文是博主学习 <code>MIT6.824</code> 课程的学习笔记,其中会总结论文知识点并加入自己的理解,内容可能与论文原文有出入,想要了解细节的读者可以阅读论文原文或者学习 <code>MIT6.824</code>课程。</p>
<p><a href="https://pdos.csail.mit.edu/6.824/papers/gfs.pdf">The Google File System</a></p>
<p><a href="https://pdos.csail.mit.edu/6.824/video/3.html">GFS MIT Video</a></p>
<h2 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h2><p><code>Google File System</code> 简称 <code>GFS</code> 是 <code>Google</code> 设计并实现的一个面向数据密集型应用的、可伸缩的分布式文件系统。</p>
<p><code>GFS</code> 的设计基于以下使用场景:</p>
<ol>
<li>运行在廉价的日用硬件上,组件失效是常态事件。因此,系统必须具有持久的监控、错误侦测、容错以及自动恢复的功能。</li>
<li>以存储大文件(<code>100MB</code> 到 <code>数GB</code>)为主,同时要支持小文件,但是不需要针对小文件做优化。</li>
<li>支持两种读操作:大规模的流式读取(<code>数百KB</code>,或者一次读取 <code>1MB</code> 甚至更多)和小规模的随机读取(在任意位移上读取 <code>几个KB</code>)。</li>
<li>支持两种写操作:大规模的、顺序的对文件的追加和小规模的任意位置写入(不必高效)。</li>
<li>必须支持高效的多客户端同时并行追加数据到同一个文件的语义(<code>Google</code> 的场景下,<code>GFS</code> 中存储的文件通常用于 <code>生产者-消费者</code> 队列,或者其他多路文件合并操作)</li>
<li>大吞吐量优先于低延时</li>
</ol></summary>
</entry>
<entry>
<title>MapReduce 总结</title>
<link href="https://andrewei1316.github.io/2020/10/04/map-reduce/"/>
<id>https://andrewei1316.github.io/2020/10/04/map-reduce/</id>
<published>2020-10-04T05:13:12.000Z</published>
<updated>2020-11-19T13:22:30.705Z</updated>
<content type="html"><![CDATA[<p>本文是博主学习 <code>MIT6.824</code> 课程的学习笔记,其中会总结论文知识点并加入自己的理解,内容可能与论文原文有出入,想要了解细节的读者可以阅读论文原文或者学习 <code>MIT6.824</code>课程。</p><p><a href="https://pdos.csail.mit.edu/6.824/papers/mapreduce.pdf">MapReduce: Simplified Data Processing on Large Clusters</a></p><p><a href="https://pdos.csail.mit.edu/6.824/video/1.html">Introduction And MapReduce</a></p><h2 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h2><p><code>MapReduce</code> 是一种编程模型,也是一个处理和生成超大数据集算法模型的相关实现。使用 <code>MapReduce</code> 架构的程序能够在大量的普通配置的计算机上实现并行化处理。</p><h2 id="MapReduce-模型"><a href="#MapReduce-模型" class="headerlink" title="MapReduce 模型"></a>MapReduce 模型</h2><p><code>MapReduce</code> 编程模型的处理过程为:输入一个 <code>key/value pair</code> 集合,经过处理后,输出一个 <code>key/value pair</code> 集合作为结果。<br><code>MapReduce</code> 允许用户使用两个函数 <code>Map</code> 和 <code>Reduce</code> 来表达上述计算。</p><ul><li><code>Map</code> 函数接受一个输入的 <code>key/value pair</code> 值,然后产生一个中间 <code>key/value pair</code> 值的集合。<code>MapReduce</code> 把所有 <code>key</code> 为 <code>I</code> 的中间值 <code>value</code>集合在一起后传递给 <code>reduce</code> 函数。</li><li><code>Reduce</code> 函数接受一个中间 <code>key</code> 的值 <code>I</code> 和其<code>value</code>值的集合,由于 <code>value</code> 值可能由于太大无法放入内存中,故通常我们把 <code>value</code> 的迭代器传递给 <code>Reduce</code> 函数。</li></ul><p>上述过程也可以抽象为下面的表达式</p><p>$$ map(k1, v1) -> list(k2, v2) $$</p><p>$$ reduce(k2, list(v2)) -> list(v2) $$</p><a id="more"></a><h3 id="一个例子"><a href="#一个例子" class="headerlink" title="一个例子"></a>一个例子</h3><p>下面的例子可以简要说明 <code>MapReduce</code> 模型的计算过程,当我们要计算一个很大的文档集合中每个单词出现的次数时,可以用下面的方式:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">map(String key, String value):</span><br><span class="line">// key: document name</span><br><span class="line">// value: document contents</span><br><span class="line">for each word w in value:</span><br><span class="line">EmitIntermediate(w, "1")</span><br><span class="line"></span><br><span class="line">reduce(String key, Iterator values):</span><br><span class="line">// key: a word</span><br><span class="line">// values: a list of counts</span><br><span class="line">int result = 0;</span><br><span class="line">for eash v in values:</span><br><span class="line">result += ParseInt(v)</span><br><span class="line">Emit(AsString(result))</span><br></pre></td></tr></table></figure><p><code>Map</code> 函数输出文档中的每个词、以及这个词的出现次数(在这个简单的例子里就是 1)。<code>Reduce</code> 函数把 <code>Map</code> 函数产生的每一个特定的词出现的次数累加起来。</p><h2 id="实现"><a href="#实现" class="headerlink" title="实现"></a>实现</h2><h3 id="执行过程"><a href="#执行过程" class="headerlink" title="执行过程"></a>执行过程</h3><p><img src="/images/map-reduce/figure1_execute_overview.png" alt="执行过程概览"></p><p>图 1 展示了 <code>MapReduce</code> 的计算流程。当用户调用 <code>MapReduce</code> 函数时,将发生下面的一 系列动作(下面的序号和图 1 中的序号一一对应): </p><ol><li><p>用户程序首先调用 <code>MapReduce</code> 库将输入文件分成 <code>M 个数据片度</code>,每个数据片段的大小一般从 <code>16MB 到 64MB</code>(可以通过可选的参数来控制每个数据片段的大小)。然后用户程序在机群中创建大量的程序副本。 </p></li><li><p>这些程序副本中的有一个特殊的程序–<code>master</code>。副本中其它的程序都是 <code>worker</code> 程序,由 <code>master</code> 分配 任务。有 <code>M 个 Map 任务</code>和 <code>R 个 Reduce 任务</code>将被分配,<code>master</code> 将一个 <code>Map</code> 任务或 <code>Reduce</code> 任务分配给一个空闲的 <code>worker</code>。 </p></li><li><p>被分配了 <code>Map</code> 任务的 <code>worker</code> 程序读取相关的输入数据片段,从输入的数据片段中解析出 <code>key/value pair</code>,然后把 <code>key/value pair</code> 传递给用户自定义的 <code>Map</code> 函数,由 <code>Map</code> 函数生成并输出的中间 <code>key/value pair</code>,并缓存在内存中。 </p></li><li><p>缓存中的 <code>key/value pair</code> 通过分区函数分成 <code>R</code> 个区域,之后周期性的写入到本地磁盘上。缓存的 <code>key/value pair</code> 在本地磁盘上的存储位置将被回传给 <code>master</code>,由 <code>master</code> 负责把这些存储位置再传送给 <code>Reduce worker</code>。</p></li><li><p>当 <code>Reduce worker</code> 程序接收到 <code>master</code> 程序发来的数据存储位置信息后,使用 <code>RPC</code> 从 <code>Map worker</code> 所在主机的磁盘上读取这些缓存数据。当 <code>Reduce worker</code> 读取了所有的中间数据后,通过对 <code>key</code> 进行排序,使得具有相同 <code>key</code> 值的数据聚合在一起。由于许多不同的 <code>key</code> 值会映射到相同的 <code>Reduce</code> 任务上, 因此必须进行排序。如果中间数据太大无法在内存中完成排序,那么就要在外部进行排序。 </p></li><li><p><code>Reduce worker</code> 程序遍历排序后的中间数据,对于每一个唯一的中间 <code>key</code> 值,<code>Reduce worker</code> 程序将这 个 <code>key</code> 值和它相关的中间 <code>value</code> 值的集合传递给用户自定义的 <code>Reduce</code> 函数。<code>Reduce</code> 函数的输出被追加到所属分区的输出文件。 </p></li><li><p>当所有的 <code>Map</code> 和 <code>Reduce</code> 任务都完成之后,<code>master</code> 唤醒用户程序。在这个时候,在用户程序里的对 <code>MapReduce</code> 调用才返回。</p></li></ol><p>在成功完成任务之后<code>MapReduce</code> 的输出存放在 <code>R</code> 个输出文件中(对应每个 <code>Reduce</code> 任务产生一个输出文件,文件名由用户指定)。一般情况下,用户不需要将这 <code>R</code> 个输出文件合并成一个文件–他们经常把这些文 件作为另外一个 <code>MapReduce</code> 的输入,或者在另外一个可以处理多个分割文件的分布式应用中使用。</p><p><span id="master维护的数据结构"/>在上述执行过程中,<code>master</code> 会记录每一个 <code>Map</code> 和 <code>Reduce</code> 任务的当前完成状态,以及所分配的 <code>worker</code>。除此之外,<code>Mmster</code> 还负责将 <code>Map</code> 产生的中间结果文件的位置和大小转发给 <code>Reduce</code>。</p><h3 id="容错机制"><a href="#容错机制" class="headerlink" title="容错机制"></a>容错机制</h3><h4 id="worker-失效"><a href="#worker-失效" class="headerlink" title="worker 失效"></a>worker 失效</h4><p><code>master</code> 会周期性的 ping <code>worker</code>,如果 <code>worker</code> 没有及时返回信息,则 <code>master</code>会将其标记为失效。对于在该 <code>worker</code>上执行的任务,分为以下几种情况:</p><ol><li><p>已经完成的 <code>Map</code> 任务:由于 <code>Map</code> 任务的输出已经无法被访问,故该任务会被重置为空闲状态,后续将被安排给其他的 <code>worker</code>重新执行,并且<code>重新执行</code>的动作会通知给所有执行 <code>Reduce</code> 的 <code>worker</code>。任何还没有从失效 <code>worker</code>上读取数据的 <code>Reduce</code> 任务后续将从新的 <code>worker</code> 上读取。</p></li><li><p>已经完成的 <code>Reduce</code> 任务:由于<code>Reduce</code> 的输出结果会存储在全局文件系统上,故无需再次执行。</p><blockquote><p>该设定是基于 Google 的使用场景,Google 会将 Reduce 任务的输出结果放在 GFS 上,故无需担心数据丢失问题,如果将结果放在本地文件,则仍需要重新执行 Reduce 任务。</p></blockquote></li><li><p>正在运行的 <code>Map</code> 和 <code>Reduce</code> 任务:将会被重置为空闲状态,等待重新调度。</p></li></ol><h4 id="master-失效"><a href="#master-失效" class="headerlink" title="master 失效"></a>master 失效</h4><p><code>master</code> 会周期性的将其维护的数据结构(<a href="#master维护的数据结构">点击跳转</a>) 写入磁盘(即检查点 checkpoint),当前 <code>master</code> 失效时,将会重新启动一个 <code>master</code> 并从检查点记录的状态继续执行。</p><blockquote><p>在 Google 内部的实现中,如果 master 失效,就中止 MapReduce 运算,由人工干预恢复 master 状态。</p></blockquote><h4 id="对于上述机制的讨论"><a href="#对于上述机制的讨论" class="headerlink" title="对于上述机制的讨论"></a>对于上述机制的讨论</h4><p>当用户定义的 <code>Map-Reduce</code> 函数都是确定性函数的时候,相同的输入具有相同的输出,所以重复执行具有相同的结果,上述容错机制最终都可以得到正确的结果。</p><p>但当用户定义的 <code>Map-Reduce</code> 函数非确定性函数时,两个 <code>Reduce</code> 可能会输入来自两个不同 <code>Map</code>(但是 <code>Map</code> 函数的输入是相同的) 的结果,此时 <code>Reduce</code> 函数的输出可能会不同。</p><h2 id="优化技巧"><a href="#优化技巧" class="headerlink" title="优化技巧"></a>优化技巧</h2><h3 id="存储与计算尽量在同一个节点"><a href="#存储与计算尽量在同一个节点" class="headerlink" title="存储与计算尽量在同一个节点"></a>存储与计算尽量在同一个节点</h3><p>为了减少数据拉取带来的网络开销,在 <code>Google</code> 的使用场景下,会结合 <code>GFS</code> 的存储,尽可能的将 <code>Map</code> 任务调度到文件存储所在的,或者相邻的服务器上。</p><h3 id="增加任务数以平衡负载"><a href="#增加任务数以平衡负载" class="headerlink" title="增加任务数以平衡负载"></a>增加任务数以平衡负载</h3><p>前面提到,在计算过程将 <code>Map</code> 拆成 <code>M</code> 个,<code>Reduce</code> 拆成 <code>R</code> 个来执行,理想情况下 <code>M</code> 和 <code>R</code> 应当远远大于 <code>worker</code> 的数量,这样可以让每个 <code>worker</code> 都可以执行大量不同的任务来实现动态的负载均衡,也可以在某个 <code>worker</code> 生效,任务转移到其他 <code>worker</code> 时不会造成热点。</p><p>但实际上需要考虑以下因素来限制 <code>M</code> 和 <code>R</code> 的值:</p><ol><li><code>master</code> 节点需要执行 $O(M+R)$ 次调度,并且需要在内存中保存 $O(M*R)$ 个状态</li><li><code>R</code> 值需要用户指定,因为它是最终结果的文件数目</li><li><code>M</code> 值的大小决定了每个 <code>Map</code> 输入数据的数据量,通常我们向让每个输入数据在 <code>16MB~64MB</code> 之间</li></ol><p>基于以上的原因,在 <code>Google</code> 的场景下,他们的 <code>M</code> <code>R</code> 与 <code>worker</code> 数量的比例一般为:$M=200000, R=5000, worker=2000$</p><h3 id="处理落后任务"><a href="#处理落后任务" class="headerlink" title="处理落后任务"></a>处理落后任务</h3><p><code>Map-Reduce</code> 这样的分布式计算框架,最终受到<code>木桶效应</code> 的影响会很大,如果有个 <code>worker</code>由于 <code>CPU</code> <code>内存</code> <code>磁盘</code> 等因素的影响,导致执行速度缓慢,就会影响整个计算任务的进度。</p><p>为了解决上述问题,在 <code>Map-Reduce</code> 操作接近完成的时候,<code>master</code>会调度备用任务(backup)进程来执行剩下的,处于处理中(in-progress)的任务,当备用任务处理完成或者是原始任务处理完成,都会将这个任务标记为完成。</p><h3 id="自定义分区函数"><a href="#自定义分区函数" class="headerlink" title="自定义分区函数"></a>自定义分区函数</h3><p><code>Map</code> 任务执行完成后,<code>Map-Reduce</code> 框架会对<code>Map</code> 产生的 <code>key</code> 使用分区函数进行分区,以保证相同的 <code>key</code> 能够被分配到相同的 <code>Reduce</code> 任务,通常来说分区函数使用 <code>hash</code> 方法,比如 $hash(key) mod R$。但是对于某些场景,可能会有其他的分区需求,比如我们对 <code>URL</code> 进行统计时,如果我们想把相同域名的 <code>URL</code> 分到同一个 <code>Reduce</code> 任务,这时需要自定义分区函数。</p><h3 id="Combiner-函数"><a href="#Combiner-函数" class="headerlink" title="Combiner 函数"></a>Combiner 函数</h3><p>在 <code>Map</code> 任务结束后,可能存在大量的重复 <code>key</code> 的 <code>key/value pair</code>,比如上述统计单词个数的任务,<code>Map</code> 结束后可能会出现大量 <code>the <-> 1</code> 这种 <code>key/value pair</code>。所有这些 <code>key/value pair</code> 都得通过网络发送给 <code>Reduce</code> 并由 <code>Reduce</code> 任务作累加。</p><p>为了减小网络传输和 <code>Reduce</code> 任务的工作量, <code>Map-Reduce</code> 框架支持用户指定一个 <code>Combiner</code> 函数在 <code>Map</code> 任务执行结束后,对 <code>Map</code> 任务的结果先做一次预处理,以将相同的 <code>key</code> 值合并在一起。</p>]]></content>
<summary type="html"><p>本文是博主学习 <code>MIT6.824</code> 课程的学习笔记,其中会总结论文知识点并加入自己的理解,内容可能与论文原文有出入,想要了解细节的读者可以阅读论文原文或者学习 <code>MIT6.824</code>课程。</p>
<p><a href="https://pdos.csail.mit.edu/6.824/papers/mapreduce.pdf">MapReduce: Simplified Data Processing on Large Clusters</a></p>
<p><a href="https://pdos.csail.mit.edu/6.824/video/1.html">Introduction And MapReduce</a></p>
<h2 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h2><p><code>MapReduce</code> 是一种编程模型,也是一个处理和生成超大数据集算法模型的相关实现。使用 <code>MapReduce</code> 架构的程序能够在大量的普通配置的计算机上实现并行化处理。</p>
<h2 id="MapReduce-模型"><a href="#MapReduce-模型" class="headerlink" title="MapReduce 模型"></a>MapReduce 模型</h2><p><code>MapReduce</code> 编程模型的处理过程为:输入一个 <code>key/value pair</code> 集合,经过处理后,输出一个 <code>key/value pair</code> 集合作为结果。<br><code>MapReduce</code> 允许用户使用两个函数 <code>Map</code> 和 <code>Reduce</code> 来表达上述计算。</p>
<ul>
<li><code>Map</code> 函数接受一个输入的 <code>key/value pair</code> 值,然后产生一个中间 <code>key/value pair</code> 值的集合。<code>MapReduce</code> 把所有 <code>key</code> 为 <code>I</code> 的中间值 <code>value</code>集合在一起后传递给 <code>reduce</code> 函数。</li>
<li><code>Reduce</code> 函数接受一个中间 <code>key</code> 的值 <code>I</code> 和其<code>value</code>值的集合,由于 <code>value</code> 值可能由于太大无法放入内存中,故通常我们把 <code>value</code> 的迭代器传递给 <code>Reduce</code> 函数。</li>
</ul>
<p>上述过程也可以抽象为下面的表达式</p>
<p>$$ map(k1, v1) -&gt; list(k2, v2) $$</p>
<p>$$ reduce(k2, list(v2)) -&gt; list(v2) $$</p></summary>
</entry>
<entry>
<title>方法引用</title>
<link href="https://andrewei1316.github.io/2019/06/02/method-reference/"/>
<id>https://andrewei1316.github.io/2019/06/02/method-reference/</id>
<published>2019-06-02T05:55:26.000Z</published>
<updated>2019-06-02T08:49:33.290Z</updated>
<content type="html"><![CDATA[<h2 id="概述"><a href="#概述" class="headerlink" title="概述"></a>概述</h2><p><code>方法引用</code>的基础是 Lambda 表达式,它可以认为是 Lambda 表达式的语法糖,用来简化开发。</p><p>在我们使用Lambda表达式的时候,<code>-></code> 右边部分是要执行的代码,即要完成的功能,可以把这部分称作 Lambda 体。有时候,当我们想要实现一个函数式接口的那个抽象方法,但是已经有类实现了我们想要的功能,这个时候我们就可以用方法引用来直接使用现有类的功能去实现。</p><a id="more"></a><h2 id="四种形式"><a href="#四种形式" class="headerlink" title="四种形式"></a>四种形式</h2><h3 id="引用静态方法"><a href="#引用静态方法" class="headerlink" title="引用静态方法"></a>引用静态方法</h3><p>语法: 类名::静态方法名</p><p>例如: <code>String::valueOf</code> 对应的 Lambda 表达式为 <code>s ->String.valueOf(s)</code></p><h3 id="引用特定对象实例的方法"><a href="#引用特定对象实例的方法" class="headerlink" title="引用特定对象实例的方法"></a>引用特定对象实例的方法</h3><p>语法: 对象::对象方法</p><p>例如: <code>obj::toString</code> 对应的 Lambda 表达式为 <code>obj -> obj.toString()</code></p><h3 id="引用特定类型任意对象的实例方法"><a href="#引用特定类型任意对象的实例方法" class="headerlink" title="引用特定类型任意对象的实例方法"></a>引用特定类型任意对象的实例方法</h3><p>语法: 类名::对象方法</p><p>例如: <code>String::compareTo</code> 对应的 Lambda 表达式为 <code>(str1, str2) -> str1.compareTo(str2)</code></p><p>注意: 这种形式不太容易理解,虽然 <code>compareTo</code> 方法只需要一个参数,但是其对应的 Lambda 表达式中却有两个参数,其中第一个参数是 <code>调用 compareTo</code> 方法的对象本身。</p><h3 id="引用构造方法"><a href="#引用构造方法" class="headerlink" title="引用构造方法"></a>引用构造方法</h3><p>语法: 类名::new</p><p>例如: <code>String::new</code> 对应 Lambda 表达式的 <code>() -> new String()</code></p><h2 id="例子"><a href="#例子" class="headerlink" title="例子"></a>例子</h2><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br><span class="line">135</span><br><span class="line">136</span><br><span class="line">137</span><br><span class="line">138</span><br><span class="line">139</span><br><span class="line">140</span><br><span class="line">141</span><br><span class="line">142</span><br><span class="line">143</span><br><span class="line">144</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> info.andrewei;</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> java.util.Arrays;</span><br><span class="line"><span class="keyword">import</span> java.util.List;</span><br><span class="line"><span class="keyword">import</span> java.util.function.Supplier;</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@author</span> Andrewei</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">class</span> <span class="title">Main</span> </span>{</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">void</span> <span class="title">main</span><span class="params">(String[] args)</span> </span>{</span><br><span class="line"> Student str1 = <span class="keyword">new</span> Student(<span class="string">"zhangsan"</span>, <span class="number">10</span>);</span><br><span class="line"> Student str2 = <span class="keyword">new</span> Student(<span class="string">"lisi"</span>, <span class="number">50</span>);</span><br><span class="line"> Student str3 = <span class="keyword">new</span> Student(<span class="string">"wangwu"</span>, <span class="number">40</span>);</span><br><span class="line"> Student str4 = <span class="keyword">new</span> Student(<span class="string">"zhaoliu"</span>, <span class="number">30</span>);</span><br><span class="line"> List<Student> list;</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 第一种</span></span><br><span class="line"> list = Arrays.asList(str1, str2, str3, str4);</span><br><span class="line"> list.sort(StudentCompare::staticCompareStudentsByName);</span><br><span class="line"> list.forEach(stu -> System.out.println(stu.getName()));</span><br><span class="line"></span><br><span class="line"> list.sort(StudentCompare::staticCompareStudentsByScore);</span><br><span class="line"> list.forEach(stu -> System.out.println(stu.getScore()));</span><br><span class="line"></span><br><span class="line"> System.out.println(<span class="string">"---------------------"</span>);</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"> <span class="comment">// 第二种</span></span><br><span class="line"> StudentCompare studentCompator = <span class="keyword">new</span> StudentCompare();</span><br><span class="line"> list = Arrays.asList(str1, str2, str3, str4);</span><br><span class="line"> list.sort(studentCompator::compareStudentsByName);</span><br><span class="line"> list.forEach(stu -> System.out.println(stu.getName()));</span><br><span class="line"></span><br><span class="line"> list.sort(studentCompator::compareStudentsByScore);</span><br><span class="line"> list.forEach(stu -> System.out.println(stu.getScore()));</span><br><span class="line"></span><br><span class="line"> System.out.println(<span class="string">"---------------------"</span>);</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"> <span class="comment">// 第三种</span></span><br><span class="line"> list = Arrays.asList(str1, str2, str3, str4);</span><br><span class="line"> list.sort(Student::comparedByName);</span><br><span class="line"> list.forEach(stu -> System.out.println(stu.getName()));</span><br><span class="line"></span><br><span class="line"> list.sort(Student::comparedByScore);</span><br><span class="line"> list.forEach(stu -> System.out.println(stu.getScore()));</span><br><span class="line"></span><br><span class="line"> System.out.println(<span class="string">"---------------------"</span>);</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"> <span class="comment">// 第四种</span></span><br><span class="line"> System.out.println(Student.getStudent(String::<span class="keyword">new</span>));</span><br><span class="line"></span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="keyword">static</span> <span class="class"><span class="keyword">class</span> <span class="title">Student</span> </span>{</span><br><span class="line"> <span class="keyword">private</span> String name;</span><br><span class="line"> <span class="keyword">private</span> <span class="keyword">int</span> score;</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="title">Student</span><span class="params">(String name, <span class="keyword">int</span> score)</span> </span>{</span><br><span class="line"> <span class="keyword">this</span>.name = name;</span><br><span class="line"> <span class="keyword">this</span>.score = score;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> Student <span class="title">getStudent</span><span class="params">(Supplier<String> nameSupplier)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> <span class="keyword">new</span> Student(nameSupplier.get()+<span class="string">"_test"</span>, <span class="number">10</span>);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> String <span class="title">getName</span><span class="params">()</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> name;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">int</span> <span class="title">getScore</span><span class="params">()</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> score;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">int</span> <span class="title">comparedByScore</span><span class="params">(Student stu1)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> <span class="keyword">this</span>.score - stu1.getScore();</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">int</span> <span class="title">comparedByName</span><span class="params">(Student stu1)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> <span class="keyword">this</span>.name.compareTo(stu1.getName());</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="meta">@Override</span></span><br><span class="line"> <span class="function"><span class="keyword">public</span> String <span class="title">toString</span><span class="params">()</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> <span class="string">"Student{"</span> +</span><br><span class="line"> <span class="string">"name='"</span> + name + <span class="string">'\''</span> +</span><br><span class="line"> <span class="string">", score="</span> + score +</span><br><span class="line"> <span class="string">'}'</span>;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="keyword">static</span> <span class="class"><span class="keyword">class</span> <span class="title">StudentCompare</span> </span>{</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">int</span> <span class="title">staticCompareStudentsByScore</span><span class="params">(Student stu1, Student stu2)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> stu1.getScore() - stu2.getScore();</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">int</span> <span class="title">staticCompareStudentsByName</span><span class="params">(Student stu1, Student stu2)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> stu1.getName().compareTo(stu2.getName());</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">int</span> <span class="title">compareStudentsByScore</span><span class="params">(Student stu1, Student stu2)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> stu1.getScore() - stu2.getScore();</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">int</span> <span class="title">compareStudentsByName</span><span class="params">(Student stu1, Student stu2)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> stu1.getName().compareTo(stu2.getName());</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 输出</span></span><br><span class="line"><span class="comment">//lisi</span></span><br><span class="line"><span class="comment">//wangwu</span></span><br><span class="line"><span class="comment">//zhangsan</span></span><br><span class="line"><span class="comment">//zhaoliu</span></span><br><span class="line"><span class="comment">//10</span></span><br><span class="line"><span class="comment">//30</span></span><br><span class="line"><span class="comment">//40</span></span><br><span class="line"><span class="comment">//50</span></span><br><span class="line"><span class="comment">//---------------------</span></span><br><span class="line"><span class="comment">//lisi</span></span><br><span class="line"><span class="comment">//wangwu</span></span><br><span class="line"><span class="comment">//zhangsan</span></span><br><span class="line"><span class="comment">//zhaoliu</span></span><br><span class="line"><span class="comment">//10</span></span><br><span class="line"><span class="comment">//30</span></span><br><span class="line"><span class="comment">//40</span></span><br><span class="line"><span class="comment">//50</span></span><br><span class="line"><span class="comment">//---------------------</span></span><br><span class="line"><span class="comment">//lisi</span></span><br><span class="line"><span class="comment">//wangwu</span></span><br><span class="line"><span class="comment">//zhangsan</span></span><br><span class="line"><span class="comment">//zhaoliu</span></span><br><span class="line"><span class="comment">//10</span></span><br><span class="line"><span class="comment">//30</span></span><br><span class="line"><span class="comment">//40</span></span><br><span class="line"><span class="comment">//50</span></span><br><span class="line"><span class="comment">//---------------------</span></span><br><span class="line"><span class="comment">//Student{name='_test', score=10}</span></span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><h2 id="概述"><a href="#概述" class="headerlink" title="概述"></a>概述</h2><p><code>方法引用</code>的基础是 Lambda 表达式,它可以认为是 Lambda 表达式的语法糖,用来简化开发。</p>
<p>在我们使用Lambda表达式的时候,<code>-&gt;</code> 右边部分是要执行的代码,即要完成的功能,可以把这部分称作 Lambda 体。有时候,当我们想要实现一个函数式接口的那个抽象方法,但是已经有类实现了我们想要的功能,这个时候我们就可以用方法引用来直接使用现有类的功能去实现。</p></summary>
<category term="编程语言" scheme="https://andrewei1316.github.io/categories/%E7%BC%96%E7%A8%8B%E8%AF%AD%E8%A8%80/"/>
<category term="JAVA" scheme="https://andrewei1316.github.io/categories/%E7%BC%96%E7%A8%8B%E8%AF%AD%E8%A8%80/JAVA/"/>
<category term="java" scheme="https://andrewei1316.github.io/tags/java/"/>
</entry>
<entry>
<title>Optional 详解</title>
<link href="https://andrewei1316.github.io/2019/05/18/java-optional/"/>
<id>https://andrewei1316.github.io/2019/05/18/java-optional/</id>
<published>2019-05-18T02:36:48.000Z</published>
<updated>2019-05-18T04:17:50.903Z</updated>
<content type="html"><![CDATA[<h3 id="定义"><a href="#定义" class="headerlink" title="定义"></a>定义</h3><ol><li><p><code>Optional</code> 的出现主要为了解决 <code>NullPointerExcepton</code> 的异常。</p></li><li><p><code>Optional</code> 是一个值的容器,用来存储一个 <code>Object</code> 或者 <code>null</code>。</p></li><li><p>它是一个<code>基于值的类(value-base class)</code>。</p><blockquote><p>基于值的类(value-base class) 需要满足以下几个条件:</p><ol><li>必须为 <code>final</code> 和 不可变的(可以包含可变对象的引用);</li><li>必须实现 <code>equals</code>、 <code>hashCode</code> 和 <code>toString</code> 方法。并且这些方法必须仅根据当前实例的状态独自计算,而不是根据他的标识或者其他对象的状态、变量计算;</li><li>不使用身份敏感的操作,例如实例之间通过引用的 <code>==</code> 来判等、实例的 <code>hashCode</code> 已经实例内在的锁;</li><li>两个实例的相等,仅仅基于 <code>equals()</code> 方法,而不基于引用的相等(==);</li><li>没有可访问的构造方法(构造方法为私有),仅仅通过工厂方法来实例化对象,但是工厂方法不保证返回实例的一致性(即:第一次调用与第二次调用可能返回的实例是不同的);</li><li>如果使用 <code>equals</code> 方法判断两个实例是相同的,那么这两个实例之间可以随意替换。</li></ol></blockquote></li></ol><a id="more"></a><h3 id="static-方法"><a href="#static-方法" class="headerlink" title="static 方法"></a>static 方法</h3><ol><li><p><code>empty()</code> 构造一个包含的值为 <code>null</code> 的 <code>Optional</code> 对象;</p></li><li><p><code>of()</code> 构造一个包含的值不为 <code>null</code> 的 <code>Optional</code> 对象;</p></li><li><p><code>ofNullable()</code> 构造一个包含的值可为 <code>null</code> 也可不为 <code>null</code> 的 <code>Optional</code> 对象;</p></li></ol><h3 id="一些方法"><a href="#一些方法" class="headerlink" title="一些方法"></a>一些方法</h3><h4 id="ifPresent-方法"><a href="#ifPresent-方法" class="headerlink" title="ifPresent 方法"></a>ifPresent 方法</h4><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * If a value is present, invoke the specified consumer with the value,</span></span><br><span class="line"><span class="comment"> * otherwise do nothing.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> consumer block to be executed if a value is present</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@throws</span> NullPointerException if value is present and {<span class="doctag">@code</span> consumer} is</span></span><br><span class="line"><span class="comment"> * null</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">ifPresent</span><span class="params">(Consumer<? <span class="keyword">super</span> T> consumer)</span> </span>{</span><br><span class="line"> <span class="keyword">if</span> (value != <span class="keyword">null</span>)</span><br><span class="line"> consumer.accept(value);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>该方法接收一个 <code>Consumer</code> 对象,并且当 <code>Optional</code> 对象中包含的值不为 <code>null</code> 的时候,调用 <code>Consumer</code> 对象的 <code>accept</code> 方法.</p><p>这个方法可以提供给我们的便利是,判断当一个对象不为空的时候去做一些事情,比如 </p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> info.andrewei;</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> java.util.Optional;</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@author</span> Andrewei</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">class</span> <span class="title">Main</span> </span>{</span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">void</span> <span class="title">main</span><span class="params">(String[] args)</span> </span>{</span><br><span class="line"> Main ma = <span class="keyword">new</span> Main();</span><br><span class="line"> ma.print(<span class="keyword">null</span>);</span><br><span class="line"> ma.print(<span class="string">"hello"</span>);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">print</span><span class="params">(String str)</span> </span>{</span><br><span class="line"> <span class="comment">// 第一种写法</span></span><br><span class="line"> <span class="keyword">if</span> (<span class="keyword">null</span> != str) {</span><br><span class="line"> System.out.println(str);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 第二种写法</span></span><br><span class="line"> Optional<String> optional = Optional.ofNullable(str);</span><br><span class="line"> optional.ifPresent(System.out::println);</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h4 id="orElse-方法"><a href="#orElse-方法" class="headerlink" title="orElse 方法"></a>orElse 方法</h4><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Return the value if present, otherwise return {<span class="doctag">@code</span> other}.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> other the value to be returned if there is no value present, may</span></span><br><span class="line"><span class="comment"> * be null</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> the value, if present, otherwise {<span class="doctag">@code</span> other}</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="keyword">public</span> T <span class="title">orElse</span><span class="params">(T other)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> value != <span class="keyword">null</span> ? value : other;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>该方法接收一个与当前 <code>Optional</code> 对象包含的值类型相同的对象参数,当前 <code>Optional</code> 对象包含的值为 <code>null</code> 时,返回传入的参数,否则返回当前包含的值。</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> info.andrewei;</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> java.util.Optional;</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@author</span> Andrewei</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">class</span> <span class="title">Main</span> </span>{</span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">void</span> <span class="title">main</span><span class="params">(String[] args)</span> </span>{</span><br><span class="line"> Main ma = <span class="keyword">new</span> Main();</span><br><span class="line"> ma.print(<span class="keyword">null</span>);</span><br><span class="line"> ma.print(<span class="string">"hello"</span>);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">print</span><span class="params">(String str)</span> </span>{</span><br><span class="line"> <span class="comment">// 第一种写法</span></span><br><span class="line"> <span class="keyword">if</span> (<span class="keyword">null</span> != str) {</span><br><span class="line"> System.out.println(str);</span><br><span class="line"> } <span class="keyword">else</span> {</span><br><span class="line"> System.out.println(<span class="string">"world"</span>);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 第二种写法</span></span><br><span class="line"> Optional<String> optional = Optional.ofNullable(str);</span><br><span class="line"> System.out.println(optional.orElse(<span class="string">"world"</span>));</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h4 id="orElseGet-方法"><a href="#orElseGet-方法" class="headerlink" title="orElseGet 方法"></a>orElseGet 方法</h4><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Return the value if present, otherwise invoke {<span class="doctag">@code</span> other} and return</span></span><br><span class="line"><span class="comment"> * the result of that invocation.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> other a {<span class="doctag">@code</span> Supplier} whose result is returned if no value</span></span><br><span class="line"><span class="comment"> * is present</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> the value if present otherwise the result of {<span class="doctag">@code</span> other.get()}</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@throws</span> NullPointerException if value is not present and {<span class="doctag">@code</span> other} is</span></span><br><span class="line"><span class="comment"> * null</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="keyword">public</span> T <span class="title">orElseGet</span><span class="params">(Supplier<? extends T> other)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> value != <span class="keyword">null</span> ? value : other.get();</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>跟 <code>orElase</code> 不同的是,该方法传入的参数为 <code>Supplier</code> 的对象,当前 <code>Optional</code> 对象包含的值为 <code>null</code> 时,该方法会调用 <code>Supplier</code> 对象的 <code>get()</code> 方法来生成一个值返回。</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> info.andrewei;</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> java.util.Optional;</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@author</span> Andrewei</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">class</span> <span class="title">Main</span> </span>{</span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">void</span> <span class="title">main</span><span class="params">(String[] args)</span> </span>{</span><br><span class="line"> Main ma = <span class="keyword">new</span> Main();</span><br><span class="line"> ma.print(<span class="keyword">null</span>);</span><br><span class="line"> ma.print(<span class="string">"hello"</span>);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">print</span><span class="params">(String str)</span> </span>{</span><br><span class="line"> <span class="comment">// 第一种写法</span></span><br><span class="line"> <span class="keyword">if</span> (<span class="keyword">null</span> != str) {</span><br><span class="line"> System.out.println(str);</span><br><span class="line"> } <span class="keyword">else</span> {</span><br><span class="line"> System.out.println(<span class="string">"world"</span>);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 第二种写法</span></span><br><span class="line"> Optional<String> optional = Optional.ofNullable(str);</span><br><span class="line"> System.out.println(optional.orElseGet(() -> <span class="string">"world"</span>));</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h4 id="map-方法"><a href="#map-方法" class="headerlink" title="map 方法"></a>map 方法</h4><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * If a value is present, apply the provided mapping function to it,</span></span><br><span class="line"><span class="comment"> * and if the result is non-null, return an {<span class="doctag">@code</span> Optional} describing the</span></span><br><span class="line"><span class="comment"> * result. Otherwise return an empty {<span class="doctag">@code</span> Optional}.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@apiNote</span> This method supports post-processing on optional values, without</span></span><br><span class="line"><span class="comment"> * the need to explicitly check for a return status. For example, the</span></span><br><span class="line"><span class="comment"> * following code traverses a stream of file names, selects one that has</span></span><br><span class="line"><span class="comment"> * not yet been processed, and then opens that file, returning an</span></span><br><span class="line"><span class="comment"> * {<span class="doctag">@code</span> Optional<FileInputStream>}:</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <pre>{<span class="doctag">@code</span></span></span><br><span class="line"><span class="comment"> * Optional<FileInputStream> fis =</span></span><br><span class="line"><span class="comment"> * names.stream().filter(name -> !isProcessedYet(name))</span></span><br><span class="line"><span class="comment"> * .findFirst()</span></span><br><span class="line"><span class="comment"> * .map(name -> new FileInputStream(name));</span></span><br><span class="line"><span class="comment"> * }</pre></span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * Here, {<span class="doctag">@code</span> findFirst} returns an {<span class="doctag">@code</span> Optional<String>}, and then</span></span><br><span class="line"><span class="comment"> * {<span class="doctag">@code</span> map} returns an {<span class="doctag">@code</span> Optional<FileInputStream>} for the desired</span></span><br><span class="line"><span class="comment"> * file if one exists.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> <U> The type of the result of the mapping function</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> mapper a mapping function to apply to the value, if present</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> an {<span class="doctag">@code</span> Optional} describing the result of applying a mapping</span></span><br><span class="line"><span class="comment"> * function to the value of this {<span class="doctag">@code</span> Optional}, if a value is present,</span></span><br><span class="line"><span class="comment"> * otherwise an empty {<span class="doctag">@code</span> Optional}</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@throws</span> NullPointerException if the mapping function is null</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="keyword">public</span><U> Optional<U> <span class="title">map</span><span class="params">(Function<? <span class="keyword">super</span> T, ? extends U> mapper)</span> </span>{</span><br><span class="line"> Objects.requireNonNull(mapper);</span><br><span class="line"> <span class="keyword">if</span> (!isPresent())</span><br><span class="line"> <span class="keyword">return</span> empty();</span><br><span class="line"> <span class="keyword">else</span> {</span><br><span class="line"> <span class="keyword">return</span> Optional.ofNullable(mapper.apply(value));</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p><code>map</code> 方法接收一个 <code>Function</code> 对象。返回一个新的 <code>Optional</code> 对象。</p><p>当前 <code>Optiaon</code> 对象包含的值为 <code>null</code> 时,返回一个包含 <code>null</code> 对象的 <code>Optional</code> 对象;当前 <code>Optional</code> 包含的值 <code>value</code> 不为 <code>null</code> 时,通过在 <code>value</code> 上应用 <code>Function</code> 对象的 <code>apply</code> 方法得到新值 <code>value1</code> 并构造一个包含 <code>value1</code> 的新 <code>Optional</code> 对象。即该方法可以改变值。</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> info.andrewei;</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> java.util.Arrays;</span><br><span class="line"><span class="keyword">import</span> java.util.Collections;</span><br><span class="line"><span class="keyword">import</span> java.util.List;</span><br><span class="line"><span class="keyword">import</span> java.util.Optional;</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@author</span> Andrewei</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">class</span> <span class="title">Main</span> </span>{</span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">void</span> <span class="title">main</span><span class="params">(String[] args)</span> </span>{</span><br><span class="line"> Main ma = <span class="keyword">new</span> Main();</span><br><span class="line"></span><br><span class="line"> Person p1 = <span class="keyword">new</span> Person();</span><br><span class="line"> p1.setName(<span class="string">"zhangsan"</span>);</span><br><span class="line"> p1.setAge(<span class="number">10</span>);</span><br><span class="line"></span><br><span class="line"> Person p2 = <span class="keyword">new</span> Person();</span><br><span class="line"> p2.setName(<span class="string">"lisi"</span>);</span><br><span class="line"> p2.setAge(<span class="number">20</span>);</span><br><span class="line"></span><br><span class="line"> Company company = <span class="keyword">new</span> Company();</span><br><span class="line"> company.setName(<span class="string">"company1"</span>);</span><br><span class="line"> company.setPersonList(Arrays.asList(p1, p2));</span><br><span class="line"></span><br><span class="line"> System.out.println(ma.getCompanyPersons(<span class="keyword">null</span>));</span><br><span class="line"> System.out.println(ma.getCompanyPersons(<span class="keyword">new</span> Company()));</span><br><span class="line"> System.out.println(ma.getCompanyPersons(company));</span><br><span class="line"></span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> List<Person> <span class="title">getCompanyPersons</span><span class="params">(Company company)</span> </span>{</span><br><span class="line"> Optional<Company> optional = Optional.ofNullable(company);</span><br><span class="line"> <span class="keyword">return</span> optional.map(Company::getPersonList).orElse(Collections.emptyList());</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="keyword">static</span> <span class="class"><span class="keyword">class</span> <span class="title">Company</span> </span>{</span><br><span class="line"> String name;</span><br><span class="line"> List<Person> personList;</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> String <span class="title">getName</span><span class="params">()</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> name;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">setName</span><span class="params">(String name)</span> </span>{</span><br><span class="line"> <span class="keyword">this</span>.name = name;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> List<Person> <span class="title">getPersonList</span><span class="params">()</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> personList;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">setPersonList</span><span class="params">(List<Person> personList)</span> </span>{</span><br><span class="line"> <span class="keyword">this</span>.personList = personList;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="keyword">static</span> <span class="class"><span class="keyword">class</span> <span class="title">Person</span> </span>{</span><br><span class="line"> String name;</span><br><span class="line"> <span class="keyword">int</span> age;</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> String <span class="title">getName</span><span class="params">()</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> name;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">setName</span><span class="params">(String name)</span> </span>{</span><br><span class="line"> <span class="keyword">this</span>.name = name;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">int</span> <span class="title">getAge</span><span class="params">()</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> age;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">setAge</span><span class="params">(<span class="keyword">int</span> age)</span> </span>{</span><br><span class="line"> <span class="keyword">this</span>.age = age;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="meta">@Override</span></span><br><span class="line"> <span class="function"><span class="keyword">public</span> String <span class="title">toString</span><span class="params">()</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> <span class="string">"Person{"</span> +</span><br><span class="line"> <span class="string">"name='"</span> + name + <span class="string">'\''</span> +</span><br><span class="line"> <span class="string">", age="</span> + age +</span><br><span class="line"> <span class="string">'}'</span>;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h4 id="filter-方法"><a href="#filter-方法" class="headerlink" title="filter 方法"></a>filter 方法</h4><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * If a value is present, and the value matches the given predicate,</span></span><br><span class="line"><span class="comment"> * return an {<span class="doctag">@code</span> Optional} describing the value, otherwise return an</span></span><br><span class="line"><span class="comment"> * empty {<span class="doctag">@code</span> Optional}.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> predicate a predicate to apply to the value, if present</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> an {<span class="doctag">@code</span> Optional} describing the value of this {<span class="doctag">@code</span> Optional}</span></span><br><span class="line"><span class="comment"> * if a value is present and the value matches the given predicate,</span></span><br><span class="line"><span class="comment"> * otherwise an empty {<span class="doctag">@code</span> Optional}</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@throws</span> NullPointerException if the predicate is null</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="keyword">public</span> Optional<T> <span class="title">filter</span><span class="params">(Predicate<? <span class="keyword">super</span> T> predicate)</span> </span>{</span><br><span class="line"> Objects.requireNonNull(predicate);</span><br><span class="line"> <span class="keyword">if</span> (!isPresent())</span><br><span class="line"> <span class="keyword">return</span> <span class="keyword">this</span>;</span><br><span class="line"> <span class="keyword">else</span></span><br><span class="line"> <span class="keyword">return</span> predicate.test(value) ? <span class="keyword">this</span> : empty();</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>该方法传入一个 <code>Predicate</code> 对象,当前 <code>Optiaonal</code> 包含的值 <code>value</code> 为 <code>null</code> 或者在 <code>value</code> 上应用 <code>Predicate</code> 对象的 <code>test</code> 方法返回 <code>false</code> 时,该方法返回空的 <code>Predicate</code> 否则返回自身。</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> info.andrewei;</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> java.util.Optional;</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@author</span> Andrewei</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">class</span> <span class="title">Main</span> </span>{</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">void</span> <span class="title">main</span><span class="params">(String[] args)</span> </span>{</span><br><span class="line"> Main ma = <span class="keyword">new</span> Main();</span><br><span class="line"></span><br><span class="line"> ma.print(<span class="keyword">null</span>);</span><br><span class="line"> ma.print(<span class="string">"abc"</span>);</span><br><span class="line"> ma.print(<span class="string">"abcd"</span>);</span><br><span class="line"></span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">print</span><span class="params">(String str)</span> </span>{</span><br><span class="line"> Optional<String> optional = Optional.ofNullable(str);</span><br><span class="line"></span><br><span class="line"> Optional<String> newOpt = optional.filter(s -> s.length() > <span class="number">3</span>);</span><br><span class="line"> System.out.println(newOpt.orElse(<span class="string">"no print"</span>));</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><h3 id="定义"><a href="#定义" class="headerlink" title="定义"></a>定义</h3><ol>
<li><p><code>Optional</code> 的出现主要为了解决 <code>NullPointerExcepton</code> 的异常。</p>
</li>
<li><p><code>Optional</code> 是一个值的容器,用来存储一个 <code>Object</code> 或者 <code>null</code>。</p>
</li>
<li><p>它是一个<code>基于值的类(value-base class)</code>。</p>
<blockquote>
<p>基于值的类(value-base class) 需要满足以下几个条件:</p>
<ol>
<li>必须为 <code>final</code> 和 不可变的(可以包含可变对象的引用);</li>
<li>必须实现 <code>equals</code>、 <code>hashCode</code> 和 <code>toString</code> 方法。并且这些方法必须仅根据当前实例的状态独自计算,而不是根据他的标识或者其他对象的状态、变量计算;</li>
<li>不使用身份敏感的操作,例如实例之间通过引用的 <code>==</code> 来判等、实例的 <code>hashCode</code> 已经实例内在的锁;</li>
<li>两个实例的相等,仅仅基于 <code>equals()</code> 方法,而不基于引用的相等(==);</li>
<li>没有可访问的构造方法(构造方法为私有),仅仅通过工厂方法来实例化对象,但是工厂方法不保证返回实例的一致性(即:第一次调用与第二次调用可能返回的实例是不同的);</li>
<li>如果使用 <code>equals</code> 方法判断两个实例是相同的,那么这两个实例之间可以随意替换。</li>
</ol>
</blockquote>
</li>
</ol></summary>
<category term="编程语言" scheme="https://andrewei1316.github.io/categories/%E7%BC%96%E7%A8%8B%E8%AF%AD%E8%A8%80/"/>
<category term="JAVA" scheme="https://andrewei1316.github.io/categories/%E7%BC%96%E7%A8%8B%E8%AF%AD%E8%A8%80/JAVA/"/>
<category term="java" scheme="https://andrewei1316.github.io/tags/java/"/>
<category term="optional" scheme="https://andrewei1316.github.io/tags/optional/"/>
</entry>
<entry>
<title>Lambda 表达式与函数式接口</title>
<link href="https://andrewei1316.github.io/2019/05/04/java-lambda/"/>
<id>https://andrewei1316.github.io/2019/05/04/java-lambda/</id>
<published>2019-05-04T08:54:51.000Z</published>
<updated>2019-05-08T01:49:54.018Z</updated>
<content type="html"><![CDATA[<blockquote><p>视频教程笔记,视频地址见 <a href="https://www.bilibili.com/video/av46434650">深入理解 Java8+jdk8 源码级思想</a></p></blockquote><h2 id="Lambda-表达式"><a href="#Lambda-表达式" class="headerlink" title="Lambda 表达式"></a>Lambda 表达式</h2><h3 id="Lambda-表达式简介"><a href="#Lambda-表达式简介" class="headerlink" title="Lambda 表达式简介"></a>Lambda 表达式简介</h3><h4 id="介绍"><a href="#介绍" class="headerlink" title="介绍"></a>介绍</h4><p>Lambda 表达式可以认为是一种匿名函数(对 JAVA 而言,他是一个对象,此处暂且认为是一种匿名函数吧),简单地说,它是没有声明的方法,也即没有访问修饰符、返回值声明和名字。</p><h4 id="作用"><a href="#作用" class="headerlink" title="作用"></a>作用</h4><ol><li>在 JAVA8 之前,无法将函数作为参数传递给一个方法,也无法声明返回一个函数的方法。Lambda 表达式为 JAVA 添加了缺失的函数式编程的特性,使我们能把函数作为一等公民看待</li><li>在将函数作为一等公民的语言中,Lambda 表达式的类型是函数。但是在 JAVA 中 Lambda 表达式是对象,他们必须依附于一类特别的对象类型——函数式接口。</li></ol><a id="more"></a><h3 id="Lambda-表达式的语法结构"><a href="#Lambda-表达式的语法结构" class="headerlink" title="Lambda 表达式的语法结构"></a>Lambda 表达式的语法结构</h3><ul><li>一个 Lambda 表达式可以有零个或多个参数</li><li>参数的类型既可以明确声明,也可以根据上下文来推断。例如:<code>(int a)</code>与<code>(a)</code>效果相同</li><li>所有参数需包含在圆括号内,参数之间用逗号相隔。例如:<code>(a, b)</code> 或 <code>(int a, int b)</code> 或 <code>(String a, int b, float c)</code></li><li>空圆括号代表参数集为空。例如:<code>() -> 42</code></li><li>当只有一个参数,且其类型可推导时,圆括号 <code>()</code> 可省略。例如:<code>a -> return a*a</code></li><li>Lambda 表达式的主体可包含零条或多条语句</li><li>如果 Lambda 表达式的主体只有一条语句,花括号 <code>{}</code> 可省略。匿名函数的返回类型与该主体表达式一致</li><li>如果 Lambda 表达式的主体包含一条以上语句,则表达式必须包含在花括号 `{} 中(形成代码块)。匿名函数的返回类型与代码块的返回类型一致,若没有返回则为空</li></ul><h2 id="函数式接口"><a href="#函数式接口" class="headerlink" title="函数式接口"></a>函数式接口</h2><h3 id="函数式接口简介"><a href="#函数式接口简介" class="headerlink" title="函数式接口简介"></a>函数式接口简介</h3><h4 id="定义"><a href="#定义" class="headerlink" title="定义"></a>定义</h4><p>某个接口中有且只有一个抽象方法,此时该接口称为函数式接口。</p><blockquote><p> 如果接口中某个方法重写了 java.lang.Object 中的方法,则改方法不算接口的抽象方法。即下面代码声明的接口也是函数式接口<br> <figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@FunctionalInterface</span></span><br><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">interface</span> <span class="title">MyInterface</span> </span>{</span><br><span class="line"> <span class="function"><span class="keyword">void</span> <span class="title">test</span><span class="params">()</span></span>;</span><br><span class="line"> </span><br><span class="line"> <span class="meta">@Override</span></span><br><span class="line"> <span class="function">String <span class="title">toString</span><span class="params">()</span></span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure></p></blockquote><h4 id="几个知识点"><a href="#几个知识点" class="headerlink" title="几个知识点"></a>几个知识点</h4><ol><li>如果在接口上添加了 FunctionalInterface 注解,则编译器会以函数式接口的定义来要求该接口</li><li>如果一个接口只有一个抽象方法,但是没有加上 FunctionalInterface注解,编译器也会认为该接口是函数式接口</li><li>函数式接口可以通过 lambda表达式、函数引用和构造函数引用的方式来创建</li></ol><h3 id="java8中常用的函数式接口"><a href="#java8中常用的函数式接口" class="headerlink" title="java8中常用的函数式接口"></a>java8中常用的函数式接口</h3><h4 id="Function-接口详解"><a href="#Function-接口详解" class="headerlink" title="Function 接口详解"></a>Function 接口详解</h4><h5 id="源码解析"><a href="#源码解析" class="headerlink" title="源码解析"></a>源码解析</h5><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Represents a function that accepts one argument and produces a result.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <p>This is a <a href="package-summary.html">functional interface</a></span></span><br><span class="line"><span class="comment"> * whose functional method is {<span class="doctag">@link</span> #apply(Object)}.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> <T> the type of the input to the function</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> <R> the type of the result of the function</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@since</span> 1.8</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="meta">@FunctionalInterface</span></span><br><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">interface</span> <span class="title">Function</span><<span class="title">T</span>, <span class="title">R</span>> </span>{</span><br><span class="line"></span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Applies this function to the given argument.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> t the function argument</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> the function result</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="function">R <span class="title">apply</span><span class="params">(T t)</span></span>;</span><br><span class="line"></span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Returns a composed function that first applies the {<span class="doctag">@code</span> before}</span></span><br><span class="line"><span class="comment"> * function to its input, and then applies this function to the result.</span></span><br><span class="line"><span class="comment"> * If evaluation of either function throws an exception, it is relayed to</span></span><br><span class="line"><span class="comment"> * the caller of the composed function.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> <V> the type of input to the {<span class="doctag">@code</span> before} function, and to the</span></span><br><span class="line"><span class="comment"> * composed function</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> before the function to apply before this function is applied</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> a composed function that first applies the {<span class="doctag">@code</span> before}</span></span><br><span class="line"><span class="comment"> * function and then applies this function</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@throws</span> NullPointerException if before is null</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@see</span> #andThen(Function)</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="keyword">default</span> <V> <span class="function">Function<V, R> <span class="title">compose</span><span class="params">(Function<? <span class="keyword">super</span> V, ? extends T> before)</span> </span>{</span><br><span class="line"> Objects.requireNonNull(before);</span><br><span class="line"> <span class="keyword">return</span> (V v) -> apply(before.apply(v));</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Returns a composed function that first applies this function to</span></span><br><span class="line"><span class="comment"> * its input, and then applies the {<span class="doctag">@code</span> after} function to the result.</span></span><br><span class="line"><span class="comment"> * If evaluation of either function throws an exception, it is relayed to</span></span><br><span class="line"><span class="comment"> * the caller of the composed function.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> <V> the type of output of the {<span class="doctag">@code</span> after} function, and of the</span></span><br><span class="line"><span class="comment"> * composed function</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> after the function to apply after this function is applied</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> a composed function that first applies this function and then</span></span><br><span class="line"><span class="comment"> * applies the {<span class="doctag">@code</span> after} function</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@throws</span> NullPointerException if after is null</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@see</span> #compose(Function)</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="keyword">default</span> <V> <span class="function">Function<T, V> <span class="title">andThen</span><span class="params">(Function<? <span class="keyword">super</span> R, ? extends V> after)</span> </span>{</span><br><span class="line"> Objects.requireNonNull(after);</span><br><span class="line"> <span class="keyword">return</span> (T t) -> after.apply(apply(t));</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Returns a function that always returns its input argument.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> <T> the type of the input and output objects to the function</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> a function that always returns its input argument</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="keyword">static</span> <T> <span class="function">Function<T, T> <span class="title">identity</span><span class="params">()</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> t -> t;</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p><code>Function</code> 函数接口一共有四个方法,其中有一个抽象方法,两个有 default 实现的方法,一个静态方法。</p><ol><li><code>R apply(T t)</code> 接收一个 <code>T</code> 类型的参数,并有一个 <code>R</code> 类型的返回值</li><li><code><V> java.util.function.Function<V, R> compose(java.util.function.Function<? super V, ? extends T> before)</code> 和 <code><V> java.util.function.Function<T, V> andThen(java.util.function.Function<? super R, ? extends V> after)</code> 提供了两种组合处理行为。前者是在调用自己的 <code>apply</code> 方法之前,先调用另外一个 <code>Function</code> 接口的 <code>apply</code> 方法;后者是先执行自己的 <code>apply</code> 方法,再执行另外一个 <code>Function</code> 接口的 <code>apply</code> 方法。值得注意的是,这两个函数返回的是一个实现了 <code>apply</code> 方法的新的 <code>Function</code> 对象,而不是直接返回计算后的结果,所以在调用了这两个方法后,还需要 <code>.apply(T)</code> 才能得到结果。</li><li><code><T> java.util.function.Function<T, T> identity()</code> 用来直接返回输入的参数。</li></ol><h5 id="一个例子"><a href="#一个例子" class="headerlink" title="一个例子"></a>一个例子</h5><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> info.andrewei;</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> java.util.function.Function;</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@author</span> Andrewei</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">class</span> <span class="title">Main</span> </span>{</span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">void</span> <span class="title">main</span><span class="params">(String[] args)</span> </span>{</span><br><span class="line"> Main ma = <span class="keyword">new</span> Main();</span><br><span class="line"> System.out.println(ma.compute1(<span class="number">2</span>, value -> value * <span class="number">3</span>, value -> value * value));</span><br><span class="line"> System.out.println(ma.compute2(<span class="number">2</span>, value -> value * <span class="number">3</span>, value -> value * value));</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">int</span> <span class="title">compute1</span><span class="params">(<span class="keyword">int</span> a, Function<Integer, Integer> function1, Function<Integer, Integer> function2)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> function1.compose(function2).apply(a);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">int</span> <span class="title">compute2</span><span class="params">(<span class="keyword">int</span> a, Function<Integer, Integer> function1, Function<Integer, Integer> function2)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> function1.andThen(function2).apply(a);</span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 输出</span></span><br><span class="line"><span class="comment">// 3 * (2 * 2) = 12</span></span><br><span class="line"><span class="comment">// (2 * 3) ^ 2 = 36</span></span><br></pre></td></tr></table></figure><h4 id="BIFunction-接口详解"><a href="#BIFunction-接口详解" class="headerlink" title="BIFunction 接口详解"></a>BIFunction 接口详解</h4><h5 id="源码解析-1"><a href="#源码解析-1" class="headerlink" title="源码解析"></a>源码解析</h5><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Represents a function that accepts two arguments and produces a result.</span></span><br><span class="line"><span class="comment"> * This is the two-arity specialization of {<span class="doctag">@link</span> Function}.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <p>This is a <a href="package-summary.html">functional interface</a></span></span><br><span class="line"><span class="comment"> * whose functional method is {<span class="doctag">@link</span> #apply(Object, Object)}.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> <T> the type of the first argument to the function</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> <U> the type of the second argument to the function</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> <R> the type of the result of the function</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@see</span> Function</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@since</span> 1.8</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="meta">@FunctionalInterface</span></span><br><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">interface</span> <span class="title">BiFunction</span><<span class="title">T</span>, <span class="title">U</span>, <span class="title">R</span>> </span>{</span><br><span class="line"></span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Applies this function to the given arguments.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> t the first function argument</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> u the second function argument</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> the function result</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="function">R <span class="title">apply</span><span class="params">(T t, U u)</span></span>;</span><br><span class="line"></span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Returns a composed function that first applies this function to</span></span><br><span class="line"><span class="comment"> * its input, and then applies the {<span class="doctag">@code</span> after} function to the result.</span></span><br><span class="line"><span class="comment"> * If evaluation of either function throws an exception, it is relayed to</span></span><br><span class="line"><span class="comment"> * the caller of the composed function.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> <V> the type of output of the {<span class="doctag">@code</span> after} function, and of the</span></span><br><span class="line"><span class="comment"> * composed function</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> after the function to apply after this function is applied</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> a composed function that first applies this function and then</span></span><br><span class="line"><span class="comment"> * applies the {<span class="doctag">@code</span> after} function</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@throws</span> NullPointerException if after is null</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="keyword">default</span> <V> <span class="function">BiFunction<T, U, V> <span class="title">andThen</span><span class="params">(Function<? <span class="keyword">super</span> R, ? extends V> after)</span> </span>{</span><br><span class="line"> Objects.requireNonNull(after);</span><br><span class="line"> <span class="keyword">return</span> (T t, U u) -> after.apply(apply(t, u));</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>可以类比 <code>Function</code> 来看。注意方法 <code><V> BiFunction<T, U, V> andThen(Function<? super R, ? extends V> after)</code> 的参数为 <code>Function</code> 类型。原因也比较容易理解,因为 <code>andThen</code> 方法会先执行自己的 <code>apply</code> 方法,再执行传入的 <code>Function</code> 接口的 <code>apply</code> 方法。执行自己的 <code>apply</code> 方法只会有一个 <code>R</code> 类型的返回值,所以后面的 <code>apply</code> 方法只能有一个入参。</p><h5 id="一个例子-1"><a href="#一个例子-1" class="headerlink" title="一个例子"></a>一个例子</h5><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> info.andrewei;</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> java.util.function.BiFunction;</span><br><span class="line"><span class="keyword">import</span> java.util.function.Function;</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@author</span> Andrewei</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">class</span> <span class="title">Main</span> </span>{</span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">void</span> <span class="title">main</span><span class="params">(String[] args)</span> </span>{</span><br><span class="line"> Main ma = <span class="keyword">new</span> Main();</span><br><span class="line"></span><br><span class="line"> System.out.println(ma.compute3(<span class="number">1</span>, <span class="number">2</span>, (value1, value2) -> value1 + value2));</span><br><span class="line"> System.out.println(ma.compute3(<span class="number">1</span>, <span class="number">2</span>, (value1, value2) -> value1 - value2));</span><br><span class="line"> System.out.println(ma.compute3(<span class="number">1</span>, <span class="number">2</span>, (value1, value2) -> value1 * value2));</span><br><span class="line"> System.out.println(ma.compute3(<span class="number">1</span>, <span class="number">2</span>, (value1, value2) -> value1 / value2));</span><br><span class="line"></span><br><span class="line"> System.out.println(ma.compute4(<span class="number">2</span>, <span class="number">3</span>, (value1, value2) -> value1 + value2, value -> value * value));</span><br><span class="line"></span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">int</span> <span class="title">compute3</span><span class="params">(<span class="keyword">int</span> a, <span class="keyword">int</span> b, BiFunction<Integer, Integer, Integer> biFunction)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> biFunction.apply(a, b);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">int</span> <span class="title">compute4</span><span class="params">(<span class="keyword">int</span> a, <span class="keyword">int</span> b, BiFunction<Integer, Integer, Integer> biFunction, Function<Integer, Integer> function)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> biFunction.andThen(function).apply(a, b);</span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 输出</span></span><br><span class="line"><span class="comment">// 3</span></span><br><span class="line"><span class="comment">// -1</span></span><br><span class="line"><span class="comment">// 2</span></span><br><span class="line"><span class="comment">// 0</span></span><br><span class="line"><span class="comment">// 25</span></span><br></pre></td></tr></table></figure><h4 id="Predicate-接口详解"><a href="#Predicate-接口详解" class="headerlink" title="Predicate 接口详解"></a>Predicate 接口详解</h4><h5 id="源码解析-2"><a href="#源码解析-2" class="headerlink" title="源码解析"></a>源码解析</h5><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@FunctionalInterface</span></span><br><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">interface</span> <span class="title">Predicate</span><<span class="title">T</span>> </span>{</span><br><span class="line"></span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Evaluates this predicate on the given argument.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> t the input argument</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> {<span class="doctag">@code</span> true} if the input argument matches the predicate,</span></span><br><span class="line"><span class="comment"> * otherwise {<span class="doctag">@code</span> false}</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="function"><span class="keyword">boolean</span> <span class="title">test</span><span class="params">(T t)</span></span>;</span><br><span class="line"></span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Returns a composed predicate that represents a short-circuiting logical</span></span><br><span class="line"><span class="comment"> * AND of this predicate and another. When evaluating the composed</span></span><br><span class="line"><span class="comment"> * predicate, if this predicate is {<span class="doctag">@code</span> false}, then the {<span class="doctag">@code</span> other}</span></span><br><span class="line"><span class="comment"> * predicate is not evaluated.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <p>Any exceptions thrown during evaluation of either predicate are relayed</span></span><br><span class="line"><span class="comment"> * to the caller; if evaluation of this predicate throws an exception, the</span></span><br><span class="line"><span class="comment"> * {<span class="doctag">@code</span> other} predicate will not be evaluated.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> other a predicate that will be logically-ANDed with this</span></span><br><span class="line"><span class="comment"> * predicate</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> a composed predicate that represents the short-circuiting logical</span></span><br><span class="line"><span class="comment"> * AND of this predicate and the {<span class="doctag">@code</span> other} predicate</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@throws</span> NullPointerException if other is null</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="function"><span class="keyword">default</span> Predicate<T> <span class="title">and</span><span class="params">(Predicate<? <span class="keyword">super</span> T> other)</span> </span>{</span><br><span class="line"> Objects.requireNonNull(other);</span><br><span class="line"> <span class="keyword">return</span> (t) -> test(t) && other.test(t);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Returns a predicate that represents the logical negation of this</span></span><br><span class="line"><span class="comment"> * predicate.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> a predicate that represents the logical negation of this</span></span><br><span class="line"><span class="comment"> * predicate</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="function"><span class="keyword">default</span> Predicate<T> <span class="title">negate</span><span class="params">()</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> (t) -> !test(t);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Returns a composed predicate that represents a short-circuiting logical</span></span><br><span class="line"><span class="comment"> * OR of this predicate and another. When evaluating the composed</span></span><br><span class="line"><span class="comment"> * predicate, if this predicate is {<span class="doctag">@code</span> true}, then the {<span class="doctag">@code</span> other}</span></span><br><span class="line"><span class="comment"> * predicate is not evaluated.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <p>Any exceptions thrown during evaluation of either predicate are relayed</span></span><br><span class="line"><span class="comment"> * to the caller; if evaluation of this predicate throws an exception, the</span></span><br><span class="line"><span class="comment"> * {<span class="doctag">@code</span> other} predicate will not be evaluated.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> other a predicate that will be logically-ORed with this</span></span><br><span class="line"><span class="comment"> * predicate</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> a composed predicate that represents the short-circuiting logical</span></span><br><span class="line"><span class="comment"> * OR of this predicate and the {<span class="doctag">@code</span> other} predicate</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@throws</span> NullPointerException if other is null</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="function"><span class="keyword">default</span> Predicate<T> <span class="title">or</span><span class="params">(Predicate<? <span class="keyword">super</span> T> other)</span> </span>{</span><br><span class="line"> Objects.requireNonNull(other);</span><br><span class="line"> <span class="keyword">return</span> (t) -> test(t) || other.test(t);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Returns a predicate that tests if two arguments are equal according</span></span><br><span class="line"><span class="comment"> * to {<span class="doctag">@link</span> Objects#equals(Object, Object)}.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> <T> the type of arguments to the predicate</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> targetRef the object reference with which to compare for equality,</span></span><br><span class="line"><span class="comment"> * which may be {<span class="doctag">@code</span> null}</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> a predicate that tests if two arguments are equal according</span></span><br><span class="line"><span class="comment"> * to {<span class="doctag">@link</span> Objects#equals(Object, Object)}</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="keyword">static</span> <T> <span class="function">Predicate<T> <span class="title">isEqual</span><span class="params">(Object targetRef)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> (<span class="keyword">null</span> == targetRef)</span><br><span class="line"> ? Objects::isNull</span><br><span class="line"> : object -> targetRef.equals(object);</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>该接口主要用于做判断,即<code>是否满足条件</code> 这种场景,一共有5个方法</p><ol><li><code>boolean test(T t)</code> 该方法接受一个 <code>T</code> 类型的入参,并返回 <code>boolean</code> 值</li><li><code>Predicate<T> and(Predicate<? super T> other)</code> 该方法允许传入另外一个 <code>Predicate</code> 接口,只有两个 <code>Predicate</code> 都判断为 <code>true</code> 时,才会返回 <code>true</code>,即 <code>与</code> 条件</li><li><code>Predicate<T> or(Predicate<? super T> other)</code> 对比上面的方法,上面的是 <code>与</code> 条件,这个函数是 <code>或</code> 条件</li><li><code>Predicate<T> negate()</code> 返回 <code>!test(t)</code></li><li><code><T> Predicate<T> isEqual(Object targetRef)</code> 判断两个 <code>object</code> 是否相等。一眼看上去会感觉比较奇怪,这个函数实际上是通过出入的参数 <code>targetRef</code> 生成一个 <code><T> Predicate<T></code> 对象,即固定了相比较的两个 <code>object</code> 中的一个 <code>targetRef</code>,后面再调用 <code>.test(obj)</code> 来判断是否相等。</li></ol><h5 id="一个例子-2"><a href="#一个例子-2" class="headerlink" title="一个例子"></a>一个例子</h5><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> info.andrewei;</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> java.util.Arrays;</span><br><span class="line"><span class="keyword">import</span> java.util.List;</span><br><span class="line"><span class="keyword">import</span> java.util.function.Predicate;</span><br><span class="line"><span class="keyword">import</span> java.util.stream.Collectors;</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@author</span> Andrewei</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">class</span> <span class="title">Main</span> </span>{</span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">void</span> <span class="title">main</span><span class="params">(String[] args)</span> </span>{</span><br><span class="line"> List<Integer> list = Arrays.asList(<span class="number">0</span>, <span class="number">1</span>, <span class="number">2</span>, <span class="number">3</span>, <span class="number">4</span>, <span class="number">5</span>, <span class="number">6</span>, <span class="number">7</span>, <span class="number">8</span>, <span class="number">9</span>);</span><br><span class="line"></span><br><span class="line"> System.out.println(list.stream().filter(i -> i % <span class="number">2</span> == <span class="number">0</span>).collect(Collectors.toList()));</span><br><span class="line"> System.out.println(<span class="string">"-----------------------"</span>);</span><br><span class="line"> System.out.println(list.stream().filter(i -> i % <span class="number">2</span> != <span class="number">0</span>).collect(Collectors.toList()));</span><br><span class="line"> System.out.println(<span class="string">"-----------------------"</span>);</span><br><span class="line"> System.out.println(list.stream().filter(i -> i >= <span class="number">5</span>).collect(Collectors.toList()));</span><br><span class="line"> System.out.println(<span class="string">"-----------------------"</span>);</span><br><span class="line"> System.out.println(list.stream().filter(i -> i < <span class="number">3</span>).collect(Collectors.toList()));</span><br><span class="line"> System.out.println(<span class="string">"-----------------------"</span>);</span><br><span class="line"> System.out.println(list.stream().filter(i -> <span class="keyword">true</span>).collect(Collectors.toList()));</span><br><span class="line"> System.out.println(<span class="string">"-----------------------"</span>);</span><br><span class="line"> System.out.println(list.stream().filter(i -> <span class="keyword">false</span>).collect(Collectors.toList()));</span><br><span class="line"> System.out.println(<span class="string">"-----------------------"</span>);</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"> Main ma = <span class="keyword">new</span> Main();</span><br><span class="line"></span><br><span class="line"> System.out.println(list.stream().filter(item -> ma.and(item, i -> i> <span class="number">5</span>, i-> i % <span class="number">2</span> == <span class="number">0</span>)).collect(Collectors.toList()));</span><br><span class="line"> System.out.println(<span class="string">"-----------------------"</span>);</span><br><span class="line"> System.out.println(list.stream().filter(item -> ma.or(item, i -> i> <span class="number">5</span>, i-> i % <span class="number">2</span> == <span class="number">0</span>)).collect(Collectors.toList()));</span><br><span class="line"> System.out.println(<span class="string">"-----------------------"</span>);</span><br><span class="line"> System.out.println(list.stream().filter(item -> ma.negate(item, i -> i> <span class="number">5</span>, i-> i % <span class="number">2</span> == <span class="number">0</span>)).collect(Collectors.toList()));</span><br><span class="line"> System.out.println(<span class="string">"-----------------------"</span>);</span><br><span class="line"></span><br><span class="line"> Predicate<String> isStringEqual = Predicate.isEqual(<span class="string">"string"</span>);</span><br><span class="line"> System.out.println(isStringEqual.test(<span class="string">"string"</span>));</span><br><span class="line"> System.out.println(<span class="string">"-----------------------"</span>);</span><br><span class="line"> System.out.println(isStringEqual.test(<span class="string">"string1"</span>));</span><br><span class="line"> System.out.println(<span class="string">"-----------------------"</span>);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">boolean</span> <span class="title">and</span><span class="params">(<span class="keyword">int</span> i, Predicate<Integer> p1, Predicate<Integer> p2)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> p1.and(p2).test(i);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">boolean</span> <span class="title">or</span><span class="params">(<span class="keyword">int</span> i, Predicate<Integer> p1, Predicate<Integer> p2)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> p1.or(p2).test(i);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">boolean</span> <span class="title">negate</span><span class="params">(<span class="keyword">int</span> i, Predicate<Integer> p1, Predicate<Integer> p2)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> p1.and(p2).negate().test(i);</span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 输出</span></span><br><span class="line"><span class="comment">//[0, 2, 4, 6, 8]</span></span><br><span class="line"><span class="comment">//-----------------------</span></span><br><span class="line"><span class="comment">//[1, 3, 5, 7, 9]</span></span><br><span class="line"><span class="comment">//-----------------------</span></span><br><span class="line"><span class="comment">//[5, 6, 7, 8, 9]</span></span><br><span class="line"><span class="comment">//-----------------------</span></span><br><span class="line"><span class="comment">//[0, 1, 2]</span></span><br><span class="line"><span class="comment">//-----------------------</span></span><br><span class="line"><span class="comment">//[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]</span></span><br><span class="line"><span class="comment">//-----------------------</span></span><br><span class="line"><span class="comment">//[]</span></span><br><span class="line"><span class="comment">//-----------------------</span></span><br><span class="line"><span class="comment">//[6, 8]</span></span><br><span class="line"><span class="comment">//-----------------------</span></span><br><span class="line"><span class="comment">//[0, 2, 4, 6, 7, 8, 9]</span></span><br><span class="line"><span class="comment">//-----------------------</span></span><br><span class="line"><span class="comment">//[0, 1, 2, 3, 4, 5, 7, 9]</span></span><br><span class="line"><span class="comment">//-----------------------</span></span><br><span class="line"><span class="comment">//true</span></span><br><span class="line"><span class="comment">//-----------------------</span></span><br><span class="line"><span class="comment">//false</span></span><br><span class="line"><span class="comment">//-----------------------</span></span><br></pre></td></tr></table></figure><h4 id="Supplier-接口详解"><a href="#Supplier-接口详解" class="headerlink" title="Supplier 接口详解"></a>Supplier 接口详解</h4><h5 id="源码解析-3"><a href="#源码解析-3" class="headerlink" title="源码解析"></a>源码解析</h5><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Represents a supplier of results.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <p>There is no requirement that a new or distinct result be returned each</span></span><br><span class="line"><span class="comment"> * time the supplier is invoked.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <p>This is a <a href="package-summary.html">functional interface</a></span></span><br><span class="line"><span class="comment"> * whose functional method is {<span class="doctag">@link</span> #get()}.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@param</span> <T> the type of results supplied by this supplier</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@since</span> 1.8</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="meta">@FunctionalInterface</span></span><br><span class="line"><span class="keyword">public</span> <span class="class"><span class="keyword">interface</span> <span class="title">Supplier</span><<span class="title">T</span>> </span>{</span><br><span class="line"></span><br><span class="line"> <span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Gets a result.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@return</span> a result</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"> <span class="function">T <span class="title">get</span><span class="params">()</span></span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>这个接口很简单,只有一个抽象方法,<code>T get()</code> 获取一个对象,每次获取的对象可以是相同的,也可以是不同的。</p>]]></content>
<summary type="html"><blockquote>
<p>视频教程笔记,视频地址见 <a href="https://www.bilibili.com/video/av46434650">深入理解 Java8+jdk8 源码级思想</a></p>
</blockquote>
<h2 id="Lambda-表达式"><a href="#Lambda-表达式" class="headerlink" title="Lambda 表达式"></a>Lambda 表达式</h2><h3 id="Lambda-表达式简介"><a href="#Lambda-表达式简介" class="headerlink" title="Lambda 表达式简介"></a>Lambda 表达式简介</h3><h4 id="介绍"><a href="#介绍" class="headerlink" title="介绍"></a>介绍</h4><p>Lambda 表达式可以认为是一种匿名函数(对 JAVA 而言,他是一个对象,此处暂且认为是一种匿名函数吧),简单地说,它是没有声明的方法,也即没有访问修饰符、返回值声明和名字。</p>
<h4 id="作用"><a href="#作用" class="headerlink" title="作用"></a>作用</h4><ol>
<li>在 JAVA8 之前,无法将函数作为参数传递给一个方法,也无法声明返回一个函数的方法。Lambda 表达式为 JAVA 添加了缺失的函数式编程的特性,使我们能把函数作为一等公民看待</li>
<li>在将函数作为一等公民的语言中,Lambda 表达式的类型是函数。但是在 JAVA 中 Lambda 表达式是对象,他们必须依附于一类特别的对象类型——函数式接口。</li>
</ol></summary>
<category term="编程语言" scheme="https://andrewei1316.github.io/categories/%E7%BC%96%E7%A8%8B%E8%AF%AD%E8%A8%80/"/>
<category term="JAVA" scheme="https://andrewei1316.github.io/categories/%E7%BC%96%E7%A8%8B%E8%AF%AD%E8%A8%80/JAVA/"/>
<category term="java" scheme="https://andrewei1316.github.io/tags/java/"/>
<category term="lambda" scheme="https://andrewei1316.github.io/tags/lambda/"/>
<category term="函数式接口" scheme="https://andrewei1316.github.io/tags/%E5%87%BD%E6%95%B0%E5%BC%8F%E6%8E%A5%E5%8F%A3/"/>
</entry>
<entry>
<title>利用git将本地Hexo生成的博客部署到服务器</title>
<link href="https://andrewei1316.github.io/2016/11/28/hexo-deploy-to-vps/"/>
<id>https://andrewei1316.github.io/2016/11/28/hexo-deploy-to-vps/</id>
<published>2016-11-28T05:34:45.000Z</published>
<updated>2018-04-09T01:16:07.266Z</updated>
<content type="html"><![CDATA[<h2 id="hexo-简介"><a href="#hexo-简介" class="headerlink" title="hexo 简介"></a>hexo 简介</h2><p>Hexo 是一个快速、简洁且高效的博客框架。Hexo 使用 Markdown(或其他渲染引擎)解析文章,在几秒内,即可利用靓丽的主题生成静态网页。 由于Hexo最终会生成静态页面,所以在部署的时候我们只需要将静态页面上传到服务器即可。结合git我们就可以实现一键自动部署。下面将介绍如何配置。</p><a id="more"></a><h2 id="服务器准备工作"><a href="#服务器准备工作" class="headerlink" title="服务器准备工作"></a>服务器准备工作</h2><h3 id="安装-git"><a href="#安装-git" class="headerlink" title="安装 git"></a>安装 git</h3><p>在服务器上安装 git 工具</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">yum install git</span><br></pre></td></tr></table></figure><h3 id="新建-git-用户"><a href="#新建-git-用户" class="headerlink" title="新建 git 用户"></a>新建 git 用户</h3><p>在服务器上新建 <code>git</code> 用户(当然也可以是其他名字)</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">useradd -d /home/git -m git</span><br></pre></td></tr></table></figure><p>这个命令会新建一个用户,并创建 <code>/home/git/</code> 目录做为这个用户的目录,同时创建一个与用户名相同的组。</p><h3 id="配置-SSH-免密访问"><a href="#配置-SSH-免密访问" class="headerlink" title="配置 SSH 免密访问"></a>配置 SSH 免密访问</h3><p>在本地的电脑上(也就是你写博客的电脑上),进入 <code>~/.ssh/</code> 目录(如果没有就创建一个),执行:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ssh-keygen -t rsa</span><br></pre></td></tr></table></figure><p>一路回车,会在 <code>~/.ssh/</code> 目录下生成 <code>id_rsa</code> 和 <code>id_rsa.pub</code> 两个文件。把 <code>id_rsa.pub</code> 文件传到服务器的 <code>git</code> 用户目录下。</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">scp .ssh/id_rsa.pub git@<your-ip>:.</span><br></pre></td></tr></table></figure><p>使用 <code>git</code> 用户登录服务器,执行:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">mkdir -p ~/.ssh</span><br><span class="line">cat id_rsa.pub >> ~/.ssh/authorized_keys</span><br></pre></td></tr></table></figure><p>到此客户端机器就可以免密登录到服务器了。<br>如果出现不能登录的情况,可能是文件权限有问题,在服务器上做如下修改:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">chmod 700 ~/.ssh</span><br><span class="line">chmod 600 ~/.ssh/authorized_keys</span><br></pre></td></tr></table></figure><p>在服务器上 <code>~/.ssh/</code> 目录必须要在 <code>git</code> 用户下创建。</p><h2 id="服务器配置"><a href="#服务器配置" class="headerlink" title="服务器配置"></a>服务器配置</h2><h3 id="在服务器上创建仓库"><a href="#在服务器上创建仓库" class="headerlink" title="在服务器上创建仓库"></a>在服务器上创建仓库</h3><p>使用 <code>git</code> 用户登录服务器,执行:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">cd ~</span><br><span class="line">mkdir blog.git</span><br><span class="line">cd blog.git</span><br><span class="line">git init --bare</span><br></pre></td></tr></table></figure><h3 id="配置-Git-Hook"><a href="#配置-Git-Hook" class="headerlink" title="配置 Git Hook"></a>配置 Git Hook</h3><p>假设 web 目录为 <code>/var/www/</code>,博客放在 <code>blog</code> 子目录中。<br>使用 <code>root</code> 用户登录服务器,进入 <code>/var/www/</code> 目录,并创建 <code>blog</code> 子目录,此时 <code>git</code> 用户没有该这个目录的写权限。用 <code>ls -l</code> 查看权限,<code>blog</code> 目录属于 <code>root</code> 用户:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">drwxr-xr-x 2 root root 4096 10月 27 00:19 blog</span><br></pre></td></tr></table></figure><p>这个目录要给 <code>git</code> 用户访问,<code>blog.git</code> 仓库收到提交后,<code>git</code> 用户要把提交的内容再 <code>checkout</code> 到 <code>/var/www/blog/</code> 目录。因为 <code>root</code> 用户创建的这个目录 <code>git</code> 用户没有写权限,所以要把这个目录的所有权交给 <code>git</code> 用户:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">chown git:git blog</span><br></pre></td></tr></table></figure><p>再用 <code>ls -l</code> 查看:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">drwxr-xr-x 2 git git 4096 10月 27 00:19 blog</span><br></pre></td></tr></table></figure><p>切换到 <code>git</code> 用户执行:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">cd ~</span><br><span class="line">git clone blog.git /var/www/blog</span><br></pre></td></tr></table></figure><p>最后一步,处理 <code>blog.git</code> 提交的事件,自动更新内容到 <code>blog</code> 目录。在 <code>git</code> 用户下执行:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">cd ~/blog.git/hooks</span><br><span class="line">touch post-receive</span><br><span class="line">cat > post-receive << EOF</span><br><span class="line"><span class="meta">></span><span class="bash"> <span class="comment">#!/bin/bash -l</span></span></span><br><span class="line"><span class="meta">></span><span class="bash"> <span class="built_in">unset</span> GIT_DIR</span></span><br><span class="line"><span class="meta">></span><span class="bash"> <span class="built_in">cd</span> /var/www/blog && git pull</span></span><br><span class="line"><span class="meta">></span><span class="bash"> EOF</span></span><br></pre></td></tr></table></figure><p>给脚本加上执行权限:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">chmod +x post-receive</span><br></pre></td></tr></table></figure><h2 id="配置本地-Hexo-的部署信息"><a href="#配置本地-Hexo-的部署信息" class="headerlink" title="配置本地 Hexo 的部署信息"></a>配置本地 Hexo 的部署信息</h2><p>打开 <code>_config.yml</code> 文件,找到 <code>deploy</code> 字段,修改如下:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">deploy: </span><br><span class="line"> type: git</span><br><span class="line"> message: update</span><br><span class="line"> repository: git@<your-ip>:blog.git</span><br><span class="line"> branch: master</span><br></pre></td></tr></table></figure><p>如果出于安全或其它原因考虑,你修改了 SSH 默认的端口,那么上面 <code>repository</code> 的配置要做如下修改:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">repository: ssh://git@<your-ip>:<your-port>/~/blog.git</span><br></pre></td></tr></table></figure><p>执行以下命令部署:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hexo d</span><br></pre></td></tr></table></figure><p>如果出现以下错误:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ERROR Deployer not found: git</span><br></pre></td></tr></table></figure><p>是因为你还没有安装部署工具。Hexo 3.0 开始,部署工具需要单独安装:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">npm install hexo-deployer-git --save</span><br></pre></td></tr></table></figure><h2 id="测试"><a href="#测试" class="headerlink" title="测试"></a>测试</h2><p>当我们完成一篇文章的编辑后, 可以直接执行:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hexo g -d</span><br></pre></td></tr></table></figure><p>将最新的博客部署到服务器上。</p>]]></content>
<summary type="html"><h2 id="hexo-简介"><a href="#hexo-简介" class="headerlink" title="hexo 简介"></a>hexo 简介</h2><p>Hexo 是一个快速、简洁且高效的博客框架。Hexo 使用 Markdown(或其他渲染引擎)解析文章,在几秒内,即可利用靓丽的主题生成静态网页。 由于Hexo最终会生成静态页面,所以在部署的时候我们只需要将静态页面上传到服务器即可。结合git我们就可以实现一键自动部署。下面将介绍如何配置。</p></summary>
<category term="Linux" scheme="https://andrewei1316.github.io/categories/Linux/"/>
<category term="软件" scheme="https://andrewei1316.github.io/categories/Linux/%E8%BD%AF%E4%BB%B6/"/>
<category term="hexo" scheme="https://andrewei1316.github.io/categories/Linux/%E8%BD%AF%E4%BB%B6/hexo/"/>
<category term="hexo" scheme="https://andrewei1316.github.io/tags/hexo/"/>
<category term="git" scheme="https://andrewei1316.github.io/tags/git/"/>
<category term="blog" scheme="https://andrewei1316.github.io/tags/blog/"/>
</entry>
<entry>
<title>在Ubuntu16.04上连接L2TP/IPsec的VPN</title>
<link href="https://andrewei1316.github.io/2016/11/27/Enabling-L2TP-IPSec-on-Ubuntu16-04/"/>
<id>https://andrewei1316.github.io/2016/11/27/Enabling-L2TP-IPSec-on-Ubuntu16-04/</id>
<published>2016-11-27T11:13:28.000Z</published>
<updated>2018-04-09T01:16:07.242Z</updated>
<content type="html"><![CDATA[<p>最近有在<code>Ubuntu</code>系统上连接L2TP/IPsec协议的VPN的需求,所以在网上搜了一波,发现一名叫做<code>Werner Jaeger</code>的大神开发了一款名叫 <code>l2tp-ipsec-vpn</code> 的软件可以解决这个问题。但是在 <code>Ubuntu16.04LTS</code> 系统上,这个款软件的软件源已经不存在,后来在 <a href="http://blog.z-proj.com/enabling-l2tp-over-ipsec-on-ubuntu-16-04/">Enabling L2TP over IPSec on Ubuntu 16.04</a> 这里看到了解决方法,为了做个备份写下这篇博客。</p><a id="more"></a><h3 id="安装依赖"><a href="#安装依赖" class="headerlink" title="安装依赖"></a>安装依赖</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo apt install intltool libtool network-manager-dev libnm-util-dev libnm-glib-dev libnm-glib-vpn-dev libnm-gtk-dev libnm-dev libnma-dev ppp-dev libdbus-glib-1-dev libsecret-1-dev libgtk-3-dev libglib2.0-dev xl2tpd strongswan</span><br></pre></td></tr></table></figure><h3 id="下载源码并编译安装"><a href="#下载源码并编译安装" class="headerlink" title="下载源码并编译安装"></a>下载源码并编译安装</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">git clone https://github.com/nm-l2tp/network-manager-l2tp.git </span><br><span class="line"><span class="meta">#</span><span class="bash"> 如果失效可以从这里克隆: git <span class="built_in">clone</span> https://github.com/andrewei1316/network-manager-l2tp.git</span></span><br><span class="line">cd network-manager-l2tp </span><br><span class="line">autoreconf -fi </span><br><span class="line">intltooliz</span><br><span class="line"></span><br><span class="line">./configure --disable-static --prefix=/usr --sysconfdir=/etc --libdir=/usr/lib/x86_64-linux-gnu --libexecdir=/usr/lib/NetworkManager --localstatedir=/var --with-pppd-plugin-dir=/usr/lib/pppd/2.4.7</span><br><span class="line"></span><br><span class="line">make </span><br><span class="line">sudo make install </span><br></pre></td></tr></table></figure><h3 id="取消IPsec应用程序访问控制的设置"><a href="#取消IPsec应用程序访问控制的设置" class="headerlink" title="取消IPsec应用程序访问控制的设置"></a>取消IPsec应用程序访问控制的设置</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">sudo apparmor_parser -R /etc/apparmor.d/usr.lib.ipsec.charon </span><br><span class="line">sudo apparmor_parser -R /etc/apparmor.d/usr.lib.ipsec.stroke </span><br></pre></td></tr></table></figure><h3 id="用libpcap代替x2ltpd"><a href="#用libpcap代替x2ltpd" class="headerlink" title="用libpcap代替x2ltpd"></a>用libpcap代替x2ltpd</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">sudo apt remove xl2tpd </span><br><span class="line">sudo apt install libpcap0.8-dev</span><br><span class="line"></span><br><span class="line">wget https://github.com/xelerance/xl2tpd/archive/v1.3.6/xl2tpd-1.3.6.tar.gz </span><br><span class="line">tar xvzf xl2tpd-1.3.6.tar.gz </span><br><span class="line">cd xl2tpd-1.3.6 </span><br><span class="line">make </span><br><span class="line">sudo make install </span><br></pre></td></tr></table></figure><p>最后重启计算机,就可以在<code>添加网络连接中</code>找到添加<code>L2TP</code>类型网络.</p>]]></content>
<summary type="html"><p>最近有在<code>Ubuntu</code>系统上连接L2TP/IPsec协议的VPN的需求,所以在网上搜了一波,发现一名叫做<code>Werner Jaeger</code>的大神开发了一款名叫 <code>l2tp-ipsec-vpn</code> 的软件可以解决这个问题。但是在 <code>Ubuntu16.04LTS</code> 系统上,这个款软件的软件源已经不存在,后来在 <a href="http://blog.z-proj.com/enabling-l2tp-over-ipsec-on-ubuntu-16-04/">Enabling L2TP over IPSec on Ubuntu 16.04</a> 这里看到了解决方法,为了做个备份写下这篇博客。</p></summary>
<category term="Linux" scheme="https://andrewei1316.github.io/categories/Linux/"/>
<category term="Linux" scheme="https://andrewei1316.github.io/tags/Linux/"/>
<category term="软件" scheme="https://andrewei1316.github.io/tags/%E8%BD%AF%E4%BB%B6/"/>
</entry>
<entry>
<title>伸展树(SplayTree)</title>
<link href="https://andrewei1316.github.io/2016/07/11/splay-tree/"/>
<id>https://andrewei1316.github.io/2016/07/11/splay-tree/</id>
<published>2016-07-11T02:30:56.000Z</published>
<updated>2018-04-09T01:16:07.269Z</updated>
<content type="html"><![CDATA[<h2 id="预备知识"><a href="#预备知识" class="headerlink" title="预备知识"></a>预备知识</h2><ol><li>树的遍历</li><li>二叉树的基本知识</li><li>排序二叉树的基本知识</li><li>线段树区间更新和区间查询知识</li><li>平衡排序二叉树的基本知识(非必须)</li></ol><h2 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h2><p>伸展树(SplayTree) 是一种经过改进的平衡排序二叉树, 他跟平衡二叉树的操作非常类似,同时也有很多不同。</p><a id="more"></a><blockquote><p>伸展树的性质有:</p></blockquote><ul><li>通过节点的旋转来调整树的结构来达到某种目的;</li><li>整棵树保证有序性,即无论怎么旋转整棵树的中序遍历的顺序是一定的,不会发生改变;</li><li>伸展树不能保证树是“平衡”的,也导致了它不能保证所有操作的时间复杂度都在 $O(log(n))$, 但是从统计意义上来讲,它可以使得所有操作的均摊复杂度为 $(O(log(n))$;</li><li>伸展树保证了“八二原则”(或者称为“九一原则”,即 $80%$ 的操作都集中在 $20%$ 的数据上), 也就是说在伸展树中对某些值操作的次数越多,那么对这些数操作的复杂度就会越来越低,这个特性拥有非常好的现实意义。</li></ul><h2 id="基本操作"><a href="#基本操作" class="headerlink" title="基本操作"></a>基本操作</h2><blockquote><p>以下所有操作的示例代码以 <code>区间更新,区间求和</code> 问题为例给出,代码参考了<a href="http://www.cnblogs.com/kuangbin/archive/2013/04/21/3034081.html">kuangbin博客</a>,特此说明.</p></blockquote><p>** 代码说明: **</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="meta-keyword">define</span> key_value tree[tree[root].ch[1]].ch[0]</span></span><br><span class="line"><span class="comment">// 定义了一个宏,代表根节点的右儿子的左儿子,我们在进行操作时都会尽量把数据集中在这个地方</span></span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> maxn = <span class="number">100010</span>; <span class="comment">// 数据规模</span></span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> INF = (<span class="number">1</span> << <span class="number">29</span>); <span class="comment">// 定义了一个极值</span></span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">Node</span>{</span></span><br><span class="line"> <span class="keyword">int</span> ch[<span class="number">2</span>]; <span class="comment">// 左右儿子</span></span><br><span class="line"> <span class="keyword">int</span> pre, val, size; <span class="comment">// 父节点,当前节点的值,当前节点为根的子树的大小</span></span><br><span class="line"> <span class="keyword">long</span> <span class="keyword">long</span> sum; <span class="comment">// 当前节点为根的子树的和</span></span><br><span class="line"></span><br><span class="line"> <span class="keyword">int</span> rev, add, same; <span class="comment">// 反转标记, 增量延迟标记, 区间所有元素相同标记</span></span><br><span class="line"> <span class="keyword">int</span> lx, rx, mx; <span class="comment">// 从区间最左端开始的子序列最大和,从区间最右端开始的区间子序列最大和,整个区间里面子序列最大和</span></span><br><span class="line">};</span><br><span class="line"></span><br><span class="line"><span class="keyword">int</span> root, total; <span class="comment">// 根节点,节点数量</span></span><br><span class="line"><span class="built_in">stack</span> <<span class="keyword">int</span>> mPool; <span class="comment">// 内存池,用来存储删除节点时释放的节点, 以便之后使用</span></span><br><span class="line">Node tree[maxn]; <span class="comment">// 树的所有节点</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">int</span> n, q; <span class="comment">// n 个数, q 个询问</span></span><br><span class="line"><span class="keyword">int</span> data[maxn]; <span class="comment">// 原始数据</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">updateAdd</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> add)</span></span>; <span class="comment">// 更新增量延迟标记</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">updateRev</span><span class="params">(<span class="keyword">int</span> rt)</span></span>; <span class="comment">// 更新反转延迟标记</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">updateSame</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> val)</span></span>; <span class="comment">// 更新区间数值相同延迟标记</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">pushUp</span><span class="params">(<span class="keyword">int</span> rt)</span></span>; <span class="comment">// 回朔时根据子节点来更新父节点</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">pushDown</span><span class="params">(<span class="keyword">int</span> rt)</span></span>; <span class="comment">// 向树的深处遍历时将父节点的延迟标记推到子节点</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">newNode</span><span class="params">(<span class="keyword">int</span> &rt, <span class="keyword">int</span> pre, <span class="keyword">int</span> val)</span></span>; <span class="comment">// 添加新节点, (当前根节点,父节点,添加的值)</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">buildTree</span><span class="params">(<span class="keyword">int</span> &cur, <span class="keyword">int</span> l, <span class="keyword">int</span> r, <span class="keyword">int</span> pre, <span class="keyword">int</span> *a)</span></span>;</span><br><span class="line"> <span class="comment">// 建树, (当前根节点,区间左端点,区间右端点,父节点,原始数据)</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">init</span><span class="params">(<span class="keyword">int</span> *data)</span></span>; <span class="comment">// 初始化整棵树调用建树函数,(原始数据)</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">rotate</span><span class="params">(<span class="keyword">int</span> cur, <span class="keyword">int</span> com)</span></span>; <span class="comment">// 单旋操作,将 cur 节点左(com==0)右(com==1)旋, (旋转的节点,控制左右旋)</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">splay</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> tar)</span></span>; <span class="comment">// 实现树的调整,将 rt 节点调整到 tar 节点的下面,(要调整的节点,目的节点)</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getKth</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> k)</span></span>; <span class="comment">// 得到第 k 个数,(当前根,第k个数)</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getValMinPos</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> val)</span></span>; <span class="comment">// 得到比 val 大的最小值的位置(当前节点,val);</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getValPos</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> val)</span></span>; <span class="comment">// 得到 val 的位置,(当前节点,val);</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getValRank</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> val)</span></span>; <span class="comment">// 得到 val 的排名, (当前节点,val);</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getMin</span><span class="params">(<span class="keyword">int</span> rt)</span></span>; <span class="comment">// 得到最小的数字<树中的数基于大小排列>(当前节点);</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getMax</span><span class="params">(<span class="keyword">int</span> rt)</span></span>; <span class="comment">// 得到最大的数字<树中的数基于大小排列>(当前节点);</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">insertOne</span><span class="params">(<span class="keyword">int</span> x, <span class="keyword">int</span> val)</span></span>; <span class="comment">// 在第 x 个数后面插入 val;</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">erase</span><span class="params">(<span class="keyword">int</span> rt)</span></span>; <span class="comment">// 内存回收,在删除节点的时候调用(删除的节点);</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">deleteOne</span><span class="params">(<span class="keyword">int</span> k)</span></span>; <span class="comment">// 回收第 k 个数</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">insert</span><span class="params">(<span class="keyword">int</span> pos, <span class="keyword">int</span> cnt, <span class="keyword">int</span> *val)</span></span>; <span class="comment">// 在第 pos 个数后插入 cnt 个数,这些数存放在 val 数组中</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">Delete</span><span class="params">(<span class="keyword">int</span> pos, <span class="keyword">int</span> cnt)</span></span>; <span class="comment">// 从第 pos 个数开始连续删除 cnt 个数</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">getSum</span><span class="params">(<span class="keyword">int</span> l, <span class="keyword">int</span> r)</span></span>; <span class="comment">// 获取区间[l, r]中的和</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">makeAdd</span><span class="params">(<span class="keyword">int</span> l, <span class="keyword">int</span> r, <span class="keyword">int</span> val)</span></span>; <span class="comment">// 将 [l, r] 区间的所有值都增加 val</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">makeSame</span><span class="params">(<span class="keyword">int</span> pos, <span class="keyword">int</span> cnt, <span class="keyword">int</span> val)</span></span>; <span class="comment">// 从 pos 开始连续的 cnt 个数都变为 val</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">revolve</span><span class="params">(<span class="keyword">int</span> l, <span class="keyword">int</span> r, <span class="keyword">int</span> T)</span></span>; <span class="comment">// 区间滑动,将 [l, r] 区间循环右移 T 个单位</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">reverse</span><span class="params">(<span class="keyword">int</span> l, <span class="keyword">int</span> r)</span></span>; <span class="comment">// 区间反转,将 [l, r] 区间内的数完全反转</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">getMaxSum</span><span class="params">(<span class="keyword">int</span> pos, <span class="keyword">int</span> cnt)</span></span>; <span class="comment">// 求从 pos 开始的连续 cnt 长度的区间内的子序列最大和</span></span><br></pre></td></tr></table></figure><p>** 接下来边讲算法原理边实现代码: **</p><p>下面是 <code>pushUp</code>, <code>pushDown</code> 还有延迟标记的处理, 与线段树区间操作类似,不再赘述。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 更新增量延迟标记</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">updateAdd</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> add)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(!rt) <span class="keyword">return</span>;</span><br><span class="line"> tree[rt].add += add;</span><br><span class="line"> tree[rt].val += add;</span><br><span class="line"> tree[rt].sum += (<span class="keyword">long</span> <span class="keyword">long</span>)add * tree[rt].size;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 更新反转延迟标记</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">updateRev</span><span class="params">(<span class="keyword">int</span> rt)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(!rt) <span class="keyword">return</span>;</span><br><span class="line"> tree[rt].rev ^= <span class="number">1</span>;</span><br><span class="line"> swap(tree[rt].lx, tree[rt].rx);</span><br><span class="line"> swap(tree[rt].ch[<span class="number">0</span>], tree[rt].ch[<span class="number">1</span>]);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 更新区间元素值相同延迟标记</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">updateSame</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(!rt) <span class="keyword">return</span>;</span><br><span class="line"> tree[rt].val = val;</span><br><span class="line"> tree[rt].sum = val * tree[rt].size;</span><br><span class="line"> tree[rt].lx = tree[rt].rx = tree[rt].mx = max(val, val * tree[rt].size);</span><br><span class="line"> tree[rt].same = <span class="number">1</span>;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 通过孩子节点的数据来更新父节点的数据</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">pushUp</span><span class="params">(<span class="keyword">int</span> rt)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> lson = tree[rt].ch[<span class="number">0</span>], rson = tree[rt].ch[<span class="number">1</span>];</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 更新节点的大小</span></span><br><span class="line"> tree[rt].size = tree[lson].size + tree[rson].size + <span class="number">1</span>;</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 更新该节点及其子树所有值的和</span></span><br><span class="line"> tree[rt].sum = tree[lson].sum + tree[rson].sum + tree[rt].val;</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 更新子序列最大值</span></span><br><span class="line"> tree[rt].lx = max((<span class="keyword">long</span> <span class="keyword">long</span>)tree[lson].lx, tree[lson].sum + tree[rt].val + max(<span class="number">0</span>, tree[rson].lx));</span><br><span class="line"> tree[rt].rx = max((<span class="keyword">long</span> <span class="keyword">long</span>)tree[rson].rx, tree[rson].sum + tree[rt].val + max(<span class="number">0</span>, tree[lson].rx));</span><br><span class="line"> tree[rt].mx = max(<span class="number">0</span>, tree[lson].rx) + tree[rt].val + max(<span class="number">0</span>, tree[rson].lx);</span><br><span class="line"> tree[rt].mx = max(tree[rt].mx, max(tree[lson].mx, tree[rson].mx));</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 将父节点的延迟标记更新到孩子节点</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">pushDown</span><span class="params">(<span class="keyword">int</span> rt)</span></span>{</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 更新增量延迟标记</span></span><br><span class="line"> <span class="keyword">if</span>(tree[rt].add){</span><br><span class="line"> updateAdd(tree[rt].ch[<span class="number">0</span>], tree[rt].add);</span><br><span class="line"> updateAdd(tree[rt].ch[<span class="number">1</span>], tree[rt].add);</span><br><span class="line"> tree[rt].add = <span class="number">0</span>;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 更新区间相同标记</span></span><br><span class="line"> <span class="keyword">if</span>(tree[rt].same){</span><br><span class="line"> updateSame(tree[rt].ch[<span class="number">0</span>], tree[rt].val);</span><br><span class="line"> updateSame(tree[rt].ch[<span class="number">1</span>], tree[rt].val);</span><br><span class="line"> tree[rt].same = <span class="number">0</span>;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 更新反转标记</span></span><br><span class="line"> <span class="keyword">if</span>(tree[rt].rev){</span><br><span class="line"> updateRev(tree[rt].ch[<span class="number">0</span>]);</span><br><span class="line"> updateRev(tree[rt].ch[<span class="number">1</span>]);</span><br><span class="line"> tree[rt].rev = <span class="number">0</span>;</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="建树"><a href="#建树" class="headerlink" title="建树"></a>建树</h3><p>建树时要考虑数据的顺序问题,这是由你需要求解的问题所决定的。这些顺序包括,按照插入数据的大小排序,数据的原始排列顺序等等。</p><p>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">newNode</span><span class="params">(<span class="keyword">int</span> &rt, <span class="keyword">int</span> pre, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(!mPool.empty()){</span><br><span class="line"> rt = mPool.top();</span><br><span class="line"> mPool.pop();</span><br><span class="line"> }<span class="keyword">else</span>{</span><br><span class="line"> rt = ++total;</span><br><span class="line"> }</span><br><span class="line"> tree[rt].pre = pre;</span><br><span class="line"> tree[rt].size = <span class="number">1</span>;</span><br><span class="line"> tree[rt].val = val;</span><br><span class="line"> tree[rt].add = <span class="number">0</span>;</span><br><span class="line"> tree[rt].sum = val;</span><br><span class="line"> tree[rt].rev = tree[rt].same = <span class="number">0</span>;</span><br><span class="line"> tree[rt].ch[<span class="number">0</span>] = tree[rt].ch[<span class="number">1</span>] = <span class="number">0</span>;</span><br><span class="line"> tree[rt].lx = tree[rt].rx = tree[rt].mx = val;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">buildTree</span><span class="params">(<span class="keyword">int</span> &cur, <span class="keyword">int</span> l, <span class="keyword">int</span> r, <span class="keyword">int</span> pre, <span class="keyword">int</span> *a)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(l > r) <span class="keyword">return</span>;</span><br><span class="line"> <span class="keyword">int</span> mid = (l + r) >> <span class="number">1</span>;</span><br><span class="line"> newNode(cur, pre, a[mid]);</span><br><span class="line"> buildTree(tree[cur].ch[<span class="number">0</span>], l, mid - <span class="number">1</span>, cur, a);</span><br><span class="line"> buildTree(tree[cur].ch[<span class="number">1</span>], mid + <span class="number">1</span>, r, cur, a);</span><br><span class="line"> pushUp(cur);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">init</span><span class="params">(<span class="keyword">int</span> *data)</span></span>{</span><br><span class="line"></span><br><span class="line"> root = total = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">while</span>(!mPool.empty()) mPool.pop();</span><br><span class="line"> tree[root].rev = tree[root].same = <span class="number">0</span>;</span><br><span class="line"> tree[root].ch[<span class="number">0</span>] = tree[root].ch[<span class="number">1</span>] = <span class="number">0</span>;</span><br><span class="line"> tree[root].lx = tree[root].rx = tree[root].mx = -INF;</span><br><span class="line"> tree[root].sum = tree[root].add = tree[root].val = <span class="number">0</span>;</span><br><span class="line"> tree[root].pre = tree[root].size = tree[root].sum = <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line"> newNode(root, <span class="number">0</span>, <span class="number">-1</span>); <span class="comment">// 注1</span></span><br><span class="line"> newNode(tree[root].ch[<span class="number">1</span>], root, <span class="number">-1</span>); <span class="comment">// 注2</span></span><br><span class="line"></span><br><span class="line"> buildTree(key_value, <span class="number">0</span>, n - <span class="number">1</span>, tree[root].ch[<span class="number">1</span>], data);</span><br><span class="line"></span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><blockquote><p>注1和注2插入了两个多余的结点,无论对树进行什么操作,这两个节点 <code>总是</code> 一个在 <code>树的根的上面</code>, 一个在树的<code>最右子树的叶节点</code>。 插入这两个节点的原因是,有这两个节点时可以避免讨论父节点是否是根节点,和叶节点是否是最右节点这两种情况。这两种情况都是特殊情况,正常情况下应该进行讨论。这种思想类似于无头链表和有头链表,可以去了解一下。</p></blockquote><h3 id="旋转"><a href="#旋转" class="headerlink" title="旋转"></a>旋转</h3><p>伸展树的大部分操作是以旋转为基础的,伸展树最重要的也是它的旋转操作,旋转操作一般有三组共六种操作,下面来讲一下这些旋转操作:</p><h4 id="单旋"><a href="#单旋" class="headerlink" title="单旋"></a>单旋</h4><p>当目标节点是根节点的左子节点或右子节点时,进行一次单旋转,将目标节点调整到根节点的位置。<br><img src="/images/splay-tree/splay-tree1.png" alt="右旋"><br><img src="/images/splay-tree/splay-tree2.png" alt="左旋"></p><h4 id="一字型双旋"><a href="#一字型双旋" class="headerlink" title="一字型双旋"></a>一字型双旋</h4><p>节点 <code>A</code> 的父节点 <code>B</code> 不是根节点,<code>B</code> 的父节点为 <code>C</code>,且 <code>A</code> 与 <code>B</code> 同时是各自父节点的左孩子或者同时是各自父节点的右孩子。这时,我们进行一次左左旋转操作或者右右旋转操作。</p><h5 id="一字型双右旋"><a href="#一字型双右旋" class="headerlink" title="一字型双右旋"></a>一字型双右旋</h5><p><img src="/images/splay-tree/splay-tree3.png" alt="一字型双右旋"></p><h5 id="一字型双左旋"><a href="#一字型双左旋" class="headerlink" title="一字型双左旋"></a>一字型双左旋</h5><p><img src="/images/splay-tree/splay-tree4.png" alt="一字型双左旋"></p><h4 id="之字型双旋"><a href="#之字型双旋" class="headerlink" title="之字型双旋"></a>之字型双旋</h4><p>节点 <code>A</code> 的父节点 <code>B</code> 不是根节点,<code>B</code> 的父节点为 <code>C</code> , <code>A</code> 与 <code>B</code> 中一个是其父节点的左孩子而另一个是其父节点的右孩子。这时,我们进行一次左右旋转操作或者右左旋转操作。</p><h5 id="之字型先右后左"><a href="#之字型先右后左" class="headerlink" title="之字型先右后左"></a>之字型先右后左</h5><p><img src="/images/splay-tree/splay-tree5.png" alt="之字型先右后左"></p><h5 id="之字型先左后右"><a href="#之字型先左后右" class="headerlink" title="之字型先左后右"></a>之字型先左后右</h5><p><img src="/images/splay-tree/splay-tree6.png" alt="之字型先左后右"></p><p>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 实现单旋</span></span><br><span class="line"><span class="comment">// com == 0 时, 对 cur 节点进行左旋</span></span><br><span class="line"><span class="comment">// com == 1 时, 对 cur 节点进行右旋</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">rotate</span><span class="params">(<span class="keyword">int</span> cur, <span class="keyword">int</span> com)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> pre = tree[cur].pre;</span><br><span class="line"> pushDown(pre);</span><br><span class="line"> pushDown(cur);</span><br><span class="line"></span><br><span class="line"> tree[pre].ch[!com] = tree[cur].ch[com];</span><br><span class="line"> tree[tree[cur].ch[com]].pre = pre;</span><br><span class="line"></span><br><span class="line"> <span class="comment">/* 上面的语句可以展开成下面的语句</span></span><br><span class="line"><span class="comment"> if(com){</span></span><br><span class="line"><span class="comment"> tree[pre].ch[0] = tree[cur].ch[1]; </span></span><br><span class="line"><span class="comment"> tree[tree[cur].ch[1]].pre = pre;</span></span><br><span class="line"><span class="comment"> }else{</span></span><br><span class="line"><span class="comment"> tree[pre].ch[1] = tree[cur].ch[0]; </span></span><br><span class="line"><span class="comment"> tree[tree[cur].ch[0]].pre = pre;</span></span><br><span class="line"><span class="comment"> }</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"></span><br><span class="line"> <span class="keyword">if</span>(tree[pre].pre){</span><br><span class="line"> tree[tree[pre].pre].ch[tree[tree[pre].pre].ch[<span class="number">1</span>] == pre] = cur;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> tree[cur].pre = tree[pre].pre;</span><br><span class="line"> tree[cur].ch[com] = pre;</span><br><span class="line"> tree[pre].pre = cur;</span><br><span class="line"> pushUp(pre);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 实现树的调整</span></span><br><span class="line"><span class="comment">// 将 rt 节点调整到 tar 下面</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">splay</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> tar)</span></span>{</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> <span class="keyword">while</span>(tree[rt].pre != tar){</span><br><span class="line"> <span class="keyword">if</span>(tree[tree[rt].pre].pre == tar){</span><br><span class="line"> pushDown(tree[rt].pre);</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> rotate(rt, tree[tree[rt].pre].ch[<span class="number">0</span>] == rt);</span><br><span class="line"> }<span class="keyword">else</span>{</span><br><span class="line"> pushDown(tree[tree[rt].pre].pre);</span><br><span class="line"> pushDown(tree[rt].pre);</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> <span class="keyword">int</span> pre = tree[rt].pre;</span><br><span class="line"> <span class="keyword">int</span> com = tree[tree[pre].pre].ch[<span class="number">0</span>] == pre;</span><br><span class="line"> <span class="keyword">if</span>(tree[pre].ch[com] == rt){</span><br><span class="line"> rotate(rt, !com);</span><br><span class="line"> rotate(rt, com);</span><br><span class="line"> }<span class="keyword">else</span>{</span><br><span class="line"> rotate(pre, com);</span><br><span class="line"> rotate(rt, com);</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> pushUp(rt);</span><br><span class="line"> <span class="keyword">if</span>(tar == <span class="number">0</span>) root = rt;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="获取某个值"><a href="#获取某个值" class="headerlink" title="获取某个值"></a>获取某个值</h3><p>通常 splay 树的操作都是先得到一个数,然后以此作为基本再进行其他操作。</p><h4 id="获取第-k-个数"><a href="#获取第-k-个数" class="headerlink" title="获取第 k 个数"></a>获取第 k 个数</h4><p>可以通过左右子树的大小获得:<br>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getKth</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> k)</span></span>{</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> <span class="keyword">int</span> tmp = tree[tree[rt].ch[<span class="number">0</span>]].size + <span class="number">1</span>;</span><br><span class="line"> <span class="keyword">if</span>(tmp == k) <span class="keyword">return</span> rt;</span><br><span class="line"> <span class="keyword">else</span> <span class="keyword">if</span>(tmp > k) <span class="keyword">return</span> getKth(tree[rt].ch[<span class="number">0</span>], k);</span><br><span class="line"> <span class="keyword">else</span> <span class="keyword">return</span> getKth(tree[rt].ch[<span class="number">1</span>], k - tmp);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h4 id="获得大于-val-的最小数位置"><a href="#获得大于-val-的最小数位置" class="headerlink" title="获得大于 val 的最小数位置"></a>获得大于 val 的最小数位置</h4><p>此操作基于树的中序遍历是一个不减序列,通过遍历树可求得<br>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getValMinPos</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> Min = INF;</span><br><span class="line"> <span class="keyword">int</span> pos = <span class="number">-1</span>;</span><br><span class="line"> <span class="keyword">while</span>(rt){</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> <span class="keyword">if</span>(tree[rt].val == val) <span class="keyword">return</span> rt;</span><br><span class="line"> <span class="keyword">if</span>(tree[rt].val > a){</span><br><span class="line"> <span class="keyword">if</span>(Min > tree[rt].val){</span><br><span class="line"> Min = tree[rt].val;</span><br><span class="line"> pos = rt;</span><br><span class="line"> }</span><br><span class="line"> rt = tree[rt].ch[<span class="number">0</span>];</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">else</span> rt = tree[rt].ch[<span class="number">1</span>];</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> pos;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><blockquote><p>同理可得小于 <code>val</code> 的最大值.</p></blockquote><h4 id="获取值为-val-的数的排名"><a href="#获取值为-val-的数的排名" class="headerlink" title="获取值为 val 的数的排名"></a>获取值为 val 的数的排名</h4><p>此操作基于 splay 树中的值唯一,且树的中序遍历是一个不减序列.<br>先遍历整棵树,找到 <code>val</code> 的位置,然后将 <code>val</code> 旋转到 <code>root</code> 位置, 然后 <code>root 的左子树的 size + 1</code> 即所求.<br>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"><span class="comment">// 得到 val 的位置</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getValPos</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(rt == <span class="number">0</span>) <span class="keyword">return</span> <span class="number">-1</span>;</span><br><span class="line"> <span class="keyword">if</span>(tree[rt].val == val) <span class="keyword">return</span> rt;</span><br><span class="line"> <span class="keyword">else</span> <span class="keyword">if</span>(tree[rt].val > val)</span><br><span class="line"> <span class="keyword">return</span> getValPos(tree[rt].ch[<span class="number">0</span>], val);</span><br><span class="line"> <span class="keyword">else</span></span><br><span class="line"> <span class="keyword">return</span> getValPos(tree[rt].ch[<span class="number">1</span>], val);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getValRank</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> pos = getValPos(root, val);</span><br><span class="line"> splay(pos, <span class="number">0</span>);</span><br><span class="line"> <span class="keyword">int</span> res = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">if</span>(tree[root].ch[<span class="number">0</span>]) res += tree[tree[root].ch[<span class="number">0</span>]].size;</span><br><span class="line"> res += <span class="number">1</span>;</span><br><span class="line"> <span class="keyword">return</span> res;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h4 id="获取最小数的位置"><a href="#获取最小数的位置" class="headerlink" title="获取最小数的位置"></a>获取最小数的位置</h4><p>此操作基于树的中序遍历是一个不减序列;<br>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getMin</span><span class="params">(<span class="keyword">int</span> rt)</span></span>{</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> <span class="keyword">while</span>(tree[rt].ch[<span class="number">0</span>]){</span><br><span class="line"> rt = tree[rt].ch[<span class="number">0</span>];</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> rt;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h4 id="获取最大数的位置"><a href="#获取最大数的位置" class="headerlink" title="获取最大数的位置"></a>获取最大数的位置</h4><p>此操作基于树的中序遍历是一个不减序列;<br>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getMax</span><span class="params">(<span class="keyword">int</span> rt)</span></span>{</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> <span class="keyword">while</span>(tree[rt].ch[<span class="number">1</span>]){</span><br><span class="line"> rt = tree[rt].ch[<span class="number">1</span>];</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> rt;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="插入"><a href="#插入" class="headerlink" title="插入"></a>插入</h3><p>插入元素是树的基本操作, 方法是根据建树时所遵循的元素顺序,将 <code>key</code> 插入至树中合适的叶子节点上(比如元素从小到大排列时合适的顺序等)。</p><p>以在 <code>x</code> 个数后面插入 <code>val</code> 为例:<br>先把第 <code>x</code> 个数旋转到 <code>root</code> 位置,然后将第 <code>x + 1</code> 个数旋转到 <code>root 的右儿子</code> 的位置,此时只需把将要插入的数插入到 <code>root左儿子</code> 位置即可.</p><p>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 在第 x 个数后面插入 val</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">insertOne</span><span class="params">(<span class="keyword">int</span> x, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> splay(getKth(root, x + <span class="number">1</span>), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, x + <span class="number">2</span>), root);</span><br><span class="line"> newNode(key_value, tree[root].ch[<span class="number">1</span>], val);</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="删除"><a href="#删除" class="headerlink" title="删除"></a>删除</h3><p>以删除第 <code>k</code> 个数为例:<br>先将第 <code>k - 1</code> 个数调整到 <code>root</code> 位置, 再将 <code>k + 1</code> 个数调整到 <code>root</code> 的右儿子,则第 <code>k</code> 个数在 <code>root 的右儿子的左儿子</code> 的位置。</p><p>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 回收内存</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">erase</span><span class="params">(<span class="keyword">int</span> rt)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(rt){</span><br><span class="line"> mPool.push(rt);</span><br><span class="line"> erase(tree[rt].ch[<span class="number">0</span>]);</span><br><span class="line"> erase(tree[rt].ch[<span class="number">1</span>]); </span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 删除第 k 个数</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">deleteOne</span><span class="params">(<span class="keyword">int</span> k)</span></span>{</span><br><span class="line"> splay(getKth(root, k), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, k + <span class="number">2</span>), root);</span><br><span class="line"> erase(key_value);</span><br><span class="line"> tree[key_value].pre = <span class="number">0</span>;</span><br><span class="line"> key_value = <span class="number">0</span>;</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h2 id="拓展操作"><a href="#拓展操作" class="headerlink" title="拓展操作"></a>拓展操作</h2><p>拓展操作是基本操作的各种组合形成的,现在讨论几个典型的示例.</p><h3 id="区间操作"><a href="#区间操作" class="headerlink" title="区间操作"></a>区间操作</h3><p>** 问题1:** 在某个点 <code>L</code> 后插入连续一段区间<br>** 解答: **<br>将 <code>pos</code> 旋转至根节点,再将 <code>(L+1)</code> 旋转至根节点的右子节点处。在 <code>(L+1)</code> 的左子节点进行逐个插入。<br><img src="/images/splay-tree/splay-tree7.jpg" alt="区间插入"></p><p>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 从第 pos 个数后开始插入 val 数组中的数</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">insert</span><span class="params">(<span class="keyword">int</span> pos, <span class="keyword">int</span> cnt, <span class="keyword">int</span> *val)</span></span>{</span><br><span class="line"> splay(getKth(root, pos + <span class="number">1</span>), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, pos + <span class="number">2</span>), root);</span><br><span class="line"> buildTree(key_value, <span class="number">0</span>, cnt - <span class="number">1</span>, tree[root].ch[<span class="number">1</span>], val);</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>** 问题2:** 删除一段连续区间 <code>[L,R]</code>:<br>** 解答: **<br>将节点 <code>(L-1)</code> 旋转至根节点,再将 <code>(R+1)</code> 旋转至 <code>(L-1)</code> 的右子节点处。此时 <code>(R+1)</code> 的左子树就是区间 <code>[L,R]</code>。<br>将整棵左子树删除即可。<br><img src="/images/splay-tree/splay-tree8.jpg" alt="区间删除"></p><p>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 从 pos 个数开始连续删除 cnt 个数</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">Delete</span><span class="params">(<span class="keyword">int</span> pos, <span class="keyword">int</span> cnt)</span></span>{</span><br><span class="line"> splay(getKth(root, pos), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, pos + cnt + <span class="number">1</span>), root);</span><br><span class="line"> erase(key_value);</span><br><span class="line"> tree[key_value].pre = <span class="number">0</span>;</span><br><span class="line"> key_value = <span class="number">0</span>;</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>** 问题3:** 求区间[L,R]所有元素的和<br>** 解答: **<br>将节点 <code>(L-1)</code> 旋转至根节点,再将 <code>(R+1)</code> 旋转至 <code>(L-1)</code> 的右子节点处。此时 <code>(R+1)</code> 的左子树就是区间 <code>[L,R]</code>。左子树的根节点的 <code>sum</code> 就是所求结果。<br><img src="/images/splay-tree/splay-tree8.jpg" alt="区间求和"></p><p>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 获取 [l, r] 的和</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getSum</span><span class="params">(<span class="keyword">int</span> l, <span class="keyword">int</span> r)</span></span>{</span><br><span class="line"> splay(getKth(root, l), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, r + <span class="number">2</span>), root);</span><br><span class="line"> <span class="keyword">return</span> tree[key_value].sum;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>** 问题4:** 区间更新<br>** 解答: **<br>将节点 <code>(L-1)</code> 旋转至根节点,再将 <code>(R+1)</code> 旋转至 <code>(L-1)</code> 的右子节点处。此时 <code>(R+1)</code> 的左子树就是区间 <code>[L,R]</code> 。更新整棵左子树。(若更新操作过多,可使用类似于线段树的延迟标记)。<br><img src="/images/splay-tree/splay-tree8.jpg" alt="区间更新"></p><p>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 将 [l, r] 区间的所有值都增加 val</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">makeAdd</span><span class="params">(<span class="keyword">int</span> l, <span class="keyword">int</span> r, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> splay(getKth(root, l), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, r + <span class="number">2</span>), root);</span><br><span class="line"> updateAdd(key_value, val);</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 从 pos 开始的连续 cnt 个数都更改为 val</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">makeSame</span><span class="params">(<span class="keyword">int</span> pos, <span class="keyword">int</span> cnt, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> splay(getKth(root, pos), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, pos + cnt + <span class="number">1</span>), root);</span><br><span class="line"> updateSame(key_value, val);</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>** 问题5:** 区间循环滑动<br>假设原来的区间为 [L, R], 区间中的数值为[2, 3, 4, 5], 现在将其循环滑动 3 次, 则序列变为 [3, 4, 5, 2];<br>** 解答: **<br>将节点 <code>(L-1)</code> 旋转至根节点,再将 <code>(R+1)</code> 旋转至 <code>(L-1)</code> 的右子节点处。此时 <code>(R+1)</code> 的左子树就是区间 <code>[L,R]</code>。假设滑动后的区间为 <code>[l,R,L,r]</code> (对应在原来区间中的位置为 <code>[L,r,l,R]</code>)。<br>将 <code>r</code> 节点旋转至 <code>(R+1)</code> 左子树的根节点。再将 <code>R</code> 旋转至 <code>r</code> 节点的右子树的根节点。然后搬移两颗子树即可.<br><img src="/images/splay-tree/splay-tree9.jpg" alt="区间滑动"><br><img src="/images/splay-tree/splay-tree10.jpg" alt="区间滑动"></p><p>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 将 [l, r] 区间循环右移 T 个单位</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">revolve</span><span class="params">(<span class="keyword">int</span> l, <span class="keyword">int</span> r, <span class="keyword">int</span> T)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> len = r - l + <span class="number">1</span>;</span><br><span class="line"> T = (T % len + len) % len;</span><br><span class="line"> <span class="keyword">if</span>(T == <span class="number">0</span>) <span class="keyword">return</span>;</span><br><span class="line"> <span class="keyword">int</span> c = r - T + <span class="number">1</span>;</span><br><span class="line"> splay(getKth(root, c), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, r + <span class="number">2</span>), root);</span><br><span class="line"> <span class="keyword">int</span> tmp = key_value;</span><br><span class="line"> key_value = <span class="number">0</span>;</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line"> splay(getKth(root, l), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, l + <span class="number">1</span>), root);</span><br><span class="line"> key_value = tmp;</span><br><span class="line"> tree[key_value].pre = tree[root].ch[<span class="number">1</span>];</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>** 问题6:** 区间反转<br>** 解答: **<br>将节点 <code>(L-1)</code> 旋转至根节点,再将 <code>(R+1)</code> 旋转至 <code>(L-1)</code> 的右子节点处。此时 <code>(R+1)</code> 的左子树就是区间 <code>[L,R]</code>。然后依次交换左右子树即可。(也可使用延迟标记)<br><img src="/images/splay-tree/splay-tree11.jpg" alt="区间反转"><br><img src="/images/splay-tree/splay-tree12.jpg" alt="区间反转"></p><p>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">reverse</span><span class="params">(<span class="keyword">int</span> l, <span class="keyword">int</span> r)</span></span>{</span><br><span class="line"> splay(getKth(root, l), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, r + <span class="number">2</span>), root);</span><br><span class="line"> updateRev(key_value);</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>** 问题7:** 区间子序列最大的和<br>** 解答: **<br>每个节点维护三个值。区间 <code>[L,R]</code> 表示以节点 <code>i</code> 为根节点的子树的区间。<code>lx[i]</code> 表示以 <code>L</code> 为左起点的子序列的最大和。<br><code>rx[i]</code> 表示以 <code>R</code> 为右结尾的子序列的最大和。<code>mx[i]</code> 表示区间子序列最大和。<br><code>mx</code> 可由 子节点的值转移而来: (转移的过程发生在树结构或者节点值发生变化时)</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">lx[i] = max(lx[lson],sum[lson] + key[i] + max(<span class="number">0</span>,lx[rson]));</span><br><span class="line">rx[i] = max(rx[rson],sum[rson] + key[i] + max(<span class="number">0</span>,rx[lson]));</span><br><span class="line">mx[i] = max(<span class="number">0</span>,rx[lson]) + key[i] + max(<span class="number">0</span>,lx[rson]);</span><br><span class="line">mx[i] = max(mx[i],max(mx[lson],mx[rson]));</span><br></pre></td></tr></table></figure><p>有了这个 <code>mx</code> 值。就可以将节点 <code>(L-1)</code> 旋转至根节点,再将 <code>(R+1)</code> 旋转至 <code>(L-1)</code> 的右子节点处。此时 <code>(R+1)</code> 的左子树就是区间 <code>[L,R]</code>。<br>左子树的根节点的 <code>mx</code> 值就是所求的。<br><img src="/images/splay-tree/splay-tree13.jpg" alt="区间子序列最大的和"></p><p>参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 从 pos 开始连续 cnt 长度的区间内子序列的最大和</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getMaxSum</span><span class="params">(<span class="keyword">int</span> pos, <span class="keyword">int</span> cnt)</span></span>{</span><br><span class="line"> splay(getKth(root, pos), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, pos + cnt + <span class="number">1</span>), root);</span><br><span class="line"> <span class="keyword">return</span> tree[key_value].mx;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>跟线段树相同,根据题目要求splay维护的东西也不同,需要按照题目要求来自行修改。在懂得splay树的操作之后,在重要的是利用他的这些特性来快速的解决问题,比如怎么旋转之后可以快速放方便的得到结果等等。<br>下面是一份完整的代码,综合了目前见过的一些需要维护的东西。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br><span class="line">135</span><br><span class="line">136</span><br><span class="line">137</span><br><span class="line">138</span><br><span class="line">139</span><br><span class="line">140</span><br><span class="line">141</span><br><span class="line">142</span><br><span class="line">143</span><br><span class="line">144</span><br><span class="line">145</span><br><span class="line">146</span><br><span class="line">147</span><br><span class="line">148</span><br><span class="line">149</span><br><span class="line">150</span><br><span class="line">151</span><br><span class="line">152</span><br><span class="line">153</span><br><span class="line">154</span><br><span class="line">155</span><br><span class="line">156</span><br><span class="line">157</span><br><span class="line">158</span><br><span class="line">159</span><br><span class="line">160</span><br><span class="line">161</span><br><span class="line">162</span><br><span class="line">163</span><br><span class="line">164</span><br><span class="line">165</span><br><span class="line">166</span><br><span class="line">167</span><br><span class="line">168</span><br><span class="line">169</span><br><span class="line">170</span><br><span class="line">171</span><br><span class="line">172</span><br><span class="line">173</span><br><span class="line">174</span><br><span class="line">175</span><br><span class="line">176</span><br><span class="line">177</span><br><span class="line">178</span><br><span class="line">179</span><br><span class="line">180</span><br><span class="line">181</span><br><span class="line">182</span><br><span class="line">183</span><br><span class="line">184</span><br><span class="line">185</span><br><span class="line">186</span><br><span class="line">187</span><br><span class="line">188</span><br><span class="line">189</span><br><span class="line">190</span><br><span class="line">191</span><br><span class="line">192</span><br><span class="line">193</span><br><span class="line">194</span><br><span class="line">195</span><br><span class="line">196</span><br><span class="line">197</span><br><span class="line">198</span><br><span class="line">199</span><br><span class="line">200</span><br><span class="line">201</span><br><span class="line">202</span><br><span class="line">203</span><br><span class="line">204</span><br><span class="line">205</span><br><span class="line">206</span><br><span class="line">207</span><br><span class="line">208</span><br><span class="line">209</span><br><span class="line">210</span><br><span class="line">211</span><br><span class="line">212</span><br><span class="line">213</span><br><span class="line">214</span><br><span class="line">215</span><br><span class="line">216</span><br><span class="line">217</span><br><span class="line">218</span><br><span class="line">219</span><br><span class="line">220</span><br><span class="line">221</span><br><span class="line">222</span><br><span class="line">223</span><br><span class="line">224</span><br><span class="line">225</span><br><span class="line">226</span><br><span class="line">227</span><br><span class="line">228</span><br><span class="line">229</span><br><span class="line">230</span><br><span class="line">231</span><br><span class="line">232</span><br><span class="line">233</span><br><span class="line">234</span><br><span class="line">235</span><br><span class="line">236</span><br><span class="line">237</span><br><span class="line">238</span><br><span class="line">239</span><br><span class="line">240</span><br><span class="line">241</span><br><span class="line">242</span><br><span class="line">243</span><br><span class="line">244</span><br><span class="line">245</span><br><span class="line">246</span><br><span class="line">247</span><br><span class="line">248</span><br><span class="line">249</span><br><span class="line">250</span><br><span class="line">251</span><br><span class="line">252</span><br><span class="line">253</span><br><span class="line">254</span><br><span class="line">255</span><br><span class="line">256</span><br><span class="line">257</span><br><span class="line">258</span><br><span class="line">259</span><br><span class="line">260</span><br><span class="line">261</span><br><span class="line">262</span><br><span class="line">263</span><br><span class="line">264</span><br><span class="line">265</span><br><span class="line">266</span><br><span class="line">267</span><br><span class="line">268</span><br><span class="line">269</span><br><span class="line">270</span><br><span class="line">271</span><br><span class="line">272</span><br><span class="line">273</span><br><span class="line">274</span><br><span class="line">275</span><br><span class="line">276</span><br><span class="line">277</span><br><span class="line">278</span><br><span class="line">279</span><br><span class="line">280</span><br><span class="line">281</span><br><span class="line">282</span><br><span class="line">283</span><br><span class="line">284</span><br><span class="line">285</span><br><span class="line">286</span><br><span class="line">287</span><br><span class="line">288</span><br><span class="line">289</span><br><span class="line">290</span><br><span class="line">291</span><br><span class="line">292</span><br><span class="line">293</span><br><span class="line">294</span><br><span class="line">295</span><br><span class="line">296</span><br><span class="line">297</span><br><span class="line">298</span><br><span class="line">299</span><br><span class="line">300</span><br><span class="line">301</span><br><span class="line">302</span><br><span class="line">303</span><br><span class="line">304</span><br><span class="line">305</span><br><span class="line">306</span><br><span class="line">307</span><br><span class="line">308</span><br><span class="line">309</span><br><span class="line">310</span><br><span class="line">311</span><br><span class="line">312</span><br><span class="line">313</span><br><span class="line">314</span><br><span class="line">315</span><br><span class="line">316</span><br><span class="line">317</span><br><span class="line">318</span><br><span class="line">319</span><br><span class="line">320</span><br><span class="line">321</span><br><span class="line">322</span><br><span class="line">323</span><br><span class="line">324</span><br><span class="line">325</span><br><span class="line">326</span><br><span class="line">327</span><br><span class="line">328</span><br><span class="line">329</span><br><span class="line">330</span><br><span class="line">331</span><br><span class="line">332</span><br><span class="line">333</span><br><span class="line">334</span><br><span class="line">335</span><br><span class="line">336</span><br><span class="line">337</span><br><span class="line">338</span><br><span class="line">339</span><br><span class="line">340</span><br><span class="line">341</span><br><span class="line">342</span><br><span class="line">343</span><br><span class="line">344</span><br><span class="line">345</span><br><span class="line">346</span><br><span class="line">347</span><br><span class="line">348</span><br><span class="line">349</span><br><span class="line">350</span><br><span class="line">351</span><br><span class="line">352</span><br><span class="line">353</span><br><span class="line">354</span><br><span class="line">355</span><br><span class="line">356</span><br><span class="line">357</span><br><span class="line">358</span><br><span class="line">359</span><br><span class="line">360</span><br><span class="line">361</span><br><span class="line">362</span><br><span class="line">363</span><br><span class="line">364</span><br><span class="line">365</span><br><span class="line">366</span><br><span class="line">367</span><br><span class="line">368</span><br><span class="line">369</span><br><span class="line">370</span><br><span class="line">371</span><br><span class="line">372</span><br><span class="line">373</span><br><span class="line">374</span><br><span class="line">375</span><br><span class="line">376</span><br><span class="line">377</span><br><span class="line">378</span><br><span class="line">379</span><br><span class="line">380</span><br><span class="line">381</span><br><span class="line">382</span><br><span class="line">383</span><br><span class="line">384</span><br><span class="line">385</span><br><span class="line">386</span><br><span class="line">387</span><br><span class="line">388</span><br><span class="line">389</span><br><span class="line">390</span><br><span class="line">391</span><br><span class="line">392</span><br><span class="line">393</span><br><span class="line">394</span><br><span class="line">395</span><br><span class="line">396</span><br><span class="line">397</span><br><span class="line">398</span><br><span class="line">399</span><br><span class="line">400</span><br><span class="line">401</span><br><span class="line">402</span><br><span class="line">403</span><br><span class="line">404</span><br><span class="line">405</span><br><span class="line">406</span><br><span class="line">407</span><br><span class="line">408</span><br><span class="line">409</span><br><span class="line">410</span><br><span class="line">411</span><br><span class="line">412</span><br><span class="line">413</span><br><span class="line">414</span><br><span class="line">415</span><br><span class="line">416</span><br><span class="line">417</span><br><span class="line">418</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*=============================================================================</span></span><br><span class="line"><span class="comment"># author: Andrewei</span></span><br><span class="line"><span class="comment"># last modified: 2016-07-12 08:22</span></span><br><span class="line"><span class="comment"># filename: a.cpp</span></span><br><span class="line"><span class="comment"># description: </span></span><br><span class="line"><span class="comment">=============================================================================*/</span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><set></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><map></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cmath></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><stack></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><queue></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><stack></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><vector></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstdio></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><string></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstdlib></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstring></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><iostream></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><algorithm></span></span></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="built_in">std</span>;</span><br><span class="line"><span class="meta">#<span class="meta-keyword">define</span> key_value tree[tree[root].ch[1]].ch[0]</span></span><br><span class="line"><span class="comment">// 定义了一个宏,代表根节点的右儿子的左儿子,我们在进行操作时都会尽量把数据集中在这个地方</span></span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> maxn = <span class="number">500010</span>; <span class="comment">// 数据规模</span></span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> INF = (<span class="number">1</span> << <span class="number">29</span>); <span class="comment">// 定义了一个极大值</span></span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">Node</span>{</span></span><br><span class="line"> <span class="keyword">int</span> ch[<span class="number">2</span>]; <span class="comment">// 左右儿子</span></span><br><span class="line"> <span class="keyword">int</span> pre, val, size; <span class="comment">// 父节点,当前节点的值,当前节点为根的子树的大小</span></span><br><span class="line"> <span class="keyword">long</span> <span class="keyword">long</span> sum; <span class="comment">// 当前节点为根的子树的和</span></span><br><span class="line"></span><br><span class="line"> <span class="keyword">int</span> rev, add, same; <span class="comment">// 反转标记, 增量延迟标记, 区间所有元素相同标记</span></span><br><span class="line"> <span class="keyword">int</span> lx, rx, mx; <span class="comment">// 从区间最左端开始的子序列最大和,从区间最右端开始的区间子序列最大和,整个区间里面子序列最大和</span></span><br><span class="line">};</span><br><span class="line"></span><br><span class="line"><span class="keyword">int</span> root, total; <span class="comment">// 根节点,节点数量</span></span><br><span class="line"><span class="built_in">stack</span> <<span class="keyword">int</span>> mPool; <span class="comment">// 内存池,用来存储删除节点时释放的节点, 以便之后使用</span></span><br><span class="line">Node tree[maxn]; <span class="comment">// 树的所有节点</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">int</span> n, q; <span class="comment">// n 个数, q 个询问</span></span><br><span class="line"><span class="keyword">int</span> data[maxn]; <span class="comment">// 原始数据</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// 更新增量延迟标记</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">updateAdd</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> add)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(!rt) <span class="keyword">return</span>;</span><br><span class="line"> tree[rt].add += add;</span><br><span class="line"> tree[rt].val += add;</span><br><span class="line"> tree[rt].sum += (<span class="keyword">long</span> <span class="keyword">long</span>)add * tree[rt].size;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 更新反转延迟标记</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">updateRev</span><span class="params">(<span class="keyword">int</span> rt)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(!rt) <span class="keyword">return</span>;</span><br><span class="line"> tree[rt].rev ^= <span class="number">1</span>;</span><br><span class="line"> swap(tree[rt].lx, tree[rt].rx);</span><br><span class="line"> swap(tree[rt].ch[<span class="number">0</span>], tree[rt].ch[<span class="number">1</span>]);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 更新区间元素值相同延迟标记</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">updateSame</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(!rt) <span class="keyword">return</span>;</span><br><span class="line"> tree[rt].val = val;</span><br><span class="line"> tree[rt].sum = val * tree[rt].size;</span><br><span class="line"> tree[rt].lx = tree[rt].rx = tree[rt].mx = max(val, val * tree[rt].size);</span><br><span class="line"> tree[rt].same = <span class="number">1</span>;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 通过孩子节点的数据来更新父节点的数据</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">pushUp</span><span class="params">(<span class="keyword">int</span> rt)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> lson = tree[rt].ch[<span class="number">0</span>], rson = tree[rt].ch[<span class="number">1</span>];</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 更新节点的大小</span></span><br><span class="line"> tree[rt].size = tree[lson].size + tree[rson].size + <span class="number">1</span>;</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 更新该节点及其子树所有值的和</span></span><br><span class="line"> tree[rt].sum = tree[lson].sum + tree[rson].sum + tree[rt].val;</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 更新子序列最大值</span></span><br><span class="line"> tree[rt].lx = max((<span class="keyword">long</span> <span class="keyword">long</span>)tree[lson].lx, tree[lson].sum + tree[rt].val + max(<span class="number">0</span>, tree[rson].lx));</span><br><span class="line"> tree[rt].rx = max((<span class="keyword">long</span> <span class="keyword">long</span>)tree[rson].rx, tree[rson].sum + tree[rt].val + max(<span class="number">0</span>, tree[lson].rx));</span><br><span class="line"> tree[rt].mx = max(<span class="number">0</span>, tree[lson].rx) + tree[rt].val + max(<span class="number">0</span>, tree[rson].lx);</span><br><span class="line"> tree[rt].mx = max(tree[rt].mx, max(tree[lson].mx, tree[rson].mx));</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 将父节点的延迟标记更新到孩子节点</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">pushDown</span><span class="params">(<span class="keyword">int</span> rt)</span></span>{</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 更新增量延迟标记</span></span><br><span class="line"> <span class="keyword">if</span>(tree[rt].add){</span><br><span class="line"> updateAdd(tree[rt].ch[<span class="number">0</span>], tree[rt].add);</span><br><span class="line"> updateAdd(tree[rt].ch[<span class="number">1</span>], tree[rt].add);</span><br><span class="line"> tree[rt].add = <span class="number">0</span>;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 更新区间相同标记</span></span><br><span class="line"> <span class="keyword">if</span>(tree[rt].same){</span><br><span class="line"> updateSame(tree[rt].ch[<span class="number">0</span>], tree[rt].val);</span><br><span class="line"> updateSame(tree[rt].ch[<span class="number">1</span>], tree[rt].val);</span><br><span class="line"> tree[rt].same = <span class="number">0</span>;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 更新反转标记</span></span><br><span class="line"> <span class="keyword">if</span>(tree[rt].rev){</span><br><span class="line"> updateRev(tree[rt].ch[<span class="number">0</span>]);</span><br><span class="line"> updateRev(tree[rt].ch[<span class="number">1</span>]);</span><br><span class="line"> tree[rt].rev = <span class="number">0</span>;</span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">newNode</span><span class="params">(<span class="keyword">int</span> &rt, <span class="keyword">int</span> pre, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(!mPool.empty()){</span><br><span class="line"> rt = mPool.top();</span><br><span class="line"> mPool.pop();</span><br><span class="line"> }<span class="keyword">else</span>{</span><br><span class="line"> rt = ++total;</span><br><span class="line"> }</span><br><span class="line"> tree[rt].pre = pre;</span><br><span class="line"> tree[rt].size = <span class="number">1</span>;</span><br><span class="line"> tree[rt].val = val;</span><br><span class="line"> tree[rt].add = <span class="number">0</span>;</span><br><span class="line"> tree[rt].sum = val;</span><br><span class="line"> tree[rt].rev = tree[rt].same = <span class="number">0</span>;</span><br><span class="line"> tree[rt].ch[<span class="number">0</span>] = tree[rt].ch[<span class="number">1</span>] = <span class="number">0</span>;</span><br><span class="line"> tree[rt].lx = tree[rt].rx = tree[rt].mx = val;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">buildTree</span><span class="params">(<span class="keyword">int</span> &cur, <span class="keyword">int</span> l, <span class="keyword">int</span> r, <span class="keyword">int</span> pre, <span class="keyword">int</span> *a)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(l > r) <span class="keyword">return</span>;</span><br><span class="line"> <span class="keyword">int</span> mid = (l + r) >> <span class="number">1</span>;</span><br><span class="line"> newNode(cur, pre, a[mid]);</span><br><span class="line"> buildTree(tree[cur].ch[<span class="number">0</span>], l, mid - <span class="number">1</span>, cur, a);</span><br><span class="line"> buildTree(tree[cur].ch[<span class="number">1</span>], mid + <span class="number">1</span>, r, cur, a);</span><br><span class="line"> pushUp(cur);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">init</span><span class="params">(<span class="keyword">int</span> *data)</span></span>{</span><br><span class="line"></span><br><span class="line"> root = total = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">while</span>(!mPool.empty()) mPool.pop();</span><br><span class="line"> tree[root].rev = tree[root].same = <span class="number">0</span>;</span><br><span class="line"> tree[root].ch[<span class="number">0</span>] = tree[root].ch[<span class="number">1</span>] = <span class="number">0</span>;</span><br><span class="line"> tree[root].lx = tree[root].rx = tree[root].mx = -INF;</span><br><span class="line"> tree[root].sum = tree[root].add = tree[root].val = <span class="number">0</span>;</span><br><span class="line"> tree[root].pre = tree[root].size = tree[root].sum = <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line"> newNode(root, <span class="number">0</span>, <span class="number">-1</span>); <span class="comment">// 注1</span></span><br><span class="line"> newNode(tree[root].ch[<span class="number">1</span>], root, <span class="number">-1</span>); <span class="comment">// 注2</span></span><br><span class="line"></span><br><span class="line"> buildTree(key_value, <span class="number">0</span>, n - <span class="number">1</span>, tree[root].ch[<span class="number">1</span>], data);</span><br><span class="line"></span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 实现单旋</span></span><br><span class="line"><span class="comment">// com == 0 时, 对 cur 节点进行左旋</span></span><br><span class="line"><span class="comment">// com == 1 时, 对 cur 节点进行右旋</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">rotate</span><span class="params">(<span class="keyword">int</span> cur, <span class="keyword">int</span> com)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> pre = tree[cur].pre;</span><br><span class="line"> pushDown(pre);</span><br><span class="line"> pushDown(cur);</span><br><span class="line"></span><br><span class="line"> tree[pre].ch[!com] = tree[cur].ch[com];</span><br><span class="line"> tree[tree[cur].ch[com]].pre = pre;</span><br><span class="line"></span><br><span class="line"> <span class="comment">/* 上面的语句可以展开成下面的语句</span></span><br><span class="line"><span class="comment"> if(com){</span></span><br><span class="line"><span class="comment"> tree[pre].ch[0] = tree[cur].ch[1]; </span></span><br><span class="line"><span class="comment"> tree[tree[cur].ch[1]].pre = pre;</span></span><br><span class="line"><span class="comment"> }else{</span></span><br><span class="line"><span class="comment"> tree[pre].ch[1] = tree[cur].ch[0]; </span></span><br><span class="line"><span class="comment"> tree[tree[cur].ch[0]].pre = pre;</span></span><br><span class="line"><span class="comment"> }</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"></span><br><span class="line"> <span class="keyword">if</span>(tree[pre].pre){</span><br><span class="line"> tree[tree[pre].pre].ch[tree[tree[pre].pre].ch[<span class="number">1</span>] == pre] = cur;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> tree[cur].pre = tree[pre].pre;</span><br><span class="line"> tree[cur].ch[com] = pre;</span><br><span class="line"> tree[pre].pre = cur;</span><br><span class="line"> pushUp(pre);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 实现树的调整</span></span><br><span class="line"><span class="comment">// 将 rt 节点调整到 tar 下面</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">splay</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> tar)</span></span>{</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> <span class="keyword">while</span>(tree[rt].pre != tar){</span><br><span class="line"> <span class="keyword">if</span>(tree[tree[rt].pre].pre == tar){</span><br><span class="line"> pushDown(tree[rt].pre);</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> rotate(rt, tree[tree[rt].pre].ch[<span class="number">0</span>] == rt);</span><br><span class="line"> }<span class="keyword">else</span>{</span><br><span class="line"> pushDown(tree[tree[rt].pre].pre);</span><br><span class="line"> pushDown(tree[rt].pre);</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> <span class="keyword">int</span> pre = tree[rt].pre;</span><br><span class="line"> <span class="keyword">int</span> com = tree[tree[pre].pre].ch[<span class="number">0</span>] == pre;</span><br><span class="line"> <span class="keyword">if</span>(tree[pre].ch[com] == rt){</span><br><span class="line"> rotate(rt, !com);</span><br><span class="line"> rotate(rt, com);</span><br><span class="line"> }<span class="keyword">else</span>{</span><br><span class="line"> rotate(pre, com);</span><br><span class="line"> rotate(rt, com);</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> pushUp(rt);</span><br><span class="line"> <span class="keyword">if</span>(tar == <span class="number">0</span>) root = rt;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getKth</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> k)</span></span>{</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> <span class="keyword">int</span> tmp = tree[tree[rt].ch[<span class="number">0</span>]].size + <span class="number">1</span>;</span><br><span class="line"> <span class="keyword">if</span>(tmp == k) <span class="keyword">return</span> rt;</span><br><span class="line"> <span class="keyword">else</span> <span class="keyword">if</span>(tmp > k) <span class="keyword">return</span> getKth(tree[rt].ch[<span class="number">0</span>], k);</span><br><span class="line"> <span class="keyword">else</span> <span class="keyword">return</span> getKth(tree[rt].ch[<span class="number">1</span>], k - tmp);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getValMinPos</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> Min = INF;</span><br><span class="line"> <span class="keyword">int</span> pos = <span class="number">-1</span>;</span><br><span class="line"> <span class="keyword">while</span>(rt){</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> <span class="keyword">if</span>(tree[rt].val == val) <span class="keyword">return</span> rt;</span><br><span class="line"> <span class="keyword">if</span>(tree[rt].val > val){</span><br><span class="line"> <span class="keyword">if</span>(Min > tree[rt].val){</span><br><span class="line"> Min = tree[rt].val;</span><br><span class="line"> pos = rt;</span><br><span class="line"> }</span><br><span class="line"> rt = tree[rt].ch[<span class="number">0</span>];</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">else</span> rt = tree[rt].ch[<span class="number">1</span>];</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> pos;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 得到 val 的位置</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getValPos</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(!rt) <span class="keyword">return</span> <span class="number">-1</span>;</span><br><span class="line"> <span class="keyword">if</span>(tree[rt].val == val) <span class="keyword">return</span> rt;</span><br><span class="line"> <span class="keyword">else</span> <span class="keyword">if</span>(tree[rt].val > val)</span><br><span class="line"> <span class="keyword">return</span> getValPos(tree[rt].ch[<span class="number">0</span>], val);</span><br><span class="line"> <span class="keyword">else</span></span><br><span class="line"> <span class="keyword">return</span> getValPos(tree[rt].ch[<span class="number">1</span>], val);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getValRank</span><span class="params">(<span class="keyword">int</span> rt, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> pos = getValPos(root, val);</span><br><span class="line"> splay(pos, <span class="number">0</span>);</span><br><span class="line"> <span class="keyword">int</span> res = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">if</span>(tree[root].ch[<span class="number">0</span>]) res += tree[tree[root].ch[<span class="number">0</span>]].size;</span><br><span class="line"> res += <span class="number">1</span>;</span><br><span class="line"> <span class="keyword">return</span> res;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getMin</span><span class="params">(<span class="keyword">int</span> rt)</span></span>{</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> <span class="keyword">while</span>(tree[rt].ch[<span class="number">0</span>]){</span><br><span class="line"> rt = tree[rt].ch[<span class="number">0</span>];</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> rt;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getMax</span><span class="params">(<span class="keyword">int</span> rt)</span></span>{</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> <span class="keyword">while</span>(tree[rt].ch[<span class="number">1</span>]){</span><br><span class="line"> rt = tree[rt].ch[<span class="number">1</span>];</span><br><span class="line"> pushDown(rt);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> rt;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 在第 x 个数后面插入 val</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">insertOne</span><span class="params">(<span class="keyword">int</span> x, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> splay(getKth(root, x + <span class="number">1</span>), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, x + <span class="number">2</span>), root);</span><br><span class="line"> newNode(key_value, tree[root].ch[<span class="number">1</span>], val);</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 回收内存</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">erase</span><span class="params">(<span class="keyword">int</span> rt)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(rt){</span><br><span class="line"> mPool.push(rt);</span><br><span class="line"> erase(tree[rt].ch[<span class="number">0</span>]);</span><br><span class="line"> erase(tree[rt].ch[<span class="number">1</span>]); </span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 删除第 k 个数</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">deleteOne</span><span class="params">(<span class="keyword">int</span> k)</span></span>{</span><br><span class="line"> splay(getKth(root, k), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, k + <span class="number">2</span>), root);</span><br><span class="line"> erase(key_value);</span><br><span class="line"> tree[key_value].pre = <span class="number">0</span>;</span><br><span class="line"> key_value = <span class="number">0</span>;</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 从第 pos 个数后开始插入 val 数组中的数</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">insert</span><span class="params">(<span class="keyword">int</span> pos, <span class="keyword">int</span> cnt, <span class="keyword">int</span> *val)</span></span>{</span><br><span class="line"> splay(getKth(root, pos + <span class="number">1</span>), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, pos + <span class="number">2</span>), root);</span><br><span class="line"> buildTree(key_value, <span class="number">0</span>, cnt - <span class="number">1</span>, tree[root].ch[<span class="number">1</span>], val);</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 从 pos 个数开始连续删除 cnt 个数</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">Delete</span><span class="params">(<span class="keyword">int</span> pos, <span class="keyword">int</span> cnt)</span></span>{</span><br><span class="line"> splay(getKth(root, pos), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, pos + cnt + <span class="number">1</span>), root);</span><br><span class="line"> erase(key_value);</span><br><span class="line"> tree[key_value].pre = <span class="number">0</span>;</span><br><span class="line"> key_value = <span class="number">0</span>;</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 获取 [l, r] 的和</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getSum</span><span class="params">(<span class="keyword">int</span> l, <span class="keyword">int</span> r)</span></span>{</span><br><span class="line"> splay(getKth(root, l), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, r + <span class="number">2</span>), root);</span><br><span class="line"> <span class="keyword">return</span> tree[key_value].sum;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 将 [l, r] 区间循环右移 T 个单位</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">revolve</span><span class="params">(<span class="keyword">int</span> l, <span class="keyword">int</span> r, <span class="keyword">int</span> T)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> len = r - l + <span class="number">1</span>;</span><br><span class="line"> T = (T % len + len) % len;</span><br><span class="line"> <span class="keyword">if</span>(T == <span class="number">0</span>) <span class="keyword">return</span>;</span><br><span class="line"> <span class="keyword">int</span> c = r - T + <span class="number">1</span>;</span><br><span class="line"> splay(getKth(root, c), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, r + <span class="number">2</span>), root);</span><br><span class="line"> <span class="keyword">int</span> tmp = key_value;</span><br><span class="line"> key_value = <span class="number">0</span>;</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line"> splay(getKth(root, l), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, l + <span class="number">1</span>), root);</span><br><span class="line"> key_value = tmp;</span><br><span class="line"> tree[key_value].pre = tree[root].ch[<span class="number">1</span>];</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">reverse</span><span class="params">(<span class="keyword">int</span> l, <span class="keyword">int</span> r)</span></span>{</span><br><span class="line"> splay(getKth(root, l), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, r + <span class="number">2</span>), root);</span><br><span class="line"> updateRev(key_value);</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">makeSame</span><span class="params">(<span class="keyword">int</span> pos, <span class="keyword">int</span> cnt, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> splay(getKth(root, pos), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, pos + cnt + <span class="number">1</span>), root);</span><br><span class="line"> updateSame(key_value, val);</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 将 [l, r] 区间的所有值都增加 val</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">makeAdd</span><span class="params">(<span class="keyword">int</span> l, <span class="keyword">int</span> r, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> splay(getKth(root, l), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, r + <span class="number">2</span>), root);</span><br><span class="line"> updateAdd(key_value, val);</span><br><span class="line"> pushUp(tree[root].ch[<span class="number">1</span>]);</span><br><span class="line"> pushUp(root);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment">// 从 pos 开始连续 cnt 长度的区间内子序列的最大和</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getMaxSum</span><span class="params">(<span class="keyword">int</span> pos, <span class="keyword">int</span> cnt)</span></span>{</span><br><span class="line"> splay(getKth(root, pos), <span class="number">0</span>);</span><br><span class="line"> splay(getKth(root, pos + cnt + <span class="number">1</span>), root);</span><br><span class="line"> <span class="keyword">return</span> tree[key_value].mx;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">main</span><span class="params">()</span></span>{</span><br><span class="line"> <span class="keyword">int</span> x, y, z;</span><br><span class="line"> <span class="keyword">char</span> op[<span class="number">20</span>];</span><br><span class="line"> <span class="keyword">while</span>(<span class="built_in">scanf</span>(<span class="string">"%d%d"</span>, &n, &q) != EOF){</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">0</span>; i < n; i++){</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d"</span>, &data[i]);</span><br><span class="line"> }</span><br><span class="line"> init(data);</span><br><span class="line"> <span class="keyword">while</span>(q--){</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%s"</span>, op);</span><br><span class="line"> <span class="keyword">if</span>(op[<span class="number">0</span>] == <span class="string">'I'</span>){</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d%d"</span>, &x, &y);</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">0</span>; i < y; i++)</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d"</span>, &data[i]);</span><br><span class="line"> insert(x, y, data);</span><br><span class="line"> }<span class="keyword">else</span> <span class="keyword">if</span>(op[<span class="number">0</span>] == <span class="string">'D'</span>){</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d%d"</span>, &x, &y);</span><br><span class="line"> Delete(x, y);</span><br><span class="line"> }<span class="keyword">else</span> <span class="keyword">if</span>(op[<span class="number">0</span>] == <span class="string">'M'</span> && op[<span class="number">2</span>] == <span class="string">'K'</span>){</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d%d%d"</span>, &x, &y, &z);</span><br><span class="line"> makeSame(x, y, z);</span><br><span class="line"> }<span class="keyword">else</span> <span class="keyword">if</span>(op[<span class="number">0</span>] == <span class="string">'R'</span>){</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d%d"</span>, &x, &y);</span><br><span class="line"> reverse(x, x + y - <span class="number">1</span>);</span><br><span class="line"> }<span class="keyword">else</span> <span class="keyword">if</span>(op[<span class="number">0</span>] == <span class="string">'G'</span>){</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d%d"</span>, &x, &y);</span><br><span class="line"> <span class="built_in">printf</span>(<span class="string">"%d\n"</span>, getSum(x, x + y - <span class="number">1</span>));</span><br><span class="line"> }<span class="keyword">else</span>{</span><br><span class="line"> <span class="built_in">printf</span>(<span class="string">"%d\n"</span>, getMaxSum(<span class="number">1</span>, tree[root].size - <span class="number">2</span>));</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><h2 id="预备知识"><a href="#预备知识" class="headerlink" title="预备知识"></a>预备知识</h2><ol>
<li>树的遍历</li>
<li>二叉树的基本知识</li>
<li>排序二叉树的基本知识</li>
<li>线段树区间更新和区间查询知识</li>
<li>平衡排序二叉树的基本知识(非必须)</li>
</ol>
<h2 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h2><p>伸展树(SplayTree) 是一种经过改进的平衡排序二叉树, 他跟平衡二叉树的操作非常类似,同时也有很多不同。</p></summary>
<category term="算法" scheme="https://andrewei1316.github.io/categories/%E7%AE%97%E6%B3%95/"/>
<category term="数据结构" scheme="https://andrewei1316.github.io/categories/%E7%AE%97%E6%B3%95/%E6%95%B0%E6%8D%AE%E7%BB%93%E6%9E%84/"/>
<category term="数据结构" scheme="https://andrewei1316.github.io/tags/%E6%95%B0%E6%8D%AE%E7%BB%93%E6%9E%84/"/>
<category term="伸展树" scheme="https://andrewei1316.github.io/tags/%E4%BC%B8%E5%B1%95%E6%A0%91/"/>
<category term="树" scheme="https://andrewei1316.github.io/tags/%E6%A0%91/"/>
</entry>
<entry>
<title>树状数组</title>
<link href="https://andrewei1316.github.io/2016/07/05/binary-indexed-trees/"/>
<id>https://andrewei1316.github.io/2016/07/05/binary-indexed-trees/</id>
<published>2016-07-05T08:23:22.000Z</published>
<updated>2020-10-08T12:13:56.348Z</updated>
<content type="html"><![CDATA[<h2 id="引入"><a href="#引入" class="headerlink" title="引入"></a>引入</h2><p>早就听说过这个数据结构,但是他所做的事情线段树都可以做。年少无知的我以为知道了线段树就可以不去理会这个数据结构,然而每次做题,都要敲长长的线段树,心里苦啊。 不就是一个树状数组么,去看!</p><a id="more"></a><h2 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h2><p>树状数组可以做什么,树状数组可以维护一个序列的前缀和,并且在 $log(n)$ 的时间复杂度内进行更新和查询区间和。<br>它主要有两种操作:</p><ul><li>1> add(i, val); 将第 i 个元素加上 val, 复杂度为 $log(n)$</li><li>2> sum(i); 统计 $[1, n]$ 的和</li></ul><p>对,这些事情线段树都可以做到,那么树状数组的优势在哪,接着往下看。</p><h2 id="详解"><a href="#详解" class="headerlink" title="详解"></a>详解</h2><h3 id="结构与节点含义"><a href="#结构与节点含义" class="headerlink" title="结构与节点含义"></a>结构与节点含义</h3><p>树状数组的结构是由线段树的结构简化而来,也就是说他的结构是线段树的结构的一部分,如图所示<img src="/images/binary-indexed-trees/bit1.png" alt="树状数组的结构">其中,A为普通数组,C为树状数组。<br>下面我们来看一下树状数组每个节点的含义:</p><table><thead><tr><th>节点</th><th>下标二进制</th><th>含义</th></tr></thead><tbody><tr><td>C1</td><td>0001</td><td>C1 = A1</td></tr><tr><td>C2</td><td>0010</td><td>C2 = C1 + A2 = A1 + A2</td></tr><tr><td>C3</td><td>0011</td><td>C3 = A3</td></tr><tr><td>C4</td><td>0100</td><td>C4 = C2 + C3 + A4 = A1 + … + A4</td></tr><tr><td>C5</td><td>0101</td><td>C5 = A5</td></tr><tr><td>C6</td><td>0110</td><td>C6 = C5 + A6 = A5 + A6</td></tr><tr><td>C7</td><td>0111</td><td>C7 = A7</td></tr><tr><td>C8</td><td>1000</td><td>C8 = C4 + C6 + C7 + A8 = A1 + … + A8</td></tr></tbody></table><p>通过观察表格我们发现,树状数组里面的每一个节点其实代表的都是一个区间的和,设有一个下标为 $i$ 的节点 $C_i$, 并且假定 $i$ 的二进制表示中末尾有 $k$ 个 $0$, 则, $C_i$ 代表的是从第 $i$ 个数向前数 $2^k$ 个数的和,即 $C_i$为: </p><p>$$\sum_{j=i-2^k+1}^{i}A_j$$</p><h3 id="求和操作"><a href="#求和操作" class="headerlink" title="求和操作"></a>求和操作</h3><p>在我们明白了树状数组中每个节点的含义之后,我们来构造求区间$[1, i], 1 <= i <= n$ 的和(记为<code>sum(i)</code>)的方法,首先我们先假定 <code>int lowbit(int i);</code> 函数用来表示 $2^k$(其中 $k$ 为 $i$ 的二进制中末尾 $0$ 的个数),他的实现方式我们稍后再讲。 那么:</p><p>$$ sum(i) = sum(i - lowbit(i)) + C_i$$</p><p>该式可递归求解也可迭代求解,结束的条件是 $ i <= 0 $</p><blockquote><p>下面以求区间 $[1,7]$ 的和(即 sum(6)的值)为例介绍求和操作的步骤:</p></blockquote><ul><li><code>lowbit(7) = 1</code> 即 $C_7=A_7$, 故 $sum(7)=sum(7-1)+C_7$;</li><li><code>lowbit(6) = 2</code> 即 $C_6=A_6+A_5$, 故 $sum(6)=sum(6-2)+C_6$;</li><li><code>lowbit(4) = 4</code> 即 $C_4=A_1+A_2+A_3+A_4$, 故 $sum(4)=sum(4-4)+C_4$;<br>由于 $4-4==0$, 故算法停止,最后得知 $sum(7)=C_7+C_6+C_4$;</li></ul><p>现在我们已经学会了求区间 $[1,n]$ 的和,对于任意的区间 $[l,r]$的和, 只需要求 $sum(r)-sum(l-1)$ 即可。</p><p>求和操作的参考代码如下:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 迭代求解</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getSum</span><span class="params">(<span class="keyword">int</span> x)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> sum = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = x; i; i -= lowbit(i)){</span><br><span class="line"> sum += C[i];</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> sum;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 递归求解</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getSum</span><span class="params">(<span class="keyword">int</span> x)</span></span>{</span><br><span class="line"> <span class="keyword">return</span> x > <span class="number">0</span> ? C[x] + getSum(x - lowbit(x)) : <span class="number">0</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="更新操作"><a href="#更新操作" class="headerlink" title="更新操作"></a>更新操作</h3><p>介绍完了求和操作,下面介绍更新操作。上文中已经提到,树状数组中的每一个节点维护的都是一个区间和,那么当一个节点的值发生变化的时候也就意味着会有一系列与之相关的节点的值也要发生变化,这一点从树状数组的结构图中也可以看出。</p><blockquote><p>比如 $C_1$ 节点发生变化,那么 $C_1, C_2, C_4, C_8, …$ 都要跟着变化,如何维护好这个变化是我们需要关注的事情。</p></blockquote><p>由结构图我们知道,当一个节点变化时影响到的是本节点和该节点的直接和间接父节点,而对于一个节点 $i$, 它的父节点的下标为 <code>i + lowbit(i)</code>, 所以更新操作跟求和操作非常相似,只需要不断的寻找父节点然后更新它即可。</p><p>更新操作参考代码如下:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 迭代求解</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">update</span><span class="params">(<span class="keyword">int</span> x, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = x; i <= n; i += lowbit(i)){</span><br><span class="line"> C[i] += val;</span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 递归求解</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">update</span><span class="params">(<span class="keyword">int</span> x, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(x <= n){</span><br><span class="line"> C[x] += val;</span><br><span class="line"> update(x + lowbit(x), val);</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="lowbit-函数的实现"><a href="#lowbit-函数的实现" class="headerlink" title="lowbit 函数的实现"></a>lowbit 函数的实现</h3><p>上文中我们一直在用 <code>lowbit(x)</code> 这个函数来求 $2^k$(其中 k 为 x 二进制表示中末尾0的个数), 那么这个函数怎么实现最方便快捷呢?<br>当然,完全可以用循环的方式从 <code>x</code> 的最后一位遍历,但是还有更好的方法,那就是 <code>x & (-x)</code>。<br>下面解释一下这个表达式为什么可以求出我们所需要的东西, 这里假定 <code>x</code> 为 <code>n</code> 位有符号整数且非负, 则:<br><img src="/images/binary-indexed-trees/bit2.png" alt="lowbit原理"></p><p>当然 <code>lowbit</code> 函数还可以用 <code>((i-1)^i)&i</code> 来实现,都是利用了二进制的性质,这里不再解释。</p><p>lowbit函数参考代码:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 实现方式1</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">lowbit</span><span class="params">(<span class="keyword">int</span> x)</span></span>{</span><br><span class="line"> <span class="keyword">return</span> x & (-x);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 实现方式2</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">lowbit</span><span class="params">(<span class="keyword">int</span> x)</span></span>{</span><br><span class="line"> <span class="keyword">return</span> ((x - <span class="number">1</span>) ^ i) & i;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h2 id="应用"><a href="#应用" class="headerlink" title="应用"></a>应用</h2><p>讲到这里大家肯定已经感觉出树状数组的优势来了,代码简单的已经跟 <code>a + b</code> 差不多了,那么除了刚刚所描述的基本用法之外,哪些地方还可以巧妙地利用树状数组呢?</p><h3 id="应用一-单点更新,区间查询"><a href="#应用一-单点更新,区间查询" class="headerlink" title="应用一 单点更新,区间查询"></a>应用一 单点更新,区间查询</h3><p>** 问题:**<br>一个长度为 $n(1 <= n <= 500000)$ 的元素序列,一开始都为 0,现给出三种操作:</p><ul><li><strong>add x v</strong>, 给第 x 个元素的值加上 v;</li><li><strong>sub x v</strong>, 给第 x 个元素的值减去 v;</li><li><strong>sum x y</strong>, 求出第 x 个元素到第 y 个元素的和;</li></ul><p><strong>解答:</strong><br>这是最基本的树状数组的用法,前两种操作都可以直接使用 <code>update(x, v);</code>来实现,第三种操作就用 <code>getSum(y) - getSum(x - 1)</code>来实现</p><h3 id="应用二-区间更新,单点查询"><a href="#应用二-区间更新,单点查询" class="headerlink" title="应用二 区间更新,单点查询"></a>应用二 区间更新,单点查询</h3><p><strong>问题:</strong><br>一个长度为 $n(1 <= n <= 500000)$ 的元素序列,一个开始都为 0, 现给出两种操作:</p><ul><li><strong>add x y v</strong>, 给第 x 个元素到第 y 个元素的值都加上 v;</li><li><strong>get x</strong>, 查询第 x 个元素的值;</li></ul><p><strong>解答:</strong><br>这个问题可以转换到单点更新,区间查询上去,转换方法如下:<br>第一种操作,我们用两步 <code>update</code> 来实现, 分别为 <code>update(x, v)</code> 和 <code>update(y + 1, -v)</code>;<br>第二种操作, 查询第 <code>x</code> 个元素的值时,直接调用 <code>getSum(x)</code> 即可。</p><blockquote><p>注意: 此时 <code>getSum(x)</code> 得到的其实是 <code>x</code> 元素的 <strong>增量</strong>。</p></blockquote><p><strong>拓展:</strong><br>如果原始数据不为 0, 则 <code>get x</code> 时应当写为 <code>getSum(x) + data[x]</code>, 其中 <code>data[x]</code> 为第 <code>x</code> 个元素的初始值。</p><h3 id="应用三-区间更新,区间查询"><a href="#应用三-区间更新,区间查询" class="headerlink" title="应用三 区间更新,区间查询"></a>应用三 区间更新,区间查询</h3><p><strong>问题:</strong><br>一个长度为 $n(1 <= n <= 500000)$ 的元素序列,现给出两种操作:</p><ul><li><strong>add x y v</strong>, 给第 x 个元素到第 y 个元素的值都加上 v;</li><li><strong>sum x y</strong>, 求出第 x 个元素到第 y 个元素的和;</li></ul><p><strong>解答:</strong><br>首先说明树状数组是可以解决这个问题的,但是感觉这种问题已经违背了树状数组的本意,所以建议这种问题用线段树来做。<br>下面介绍树状数组的做法:<br>设 $s(i) = $加上 $v$ 之前的 $\sum_{j=1}^{i}a_j$</p><p>设 $s’(i) = $加上 $v$ 之后的 $\sum_{j=1}^{i}a’_j$<br>那么:</p><p>$i < x : s’(i) = s(i)$</p><p>$x <= i <= y : s’(i) = s(i) + v * (i - x + 1)$ </p><p>$y < i : s’(i) = s(i) + v * (y - x + 1)$</p><p>我们构建两个树状数组 <code>bit0</code>, <code>bit1</code>,并且满足:<br>$$\sum_{j=1}^{i}a_j = sum(bit1, i) * i + sum(bit0, i)$$<br>于是,在区间 $[l,r]$ 上同时加上 $x$ 就可以看作是:</p><blockquote><p>在 <code>bit0</code> 上的 <code>l</code> 位置上加上 <code>-x(l-1)</code><br>在 <code>bit1</code> 上的 <code>l</code> 位置上加上 <code>x</code><br>在 <code>bit0</code> 上的 <code>r+1</code> 位置上加上 <code>x*r</code><br>在 <code>bit1</code> 上的 <code>r+1</code> 位置上加上 <code>-x</code></p></blockquote><p>这四个操作都可以在 $log(n)$ 内完成。<br>下面实现一下:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">int</span> n;</span><br><span class="line"><span class="keyword">int</span> a[MAXN]; <span class="comment">// 原始数据</span></span><br><span class="line"><span class="keyword">int</span> bit0[MAXN], bit1[MAXN];</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">lowbit</span><span class="params">(<span class="keyword">int</span> x)</span></span>{</span><br><span class="line"> <span class="keyword">return</span> x & (-x);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getSum</span><span class="params">(<span class="keyword">int</span> *bit, <span class="keyword">int</span> x)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> sum = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = x; i; i -= lowbit(i)){</span><br><span class="line"> sum += bit[i];</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> sum;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">update</span><span class="params">(<span class="keyword">int</span> *bit, <span class="keyword">int</span> x, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = x; i <= n; i += lowbit(i)){</span><br><span class="line"> bit[i] += val;</span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">solve</span><span class="params">()</span></span>{</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">1</span>; i <= n; i++){</span><br><span class="line"> update(bit0, i, a[i]);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 更新操作 [x, y] 区间内加上 val;</span></span><br><span class="line"> update(bit0, x, -val * (x - <span class="number">1</span>));</span><br><span class="line"> update(bit1, x, val);</span><br><span class="line"> update(bit0, y + <span class="number">1</span>, val * y);</span><br><span class="line"> update(bit1, y + <span class="number">1</span>, -val);</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 求和操作 求 [x, y] 区间和</span></span><br><span class="line"> <span class="keyword">int</span> sum = <span class="number">0</span>;</span><br><span class="line"> sum += getSum(bit0, y) + getSum(bit1, y) * y;</span><br><span class="line"> sum -= getSum(bit0, x - <span class="number">1</span>) + getSum(bit1, x - <span class="number">1</span>) * (x - <span class="number">1</span>);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="应用四-求逆序对数"><a href="#应用四-求逆序对数" class="headerlink" title="应用四 求逆序对数"></a>应用四 求逆序对数</h3><p><strong>问题1:</strong><br>一个长度为 $n(1 <= n <= 500000)$ 的元素序列,给定每个数的值, 求它的逆序对数</p><p><strong>解答:</strong><br>求逆序对可以用 <code>归并排序</code> 来解,复杂度为 $nlog(n)$, 如果用树状数组的话, 怎么做呢?<br>我们用一个例子来介绍求解的步骤,比如我们要求 4, 2, 1, 3, 6, 5 这个序列的逆序对。</p><blockquote><p>首先,我们要声明一个数组,初始化为 0, 大小为我们要求的那个序列的最大值(如果最大值太大的话,建议离散化), 并将答案记录在 ans 中;</p></blockquote><ol><li>将 4 插入到序列中,执行 <code>update(4, 1)</code>, <code>cnt += (1 - getSum(4))</code>;</li><li>将 2 插入到序列中,执行 <code>update(2, 1)</code>, <code>cnt += (2 - getSum(2))</code>;</li><li>将 1 插入到序列中,执行 <code>update(1, 1)</code>, <code>cnt += (3 - getSum(1))</code>;</li><li>将 3 插入到序列中,执行 <code>update(3, 1)</code>, <code>cnt += (4 - getSum(3))</code>;</li><li>将 6 插入到序列中,执行 <code>update(6, 1)</code>, <code>cnt += (5 - getSum(6))</code>;</li><li>将 5 插入到序列中,执行 <code>update(5, 1)</code>, <code>cnt += (6 - getSum(5))</code>;<br>当这些操作做完后,<code>cnt</code> 就是我们要求的值.</li></ol><p><strong>问题2:</strong><br>给定$N(N <= 100000)$个区间,定义两个区间$(S_i, E_i)$和$(S_j, E_j)$的<code>></code>如下:如果 $S_i <= S_j and E_j <= E_i and E_i - S_i > E_j - S_j$,则 $(S_i, E_i) > (S_j, E_j)$,现在要求每个区间有多少区间<code>></code>它。<br>简化这个问题的话就是求, 有多少个区间 $i$ 完全覆盖 $j$, 并且这两个区间不能相等(如下图所示)。<br><img src="/images/binary-indexed-trees/bit3.png" alt="应用四题目"><br><strong>解答:</strong> </p><ol><li>对区间进行排序,排序规则为:左端点递增,如果左端点相同,则右端点递减。</li><li>枚举区间,不断插入区间右端点,因为区间左端点是保持递增的,所以对于某个区间$(S_i, E_i)$,只需要查询树状数组中$[Ei, MAX]$这一段有多少已经插入的数据,就能知道有多少个区间是比它大的,这里需要注意的是多个区间相等的情况,因为有排序,所以它们在排序后的数组中一定是相邻的,所以在遇到有相等区间的情况,需要”延迟”插入。等下一个不相等区间出现时才把之前保存下来的区间右端点进行插入。插入完毕再进行统计。<br>备注: 这里的插入即$(E_j, 1)$,统计则是$getSum(n) - getSum(E_i - 1)$ (其中 $j < i$ )。</li></ol><h4 id="关于逆序对的种种讨论"><a href="#关于逆序对的种种讨论" class="headerlink" title="关于逆序对的种种讨论"></a>关于逆序对的种种讨论</h4><p>既然提到了逆序对,那么我们现在来讨论一下当一个序列发生改变时其逆序对的变化</p><p><strong>问题1:</strong></p><p>现在考虑这样一个问题:在序列 $a[]$ 中,交换 $a[i]$ 与 $a[j](i < j)$,则序列的逆序对数奇偶性有何变化?</p><p><strong>解答:</strong></p><p>为了简化问题,我们首先考虑 $i, j$ 相邻的情况:<br>我们可以肯定的是如果 $i, j$ 相邻, 那么 $a[i], a[j]$ 的交换不会引起 $a[k], k < i$ 和 $a[k], k > j$ 的逆序对, 那么: 当 $a[i] > a[j]$时,$a[i], a[j]$ 的交换会导致整体的逆序对数 $-1$, 反之会导致整体的逆序对数 $+1$, 也就是说 ** $a[i], a[j]$ 的交换会导致整体逆序对数的变化,即相邻两个数的交换会导致整体逆序对数的变化 **(在这里我们不考虑 $a[i] == a[j]$)。<br>假设 $a[i]$ 与 $a[j]$ 之间有 $m(m >= 1)$ 个数:<br>这种情况下,我们可以考虑 $a[i], a[j]$ 的交换是下面三种情况的叠加,</p><ol><li>$a[i]$ 依次与其后面的 $m$ 个数交换直到 $a[j]$, 共进行 $m$ 次交换;</li><li>$a[i], a[j]$ 进行交换,共$1$次;</li><li>$a[j]$ 依次与其前面的 $m$ 个数交换直到 $a[i]$ 之前的位置,共进行 $m$ 次交换;</li></ol><p>如此可知,共进行了 $2 \ast m + 1$ 次交换, 综合两种情况可知 <strong>任何两个数的交换会导致整体逆序对数的变化</strong>(不考虑 $a[i] == a[j]$ 的情况).</p><p><strong>问题2:</strong></p><p>在问题1的基础上我们接着考虑,$a[i], a[j]$交换后对整体逆序对数量的影响。<br><strong>解答:</strong></p><p>可以很容易的想到想要解决这个问题,必须要知道在区间 $[l, r]$ 中比 $a[i]$ 小的数的个数 <code>num_less_i</code>, 比 $a[i]$ 大的数的个数 <code>num_larger_i</code>, 比 $a[j]$ 小的数的个数 <code>num_less_j</code>, 比 $a[j]$ 大的数的个数 <code>num_larger_j</code>, 则逆序对的变化为:<br><code>change_num = -num_less_i + num_larger_i + num_less_j - num_larger_j</code><br>现在的问题是如何快速的求出这些值。<br>令 <code>cnt[i][j]</code> 表示到位置 <code>i</code> 为止(包括<code>i</code>), 比 <code>j</code> 小的数的个数,并且假定对任意的 $a[i], a[i] <= m$, 那么</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">num_less_i = cnt[j][a[i]] - cnt[i][a[i]];</span><br><span class="line">num_larger_i = j - i - num_less_i;</span><br><span class="line">num_less_j = cnt[j][a[j]] - cnt[i][a[j]];</span><br><span class="line">num_larger_j = j - i - <span class="number">1</span> - num_less_j;</span><br><span class="line"><span class="comment">//第四行的减一是减去a[j]本身也算一个数</span></span><br></pre></td></tr></table></figure><p>至于 <code>cnt</code> 数组可以预处理出来:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">//递推求cnt数组, 复杂度比较大</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">1</span>; i <= n; i++){</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> j = <span class="number">1</span>; j <=m; j++){</span><br><span class="line"> <span class="keyword">if</span>(a[i] < j)cnt[i][j] = cnt[i - <span class="number">1</span>][j] + <span class="number">1</span>;</span><br><span class="line"> <span class="keyword">else</span> cnt[i][j] = cnt[i - <span class="number">1</span>][j];</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>问题解决.</p><p><strong>问题3:</strong></p><p>现在已知这样一个序列 <code>b</code>, $b[i], (1 <= i <= n)$ 表示 $i$ 在另外一个序列中的逆序对数,试问能否构造出这样的一个$1 - n $的排列,满足<code>b</code>序列?</p><p><strong>解答:</strong></p><p>这个问题刚好和求逆序对数反了过来。举个例子,<code>b</code> 序列 <code>1 ,2 , 0, 1, 0</code>.如何构造呢?<br>不妨试一试.<code>1</code>的逆序对数是<code>1</code>,也就是说,<code>1</code>在新序列中他的前面只能有<code>1</code>个比他大的数,但是<code>1</code>已经是最小数了,所以<code>1</code>必定处在第<code>2</code>的位置.构造序列: <code>_ 1 _ _</code> <code>2</code>的逆序对数是<code>2</code>,依照前面的分析方法,<code>2</code>必定处在第<code>4</code>的位置,即 <code>_ 1 _ 2</code> 。换句话说,<code>2要找到第</code>3<code>个空位.再换个角度,对于位置序列</code>(1,2,3,4,5)<code>,数字</code>1<code>已经占据了第</code>2<code>的位置,所以将序列中的</code>2<code>删除-></code>(1,3,4,5)<code>,那么我们要寻找的</code>2<code>的插入位置不就是第</code>3<code>小的元素,也就是第</code>b[i]<code>小元素么.求第</code>K`小元素上面已经分析过了,树状数组可以搞定.</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">//ans为构造的序列</span></span><br><span class="line"><span class="comment">//c[]为位置序列</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">1</span>; i <= n; i++){</span><br><span class="line"> <span class="keyword">int</span> pos = find_kth_element(k);</span><br><span class="line"> ans[pos] = c[i];</span><br><span class="line"> update(pos, <span class="number">-1</span>);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p><strong>问题4:</strong></p><p>仍是上一题中的序列<code>b</code>,<code>b[i]</code>表示原序列中位置 <code>i</code> 处的逆序对数,问你能否构造出原序列?(原序列为<code>1-n</code>的一个排列)</p><p><strong>解答:</strong></p><p>此题和上一题的不同.但是可以采用和上一题的相同的思路去解决.比如<code>b</code>序列 <code>0, 1, 2, 0, 1</code></p><p>因为一个序列的第一个数的逆序对数总是为<code>0</code>,所以从前往后的分析不太靠谱.那么我们试一试从后向前分析.最后一个数的逆序对数为<code>1</code>,说明他前面只能有一个数比他大,显然最后一个数只能是<code>4</code>.即序列变成 <code>_ _ _ _ 4</code>. 倒数第二个数的逆序对数为<code>0</code>,则同样可确定该数只能是<code>5</code>.序列变成 <code>_ _ _5 4</code>. 倒数第三个数的逆序对数为<code>2</code>,可确定该数为<code>1</code>.有什么规律呢?用<code>cnt</code>表示还剩下的数,每次要填的数,是不是第<code>cnt - b[i]</code>小的数呢?倒数第一个数的逆序对数为<code>1</code>,要填的是第 <code>5 - 1</code>小的数,也就是<code>4</code>. 然后倒数第二个数的逆序对数为<code>0</code>,要填第 <code>4-0</code>小的数,在剩余的数里面就是<code>5</code>.以此类推.</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">//算法伪代码</span></span><br><span class="line"><span class="comment">//ans为构造的序列</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">int</span> i = n; i > <span class="number">0</span>; i--){</span><br><span class="line"> <span class="keyword">int</span> num = find_kth_element(i - b[i]);</span><br><span class="line"> ans[i] = num;</span><br><span class="line"> update(num, <span class="number">-1</span>);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="应用五-求第k大数"><a href="#应用五-求第k大数" class="headerlink" title="应用五 求第k大数"></a>应用五 求第k大数</h3><p><strong>问题:</strong></p><p>给定一个空的栈,现对他进行如下操作:</p><ol><li><strong>push x</strong>, 将 x(0 <= x <= 1000) 放于栈顶;</li><li><strong>pop</strong>, 将栈顶元素弹出;</li><li><strong>query k</strong>, 查询栈中第 k 大元素;</li></ol><p><strong>解答:</strong></p><p>开一个大小为 <code>1000</code> 的数组并初始化为 0, </p><ol><li>当进行 <code>push x</code> 操作的时候调用 <code>update(x, 1)</code>;</li><li>当进行 <code>pop</code> 操作的时候调用 <code>update(x, -1)</code>;</li><li>对于 <code>query k</code> 操作,实际上就是找到一个右端点 <code>r</code>, 使得 $[1, r]$ 的和为 <code>k</code>, 故我们可以用二分与树状数组相结合来查找 <code>r</code>;</li></ol><p>具体实现:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXN = <span class="number">1010</span>;</span><br><span class="line"></span><br><span class="line"><span class="keyword">int</span> a[MAXN];</span><br><span class="line"><span class="built_in">stack</span> <<span class="keyword">int</span>> sta;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">lowbit</span><span class="params">(<span class="keyword">int</span> x)</span></span>{</span><br><span class="line"> <span class="keyword">return</span> x & (-x);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">update</span><span class="params">(<span class="keyword">int</span> x, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = x; i < MAXN; i += lowbit(i)){</span><br><span class="line"> a[i] += val;</span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getSum</span><span class="params">(<span class="keyword">int</span> x)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> sum = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = x; i; i -= lowbit(i)){</span><br><span class="line"> sum += a[i];</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> sum;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">FindMid</span><span class="params">(<span class="keyword">int</span> k)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> l = <span class="number">0</span>, r = MAXN;</span><br><span class="line"> <span class="keyword">while</span>(r - l > <span class="number">1</span>){</span><br><span class="line"> <span class="keyword">int</span> mid = (l + r) >> <span class="number">1</span>;</span><br><span class="line"> <span class="keyword">if</span>(getSum(mid) < k) l = mid;</span><br><span class="line"> <span class="keyword">else</span> r = mid;</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> r;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">solve</span><span class="params">()</span></span>{</span><br><span class="line"></span><br><span class="line"> <span class="built_in">memset</span>(a, <span class="number">0</span>, <span class="keyword">sizeof</span>(a));</span><br><span class="line"> <span class="keyword">while</span>(!sta.empty()) sta.pop();</span><br><span class="line"></span><br><span class="line"> <span class="comment">// push x 操作</span></span><br><span class="line"> sta.push(x);</span><br><span class="line"> update(x, <span class="number">1</span>);</span><br><span class="line"></span><br><span class="line"> <span class="comment">// pop 操作</span></span><br><span class="line"> top = sta.top();</span><br><span class="line"> update(top, <span class="number">-1</span>);</span><br><span class="line"> sta.pop();</span><br><span class="line"></span><br><span class="line"> <span class="comment">// query k 操作</span></span><br><span class="line"> ans = Find(k);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="应用六-二维树状数组"><a href="#应用六-二维树状数组" class="headerlink" title="应用六 二维树状数组"></a>应用六 二维树状数组</h3><p><strong>问题:</strong></p><p>给定一个矩阵 $mp(n * m, (1 <= n, m <= 1000))$, 初始情况元素均为 $0$, 有两种操作:</p><ol><li><strong>add x y v</strong>, 表示 $ mp[x][y] += v $;</li><li><strong>query x1 y1 x2 y2</strong>, 表示询问由 x1, y1, x2, y2 围成的矩形中所有数的和;</li></ol><p><strong>解答:</strong></p><p>可以把普通的一维树状数组改写为二维的树状数组:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">lowbit</span><span class="params">(<span class="keyword">int</span> x)</span></span>{</span><br><span class="line"> <span class="keyword">return</span> x & (-x);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">update</span><span class="params">(<span class="keyword">int</span> x, <span class="keyword">int</span> y, <span class="keyword">int</span> val)</span></span>{</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = x; i <= n; i += lowbit(i)){</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> j = y; j <= m; j += lowbit(j)){</span><br><span class="line"> c[i][j] += val;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">getSum</span><span class="params">(<span class="keyword">int</span> x, <span class="keyword">int</span> y)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> sum = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = x; i; i -= lowbit(i)){</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> j = y; j; j -= lowbit(j)){</span><br><span class="line"> sum += c[i][j];</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> sum;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>仔细观察即可发现,二维树状数组的实现和一维的实现极其相似,二维仅仅比一维多了一个循环,并且数据用二维数组实现。那么同样地,对于三维的情况,也只是在数组的维度上再增加一维,更新和求和时都各加一个循环而已。</p><p>此时,如果需要求 $(x1, y1), (x2, y2)$ 这两个点所框定的矩阵中的和则会有以下公式:<br>$$ sum = getSum(x2, y2) - getSum(x1 - 1, y2) - getSum(x2, y1 - 1) + getSum(x1 - 1, y1 - 1)$$</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>罗嗦了这么多,我们来总结一下树状数组的优势:</p><ol><li>编码复杂度极低,跟它的功能比起来性价比超高;</li><li>运用非常灵活,功能强大;</li><li>一维树状数组的各种操作的复杂度均为 $O(log(n))$;</li><li>只需要线性的空间,空间复杂度为 $O(n)$;</li><li>可以拓展成为 $n$ 维的情况.</li></ol>]]></content>
<summary type="html"><h2 id="引入"><a href="#引入" class="headerlink" title="引入"></a>引入</h2><p>早就听说过这个数据结构,但是他所做的事情线段树都可以做。年少无知的我以为知道了线段树就可以不去理会这个数据结构,然而每次做题,都要敲长长的线段树,心里苦啊。 不就是一个树状数组么,去看!</p></summary>
<category term="算法" scheme="https://andrewei1316.github.io/categories/%E7%AE%97%E6%B3%95/"/>
<category term="数据结构" scheme="https://andrewei1316.github.io/categories/%E7%AE%97%E6%B3%95/%E6%95%B0%E6%8D%AE%E7%BB%93%E6%9E%84/"/>
<category term="数据结构" scheme="https://andrewei1316.github.io/tags/%E6%95%B0%E6%8D%AE%E7%BB%93%E6%9E%84/"/>
<category term="算法" scheme="https://andrewei1316.github.io/tags/%E7%AE%97%E6%B3%95/"/>
<category term="树状数组" scheme="https://andrewei1316.github.io/tags/%E6%A0%91%E7%8A%B6%E6%95%B0%E7%BB%84/"/>
</entry>
<entry>
<title>C++虚函数与多态性</title>
<link href="https://andrewei1316.github.io/2016/06/07/cplusplus-virtual-function/"/>
<id>https://andrewei1316.github.io/2016/06/07/cplusplus-virtual-function/</id>
<published>2016-06-07T04:06:56.000Z</published>
<updated>2018-04-09T01:16:07.246Z</updated>
<content type="html"><![CDATA[<h3 id="写在前面"><a href="#写在前面" class="headerlink" title="写在前面"></a>写在前面</h3><p>本文转自 <a href="http://www.cppblog.com/dawnbreak/archive/2009/03/10/76084.aspx">http://www.cppblog.com/dawnbreak/archive/2009/03/10/76084.aspx</a> 感谢作者<br>总觉得C++很神奇,在继承和多态性方面比JAVA要灵活的许多,今天看到了上面这个介绍虚函数的文章觉得写的很好,分享到这里。</p><hr><a id="more"></a><h1 id="虚函数表"><a href="#虚函数表" class="headerlink" title="虚函数表"></a>虚函数表</h1><p>对C++ 了解的人都应该知道虚函数(Virtual Function)是通过一张虚函数表(Virtual Table)来实现的。简称为V-Table。 在这个表中,主是要一个类的虚函数的地址表,这张表解决了继承、覆盖的问题,保证其容真实反应实际的函数。这样,在有虚函数的类的实例中这个表被分配在了 这个实例的内存中,所以,当我们用父类的指针来操作一个子类的时候,这张虚函数表就显得由为重要了,它就像一个地图一样,指明了实际所应该调用的函数。<br>这里我们着重看一下这张虚函数表。在C++的标准规格说明书中说到,编译器必需要保证虚函数表的指针存在于对象实例中最前面的位置(这是为了保证正确取到虚函数的偏移量)。 这意味着我们通过对象实例的地址得到这张虚函数表,然后就可以遍历其中函数指针,并调用相应的函数。<br>下面我们通过一个例子来说明一下:<br>假设我们有这样的一个类:</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">Base</span>{</span></span><br><span class="line"> <span class="keyword">public</span>:</span><br><span class="line"> <span class="function"><span class="keyword">virtual</span> <span class="keyword">void</span> <span class="title">f</span><span class="params">()</span> </span>{ <span class="built_in">cout</span> << <span class="string">"Base::f"</span> << <span class="built_in">endl</span>; }</span><br><span class="line"> <span class="function"><span class="keyword">virtual</span> <span class="keyword">void</span> <span class="title">g</span><span class="params">()</span> </span>{ <span class="built_in">cout</span> << <span class="string">"Base::g"</span> << <span class="built_in">endl</span>; }</span><br><span class="line"> <span class="function"><span class="keyword">virtual</span> <span class="keyword">void</span> <span class="title">h</span><span class="params">()</span> </span>{ <span class="built_in">cout</span> << <span class="string">"Base::h"</span> << <span class="built_in">endl</span>; }</span><br><span class="line">};</span><br></pre></td></tr></table></figure><p>下面我们通过 Base 的实例来获取虚函数表</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">typedef</span> <span class="title">void</span><span class="params">(*Fun)</span><span class="params">(<span class="keyword">void</span>)</span></span>; Base b;</span><br><span class="line">Fun pFun = <span class="literal">NULL</span>;</span><br><span class="line"><span class="built_in">cout</span> << <span class="string">"虚函数表地址:"</span> << (<span class="keyword">int</span>*)(&b) << <span class="built_in">endl</span>;</span><br><span class="line"><span class="built_in">cout</span> << <span class="string">"虚函数表 — 第一个函数地址:"</span> << (<span class="keyword">int</span>*)*(<span class="keyword">int</span>*)(&b) << <span class="built_in">endl</span>;</span><br><span class="line"></span><br><span class="line">pFun = (Fun)*((<span class="keyword">int</span>*)*(<span class="keyword">int</span>*)(&b));</span><br><span class="line"></span><br><span class="line">pFun();</span><br></pre></td></tr></table></figure><p>本程序的测试环境为: Windows XP+VS2003, Linux 2.6.22 + GCC 4.1.3<br>实际运行结果如下:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">虚函数表地址:0012FED4</span><br><span class="line">虚函数表 — 第一个函数地址:0044F148</span><br><span class="line">Base::f</span><br></pre></td></tr></table></figure><p>通过这个示例,我们可以看到,我们可以通过强行把<code>&b</code>转成<code>int *</code>,取得虚函数表的地址,然后,再次取址就可以得到第一个虚函数的地址了,也就是<code>Base::f()</code>,这在上面的程序中得到了验证(把<code>int*</code>强制转成了函数指针)。通过这个示例,我们就可以知道如果要调用<code>Base::g()</code>和<code>Base::h()</code>,其代码如下:</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">(Fun)*((<span class="keyword">int</span>*)*(<span class="keyword">int</span>*)(&b)+<span class="number">0</span>); <span class="comment">// Base::f()</span></span><br><span class="line">(Fun)*((<span class="keyword">int</span>*)*(<span class="keyword">int</span>*)(&b)+<span class="number">1</span>); <span class="comment">// Base::g()</span></span><br><span class="line">(Fun)*((<span class="keyword">int</span>*)*(<span class="keyword">int</span>*)(&b)+<span class="number">2</span>); <span class="comment">// Base::h()</span></span><br></pre></td></tr></table></figure><p>下面通过图片来说明虚函数表在内容中的分布情况<br><img src="/images/cpluscplus_virtual_function/o_vtable1.jpg" alt="虚函数分布"><br>** 注意:**在上面这个图中,我在虚函数表的最后多加了一个结点,这是虚函数表的结束结点,就像字符串的结束符<code>'\0'</code>一样,其标志了虚函数表的结束。这个结束标志的值在不同的编译器下是不同的。在<code>WinXP+VS2003</code>下,这个值是<code>NULL</code>。而在<code>Ubuntu 7.10 + Linux 2.6.22 + GCC 4.1.3</code>下,这个值是如果<code>1</code>,表示还有下一个虚函数表,如果值是<code>0</code>,表示是最后一个虚函数表。<br>下面,我将分别说明“无覆盖”和“有覆盖”时的虚函数表的样子。没有覆盖父类的虚函数是毫无意义的。我之所以要讲述没有覆盖的情况,主要目的是为了给一个对比。在比较之下,我们可以更加清楚地知道其内部的具体实现。</p><h1 id="一般继承(无虚函数覆盖)"><a href="#一般继承(无虚函数覆盖)" class="headerlink" title="一般继承(无虚函数覆盖)"></a>一般继承(无虚函数覆盖)</h1><p>下面,再让我们来看看继承时的虚函数表是什么样的。假设有如下所示的一个继承关系:<br><img src="/images/cpluscplus_virtual_function/o_vtable2.jpg" alt="虚函数分布"><br>请注意,在这个继承关系中,子类没有重载任何父类的函数。那么,在派生类的实例中,其虚函数表如下所示:<br>对于实例:Derive d; 的虚函数表如下:<br><img src="/images/cpluscplus_virtual_function/o_vtable3.jpg" alt="虚函数分布"><br>我们可以看到下面几点:</p><blockquote><p>1)虚函数按照其声明顺序放于表中。<br>2)父类的虚函数在子类的虚函数前面。<br>我相信聪明的你一定可以参考前面的那个程序,来编写一段程序来验证。</p></blockquote><h1 id="一般继承(有虚函数覆盖)"><a href="#一般继承(有虚函数覆盖)" class="headerlink" title="一般继承(有虚函数覆盖)"></a>一般继承(有虚函数覆盖)</h1><p>覆盖父类的虚函数是很显然的事情,不然,虚函数就变得毫无意义。下面,我们来看一下,如果子类中有虚函数重载了父类的虚函数,会是一个什么样子?假设,我们有下面这样的一个继承关系。<br><img src="/images/cpluscplus_virtual_function/o_vtable4.jpg" alt="虚函数分布"><br>为了让大家看到被继承过后的效果,在这个类的设计中,我只覆盖了父类的一个函数:<code>f()</code>.那么,对于派生类的实例,其虚函数表会是下面的一个样子:<br><img src="/images/cpluscplus_virtual_function/o_vtable5.jpg" alt="虚函数分布"><br>我们从表中可以看到下面几点:</p><blockquote><p>1)覆盖的f()函数被放到了虚表中原来父类虚函数的位置。<br>2)没有被覆盖的函数依旧。<br>这样,我们就可以看到对于下面这样的程序,</p></blockquote><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">Base *b = <span class="keyword">new</span> Derive();</span><br><span class="line">b->f();</span><br></pre></td></tr></table></figure><p>由<code>b</code>所指的内存中的虚函数表的<code>f()</code>的位置已经被<code>Derive::f()</code>函数地址所取代,于是在实际调用发生时,是<code>Derive::f()</code>被调用了。这就实现了多态。</p><h1 id="多重继承(无虚函数覆盖)"><a href="#多重继承(无虚函数覆盖)" class="headerlink" title="多重继承(无虚函数覆盖)"></a>多重继承(无虚函数覆盖)</h1><p>下面,再让我们来看看多重继承中的情况,假设有下面这样一个类的继承关系。注意:子类并没有覆盖父类的函数。<br><img src="/images/cpluscplus_virtual_function/o_vtable6.jpg" alt="虚函数分布"><br>对于子类实例中的虚函数表,是下面这个样子:<br><img src="/images/cpluscplus_virtual_function/o_vtable7.jpg" alt="虚函数分布"><br>我们可以看到:</p><blockquote><p>1)每个父类都有自己的虚表。<br>2)子类的成员函数被放到了第一个父类的表中。(所谓的第一个父类是按照声明顺序来判断的)<br>这样做就是为了解决不同的父类类型的指针指向同一个子类实例,而能够调用到实际的函数。</p></blockquote><h1 id="多重继承(有虚函数覆盖)"><a href="#多重继承(有虚函数覆盖)" class="headerlink" title="多重继承(有虚函数覆盖)"></a>多重继承(有虚函数覆盖)</h1><p>下面我们再来看看,如果发生虚函数覆盖的情况。<br>下图中,我们在子类中覆盖了父类的<code>f()</code>函数。<br><img src="/images/cpluscplus_virtual_function/o_vtable8.jpg" alt="虚函数分布"><br>下面是对于子类实例中的虚函数表的图:<br><img src="/images/cpluscplus_virtual_function/o_vtable9.jpg" alt="虚函数分布"><br>我们可以看见,三个父类虚函数表中的<code>f()</code>的位置被替换成了子类的函数指针。这样,我们就可以任一静态类型的父类来指向子类,并调用子类的<code>f()</code>了。如:</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">Derive d;</span><br><span class="line">Base1 *b1 = &d;</span><br><span class="line">Base2 *b2 = &d;</span><br><span class="line">Base3 *b3 = &d;</span><br><span class="line">b1->f(); <span class="comment">//Derive::f()</span></span><br><span class="line">b2->f(); <span class="comment">//Derive::f()</span></span><br><span class="line">b3->f(); <span class="comment">//Derive::f()</span></span><br><span class="line">b1->g(); <span class="comment">//Base1::g()</span></span><br><span class="line">b2->g(); <span class="comment">//Base2::g()</span></span><br><span class="line">b3->g(); <span class="comment">//Base3::g()</span></span><br></pre></td></tr></table></figure><h1 id="基类的析构函数为什么通常写为虚函数"><a href="#基类的析构函数为什么通常写为虚函数" class="headerlink" title="基类的析构函数为什么通常写为虚函数"></a>基类的析构函数为什么通常写为虚函数</h1><p>我们通过一个例子来说明这个问题,假设有如下代码:</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><iostream></span></span></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="built_in">std</span>;</span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">Base</span>{</span></span><br><span class="line"><span class="keyword">public</span>:</span><br><span class="line"> <span class="function"><span class="keyword">virtual</span> <span class="keyword">void</span> <span class="title">func</span><span class="params">()</span></span>{</span><br><span class="line"> <span class="built_in">cout</span> <<<span class="string">"class Base: func(), do something!"</span><< <span class="built_in">endl</span>;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> Base(){}</span><br><span class="line"> </span><br><span class="line"> <span class="keyword">virtual</span> ~Base(){</span><br><span class="line"> <span class="built_in">cout</span> <<<span class="string">"class Base: ~Base(), do something!"</span><< <span class="built_in">endl</span>;</span><br><span class="line"> }</span><br><span class="line">};</span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">Derived</span> :</span> <span class="keyword">public</span> Base{</span><br><span class="line"><span class="keyword">public</span>:</span><br><span class="line"> <span class="function"><span class="keyword">void</span> <span class="title">func</span><span class="params">()</span></span>{</span><br><span class="line"> <span class="built_in">cout</span> <<<span class="string">"class Derived: func(), do something!"</span><< <span class="built_in">endl</span>;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> Derived(){}</span><br><span class="line"></span><br><span class="line"> ~Derived(){</span><br><span class="line"> <span class="built_in">cout</span> <<<span class="string">"class Derived: ~Derived(), do something!"</span><< <span class="built_in">endl</span>;</span><br><span class="line"> }</span><br><span class="line">};</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">main</span><span class="params">()</span></span>{</span><br><span class="line"></span><br><span class="line"> Base *b = <span class="keyword">new</span> Derived();</span><br><span class="line"> b -> func();</span><br><span class="line"> <span class="keyword">delete</span> b;</span><br><span class="line"></span><br><span class="line"> <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>输出结果如下:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">class Derived: func(), do something!</span><br><span class="line">class Derived: ~Derived(), do something!</span><br><span class="line">class Base: ~Base(), do something!</span><br></pre></td></tr></table></figure><p>如果此时将 class Base 析构函数的 virtual 去掉, 则输出的结果为:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">class Derived: func(), do something!</span><br><span class="line">class Base: ~Base(), do something!</span><br></pre></td></tr></table></figure><p>即此时并没有调用 Derived 的析构函数, 然而我们在 main 函数中确确实实 new 了一个 Derived 对象,也就是说这个时候发生了内存泄漏。</p><h1 id="安全性"><a href="#安全性" class="headerlink" title="安全性"></a>安全性</h1><p>每次写C++的文章,总免不了要批判一下C++。这篇文章也不例外。通过上面的讲述,相信我们对虚函数表有一个比较细致的了解了。水可载舟,亦可覆舟。下面,让我们来看看我们可以用虚函数表来干点什么坏事吧。</p><h2 id="通过父类型的指针访问子类自己的虚函数"><a href="#通过父类型的指针访问子类自己的虚函数" class="headerlink" title="通过父类型的指针访问子类自己的虚函数"></a>通过父类型的指针访问子类自己的虚函数</h2><p>我们知道,子类没有重载父类的虚函数是一件毫无意义的事情。因为多态也是要基于函数重载的。虽然在上面的图中我们可以看到Base1的虚表中有Derive的虚函数,但我们根本不可能使用下面的语句来调用子类的自有虚函数:</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">Base1 *b1 = <span class="keyword">new</span> Derive();</span><br><span class="line">b1->g1(); <span class="comment">//编译出错</span></span><br></pre></td></tr></table></figure><p>任何妄图使用父类指针想调用子类中的未覆盖父类的成员函数的行为都会被编译器视为非法,所以,这样的程序根本无法编译通过。但在运行时,我们可以通过指针的方式访问虚函数表来达到违反C++语义的行为。</p><h2 id="访问non-public的虚函数"><a href="#访问non-public的虚函数" class="headerlink" title="访问non-public的虚函数"></a>访问non-public的虚函数</h2><p>另外,如果父类的虚函数是<code>private</code>或是<code>protected</code>的,但这些非<code>public</code>的虚函数同样会存在于虚函数表中,所以,我们同样可以使用访问虚函数表的方式来访问这些<code>non-public</code>的虚函数,这是很容易做到的。<br>如:</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">Base</span> {</span></span><br><span class="line"><span class="keyword">private</span>:</span><br><span class="line"> <span class="function"><span class="keyword">virtual</span> <span class="keyword">void</span> <span class="title">f</span><span class="params">()</span> </span>{ <span class="built_in">cout</span> << <span class="string">"Base::f"</span> << <span class="built_in">endl</span>; }</span><br><span class="line">};</span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">Derive</span> :</span> <span class="keyword">public</span> Base{</span><br><span class="line">};</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">typedef</span> <span class="title">void</span><span class="params">(*Fun)</span><span class="params">(<span class="keyword">void</span>)</span></span>;</span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">main</span><span class="params">()</span> </span>{</span><br><span class="line"> Derive d;</span><br><span class="line"> Fun pFun = (Fun)*((<span class="keyword">int</span>*)*(<span class="keyword">int</span>*)(&d)+<span class="number">0</span>);</span><br><span class="line"> pFun();</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h1 id="结束语"><a href="#结束语" class="headerlink" title="结束语"></a>结束语</h1><p>C++这门语言是一门Magic的语言,对于程序员来说,我们似乎永远摸不清楚这门语言背着我们在干了什么。需要熟悉这门语言,我们就必需要了解C++里面的那些东西,需要去了解C++中那些危险的东西。不然,这是一种搬起石头砸自己脚的编程语言</p>]]></content>
<summary type="html"><h3 id="写在前面"><a href="#写在前面" class="headerlink" title="写在前面"></a>写在前面</h3><p>本文转自 <a href="http://www.cppblog.com/dawnbreak/archive/2009/03/10/76084.aspx">http://www.cppblog.com/dawnbreak/archive/2009/03/10/76084.aspx</a> 感谢作者<br>总觉得C++很神奇,在继承和多态性方面比JAVA要灵活的许多,今天看到了上面这个介绍虚函数的文章觉得写的很好,分享到这里。</p>
<hr></summary>
<category term="编程语言" scheme="https://andrewei1316.github.io/categories/%E7%BC%96%E7%A8%8B%E8%AF%AD%E8%A8%80/"/>
<category term="C++" scheme="https://andrewei1316.github.io/categories/%E7%BC%96%E7%A8%8B%E8%AF%AD%E8%A8%80/C/"/>
<category term="C++" scheme="https://andrewei1316.github.io/tags/C/"/>
<category term="虚函数" scheme="https://andrewei1316.github.io/tags/%E8%99%9A%E5%87%BD%E6%95%B0/"/>
<category term="多态性" scheme="https://andrewei1316.github.io/tags/%E5%A4%9A%E6%80%81%E6%80%A7/"/>
</entry>
<entry>
<title>解决ubuntu下sublime中无法输入中文的问题</title>
<link href="https://andrewei1316.github.io/2016/05/05/sublime-input-Chinese/"/>
<id>https://andrewei1316.github.io/2016/05/05/sublime-input-Chinese/</id>
<published>2016-05-05T12:11:39.000Z</published>
<updated>2018-04-09T01:16:07.269Z</updated>
<content type="html"><![CDATA[<p>sublime是一款非常好用的代码编辑器, 支持多种语言的语法高亮, 支持多种插件拓展, 但是最近在使用sublime写博客的时候发现, 不能在里面输入中文, 特地找了一波方法, 现整理如下:</p><a id="more"></a><p>** 方法一 : **<br>为了解决这个问题一些大牛已经开始写一些自动化的脚本, 比如 <a href="https://github.com/lyfeyaj/sublime-text-imfix">sublime-text-imfix</a>具体的使用方法作者也已经给出:</p><ol><li><p>你需要用git或其他方法把代码下载到本地</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">sudo apt-get update && sudo apt-get upgrade <span class="comment"># 更新一下系统</span></span><br><span class="line">sudo apt-get install git <span class="comment"># 安装 git 工具</span></span><br><span class="line">git <span class="built_in">clone</span> https://github.com/lyfeyaj/sublime-text-imfix.git <span class="comment"># 克隆代码</span></span><br></pre></td></tr></table></figure></li><li><p>进入代码根目录</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">cd</span> sublime-text-imfix</span><br></pre></td></tr></table></figure></li><li><p>执行脚本</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">./sublime-imfix</span><br></pre></td></tr></table></figure></li><li><p>重启你的 sublime, 查看问题有没有解决</p></li></ol><p>这个方法非常简单, 但是在我电脑上用这个方法后, sublime 在用命令行 <code>subl</code> 启动的时候可以在里面输入中文, 但是直接点击快捷方式打开的时候就不行了; 感觉一般写代码的话这样就够了, 所以接下来的工作我没做, 但是不确定效果, 仅供参考</p><p>** 方法二 :**<br><a href="https://www.sinosky.org/linux-sublime-text-fcitx.html">美解决 Linux 下 Sublime Text 中文输入</a>,写的较为详细, 可以试一下;</p><p>** 方法三 :××<br><a href="http://jingyan.baidu.com/article/f3ad7d0ff8731609c3345b3b.html">如何解决在Ubuntu14.04下Sublime Text 3无法输入中文的问题</a>, 写的很详细, 全程有截图</p>]]></content>
<summary type="html"><p>sublime是一款非常好用的代码编辑器, 支持多种语言的语法高亮, 支持多种插件拓展, 但是最近在使用sublime写博客的时候发现, 不能在里面输入中文, 特地找了一波方法, 现整理如下:</p></summary>
<category term="Linux" scheme="https://andrewei1316.github.io/categories/Linux/"/>
<category term="软件" scheme="https://andrewei1316.github.io/categories/Linux/%E8%BD%AF%E4%BB%B6/"/>
<category term="sublime" scheme="https://andrewei1316.github.io/categories/Linux/%E8%BD%AF%E4%BB%B6/sublime/"/>
<category term="Linux" scheme="https://andrewei1316.github.io/tags/Linux/"/>
</entry>
<entry>
<title>ubuntu 安装 deb 的方法</title>
<link href="https://andrewei1316.github.io/2016/04/27/ubuntu-install-deb/"/>
<id>https://andrewei1316.github.io/2016/04/27/ubuntu-install-deb/</id>
<published>2016-04-27T13:12:15.000Z</published>
<updated>2018-04-09T01:16:07.269Z</updated>
<content type="html"><![CDATA[<p>ubuntu 安装 deb 软件包的方法有两种</p><ol><li>直接双击 deb 文件, 此时会自动调用系统的软件中心然后根据提示进行安装</li><li>作为程序员当然要用一种逼格高的方法来解决这个问题——用命令行来安装。ubuntu 下可以使用 dpkg 命令来安装, 下面是 dpkg 常用的几个命令:<a id="more"></a><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">sudo dpkg -I iptux.deb <span class="comment">#查看iptux.deb软件包的详细信息,包括软件名称、版本以及大小等(其中-I等价于--info)</span></span><br><span class="line">sudo dpkg -c iptux.deb <span class="comment">#查看iptux.deb软件包中包含的文件结构(其中-c等价于--contents)</span></span><br><span class="line">sudo dpkg -i iptux.deb <span class="comment">#安装iptux.deb软件包(其中-i等价于--install)</span></span><br><span class="line">sudo dpkg -l iptux <span class="comment">#查看iptux软件包的信息(软件名称可通过dpkg -I命令查看,其中-l等价于--list)</span></span><br><span class="line">sudo dpkg -L iptux <span class="comment">#查看iptux软件包安装的所有文件(软件名称可通过dpkg -I命令查看,其中-L等价于--listfiles)</span></span><br><span class="line">sudo dpkg -s iptux <span class="comment">#查看iptux软件包的详细信息(软件名称可通过dpkg -I命令查看,其中-s等价于--status)</span></span><br><span class="line">sudo dpkg -r iptux<span class="comment">#卸载iptux软件包(软件名称可通过dpkg -I命令查看,其中-r等价于--remove)</span></span><br></pre></td></tr></table></figure></li></ol><p><strong>注:</strong>dpkg命令无法自动解决依赖关系。如果安装的deb包存在依赖包,则应避免使用此命令,或者按照依赖关系顺序安装依赖包。</p>]]></content>
<summary type="html"><p>ubuntu 安装 deb 软件包的方法有两种</p>
<ol>
<li>直接双击 deb 文件, 此时会自动调用系统的软件中心然后根据提示进行安装</li>
<li>作为程序员当然要用一种逼格高的方法来解决这个问题——用命令行来安装。ubuntu 下可以使用 dpkg 命令来安装, 下面是 dpkg 常用的几个命令:</summary>
<category term="Linux" scheme="https://andrewei1316.github.io/categories/Linux/"/>
<category term="软件" scheme="https://andrewei1316.github.io/categories/Linux/%E8%BD%AF%E4%BB%B6/"/>
<category term="Linux" scheme="https://andrewei1316.github.io/tags/Linux/"/>
</entry>
<entry>
<title>POJ1273(Drainage Ditches)</title>
<link href="https://andrewei1316.github.io/2016/04/15/poj1273/"/>
<id>https://andrewei1316.github.io/2016/04/15/poj1273/</id>
<published>2016-04-15T07:38:39.000Z</published>
<updated>2018-04-09T01:16:07.268Z</updated>
<content type="html"><![CDATA[<p>题目链接:<a href="http://poj.org/problem?id=1273">http://poj.org/problem?id=1273</a></p><h2 id="题目大意"><a href="#题目大意" class="headerlink" title="题目大意:"></a>题目大意:</h2><p>下雨的时候约翰的田里总是积水,积水把他种的三叶草给淹了,他于是做了若干排水沟,每条沟在起始处安置一个阀门来控制这条沟的最大排水量,现在给出沟的条数以及阀门的个数。并给出每条沟的最大排水量。约翰的田里的积水处是阀门1,排出水的位置是最后一个阀门。求约翰在处理积水时的最大排出量。 </p><a id="more"></a><h2 id="思路"><a href="#思路" class="headerlink" title="思路:"></a>思路:</h2><p>题意抽象一下即有一个n个点的图, 边权代表最大流量, 求源点为1, 汇点为n的最大流.<br>题目为最大流的裸题, 用于练习算法模板.</p><h2 id="参考代码"><a href="#参考代码" class="headerlink" title="参考代码"></a>参考代码</h2><h3 id="EK算法"><a href="#EK算法" class="headerlink" title="EK算法"></a>EK算法</h3><h4 id="邻接矩阵"><a href="#邻接矩阵" class="headerlink" title="邻接矩阵"></a>邻接矩阵</h4><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><queue></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstdio></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstring></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><algorithm></span></span></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="built_in">std</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXN = <span class="number">300</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAX_INT = ((<span class="number">1</span> << <span class="number">31</span>) - <span class="number">1</span>);</span><br><span class="line"></span><br><span class="line"><span class="keyword">int</span> n, m;</span><br><span class="line"><span class="keyword">int</span> pre[MAXN];</span><br><span class="line"><span class="keyword">bool</span> vis[MAXN];</span><br><span class="line"><span class="keyword">int</span> mp[MAXN][MAXN];</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">bool</span> <span class="title">bfs</span><span class="params">(<span class="keyword">int</span> s, <span class="keyword">int</span> t)</span></span>{</span><br><span class="line"> <span class="built_in">queue</span> <<span class="keyword">int</span>> que;</span><br><span class="line"> <span class="built_in">memset</span>(vis, <span class="number">0</span>, <span class="keyword">sizeof</span>(vis));</span><br><span class="line"> <span class="built_in">memset</span>(pre, <span class="number">-1</span>, <span class="keyword">sizeof</span>(pre));</span><br><span class="line"> pre[s] = s;</span><br><span class="line"> vis[s] = <span class="literal">true</span>;</span><br><span class="line"> que.push(s);</span><br><span class="line"> <span class="keyword">while</span>(!que.empty()){</span><br><span class="line"> <span class="keyword">int</span> u = que.front();</span><br><span class="line"> que.pop();</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">1</span>; i <= n; i++){</span><br><span class="line"> <span class="keyword">if</span>(mp[u][i] && !vis[i]){</span><br><span class="line"> pre[i] = u;</span><br><span class="line"> vis[i] = <span class="literal">true</span>;</span><br><span class="line"> <span class="keyword">if</span>(i == t) <span class="keyword">return</span> <span class="literal">true</span>;</span><br><span class="line"> que.push(i);</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="literal">false</span>;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">EK</span><span class="params">(<span class="keyword">int</span> s, <span class="keyword">int</span> t)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> ans = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">while</span>(bfs(s, t)){</span><br><span class="line"> <span class="keyword">int</span> mi = MAX_INT;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = t; i != s; i = pre[i]){</span><br><span class="line"> mi = min(mi, mp[pre[i]][i]);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = t; i != s; i = pre[i]){</span><br><span class="line"> mp[pre[i]][i] -= mi;</span><br><span class="line"> mp[i][pre[i]] += mi;</span><br><span class="line"> }</span><br><span class="line"> ans += mi;</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> ans;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">main</span><span class="params">()</span></span>{</span><br><span class="line"> <span class="keyword">while</span>(<span class="built_in">scanf</span>(<span class="string">"%d%d"</span>, &m, &n) != EOF){</span><br><span class="line"> <span class="keyword">int</span> u, v, w;</span><br><span class="line"> <span class="built_in">memset</span>(mp, <span class="number">0</span>, <span class="keyword">sizeof</span>(mp));</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">0</span>; i < m; i++){</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d%d%d"</span>, &u, &v, &w);</span><br><span class="line"> mp[u][v] += w;</span><br><span class="line"> }</span><br><span class="line"> <span class="built_in">printf</span>(<span class="string">"%d\n"</span>, EK(<span class="number">1</span>, n));</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h4 id="邻接表"><a href="#邻接表" class="headerlink" title="邻接表"></a>邻接表</h4><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><queue></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstdio></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstring></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><iostream></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><algorithm></span></span></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="built_in">std</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXN = <span class="number">430</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAX_INT = (<span class="number">1</span> << <span class="number">30</span>);</span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">Edge</span>{</span></span><br><span class="line"> <span class="keyword">int</span> v, nxt, w;</span><br><span class="line">};</span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">Node</span>{</span></span><br><span class="line"> <span class="keyword">int</span> v, id;</span><br><span class="line">};</span><br><span class="line"></span><br><span class="line"><span class="keyword">int</span> n, m, ecnt;</span><br><span class="line"><span class="keyword">bool</span> vis[MAXN];</span><br><span class="line"><span class="keyword">int</span> head[MAXN];</span><br><span class="line">Node pre[MAXN];</span><br><span class="line">Edge edge[MAXN];</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">init</span><span class="params">()</span></span>{</span><br><span class="line"> ecnt = <span class="number">0</span>;</span><br><span class="line"> <span class="built_in">memset</span>(edge, <span class="number">0</span>, <span class="keyword">sizeof</span>(edge));</span><br><span class="line"> <span class="built_in">memset</span>(head, <span class="number">-1</span>, <span class="keyword">sizeof</span>(head));</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">addEdge</span><span class="params">(<span class="keyword">int</span> u, <span class="keyword">int</span> v, <span class="keyword">int</span> w)</span></span>{</span><br><span class="line"> edge[ecnt].v = v;</span><br><span class="line"> edge[ecnt].w = w;</span><br><span class="line"> edge[ecnt].nxt = head[u];</span><br><span class="line"> head[u] = ecnt++;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">bool</span> <span class="title">bfs</span><span class="params">(<span class="keyword">int</span> s, <span class="keyword">int</span> t)</span></span>{</span><br><span class="line"> <span class="built_in">queue</span> <<span class="keyword">int</span>> que;</span><br><span class="line"> <span class="built_in">memset</span>(vis, <span class="number">0</span>, <span class="keyword">sizeof</span>(vis));</span><br><span class="line"> <span class="built_in">memset</span>(pre, <span class="number">-1</span>, <span class="keyword">sizeof</span>(pre));</span><br><span class="line"> pre[s].v = s;</span><br><span class="line"> vis[s] = <span class="literal">true</span>;</span><br><span class="line"> que.push(s);</span><br><span class="line"> <span class="keyword">while</span>(!que.empty()){</span><br><span class="line"> <span class="keyword">int</span> u = que.front();</span><br><span class="line"> que.pop();</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = head[u]; i + <span class="number">1</span>; i = edge[i].nxt){</span><br><span class="line"> <span class="keyword">int</span> v = edge[i].v;</span><br><span class="line"> <span class="keyword">if</span>(!vis[v] && edge[i].w){</span><br><span class="line"> pre[v].v = u;</span><br><span class="line"> pre[v].id = i;</span><br><span class="line"> vis[v] = <span class="literal">true</span>;</span><br><span class="line"> <span class="keyword">if</span>(v == t) <span class="keyword">return</span> <span class="literal">true</span>;</span><br><span class="line"> que.push(v);</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="literal">false</span>;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">EK</span><span class="params">(<span class="keyword">int</span> s, <span class="keyword">int</span> t)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> ans = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">while</span>(bfs(s, t)){</span><br><span class="line"> <span class="keyword">int</span> mi = MAX_INT;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = t; i != s; i = pre[i].v){</span><br><span class="line"> mi = min(mi, edge[pre[i].id].w);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = t; i != s; i = pre[i].v){</span><br><span class="line"> edge[pre[i].id].w -= mi;</span><br><span class="line"> edge[pre[i].id ^ <span class="number">1</span>].w += mi;</span><br><span class="line"> }</span><br><span class="line"> ans += mi;</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> ans;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">main</span><span class="params">()</span></span>{</span><br><span class="line"> <span class="keyword">while</span>(<span class="built_in">scanf</span>(<span class="string">"%d%d"</span>, &m, &n) != EOF){</span><br><span class="line"> init();</span><br><span class="line"> <span class="keyword">int</span> u, v, w;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">0</span>; i < m; i++){</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d%d%d"</span>, &u, &v, &w);</span><br><span class="line"> addEdge(u, v, w);</span><br><span class="line"> addEdge(v, u, <span class="number">0</span>);</span><br><span class="line"> }</span><br><span class="line"> <span class="built_in">printf</span>(<span class="string">"%d\n"</span>, EK(<span class="number">1</span>, n));</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><p>题目链接:<a href="http://poj.org/problem?id=1273">http://poj.org/problem?id=1273</a></p>
<h2 id="题目大意"><a href="#题目大意" class="headerlink" title="题目大意:"></a>题目大意:</h2><p>下雨的时候约翰的田里总是积水,积水把他种的三叶草给淹了,他于是做了若干排水沟,每条沟在起始处安置一个阀门来控制这条沟的最大排水量,现在给出沟的条数以及阀门的个数。并给出每条沟的最大排水量。约翰的田里的积水处是阀门1,排出水的位置是最后一个阀门。求约翰在处理积水时的最大排出量。 </p></summary>
<category term="题解" scheme="https://andrewei1316.github.io/categories/%E9%A2%98%E8%A7%A3/"/>
<category term="poj" scheme="https://andrewei1316.github.io/categories/%E9%A2%98%E8%A7%A3/poj/"/>
<category term="题解" scheme="https://andrewei1316.github.io/tags/%E9%A2%98%E8%A7%A3/"/>
<category term="poj" scheme="https://andrewei1316.github.io/tags/poj/"/>
<category term="网络流" scheme="https://andrewei1316.github.io/tags/%E7%BD%91%E7%BB%9C%E6%B5%81/"/>
</entry>
<entry>
<title>网络流入门</title>
<link href="https://andrewei1316.github.io/2016/04/11/network-flows/"/>
<id>https://andrewei1316.github.io/2016/04/11/network-flows/</id>
<published>2016-04-11T08:51:36.000Z</published>
<updated>2018-04-09T01:16:07.267Z</updated>
<content type="html"><![CDATA[<h2 id="基本概念-从书上摘抄-可以直接跳过不看"><a href="#基本概念-从书上摘抄-可以直接跳过不看" class="headerlink" title="基本概念(从书上摘抄,可以直接跳过不看)"></a>基本概念(从书上摘抄,可以直接跳过不看)</h2><h3 id="容量网络和网络最大流"><a href="#容量网络和网络最大流" class="headerlink" title="容量网络和网络最大流"></a>容量网络和网络最大流</h3><p>** 容量网络: ** 设 <code>G(V, E)</code>是一个有向网络, 在 V 中指定了一个顶点, 称为源点(记为 Vs ), 以及另一个顶点, 称为汇点(记为 Vt); 对于每一条弧 <code><u, v>∈E</code>, 对应有一个权值 c(u, v)>0, 称为<code>弧的容量</code>, 通常把这样的有向网络 G 称为容量网络。</p><blockquote><p>也就是指: 一个拥有源点、汇点并且可以容纳流量的图.</p></blockquote><p>** 弧的流量: ** 通过容量网络 G 中每条弧 <code><u, v></code> 上的实际流量(简称流量), 记为 <code>f(u, v)</code>。<br>** 网络流: ** 所有弧上流量的集合 <code>f = { f(u, v) }</code>,称为该容量网络 G 的一个网络流。<br>** 可行流: ** 在容量网络 <code>G(V, E)</code> 中, 满足以下条件的网络流 f, 称为可行流:</p><ul><li>** 弧流量限制条件: ** $ 0 ≤ f(u, v) ≤ c(u, v)$</li><li>** 平衡条件: ** 除了 Vs, Vt 外, 其余的点流入的流量总和等于流出的流量总和, 其中 <code>Vs 流出的流量总和 - 流出的流量总和 = f</code>, <code>Vt 流入的流量总和 - 流出的流量总和 = f</code>, 并且称 <code>f</code> 为可性流的流量.</li></ul><blockquote><p>也就是指: 在图中有一条从 Vs 到 Vt 的路径, 这条路径上起点 $f_o - f_i = f$, 终点 $f_i - f_o = f$, 其他的点 $f_i == f_o$, 并且所有的边的当前流量小于等于最大流量.(其中 $f_i$ 代表流入流量, $f_o$ 代表流出流量)</p></blockquote><a id="more"></a><p>** 伪流: ** 如果一个网络流只满足弧流量限制条件, 不满足平衡条件, 则这种网络流称为伪流, 或称为容量可行流。<br>** 最大流: ** 在容量网络 <code>G(V, E)</code> 中, 满足弧流量限制条件和平衡条件、且具有最大流量的可行流, 称为网络最大流, 简称最大流。</p><h3 id="链与增广路"><a href="#链与增广路" class="headerlink" title="链与增广路"></a>链与增广路</h3><p>在容量网络 <code>G(V, E)</code> 中, 设有一可行流 <code>f = { f(u, v) }</code>, 根据每条弧上流量的多少、以及流量和容量的关系,可将弧分四种类型:</p><ul><li>饱和弧, 即 $f(u, v) = c(u, v)$;</li><li>非饱和弧,即 $f(u, v) < c(u, v)$;</li><li>零流弧, 即 $f(u, v) = 0$;</li><li>非零流弧, 即 $f(u, v) > 0$。</li></ul><p>** 链: ** 在容量网络中,称顶点序列$(u, u_1, u_2, …, u_n, v)$为一条链,要求相邻两个顶点之间有一条弧, 如 <code><u, u1></code> 或 <code><u1, u></code> 为容量网络中一条弧。沿着 Vs 到 Vt 的一条链, 各弧可分为两类:</p><ul><li>** 前向弧: ** 方向与链的正方向一致的弧, 其集合记为 <code>P+</code>; </li><li>** 后向弧: ** 方向与链的正方向相反的弧, 其集合记为 <code>P-</code>;</li></ul><p>** 增广路: ** 设 f 是一个容量网络 G 中的一个可行流, P 是从 Vs 到 Vt 的一条链, 若 P 满足下列条件:</p><ul><li>在 P 的所有前向弧 <code><u, v></code> 上, $0 ≤ f(u, v) < c(u, v)$, 即 <code>P+</code> 中每一条弧都是非饱和弧;</li><li>在 P 的所有后向弧 <code><u, v></code> 上, $0 < f(u, v) ≤ c(u, v)$, 即 <code>P–</code> 中每一条弧是非零流弧。</li></ul><p>则称 P 为关于可行流 f 的一条增广路, 简称为 <code>增广路(或称为增广链、可改进路)</code>。沿着增广路改进可行流的操作称为<code>增广</code></p><h3 id="残留容量与残留网络"><a href="#残留容量与残留网络" class="headerlink" title="残留容量与残留网络"></a>残留容量与残留网络</h3><p>** 残留容量: ** 给定容量网络 <code>G(V, E)</code> 及可行流 f, 弧 <code><u, v></code> 上的残留容量记为 $c’(u, v)=c(u, v)–f(u, v)$。每条弧的残留容量表示该弧上可以增加的流量。因为从顶点 u 到顶点 v 流量的减少, 等效于顶点 v 到顶点 u 流量增加, 所以每条弧 <code><u, v></code> 上还有一个反方向的残留容量 $c’(v, u) =– f(u, v)$。</p><blockquote><p>一个容量网络中还可以压入的流量称为残留容量</p></blockquote><p>** 残留网络: ** 设有容量网络 <code>G(V, E)</code> 及其上的网络流 f,G 关于 f 的残留网络(简称残留网络)记为 <code>G'(V', E')</code>, 其中 G’的顶点集 V’和 G 的顶点集 V 相同,即 V’=V, 对于 G 中的任何一条弧 <code><u, v></code>, 如果 $f(u, v) < c(u, v)$, 那么在 G’中有一条弧 <code><u, v>∈E'</code>, 其容量为 $c’(u, v) = c(u, v) – f(u, v)$, 如果 $f(u, v) > 0$,则在 G’中有一条弧 <code><v, u>∈E'</code>, 其容量为 $c’(v, u) = f(u, v)$, 残留网络也称为<code>剩余网络</code>.</p><blockquote><p>由残留的容量以及源点汇点构成的网络。</p></blockquote><h3 id="割与最小割"><a href="#割与最小割" class="headerlink" title="割与最小割"></a>割与最小割</h3><p>** 割: ** 在容量网络 <code>G(V, E)</code> 中, 设 <code>E'⊆E</code>, 如果在 G 的基图中删去 E’ 后不再连通, 则称 E’ 是 G 的割。割将 G 的顶点集 V 划分成两个子集 S 和 T = V - S。将割记为(S, T)。<br>** s-t 割: ** 更进一步, 如果割所划分的两个顶点子集满足源点 <code>Vs ∈ S</code>,汇点 <code>Vt ∈ T</code>, 则称该割为 <code>s-t 割</code>。 s-t 割(S, T)中的弧 <code><u, v>(u∈S, v∈T)</code> 称为割的前向弧, 弧 <code><u, v>( u∈T, v∈S)</code> 称为割的反向弧。<br>** 割的容量: ** 设 <code>(S, T)</code> 为容量网络 <code>G(V, E)</code> 的一个割, 其容量定义为所有前向弧的容量总和, 用 <code>c(S, T)</code> 表示。<br>** 最小割: ** 容量网络 <code>G(V, E)</code> 的最小割是指容量最小的割。</p><h3 id="相关定理"><a href="#相关定理" class="headerlink" title="相关定理"></a>相关定理</h3><h4 id="残留网络与原网络的关系"><a href="#残留网络与原网络的关系" class="headerlink" title="残留网络与原网络的关系"></a>残留网络与原网络的关系</h4><p>设 f 是容量网络 G(V, E) 的可行流, f’ 是残留网络 G’ 的可行流, 则 f + f’ 仍是容量网络 G 的一个可行流。(f + f’ 表示对应弧上的流量相加)</p><h4 id="网络流流量与割的净流量之间的关系"><a href="#网络流流量与割的净流量之间的关系" class="headerlink" title="网络流流量与割的净流量之间的关系"></a>网络流流量与割的净流量之间的关系</h4><p>在一个容量网络 G(V, E) 中, 设其任意一个流为 f, 关于 f 的任意一个割为(S, T), 则有 $f(S, T) = | f |$,即网络流的流量等于任何割的净流量。</p><h4 id="网络流流量与割的容量之间的关系"><a href="#网络流流量与割的容量之间的关系" class="headerlink" title="网络流流量与割的容量之间的关系"></a>网络流流量与割的容量之间的关系</h4><p>在一个容量网络 G(V, E) 中, 设其任意一个流为 f, 任意一个割为(S, T), 则必有 $f(S, T) ≤ c(S, T)$,即网络流的流量小于或等于任何割的容量。</p><h4 id="最大流最小割定理"><a href="#最大流最小割定理" class="headerlink" title="最大流最小割定理"></a>最大流最小割定理</h4><p>对容量网络 G(V, E), 其最大流的流量等于最小割的容量。</p><h4 id="增广路定理"><a href="#增广路定理" class="headerlink" title="增广路定理"></a>增广路定理</h4><p>设容量网络 G(V, E) 的一个可行流为 f, f 为最大流的充要条件是在容量网络中不存在增广路。</p><h4 id="几个等价命题"><a href="#几个等价命题" class="headerlink" title="几个等价命题"></a>几个等价命题</h4><p>设容量网络 G(V, E)的一个可行流为 f 则:</p><ul><li><ol><li>f 是容量网络 G 的最大流;</li></ol></li><li><ol start="2"><li>| f |等于容量网络最小割的容量;</li></ol></li><li><ol start="3"><li>容量网络中不存在增广路;</li></ol></li><li><ol start="4"><li>残留网络 G’中不存在从源点到汇点的路径。</li></ol></li></ul><h2 id="最大流"><a href="#最大流" class="headerlink" title="最大流"></a>最大流</h2><p>最大流相关算法有两种解决思想, 一种是<code>增广路算法</code>思想, 另一种是<code>预流推进</code>算法思想。 下面将分别介绍这两种算法思想。</p><h3 id="增广路算法-Ford-Fulkerson"><a href="#增广路算法-Ford-Fulkerson" class="headerlink" title="增广路算法(Ford-Fulkerson)"></a>增广路算法(Ford-Fulkerson)</h3><h4 id="基本思想"><a href="#基本思想" class="headerlink" title="基本思想"></a>基本思想</h4><p>根据增广路定理, 为了得到最大流, 可以从任何一个可行流开始, 沿着增广路对网络流进行增广, 直到网络中不存在增广路为止,这样的算法称为增广路算法。问题的关键在于如何有效地找到增广路, 并保证算法在有限次增广后一定终止。<br>增广路算法的基本流程是 :</p><ul><li>(1) 取一个可行流 f 作为初始流(如果没有给定初始流,则取零流 f= { 0 }作为初始流);</li><li>(2) 寻找关于 f 的增广路 P,如果找到,则沿着这条增广路 P 将 f 改进成一个更大的流, 并建立相应的<strong>反向弧</strong>;</li><li>(3) 重复第(2)步直到 f 不存在增广路为止。</li></ul><p>图示如下:<br><img src="/images/network-flows/FFalgo1.png" alt="Ford-Fulkerson算法过程"><br><img src="/images/network-flows/FFalgo2.png" alt="Ford-Fulkerson算法过程"></p><p>增广路算法的关键是 <code>寻找增广路</code> 和 <code>改进网络流</code>。<br>** 问题: 为什么要创建反向弧呢? **<br>** 原因: 为程序提供一次反悔的机会 ** 什么意思, 如下图所示:<br>在图中如果程序找到了一条增广路 1 -> 2 -> 4 -> 6, 此时得到一个流量为 2 的流并且无法继续进行增广,<br>但是如果在更新可行流的同时建立反向弧的话, 就可以找到 1 -> 3 -> 4 -> 2 -> 5 -> 6 的可行流, 流量为1, 这样就可以得到最大流为 3.<br><img src="/images/network-flows/FFalgo7.jpg" alt="Ford-Fulkerson算法过程"></p><h4 id="一般增广路算法-EdmondsKarp"><a href="#一般增广路算法-EdmondsKarp" class="headerlink" title="一般增广路算法(EdmondsKarp)"></a>一般增广路算法(EdmondsKarp)</h4><h5 id="算法流程"><a href="#算法流程" class="headerlink" title="算法流程"></a>算法流程</h5><p>在一般的增广路算法中, 程序的实现过程与增广路求最大流的过程基本一致. 即每一次更新都进行一次找增广路然后更新路径上的流量的过程。但是我们可以从上图中发现一个问题, 就是每次找到的增广路曲曲折折非常长, 此时我们往往走了冤枉路(即:明明我们可以从源点离汇点越走越进的,可是中间的几条边却向离汇点远的方向走了), 此时更新增广路的复杂度就会增加。EK 算法为了规避这个问题使用了 bfs 来寻找增广路, 然后在寻找增广路的时候总是向离汇点越来越近的方向去寻找下一个结点。 </p><h5 id="算法实现"><a href="#算法实现" class="headerlink" title="算法实现"></a>算法实现</h5><h6 id="邻接矩阵"><a href="#邻接矩阵" class="headerlink" title="邻接矩阵"></a>邻接矩阵</h6><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><queue></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstdio></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstring></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><iostream></span></span></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="built_in">std</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXN = <span class="number">300</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAX_INT = ((<span class="number">1</span> << <span class="number">31</span>) - <span class="number">1</span>);</span><br><span class="line"></span><br><span class="line"><span class="keyword">int</span> n; <span class="comment">// 图中点的数目</span></span><br><span class="line"><span class="keyword">int</span> pre[MAXN]; <span class="comment">// 从 s - t 中的一个可行流中, 节点 i 的前序节点为 Pre[i];</span></span><br><span class="line"><span class="keyword">bool</span> vis[MAXN]; <span class="comment">// 标记一个点是否被访问过</span></span><br><span class="line"><span class="keyword">int</span> mp[MAXN][MAXN]; <span class="comment">// 记录图信息</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">bool</span> <span class="title">bfs</span><span class="params">(<span class="keyword">int</span> s, <span class="keyword">int</span> t)</span></span>{</span><br><span class="line"> <span class="built_in">queue</span> <<span class="keyword">int</span>> que;</span><br><span class="line"> <span class="built_in">memset</span>(vis, <span class="number">0</span>, <span class="keyword">sizeof</span>(vis));</span><br><span class="line"> <span class="built_in">memset</span>(pre, <span class="number">-1</span>, <span class="keyword">sizeof</span>(pre));</span><br><span class="line"> pre[s] = s;</span><br><span class="line"> vis[s] = <span class="literal">true</span>;</span><br><span class="line"> que.push(s);</span><br><span class="line"> <span class="keyword">while</span>(!que.empty()){</span><br><span class="line"> <span class="keyword">int</span> u = que.front();</span><br><span class="line"> que.pop();</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">1</span>; i <= n; i++){</span><br><span class="line"> <span class="keyword">if</span>(mp[u][i] && !vis[i]){</span><br><span class="line"> pre[i] = u;</span><br><span class="line"> vis[i] = <span class="literal">true</span>;</span><br><span class="line"> <span class="keyword">if</span>(i == t) <span class="keyword">return</span> <span class="literal">true</span>;</span><br><span class="line"> que.push(i);</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="literal">false</span>;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">EK</span><span class="params">(<span class="keyword">int</span> s, <span class="keyword">int</span> t)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> ans = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">while</span>(bfs(s, t)){</span><br><span class="line"> <span class="keyword">int</span> mi = MAX_INT;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = t; i != s; i = pre[i]){</span><br><span class="line"> mi = min(mi, mp[pre[i]][i]);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = t; i != s; i = pre[i]){</span><br><span class="line"> mp[pre[i]][i] -= mi;</span><br><span class="line"> mp[i][pre[i]] += mi;</span><br><span class="line"> }</span><br><span class="line"> ans += mi;</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> ans;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h6 id="邻接表"><a href="#邻接表" class="headerlink" title="邻接表"></a>邻接表</h6><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXN = <span class="number">430</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAX_INT = (<span class="number">1</span> << <span class="number">30</span>);</span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">Edge</span>{</span></span><br><span class="line"> <span class="keyword">int</span> v, nxt, w;</span><br><span class="line">};</span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">Node</span>{</span></span><br><span class="line"> <span class="keyword">int</span> v, id;</span><br><span class="line">};</span><br><span class="line"></span><br><span class="line"><span class="keyword">int</span> n, m, ecnt;</span><br><span class="line"><span class="keyword">bool</span> vis[MAXN];</span><br><span class="line"><span class="keyword">int</span> head[MAXN];</span><br><span class="line">Node pre[MAXN];</span><br><span class="line">Edge edge[MAXN];</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">init</span><span class="params">()</span></span>{</span><br><span class="line"> ecnt = <span class="number">0</span>;</span><br><span class="line"> <span class="built_in">memset</span>(edge, <span class="number">0</span>, <span class="keyword">sizeof</span>(edge));</span><br><span class="line"> <span class="built_in">memset</span>(head, <span class="number">-1</span>, <span class="keyword">sizeof</span>(head));</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">addEdge</span><span class="params">(<span class="keyword">int</span> u, <span class="keyword">int</span> v, <span class="keyword">int</span> w)</span></span>{</span><br><span class="line"> edge[ecnt].v = v;</span><br><span class="line"> edge[ecnt].w = w;</span><br><span class="line"> edge[ecnt].nxt = head[u];</span><br><span class="line"> head[u] = ecnt++;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">bool</span> <span class="title">bfs</span><span class="params">(<span class="keyword">int</span> s, <span class="keyword">int</span> t)</span></span>{</span><br><span class="line"> <span class="built_in">queue</span> <<span class="keyword">int</span>> que;</span><br><span class="line"> <span class="built_in">memset</span>(vis, <span class="number">0</span>, <span class="keyword">sizeof</span>(vis));</span><br><span class="line"> <span class="built_in">memset</span>(pre, <span class="number">-1</span>, <span class="keyword">sizeof</span>(pre));</span><br><span class="line"> pre[s].v = s;</span><br><span class="line"> vis[s] = <span class="literal">true</span>;</span><br><span class="line"> que.push(s);</span><br><span class="line"> <span class="keyword">while</span>(!que.empty()){</span><br><span class="line"> <span class="keyword">int</span> u = que.front();</span><br><span class="line"> que.pop();</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = head[u]; i + <span class="number">1</span>; i = edge[i].nxt){</span><br><span class="line"> <span class="keyword">int</span> v = edge[i].v;</span><br><span class="line"> <span class="keyword">if</span>(!vis[v] && edge[i].w){</span><br><span class="line"> pre[v].v = u;</span><br><span class="line"> pre[v].id = i;</span><br><span class="line"> vis[v] = <span class="literal">true</span>;</span><br><span class="line"> <span class="keyword">if</span>(v == t) <span class="keyword">return</span> <span class="literal">true</span>;</span><br><span class="line"> que.push(v);</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="literal">false</span>;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">EK</span><span class="params">(<span class="keyword">int</span> s, <span class="keyword">int</span> t)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> ans = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">while</span>(bfs(s, t)){</span><br><span class="line"> <span class="keyword">int</span> mi = MAX_INT;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = t; i != s; i = pre[i].v){</span><br><span class="line"> mi = min(mi, edge[pre[i].id].w);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = t; i != s; i = pre[i].v){</span><br><span class="line"> edge[pre[i].id].w -= mi;</span><br><span class="line"> edge[pre[i].id ^ <span class="number">1</span>].w += mi;</span><br><span class="line"> }</span><br><span class="line"> ans += mi;</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> ans;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 加边</span></span><br><span class="line">addEdge(u, v, w);</span><br><span class="line">addEdge(v, u, <span class="number">0</span>);</span><br><span class="line"><span class="comment">// 调用</span></span><br><span class="line"><span class="keyword">int</span> ans = EK(s, t);</span><br></pre></td></tr></table></figure><h5 id="算法复杂度"><a href="#算法复杂度" class="headerlink" title="算法复杂度"></a>算法复杂度</h5><p>每进行一次增广需要的时间复杂度为 bfs 的复杂度 + 更新残余网络的复杂度, 大约为 O(m)(m为图中的边的数目), 需要进行多少次增广呢, 假设每次增广只增加1, 则需要增广 nW 次(n为图中顶点的数目, W为图中边上的最大容量), .</p><h4 id="Dinic-算法"><a href="#Dinic-算法" class="headerlink" title="Dinic 算法"></a>Dinic 算法</h4><h5 id="算法思想"><a href="#算法思想" class="headerlink" title="算法思想"></a>算法思想</h5><p>DINIC 在找增广路的时候也是找的最短增广路, 与 EK 算法不同的是 DINIC 算法并不是每次 bfs 只找一个增广路, 他会首先通过一次 bfs 为所有点添加一个标号, 构成一个层次图, 然后在层次图中寻找增广路进行更新。</p><h5 id="算法流程-1"><a href="#算法流程-1" class="headerlink" title="算法流程"></a>算法流程</h5><blockquote><ol><li>利用 BFS 对原来的图进行分层,即对每个结点进行标号, 这个标号的含义是当前结点距离源点的最短距离(假设每条边的距离都为1),注意:构建层次图的时候所走的边的残余流量必须大于0</li><li>用 DFS 寻找一条从源点到汇点的增广路, 注意: 此处寻找增广路的时候要按照层次图的顺序, 即如果将边(u, v)纳入这条增广路的话必须满足$dis[u] = dis[v] - 1$, 其中 $dis[i]$为结点 $i$的编号。找到一条路后要根据这条增广路径上的所有边的残余流量的最小值$l$更新所有边的残余流量(即正向弧 - l, 反向弧 + l).</li><li>重复步骤 2, 当找不到一条增广路的时候, 重复步骤 1, 重新建立层次图, 直到从源点不能到达汇点为止。</li></ol></blockquote><p>算法流程如下图所示:<br><img src="/images/network-flows/FFalgo8.jpg" alt="DINIC算法过程"></p><h5 id="算法实现-1"><a href="#算法实现-1" class="headerlink" title="算法实现"></a>算法实现</h5><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><queue></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstdio></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstring></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><iostream></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><algorithm></span></span></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="built_in">std</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXN = <span class="number">510</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXN_INT = (<span class="number">1</span> << <span class="number">29</span>);</span><br><span class="line"></span><br><span class="line"><span class="keyword">int</span> n, m;</span><br><span class="line"><span class="keyword">int</span> dis[MAXN];</span><br><span class="line"><span class="keyword">int</span> mp[MAXN][MAXN];</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">bfs</span><span class="params">(<span class="keyword">int</span> s)</span></span>{</span><br><span class="line"> <span class="built_in">memset</span>(dis, <span class="number">0xff</span>, <span class="keyword">sizeof</span>(dis));</span><br><span class="line"> dis[s] = <span class="number">0</span>;</span><br><span class="line"> <span class="built_in">queue</span> <<span class="keyword">int</span>> que;</span><br><span class="line"> que.push(s);</span><br><span class="line"> <span class="keyword">while</span>(!que.empty()){</span><br><span class="line"> <span class="keyword">int</span> top = que.front();</span><br><span class="line"> que.pop();</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">1</span>; i <= n; i++){</span><br><span class="line"> <span class="keyword">if</span>(dis[i] < <span class="number">0</span> && mp[top][i] > <span class="number">0</span>){</span><br><span class="line"> dis[i] = dis[top] + <span class="number">1</span>;</span><br><span class="line"> que.push(i);</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">if</span>(dis[n] > <span class="number">0</span>) <span class="keyword">return</span> <span class="literal">true</span>;</span><br><span class="line"> <span class="keyword">return</span> <span class="literal">false</span>;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">Find</span><span class="params">(<span class="keyword">int</span> x, <span class="keyword">int</span> low)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> a = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">if</span>(x == n) <span class="keyword">return</span> low;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">1</span>; i <= n; i++){</span><br><span class="line"> <span class="keyword">if</span>(mp[x][i] > <span class="number">0</span> </span><br><span class="line"> && dis[i] == dis[x] + <span class="number">1</span></span><br><span class="line"> && (a = Find(i, min(low, mp[x][i])))){</span><br><span class="line"> mp[x][i] -= a;</span><br><span class="line"> mp[i][x] += a;</span><br><span class="line"> <span class="keyword">return</span> a;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">main</span><span class="params">()</span></span>{</span><br><span class="line"> <span class="keyword">while</span>(<span class="built_in">scanf</span>(<span class="string">"%d%d"</span>, &n, &m) != EOF){</span><br><span class="line"> <span class="built_in">memset</span>(mp, <span class="number">0</span>, <span class="keyword">sizeof</span>(mp));</span><br><span class="line"> <span class="keyword">int</span> u, v, w;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">0</span>; i < m; i++){</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d%d%d"</span>, &u, &v, &w);</span><br><span class="line"> mp[u][v] += w;</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">int</span> ans = <span class="number">0</span>, tmp;</span><br><span class="line"> <span class="keyword">while</span>(bfs(<span class="number">1</span>)){</span><br><span class="line"> <span class="keyword">while</span>(tmp = Find(<span class="number">1</span>, MAXN_INT))</span><br><span class="line"> ans += tmp;</span><br><span class="line"> }</span><br><span class="line"> <span class="built_in">printf</span>(<span class="string">"%d\n"</span>, ans);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>** 当前弧优化和多路增广:**</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><queue></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstdio></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstring></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><iostream></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><algorithm></span></span></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="built_in">std</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXN = <span class="number">101000</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXN_INT = (<span class="number">1</span> << <span class="number">29</span>);</span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">Edge</span>{</span></span><br><span class="line"> <span class="keyword">int</span> v, w, nxt;</span><br><span class="line">};</span><br><span class="line"></span><br><span class="line"><span class="keyword">int</span> s, t;</span><br><span class="line"><span class="keyword">int</span> n, m, ecnt;</span><br><span class="line">Edge edge[MAXN * <span class="number">2</span>];</span><br><span class="line"><span class="keyword">int</span> head[MAXN], dis[MAXN], curEdge[MAXN];</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">init</span><span class="params">()</span></span>{</span><br><span class="line"> ecnt = <span class="number">0</span>;</span><br><span class="line"> <span class="built_in">memset</span>(dis, <span class="number">-1</span>, <span class="keyword">sizeof</span>(dis));</span><br><span class="line"> <span class="built_in">memset</span>(edge, <span class="number">0</span>, <span class="keyword">sizeof</span>(edge));</span><br><span class="line"> <span class="built_in">memset</span>(head, <span class="number">-1</span>, <span class="keyword">sizeof</span>(head));</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">addEdge</span><span class="params">(<span class="keyword">int</span> u, <span class="keyword">int</span> v, <span class="keyword">int</span> w)</span></span>{</span><br><span class="line"> edge[ecnt].v = v;</span><br><span class="line"> edge[ecnt].w = w;</span><br><span class="line"> edge[ecnt].nxt = head[u];</span><br><span class="line"> head[u] = ecnt++;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">bfs</span><span class="params">()</span></span>{</span><br><span class="line"> dis[t] = <span class="number">0</span>;</span><br><span class="line"> <span class="built_in">queue</span> <<span class="keyword">int</span>> que;</span><br><span class="line"> que.push(t);</span><br><span class="line"> <span class="keyword">while</span>(!que.empty()){</span><br><span class="line"> <span class="keyword">int</span> u = que.front();</span><br><span class="line"> que.pop();</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = head[u]; i + <span class="number">1</span>; i = edge[i].nxt){</span><br><span class="line"> <span class="keyword">if</span>(dis[edge[i].v] == <span class="number">-1</span> && edge[i ^ <span class="number">1</span>].w > <span class="number">0</span>){</span><br><span class="line"> dis[edge[i].v] = dis[u] + <span class="number">1</span>;</span><br><span class="line"> que.push(edge[i].v);</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> dis[s] != <span class="number">-1</span>;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">dfs</span><span class="params">(<span class="keyword">int</span> u, <span class="keyword">int</span> v, <span class="keyword">int</span> flow)</span></span>{</span><br><span class="line"> <span class="keyword">if</span>(u == t) <span class="keyword">return</span> flow;</span><br><span class="line"> <span class="keyword">int</span> delta = flow;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> &i = curEdge[u]; i + <span class="number">1</span>; i = edge[i].nxt){</span><br><span class="line"> <span class="keyword">if</span>(dis[u] == dis[edge[i].v] + <span class="number">1</span> && edge[i].w){</span><br><span class="line"> <span class="keyword">int</span> d = dfs(edge[i].v, v, min(delta, edge[i].w));</span><br><span class="line"> edge[i].w -= d, edge[i ^ <span class="number">1</span>].w += d;</span><br><span class="line"> delta -= d;</span><br><span class="line"> <span class="keyword">if</span>(delta == <span class="number">0</span>) <span class="keyword">break</span>;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> flow - delta;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">dinic</span><span class="params">()</span></span>{</span><br><span class="line"> <span class="keyword">int</span> ans = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">while</span>(bfs()){</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">0</span>; i < n; i++)</span><br><span class="line"> curEdge[i] = head[i];</span><br><span class="line"> ans += dfs(s, t, MAXN_INT);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> ans;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">main</span><span class="params">()</span></span>{</span><br><span class="line"> <span class="keyword">while</span>(<span class="built_in">scanf</span>(<span class="string">"%d%d"</span>, &n, &m) != EOF){</span><br><span class="line"> init();</span><br><span class="line"> <span class="keyword">int</span> u, v, w;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">0</span>; i < m; i++){</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d%d%d"</span>, &u, &v, &w);</span><br><span class="line"> addEdge(u, v, w);</span><br><span class="line"> addEdge(v, u, <span class="number">0</span>);</span><br><span class="line"> }</span><br><span class="line"> <span class="built_in">printf</span>(<span class="string">"%d\n"</span>, dinic());</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h5 id="时间复杂度"><a href="#时间复杂度" class="headerlink" title="时间复杂度"></a>时间复杂度</h5><p>$O(V^2E)</p><h4 id="最短增广路算法-SAP"><a href="#最短增广路算法-SAP" class="headerlink" title="最短增广路算法(SAP)"></a>最短增广路算法(SAP)</h4><h5 id="算法思想-1"><a href="#算法思想-1" class="headerlink" title="算法思想"></a>算法思想</h5><p>最短增广路算法是一种运用距离标号使寻找增广路的时间复杂度下降的算法。所谓的距离标号就是某个点到汇点的最少的弧的数量(即当边权为1时某个点的最短路径长度). 设点i的标号为d[i], 那么如果将满足d[i] = d[j] + 1, 且增广时只走允许弧, 那么就可以达到”怎么走都是最短路”的效果. 每个点的初始标号可以在一开始用一次从汇点沿所有反向的BFS求出.</p><h5 id="算法流程-2"><a href="#算法流程-2" class="headerlink" title="算法流程"></a>算法流程</h5><blockquote><ol><li>定义节点的标号为到汇点的最短距离;</li><li>每次沿可行边进行增广, 可行边即: 假设有两个点 i, j 若 d[i] = 3, d[j] = 4, 则d[j] = d[i] + 1, 也就是从 j 到 i 有一条边.</li><li>找到增广路后,将路径上所有边的流量更新.</li><li>遍历完当前结点的可行边后更新当前结点的标号为 $d[now] = min(d[next] | Flow(now, next) > 0) + 1$,使下次再搜的时候有路可走。</li><li>图中不存在增广路后即退出程序,此时得到的流量值就是最大流。</li></ol></blockquote><p>需要注意的是, 标号的更新过程首先我们要理解更新标号的目的。标号如果需要更新,说明在当前的标号下已经没有增广路可以继续走,这时更新标号就可以使得我们有继续向下走的可能,并且每次找的都是能走到的点中标号最小的那个点,这样也使得每次搜索长度最小.<br>下面的图演示了标号的更新过程:</p><ol><li>首先我们假设有个图如下,为了简化没有标箭头也没有写流量:<br><img src="/images/network-flows/FFalgo3.png" alt="SAP算法过程"></li><li>为图标号, 每个点的标号为其到汇点的最短距离(这里把每条边看作1)<br><img src="/images/network-flows/FFalgo4.png" alt="SAP算法过程"></li><li>第一遍遍历时,找到了1->2->9这样一条增广路以后,更新边上流量值, 得到下图<br>棕色字体为边上的流量值。这时按照标号再搜一遍,发现从1出发已经找不到增广路了,因为flow(1,2)等于0不可以走,$ h[1]=2,h[3]=2≠h[1]+1,h[5]=4≠h[1]+1 $,所以这时更新1的标号,按照 $min(h[next]|Flow(now,next)>0)+1$,修改后 $h[1]=h[3]+1=3$.<br><img src="/images/network-flows/FFalgo5.png" alt="SAP算法过程"></li><li>第二遍遍历以后找到了这样一条增广路:1->3->4->9,做完这条路以后又发现无法找到可行边了,这时再更新标号使图中有路可走,如上文所说的那样做,再次修改后$h[1]=h[5]+1=5$,就这样搜索并更新直到变成下图<br><img src="/images/network-flows/FFalgo6.png" alt="SAP算法过程"></li><li>这时再更新h[1]发现没有点可以用来更新h[1]了,于是此时$h[1]=∞$,使程序退出。</li></ol><p>** GAP 优化: ** 由于可行边定义为:$ (now, next) | h[now] = h[next]+1 $,所以若标号出现“断层”即有的标号对应的顶点个数为0,则说明剩余图中不存在增广路,此时便可以直接退出,降低了无效搜索。举个栗子:若结点标号为3的结点个数为0,而标号为4的结点和标号为2的结点都大于 0,那么在搜索至任意一个标号为4的结点时,便无法再继续往下搜索,说明图中就不存在增广路。此时我们可以以将$ h[1] = n$形式来变相地直接结束搜索</p><h5 id="算法实现-2"><a href="#算法实现-2" class="headerlink" title="算法实现"></a>算法实现</h5><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><queue></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstdio></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstring></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><iostream></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><algorithm></span></span></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="built_in">std</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXN = <span class="number">5010</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXN_INT = (<span class="number">1</span> << <span class="number">29</span>);</span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">Edge</span>{</span></span><br><span class="line"> <span class="keyword">int</span> v, w, nxt;</span><br><span class="line">};</span><br><span class="line"></span><br><span class="line"><span class="keyword">bool</span> isFind;</span><br><span class="line"><span class="keyword">int</span> head[MAXN];</span><br><span class="line">Edge edge[MAXN];</span><br><span class="line"><span class="keyword">int</span> dis[MAXN], gap[MAXN];</span><br><span class="line"><span class="keyword">int</span> n, m, ecnt, aug, maxFlow;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">init</span><span class="params">()</span></span>{</span><br><span class="line"> ecnt = maxFlow = <span class="number">0</span>;</span><br><span class="line"> <span class="built_in">memset</span>(gap, <span class="number">0</span>, <span class="keyword">sizeof</span>(gap));</span><br><span class="line"> <span class="built_in">memset</span>(dis, <span class="number">0</span>, <span class="keyword">sizeof</span>(dis));</span><br><span class="line"> <span class="built_in">memset</span>(edge, <span class="number">0</span>, <span class="keyword">sizeof</span>(edge));</span><br><span class="line"> <span class="built_in">memset</span>(head, <span class="number">-1</span>, <span class="keyword">sizeof</span>(head));</span><br><span class="line"> gap[<span class="number">0</span>] = n;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">addEdge</span><span class="params">(<span class="keyword">int</span> u, <span class="keyword">int</span> v, <span class="keyword">int</span> w)</span></span>{</span><br><span class="line"> edge[ecnt].v = v;</span><br><span class="line"> edge[ecnt].w = w;</span><br><span class="line"> edge[ecnt].nxt = head[u];</span><br><span class="line"> head[u] = ecnt++;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">Find</span><span class="params">(<span class="keyword">int</span> s)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> dx, augc, minDis;</span><br><span class="line"> <span class="keyword">if</span>(s == n){</span><br><span class="line"> isFind = <span class="literal">true</span>;</span><br><span class="line"> maxFlow += aug;</span><br><span class="line"> <span class="keyword">return</span>;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> augc = aug;</span><br><span class="line"> minDis = n - <span class="number">1</span>;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = head[i]; i + <span class="number">1</span>; i = edge[i].nxt){</span><br><span class="line"> <span class="keyword">if</span>(edge[i].w > <span class="number">0</span>){</span><br><span class="line"> <span class="keyword">if</span>(dis[s] == dis[edge[i].v] + <span class="number">1</span>){</span><br><span class="line"> aug = min(aug, edge[i].w);</span><br><span class="line"> Find(edge[i].v);</span><br><span class="line"> <span class="keyword">if</span>(dis[<span class="number">1</span>] >= n) <span class="keyword">return</span>;</span><br><span class="line"> <span class="keyword">if</span>(isFind){</span><br><span class="line"> dx = i;</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> }</span><br><span class="line"> aug = augc;</span><br><span class="line"> }</span><br><span class="line"> minDis = min(minDis, dis[edge[i].v]);</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">if</span>(!isFind){</span><br><span class="line"> gap[dis[s]]--;</span><br><span class="line"> <span class="keyword">if</span>(gap[dis[s]] == <span class="number">0</span>) dis[<span class="number">1</span>] = n;</span><br><span class="line"> dis[s] = minDis + <span class="number">1</span>;</span><br><span class="line"> gap[dis[s]]++;</span><br><span class="line"> }<span class="keyword">else</span>{</span><br><span class="line"> edge[dx].w -= aug;</span><br><span class="line"> edge[dx ^ <span class="number">1</span>].w += aug;</span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">main</span><span class="params">()</span></span>{</span><br><span class="line"> <span class="keyword">while</span>(<span class="built_in">scanf</span>(<span class="string">"%d%d"</span>, &n, &m) != EOF){</span><br><span class="line"> init();</span><br><span class="line"> <span class="keyword">int</span> u, v, w;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">0</span>; i < m; i++){</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d%d%d"</span>, &u, &v, &w);</span><br><span class="line"> addEdge(u, v, w);</span><br><span class="line"> addEdge(v, u, <span class="number">0</span>);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="keyword">while</span>(dis[<span class="number">1</span>] < n){</span><br><span class="line"> isFind = <span class="number">0</span>;</span><br><span class="line"> aug = MAXN_INT;</span><br><span class="line"> Find(<span class="number">1</span>);</span><br><span class="line"> }</span><br><span class="line"> <span class="built_in">cout</span> << maxFlow << <span class="built_in">endl</span>;</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h5 id="时间复杂度-1"><a href="#时间复杂度-1" class="headerlink" title="时间复杂度"></a>时间复杂度</h5><p>$O(V^2E)$</p><h3 id="预流推进算法"><a href="#预流推进算法" class="headerlink" title="预流推进算法"></a>预流推进算法</h3><p>预流推进算法是从一个预流出发对活跃顶点沿着允许弧进行流量增广,每次增广称为一次推进。在推进过程中,流一定满足流量限制条件,但一般不满足流量平衡条件, 因此只是一个伪流。此外, 如果一个伪流中, 从每个顶点(除源点 V s 、汇点 V t 外)流出的流量之和总是小于等于流入该顶点的流量之和, 称这样的伪流为预流。因此这类算法被称为预流推进算法。</p><h4 id="算法流程-3"><a href="#算法流程-3" class="headerlink" title="算法流程"></a>算法流程</h4><blockquote><ol><li>首先用一边 BFS 为图中每个顶点一个标号dis[v], 表示该点到v的最短路.</li><li>将与 S 相连的边设为满流, 并将这时产生的活动结点加入队列Q。</li><li>选出 Q 的一个活动结点 u 并依次判断残量网咯 G’ 中每条边(u, v), 若 $dis[u] = min(dis[v] + 1)$, 则顺着这些边推流, 直到 Q 变成非活动结点(不存在多余流量). </li><li>如果 u 还是活动结点,则需要对 u 进行重新标号: $dis[u] = min(dis[v] + 1)$, 其中边 (u, v) 存在于 G’ 中,然后再将 u 加入队列。</li><li>重复3, 4两个步骤直到队列 Q 为空。</li></ol></blockquote><h4 id="算法实现-3"><a href="#算法实现-3" class="headerlink" title="算法实现"></a>算法实现</h4><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> size = <span class="number">501</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAX = <span class="number">1</span> << <span class="number">15</span>;</span><br><span class="line"> </span><br><span class="line"><span class="keyword">int</span> graph[size][size];</span><br><span class="line"><span class="keyword">int</span> label[size]; <span class="comment">//标号</span></span><br><span class="line"><span class="keyword">bool</span> visited[size];</span><br><span class="line"> </span><br><span class="line"><span class="function"><span class="keyword">bool</span> <span class="title">bfs</span><span class="params">(<span class="keyword">int</span> st, <span class="keyword">int</span> ed)</span></span></span><br><span class="line"><span class="function"></span>{</span><br><span class="line"> <span class="built_in">memset</span>(label, <span class="number">-1</span>, <span class="keyword">sizeof</span>(label));</span><br><span class="line"> <span class="built_in">memset</span>(visited, <span class="literal">false</span>, <span class="keyword">sizeof</span>(visited));</span><br><span class="line"> label[st] = <span class="number">0</span>;</span><br><span class="line"> visited[st] = <span class="literal">true</span>;</span><br><span class="line"> <span class="built_in">vector</span> < <span class="keyword">int</span> >plist;</span><br><span class="line"> plist.push_back(st);</span><br><span class="line"> <span class="keyword">while</span> (plist.size()) {</span><br><span class="line"> <span class="keyword">int</span> p = plist[<span class="number">0</span>];</span><br><span class="line"> plist.erase(plist.begin());</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < size; i++) {</span><br><span class="line"> <span class="keyword">if</span> (graph[i][p] > <span class="number">0</span> && !visited[i]) {</span><br><span class="line"> plist.push_back(i);</span><br><span class="line"> visited[i] = <span class="literal">true</span>;</span><br><span class="line"> label[i] = label[p] + <span class="number">1</span>;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">if</span> (label[ed] == <span class="number">-1</span>) {</span><br><span class="line"> <span class="keyword">return</span> <span class="literal">false</span>;</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="literal">true</span>;</span><br><span class="line">}</span><br><span class="line"> </span><br><span class="line"><span class="keyword">int</span> inflow[size]; <span class="comment">//流入量</span></span><br><span class="line"> </span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">maxFlow</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>{</span><br><span class="line"> <span class="built_in">memset</span>(inflow, <span class="number">0</span>, <span class="keyword">sizeof</span>(inflow));</span><br><span class="line"> </span><br><span class="line"> <span class="comment">//hights</span></span><br><span class="line"> bfs(size - <span class="number">1</span>, <span class="number">0</span>); <span class="comment">//end point: size - 1, start point: 0</span></span><br><span class="line"> <span class="built_in">memset</span>(visited, <span class="literal">false</span>, <span class="keyword">sizeof</span>(visited));</span><br><span class="line"> </span><br><span class="line"><span class="comment">//prepare()</span></span><br><span class="line"> <span class="built_in">vector</span> < <span class="keyword">int</span> >plist;</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < size; i++) {</span><br><span class="line"> <span class="keyword">if</span> (graph[start][i] > <span class="number">0</span>) {</span><br><span class="line"> inflow[i] = graph[start][i];</span><br><span class="line"> graph[start][i] -= inflow[i];</span><br><span class="line"> graph[i][start] += inflow[i];</span><br><span class="line"> <span class="keyword">if</span> (!visited[i]) {</span><br><span class="line"> plist.push_back(i);</span><br><span class="line"> visited[i] = <span class="literal">true</span>;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">while</span> (plist.size()) {</span><br><span class="line"> <span class="keyword">int</span> p = plist[<span class="number">0</span>];</span><br><span class="line"> plist.erase(plist.begin());</span><br><span class="line"> visited[p] = <span class="literal">false</span>;</span><br><span class="line"> <span class="keyword">int</span> minLabel = <span class="number">-1</span>;</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < size; i++) {</span><br><span class="line"> <span class="keyword">if</span> (graph[p][i] > <span class="number">0</span>) {</span><br><span class="line"> <span class="keyword">if</span> (label[p] == label[i] + <span class="number">1</span>) {</span><br><span class="line"> <span class="keyword">int</span> flow = min(inflow[p], graph[p][i]);</span><br><span class="line"> inflow[p] -= flow;</span><br><span class="line"> inflow[i] += flow;</span><br><span class="line"> graph[p][i] -= flow;</span><br><span class="line"> graph[i][p] += flow;</span><br><span class="line"> </span><br><span class="line"> <span class="keyword">if</span> (!visited[i] && inflow[i] > <span class="number">0</span>) {</span><br><span class="line"> plist.push_back(i);</span><br><span class="line"> visited[i] = <span class="literal">true</span>;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">if</span> (inflow[p] > <span class="number">0</span> && p != end) {</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < size; i++) {</span><br><span class="line"> <span class="keyword">if</span> (graph[p][i] > <span class="number">0</span>) {</span><br><span class="line"> <span class="keyword">if</span> (minLabel == <span class="number">-1</span> || minLabel > label[i] + <span class="number">1</span>) {</span><br><span class="line"> minLabel = label[i] + <span class="number">1</span>;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">if</span> (!visited[p] && minLabel != <span class="number">-1</span> && minLabel < size) <span class="comment">//minLabel < size, 这个条件需要加上, 因为经过测试发现有死循环的可能</span></span><br><span class="line"> {</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < size; i++) {</span><br><span class="line"> <span class="keyword">if</span> (label[i] + <span class="number">1</span> == minLabel && graph[p][i] > <span class="number">0</span>) {</span><br><span class="line"> visited[p] = <span class="literal">true</span>;</span><br><span class="line"> label[p] = minLabel;</span><br><span class="line"> plist.push_back(p);</span><br><span class="line"> <span class="keyword">break</span>;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> inflow[end];</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h4 id="复杂度分析"><a href="#复杂度分析" class="headerlink" title="复杂度分析"></a>复杂度分析</h4><p>如果该算法的Q是标准的FIFO队列,则时间复杂度为$(n^2m)$,最高标号不会超过$n$(超过时必无到汇的路径),所以$n$个点每个最多重新标号$n$次,两次标号之间$m$条边每条最多推流一次。如果是优先队列,并且标号最高的点优先的话,我们就得到了最高标号预流推进算法,其时间复杂度仅为$n^2\sqrt{m}$.</p><h2 id="最小费用最大流"><a href="#最小费用最大流" class="headerlink" title="最小费用最大流"></a>最小费用最大流</h2><h3 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h3><p>最小费用最大流是解决这么一种问题: 对于图中的每一条边来说, 除了有一个最大容量的属性以外,还有一个费用属性, 即流过这条边的单位流量的花费。求解的问题为在保证从源点到汇点的流量最大的前提下使得花费最少。</p><h3 id="求解思想"><a href="#求解思想" class="headerlink" title="求解思想"></a>求解思想</h3><p>我们来考虑这么一个问题: 在最短路的一些变形的题目中往往有这种题,每条路不仅仅有一个长度还有一个建设的费用, 最终求从起点到终点在保证路最短的前提下,使得花费的钱最少。当时我们是怎么求解的呢?<br>首先我们知道,最短路的长度是一定的,但是组成一条最短路的边是不一定的,所以我们在搜索这条最短路的时候只要通过调整待选边的优先级来控制搜索的方向就可以满足上述问题的要求。<br>这个问题跟我们现在求解的最小费用最大流问题神似啊,只要我们在寻找增广路的时候调整待选边的优先级来控制寻找方向,这个问题就可以解决了啊。我们直到对于一条增广路来说, 花费满足: $cost = minFlow * \sum w_i(i\in 增广路上的边)$, 实际上这里的优先级就是每条边的长度认为是其单位流量的花费的最短路。</p><h3 id="求解算法"><a href="#求解算法" class="headerlink" title="求解算法"></a>求解算法</h3><p>基于最大流的三种算法,求解最小费用最大流也具有三种算法,我们来对比一下这三对算法:</p><blockquote><p>** 最大流 EK 算法:** 每次用广搜寻找一条最短的增广路(即包含最少的边),然后沿其增广。<br>** 费用流 E’K’ 算法:** 每次用spfa计算图的距离标号,然后沿着可行边进行增广。</p></blockquote><blockquote><p>** 最大流 DINIC 算法:** 用广搜获得每个点到源点的距离标号,增广时沿距离标号严格减1的路径增广,直到网络中不再存在这么一条路径,那么重新广搜计算距离标号,如果广搜发现整个源点到汇点已经不连通那么退出算法。<br>** 费用流 原始对偶 算法:** 用 SPFA 获得每个点到源点的最短路,增广时沿着最短路前进的方向增广, 直到网络中不存在一条路径时重新 SPFA 求最短路, 直到没有一条最短路可以到达汇点为止。</p></blockquote><blockquote><p>** 最大流 SAP 算法:** 与 dinic 一样基于距离标号,不过这里保存的是到汇点的距离标号。并且考虑每次增广对网络的影响,发现增广只会使点的距离标号变大,并且并不会破坏距离标号 $dis[u] <= dis[v] + w[u, v]$ 的性质,只会使得等号不再成立。找不到可行边就是因为没有一个结点v使得$dis[u] == dis[v] + w[u, v]$。那么重新使等号成立的方法也很简单,并不需要重新计算整个图的距离标号,只需要调整距离标号:如果从u点开始寻找增广路没有成功,即没有一个v使得$dis[u] == dis[v] + w[u, v]$那么在所有<u,v>(v∈V)中找到距离标号最小的一个v,使$dis[u] = dis[v] + w[u, v]$即可。<br>** 费用流 ZKW 算法:** 每次增广,同样不会破坏距离标号$dis[u] <= dis[v] + w[u, v]$,只会使得等号不再成立。并且被破坏的点并没有很多(只有在增广路上的点有可能被破坏)。因此并不需要SPFA来重新计算全部的距离标号。如果某一次寻找可行边组成增广路的尝试进行到点u失败,那么在所有的边$<u,v>(v∈V$中找到距离标号最小的一个v,使$dis[v] == dis[v] + w[u, v]&成立即可。</p></blockquote><h3 id="费用流-E’K’-算法"><a href="#费用流-E’K’-算法" class="headerlink" title="费用流 E’K’ 算法"></a>费用流 E’K’ 算法</h3><p>思想上面说过了, 就是把最大流 EK 算法里面的 bfs 替换为 SPFA, 改变遍历的优先级来实现:</p><h4 id="算法步骤"><a href="#算法步骤" class="headerlink" title="算法步骤"></a>算法步骤</h4><p>与 EK 算法相同, 只不过将 bfs 换成 spfa求最短路, 边权为该边的单位流量花费.<br>如下图所示<br><img src="/images/network-flows/FFalgo9.jpg" alt="SAP算法过程"></p><h4 id="算法实现-4"><a href="#算法实现-4" class="headerlink" title="算法实现"></a>算法实现</h4><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><queue></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstdio></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstring></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><iostream></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><algorithm></span></span></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="built_in">std</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXN = <span class="number">1010</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXM = <span class="number">1000100</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXN_INT = (<span class="number">1</span> << <span class="number">29</span>);</span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">Edge</span>{</span></span><br><span class="line"> <span class="keyword">int</span> v, w, c, nxt;</span><br><span class="line">};</span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">Node</span>{</span></span><br><span class="line"> <span class="keyword">int</span> id, v;</span><br><span class="line">};</span><br><span class="line"></span><br><span class="line"><span class="keyword">bool</span> vis[MAXN];</span><br><span class="line">Node pre[MAXN];</span><br><span class="line">Edge edge[MAXN];</span><br><span class="line"><span class="keyword">int</span> n, m, ecnt, sumFlow;</span><br><span class="line"><span class="keyword">int</span> head[MAXN], dis[MAXN];</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">init</span><span class="params">()</span></span>{</span><br><span class="line"> ecnt = <span class="number">0</span>;</span><br><span class="line"> <span class="built_in">memset</span>(edge, <span class="number">0</span>, <span class="keyword">sizeof</span>(edge));</span><br><span class="line"> <span class="built_in">memset</span>(head, <span class="number">-1</span>, <span class="keyword">sizeof</span>(head));</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">addEdge</span><span class="params">(<span class="keyword">int</span> u, <span class="keyword">int</span> v, <span class="keyword">int</span> c, <span class="keyword">int</span> w)</span></span>{</span><br><span class="line"> edge[ecnt].v = v;</span><br><span class="line"> edge[ecnt].w = w;</span><br><span class="line"> edge[ecnt].c = c;</span><br><span class="line"> edge[ecnt].nxt = head[u];</span><br><span class="line"> head[u] = ecnt++;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">bool</span> <span class="title">SPFA</span><span class="params">(<span class="keyword">int</span> s, <span class="keyword">int</span> t, <span class="keyword">int</span> n)</span></span>{</span><br><span class="line"> <span class="built_in">queue</span> <<span class="keyword">int</span>> que;</span><br><span class="line"> <span class="built_in">memset</span>(vis, <span class="number">0</span>, <span class="keyword">sizeof</span>(vis));</span><br><span class="line"> fill(dis, dis + MAXN, MAXN_INT);</span><br><span class="line"> vis[s] = <span class="literal">true</span>;</span><br><span class="line"> dis[s] = <span class="number">0</span>;</span><br><span class="line"> que.push(s);</span><br><span class="line"> <span class="keyword">while</span>(!que.empty()){</span><br><span class="line"> <span class="keyword">int</span> u =que.front();</span><br><span class="line"> que.pop();</span><br><span class="line"> vis[u] = <span class="literal">false</span>;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = head[u]; i + <span class="number">1</span>; i = edge[i].nxt){</span><br><span class="line"> <span class="keyword">int</span> v = edge[i].v;</span><br><span class="line"> <span class="keyword">if</span>(edge[i].c && dis[v] > dis[u] + edge[i].c){</span><br><span class="line"> dis[v] = dis[u] + edge[i].c;</span><br><span class="line"> pre[v].v = u;</span><br><span class="line"> pre[v].id = i;</span><br><span class="line"> <span class="keyword">if</span>(!vis[v]){</span><br><span class="line"> que.push(v);</span><br><span class="line"> vis[v] = <span class="literal">true</span>;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">if</span>(dis[t] == MAXN_INT) <span class="keyword">return</span> <span class="literal">false</span>;</span><br><span class="line"> <span class="keyword">return</span> <span class="literal">true</span>;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">MCMF</span><span class="params">(<span class="keyword">int</span> s, <span class="keyword">int</span> t, <span class="keyword">int</span> n)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> flow = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">int</span> minCost = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">while</span>(SPFA(s, t, n)){</span><br><span class="line"> <span class="keyword">int</span> minFlow = MAXN_INT + <span class="number">1</span>;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = t; i != s; i = pre[i].v){</span><br><span class="line"> minFlow = min(minFlow, edge[pre[i].id].w);</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = t; i != s; i = pre[i].v){</span><br><span class="line"> edge[pre[i].id].w -= minFlow;</span><br><span class="line"> edge[pre[i].id ^ <span class="number">1</span>].w += minFlow;</span><br><span class="line"> }</span><br><span class="line"> minCost += dis[t] * minFlow;</span><br><span class="line"> }</span><br><span class="line"> sumFlow = flow;</span><br><span class="line"> <span class="keyword">return</span> minCost;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">main</span><span class="params">()</span></span>{</span><br><span class="line"> <span class="keyword">while</span>(<span class="built_in">scanf</span>(<span class="string">"%d%d"</span>, &n, &m) != EOF){</span><br><span class="line"> <span class="keyword">int</span> u, v, c, w;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">0</span>; i < m; i++){</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d%d%d%d"</span>, &u, &v, &c, &w);</span><br><span class="line"> addEdge(u, v, c, w);</span><br><span class="line"> addEdge(v, u, -c, <span class="number">0</span>);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">int</span> ans = MCMF(<span class="number">1</span>, n, n);</span><br><span class="line"> <span class="built_in">printf</span>(<span class="string">"%d\n"</span>, ans);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure>]]></content>
<summary type="html"><h2 id="基本概念-从书上摘抄-可以直接跳过不看"><a href="#基本概念-从书上摘抄-可以直接跳过不看" class="headerlink" title="基本概念(从书上摘抄,可以直接跳过不看)"></a>基本概念(从书上摘抄,可以直接跳过不看)</h2><h3 id="容量网络和网络最大流"><a href="#容量网络和网络最大流" class="headerlink" title="容量网络和网络最大流"></a>容量网络和网络最大流</h3><p>** 容量网络: ** 设 <code>G(V, E)</code>是一个有向网络, 在 V 中指定了一个顶点, 称为源点(记为 Vs ), 以及另一个顶点, 称为汇点(记为 Vt); 对于每一条弧 <code>&lt;u, v&gt;∈E</code>, 对应有一个权值 c(u, v)&gt;0, 称为<code>弧的容量</code>, 通常把这样的有向网络 G 称为容量网络。</p>
<blockquote>
<p>也就是指: 一个拥有源点、汇点并且可以容纳流量的图.</p>
</blockquote>
<p>** 弧的流量: ** 通过容量网络 G 中每条弧 <code>&lt;u, v&gt;</code> 上的实际流量(简称流量), 记为 <code>f(u, v)</code>。<br>** 网络流: ** 所有弧上流量的集合 <code>f = &#123; f(u, v) &#125;</code>,称为该容量网络 G 的一个网络流。<br>** 可行流: ** 在容量网络 <code>G(V, E)</code> 中, 满足以下条件的网络流 f, 称为可行流:</p>
<ul>
<li>** 弧流量限制条件: ** $ 0 ≤ f(u, v) ≤ c(u, v)$</li>
<li>** 平衡条件: ** 除了 Vs, Vt 外, 其余的点流入的流量总和等于流出的流量总和, 其中 <code>Vs 流出的流量总和 - 流出的流量总和 = f</code>, <code>Vt 流入的流量总和 - 流出的流量总和 = f</code>, 并且称 <code>f</code> 为可性流的流量.</li>
</ul>
<blockquote>
<p>也就是指: 在图中有一条从 Vs 到 Vt 的路径, 这条路径上起点 $f_o - f_i = f$, 终点 $f_i - f_o = f$, 其他的点 $f_i == f_o$, 并且所有的边的当前流量小于等于最大流量.(其中 $f_i$ 代表流入流量, $f_o$ 代表流出流量)</p>
</blockquote></summary>
<category term="算法" scheme="https://andrewei1316.github.io/categories/%E7%AE%97%E6%B3%95/"/>
<category term="图论" scheme="https://andrewei1316.github.io/categories/%E7%AE%97%E6%B3%95/%E5%9B%BE%E8%AE%BA/"/>
<category term="图论" scheme="https://andrewei1316.github.io/tags/%E5%9B%BE%E8%AE%BA/"/>
<category term="网络流" scheme="https://andrewei1316.github.io/tags/%E7%BD%91%E7%BB%9C%E6%B5%81/"/>
</entry>
<entry>
<title>图的连通性问题</title>
<link href="https://andrewei1316.github.io/2016/04/06/Connectivity-of-Graphs/"/>
<id>https://andrewei1316.github.io/2016/04/06/Connectivity-of-Graphs/</id>
<published>2016-04-06T15:22:02.000Z</published>
<updated>2018-04-09T01:16:07.242Z</updated>
<content type="html"><![CDATA[<h2 id="基本概念"><a href="#基本概念" class="headerlink" title="基本概念"></a>基本概念</h2><h3 id="无向图"><a href="#无向图" class="headerlink" title="无向图"></a>无向图</h3><a id="more"></a><ul><li><p><strong>连通图和非联通图</strong>: 如果无向图 G 中任意一对顶点都是连通的,则称此图是连通图(connected graph);相反,如<br>果一个无向图不是连通图,则称为非连通图(disconnected graph)。对非连通图G,其极大连通子图称为连通分量(connected component,或连通分支),连通分支数记为w(G)。</p></li><li><p><strong>割顶集与连通度</strong>: 设V’是连通图G 的一个顶点子集,在G 中删去V’及与V’关联的边后图不连通,则称 V’ 是 G 的割顶集(vertex-cut set)。如果割顶集V’的任何真子集都不是割顶集,则称V’为极小割顶 集。顶点个数最小的极小割顶集称为最小割顶集。最小割顶集中顶点的个数,称作图G 的顶点连通度(vertex connectivity degree),记做κ(G),且称图G 是κ–连通图(κ–connected graph)。</p></li><li><p><strong>割点</strong>:如果割顶集中只有一个顶点,则该顶点可以称为割点(cut-vertex),或关节点。</p></li><li><p><strong>点双连通图</strong>:如果一个无向连通图 G 没有关节点,或者说点连通度κ(G) > 1,则称 G 为点双 连通图,或者称为重连通图。 </p></li><li><p><strong>点双连通分量</strong>:一个连通图 G 如果不是点双连通图,那么它可以包括几个点双连通分量,也 称为重连通分量(或块)。一个连通图的重连通分量是该图的极大重连通子图,在重连通分量中不存在关节点。 </p></li><li><p><strong>割边集与边连通度</strong>:设 E’ 是连通图 G 的边集的子集,在 G 中删去E’后图不连通,则称E’是G 的割边集 (edge-cut set)。如果割边集 E’ 的任何真子集都不是割边集,则称 E’ 为极小割边集。边数最小的极 小割边集称为最小割边集。最小割边集中边的个数,称作图G 的边连通度(edge connectivity degree),记做λ(G),且称图G 是λ–边连通图(λ–edge–connected graph)。 </p></li><li><p><strong>割边</strong>:如果割边集中只有一条边,则该边可以称为割边(bridge),或桥。</p></li><li><p><strong>边双连通图</strong>:如果一个无向连通图 G 没有割边,或者说边连通度λ(G) > 1,则称G 为边双连通图。</p></li><li><p><strong>边双连通分量</strong>:边双连通分量:一个连通图 G 如果不是边双连通图,那么它可以包括几个边双连通分量。一 个连通图的边双连通分量是该图的极大重连通子图,在边双连通分量中不存在割边。在连通图中, 把割边删除,则连通图变成了多个连通分量,每个连通分量就是一个边双连通分量。</p></li><li><p><strong>顶点连通性与边连通性的关系</strong>:(顶点连通度、边连通度与图的最小度的关系) 设G 为无向连通图,则存在关系式:$$κ(G) ≤ λ(G) ≤ δ(G)$$</p></li><li><p><strong>割边和割点的联系</strong>:(割边和割点的联系)设 v 是图 G 中与一条割边相关联的顶点,则 v 是 G 的割点当且仅当$$deg(v) ≥ 2$$</p></li></ul><h3 id="有向图"><a href="#有向图" class="headerlink" title="有向图"></a>有向图</h3><ul><li><p><strong>强连通(strongly connected)</strong>:若 G 是有向图,如果对图 G 中任意两个顶点 u 和 v,既存在从 u 到 v 的路径,也存在从 v 到 u 的路径,则称该有向图为强连通有向图。对于非强连通图,其极 大强连通子图称为其强连通分量。</p></li><li><p><strong>单连通(simply connected)</strong>:若 G 是有向图,如果对图 G 中任意两个顶点 u 和 v,存在从 u 到 v 的路径或从 v 到 u 的路径,则称该有向图为单连通有向图。</p></li><li><p><strong>弱连通(weak connected)</strong>:若 G 是有向图,如果忽略图 G 中每条有向边的方向,得到的无向 图(即有向图的基图)连通,则称该有向图为弱连通有向图。</p></li></ul><h2 id="无向图点连通性的求解及应用"><a href="#无向图点连通性的求解及应用" class="headerlink" title="无向图点连通性的求解及应用"></a>无向图点连通性的求解及应用</h2><h3 id="求割点"><a href="#求割点" class="headerlink" title="求割点"></a>求割点</h3><p>Tarjan 算法只需从某个顶点出发进行一次遍历,就可以求得图中所有的关节点,因此其复杂度为O(n^2)。接下来以图(a)所示的无向图为例介绍这种方法。<br>在图(a)中,对该图从顶点 4 出发进行深度优先搜索,实线表示搜索前进方向,虚线表示 回退方向,顶点旁的数字标明了进行深度优先搜索时各顶点的访问次序,即深度优先数。在 DFS 搜索过程中,可以将各顶点的深度优先数记录在数组dfn 中。 图(b)是进行DFS 搜索后得到的根为顶点4 的深度优先生成树。为了更加直观地描述树形结 构,将此生成树改画成图(d)所示的树形形状。在图(d)中,还用虚线画出了两条虽然属于图G、但 不属于生成树的边,即(4, 5)和(6, 8)。 请注意:在深度优先生成树中,如果u 和v 是2 个顶点,且在生成树中u 是v 的祖先,则必 有dfn[u] < dfn[v],表明u 的深度优先数小于v,u 先于 v 被访问。<br><img src="/images/Connectivity-of-Graphs/1.png"><br>图G 中的边可以分为3 种:</p><ul><li><ol><li>生成树的边,如(2, 4)、(6, 7)等。</li></ol></li><li><ol start="2"><li>回边(back edge):图(d)中虚线所表示的非生成树的边,称为回边。当且仅当 u 在生成树中是 v 的祖先,或者 v 是 u 的祖先时,非生成树的边(u,v)才成为一条回边。如图(a)及图(d)中的(4, 5)、(6, 8)都是回边。 </li></ol></li><li><ol start="3"><li>交叉边:除生成树的边、回边外,图G 中的其他边称为交叉边。<br>请特别注意:一旦生成树确定以后,那么原图中的边只可能是回边和生成树的边,交叉边实际上是不存在的。为什么?(说明:对有向图进行DFS 搜索后,非生成树的边可能是交叉边) 假设图G 中存在边(1, 10),如图(c)所示,这就是所谓的交叉边,那么顶点10(甚至其他顶点都)只能位于顶点4 的左边这棵子树中。另外,如果在图G 中增加两条交叉边(1, 10)和(7, 9),则图G 就是一个重连通图,如图(c)所示。</li></ol></li></ul><blockquote><p>顶点u 是关节点的充要条件:</p></blockquote><ol><li>如果顶点u 是深度优先搜索生成树的根,则u 至少有2 个子女。为什么呢?因为删除u,它的子女所在的子树就断开了,你不用担心这些子树之间(在原图中)可能存在边,因为交叉边是不存在的。</li></ol><blockquote><ol start="2"><li>如果 u 不是生成树的根,则它至少有一个子女 w,从 w 出发,不可能通过w、w 的子孙,以及一条回边组成的路径到达 u 的祖先。为什么呢?这是因为如果删除顶点 u 及其 所关联的边,则以顶点 w 为根的子树就从搜索树中脱离了。例如,顶点6 为什么是关节 点?这是因为它的一个子女顶点,如图(d)所示,即顶点7,不存在如前所述的路径到达顶点6 的祖先结点,这样,一旦顶点6 删除了,则以顶点7 为根结点的子树就断开了。 又如,顶点7 为什么不是关节点?这是因为它的所有子女顶点,当然在图(d)中只有顶点 8,存在如前所述的路径到达顶点7 的祖先结点,即顶点6,这样,一旦顶点7 删除了, 则以顶点8 为根结点的子树仍然跟图G 连通。</li></ol></blockquote><p>因此,可对图 G 的每个顶点 u 定义一个 low 值:low[u]是从 u 或 u 的子孙出发通过回边可以到达的最低深度优先数。low[u]的定义如下: </p><blockquote><p>low[u] = Min<br>{<br>dfn[u],<br>Min{ low[w] | w 是u 的一个子女}, (8-2)<br>Min{ dfn[v] | v 与u 邻接,且(u,v)是一条回边 }<br>}</p></blockquote><p>即low[u]是取以上三项的最小值,其中:</p><ul><li>第1 项为它本身的深度优先数;</li><li>第2 项为它的(可能有多个)子女顶点w 的low[w]值的最小值,因为它的子女可以到达的最低深度优先数,则它也 可以通过子女到达;</li><li>第 3 项为它直接通过回边可以到达的最低优先数。</li></ul><p>因此,<strong>顶点u 是关节点的充要条件是:u 或者是具有两个以上子女的深度优先生成树的根, 或者虽然不是一个根,但它有一个子女w,使得 low[w]>=dfn[u]。</strong><br>其中,“low[w]>=dfn[u]”的含义是:顶点u 的子女顶点w,能够通过如前所述的路径到达顶 点的最低深度优先数大于等于顶点u 的深度优先数(注意在深度优先生成树中,顶点m 是顶点n 的祖先,则必有dfn[m] < dfn[n]),即w 及其子孙不存在指向顶点u 的祖先的回边。这时删除顶点 u 及其所关联的边,则以顶点 w 为根的子树就从搜索树中脱离了。 每个顶点的深度优先数dfn[n]值可以在搜索前进时进行统计,而low[n]值是在回退的时候进行计算的。<br>接下来结合图和表解释在回退过程中计算每个顶点 n 的low[n]值的方法 (当前计算出来的low[n]值用粗体、斜体及下划线标明):</p><ul><li><ol><li>在图(a)中,访问到顶点1 后,要回退,因为顶点1 没有子女顶点,所以low[1]就等于它的深度优先数dfn[1],为5;</li></ol></li><li><ol start="2"><li>从顶点1 回退到顶点5 后,要继续回退,此时计算顶点5 的low 值,因为顶点5 可以直接通过回边(5, 4)到达根结点,而根结点的深度优先数为1,所以顶点5 的low 值为1; </li></ol></li><li><ol start="3"><li>从顶点5 回退到顶点3 后,要继续回退,此时计算顶点3 的low 值,因为它的子女顶点, 即顶点5 的low 值为1,则顶点3 的low 值也为1;</li></ol></li><li><ol start="4"><li>从顶点3 回退到顶点2 后,要继续回退,此时计算顶点2 的low 值,因为它的子女顶点, 即顶点3 的low 值为1,则顶点2 的low 值也为1;</li></ol></li><li><ol start="5"><li>从顶点2 回退到顶点4 后,要继续访问它的右子树中的顶点,此时计算顶点4 的low 值,因为它的子女顶点,即顶点2 的low 值为1,则顶点4 的low 值也为1; 根结点4 右子树在回退过程计算顶点的low[n],方法类似。<br><img src="/images/Connectivity-of-Graphs/2.png"><br>求出关节点u 后,还有一个问题需要解决:去掉该关节点u,将原来的连通图分成了几个连通分量?答案是:</li></ol></li><li><ol><li>如果关节点 u 是根结点,则有几个子女,就分成了几个连通分量;</li></ol></li><li><ol start="2"><li>如果关节点 u 不是根结点,则有 d 个子女 w ,使得low[w] >= dfn[u],则去掉该结点,分成了d + 1 个连通分量。</li></ol></li></ul><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 当 res[i] > 0 时说明 i 是割点, 并且去掉 i 之后图的连通分量的个数为 res[i];</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">Tarjan</span><span class="params">(<span class="keyword">int</span> u, <span class="keyword">int</span> fa)</span></span>{</span><br><span class="line"> <span class="keyword">int</span> son = <span class="number">0</span>;</span><br><span class="line"> dfn[u] = low[u] = ++tdfn;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = head[u]; i + <span class="number">1</span>; i = edge[i].nxt){</span><br><span class="line"> <span class="keyword">int</span> v = edge[i].v;</span><br><span class="line"> <span class="keyword">if</span>((fa ^ <span class="number">1</span>) == i) <span class="keyword">continue</span>;</span><br><span class="line"> <span class="keyword">if</span>(!dfn[v]){</span><br><span class="line"> Tarjan(v, i);</span><br><span class="line"> low[u] = min(low[u], low[v]);</span><br><span class="line"> <span class="keyword">if</span>(low[v] >= dfn[u]) son++;</span><br><span class="line"> }<span class="keyword">else</span> low[u] = min(low[u], dfn[v]);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">if</span>(u != root && son) res[u] = son + <span class="number">1</span>;</span><br><span class="line"> <span class="keyword">else</span> <span class="keyword">if</span>(u == root && son > <span class="number">1</span>) res[u] = son;</span><br><span class="line"> <span class="keyword">else</span> res[u] = <span class="number">0</span>;</span><br><span class="line">}</span><br><span class="line"><span class="comment">// 调用时</span></span><br><span class="line">root = <span class="number">1</span>;</span><br><span class="line">Tarjan(root, <span class="number">-1</span>);</span><br></pre></td></tr></table></figure><h3 id="点双连通分量的求解"><a href="#点双连通分量的求解" class="headerlink" title="点双连通分量的求解"></a>点双连通分量的求解</h3><p>在求关节点的过程中就能顺便把每个重连通分量求出。方法是:建立一个栈,存储当前重连通分量,在 DFS 过程中,每找到一条生成树的边或回边,就把这条边加入栈中。如果遇到某个顶点 u 的子女顶点 v 满足 dfn[u] <= low[v],说明 u 是一个割点,同时把边从栈顶一条条取出,直到遇到了边(u, v),取出的这些边与其关联的顶点,组成一个重连通分量。割点可以属于多个重连通分量,其余顶点和每条边属于且只属于一个重连通分量。</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// bcc_cnt 即连通分支数目</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">Tarjan</span><span class="params">(<span class="keyword">int</span> u, <span class="keyword">int</span> fa)</span></span>{</span><br><span class="line"> sta.push(u);</span><br><span class="line"> instack[u] = <span class="literal">true</span>;</span><br><span class="line"> low[u] = dfn[u] = ++tdfn;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = head[u]; i + <span class="number">1</span>; i = edge[i].nxt){</span><br><span class="line"> <span class="keyword">int</span> v = edge[i].v;</span><br><span class="line"> <span class="keyword">if</span>((fa ^ <span class="number">1</span>) == i) <span class="keyword">continue</span>;</span><br><span class="line"> <span class="keyword">if</span>(!dfn[v]){</span><br><span class="line"> Tarjan(v, i);</span><br><span class="line"> low[u] = min(low[u], low[v]);</span><br><span class="line"> }<span class="keyword">else</span> low[u] = min(low[u], dfn[v]);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">if</span>(low[u] == dfn[u]){</span><br><span class="line"> bcc_cnt++;</span><br><span class="line"> <span class="keyword">int</span> top;</span><br><span class="line"> <span class="keyword">do</span>{</span><br><span class="line"> top = sta.top();</span><br><span class="line"> sta.pop();</span><br><span class="line"> instack[top] = <span class="literal">false</span>;</span><br><span class="line"> belong[top] = bcc_cnt;</span><br><span class="line"> }<span class="keyword">while</span>(u != top);</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h3 id="顶点连通度求解"><a href="#顶点连通度求解" class="headerlink" title="顶点连通度求解"></a>顶点连通度求解</h3><p>待补充。。。。</p><h3 id="割边的求解"><a href="#割边的求解" class="headerlink" title="割边的求解"></a>割边的求解</h3><p>割边的求解过程与求割点的过程类似,判断方法是:无向图中的一条边(u, v)是桥,当且仅当(u, v)为生成树中的边,且满足dfn[u] < low[v]。<br>例如,图(a)所示的无向图,如果从顶点 4 开始进行DFS 搜索,各顶点的 <code>dfn[]</code> 值和 <code>low[]</code> 值如图(a)所示(每个顶点旁的两个数值分别表示 <code>dfn[]</code> 值和 <code>low[]</code> 值),深度优先搜索树如图(b)所 示。根据上述判断方法,可判断出边(1, 5)、(4, 6)、(8, 9)和(9, 10)为无向图中的割边。<br><img src="/images/Connectivity-of-Graphs/3.png"></p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 求桥的模板, res数组存储的是桥的编号</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">Tarjan</span><span class="params">(<span class="keyword">int</span> u, <span class="keyword">int</span> fa)</span></span>{</span><br><span class="line"> low[u] = dfn[u] = ++tdfn;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = head[u]; i + <span class="number">1</span>; i = edge[i].nxt){</span><br><span class="line"> <span class="keyword">int</span> v = edge[i].v;</span><br><span class="line"> <span class="keyword">if</span>((fa ^ <span class="number">1</span>) == i) <span class="keyword">continue</span>;</span><br><span class="line"> <span class="keyword">if</span>(!dfn[v]){</span><br><span class="line"> Tarjan(v, i);</span><br><span class="line"> low[u] = min(low[u], low[v]);</span><br><span class="line"> <span class="keyword">if</span>(low[v] > dfn[u]) res[cnt++] = edge[i].id;</span><br><span class="line"> }<span class="keyword">else</span> low[u] = min(low[u], dfn[v]);</span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"><span class="comment">// 调用时 Tarjan(1, -1);</span></span><br></pre></td></tr></table></figure><h3 id="边双连通分量的求解"><a href="#边双连通分量的求解" class="headerlink" title="边双连通分量的求解"></a>边双连通分量的求解</h3><p>在求出所有的桥以后,把桥删除,原图变成了多个连通块,则每个连通块就是一个边双连通分量。桥不属于任何一个边双连通分量,其余的边和每个顶点都属于且只属于一个边双连通分量。 </p><h3 id="边连通度的求解"><a href="#边连通度的求解" class="headerlink" title="边连通度的求解"></a>边连通度的求解</h3><h2 id="有向图强连通性的求解及应用"><a href="#有向图强连通性的求解及应用" class="headerlink" title="有向图强连通性的求解及应用"></a>有向图强连通性的求解及应用</h2><h3 id="有向图强连通分量的求解算法"><a href="#有向图强连通分量的求解算法" class="headerlink" title="有向图强连通分量的求解算法"></a>有向图强连通分量的求解算法</h3><h4 id="Tarjan-算法"><a href="#Tarjan-算法" class="headerlink" title="Tarjan 算法"></a><strong>Tarjan 算法</strong></h4><p>Tarjan 算法是基于 DFS 算法,每个强连通分量为搜索树中的一棵子树。搜索时,把当前搜索 树中未处理的节点加入一个栈,回溯时可以判断栈顶到栈中的节点是否为一个强连通分量。当 dfn(u) = low(u)时,以 u 为根的搜索子树上所有节点是一个强连通分量。<br>接下来以图(a)所示的有向图为例解释 Tarjan 算法的思想和执行过程,在该有向图中,{ 1, 2, 5, 3 }为一个强连通分量,{ 4 }、{ 6 }也分别是强连通分量。 图(b)为从顶点1 出发进行深度优先搜索后得到的深度优先搜索树。约定:如果某个顶点有多 个未访问过的邻接顶点,按顶点序号从小到大的顺序进行选择。各顶点旁边的两个数值分别为顶 点的深度优先数(dfn[])值和low[]值。在图(b)中,虚线表示非生成树的边,其中边<5, 6>为交 叉边,边<5, 1>和<3, 5>是回边<br><img src="/images/Connectivity-of-Graphs/4.png"><br>图(c)~(f)演示了 Tarjan 算法的执行过程。在图(c)中,沿着实线箭头所指示的方向搜索到顶点 6,此时无法再前进下去了,并且因为此时 dfn[6] = low[6] = 4,所以找到了一个强连通分量。退栈到u == v 为止,{ 6 }为一个强连通分量。<br>在图(d)中,沿着虚线箭头所指示的方向回退到顶点4,发现dfn[4] == low[4],为3,退栈后{ 4 } 为一个强连通分量。<br>在图(e)中,回退到顶点2 并继续搜索到顶点5,把顶点5 加入栈。发现顶点5 有到顶点 1 的有向边,顶点 1 还在栈中,所以 low[5] = 1,有向边 <5, 1> 为回边。顶点 6 已经出栈,所以 <5, 6> 是交叉边,返回顶点 2,<2, 5>为生成树的边,所以low[2] = low[5] = 1。<br>在图(f)中,先回退到顶点 1,接着访问顶点 3。发现顶点 3 到顶点有一条有向边,顶点 5 已经访问过了、且 5 还在栈中,因此边 <3, 5> 为回边,所以low[3] = dfn[5] = 5。返回顶点 1 后,发现 dfn[1] == low[1],把栈中的顶点全部弹出,组成一个连通分量{ 3, 5, 2, 1 }。 至此,Tarjan 算法结束,求出了图中全部的三个强连通分量为{ 6 }、{ 4 }和{ 3, 5, 2, 1 }。<br>Tarjan 算法的时间复杂度分析:假设用邻接表存储图,在 Tarjan 算法的执行过程中,每个顶点都被访问了一次,且只进出了一次栈,每条边也只被访问了一次,所以该算法的时间复杂度为 O(n + m)。</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 代码模板</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">Tarjan</span><span class="params">(<span class="keyword">int</span> u)</span></span>{</span><br><span class="line"> sta.push(u);</span><br><span class="line"> instack[u] = <span class="literal">true</span>;</span><br><span class="line"> dfn[u] = low[u] = ++tdfn;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = head[u]; i + <span class="number">1</span>; i = edge[i].nxt){</span><br><span class="line"> <span class="keyword">int</span> v = edge[i].v;</span><br><span class="line"> <span class="keyword">if</span>(!dfn[v]){</span><br><span class="line"> Tarjan(v);</span><br><span class="line"> low[u] = min(low[u], low[v]);</span><br><span class="line"> }<span class="keyword">else</span> <span class="keyword">if</span>(instack[v]) low[u] = min(low[u], dfn[v]);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">if</span>(low[u] == dfn[u]){</span><br><span class="line"> <span class="keyword">int</span> top;</span><br><span class="line"> scc_cnt++;</span><br><span class="line"> <span class="keyword">do</span>{</span><br><span class="line"> top = sta.top();</span><br><span class="line"> sta.pop();</span><br><span class="line"> instack[top] = <span class="literal">false</span>;</span><br><span class="line"> belong[top] = scc_cnt;</span><br><span class="line"> }<span class="keyword">while</span>(u != top);</span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">// 调用</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">1</span>; i <= n; i++)</span><br><span class="line"> <span class="keyword">if</span>(!dfn[i]) Tarjan(i);</span><br></pre></td></tr></table></figure><h4 id="Kosaraju-算法"><a href="#Kosaraju-算法" class="headerlink" title="Kosaraju 算法"></a><strong>Kosaraju 算法</strong></h4><p>Kosaraju 算法是基于对有向图 G 及其逆图 GT(各边反向得到的有向图)进行两次 DFS 的方 法,其时间复杂度也是 O(n + m)。与 Trajan 算法相比,Kosaraju 算法的思想更为直观。<br>Kosaraju 算法的原理为:如果有向图 G 的一个子图 G’ 是强连通子图,那么各边反向后没有任何影响,G’ 内各顶点间仍然连通,G’ 仍然是强连通子图。但如果子图G’是单向连通的,那么各边反向后可能某些顶点间就不连通了,因此,各边的反向处理是对非强连通块的过滤。<br>Kosaraju 算法的执行过程为:</p><ul><li>(1) 对原图G 进行深度优先搜索,并记录每个顶点的 dfn[] 值。</li><li>(2) 将图G 的各边进行反向,得到其逆图GT。</li><li>(3) 选择从当前dfn[ ]值最大的顶点出发,对逆图GT 进行DFS 搜索,删除能够遍历到的顶点,这些顶点构成一个强连通分量。</li><li>(4) 如果还有顶点没有删除,继续执行第(3)步,否则算法结束。<br>接下来以图(a)所示的有向图 G 为例分析 Kosaraju 算法的执行过程。图(b)为正向搜索过程,搜索完毕后,得到各顶点的 dfn[ ]值。图(c)为逆图GT。图(d)为从顶点3 出发对逆图GT 进行 DFS 搜索,得到第1 个强连通分量{ 1, 2, 5, 3 },图(e)和(f)分别从顶点4 和6 出发进行DFS 搜索得到另外两个强连通分量。<br><img src="/images/Connectivity-of-Graphs/5.png"><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 算法模板</span></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">dfs</span><span class="params">(<span class="keyword">int</span> u)</span></span>{</span><br><span class="line"> vis[u] = <span class="literal">true</span>;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = head[u]; i + <span class="number">1</span>; i = edge[i].nxt)</span><br><span class="line"> <span class="keyword">if</span>(!vis[edge[i].v]) dfs(edge[i].v);</span><br><span class="line"> vs[vscnt++] = u;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">rdfs</span><span class="params">(<span class="keyword">int</span> u, <span class="keyword">int</span> k)</span></span>{</span><br><span class="line"> vis[u] = <span class="literal">true</span>;</span><br><span class="line"> belong[u] = k;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = rhead[u]; i + <span class="number">1</span>; i = redge[i].nxt)</span><br><span class="line"> <span class="keyword">if</span>(!vis[redge[i].v]) rdfs(redge[i].v, k);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">scc</span><span class="params">()</span></span>{</span><br><span class="line"> <span class="built_in">memset</span>(vis, <span class="number">0</span>, <span class="keyword">sizeof</span>(vis));</span><br><span class="line"> vscnt = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = <span class="number">1</span>; i <= n; i++)</span><br><span class="line"> <span class="keyword">if</span>(!vis[i]) dfs(i);</span><br><span class="line"></span><br><span class="line"> <span class="keyword">int</span> scc_cnt = <span class="number">0</span>;</span><br><span class="line"> <span class="built_in">memset</span>(vis, <span class="number">0</span>, <span class="keyword">sizeof</span>(vis));</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i = vscnt - <span class="number">1</span>; i >= <span class="number">0</span>; i--)</span><br><span class="line"> <span class="keyword">if</span>(!vis[vs[i]]) rdfs(vs[i], scc_cnt++);</span><br><span class="line"> </span><br><span class="line"> <span class="keyword">return</span> scc_cnt;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h2 id="连通性算法的应用2-SAT"><a href="#连通性算法的应用2-SAT" class="headerlink" title="连通性算法的应用2_SAT"></a>连通性算法的应用2_SAT</h2><h3 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h3>现有一个由 N 个布尔值组成的序列 A,给出一些限制关系,比如 A[x] AND A[y]=0 A[x] OR A[y] OR A[z] = 1 等,要确定 A[0..N-1] 的值,使得其满足所有限制关系。这个称为 SAT 问题,特别的,若每种限制关系中最多只对两个元素进行限制,则称为 2-SAT 问题。</li></ul><h3 id="展开"><a href="#展开" class="headerlink" title="展开"></a>展开</h3><p>由于在2-SAT问题中,最多只对两个元素进行限制,所以可能的限制关系共有11种:<br>A[x]<br>NOT A[x]<br>A[x] AND A[y]<br>A[x] AND NOT A[y]<br>A[x] OR A[y]<br>A[x] OR NOT A[y]<br>NOT (A[x] AND A[y])<br>NOT (A[x] OR A[y])<br>A[x] XOR A[y]<br>NOT (A[x] XOR A[y])<br>A[x] XOR NOT A[y]<br>进一步,A[x] AND A[y]相当于(A[x]) AND (A[y])(也就是可以拆分成A[x]与A[y]两个限制关系),NOT(A[x] OR A[y])相当于NOT A[x] AND NOT A[y](也就是可以拆分成NOT A[x]与NOT A[y]两个限制关系)。因此,可能的限制关系最多只有9种。</p><p>在实际问题中,2-SAT问题在大多数时候表现成以下形式:有N对物品,每对物品中必须选取一个,也只能选取一个,并且它们之间存在某些限制关系(如某两个物品不能都选,某两个物品不能都不选,某两个物品必须且只能选一个,某个物品必选)等,这时,可以将每对物品当成一个布尔值(选取第一个物品相当于0,选取第二个相当于1),如果所有的限制关系最多只对两个物品进行限制,则它们都可以转化成9种基本限制关系,从而转化为2-SAT模型。</p><h3 id="建模"><a href="#建模" class="headerlink" title="建模"></a>建模</h3><p>其实 2-SAT 问题的建模是和实际问题非常相似的。建立一个 2N 阶的有向图,其中的点分为 N 对,每对点表示布尔序列 A 的一个元素的 0、1 取值(以下将代表 A[i] 的 0 取值的点称为 i,代表 A[i] 的 1 取值的点称为i’)。显然每对点必须且只能选取一个。然后,图中的边具有特定含义。若图中存在边 <i, j>,则表示若选了 i 必须选 j。可以发现,上面的 9 种限制关系中,后7种二元限制关系都可以用连边实现,比如NOT(A[x] AND A[y])需要连两条边<x, y’>和<y, x’>,A[x] OR A[y]需要连两条边<x’, y>和<y’, x>。而前两种一元关系,对于A[x](即x必选),可以通过连边<x’, x>来实现,而NOT A[x](即x不能选),可以通过连边<x, x’>来实现。</p><h3 id="O-NM-算法:求字典序最小的解"><a href="#O-NM-算法:求字典序最小的解" class="headerlink" title="O(NM)算法:求字典序最小的解"></a>O(NM)算法:求字典序最小的解</h3><p>根据 2-SAT 建成的图中边的定义可以发现,若图中 i 到 j 有路径,则若 i 选,则 j 也要选;或者说,若 j 不选,则 i 也不能选;<br>因此得到一个很直观的算法:</p><ul><li><p>(1)给每个点设置一个状态 V,V = 0 表示未确定,V = 1 表示确定选取,V = 2 表示确定不选取。称一个点是已确定的当且仅当其 V 值非 0。设立两个队列 Q1 和 Q2,分别存放本次尝试选取的点的编号和尝试不选的点的编号。</p></li><li><p>(2)若图中所有的点均已确定,则找到一组解,结束,否则,将 Q1、Q2 清空,并任选一个未确定的点 i,将 i 加入队列 Q1,将 i’ 加入队列 Q2;</p></li><li><p>(3)找到 i 的所有后继。对于后继 j,若 j 未确定,则将 j 加入队列 Q1;若 j’(这里的 j’ 是指与 j 在同一对的另一个点)未确定,则将 j’ 加入队列 Q2;</p></li><li><p>(4)遍历 Q2 中的每个点,找到该点的所有前趋(这里需要先建一个补图),若该前趋未确定,则将其加入队列 Q2;</p></li><li><p>(5)在(3)(4)步操作中,出现以下情况之一,则本次尝试失败,否则本次尝试成功:</p><ul><li><1>某个已被加入队列 Q1 的点被加入队列 Q2;</li><li><2>某个已被加入队列 Q2 的点被加入队列 Q1;</li><li><3>某个 j 的状态为 2;</li><li><4>某个 i’ 或 j’ 的状态为 1 或某个 i’ 或 j’ 的前趋的状态为 1 ;</li></ul></li><li><p>(6)若本次尝试成功,则将Q1中的所有点的状态改为1,将Q2中所有点的状态改为2,转(2),否则尝试点i’,若仍失败则问题无解。</p></li></ul><p>该算法的时间复杂度为 O(NM)(最坏情况下要尝试所有的点,每次尝试要遍历所有的边),但是在多数情况下,远远达不到这个上界。<br>具体实现时,可以用一个数组 vst 来表示队列 Q1 和 Q2。设立两个标志变量 i1 和 i2(要求对于不同的 i,i1 和 i2 均不同,这样可以避免每次尝试都要初始化一次,节省时间),若 vst[i] = i1 则表示 i 已被加入 Q1,若 vst[i] = i2 则表示 i 已被加入 Q2。不过 Q1 和 Q2 仍然是要设立的,因为遍历(BFS)的时候需要队列,为了防止重复遍历,加入 Q1(或Q2)中的点的 vst 值必然不等于 i1(或i2)。中间一旦发生矛盾,立即中止尝试,宣告失败。</p><p>该算法虽然在多数情况下时间复杂度到不了 O(NM),但是综合性能仍然不如下面的 O(M) 算法。不过,该算法有一个很重要的用处:求字典序最小的解!<br>如果原图中的同一对点编号都是连续的(01、23、45……)则可以依次尝试第 0 对、第 1 对……点,每对点中先尝试编号小的,若失败再尝试编号大的。这样一定能求出字典序最小的解(如果有解的话),因为一个点一旦被确定,则不可更改。<br>如果原图中的同一对点编号不连续(比如03、25、14……)则按照该对点中编号小的点的编号递增顺序将每对点排序,然后依次扫描排序后的每对点,先尝试其编号小的点,若成功则将这个点选上,否则尝试编号大的点,若成功则选上,否则(都失败)无解。</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br><span class="line">135</span><br><span class="line">136</span><br><span class="line">137</span><br><span class="line">138</span><br><span class="line">139</span><br><span class="line">140</span><br><span class="line">141</span><br><span class="line">142</span><br><span class="line">143</span><br><span class="line">144</span><br><span class="line">145</span><br><span class="line">146</span><br><span class="line">147</span><br><span class="line">148</span><br><span class="line">149</span><br><span class="line">150</span><br><span class="line">151</span><br><span class="line">152</span><br><span class="line">153</span><br><span class="line">154</span><br><span class="line">155</span><br><span class="line">156</span><br><span class="line">157</span><br><span class="line">158</span><br><span class="line">159</span><br><span class="line">160</span><br><span class="line">161</span><br><span class="line">162</span><br><span class="line">163</span><br><span class="line">164</span><br><span class="line">165</span><br><span class="line">166</span><br><span class="line">167</span><br><span class="line">168</span><br><span class="line">169</span><br><span class="line">170</span><br><span class="line">171</span><br><span class="line">172</span><br><span class="line">173</span><br><span class="line">174</span><br><span class="line">175</span><br><span class="line">176</span><br><span class="line">177</span><br><span class="line">178</span><br><span class="line">179</span><br><span class="line">180</span><br><span class="line">181</span><br><span class="line">182</span><br><span class="line">183</span><br><span class="line">184</span><br><span class="line">185</span><br><span class="line">186</span><br><span class="line">187</span><br><span class="line">188</span><br><span class="line">189</span><br><span class="line">190</span><br><span class="line">191</span><br><span class="line">192</span><br><span class="line">193</span><br><span class="line">194</span><br><span class="line">195</span><br><span class="line">196</span><br><span class="line">197</span><br><span class="line">198</span><br><span class="line">199</span><br><span class="line">200</span><br><span class="line">201</span><br><span class="line">202</span><br><span class="line">203</span><br><span class="line">204</span><br><span class="line">205</span><br><span class="line">206</span><br><span class="line">207</span><br><span class="line">208</span><br><span class="line">209</span><br><span class="line">210</span><br><span class="line">211</span><br><span class="line">212</span><br><span class="line">213</span><br><span class="line">214</span><br><span class="line">215</span><br><span class="line">216</span><br><span class="line">217</span><br><span class="line">218</span><br><span class="line">219</span><br><span class="line">220</span><br><span class="line">221</span><br><span class="line">222</span><br><span class="line">223</span><br><span class="line">224</span><br><span class="line">225</span><br><span class="line">226</span><br><span class="line">227</span><br><span class="line">228</span><br><span class="line">229</span><br><span class="line">230</span><br><span class="line">231</span><br><span class="line">232</span><br><span class="line">233</span><br><span class="line">234</span><br><span class="line">235</span><br><span class="line">236</span><br><span class="line">237</span><br><span class="line">238</span><br><span class="line">239</span><br><span class="line">240</span><br><span class="line">241</span><br><span class="line">242</span><br><span class="line">243</span><br><span class="line">244</span><br><span class="line">245</span><br><span class="line">246</span><br><span class="line">247</span><br><span class="line">248</span><br><span class="line">249</span><br><span class="line">250</span><br><span class="line">251</span><br><span class="line">252</span><br><span class="line">253</span><br><span class="line">254</span><br><span class="line">255</span><br><span class="line">256</span><br><span class="line">257</span><br><span class="line">258</span><br><span class="line">259</span><br><span class="line">260</span><br><span class="line">261</span><br><span class="line">262</span><br><span class="line">263</span><br><span class="line">264</span><br><span class="line">265</span><br><span class="line">266</span><br><span class="line">267</span><br><span class="line">268</span><br><span class="line">269</span><br><span class="line">270</span><br><span class="line">271</span><br><span class="line">272</span><br><span class="line">273</span><br><span class="line">274</span><br><span class="line">275</span><br><span class="line">276</span><br><span class="line">277</span><br><span class="line">278</span><br><span class="line">279</span><br><span class="line">280</span><br><span class="line">281</span><br><span class="line">282</span><br><span class="line">283</span><br><span class="line">284</span><br><span class="line">285</span><br><span class="line">286</span><br><span class="line">287</span><br><span class="line">288</span><br><span class="line">289</span><br><span class="line">290</span><br><span class="line">291</span><br><span class="line">292</span><br><span class="line">293</span><br><span class="line">294</span><br><span class="line">295</span><br><span class="line">296</span><br><span class="line">297</span><br><span class="line">298</span><br><span class="line">299</span><br><span class="line">300</span><br><span class="line">301</span><br><span class="line">302</span><br><span class="line">303</span><br><span class="line">304</span><br><span class="line">305</span><br><span class="line">306</span><br><span class="line">307</span><br><span class="line">308</span><br><span class="line">309</span><br><span class="line">310</span><br><span class="line">311</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 模板代码</span></span><br><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment">HDU 1814</span></span><br><span class="line"><span class="comment">求出字典序最小的解</span></span><br><span class="line"><span class="comment">C++ 2652ms 2316K</span></span><br><span class="line"><span class="comment"></span></span><br><span class="line"><span class="comment">*/</span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span><span class="meta-string"><stdio.h></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span><span class="meta-string"><iostream></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span><span class="meta-string"><algorithm></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span><span class="meta-string"><iostream></span></span></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="built_in">std</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXN=<span class="number">16010</span>;</span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXM=<span class="number">100000</span>;</span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">Node</span></span></span><br><span class="line"><span class="class">{</span></span><br><span class="line"> <span class="keyword">int</span> a,b,pre,next;</span><br><span class="line">}E[MAXM],E2[MAXM];</span><br><span class="line"><span class="keyword">int</span> _n,n,m;</span><br><span class="line"><span class="keyword">int</span> V[MAXN],ST[MAXN][<span class="number">2</span>],Q[MAXN],Q2[MAXN],vst[MAXN];</span><br><span class="line"><span class="keyword">bool</span> res_ex;</span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">init_d</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>{</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i=<span class="number">0</span>;i<n;i++)</span><br><span class="line"> E[i].a=E[i].pre=E[i].next=E2[i].a=E2[i].pre=E2[i].next=i;</span><br><span class="line"> m=n;</span><br><span class="line">}</span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">add_edge</span><span class="params">(<span class="keyword">int</span> a,<span class="keyword">int</span> b)</span></span></span><br><span class="line"><span class="function"></span>{</span><br><span class="line"> E[m].a=a;E[m].b=b;E[m].pre=E[a].pre;E[m].next=a;E[a].pre=m;E[E[m].pre].next=m;</span><br><span class="line"> E2[m].a=b;E2[m].b=a;E2[m].pre=E2[b].pre;E2[m].next=b;E2[b].pre=m;E2[E2[m].pre].next=m;</span><br><span class="line"> m++;</span><br><span class="line">}</span><br><span class="line"><span class="function"><span class="keyword">void</span> <span class="title">solve</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>{<span class="comment">//1</span></span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i=<span class="number">0</span>;i<n;i++)</span><br><span class="line"> {</span><br><span class="line"> V[i]=<span class="number">0</span>;</span><br><span class="line"> vst[i]=<span class="number">0</span>;</span><br><span class="line"> }</span><br><span class="line"> res_ex=<span class="number">1</span>;</span><br><span class="line"> <span class="keyword">int</span> i,i1,i2,j,k,front,rear,front2,rear2;</span><br><span class="line"> <span class="keyword">int</span> len;</span><br><span class="line"> <span class="keyword">bool</span> ff;</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> _i=<span class="number">0</span>;_i<_n;_i++)</span><br><span class="line"> {<span class="comment">//2</span></span><br><span class="line"> <span class="keyword">if</span>(V[_i<<<span class="number">1</span>]==<span class="number">1</span>||V[(_i<<<span class="number">1</span>)+<span class="number">1</span>]==<span class="number">1</span>)<span class="keyword">continue</span>;<span class="comment">//找一对未确定的点</span></span><br><span class="line"> i=_i<<<span class="number">1</span>;len=<span class="number">0</span>;</span><br><span class="line"> <span class="keyword">if</span>(!V[i])</span><br><span class="line"> {<span class="comment">//3</span></span><br><span class="line"> ST[len][<span class="number">0</span>]=i;</span><br><span class="line"> ST[len++][<span class="number">1</span>]=<span class="number">1</span>;</span><br><span class="line"> <span class="keyword">if</span>(!V[i^<span class="number">1</span>])</span><br><span class="line"> {</span><br><span class="line"> ST[len][<span class="number">0</span>]=i^<span class="number">1</span>;</span><br><span class="line"> ST[len++][<span class="number">1</span>]=<span class="number">2</span>;</span><br><span class="line"> }</span><br><span class="line"> Q[front=rear=<span class="number">0</span>]=i;</span><br><span class="line"> vst[i]=i1=n+i;</span><br><span class="line"> Q2[front2=rear2=<span class="number">0</span>]=i^<span class="number">1</span>;</span><br><span class="line"> vst[i^<span class="number">1</span>]=i2=(n<<<span class="number">1</span>)+i;</span><br><span class="line"> ff=<span class="number">1</span>;</span><br><span class="line"> <span class="keyword">for</span>(;front<=rear;front++)</span><br><span class="line"> {<span class="comment">//4</span></span><br><span class="line"> j=Q[front];</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> p=E[j].next;p!=j;p=E[p].next)</span><br><span class="line"> {<span class="comment">//5</span></span><br><span class="line"> k=E[p].b;</span><br><span class="line"> <span class="keyword">if</span>(V[k]==<span class="number">2</span>||vst[k]==i2||V[k^<span class="number">1</span>]==<span class="number">1</span>||vst[k^<span class="number">1</span>]==i1)</span><br><span class="line"> {ff=<span class="number">0</span>;<span class="keyword">break</span>;}</span><br><span class="line"> <span class="keyword">if</span>(vst[k]!=i1)</span><br><span class="line"> {<span class="comment">//6</span></span><br><span class="line"> Q[++rear]=k;vst[k]=i1;</span><br><span class="line"> <span class="keyword">if</span>(!V[k])</span><br><span class="line"> {</span><br><span class="line"> ST[len][<span class="number">0</span>]=k;</span><br><span class="line"> ST[len++][<span class="number">1</span>]=<span class="number">1</span>;</span><br><span class="line"> }</span><br><span class="line"> }<span class="comment">//6</span></span><br><span class="line"> <span class="keyword">if</span>(vst[k^<span class="number">1</span>]!=i2)</span><br><span class="line"> {<span class="comment">//6</span></span><br><span class="line"> Q2[++rear2]=k^<span class="number">1</span>;vst[k^<span class="number">1</span>]=i2;</span><br><span class="line"> <span class="keyword">if</span>(!V[k])</span><br><span class="line"> {</span><br><span class="line"> ST[len][<span class="number">0</span>]=k^<span class="number">1</span>;</span><br><span class="line"> ST[len++][<span class="number">1</span>]=<span class="number">2</span>;</span><br><span class="line"> }</span><br><span class="line"> }<span class="comment">//6</span></span><br><span class="line"> }<span class="comment">//5</span></span><br><span class="line"> <span class="keyword">if</span>(!ff)<span class="keyword">break</span>;</span><br><span class="line"> }<span class="comment">//4</span></span><br><span class="line"> <span class="keyword">if</span>(ff)</span><br><span class="line"> {<span class="comment">//4</span></span><br><span class="line"> <span class="keyword">for</span>(;front2<=rear2;front2++)</span><br><span class="line"> {<span class="comment">//5</span></span><br><span class="line"> j=Q2[front2];</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> p=E2[j].next;p!=j;p=E2[p].next)</span><br><span class="line"> {<span class="comment">//6</span></span><br><span class="line"> k=E2[p].b;</span><br><span class="line"> <span class="keyword">if</span>(V[k]==<span class="number">1</span>||vst[k]==i1)</span><br><span class="line"> {ff=<span class="number">0</span>;<span class="keyword">break</span>;}</span><br><span class="line"> <span class="keyword">if</span>(vst[k]!=i2)</span><br><span class="line"> {</span><br><span class="line"> vst[k]=i2;Q2[++rear]=k;</span><br><span class="line"> <span class="keyword">if</span>(!V[k])</span><br><span class="line"> {</span><br><span class="line"> ST[len][<span class="number">0</span>]=k;</span><br><span class="line"> ST[len++][<span class="number">1</span>]=<span class="number">2</span>;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }<span class="comment">//6</span></span><br><span class="line"> <span class="keyword">if</span>(!ff)<span class="keyword">break</span>;</span><br><span class="line"> }<span class="comment">//5</span></span><br><span class="line"> <span class="keyword">if</span>(ff)</span><br><span class="line"> {</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> j=<span class="number">0</span>;j<len;j++)V[ST[j][<span class="number">0</span>]]=ST[j][<span class="number">1</span>];</span><br><span class="line"> <span class="keyword">continue</span>;</span><br><span class="line"> }</span><br><span class="line"> }<span class="comment">//4</span></span><br><span class="line"> }<span class="comment">//3</span></span><br><span class="line"> i=(_i<<<span class="number">1</span>)+<span class="number">1</span>;len=<span class="number">0</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">//********************************************</span></span><br><span class="line"><span class="comment">//下面这段和上面完全一样的,可以复制。但是要保证上面写对</span></span><br><span class="line"><span class="comment">//********************************************</span></span><br><span class="line"> <span class="keyword">if</span>(!V[i])</span><br><span class="line"> {<span class="comment">//3</span></span><br><span class="line"> ST[len][<span class="number">0</span>]=i;</span><br><span class="line"> ST[len++][<span class="number">1</span>]=<span class="number">1</span>;</span><br><span class="line"> <span class="keyword">if</span>(!V[i^<span class="number">1</span>])</span><br><span class="line"> {</span><br><span class="line"> ST[len][<span class="number">0</span>]=i^<span class="number">1</span>;</span><br><span class="line"> ST[len++][<span class="number">1</span>]=<span class="number">2</span>;</span><br><span class="line"> }</span><br><span class="line"> Q[front=rear=<span class="number">0</span>]=i;</span><br><span class="line"> vst[i]=i1=n+i;</span><br><span class="line"> Q2[front2=rear2=<span class="number">0</span>]=i^<span class="number">1</span>;</span><br><span class="line"> vst[i^<span class="number">1</span>]=i2=(n<<<span class="number">1</span>)+i;</span><br><span class="line"> ff=<span class="number">1</span>;</span><br><span class="line"> <span class="keyword">for</span>(;front<=rear;front++)</span><br><span class="line"> {<span class="comment">//4</span></span><br><span class="line"> j=Q[front];</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> p=E[j].next;p!=j;p=E[p].next)</span><br><span class="line"> {<span class="comment">//5</span></span><br><span class="line"> k=E[p].b;</span><br><span class="line"> <span class="keyword">if</span>(V[k]==<span class="number">2</span>||vst[k]==i2||V[k^<span class="number">1</span>]==<span class="number">1</span>||vst[k^<span class="number">1</span>]==i1)</span><br><span class="line"> {ff=<span class="number">0</span>;<span class="keyword">break</span>;}</span><br><span class="line"> <span class="keyword">if</span>(vst[k]!=i1)</span><br><span class="line"> {<span class="comment">//6</span></span><br><span class="line"> Q[++rear]=k;vst[k]=i1;</span><br><span class="line"> <span class="keyword">if</span>(!V[k])</span><br><span class="line"> {</span><br><span class="line"> ST[len][<span class="number">0</span>]=k;</span><br><span class="line"> ST[len++][<span class="number">1</span>]=<span class="number">1</span>;</span><br><span class="line"> }</span><br><span class="line"> }<span class="comment">//6</span></span><br><span class="line"> <span class="keyword">if</span>(vst[k^<span class="number">1</span>]!=i2)</span><br><span class="line"> {<span class="comment">//6</span></span><br><span class="line"> Q2[++rear2]=k^<span class="number">1</span>;vst[k^<span class="number">1</span>]=i2;</span><br><span class="line"> <span class="keyword">if</span>(!V[k])</span><br><span class="line"> {</span><br><span class="line"> ST[len][<span class="number">0</span>]=k^<span class="number">1</span>;</span><br><span class="line"> ST[len++][<span class="number">1</span>]=<span class="number">2</span>;</span><br><span class="line"> }</span><br><span class="line"> }<span class="comment">//6</span></span><br><span class="line"> }<span class="comment">//5</span></span><br><span class="line"> <span class="keyword">if</span>(!ff)<span class="keyword">break</span>;</span><br><span class="line"> }<span class="comment">//4</span></span><br><span class="line"> <span class="keyword">if</span>(ff)</span><br><span class="line"> {<span class="comment">//4</span></span><br><span class="line"> <span class="keyword">for</span>(;front2<=rear2;front2++)</span><br><span class="line"> {<span class="comment">//5</span></span><br><span class="line"> j=Q2[front2];</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> p=E2[j].next;p!=j;p=E2[p].next)</span><br><span class="line"> {<span class="comment">//6</span></span><br><span class="line"> k=E2[p].b;</span><br><span class="line"> <span class="keyword">if</span>(V[k]==<span class="number">1</span>||vst[k]==i1)</span><br><span class="line"> {ff=<span class="number">0</span>;<span class="keyword">break</span>;}</span><br><span class="line"> <span class="keyword">if</span>(vst[k]!=i2)</span><br><span class="line"> {</span><br><span class="line"> vst[k]=i2;Q2[++rear]=k;</span><br><span class="line"> <span class="keyword">if</span>(!V[k])</span><br><span class="line"> {</span><br><span class="line"> ST[len][<span class="number">0</span>]=k;</span><br><span class="line"> ST[len++][<span class="number">1</span>]=<span class="number">2</span>;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> }<span class="comment">//6</span></span><br><span class="line"> <span class="keyword">if</span>(!ff)<span class="keyword">break</span>;</span><br><span class="line"> }<span class="comment">//5</span></span><br><span class="line"> <span class="keyword">if</span>(ff)</span><br><span class="line"> {</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> j=<span class="number">0</span>;j<len;j++)V[ST[j][<span class="number">0</span>]]=ST[j][<span class="number">1</span>];</span><br><span class="line"> <span class="keyword">continue</span>;</span><br><span class="line"> }</span><br><span class="line"> }<span class="comment">//4</span></span><br><span class="line"> }<span class="comment">//3</span></span><br><span class="line"><span class="comment">//**************************************************************</span></span><br><span class="line"> <span class="keyword">if</span>(V[_i<<<span class="number">1</span>]+V[(_i<<<span class="number">1</span>)+<span class="number">1</span>]!=<span class="number">3</span>){res_ex=<span class="number">0</span>;<span class="keyword">break</span>;}</span><br><span class="line"> }<span class="comment">//2</span></span><br><span class="line">}<span class="comment">//1</span></span><br><span class="line"><span class="comment">//点的编号必须从0开始,2*i和2*i+1是一对sat</span></span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">main</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>{</span><br><span class="line"> <span class="keyword">int</span> M;</span><br><span class="line"> <span class="keyword">int</span> x,y;</span><br><span class="line"> <span class="keyword">while</span>(<span class="built_in">scanf</span>(<span class="string">"%d%d"</span>,&_n,&M)!=EOF)</span><br><span class="line"> {</span><br><span class="line"> n=_n<<<span class="number">1</span>;</span><br><span class="line"> init_d();</span><br><span class="line"> <span class="keyword">while</span>(M--)</span><br><span class="line"> {</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d%d"</span>,&x,&y);</span><br><span class="line"> x--;</span><br><span class="line"> y--;</span><br><span class="line"> <span class="keyword">if</span>(x!=(y^<span class="number">1</span>))</span><br><span class="line"> {</span><br><span class="line"> add_edge(x,y^<span class="number">1</span>);</span><br><span class="line"> add_edge(y,x^<span class="number">1</span>);</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> solve();</span><br><span class="line"> <span class="keyword">if</span>(res_ex)</span><br><span class="line"> {</span><br><span class="line"> <span class="keyword">for</span>(<span class="keyword">int</span> i=<span class="number">0</span>;i<n;i++)<span class="comment">//V为0为不确定,1为确定选择,2为确定不选择</span></span><br><span class="line"> <span class="keyword">if</span>(V[i]==<span class="number">1</span>)<span class="built_in">printf</span>(<span class="string">"%d\n"</span>,i+<span class="number">1</span>);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">else</span> <span class="built_in">printf</span>(<span class="string">"NIE\n"</span>);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment">//网上另解</span></span><br><span class="line"><span class="comment">#include <cstdio></span></span><br><span class="line"><span class="comment">#include <iostream></span></span><br><span class="line"><span class="comment">#include <algorithm></span></span><br><span class="line"><span class="comment">using namespace std;</span></span><br><span class="line"><span class="comment">const int MAXN = 16005;</span></span><br><span class="line"><span class="comment"></span></span><br><span class="line"><span class="comment">struct Edge{</span></span><br><span class="line"><span class="comment"> int v, nxt;</span></span><br><span class="line"><span class="comment">};</span></span><br><span class="line"><span class="comment"></span></span><br><span class="line"><span class="comment">int cnt, ecnt, n, m;</span></span><br><span class="line"><span class="comment">Edge edge[4 * MAXN];</span></span><br><span class="line"><span class="comment">int head[MAXN], col[MAXN], res[MAXN];</span></span><br><span class="line"><span class="comment"></span></span><br><span class="line"><span class="comment">void init(){</span></span><br><span class="line"><span class="comment"> cnt = ecnt = 0;</span></span><br><span class="line"><span class="comment"> memset(res, 0, sizeof(res));</span></span><br><span class="line"><span class="comment"> memset(col, 0, sizeof(col));</span></span><br><span class="line"><span class="comment"> memset(edge, 0, sizeof(edge));</span></span><br><span class="line"><span class="comment"> memset(head, -1, sizeof(head));</span></span><br><span class="line"><span class="comment">}</span></span><br><span class="line"><span class="comment"></span></span><br><span class="line"><span class="comment">void addEdge(int u, int v){</span></span><br><span class="line"><span class="comment"> edge[ecnt].v = v;</span></span><br><span class="line"><span class="comment"> edge[ecnt].nxt = head[u];</span></span><br><span class="line"><span class="comment"> head[u] = ecnt++;</span></span><br><span class="line"><span class="comment">}</span></span><br><span class="line"><span class="comment"></span></span><br><span class="line"><span class="comment">bool dfs(int u){</span></span><br><span class="line"><span class="comment"> if(col[u] == 2) return false;</span></span><br><span class="line"><span class="comment"> if(col[u] == 1) return true;</span></span><br><span class="line"><span class="comment"> col[u] = 1;</span></span><br><span class="line"><span class="comment"> col[u ^ 1] = 2;</span></span><br><span class="line"><span class="comment"> res[cnt++] = u;</span></span><br><span class="line"><span class="comment"> for(int i = head[u]; i + 1; i = edge[i].nxt){</span></span><br><span class="line"><span class="comment"> int v = edge[i].v;</span></span><br><span class="line"><span class="comment"> if(!dfs(v)) return false;</span></span><br><span class="line"><span class="comment"> }</span></span><br><span class="line"><span class="comment"> return true;</span></span><br><span class="line"><span class="comment">}</span></span><br><span class="line"><span class="comment"></span></span><br><span class="line"><span class="comment">bool solve(){</span></span><br><span class="line"><span class="comment"> memset(col, 0, sizeof(col));</span></span><br><span class="line"><span class="comment"> for(int i = 0; i < n; i++){</span></span><br><span class="line"><span class="comment"> if(col[i]) continue;</span></span><br><span class="line"><span class="comment"> cnt = 0;</span></span><br><span class="line"><span class="comment"> if(!dfs(i)){</span></span><br><span class="line"><span class="comment"> for(int j = 0; j < cnt; j++){</span></span><br><span class="line"><span class="comment"> col[res[j]] = 0;</span></span><br><span class="line"><span class="comment"> col[res[j] ^ 1] = 0;</span></span><br><span class="line"><span class="comment"> }</span></span><br><span class="line"><span class="comment"> if(!dfs(i ^ 1)) return false;</span></span><br><span class="line"><span class="comment"> }</span></span><br><span class="line"><span class="comment"> }</span></span><br><span class="line"><span class="comment"> return true;</span></span><br><span class="line"><span class="comment">}</span></span><br><span class="line"><span class="comment"></span></span><br><span class="line"><span class="comment">int main(){</span></span><br><span class="line"><span class="comment"> while(scanf("%d%d", &n, &m) != EOF){</span></span><br><span class="line"><span class="comment"> init();</span></span><br><span class="line"><span class="comment"> n <<= 1;</span></span><br><span class="line"><span class="comment"> int u, v;</span></span><br><span class="line"><span class="comment"> for(int i = 0; i < m; i++){</span></span><br><span class="line"><span class="comment"> scanf("%d%d", &u, &v);</span></span><br><span class="line"><span class="comment"> u--, v--;</span></span><br><span class="line"><span class="comment"> addEdge(u, v ^ 1);</span></span><br><span class="line"><span class="comment"> addEdge(v, u ^ 1);</span></span><br><span class="line"><span class="comment"> }</span></span><br><span class="line"><span class="comment"> if(solve()){</span></span><br><span class="line"><span class="comment"> for(int i = 0; i < n; i++)</span></span><br><span class="line"><span class="comment"> if(col[i] == 1) printf("%d\n", i + 1);</span></span><br><span class="line"><span class="comment"> } else printf("NIE\n");</span></span><br><span class="line"><span class="comment"> }</span></span><br><span class="line"><span class="comment"> return 0;</span></span><br><span class="line"><span class="comment">}</span></span><br><span class="line"><span class="comment">*/</span></span><br></pre></td></tr></table></figure><h3 id="只输出一组可行解-O-n-m"><a href="#只输出一组可行解-O-n-m" class="headerlink" title="只输出一组可行解(O(n + m))"></a>只输出一组可行解(O(n + m))</h3><p>根据《挑战程序设计竞赛》的说法,如果不存在 x 与 NOTx 同在一个强连通分量, 那么对于每一个布尔变量 x , 让 $$ x 所在的强连通分量的拓扑序在 NOTx 所在的强连通分量之后 <=> x 为真 $$ 就是使得该公式的值为真的一组合适的布尔变量赋值。</p><h3 id="一些例题"><a href="#一些例题" class="headerlink" title="一些例题"></a>一些例题</h3><h4 id="POJ-2117"><a href="#POJ-2117" class="headerlink" title="POJ 2117"></a><a href="http://poj.org/submit?problem_id=2117">POJ 2117</a></h4><p>未完待续。。。。</p>]]></content>
<summary type="html"><h2 id="基本概念"><a href="#基本概念" class="headerlink" title="基本概念"></a>基本概念</h2><h3 id="无向图"><a href="#无向图" class="headerlink" title="无向图"></a>无向图</h3></summary>
<category term="算法" scheme="https://andrewei1316.github.io/categories/%E7%AE%97%E6%B3%95/"/>
<category term="图论" scheme="https://andrewei1316.github.io/categories/%E7%AE%97%E6%B3%95/%E5%9B%BE%E8%AE%BA/"/>
<category term="图论" scheme="https://andrewei1316.github.io/tags/%E5%9B%BE%E8%AE%BA/"/>
<category term="连通性" scheme="https://andrewei1316.github.io/tags/%E8%BF%9E%E9%80%9A%E6%80%A7/"/>
</entry>
<entry>
<title>C++ STL简介</title>
<link href="https://andrewei1316.github.io/2016/04/06/cppSTL/"/>
<id>https://andrewei1316.github.io/2016/04/06/cppSTL/</id>
<published>2016-04-06T12:19:31.000Z</published>
<updated>2018-04-09T01:16:07.246Z</updated>
<content type="html"><![CDATA[<p>本文转自: <a href="https://www.zybuluo.com/comzyh/note/138935">https://www.zybuluo.com/comzyh/note/138935</a></p><h2 id="参考网站"><a href="#参考网站" class="headerlink" title="参考网站"></a>参考网站</h2><p>推荐大家去这些网站查询自己需要的东西</p><ul><li><a href="http://www.cplusplus.com/reference/">cpluscplus</a></li><li><a href="http://en.cppreference.com/w/">cppreference</a></li><li><a href="http://zh.cppreference.com/w/%E9%A6%96%E9%A1%B5">cppreferne中文</a></li></ul><p>STL中有很多好用的容器和算法,非常好用。<br>简单介绍一下</p><a id="more"></a><ul><li><code>vector</code> 向量(可以理解成可变长数组)</li><li><code>utility</code> (pair)</li><li><code>algorithm</code> (sort)</li><li><code>queue 队列</code> (queue,priority_queue)</li><li><code>list</code> (链表)</li><li><code>map</code> (key-value映射)</li><li><code>bitset</code> (用int各位表示的数组)</li></ul><h2 id="C-模板简单介绍"><a href="#C-模板简单介绍" class="headerlink" title="C++ 模板简单介绍"></a>C++ 模板简单介绍</h2><p>我们来看看 cplusplus 上对 <a href="http://www.cplusplus.com/reference/vector/">vector</a> 的介绍</p><blockquote><p>template < class T, class Alloc = allocator > class vector; // generic template</p></blockquote><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">vector</span><<span class="keyword">int</span>> arr; </span><br><span class="line"><span class="built_in">queue</span><<span class="keyword">int</span>> q;</span><br><span class="line"><span class="built_in">vector</span><<span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">int</span>> > ponints; <span class="comment">// 注意那个空格</span></span><br><span class="line"><span class="built_in">map</span><<span class="built_in">string</span>,<span class="keyword">int</span>> reflect;</span><br></pre></td></tr></table></figure><p>使用typedef 可以缩短代码长度,但是会降低可读性</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">int</span>> pii;</span><br><span class="line"><span class="built_in">vector</span><pii> points;</span><br></pre></td></tr></table></figure><h3 id="iterator迭代器"><a href="#iterator迭代器" class="headerlink" title="iterator迭代器"></a>iterator迭代器</h3><p>迭代器是C++的重要组成部分,但是这里不细讲,只是很多STL容器的方法都返回迭代器,不对迭代器有些了解是不行的<br>比如遍历一个 <code>vector</code>,可以这么做</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">vector</span><<span class="keyword">int</span>> n;</span><br><span class="line"><span class="keyword">for</span>(<span class="built_in">vector</span><<span class="keyword">int</span>>:: it=n.begin();it!=n.end();it++)</span><br><span class="line"> <span class="built_in">cout</span> << *it << <span class="built_in">endl</span>;</span><br></pre></td></tr></table></figure><p>访问 <code>iterator</code> 指向的内容可以直接用 <code>*it</code> 访问,如果访问其成员的话,也可以用 <code>-></code> 访问</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">vector</span><<span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">int</span>> > points;</span><br><span class="line"><span class="keyword">for</span> (<span class="built_in">vector</span><<span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">int</span>> >::iterator = n.begin();it != n.end();it++)</span><br><span class="line"> <span class="built_in">cout</span> << <span class="string">"x: "</span> << it->first << <span class="string">"y: "</span> << (*it).second << <span class="built_in">endl</span>;</span><br></pre></td></tr></table></figure><p>当然,我们平常是不会这么遍历数组的,这里只是演示下 <code>iterator</code> 的用法</p><h3 id="vector"><a href="#vector" class="headerlink" title="vector"></a>vector</h3><p>vector 是C++中最常用的数据结构,相当于可变长数组,效率和使用数组没有明显差别<br>cpluscplus 上对 <a href="http://www.cplusplus.com/reference/vector/">vector</a> 的介绍</p><p>常见用法,建立邻接表样例</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><iostream></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><vector></span></span></span><br><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> MAXN = <span class="number">100</span>;</span><br><span class="line"><span class="built_in">vector</span><<span class="keyword">int</span>> tab[MAXN+<span class="number">1</span>];</span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">main</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>{</span><br><span class="line"> <span class="built_in">cin</span> >> N >> M;</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i=<span class="number">1</span>;i<=N;i++)</span><br><span class="line"> tab[i].clear();</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>;i<M;i++)</span><br><span class="line"> {</span><br><span class="line"> <span class="keyword">int</span> a,b;</span><br><span class="line"> <span class="built_in">cin</span> >> a >> b;</span><br><span class="line"> tab[a].push_back(b);</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>我们来看一看上面发生了什么</p><p>声明: <code>vector<int> arrary</code> ;声明了一个 <code>vector</code> ,而 <code>vector<int> tab[101]</code> 则声明了一个 vector 的数组,访问5点能引出的第0条边使用 <code>tab[5][0]</code> 就可以了<br><code>tab[i].clear</code> 是将一个 vector 清空。 这一句在这个程序里是没有用的,但是很多题目需要多组输入输出,上一个Case的邻接表没有清空是要死得很惨的。<br><code>tab[a].push_back(b)</code> 在 <code>vector tab[a]</code> 最后添加一个元素,值为 b 如何使用这张邻接表呢</p><p>遍历a点所有连接的点</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">vecotr<<span class="keyword">int</span>> tab[<span class="number">100</span>]</span><br><span class="line"><span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>;i<tab[a].size();i++)</span><br><span class="line"> <span class="built_in">cout</span> << tab[a][i] << <span class="built_in">endl</span>;</span><br></pre></td></tr></table></figure><p>注意坑</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>;i <= tab[a].size()<span class="number">-1</span>;i++)</span><br><span class="line"> <span class="built_in">cout</span> << <span class="built_in">endl</span>;</span><br></pre></td></tr></table></figure><p>这样是有可能会跪掉的,因为 <code>vector</code> 的 <code>size()</code> 返回的是一个 <code>size_t</code> 也就是 <code>unsigned int</code> ,即32位无符号数 类型,这样如果 <code>vector</code> 为空的话,<code>size()</code> 返回 <code>0</code>,然而32位无符号数 <code>0 - 1 = 4294967295</code>,这样会导致访问越界然后开心的RE掉</p><h4 id="其他重要的成员"><a href="#其他重要的成员" class="headerlink" title="其他重要的成员"></a>其他重要的成员</h4><ul><li><code>vector::resize()</code> 如果你想立即得到一个长度为100的数组而不想一个一个push_bakc进去的话,直接 <code>xxx.resize(100)</code> 就好了</li><li><code>vector::begin()</code> 返回首个元素的迭代器</li><li><code>vector::end()</code> 返回终止位置的迭代器,注意并非指向最后一个元素,而是比最后一个元素还要往后一个元素的位置</li></ul><h3 id="pair"><a href="#pair" class="headerlink" title="pair"></a><a href="http://www.cplusplus.com/reference/utility/">pair</a></h3><p>使用 <code>pair</code> 需要 <code>#include <utility></code></p><p><code>pair</code> 代表的是数据对,可以用来表示二维坐标(x,y),图中的边之类的东西,<code>pair</code> 的两个分量类型可以不同,像下面这样。</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">int</span>> point; <span class="comment">//藐视一个点</span></span><br><span class="line"><span class="keyword">typedef</span> <span class="built_in">pair</span><<span class="built_in">string</span>,<span class="keyword">int</span>> name_and_id_pair; <span class="comment">// 学生姓名和学号</span></span><br><span class="line"><span class="keyword">typedef</span> <span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">double</span>> id_to_height <span class="built_in">pair</span>; <span class="comment">// 学生学号和身高</span></span><br></pre></td></tr></table></figure><h4 id="如何制造-pair"><a href="#如何制造-pair" class="headerlink" title="如何制造 pair"></a>如何制造 pair</h4><p>常用的方式有构造函数法和make_pair</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">int</span>> point1 = <span class="built_in">make_pair</span>(<span class="number">1</span>,<span class="number">1</span>);</span><br><span class="line"><span class="built_in">pair</span><<span class="keyword">double</span>,<span class="keyword">double</span>> point2 = <span class="built_in">make_pair</span>(<span class="number">2.0</span>,<span class="number">3.0</span>);</span><br><span class="line"><span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">int</span>> point3 = <span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">int</span>>(<span class="number">1.0</span>,<span class="number">2.0</span>);</span><br><span class="line"><span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">int</span>> point4 = <span class="built_in">make_pair</span>(<span class="number">1.0</span>,<span class="number">2.0</span>); <span class="comment">// 这句会编译失败,因为make出来的是pair<double,double> 却赋值给了pair<int,int></span></span><br></pre></td></tr></table></figure><h4 id="如何使用-pair"><a href="#如何使用-pair" class="headerlink" title="如何使用 pair"></a>如何使用 pair</h4><p><code>pair</code> 有两个主要成员 <code>first</code> 和 <code>second</code>,类型分别和为 <code>pair</code> 里 <code>U</code>,<code>V</code> 的类型</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">pair</span><<span class="keyword">int</span>, <span class="keyword">double</span>> yz = <span class="built_in">make_pair</span>(<span class="number">1</span>, <span class="number">179.99999</span>);</span><br><span class="line"><span class="built_in">cout</span> << yz.first << <span class="string">":"</span> << yz.second << <span class="built_in">endl</span>;</span><br><span class="line"><span class="built_in">cout</span> << <span class="keyword">sizeof</span>(yz.first) << <span class="string">" "</span> << <span class="keyword">sizeof</span>(yz.second) << <span class="built_in">endl</span>;</span><br></pre></td></tr></table></figure><p>输出</p><blockquote><p>1:179.999<br>4 8</p></blockquote><h3 id="sort-的姿势"><a href="#sort-的姿势" class="headerlink" title="sort 的姿势"></a>sort 的姿势</h3><p><code>cplusplus</code> 上关于 <a href="http://www.cplusplus.com/reference/algorithm/sort/">srot</a> 的页面<br><code>sort</code> 是 <code>STL</code> 里面一个非常重要的算法函数,排序效率非常高,几乎在任何时候都不会需要手敲,所以,需要排序的时候,用 <code>sort</code> 吧!</p><p><code>std::sort</code> 需要 <code>#include <algorithm></code></p><h4 id="基本排序姿势"><a href="#基本排序姿势" class="headerlink" title="基本排序姿势"></a><strong>基本排序姿势</strong></h4><p>第一种用法原型如下,传入两个 <code>RandomAccessIterator</code>,对 <code>[first,last) </code>区间的元素进行排序,<strong>注意区间左闭右开</strong>,也就是<strong>传入的 <code>last</code> 迭代器指向的位置不会参与排序</strong></p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">template</span> <<span class="class"><span class="keyword">class</span> <span class="title">RandomAccessIterator</span>></span></span><br><span class="line"><span class="class"><span class="title">void</span> <span class="title">sort</span> (<span class="title">RandomAccessIterator</span> <span class="title">first</span>, <span class="title">RandomAccessIterator</span> <span class="title">last</span>);</span></span><br></pre></td></tr></table></figure><p>对 <code>int</code> 数组从大到小排序<br>因为指针也是 <code>RandomAccessIterator</code> 的一种,所以 <code>sort</code> 直接传入指针就好了</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">const</span> <span class="keyword">int</span> N = <span class="number">1000</span>;</span><br><span class="line"><span class="keyword">int</span> arr[N];</span><br><span class="line">sort(arr , arr + N); <span class="comment">// 注意arr + N 指向的位置已经越界,但是sort传入的last参数就是指向最后一个元素后的一个位置</span></span><br></pre></td></tr></table></figure><h4 id="排序vector"><a href="#排序vector" class="headerlink" title="排序vector"></a><strong>排序vector</strong></h4><p><code>vector</code> 直接能返回迭代器,很方便</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">vector</span><<span class="keyword">int</span>> <span class="built_in">array</span>;</span><br><span class="line">sort(<span class="built_in">array</span>.begin(), <span class="built_in">array</span>.end());</span><br></pre></td></tr></table></figure><h4 id="从大到小排序"><a href="#从大到小排序" class="headerlink" title="从大到小排序"></a><strong>从大到小排序</strong></h4><p>我们来看看 sort 的第二个原型</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">template</span> <<span class="class"><span class="keyword">class</span> <span class="title">RandomAccessIterator</span>, <span class="title">class</span> <span class="title">Compare</span>></span></span><br><span class="line"><span class="class"> <span class="title">void</span> <span class="title">sort</span> (<span class="title">RandomAccessIterator</span> <span class="title">first</span>, <span class="title">RandomAccessIterator</span> <span class="title">last</span>, <span class="title">Compare</span> <span class="title">comp</span>);</span></span><br></pre></td></tr></table></figure><p>这里出现了第三个参数 <code>Compare comp comp</code> 是一个比较器,可以有很多种玩法<br>如果我们想从大到小排序,把 <code>greater</code> 比较器传给<code>sort</code> 就行了</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">vector</span><<span class="keyword">double</span>> height;</span><br><span class="line">sort(height.begin(), begin.end(), greater<<span class="keyword">double</span>>());</span><br></pre></td></tr></table></figure><p>注意:比较器的使用方法,比较器 <code>std::greater</code> 是一个使用模板的结构体</p><p>参见 <code>cplusplus</code> 对 <a href="http://www.cplusplus.com/reference/functional/greater/">std::greater</a> 的介绍</p><p><code>greater</code> 的原型为 <code>template <class T> struct greater</code>;</p><p>我们需要传入的实际上是是一个 <code>greater</code> 类型的变量,所以需要调用 <code>greater</code> 的构造函数,最后写成 <code>greater<double>()</code></p><h4 id="排序-pair"><a href="#排序-pair" class="headerlink" title="排序 pair"></a><strong>排序 pair</strong></h4><p>排序 <code>pair</code> 非常容易,直接 <code>sort</code> 的时候默认以 <code>first</code> 为第一关键字,<code>second</code> 为第二关键字排序</p><p>比如我们要对一系列事件已开始时间为第一关键字,结束时间为第二关键字排序</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">vector</span><<span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">int</span>> > events;</span><br><span class="line">sort(events.begin(),events.end());</span><br></pre></td></tr></table></figure><p>搞定~</p><h4 id="对自定义结构体进行排序(重载运算符方案)"><a href="#对自定义结构体进行排序(重载运算符方案)" class="headerlink" title="对自定义结构体进行排序(重载运算符方案)"></a><strong>对自定义结构体进行排序(重载运算符方案)</strong></h4><p>我们只需要重载结构体的 <code><</code> 运算符即可</p><p>例如,对事件以开始时间排序</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">T_event</span></span></span><br><span class="line"><span class="class">{</span></span><br><span class="line"> <span class="keyword">int</span> begin_at, end_at;</span><br><span class="line"> <span class="keyword">bool</span> <span class="keyword">operator</span> <(<span class="keyword">const</span> T_event &other)<span class="keyword">const</span></span><br><span class="line"> {</span><br><span class="line"> <span class="keyword">return</span> begin < other.begin;</span><br><span class="line"> }</span><br><span class="line">};</span><br><span class="line"><span class="built_in">vector</span><T_event> events;</span><br><span class="line">sort(events.begin(),events.end());</span><br></pre></td></tr></table></figure><p>注意:比较重载运算符的两处 <code>const</code>,和引用 <code>&</code> 。<code>const T_event &other</code> 防止比较函数对 <code>other</code> 进行修改,第二个 <code>const</code> 是限制比较函数不得修改所在的结构体的成员。如果不加这两个 <code>const</code> 限定就会爆满屏幕的编译错误。而比较的时候,另一个变量必须以引用方式 <code>&</code> 传递</p><h4 id="双(多)关键字排序"><a href="#双(多)关键字排序" class="headerlink" title="双(多)关键字排序"></a><strong>双(多)关键字排序</strong></h4><p>比如我们要对一个结构体 <code>vector</code> 排序,要求很复杂,首先按照 <code>a</code> 降序,然后按照 <code>c</code> 升序,再按照 <code>b</code> 降序,而且 <code>c</code> 是 <code>double</code> 值,排序的时候认为如果两个结构体的 <code>c</code> 下去正一样就算 <code>c</code> 一样,怎么搞?</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">Three_key</span></span></span><br><span class="line"><span class="class">{</span></span><br><span class="line"> <span class="keyword">int</span> a, b;</span><br><span class="line"> <span class="keyword">double</span> c;</span><br><span class="line"> <span class="keyword">bool</span> opeartor < (<span class="keyword">const</span> Three_key &other)<span class="keyword">const</span></span><br><span class="line"> {</span><br><span class="line"> <span class="keyword">if</span> (a != other.a)</span><br><span class="line"> <span class="keyword">return</span> a > other.a;</span><br><span class="line"> <span class="keyword">if</span> ((<span class="keyword">int</span>)c != (<span class="keyword">int</span>)other.c)</span><br><span class="line"> <span class="keyword">return</span> (<span class="keyword">int</span>)c < (<span class="keyword">int</span>)other.c;</span><br><span class="line"> <span class="keyword">return</span> b > other.b;</span><br><span class="line"> }</span><br><span class="line">};</span><br></pre></td></tr></table></figure><p>这样就可以了;</p><h4 id="使用比较函数排序"><a href="#使用比较函数排序" class="headerlink" title="使用比较函数排序"></a><strong>使用比较函数排序</strong></h4><p>有的时候我们需要对一个数组进行多次排序,每次排序标准还不一样,怎么搞?</p><p>比如坐标,第一次按照X坐标升序排序,搞点什么,然后再按照Y坐标降序排序,那么可以这样写</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">vector</span><<span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">int</span>> > points;</span><br><span class="line"><span class="function"><span class="keyword">bool</span> <span class="title">cmp_x_inc</span><span class="params">(<span class="keyword">const</span> <span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">int</span>> &p1, <span class="keyword">const</span> <span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">int</span>> &p2)</span></span></span><br><span class="line"><span class="function"></span>{</span><br><span class="line"> <span class="keyword">return</span> p1.first < p2.first;</span><br><span class="line">}</span><br><span class="line"><span class="function"><span class="keyword">bool</span> <span class="title">cmp_y_dec</span><span class="params">(<span class="keyword">const</span> <span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">int</span>> &p1, <span class="keyword">const</span> <span class="built_in">pair</span><<span class="keyword">int</span>,<span class="keyword">int</span>> &p2)</span></span></span><br><span class="line"><span class="function"></span>{</span><br><span class="line"> <span class="keyword">return</span> p1.second < p2.second;</span><br><span class="line">}</span><br><span class="line">sort(points.begin(),points.end(),cmp_x_inc);<span class="comment">//X 升序</span></span><br><span class="line"><span class="comment">//do something...</span></span><br><span class="line">sort(points.begin(),points.end(),cmp_y_dec);<span class="comment">//Y 降序</span></span><br></pre></td></tr></table></figure><p>向sort传入比较函数的函数指针就可以了~</p><h4 id="对字符串排序(使用结构体,不推荐)"><a href="#对字符串排序(使用结构体,不推荐)" class="headerlink" title="对字符串排序(使用结构体,不推荐)"></a><strong>对字符串排序(使用结构体,不推荐)</strong></h4><p>首先定义结构体</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstdio> // strcmp 函数在这里</span></span></span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">T_String</span></span></span><br><span class="line"><span class="class">{</span></span><br><span class="line"> <span class="keyword">char</span> str[<span class="number">10000</span>];</span><br><span class="line"> <span class="keyword">bool</span> <span class="keyword">operator</span> < (<span class="keyword">const</span> T_String &s)</span><br><span class="line"> {</span><br><span class="line"> <span class="keyword">return</span> <span class="built_in">strcmp</span>(str,s.str) < <span class="number">0</span>;</span><br><span class="line"> }</span><br><span class="line">};</span><br><span class="line"><span class="built_in">vector</span><T_String> strs;</span><br><span class="line">sort(strs.begin(),strs.end());</span><br></pre></td></tr></table></figure><p>这种方法虽然简单,但是有很多缺陷,比如</p><ul><li>因为字符串存储在结构体中,造成结构体很大,交换结构体的开销很大</li><li>不能对常量字符串排序</li><li>一般不推荐使用</li></ul><h4 id="对字符串排序(使用-char-数组)"><a href="#对字符串排序(使用-char-数组)" class="headerlink" title="*对字符串排序(使用 char 数组)**"></a>*<em>对字符串排序(使用 char</em> 数组)**</h4><p>由于交换字符串开销很大,但是字符串本身是不会改变的,我们并不需要交换字符串本身,最终只需要能顺字典序访问所有字符串就行了,那么,可以对每个字符串建立一个指针,然后采用上面的比较函数方法对指针进行排序即可。</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><iostream></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstdio></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><algorithm></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstring></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><vector></span></span></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="built_in">std</span>;</span><br><span class="line"><span class="keyword">int</span> N = <span class="number">0</span>;</span><br><span class="line"><span class="keyword">char</span> strs[<span class="number">1000</span>][<span class="number">1000</span>];</span><br><span class="line"><span class="built_in">vector</span><<span class="keyword">char</span> *> strs_sorted;</span><br><span class="line"><span class="function"><span class="keyword">bool</span> <span class="title">char_ptr_cmp</span><span class="params">(<span class="keyword">const</span> <span class="keyword">char</span> *a, <span class="keyword">const</span> <span class="keyword">char</span> *b)</span></span></span><br><span class="line"><span class="function"></span>{</span><br><span class="line"> <span class="keyword">return</span> <span class="built_in">strcmp</span>(a,b) < <span class="number">0</span>;</span><br><span class="line">}</span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">main</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>{</span><br><span class="line"> <span class="keyword">while</span> (<span class="built_in">scanf</span>(<span class="string">"%s"</span>, strs[N++]) != EOF);</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < N; i++)</span><br><span class="line"> strs_sorted.push_back(strs[i]);</span><br><span class="line"> sort(strs_sorted.begin(), strs_sorted.end(), char_ptr_cmp);</span><br><span class="line"> <span class="built_in">printf</span>(<span class="string">"sorted strs are:\n"</span>);</span><br><span class="line"> <span class="keyword">for</span> (<span class="built_in">vector</span><<span class="keyword">char</span>*>::iterator it = strs_sorted.begin(); it != strs_sorted.end(); it++)</span><br><span class="line"> <span class="built_in">printf</span>(<span class="string">"%s\n"</span>, *it);</span><br><span class="line"> <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><h4 id="拓展-使用自定义比较器(伪函数)"><a href="#拓展-使用自定义比较器(伪函数)" class="headerlink" title="(拓展) 使用自定义比较器(伪函数)"></a><strong>(拓展) 使用自定义比较器(伪函数)</strong></h4><p>如果我们定义了一个结构体,里面有一个长度为10的int数组</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">T_arr</span></span></span><br><span class="line"><span class="class">{</span></span><br><span class="line"> <span class="keyword">int</span> arr[<span class="number">10</span>];</span><br><span class="line">};</span><br><span class="line"><span class="built_in">vector</span><T_arr> <span class="built_in">array</span>;</span><br></pre></td></tr></table></figure><p>我们需要对 <code>array</code> 进行 10 次排序,每次分别以其中一个下标 <code>(arr[0],arr[1],...)</code> 为关键字进行排序,怎么办?</p><p>写10个比较函数?听起来好靠谱的样子~~ 才怪!</p><p>还记得我们刚才提到的 <code>greater</code> 吗, <code>std::greater</code> 是一个结构体,我们来看看他的实现</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">template</span> <<span class="class"><span class="keyword">class</span> <span class="title">T</span>> <span class="title">struct</span> <span class="title">greater</span> :</span> binary_function <T,T,<span class="keyword">bool</span>> {</span><br><span class="line"> <span class="function"><span class="keyword">bool</span> <span class="title">operator</span><span class="params">()</span> <span class="params">(<span class="keyword">const</span> T& x, <span class="keyword">const</span> T& y)</span> <span class="keyword">const</span> </span>{<span class="keyword">return</span> x>y;}</span><br><span class="line">};</span><br></pre></td></tr></table></figure><p>能够看到,<code>greater</code> 重载了一个奇怪的运算符 <code>()</code>, <code>sort</code> 比较两个值的时候会使用这个运算符来对两个元素进行比较,我们也可以这么写</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstdio></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><algorithm></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><vector></span></span></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="built_in">std</span>;</span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">T_arr</span></span></span><br><span class="line"><span class="class">{</span></span><br><span class="line"> <span class="keyword">int</span> arr[<span class="number">4</span>];</span><br><span class="line">};</span><br><span class="line"><span class="built_in">vector</span><T_arr> to_sort;</span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">T_arr_cmp</span></span></span><br><span class="line"><span class="class">{</span></span><br><span class="line"> <span class="keyword">int</span> index;</span><br><span class="line"> T_arr_cmp(<span class="keyword">int</span> index): index(index) {} <span class="comment">// 构造函数</span></span><br><span class="line"> <span class="function"><span class="keyword">bool</span> <span class="title">operator</span> <span class="params">()</span><span class="params">(<span class="keyword">const</span> T_arr &a, <span class="keyword">const</span> T_arr &b)</span></span></span><br><span class="line"><span class="function"> </span>{</span><br><span class="line"> <span class="keyword">return</span> a.arr[index] < b.arr[index];</span><br><span class="line"> }</span><br><span class="line">};</span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">main</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>{</span><br><span class="line"> <span class="keyword">int</span> N;</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d"</span>, &N);</span><br><span class="line"> to_sort.resize(N);</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < N; i++)</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> j = <span class="number">0</span>; j < <span class="number">4</span>; j++)</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d"</span>, &to_sort[i].arr[j]);</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> j = <span class="number">0</span>; j < <span class="number">4</span>; j++)</span><br><span class="line"> {</span><br><span class="line"> sort(to_sort.begin(), to_sort.end(), T_arr_cmp(j)); <span class="comment">// 传入比较器,以数组的第j位为关键字</span></span><br><span class="line"> <span class="built_in">printf</span>(<span class="string">"the to_sort sort by arr[%d] is:\n"</span>, j);</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < N; i++)</span><br><span class="line"> <span class="built_in">printf</span>(<span class="string">"%4d %4d %4d %4d\n"</span>, to_sort[i].arr[<span class="number">0</span>], to_sort[i].arr[<span class="number">1</span>], to_sort[i].arr[<span class="number">2</span>], to_sort[i].arr[<span class="number">3</span>]);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>上面的代码 给 <code>sort</code> 函数传入了一个结构体,结构体有一个成员变量 <code>index</code> ,表示用 <code>arr[index]</code> 为关键字进行比较,而这个 <code>index</code>,这个 <code>index</code> 是在结构体构造的时候由构造函数传进去的<br>相当于</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">T_arr_cmp <span class="title">cmp</span><span class="params">(j)</span></span>;</span><br><span class="line">sort(<span class="built_in">array</span>.begin(), <span class="built_in">array</span>.end(), cmp);</span><br></pre></td></tr></table></figure><p>可以用下面的数据测试</p><blockquote><p>5<br>1 2 3 4<br>2 3 4 1<br>3 4 1 2<br>4 1 2 3<br>1 2 3 4</p></blockquote><h4 id="拓展-使用lambda函数进行排序(C-11)"><a href="#拓展-使用lambda函数进行排序(C-11)" class="headerlink" title="(拓展)使用lambda函数进行排序(C++11)"></a><strong>(拓展)使用lambda函数进行排序(C++11)</strong></h4><p>每次都要定义一个排序函数太麻烦了有木有!<br>看代码的时候找比较函数往上滚滚轮都快疯了,还打断思路有木有!!<br>写比较器好多行好麻烦有木有!!!</p><p>然而C++11标准提供了 <code>lambda</code> 函数(匿名函数,现声明现调用),写出的代码就好看多了</p><p>参见:<a href="http://zh.cppreference.com/w/cpp/language/lambda">Lambda函数(C++11 起)</a></p><p>上面的使用比较器对数组多处排序可以改成这样,注意使用 <code>g++ xxx.cpp -std=c++11</code> 来编译(启用C++11标准)</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><cstdio></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><algorithm></span></span></span><br><span class="line"><span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><vector></span></span></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="built_in">std</span>;</span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">T_arr</span></span></span><br><span class="line"><span class="class">{</span></span><br><span class="line"> <span class="keyword">int</span> arr[<span class="number">4</span>];</span><br><span class="line">};</span><br><span class="line"><span class="built_in">vector</span><T_arr> to_sort;</span><br><span class="line"><span class="function"><span class="keyword">int</span> <span class="title">main</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>{</span><br><span class="line"> <span class="keyword">int</span> N;</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d"</span>, &N);</span><br><span class="line"> to_sort.resize(N);</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < N; i++)</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> j = <span class="number">0</span>; j < <span class="number">4</span>; j++)</span><br><span class="line"> <span class="built_in">scanf</span>(<span class="string">"%d"</span>, &to_sort[i].arr[j]);</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> j = <span class="number">0</span>; j < <span class="number">4</span>; j++)</span><br><span class="line"> {</span><br><span class="line"> sort(to_sort.begin(), to_sort.end(), [&j](<span class="keyword">const</span> T_arr &a, <span class="keyword">const</span> T_arr &b)-><span class="keyword">bool</span></span><br><span class="line"> {</span><br><span class="line"> <span class="keyword">return</span> a.arr[j] < b.arr[j]; </span><br><span class="line"> }); </span><br><span class="line"> <span class="built_in">printf</span>(<span class="string">"the to_sort sort by arr[%d] is:\n"</span>, j);</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < N; i++)</span><br><span class="line"> <span class="built_in">printf</span>(<span class="string">"%4d %4d %4d %4d\n"</span>, to_sort[i].arr[<span class="number">0</span>], to_sort[i].arr[<span class="number">1</span>], to_sort[i].arr[<span class="number">2</span>], to_sort[i].arr[<span class="number">3</span>]);</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>我们看看cppreference中给出的第一种lambda函数语法</p><blockquote><p>[ capture ] ( params ) mutable exception attribute -> ret { body }</p></blockquote><p>再看看我们在sort最后一个参数写了什么?</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">sort(to_sort.begin(), to_sort.end(), [&j](<span class="keyword">const</span> T_arr &a, <span class="keyword">const</span> T_arr &b)-><span class="keyword">bool</span></span><br><span class="line">{</span><br><span class="line"> <span class="keyword">return</span> a.arr[j] < b.arr[j]; </span><br><span class="line">}); </span><br></pre></td></tr></table></figure><p>首先我们用 <code>[&j]</code> 捕获了 <code>j</code> ,这样排序函数内部就可以直接使用 <code>lambda</code> 外面的 <code>j </code> 啦,不用构造难用的比较器再传入 <code>index</code> 啦。<br>剩下的和之前说的使用函数比较没什么区别,只是把函数定义放在调用位置而且没起名而已~</p><p>我们再来看看使用指针排序字符串的程序,使用lambda函数可以改成这样</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line">sort(strs_sorted.begin(), strs_sorted.end(), [](<span class="keyword">const</span> <span class="keyword">char</span> *a, <span class="keyword">const</span> <span class="keyword">char</span> *b)-><span class="keyword">bool</span></span><br><span class="line">{</span><br><span class="line"> <span class="keyword">return</span> <span class="built_in">strcmp</span>(a,b) < <span class="number">0</span>;</span><br><span class="line">});</span><br><span class="line">````</span><br><span class="line">lambda真好用有没有!!!!</span><br><span class="line"></span><br><span class="line">##<span class="meta"># queue</span></span><br><span class="line"></span><br><span class="line"><span class="built_in">queue</span> 是 `STL` 提供的一个队列类,比手写队列有很多优势</span><br><span class="line"></span><br><span class="line">`<span class="built_in">std</span>::<span class="built_in">queue</span>` 需要 `<span class="meta">#<span class="meta-keyword">include</span> <span class="meta-string"><queue>`</span></span></span><br><span class="line"></span><br><span class="line"><span class="built_in">queue</span> 的主要成员</span><br><span class="line"></span><br><span class="line">* `push(<span class="keyword">const</span> value_type& val)` 向队列压入一个元素</span><br><span class="line">* `pop()` 将队头弹出</span><br><span class="line">* `front()` 取出队头</span><br><span class="line">* `empty()` 判断队列是否为空</span><br><span class="line">简单的演示</span><br><span class="line"></span><br><span class="line">```c++</span><br><span class="line"><span class="built_in">queue</span><<span class="keyword">int</span>> q;</span><br><span class="line">q.push(<span class="number">0</span>);</span><br><span class="line"><span class="keyword">while</span> (!q.empty())</span><br><span class="line">{</span><br><span class="line"> <span class="keyword">int</span> h = q.front();</span><br><span class="line"> q.pop();</span><br><span class="line"> <span class="keyword">if</span> (h < <span class="number">100</span>)</span><br><span class="line"> q.push(h+<span class="number">1</span>)</span><br><span class="line"> <span class="built_in">printf</span>(<span class="string">"%d\n"</span>,h);</span><br><span class="line">}</span><br></pre></td></tr></table></figure><p>注意,要及时判断 <code>queue</code> 的 <code>empty()</code>,你只有一次机会,如果队列为空再 <code>pop()</code> 的话之后 <code>empty()</code> 八成是返回 <code>false</code>,因为 <code>size</code> 变成 <code>232−1</code> 了,然后什么奇怪的事情都有可能发生</p><h3 id="priority-queue"><a href="#priority-queue" class="headerlink" title="priority_queue"></a>priority_queue</h3><p>顾名思义,优先队列,是算法竞赛中的非常重要数据结构,Dijkstra等算法 少不了它。<br>可以参考<br>cplusplus中的 <a href="http://www.cplusplus.com/reference/queue/priority_queue/">priority_queue</a> 和 它的<a href="http://www.cplusplus.com/reference/queue/priority_queue/priority_queue/">构造函数</a></p><p><code>priority_queue</code> 的使用方法和 <code>queue</code> 基本一致,和主要区别是 <code>front()</code> 换成了 <code>top()</code> ,因为 <code>priority_queue</code> 使用堆实现的</p><p>注意,<code>priority_queue</code> 默认是大根堆,也就是大的元素先出队,想让小的元素先出队则应当给出比较器</p><p>重载结构体运算符实现“小根堆”</p><p>我们经常会遇到想要所有元素以某种优先方法出队,比如Dijkstra算法中,想要当前距离小的点先出队,我们可以这样做</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">State</span></span></span><br><span class="line"><span class="class">{</span></span><br><span class="line"> <span class="keyword">int</span> point,dis;</span><br><span class="line"> <span class="keyword">bool</span> <span class="keyword">operator</span> < (<span class="keyword">const</span> State &s)<span class="keyword">const</span></span><br><span class="line"> {</span><br><span class="line"> <span class="keyword">return</span> dis > s.dis;</span><br><span class="line"> }</span><br><span class="line">};</span><br><span class="line"><span class="built_in">priority_queue</span><State> q;</span><br></pre></td></tr></table></figure><p>无论你使用怎样的方法,都不能改变 <code>priority_queue</code> 是一个大根堆的事实,我们只是重载了运算符让小的元素比较起来大了而已,事实上,这是算法竞赛中非常常用的一种写法,一般来说足够用了。</p><h4 id="拓展-自定义priority-queue的比较方法"><a href="#拓展-自定义priority-queue的比较方法" class="headerlink" title="(拓展)自定义priority_queue的比较方法"></a><strong>(拓展)自定义priority_queue的比较方法</strong></h4><p>上面那个例子,明明可以用 <code>pair<int,int></code> 存下来的嘛,如果我强行要使用 <code>pair<int,int></code> 这种不能重载运算符的怎么办?<br>或者有的时候我们不能重载某个结构体的 <code><</code> 运算符怎么办?</p><p>我们先来看看 <code>priority_queue</code> 的原型</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">template</span> <<span class="class"><span class="keyword">class</span> <span class="title">T</span>, <span class="title">class</span> <span class="title">Container</span> = <span class="title">vector</span><T>,</span></span><br><span class="line"><span class="class"> <span class="title">class</span> <span class="title">Compare</span> = <span class="title">less</span><typename Container::value_type> > <span class="title">class</span> <span class="title">priority_queue</span>;</span></span><br></pre></td></tr></table></figure><p>可以看到,模板参数有三个,不过后面两参数已经有默认值了,如果我们想自定义比较器,那么三个参数都要填。还记得上面 <code>sort</code> 里面讲的比较器(仿函数)嘛,第三个参数填入一个仿函数就好了</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">pair_cmp</span></span></span><br><span class="line"><span class="class">{</span></span><br><span class="line"> <span class="function"><span class="keyword">bool</span> <span class="title">operator</span><span class="params">()</span><span class="params">(<span class="keyword">const</span> <span class="built_in">pair</span><<span class="keyword">int</span>, <span class="keyword">int</span>> &a, <span class="keyword">const</span> <span class="built_in">pair</span><<span class="keyword">int</span>, <span class="keyword">int</span>> &b)</span></span></span><br><span class="line"><span class="function"> </span>{</span><br><span class="line"> <span class="keyword">return</span> a.second > b.second;</span><br><span class="line"> }</span><br><span class="line">};</span><br><span class="line"><span class="built_in">priority_queue</span><<span class="built_in">pair</span><<span class="keyword">int</span>, <span class="keyword">int</span>>, <span class="built_in">vector</span><<span class="built_in">pair</span><<span class="keyword">int</span>, <span class="keyword">int</span>> >, pair_cmp> q;</span><br></pre></td></tr></table></figure><h3 id="set"><a href="#set" class="headerlink" title="set"></a>set</h3><p><code>set</code> 是有序集合,使用 <code>set</code> 需要 <code>#include <set></code></p><p><code>set</code> 是使用平衡树实现的,可以在 <code>O(Log(N))</code> 的时间内完成插入删除修改操作。</p><p><code>set</code> 常用来进行各种判重,比如搜索判重,状态判重等等。</p><p>cplusplus上对 <a href="http://www.cplusplus.com/reference/set/set/">set</a> 的介绍</p><h4 id="声明一个set"><a href="#声明一个set" class="headerlink" title="声明一个set"></a><strong>声明一个set</strong></h4><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">set</span><<span class="keyword">int</span>> iset;</span><br></pre></td></tr></table></figure><p>常用API有:</p><ul><li><p><code>set::insert(val)</code> 插入一个元素</p></li><li><p><code>set::empty()</code> 判定set是否为空</p></li><li><p><code>set::clear()</code> 清空set</p></li><li><p><code>set::size()</code> 取得set大小</p></li><li><p><code>set::erase()</code> 删除元素(有三只牛股用法)</p></li><li><p><code>set::find(val)</code> 返回指定元素迭代器,不存在的话返返回end()</p></li><li><p><code>set::lower_bound(val)</code></p></li><li><p><code>set::upperbound(val)</code></p></li><li><p><code>set::begin()</code> 返回从左开始的迭代器(从小到大)</p></li><li><p><code>set::end()</code> 返回</p></li><li><p><code>set::rbegin,set::rend()</code></p></li><li><p><code>set::count(val)</code> 返回set指定val的个数<br>显然只能返回1(有)或者0(没有),可以用来判断元素是否存在</p></li></ul>]]></content>
<summary type="html"><p>本文转自: <a href="https://www.zybuluo.com/comzyh/note/138935">https://www.zybuluo.com/comzyh/note/138935</a></p>
<h2 id="参考网站"><a href="#参考网站" class="headerlink" title="参考网站"></a>参考网站</h2><p>推荐大家去这些网站查询自己需要的东西</p>
<ul>
<li><a href="http://www.cplusplus.com/reference/">cpluscplus</a></li>
<li><a href="http://en.cppreference.com/w/">cppreference</a></li>
<li><a href="http://zh.cppreference.com/w/%E9%A6%96%E9%A1%B5">cppreferne中文</a></li>
</ul>
<p>STL中有很多好用的容器和算法,非常好用。<br>简单介绍一下</p></summary>
<category term="编程语言" scheme="https://andrewei1316.github.io/categories/%E7%BC%96%E7%A8%8B%E8%AF%AD%E8%A8%80/"/>
<category term="C++" scheme="https://andrewei1316.github.io/categories/%E7%BC%96%E7%A8%8B%E8%AF%AD%E8%A8%80/C/"/>
<category term="C++" scheme="https://andrewei1316.github.io/tags/C/"/>
<category term="STL" scheme="https://andrewei1316.github.io/tags/STL/"/>
</entry>
</feed>