forked from dalcde/cam-notes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathintegrable_systems.tex
3313 lines (3026 loc) · 156 KB
/
integrable_systems.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[a4paper]{article}
\def\npart {II}
\def\nterm {Michaelmas}
\def\nyear {2016}
\def\nlecturer {A. Ashton}
\def\ncourse {Integrable Systems}
\input{header}
\begin{document}
\maketitle
{\small
\noindent\emph{Part IB Methods, and Complex Methods or Complex Analysis are essential; Part II Classical Dynamics is desirable.}
\vspace{10pt}
\noindent Integrability of ordinary differential equations: Hamiltonian systems and the Arnol'd--Liouville Theorem (sketch of proof). Examples.\hspace*{\fill}[3]
\vspace{5pt}
\noindent Integrability of partial differential equations: The rich mathematical structure and the universality of the integrable nonlinear partial differential equations (Korteweg-de Vries, sine-Gordon). Backlund transformations and soliton solutions.\hspace*{\fill}[2]
\vspace{5pt}
\noindent The inverse scattering method: Lax pairs. The inverse scattering method for the KdV equation, and other integrable PDEs. Multi soliton solutions. Zero curvature representation. \hspace*{\fill}[6]
\vspace{5pt}
\noindent Hamiltonian formulation of soliton equations.\hspace*{\fill}[2]
\vspace{5pt}
\noindent Painleve equations and Lie symmetries: Symmetries of differential equations, the ODE reductions of certain integrable nonlinear PDEs, Painleve equations.\hspace*{\fill}[3]%
}
\tableofcontents
\setcounter{section}{-1}
\section{Introduction}
What is an integrable system? Unfortunately, an integrable system is a something mathematicians have not yet managed to define properly. Intuitively, an integrable system is a differential equation we can ``integrate up'' directly. While in theory, integrable systems should be very rare, it happens that in nature, a lot of systems happen to be integrable. By exploiting the fact that they are integrable, we can solve them much more easily.
\section{Integrability of ODE's}
\subsection{Vector fields and flow maps}
In the first section, we are going to look at the integrability of ODE's. Here we are going to consider a general $m$-dimensional first order non-linear ODE's. As always, restricting to only first-order ODE's is not an actual restriction, since any higher-order ODE can be written as a system of first-order ODE's. At the end, we will be concerned with a special kind of ODE given by a \emph{Hamiltonian system}. However, in this section, we first give a quick overview of the general theory of ODEs.
An $m$-dimensional ODE is specified by a \term{vector field} $\mathbf{V}: \R^m \to \R^m$ and an \term{initial condition} $\mathbf{x}_0 \in \R^m$. The objective is to find some $\mathbf{x}(t) \in \R^m$, which is a function of $t \in (a, b)$ for some interval $(a, b)$ containing $0$, satisfying
\[
\dot{\mathbf{x}} = \mathbf{V}(\mathbf{x}),\quad \mathbf{x}(0) = \mathbf{x}_0.
\]
In this course, we will assume the vector field $\mathbf{V}$ is sufficiently ``nice'', so that the following result holds:
\begin{fact}
For a ``nice'' vector field $\mathbf{V}$ and any initial condition $\mathbf{x}_0$, there is always a unique solution to $\dot{\mathbf{x}} = \mathbf{V}(\mathbf{x})$, $\mathbf{x}(0) = \mathbf{x}_0$. Moreover, this solution depends smoothly (ie. infinitely differentiably) on $t$ and $\mathbf{x}_0$.
\end{fact}
It is convenient to write the solution as
\[
\mathbf{x}(t) = g^t \mathbf{x}_0,
\]
where $g^t: \R^m \to \R^m$ is called the \emph{flow map}. Since $\mathbf{V}$ is nice, we know this is a smooth map. This flow map has some nice properties:
\begin{prop}\leavevmode
\begin{enumerate}
\item $g^0 = \id$
\item $g^{t + s} = g^t g^s$
\item $(g^{t})^{-1} = g^{-t}$
\end{enumerate}
\end{prop}
If one knows group theory, then this says that $g$ is a group homomorphism from $\R$ to the group of diffeomorphisms of $\R^m$, ie. the group of smooth invertible maps $\R^m \to \R^m$.
\begin{proof}
The equality $g^0 = \id$ is by definition of $g$, and the last equality follows from the first two since $t + (-t) = 0$. To see the second, we need to show that
\[
g^{t + s}\mathbf{x}_0 = g^t (g^s \mathbf{x}_0)
\]
for any $\mathbf{x}_0$. To do so, we see that both of them, as a function of $t$, are solutions to
\[
\dot{\mathbf{x}} = \mathbf{V}(\mathbf{x}),\quad \mathbf{x}(0) = g^s \mathbf{x}_0.
\]
So the result follows since solutions are unique.
\end{proof}
We say that $\mathbf{V}$ is the \term{infinitesimal generator} of the flow $g^t$. This is because we can Taylor expand.
\[
\mathbf{x}(\varepsilon) = g^\varepsilon \mathbf{x}_0 = \mathbf{x}(0) + \varepsilon \dot{\mathbf{x}}(0) + o(\varepsilon) = \mathbf{x}_0 + \varepsilon \mathbf{V}(\mathbf{x}_0) + o(\varepsilon).
\]
Given vector fields $\mathbf{V}_1, \mathbf{V}_2$, one natural question to ask is whether their flows commute, ie. if they generate $g_1^t$ and $g_2^s$, then must we have
\[
g_1^t g_2^s \mathbf{x}_0 = g_2^s g_1^t \mathbf{x}_0
\]
for all $\mathbf{x}_0$? In general, this need not be true, so we might be interested to find out if this happens to be true for particular $\mathbf{V}_1, \mathbf{V}_2$. However, often, it is difficult to check this directly, because differential equations are generally hard to solve, and we will probably have a huge trouble trying to find explicit expressions for $g_1$ and $g_2$.
Thus, we would want to be able to consider this problem at an infinitesimal level, ie. just by looking at $\mathbf{V}_1, \mathbf{V}_2$ themselves. It turns out the answer is given by the commutator:
\begin{defi}[Commutator]\index{commutator}
For two vector fields $\mathbf{V}_1, \mathbf{V}_2: \R^m \to \R^m$, we define a third vector field called the \emph{commutator} by
\[
[\mathbf{V}_1, \mathbf{V}_2] = \left(\mathbf{V}_1 \cdot \frac{\partial}{\partial \mathbf{x}}\right) \mathbf{V}_2 - \left(\mathbf{V}_2 \cdot \frac{\partial}{\partial \mathbf{x}}\right) \mathbf{V}_1,
\]
where we write
\[
\frac{\partial}{\partial \mathbf{x}} = \left(\frac{\partial}{\partial x_1}, \cdots, \frac{\partial}{\partial x_n}\right)^T.
\]
More explicitly, the $i$th component is given by
\[
[\mathbf{V}_1, \mathbf{V}_2]_i = \sum_{j = 1}^m (\mathbf{v}_1)_j \frac{\partial}{\partial x_j} (\mathbf{v}_2)_i - (\mathbf{v}_2)_j \frac{\partial}{\partial x_j} (\mathbf{v}_1)_i
\]
\end{defi}
The result we have is
\begin{prop}
Let $\mathbf{V}_1, \mathbf{V}_2$ be vector fields with flows $g_1^t$ and $g_2^s$. Then we have
\[
[\mathbf{V}_1, \mathbf{V}_2] = 0 \quad\Longleftrightarrow\quad g_1^t g_2^s = g_2^s g_1^t.
\]
\end{prop}
\begin{proof}
See example sheet 1.
\end{proof}
\subsection{Hamiltonian dynamics}
From now on, we are going to restrict to a very special kind of ODE, known as a \emph{Hamiltonian system}. To write down a general ODE, the background setting is just the space $\R^n$. We then pick a vector field, and then we get an ODE. To write down a Hamiltonian system, we need more things in the background, but conversely we need to supply less information to get the system. These Hamiltonian system are very useful in classical dynamics, and our results here have applications in classical dynamics, but we will not go into the physical applications here.
The background settings of a Hamiltonian is a \term{phase space} $M = \R^{2n}$. Points on $M$ are described by coordinates
\[
(\mathbf{q}, \mathbf{p}) = (q_1, \cdots, q_n, p_1, \cdots, p_n).
\]
We tend to think of the $q_i$ are ``generalized positions'' of particles, and the $p_n$ as the ``generalized momentum'' coordinates. We will often write
\[
\mathbf{x} = (\mathbf{q}, \mathbf{p})^T.
\]
It is very important to note that here we have ``paired up'' each $q_i$ with the corresponding $p_i$. In normal $\R^n$, all the coordinates are equal, but this is no longer the case here. To encode this information, we define the $2n \times 2n$ anti-symmetric matrix
\[
J =
\begin{pmatrix}
0 & I_n\\
-I_n & 0
\end{pmatrix}.
\]
We call this the \term{symplectic form}, and this is the extra structure we have for a phase space. We will later see that all the things we care about can be written in terms of $J$, but for practical purposes, we will often express them in terms of $\mathbf{p}$ and $\mathbf{q}$ instead.
The first example is the \emph{Poisson bracket}:
\begin{defi}[Poisson bracket]\index{Poisson bracket}
For any two functions $f, g: M \to \R$, we define the \emph{Poisson bracket} by
\[
\{f, g\} = \frac{\partial f}{\partial \mathbf{x}} J \frac{\partial g}{\partial \mathbf{x}} = \frac{\partial f}{\partial \mathbf{q}} \cdot \frac{\partial g}{\partial \mathbf{p}} - \frac{\partial f}{\partial \mathbf{p}} \cdot \frac{\partial g}{\partial \mathbf{q}}.
\]
\end{defi}
This has some obvious and not-so-obvious properties:
\begin{prop}\leavevmode
\begin{enumerate}
\item This is linear in each argument.
\item This is antisymmetric, ie. $\{f, g\} = - \{g, f\}$.
\item This satisfies the Leibniz property:
\[
\{f, gh\} = \{f, g\}h + \{f, h\} g.
\]
\item This satisfies the Jacobi identity:
\[
\{f, \{g, h\}\} + \{g, \{h, f\}\} + \{h, \{f, g\}\} = 0.
\]
\item We have
\[
\{q_i, q_j\} = \{p_i, p_j\} = 0,\quad \{q_i, p_j\} = \delta_{ij}.
\]
\end{enumerate}
\end{prop}
\begin{proof}
Just write out the definitions. In particular, you will be made to write out the 24 terms of the Jacobi identity in the first example sheet.
\end{proof}
We will be interested in problems on $M$ of the following form:
\begin{defi}[Hamilton's equation]\index{Hamilton's equation}
\emph{Hamilton's equation} is an equation of the form
\[
\dot{\mathbf{q}} = \frac{\partial H}{\partial \mathbf{p}},\quad \dot{\mathbf{p}} = -\frac{\partial H}{\partial \mathbf{q}}\tag{$*$}
\]
for some function $H: M \to \R$ called the \term{Hamiltonian}.
\end{defi}
Just as we think of $\mathbf{q}$ and $\mathbf{p}$ as generalized position and momentum, we tend to think of $H$ as generalized energy.
Note that given the phase space $M$, all we need to specify a Hamiltonian system is just a Hamiltonian function $H: M \to \R$, which is much less information that that needed to specify a vector field.
In terms of $J$, we can write Hamilton's equation as
\[
\dot{\mathbf{x}} = J \frac{\partial H}{\partial \mathbf{x}}.
\]
We can imagine Hamilton's equation as specifying the trajectory of a particle. In this case, we might want to ask how, say, the speed of the particle changes as it evolves. In general, suppose we have a smooth function $f: M \to \R$. We want to find the value of $\frac{\d f}{\d t}$. We simply have to apply the chain rule to obtain
\[
\frac{\d f}{\d t} = \frac{\d}{\d t} f(\mathbf{x}(t)) = \frac{\partial f}{\partial \mathbf{x}} \cdot \dot{\mathbf{x}} = \frac{\partial f}{\partial \mathbf{x}} J \frac{\partial H}{\partial \mathbf{x}} = \{f, H\}.
\]
We record this result:
\begin{prop}
Let $f: M \to \R$ be a smooth function. If $\mathbf{x}(t)$ evolves according to Hamilton's equation, then
\[
\frac{\d f}{\d t} = \{f, H\}.
\]
\end{prop}
In particular, a function $f$ is constant if and only if $\{f, H\} = 0$. This is very convenient. Without a result like this, if we want to see if $f$ is a conserved quantity of the particle (ie. $\frac{\d f}{\d t} = 0$), we might have to integrate the equations of motion, and then try to find explicitly what is conserved, or perhaps mess around with the equations of motion to somehow find that $\frac{\d f}{\d t}$ vanishes. However, we now have a very systematic way of figuring out if $f$ is a conserved quantity --- we just compute $\{f, H\}$.
In particular, we automatically find that the Hamiltonian is conserved:
\[
\frac{\d H}{\d t} = \{H, H\} = 0.
\]
\begin{eg}
Consider a particle (of unit mass) with position $\mathbf{q} = (q_1, q_2, q_3)$ (in Cartesian coordinates) moving under the influence of a potential $U(\mathbf{q})$. By Newton's second law, we have
\[
\ddot{\mathbf{q}} = -\frac{\partial U}{\partial \mathbf{q}}.
\]
This is actually a Hamiltonian system. We define the momentum variables by
\[
p_i = \dot{q}_i,
\]
then we have
\[
\dot{\mathbf{x}} =
\begin{pmatrix}
\dot{\mathbf{q}}\\
\dot{\mathbf{p}}
\end{pmatrix}
=
\begin{pmatrix}
\mathbf{p}\\
-\frac{\partial U}{\partial \mathbf{q}}
\end{pmatrix}
= J \frac{\partial H}{\partial \mathbf{x}},
\]
with
\[
H = \frac{1}{2} \abs{\mathbf{p}}^2 + U(\mathbf{q}).
\]
This is just the usual energy! Indeed, we can compute
\[
\frac{\partial H}{\partial \mathbf{p}} = \mathbf{p},\quad \frac{\partial H}{\partial \mathbf{q}} = \frac{\partial U}{\partial \mathbf{q}}.
\]
\end{eg}
\begin{defi}[Hamiltonian vector field]\index{Hamiltonian vector field}
Given a Hamiltonian function $H$, the \emph{Hamiltonian vector field} is given by
\[
\mathbf{V}_H = J \frac{\partial H}{\partial \mathbf{x}}.
\]
\end{defi}
We then see that by definition, the Hamiltonian vector field generates the Hamiltonian flow. More generally, for any $f: M \to \R$, we call
\[
\mathbf{V}_f = J \frac{\partial f}{\partial \mathbf{x}}.
\]
This is the Hamiltonian vector field with respect to $f$ and $g$.
We know have two bracket-like things we can form. Given two $f, g$, we can take the Poisson bracket to get $\{f, g\}$, and consider its Hamiltonian vector field $\mathbf{V}_{\{f, g\}}$. On the other hand, we can first get $\mathbf{V}_f$ and $V_g$, and then take the commutator of the vector fields. It turns out these are not equal, but differ by a sign.
\begin{prop}
We have
\[
[\mathbf{V}_\mathbf{f}, \mathbf{V}_\mathbf{g}] = - \mathbf{\mathbf{V}}_{\{f, g\}}.
\]
\end{prop}
\begin{proof}
See first example sheet.
\end{proof}
\begin{defi}[First integral]\index{first integral}
Given a phase space $M$ with a Hamiltonian $H$, we call $f: M \to \R$ a \emph{first integral} of the Hamiltonian system if
\[
\{f, H\} = 0.
\]
\end{defi}
The reason for the term ``first integral'' is historical --- when we solve a differential equation, we integrate the equation. Every time we integrate it, we obtain a new constant. And the first constant we obtain when we integrate is known as the first integral. However, for our purposes, we can just as well think of it as a constant of motion.
\begin{eg}
Consider the two-body problem --- the Sun is fixed at the origin, and a planet has Cartesian coordinates $\mathbf{q} = (q_1, q_2, q_3)$. The equation of motion will be
\[
\ddot{\mathbf{q}} = - \frac{\mathbf{q}}{|\mathbf{q}|^3}.
\]
This is equivalent to the Hamiltonian system $\mathbf{p} = \dot{\mathbf{q}}$, with
\[
H = \frac{1}{2} |\mathbf{p}|^2 - \frac{1}{|\mathbf{q}|}.
\]
We have an angular momentum given by
\[
\mathbf{L} = \mathbf{q} \wedge \mathbf{p}.
\]
Working with coordinates, we have
\[
L_i = \varepsilon_{ijk} q_j p_k.
\]
We then have (with implicit summation)
\begin{align*}
\{L_i, H\} &= \frac{\partial L_i}{\partial q_\ell}\frac{\partial H}{\partial p_\ell} - \frac{\partial L_i}{\partial p_\ell} \frac{\partial H}{\partial q_\ell}\\
&= \varepsilon_{ijk} \left(p_k \delta_{\ell j}p_\ell + \frac{q_j q_k}{|\mathbf{q}|^3}\right)\\
&= \varepsilon_{ijk} \left(p_k p_j + \frac{q_j q_k}{|\mathbf{q}|^3}\right)\\
&= 0,
\end{align*}
where we know the thing vanishes because we contracted a symmetric tensor with an antisymmetric one. So this is a first integral.
Less interestingly, we know $H$ is also a first integral. In general, some Hamiltonians have many many first integrals.
\end{eg}
Our objective of the remainder of the chapter is to show that if our Hamiltonian system has enough first integrals, then we can find a change of coordinates so that the equations of motion are ``trivial''. However, we need to impose some constraints on the integrals for this to be true. We will need to know about the following words:
\begin{defi}[Involution]\index{involution}
We say that two first integrals $F, G$ are in \emph{involution} if $\{F, G\} = 0$ (so $F$ and $G$ ``\emph{Poisson commute}'').
\end{defi}
\begin{defi}[Independent first integrals]\index{independent first integrals}
A collection of functions $f_i: M \to \R$ are independent if at each $\mathbf{x} \in M$, the vectors $\frac{\partial f_i}{\partial \mathbf{x}}$ for $i = 1, \cdots, n$ are independent.
\end{defi}
In general we will say a system is ``integrable'' if we can find a change of coordaintes so that the equations of motion become ``trivial'' and we can just integrate it up. This is a bit vague, so we will define integrability in terms of the existence of first integrals, and then we will later see that if these conditions are satisfied, then we can indeed integrate it up.:
\begin{defi}[Integrable system]\index{integrable system}
A $2n$-dimensional Hamiltonian system $(M, H)$ is \emph{integrable} if there exists $n$ first integrals $\{f_i\}_{i = 1}^n$ that are independent and in involution (ie. $\{f_i, f_j\} = 0$ for all $i, j$).
\end{defi}
The word independent is very important, or else people will cheat, eg. take $H, 2H, e^H, H^2, \cdots$.
\begin{eg}
Two-dimensional Hamiltonian systems are always integrable.
\end{eg}
\subsection{Canonical transformations}
We now come to the main result of the chapter. We will show that we can indeed integrate up integrable systems. We are going to show that there is a clever choice of coordinates such Hamilton's equations become ``trivial''. However, recall that the coordinates in a Hamiltonian system are not arbitrary. We have somehow ``paired up'' $q_i$ and $p_i$. So we want to only consider coordinate changes that somehow respect this pairing.
There are many ways we can define what it means to ``respect'' the pairing. We will pick a simple definition --- we require that it preserves the form of Hamilton's equation.
Suppose we had a general coordinate change $(\mathbf{q}, \mathbf{p}) \mapsto (\mathbf{Q}(\mathbf{q}, \mathbf{p}), \mathbf{P}(\mathbf{q}, \mathbf{p}))$.
\begin{defi}[Canonical transformation]\index{canonical transformation}
A coordinate change $(\mathbf{q}, \mathbf{p}) \mapsto (\mathbf{Q}, \mathbf{P})$ is called \emph{canonical} if it leaves Hamilton's equations invariant, ie. the equations in the original coordinates
\[
\dot{\mathbf{q}} = \frac{\partial H}{\partial \mathbf{q}},\quad \mathbf{p} = -\frac{\partial H}{\partial \mathbf{q}}.
\]
is equivalent to
\[
\dot{\mathbf{Q}} = \frac{\partial \tilde{H}}{\partial \mathbf{P}},\quad \dot{\mathbf{P}} = -\frac{\partial \tilde{H}}{\partial \mathbf{Q}},
\]
where $\tilde{H}(\mathbf{Q}, \mathbf{P}) = H(\mathbf{q}, \mathbf{p})$.
\end{defi}
If we write $\mathbf{x} = (\mathbf{q}, \mathbf{p})$ and $\mathbf{y} = (\mathbf{Q}, \mathbf{P})$, then this is equivalent to asking for
\[
\dot{\mathbf{x}} = J \frac{\partial H}{\partial \mathbf{x}} \quad\Longleftrightarrow\quad \dot{\mathbf{y}} = J \frac{\partial \tilde{H}}{ \partial \mathbf{y}}.
\]
\begin{eg}
If we just swap the $\mathbf{q}$ and $\mathbf{p}$ around, then the equations change by a sign. So this is not a canonical transformation.
\end{eg}
\begin{eg}
The simplest possible case of a canonical transformation is a linear transformation. Consider a linear change of coordinates given by
\[
\mathbf{x} \mapsto \mathbf{y}(\mathbf{x}) = A\mathbf{x}.
\]
We claim that this is canonical iff $AJA^t = J$, ie. that $A$ is \emph{symplectic}\index{symplectic transformation}.
Indeed, by linearity, we have
\[
\dot{\mathbf{y}} = A\dot{\mathbf{x}} = AJ\frac{\partial H}{\partial \mathbf{x}}.
\]
Setting $\tilde{H}(\mathbf{y}) = H(\mathbf{x})$, we have
\[
\frac{\partial H}{\partial x_i} = \frac{\partial y_j}{\partial x_i} \frac{\partial \tilde{H}(\mathbf{y})}{\partial y_j} = A_{ji} \frac{\partial \tilde{H}(\mathbf{y})}{\partial y_j} = \left[A^T \frac{\partial \tilde{H}}{\partial \mathbf{y}}\right]_i.
\]
Putting this back in, we have
\[
\dot{\mathbf{y}} = AJA^T \frac{\partial\tilde{H}}{\partial \mathbf{y}}.
\]
So $\mathbf{y} \mapsto \mathbf{y}(\mathbf{x})$ is canonical iff $J = AJA^T$.
\end{eg}
What about more general cases? Recall from IB Analysis II that a differentiable map is ``locally linear''. Now Hamilton's equations are purely local equations, so we might expect the following:
\begin{prop}
A map $\mathbf{x} \mapsto \mathbf{y}(\mathbf{x})$ is canonical iff $D\mathbf{y}$ is \emph{symplectic}\index{symplectic map}\index{symplectomorphism}, ie.
\[
D\mathbf{y} J (D\mathbf{y})^T = J.
\]
\end{prop}
Indeed, this follows from a simple application of the chain rule.
\subsubsection*{Generating functions}
We now discuss a useful way of producing canonical transformation, known as \term{generating functions}. In general, we can do generating functions in four different ways, but they are all very similar, so we will just do one that will be useful later on.
Suppose we have a function $S: \R^{2n} \to \R$. We suggestively write its arguments as $S(\mathbf{q}, \mathbf{P})$. We now set
\[
\mathbf{p} = \frac{\partial S}{\partial \mathbf{q}},\quad \mathbf{Q} = \frac{\partial S}{\partial \mathbf{P}}.
\]
By this equation, we mean we write down the first equation, which allows us to solve for $\mathbf{P}$ in terms of $\mathbf{q}, \mathbf{p}$. Then the second equation tells us the value of $\mathbf{Q}$ in terms of $\mathbf{q}, \mathbf{P}$, hence in terms of $\mathbf{p}, \mathbf{q}$.
Usually, the way we use this is that we already have a candidate for what $\mathbf{P}$ should be. We then try to find a function $S(\mathbf{q}, \mathbf{P})$ such that the first equation holds. Then the second equation will tell us what the right choice of $\mathbf{Q}$ is.
Checking that this indeed gives rise to a canonical transformation is just a very careful application of the chain rule, which we shall not go into. Instead, we look at a few examples to see it in action.
\begin{eg}
Consider the generating function
\[
S(\mathbf{q}, \mathbf{P}) = \mathbf{q} \cdot \mathbf{P}.
\]
Then we have
\[
\mathbf{p} = \frac{\partial S}{\partial \mathbf{q}} = \mathbf{P},\quad \mathbf{Q} = \frac{\partial S}{\partial \mathbf{P}} = \mathbf{q}.
\]
So this generates the identity transformation $(\mathbf{Q}, \mathbf{P}) = (\mathbf{q}, \mathbf{p})$.
\end{eg}
\begin{eg}
In a 2-dimensional phase space, we consider the generating function
\[
S(q, P) = qP + q^2.
\]
Then we have
\[
p = \frac{\partial S}{\partial q} = P + 2q,\quad Q = \frac{\partial S}{\partial P} = q.
\]
So we have the transformation
\[
(Q, P) = (q, p - 2q).
\]
In matrix form, this is
\[
\begin{pmatrix}
Q\\P
\end{pmatrix} =
\begin{pmatrix}
1 & 0\\
-2 & 1
\end{pmatrix}
\begin{pmatrix}
q\\p
\end{pmatrix}.
\]
To see that this is canonical, we compute
\[
\begin{pmatrix}
1 & 0\\
-2 & 1
\end{pmatrix}
J
\begin{pmatrix}
1 & 0\\
-2 & 1
\end{pmatrix}^T =
\begin{pmatrix}
1 & 0\\
-2 & 1
\end{pmatrix}
\begin{pmatrix}
0 & 1\\
-1 & 0
\end{pmatrix}
\begin{pmatrix}
1 & -2\\
0 & 1
\end{pmatrix} =
\begin{pmatrix}
0 & 1\\
-1 & 0
\end{pmatrix}
\]
So this is indeed a canonical transformation.
\end{eg}
\subsection{The Arnold-Liouville theorem}
We now get to the Arnold-Liouville theorem. This theorem says that if a Hamiltonian system is integrable, then we can find a canonical transformation $(\mathbf{q}, \mathbf{p}) \mapsto (\mathbf{Q}, \mathbf{P})$ such that $\tilde{H}$ depends only on $\mathbf{P}$. If this happened, then Hamilton's equations reduce to
\[
\dot{\mathbf{Q}} = \frac{\partial \tilde{H}}{\partial \mathbf{P}},\quad \dot{\mathbf{P}} = -\frac{\partial \tilde{H}}{\partial \mathbf{Q}} = 0,
\]
which is pretty easy to solve. We find that $\mathbf{P}(t) = \mathbf{P}_0$ is a constant, and since the right hand side of the first equation depends only on $\mathbf{P}$, we find that $\dot{\mathbf{Q}}$ is also constant! So $\mathbf{Q} = \mathbf{Q}_0 + \Omega t$, where
\[
\Omega = \frac{\partial \tilde{H}}{\partial \mathbf{P}} (\mathbf{P}_0).
\]
So the solution just falls out very easily.
Before we prove the Arnold-Liouville theorem in full generality, we first see how the canonical transformation looks like in a very particular case. Here we will just have to write down the canonical transformation and see that it works, but we will later find that the Arnold-Liouville theorem give us a general method to find the transformation.
\begin{eg}
Consider the harmonic oscillator with Hamiltonian
\[
H(q, p) = \frac{1}{2}p^2 + \frac{1}{2}\omega^2 q^2.
\]
Since is a 2-dimensional system, so we only need a single first integral. Since $H$ is a first integral for trivial reasons, this is an integrable Hamiltonian system.
We can actually draw the lines on which $H$ is constant --- they are just ellipses:
\begin{center}
\begin{tikzpicture}
\draw [->] (-3, 0) -- (3, 0) node [right] {$q$};
\draw [->] (0, -2) -- (0, 2) node [above] {$q$};
\foreach \x in {0.4, 0.8, 1.2} {
\begin{scope}[scale=\x]
\draw ellipse (1.5 and 1);
\end{scope}
}
\end{tikzpicture}
\end{center}
We note that the ellipses are each homeomorphic to $S^1$. Now we introduce the coordinate transformation $(q, p) \mapsto (\phi, I)$, defined by
\[
q = \sqrt{\frac{2I}{\omega}} \sin \phi,\quad p = \sqrt{2I\omega} \cos \phi,
\]
For the purpose of this example, we can suppose we obtained this formula through divine inspiration. However, in the Arnold-Liouville theorem, we will provide a general way of coming up with these formulas.
We can manually show that this transformation is canonical, but it is merely a computation and we will not waste time doing that. IN these new coordinates, the Hamiltonian looks like
\[
\tilde{H}(\phi, I) = H(q(\phi, I), p(\phi, I)) = \omega I.
\]
This is really nice. There is no $\phi$! Now Hamilton's equations become
\[
\dot\phi = \frac{\partial \tilde{H}}{ \partial I} = \omega,\quad \dot{I} = -\frac{\partial \tilde{H}}{\partial \phi} = 0.
\]
We can integrate up to obtain
\[
\phi(t) = \phi_0 + \omega t,\quad I(t) = I_0.
\]
For some unexplainable reason, we decide it is fun to consider the integral along paths of constant $H$:
\begin{align*}
\frac{1}{2\pi}\oint p \;\d q &= \frac{1}{2\pi} \int_0^{2\pi}p(\phi, I) \left(\frac{\partial q}{\partial \phi} \;\d \phi + \frac{\partial q}{\partial I} \;\d I\right)\\
&= \frac{1}{2\pi} \int_0^{2\pi}p(\phi, I) \left(\frac{\partial q}{\partial \phi} \;\d \phi\right)\\
&= \frac{1}{2\pi} \int_0^{2\pi} \sqrt{\frac{2I}{\omega}}\sqrt{2I\omega} \cos^2 \phi \;\d \phi\\
&= I
\end{align*}
This is interesting. We could always have performed the integral $\frac{1}{2\pi} \oint p \;\d q$ along paths of constant $H$ without knowing anything about $I$ and $\phi$, and this would have magically gave us the new coordinate $I$.
\end{eg}
There are two things to take away from this.
\begin{enumerate}
\item The motion takes place in $S^1$
\item We got $I$ by performing $\frac{1}{2\pi}\oint p \;\d q$.
\end{enumerate}
These two ideas are essentially what we are going to prove for general Hamiltonian system.
\begin{thm}[Arnold-Liouville theorem]\index{Arnold-Liouville theorem}
We let $(M, H)$ be an integrable $2n$-dimensional Hamiltonian system with independent, involutive first integrals $f_1, \cdots, f_n$, where $f_1 = H$. For any fixed $\mathbf{c} \in \R^n$, we set
\[
M_\mathbf{c} = \{(\mathbf{q}, \mathbf{p}) \in M: f_i(\mathbf{q}, \mathbf{p}) = c_i, i =1 , \cdots, n\}.
\]
Then
\begin{enumerate}
\item $M_\mathbf{c}$ is a smooth $n$-dimensional surface in $M$. If $M_\mathbf{c}$ is compact and connected, then it is diffeomorphic to
\[
T^n = S^1 \times \cdots \times S^1.
\]
\item If $M_\mathbf{c}$ is compact and connected, then locally, there exists canonical coordinate transformations $(\mathbf{q}, \mathbf{p}) \mapsto (\boldsymbol\phi, \mathbf{I})$ called the \term{action-angle coordinates} such that the angles $\{\phi_k\}_{k = 1}^n$ are coordinates on $M_\mathbf{c}$; the actions $\{I_k\}_{k = 1}^n$ are first integrals, and $H(\mathbf{q}, \mathbf{p})$ does not depend on $\boldsymbol\phi$. In particular, Hamilton's equations
\[
\dot{\mathbf{I}} = 0,\quad \dot{\boldsymbol\phi} = \frac{\partial \tilde{H}}{\partial \mathbf{I}} = \text{constant}.
\]
\end{enumerate}
\end{thm}
Some parts of the proof will refer to certain results from rather pure courses, which the applied people may be willing to just take on faith.
\begin{proof}[Proof sketch]
The first part is purely differential geometry. We first show that $M_\mathbf{c}$ is smooth and $n$-dimensional. The proof is purely formal, and follows from the preimage theory you may or may not have learnt from IID Differential Geometry (which is in turn an easy consequence of the inverse function theorem from IB Analysis II). The key that makes this work is that the constraints are independent, which is the condition that allows the preimage theorem to apply.
We next show that $M_\mathbf{c}$ is diffeomorphic to the torus if it is compact and connected. Consider the Hamiltonian vector fields defined by
\[
\mathbf{V}_{f_i} = J \frac{\partial f_i}{\partial \mathbf{x}}.
\]
We claim that these are \emph{tangent} to the surface $M_\mathbf{c}$. By differential geometry, it suffices to show that the derivative of the $\{f_j\}$ in the direction of $\mathbf{V}_{f_i}$ vanishes. We can compute
\[
\left(\mathbf{V}_{f_i} \cdot \frac{\partial}{\partial \mathbf{x}}\right)f_j = \frac{\partial f_j}{\partial \mathbf{x}} J \frac{\partial f_i}{\partial \mathbf{x}} = \{f_j, f_i\} = 0.
\]
Since this vanishes, we know that $\mathbf{V}_{f_i}$ is a tangent a tangent to the surface. Again by differential geometry, the flow maps $\{g_i\}$ must map $M_\mathbf{c}$ to itself. Also, we know that the flow maps commute. Indeed, this follows from the fact that
\[
[\mathbf{V}_{f_i}, \mathbf{V}_{f_j}] = -\mathbf{V}_{\{f_i, f_j\}} = -\mathbf{V}_{0} = 0.
\]
So we have a whole bunch of commuting flow maps from $M_\mathbf{c}$ to itself. We set
\[
g^\mathbf{t} = g_1^{t_1} g_2^{t_2} \cdots g_n^{t_n},
\]
where $\mathbf{t} \in \R^n$. Then because of commutativity, we have
\[
g^{\mathbf{t}_1 + \mathbf{t}_2} = g^{\mathbf{t}_1}g^{\mathbf{t}_2}.
\]
So this is gives a group action of $\R^n$ on the surface $M_\mathbf{c}$. We fix $\mathbf{x} \in M_\mathbf{c}$. We define
\[
\stab(\mathbf{x}) = \{\mathbf{t} \in \R^n: g^\mathbf{t}\mathbf{x} = \mathbf{x}\}.
\]
We introduce the map
\[
\phi: \frac{\R^n}{\stab(\mathbf{x})} \to M_\mathbf{c}
\]
given by $\phi(\mathbf{t}) = g^{\mathbf{t}}\mathbf{x}$. By the orbit-stabilizer theorem, this gives a bijection between $\R^n/\stab(\mathbf{x})$ and the orbit of $\mathbf{x}$. It can be shown that the orbit of $\mathbf{x}$ is exactly the connected component of $\mathbf{x}$. Now if $M_\mathbf{c}$ is connected, then this must be the whole of $\mathbf{x}$! By general differential geometry theory, we get that this map is indeed a diffeomorphism.
We know that $\stab(\mathbf{x})$ is a subgroup of $\R^n$, and if the $g_i$ are non-trivial, it can be seen (at least intuitively) that this is discrete. Thus, it must be isomorphic to something of the form $\Z^k$ with $1 \leq k \leq n$.
So we have
\[
M_\mathbf{c} \cong \R^n / \stab(\mathbf{x}) \cong \R^n/\Z^k \cong \cong \R^k/\Z^k \times \R^{n - k} \cong T^k \times \R^{n - k}.
\]
Now if $M_\mathbf{c}$ is compact, we must have $n - k = 0$, ie. $n = k$, so that we have no factors of $\R$. So $M_\mathbf{c} \cong T^n$.
\separator
With all the differential geometry out of the way, we can now construct the action-angle coordinates.
For simplicity of presentation, we only do it in the case when $n = 2$. The proof for higher dimensions is entirely analogous, except that we need to use a higher-dimensional analogue of Green's theorem, which we do not currently have.
We note that it is currently trivial to re-parameterize the phase space with coordinates $(\mathbf{Q}, \mathbf{P})$ such that $\mathbf{P}$ is constant within the Hamiltonian flow, and each coordinate of $\mathbf{Q}$ takes values in $S^1$. Indeed, we just put $\mathbf{P} = \mathbf{c}$ and use the diffeomorphism $T^n \cong M_\mathbf{c}$ to parameterize each $M_\mathbf{c}$ as a product of $n$ copies of $S^n$. However, this is not good enough, because such an arbitrary transformation will almost certainly not be canonical. So we shall try to find a more natural and in fact canonical way of parametrizing our phase space.
We first work on the generalized momentum part. We want to replace $\mathbf{c}$ with something nicer. We will do something analogous to the simple harmonic oscillator we've got.
So we fix a $\mathbf{c}$, and try to come up with some numbers $\mathbf{I}$ that labels this $M_\mathbf{c}$. Recall that our surface $M_\mathbf{c}$ looks like a torus:
\begin{center}
\begin{tikzpicture}
\draw (0,0) ellipse (2 and 1.12);
\path[rounded corners=24pt] (-.9,0)--(0,.6)--(.9,0) (-.9,0)--(0,-.56)--(.9,0);
\draw[rounded corners=28pt] (-1.1,.1)--(0,-.6)--(1.1,.1);
\draw[rounded corners=24pt] (-.9,0)--(0,.6)--(.9,0);
\end{tikzpicture}
\end{center}
Up to continuous deformation of loops, we see that there are two non-trivial ``single'' loops in the torus, given by the red and blue loops:
\begin{center}
\begin{tikzpicture}
\draw (0,0) ellipse (2 and 1.12);
\draw [mred] (0,0) ellipse (1.5 and 0.6);
\draw [mblue] (0, -0.71) ellipse (0.1 and 0.41);
\draw [rounded corners=28pt] (-1.1,.1)--(0,-.6)--(1.1,.1);
\draw [rounded corners=24pt] (-.9,0)--(0,.6)--(.9,0);
\end{tikzpicture}
\end{center}
More generally, for an $n$ torus, we have $n$ such distinct loops $\Gamma_1, \cdots, \Gamma_n$. More concretely, after identifying $M_\mathbf{c}$ with $S^n$, these are the loops given by
\[
\{0\} \times \cdots \times \{0\} \times S^1 \times \{0\} \times \cdots \times \{0\} \subseteq S^1.
\]
We now attempt to define:
\[
I_j = \frac{1}{2\pi} \oint_{\Gamma_j} \mathbf{p}\cdot \d \mathbf{q},
\]
This is just like the formula we had for the simple harmonic oscillator.
We want to make sure this is well-defined --- recall that $\Gamma_i$ actually represents a \emph{class} of loops identified under continuous deformation. What if we picked a different loop?
\begin{center}
\begin{tikzpicture}
\draw (0,0) ellipse (2 and 1.12);
\draw [mblue] (0.5, -1.09) arc (270:450:0.1 and 0.43);
\draw [mblue, dashed] (0.5, -0.23) arc (90:270:0.1 and 0.43);
\draw [mblue, dashed] (-0.5, -1.09) arc (270:450:0.1 and 0.43);
\draw [mblue] (-0.5, -0.23) arc (90:270:0.1 and 0.43);
\node [mblue, right] at (0.6, -0.66) {$\Gamma_2'$};
\node [mblue, left] at (-0.6, -0.66) {$\Gamma_2$};
\draw [rounded corners=28pt] (-1.1,.1)--(0,-.6)--(1.1,.1);
\draw [rounded corners=24pt] (-.9,0)--(0,.6)--(.9,0);
\end{tikzpicture}
\end{center}
On $M_\mathbf{c}$, we have the equation
\[
f_i(\mathbf{q}, \mathbf{p}) = \mathbf{c}_i.
\]
We will have to assume that we can invert this equation for $\mathbf{p}$ locally, ie. we can write
\[
\mathbf{p} = \mathbf{p}(\mathbf{q}, \mathbf{c}).
\]
The condition for being able to do so is just
\[
\det\left(\frac{\partial f_i}{\partial p_j}\right) \not= 0,
\]
which is not hard.
Then by definition, the following holds identically:
\[
f_i(\mathbf{q}, \mathbf{p}(\mathbf{q}, \mathbf{c})) = c_i.
\]
We an then differentiate this with respect to $q_k$ to obtain
\[
\frac{\partial f_i}{\partial q_k} + \frac{\partial f_i}{\partial p_\ell} \frac{\partial p_\ell}{\partial q_k} = 0
\]
on $M_\mathbf{c}$. Now recall that the $\{f_i\}$'s are in involution. So on $M_\mathbf{c}$, we have
\begin{align*}
0 &= \{f_i, f_j\} \\
&= \frac{\partial f_i}{\partial q_k} \frac{\partial f_j}{\partial p_k} - \frac{\partial f_i}{\partial p_k} \frac{\partial f_j}{\partial q_k}\\
&= \left(-\frac{\partial f_i}{\partial p_\ell} \frac{\partial p_\ell}{\partial q_k}\right)\frac{\partial f_j}{\partial p_k} - \frac{\partial f_i}{\partial p_k}\left(-\frac{\partial f_j}{\partial p_\ell} \frac{\partial p_\ell}{\partial q_k}\right)\\
&= \left(-\frac{\partial f_i}{\partial p_k} \frac{\partial p_k}{\partial q_\ell}\right)\frac{\partial f_j}{\partial p_\ell} - \frac{\partial f_i}{\partial p_k}\left(-\frac{\partial f_j}{\partial p_\ell} \frac{\partial p_\ell}{\partial q_k}\right)\\
&= \frac{\partial f_i}{\partial p_k} \left(\frac{\partial p_\ell}{\partial q_k} - \frac{\partial p_k}{\partial q_\ell}\right) \frac{\partial f_j}{\partial p_\ell}.
\end{align*}
Recall that the determinants of the matrices $\frac{\partial f_i}{\partial p_k}$ and $\frac{\partial f_j}{\partial p_\ell}$ are non-zero, ie. the matrices are invertible. So for this to hold, the middle matrix must vanish! So we have
\[
\frac{\partial p_\ell}{\partial q_k} - \frac{\partial p_k}{\partial q_\ell} = 0.
\]
In our particular case of $n = 2$, since $\ell, k$ can only be $1, 2$, the only non-trivial thing this says is
\[
\frac{\partial p_1}{\partial q_2} - \frac{\partial p_2}{\partial q_1} = 0.
\]
Now suppose we have two ``simple'' loops $\Gamma_2$ and $\Gamma_2'$. Then they bound an area $A$:
\begin{center}
\begin{tikzpicture}
\draw (0,0) ellipse (2 and 1.12);
\draw [mblue] (0.5, -1.09) arc (270:450:0.1 and 0.43);
\draw [mblue, dashed] (0.5, -0.23) arc (90:270:0.1 and 0.43);
\draw [mblue, dashed] (-0.5, -1.09) arc (270:450:0.1 and 0.43);
\draw [mblue] (-0.5, -0.23) arc (90:270:0.1 and 0.43);
\node [mblue, right] at (0.6, -0.66) {$\Gamma_2'$};
\node [mblue, left] at (-0.6, -0.66) {$\Gamma_2$};
\draw [rounded corners=28pt] (-1.1,.1)--(0,-.6)--(1.1,.1);
\draw [rounded corners=24pt] (-.9,0)--(0,.6)--(.9,0);
\fill [morange, opacity=0.3] (0.5, -1.09) arc (270:450:0.1 and 0.43) to [out=190, in=-10] (-0.5, -0.23) arc (90:270:0.1 and 0.43) to [out=-7, in=187] (0.5, -1.09);
\node [morange!50!black] at (0, -0.71) {$A$};
\end{tikzpicture}
\end{center}
Then we have
\begin{align*}
\left(\oint_{\Gamma_2} - \oint_{\Gamma_2'}\right) \mathbf{p}\cdot \d \mathbf{q} &= \oint_{\partial A}\mathbf{p}\cdot \d \mathbf{q}\\
&= \iint_A \left(\frac{\partial p_2}{\partial q_1} - \frac{\partial p_1}{\partial q_2}\right) \;\d q_1\;\d q_2\\
&= 0
\end{align*}
by Green's theorem.
So $I_j$ is well-defined, and
\[
\mathbf{I} = \mathbf{I}(\mathbf{c})
\]
is just a function of $c$. This will be our new ``momentum'' coordinates. To figure out what the angles $\boldsymbol\phi$ should be, we use generating functions. For now, we assume that we can invert $\mathbf{I}(\mathbf{c})$, so that we can write
\[
\mathbf{c} = \mathbf{c}(\mathbf{I}).
\]
We arbitrarily pick a point $\mathbf{x}_0$, and define the generating function
\[
S(\mathbf{q}, \mathbf{I}) = \int_{\mathbf{x}_0}^\mathbf{x} \mathbf{p}(\mathbf{q}', \mathbf{c}(\mathbf{I})) \cdot \d \mathbf{q}',
\]
where $\mathbf{x} = (\mathbf{q}, \mathbf{p}) = (\mathbf{q}, \mathbf{p}(\mathbf{q}, \mathbf{c}(\mathbf{I})))$. However, this is not \emph{a priori} well-defined, because we haven't said how we are going to integrate from $\mathbf{x}_0$ to $\mathbf{x}$. We are going to pick paths arbitrarily, but we want to make sure it is well-defined. Suppose we change from a path $\gamma_1$ to $\gamma_2$ by a little bit, and they enclose a surface $B$.
\begin{center}
\begin{tikzpicture}
\node [circ] {};
\node [left] {$\mathbf{x}_0$};
\node at (2, 2) [circ] {};
\node at (2, 2) [right] {$\mathbf{x}$};
\node at (2, 1) {$\gamma_2$};
\draw [->-=0.6] plot [smooth, tension=1] coordinates {(0, 0) (0.8, 1.7) (2, 2)};
\node [above] at (0.8, 1.7) {$\gamma_1$};
\draw [->-=0.6] plot [smooth, tension=1] coordinates {(0, 0) (1.2, -0.1) (2, 2)};
\node at (1, 0.9) {$B$};
\end{tikzpicture}
\end{center}
Then we have
\[
S(\mathbf{q}, \mathbf{I}) \mapsto S(\mathbf{q}, \mathbf{I}) + \oint_{\partial B} \mathbf{p} \cdot \d \mathbf{q}.
\]
Again, we are integrating $\mathbf{p} \cdot \d\mathbf{q}$ around a boundary, so there is no change.
However, we don't live in flat space. We live in a torus, and we can have a crazy loop that does something like this:
\begin{center}
\begin{tikzpicture}
\draw (-3, 1) to [out=5, in=175] (3, 1);
\draw (-3, -1) to [out=5, in=175] (3, -1);
\node (i) [circ] at (-1, 0) {};
\node (f) [circ] at (1, 0.2) {};
\draw [mblue, ->-=0.5] (i) to [out=10, in=150] (f);
\draw [mred] (i) to [out=30, in=180] (0, 1.15);
\draw [mred, dashed, ->-=0.7] (0, 1.15) arc(90:-90:0.1 and 1);
\draw [mred] (0, -0.85) to [out=180, in=180] (f);
\end{tikzpicture}
\end{center}
Then what we have effectively got is that we added a loop (say) $\Gamma_2$ to our path, and this contributes a factor of $2\pi I_2$. In general, these transformations give changes of the form
\[
S(\mathbf{q}, \mathbf{I}) \mapsto S(\mathbf{q}, \mathbf{I}) + 2\pi I_j.
\]
This is the only thing that can happen. So differentiating with respect to $I$, we know that
\[
\boldsymbol\phi = \frac{\partial S}{\partial \mathbf{I}}
\]
is well-defined modulo $2\pi$. These are the \emph{angles coordinates}. Note that just like angles, we can pick $\boldsymbol\phi$ consistently locally without this ambiguity, as long as we stay near some fixed point, but when we want to talk about the whole surface, this ambiguity necessarily arises. Now also note that
\[
\frac{\partial S}{\partial \mathbf{q}} = \mathbf{p}.
\]
Indeed, we can write
\[
S = \int_{\mathbf{x}_0}^\mathbf{x} \mathbf{F} \cdot \d \mathbf{x}',
\]
where
\[
\mathbf{F} = (\mathbf{p}, 0).
\]
So by the fundamental theorem of calculus, we have
\[
\frac{\partial S}{\partial \mathbf{x}} = \mathbf{F}.
\]
So we get that
\[
\frac{\partial S}{\partial \mathbf{q}} = \mathbf{p}.
\]
In summary, we have constructed on $M_\mathbf{c}$ the following: $\mathbf{I}= \mathbf{I}(\mathbf{c})$, $S(\mathbf{q}, I)$, and
\[
\boldsymbol\phi = \frac{\partial S}{\partial \mathbf{I}},\quad \mathbf{p} = \frac{\partial S}{\partial \mathbf{q}}.
\]
So $S$ is a generator for the canonical transformation, and $(\mathbf{q}, \mathbf{p}) \mapsto (\boldsymbol\phi, \mathbf{I})$ is a canonical transformation.
Note that at any point $\mathbf{x}$, we know $\mathbf{c} = \mathbf{f}(\mathbf{x})$. So $I(\mathbf{c}) = I(\mathbf{f})$ depends on the first integrals only. So we have
\[
\dot{\mathbf{I}} = 0.
\]
So Hamilton's equations become
\[
\dot{\boldsymbol\phi} = \frac{\partial \tilde{H}}{\partial \mathbf{I}},\quad \dot{\mathbf{I}} = 0 = \frac{\partial \tilde{H}}{\partial \boldsymbol\phi}.
\]
So the new Hamiltonian depends only on $\mathbf{I}$. So we can integrate up and get
\[
\boldsymbol\phi(t) = \boldsymbol\phi_0 + \Omega t,\quad \mathbf{I}(t) = \mathbf{I}_0,
\]
where
\[
\Omega = \frac{\partial\tilde{H}}{\partial \mathbf{I}}(\mathbf{I}_0).
\]
\end{proof}
To summarize, to integrate up an integrable Hamiltonian system, we identify the different cycles $\Gamma_1, \cdots, \Gamma_n$ on $M_\mathbf{c}$. We then construct
\[
I_j = \frac{1}{2\pi} \oint_{\Gamma_j} \mathbf{p}\cdot \d \mathbf{q},
\]
where $\mathbf{p} = \mathbf{p}(\mathbf{q}, \mathbf{c})$. We then invert this to say
\[
\mathbf{c} = \mathbf{c}(\mathbf{I}).
\]
We then compute
\[
\boldsymbol\phi = \frac{\partial S}{\partial \mathbf{I}},
\]
where
\[
S = \int_{\mathbf{x}_0}^{\mathbf{x}} \mathbf{p}(\mathbf{q}', \mathbf{c}(\mathbf{I})) \cdot \d \mathbf{q}'.
\]
Now we do this again with the Harmonic oscillator.
\begin{eg}
In the harmonic oscillator, we have
\[
H(q, p) = \frac{1}{2}p^2 + \frac{1}{2}\omega^2 q^2.
\]
We then have
\[
M_\mathbf{c} = \left\{(q, p): \frac{1}{2} p^2 + \frac{1}{2}\omega^2 q^2 = c\right\}.
\]
The first part of the Arnold-Liouville theorem says this is diffeomorphic to $T^1 = S^1$, which it is! The next step is to pick a loop, and there is an obvious one --- the circle itself. We write
\[
p = p(q, c) = \pm \sqrt{2c - \omega^2 q^2}
\]
on $M_\mathbf{c}$. Then we have
\[
I = \frac{1}{2\pi} \int p \cdot \d q = \frac{c}{\omega}.
\]
We can then write $c$ as a function of $I$ by
\[
c = c(I) = \omega I.
\]
Now construct
\[
S(q, I) = \int_{x_0}^{x} p(q', c(I))\;\d q'.
\]
We can pick $x_0$ to be the point corresponding to $\theta = 0$. Then this is equal to
\[
\int_0^q \sqrt{2\omega I - \omega^2 q'^2} \;\d q'.
\]
To find $\phi$, we need to differentiate this thing to get
\[
\phi = \frac{\partial S}{\partial I} = \omega\int_0^q \frac{\d q'}{\sqrt{2 \omega I - \omega^2 q'^2}} = \sin^{-1}\left(\sqrt{\frac{\omega}{2I}}q\right)
\]
As expected, this is only well-defined up to a factor of $2\pi$! Using the fact that $c = H$, we have
\[
q = \sqrt{\frac{2\pi}{\omega}} \sin \phi,\quad p = \sqrt{2I\omega} \cos \phi.
\]
These are exactly the coordinates we obtained through divine inspiration last time.
\end{eg}
\section{Partial Differential Equations}
For the remainder of the course, we are going to look at PDE's. We can view these as infinite-dimensional analogues of ODE's. So what do we expect for integrable PDE's? Recall that If an $2n$-dimensional ODE is integrable, then we $n$ first integrals. Since PDE's are infinite-dimensional, and half of infinity is still infinity, we would expect to have infinitely many first integrals. Similar to the case of integrable ODE's, we would also expect that there will be some magic transformation that allows us to write down the solution with ease, even if the initial problem looks very complicated.
These are all true, but our journey will be less straightforward. To begin with, we will not define what integrability means, because it is a rather complicated issue. We will go through one method of ``integrating up'' a PDE in detail, known as the inverse scattering transform, and we will apply it to a particular equation. Unfortunately, the way we apply the inverse scattering transform to a PDE is not obvious, and here we will have to do it through ``divine inspiration''.
Before we get to the inverse scattering transform, we first look at a few examples of PDEs.
\subsection{KdV equation}
The \term{KdV equation} is given by
\[
u_t + u_{xxx} - 6 u u_x = 0.
\]
Before we study the KdV equation, we will look at some variations of this where we drop some terms, and then see how they compare.
\begin{eg}
Consider the linear PDE
\[
u_t + u_{xxx} = 0,
\]
where $u = u(x, t)$ is a function on two variables. This admits solutions of the form
\[
e^{ikx - i\omega t},
\]
known as \term{plane wave modes}. For this to be a solution, $\omega$ must obey the \term{dispersion relation}
\[
\omega = \omega(k) = -k^3.
\]
For \emph{any} $k$, as long as we pick $\omega$ this way, we obtain a solution. By writing the solution as
\[
u(x, t) = \exp\left(ik\left(x - \frac{\omega(k)}{k}t\right)\right),
\]
we see that plane wave modes travel at speed
\[
\frac{\omega}{k} = -k^2.
\]
It is very important that the speed depends on $k$. Different plane wave modes travel at different speeds. This is going to give rise to what we call \term{dispersion}.
A general solution is a superposition of plane wave modes
\[
\sum_k a(k) e^{ikx - i \omega(k) t},
\]
or even an uncountable superposition
\[
\int_k A(k) e^{ikx - i\omega(k)t}\;\d k.
\]
It is a theorem that for linear PDE's on convex domains, all solutions are indeed superpositions of plane wave modes. So this is indeed completely general.
So suppose we have an initial solution that looks like this:
\begin{center}
\begin{tikzpicture}[yscale=1.5]
\draw [domain=-2:2,samples=50, mblue, thick] plot (\x, {exp(-3 * \x * \x)});
\end{tikzpicture}
\end{center}
We write this as a superposition of plane wave modes. As we let time pass, different plane wave modes travel at different speeds, so this becomes a huge mess! So after some time, it might look like
\begin{center}
\begin{tikzpicture}[yscale=1.5]
\draw [domain=0:2,samples=50, mblue, thick] plot (\x, {exp(-3 * \x * \x)});
\draw [domain=-3:0,samples=50, mblue, thick] plot [smooth] (\x, {exp(-\x * \x) *(1 - 0.5 * sin(400*\x*\x) * sin(400*\x*\x))});
\end{tikzpicture}
\end{center}
Intuitively, what gives us the dispersion is the third order derivative $\partial^3_x$. If we had $\partial_x$ instead, then there will be no dispersion.
\end{eg}
\begin{eg}
Consider the non-linear PDE
\[
u_t - 6 uu_x = 0.
\]
This looks almost intractable, as non-linear PDE's are scary, and we don't know what to do. However, it turns out that we can solve this for any initial data $u(x, 0) = f(x)$ via the method of characteristics. Details are left on the second example sheet, but the solution we get is
\[
u(x, t) = f(\xi),
\]
where $\xi$ is given implicitly by
\[
\xi = x - 6t f(\xi)
\]
We can show that $u_x$ becomes, in general, infinite in finite time. Indeed, we have
\[
u_x = f'(\xi) \frac{\partial \xi}{\partial x}.
\]
We differentiate the formula for $\xi$ to obtain
\[
\frac{\partial \xi}{\partial x} = 1 - 6tf'(\xi) \frac{\partial \xi}{\partial x}
\]
So we know $\frac{\partial \xi}{\partial x}$ becomes infinite when $1 + 6t f'(\xi) = 0$. In general, this happens in finite time, and at the time, we will get a straight slope. \emph{After} that, it becomes a multi-valued function! So the solution might evolve like this:
\begin{center}
\begin{tikzpicture}[xscale=0.9]
\draw [domain=-1.5:1.5,samples=50, thick, mblue] plot (\x, {1.5 * exp(-3 * \x * \x)});
\draw [->] (2, 0.75) -- (3, 0.75);
\begin{scope}[shift={(5, 0)}]
\draw [mblue, thick] (-1.5, 0) -- (-1.3, 0) to [out=0, in=90] (0.5, 1.3) -- (0.5, 0.3) to [out=270, in=180] (1, 0) -- (1.5, 0);
\draw [->] (2, 0.75) -- (3, 0.75);
\end{scope}
\begin{scope}[shift={(10, 0)}]
\draw [mblue, thick] (-1.5, 0) -- (-1.3, 0) to [out=0, in=90, looseness=0.7] (0.7, 1.3) to [out=270, in=90] (0.3, 0.5) to [out=270, in=180] (1, 0) -- (1.5, 0);
\end{scope}
\end{tikzpicture}
\end{center}
This is known as \term{wave-breaking}.
We can imagine that $-6uu_x$ gives us wave breaking.
\end{eg}
What happens if we combine both of these effects?
\begin{defi}[KdV equation]\index{KdV equation}
The \emph{KdV equation} is given by
\[
u_t + u_{xxx} - 6 u u_x = 0.
\]
\end{defi}
It turns out that this has a perfect balance between dispersion and non-linearity. This admits very special solutions known as \term{solitons}. For example, a $1$-solution solution is
\[
u(x, t) = -2 \chi_1^2 \sech^2\left(\chi_1 (x - 4 \chi_1^2 t)\right).
\]
\begin{center}
\begin{tikzpicture}[xscale=0.8]
\draw [domain=-5.5:5.5,samples=200, thick, mblue] plot (\x, {1.5* (cosh (\x))^(-2)});
\end{tikzpicture}
\end{center}
The solutions tries to both topple over and disperse, and it turns out they actually move like normal waves at a constant speed. If we look at the solution, then we see that this has a peculiar property that the speed of the wave depends on the amplitude --- the taller you are, the faster you move.
Now what if we started with \emph{two} of these solitons? If we placed them far apart, then they should not interact, and they would just individually move to the right. But note that the speed depends on the amplitude. So if we put a taller one before a shorter one, they might catch up with each other and then collide! Indeed, suppose they started off looking like this:
\pgfplotsset{compat=1.12}
\pgfplotsset{width=\textwidth, height=0.4\textwidth, axis lines=none}
\begin{center}
\centering
\begin{tikzpicture}
\begin{axis}[restrict x to domain=-20:0, ymax=0.8]
\addplot [thick, mblue] table [x={x}, y={0}] {solitons.csv};
\end{axis}
\end{tikzpicture}
\end{center}
After a while, the tall one starts to catch up:
\begin{center}
\centering
\begin{tikzpicture}
\begin{axis}[restrict x to domain=-15:5, ymax=0.8]
\addplot [thick, mblue] table [x={x}, y={1}] {solitons.csv};
\end{axis}