[SeFa] Closed-Form Factorization of Latent Semantics in GANs (Jul 2020) #21

andrewjong · 2020-07-21T21:02:56Z

0. Article Information and Links

Paper's project website: https://genforce.github.io/sefa/
Release date: 2020/07/13
Number of citations (as of 2020/MM/DD):
Code: Unofficial implementation in rosinality/stylegan2

1. What do the authors try to accomplish?

Unsupervised discovery of semantic latent space directions.

2. What's great compared to previous research?

Unsupervised better than supervised because 1) don't need labels 2) can discover more directions than you could with labels alone
GANSpace's PCA approach requires sampling and calculating PCA on the samples. On this takes 1.5 hours on BigGAN, and 2 minutes on StyleGAN.
- Recap: PCA works by finding the most important dimensions of data points in a high dimesional space. Requires many data points to be effective.
However SeFa uses a closed form and does not require any PCA computation, rather computes only on the weight values of the latent code's linear projections. This can be done very fast in < 1 second.
- authors prove that we only need to examine the trained weights of the first linear projection to find semantic changes in the output image (equation 3)
Quantitative and qualitative results show to be more disentangled than GANSpace

3. Where are the key elements of the technology and method?

2.2 Unsupervised Semantic Factorization

"our goal turns into finding the directions n that can cause the
significant change of y."

"we make an assumption that a large change of [the latent code's linear projection,] y, will lead to a large content variation of [the output image]"

Equation 3: A change in the projection space, ∆y, is equivalent to αAn.

Equation 4: to get the largest change, frame it as a constrained optimization problem.

The constraint is that nTn=1, i.e. unit vectors only. We maximize the L2 norm, akin to distance. Why the constraint? Because taking the "max" over an infinite space is meaningless unless we establish a boundary.

Equation 5: top-k version of equation 4, for most meaningful k directions.

Equation 6: Constrained optimization can be easily solved with the Lagrangian.

Lagrangian recap: left term f(n1, n2, n3...) is the infinite space, right term g(n1, n2, n3...) is the constraint. That - 1 represents the unit circle.
We simply put equation 4/5 into Lagrangian format.

Equation 7: This is the derivative of the Lagrangian in equation 6; we find the max by finding the critical points (where the derivative is 0).

Note: we can simplify by dropping the 2 factor and the solution will be the same.

Implies all possible solutions, where input n_i makes the equation 0, should be the eigenvectors of ATA.
- eigenvector recap: transforming a vector n by the matrix A is akin to scaling n up by some scalar. We can see that scalar is λ
- eigenvectors are important because they're orthogonal

2.3 Property of the Discovered Semantics

Equation 8: To find eigen values, we can use eigen decomposition. These eigenvalues correspond to eigenvectors n_i of the latent space. This is a special case of the eigen decomposition because ATA is positive semi-definite, see page 5 of these lecture notes, or Wikipedia.

The eigenvectors are orthogonal. From above lecture notes, "The important properties of a positive semi-definite matrix is
that [...] eigenvectors are pairwise orthogonal when their eigenvalues are different."

"Obviously, each n_i is a column of Q". Why? NOT SURE.
Draft: Diagonal matrix Lambda are the eigen values,

Since eigenvectors are orthogonal, they represent independent semantic directions in latent space.

Equation 9: Having orthogonal latent vectors also produces orthogonal projections y.

4. How do the authors measure success?

Quantitative

Compare cosine distance of found vectors to "ground-truth" learned in the supervised InterFaceGAN

Qualitative

Some test results

Appears to disentangle moderately well for some attributes

Advantage of unsupervised over supervised

5. How did you verify that it works?

TODO: apply to generated clothes latent space.

6. Things to discuss? (e.g. weaknesses, potential for future work, relation to other work)

Weaknesses

Authors stated goal is to find directions that cause the most perturbation in the output image; but what about entanglement? A big change is not necessarily semantically isolated, but could be a change composed of several semantics.
- this is shown in table 2 there is still entanglement
- supposedly, disentanglement is addressed by the fact they do an eigen decomposition, which uses orthogonal matrices
  - as acknowledged by the authors, orthogonal vectors don't totally disentangle: "(masculinizing + aging) and (feminizing + aging) are also two orthogonal directions in the latent space."
The authors make an assumption that a large change in the linear projection of the latent space will lead to a large variation in the output image (section 2.2, first ¶). However, that assumption is not proven in the paper.
Authors assume that the rate of change of the projected space y is monotonic, and that the max change found within the unit circle constraint will hold true outside the unit circle. Is it a guarantee that this is true? Or can the rate of change vary in a latent space? Recall StyleGAN's concept of Perceptual Path Length, an entangled space might be drastically curved.

How does it compare to the PCA approach of GANSpace? PCA can also be done via eigen decomposition?

7. Are there any papers to read next?

8. References

The text was updated successfully, but these errors were encountered:

andrewjong self-assigned this Jul 21, 2020

andrewjong added gan in progress almost done and removed in progress labels Jul 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SeFa] Closed-Form Factorization of Latent Semantics in GANs (Jul 2020) #21

[SeFa] Closed-Form Factorization of Latent Semantics in GANs (Jul 2020) #21

andrewjong commented Jul 21, 2020 •

edited

Loading

[SeFa] Closed-Form Factorization of Latent Semantics in GANs (Jul 2020) #21

[SeFa] Closed-Form Factorization of Latent Semantics in GANs (Jul 2020) #21

Comments

andrewjong commented Jul 21, 2020 • edited Loading

0. Article Information and Links

1. What do the authors try to accomplish?

2. What's great compared to previous research?

3. Where are the key elements of the technology and method?

2.2 Unsupervised Semantic Factorization

2.3 Property of the Discovered Semantics

4. How do the authors measure success?

Quantitative

Qualitative

5. How did you verify that it works?

6. Things to discuss? (e.g. weaknesses, potential for future work, relation to other work)

Weaknesses

7. Are there any papers to read next?

8. References

andrewjong commented Jul 21, 2020 •

edited

Loading