Skip to content

[archive] How exam works

Oleg Vasilev edited this page Sep 13, 2017 · 1 revision

##Unnecessarily grumpy intro Hard to admit it, but our attempt to incentivize projects wasn't as popular as we hoped (if you are among few heroes who jumped in - know that you're awesome). Additionally, HSE CS department forced us to roll some kind of an exam at the end of the course. So here it goes, our kinda-sorta-optional-HSE-exam.

##Rules & grading Instead of the regular exam with tests and questions, we'd ask you to pick a problem from the list (or suggest your own), develop a solution for that problem and make sure you understand what exactly you wrote and why.

##Skipping the exam Any person with at least 40 points can skip the examination. If so, no bonus points will be awarded and the final grade will be determined by your score at the course deadline (26 dec 16).

If you already have enough points for A grade please do skip the exam even if you feel masochistic. There are bound to be more productive ways to spend your time. Consider ICLR workshops :)

##Grading Baseline for working solution is 10pts, explaining basics of how you made it is +5pts, up to +5pts for answering questions on it, which adds up to 20pts.

Any clever tricks, useful (not just nice) visualizations and creativity will result in some bonus points, at approximately the same rate as in homework assignments.

##What is a good solution Any self-respecting solution

  • Should solve the problem you picked, not something else :)
  • Should evaluate quality in some way (even for generative models, at least evaluate test loss and give samples)
  • Should have an applier notebook/app that uses pre-trained model
  • Should be explainable by you in under 10 minutes (russian or english, no preference)
  • Should either attempt or provide motivation for not attempting the relevant approaches covered in the base practical assignments course.
  • Should at least attempt deep learning solution. Showing thoat non-DL solution is better in some way will result in bonus points as long as your DL solution isn't unreasonably flawed.
  • Should not contain paragraphs of comments to remind you how every screw works. (unless you won't do without them)
  • Should not cause spontaneous eye bleeding (100-page debug prints is a no)
  • It's okay to use other person's code as long as you understand it as well as if you wrote it. Would be great if you mention that person, even if he's your groupmate :)

##Understanding Ideally, one should be able to explain

  • What architecture you used and some basic motivations (why convolutional? why recurrent? why use batch normalization?)
  • Dark knowledge of what exactly is the optimal nonlinearity / training curriculum is appreciated, but not expected
  • Exact math and formulae derivations are, again, appreciated, but not expected
  • The basic rule is that you should be able to explain any non-default steps you made.
    • Random sampling is default, prioritized sampling needs explaination
    • Regular optimization methods are default, applying some ada* for all layers except embedding and vanilla sgd for embedding needs some motivation, or at least intuition
    • Using any external data/networks needs some explaination
  • A rule of thumb: if you actually developed the solution without mindlessly copypasting blobs of code, you should be okay even if you forget a thing or two.

##The list

Tags:

  • [image] computer vision
  • [text] natural language processing
  • [sound] speech processing
  • [ts] time series processing
  • [generative] involves generating something
  • [experimental] is likely to cause complications (you need to either solve it or show that straightforward approach fails)
  • [gpu] near-required to have some GPU (not that it isn't useful everywhere)

1. [lite] [image] NotMnist Classification

2. [lite] [image] Recognize human emotion

3. [lite] [text] Classify review sentiment

4. [image] Classify leaves

5. [image] [gpu] Classify fish species

6. [image] [gpu] Classify satellite images

7. [image] [gpu] Classify ocean life

8. [image] Predict gender by handwriting

9. [image] Predict restaurant score given photo

10. [image] [generative] NotMnist generation

  • Learn to generate new fonts like these ones in the dataset

11. [experimental] [ts] Train financial model or prove it cannot be trained efficiently

12. [text] Classify salary given job description

13. [text] Score question answers

14. [text] Classify lame questions on stackoverflow

15. [text] Predict question tags on stackoverflow

16. [experimental] [text] Classify the s**t outa wikipedia

17. [experimental] [ts] Classify malware

18. [ts] [generative] Generate molecules for given properties

19. [ts] [generative] Name of the game

20. [sound] [gpu] Identify speakers by voice

21. [sound] Identify language by voice

  • A small dataset of audiobooks
  • Task - classify language given small (3-5s, you may adjust) extract.
  • Data augmentation OR external data usage may be cruicial
    • e.g. parse a lot of audiobooks
  • For starters, learn to distinguish between any 2 languages

N-1. Homework as an exam task

  • Alternatively, you can pick any homework from the list
  • HW2, HW3, HW4, HW6, HW7, HW8, HW9, HW11, HW12
  • complete it the regular way.
  • The implementation points will be awarded the same way you would otherwise get for the homework
  • which is luckily the same amount as for technical part of the project
  • lateness penalty not applied
  • You are expected to explain what exactly you were doing in the homework and answer basic questions on the implementation