Skip to content

SvenDH/CapsNet-ASR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CapsNet-ASR

Phoneme recognition with Capsule Networks

Presentation 24 april link: https://docs.google.com/presentation/d/149ZALP9stKvSWqu2N12QfzwSR06X3CsfPLHTvgIRO0U/edit?usp=sharing

Extra ideas (possible individual research questions):

Since our frames are not necessarily of only one phone, we can maybe label them as 0.75A 0.25B and compare this with the classifier distribution. This would mean that (0.75A 0.25B) predicted as (0.6A 0.2B 0.2C) is better than (0.6A 0.1B 0.3C), even though in both predictions the amount of prediciton for A is the same.

Extra reading:

First paper to use 48 (39) instead of 61 labels: http://repository.cmu.edu/cgi/viewcontent.cgi?article=2768&context=compsci

Single recurrent neural network on TIMIT phoneme classification: http://people.idsia.ch/~santiago/papers/IDSIA-04-08.pdf

Good general info about TIMIT and phoneme classification: https://www.intechopen.com/books/speech-technologies/phoneme-recognition-on-the-timit-database

About

Phoneme recognition with Capsule Networks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •