v1.2.0
Release v1.2.0: Engine Upgrade
With this release, we also welcome @andreafooo to our team who did a lot of the heavy lifting for it; glad to have you! What also makes us really happy is the increased involvement of all the contributors for the growing aTrain user community.
Also, as a shameless plug, it is really cool that aTrain is now recommended at Harvard University (and many other Universities worldwide) for the local transcription of sensitive audio material
But on to the release, and it is a big one, as it includes a completely rewritten backend, lots of new features and improvements and the use of faster-large-v3-turbo as the default model.
Major Improvements
- Completely new backend aTrain_core
- Support for faster-whisper large-v3
- New default model for a great balance between speed and accuracy: faster-whisper large-v3-turbo
- Support for distil-whisper models single language models for large speed improvement. Currently only for English and marked as beta as there is an illusive bug that leads to sometimes not showing the completion in the GUI, while the transcript is already finished and available in the folder.
- Introduction of our Model Manager: Download only models you really use, reducing the installer size greatly (we include speaker diarization model and faster-large-v3-turbo as the defaults)
- Halfed installer size thanks to model manager
- Rewrote transcription time estimate function, now estimated live and it should be very accurate
Minor improvements
- Updated faster-whisper to 1.0.2
- Updated pyannote-audio to 3.2.0
Major bugfixes
- aTrain no longer crashes when having special characters in the filename. Thanks to @wenyuan-wu and @hirowa for figuring this out and to @SjDayg for fixing it
- Numpy error when installing fixed, thanks @samfisherirl