-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
official Docker images for Kaldi #3284
Comments
Docker is not something that I really use myself so I wouldn't be able to
help a lot. If you are willing to help I'm open to the idea though.
…On Thu, May 2, 2019 at 11:22 AM Mortaza (Morrie) Doulaty < ***@***.***> wrote:
Are there any plans to add official docker images for Kaldi on Docker Hub?
Running Kaldi inside containers might be quite helpful for some
users/workloads and I think having official Kaldi images in Docker Hub
would be a good thing to have
we can setup automated builds for cpu and gpu based images and I can help
with the setup etc if this is something that you think would be beneficial
to other users
(we've some good experience with running containerized Kaldi ASR
workloads, both training and decoding on slurm cluster)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#3284>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAZFLO56WOZQUH372QRFJELPTMBKPANCNFSM4HJ7MGAQ>
.
|
yes, happy to help |
What service supports doing automated builds of docker containers? Does
Docker Hub itself support that? I admit that I am not very familiar with
this. Right now, we are using Travis CI, which has no concept of Docker.
…On Thu, May 2, 2019 at 9:44 AM Mortaza (Morrie) Doulaty < ***@***.***> wrote:
yes, happy to help
to start with, we'll need to setup a new public repository in Docker Hub (
http://hub.docker.com/), which is the container registry that we're going
to use
and since you're the yoda master, probably makes sense that you own the
organisation and the repository (similar to github) - so the orgname would
be kaldi-asr and the repository name would be kaldi
then the account owner needs to connect those two accounts together
(meaning DockerHub and GitHub) so that we can set automated builds whenever
something new is pushed
Similar to other projects, we can have latest dev images (both CPU and GPU
versions)
also have images for branches that are more stable (I can see 5.0, 5.1,
5.2, 5.3, 5.4 branches which seems like some stable versions), again both
CPU and GPU versions
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#3284 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABEL6UAXJOMWUJA6DCXJOFDPTMLABANCNFSM4HJ7MGAQ>
.
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
|
BTW, something to be aware of is that Kaldi uses absolute paths in a lot of
its files. The fact that Docker allows you to mount paths with different
names every time you run a container may cause some problems if you change
mounts frequently.
…On Thu, May 2, 2019 at 9:48 AM Daniel Galvez ***@***.***> wrote:
What service supports doing automated builds of docker containers? Does
Docker Hub itself support that? I admit that I am not very familiar with
this. Right now, we are using Travis CI, which has no concept of Docker.
On Thu, May 2, 2019 at 9:44 AM Mortaza (Morrie) Doulaty <
***@***.***> wrote:
> yes, happy to help
> to start with, we'll need to setup a new public repository in Docker Hub (
> http://hub.docker.com/), which is the container registry that we're
> going to use
> and since you're the yoda master, probably makes sense that you own the
> organisation and the repository (similar to github) - so the orgname would
> be kaldi-asr and the repository name would be kaldi
> then the account owner needs to connect those two accounts together
> (meaning DockerHub and GitHub) so that we can set automated builds whenever
> something new is pushed
> Similar to other projects, we can have latest dev images (both CPU and
> GPU versions)
> also have images for branches that are more stable (I can see 5.0, 5.1,
> 5.2, 5.3, 5.4 branches which seems like some stable versions), again
> both CPU and GPU versions
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#3284 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/ABEL6UAXJOMWUJA6DCXJOFDPTMLABANCNFSM4HJ7MGAQ>
> .
>
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
|
I created an organization `kaldiasr` (no `-` allowed) but the next steps
look complicated.
I could add someone else there, e.g. you, or preferably @galv or @kkm .
On Thu, May 2, 2019 at 12:50 PM Daniel Galvez <[email protected]>
wrote:
… BTW, something to be aware of is that Kaldi uses absolute paths in a lot of
its files. The fact that Docker allows you to mount paths with different
names every time you run a container may cause some problems if you change
mounts frequently.
On Thu, May 2, 2019 at 9:48 AM Daniel Galvez ***@***.***> wrote:
> What service supports doing automated builds of docker containers? Does
> Docker Hub itself support that? I admit that I am not very familiar with
> this. Right now, we are using Travis CI, which has no concept of Docker.
>
> On Thu, May 2, 2019 at 9:44 AM Mortaza (Morrie) Doulaty <
> ***@***.***> wrote:
>
>> yes, happy to help
>> to start with, we'll need to setup a new public repository in Docker
Hub (
>> http://hub.docker.com/), which is the container registry that we're
>> going to use
>> and since you're the yoda master, probably makes sense that you own the
>> organisation and the repository (similar to github) - so the orgname
would
>> be kaldi-asr and the repository name would be kaldi
>> then the account owner needs to connect those two accounts together
>> (meaning DockerHub and GitHub) so that we can set automated builds
whenever
>> something new is pushed
>> Similar to other projects, we can have latest dev images (both CPU and
>> GPU versions)
>> also have images for branches that are more stable (I can see 5.0, 5.1,
>> 5.2, 5.3, 5.4 branches which seems like some stable versions), again
>> both CPU and GPU versions
>>
>> —
>> You are receiving this because you are subscribed to this thread.
>> Reply to this email directly, view it on GitHub
>> <#3284 (comment)
>,
>> or mute the thread
>> <
https://github.com/notifications/unsubscribe-auth/ABEL6UAXJOMWUJA6DCXJOFDPTMLABANCNFSM4HJ7MGAQ
>
>> .
>>
>
>
> --
> Daniel Galvez
> http://danielgalvez.me
> https://github.com/galv
>
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3284 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAZFLO66A7E6CWD5QX6T7HLPTMLTRANCNFSM4HJ7MGAQ>
.
|
I mean, maybe if I add someone else to the team for the kaldiasr
organization on dockerhub they can do the next steps.
BTW, I don't want to build images for much older versions, the support gets
too much. 5.4 is the lowest I'd go.
…On Thu, May 2, 2019 at 12:54 PM Daniel Povey ***@***.***> wrote:
I created an organization `kaldiasr` (no `-` allowed) but the next steps
look complicated.
I could add someone else there, e.g. you, or preferably @galv or @kkm .
On Thu, May 2, 2019 at 12:50 PM Daniel Galvez ***@***.***>
wrote:
> BTW, something to be aware of is that Kaldi uses absolute paths in a lot
> of
> its files. The fact that Docker allows you to mount paths with different
> names every time you run a container may cause some problems if you change
> mounts frequently.
>
> On Thu, May 2, 2019 at 9:48 AM Daniel Galvez ***@***.***> wrote:
>
> > What service supports doing automated builds of docker containers? Does
> > Docker Hub itself support that? I admit that I am not very familiar with
> > this. Right now, we are using Travis CI, which has no concept of Docker.
> >
> > On Thu, May 2, 2019 at 9:44 AM Mortaza (Morrie) Doulaty <
> > ***@***.***> wrote:
> >
> >> yes, happy to help
> >> to start with, we'll need to setup a new public repository in Docker
> Hub (
> >> http://hub.docker.com/), which is the container registry that we're
> >> going to use
> >> and since you're the yoda master, probably makes sense that you own the
> >> organisation and the repository (similar to github) - so the orgname
> would
> >> be kaldi-asr and the repository name would be kaldi
> >> then the account owner needs to connect those two accounts together
> >> (meaning DockerHub and GitHub) so that we can set automated builds
> whenever
> >> something new is pushed
> >> Similar to other projects, we can have latest dev images (both CPU and
> >> GPU versions)
> >> also have images for branches that are more stable (I can see 5.0, 5.1,
> >> 5.2, 5.3, 5.4 branches which seems like some stable versions), again
> >> both CPU and GPU versions
> >>
> >> —
> >> You are receiving this because you are subscribed to this thread.
> >> Reply to this email directly, view it on GitHub
> >> <#3284 (comment)
> >,
> >> or mute the thread
> >> <
> https://github.com/notifications/unsubscribe-auth/ABEL6UAXJOMWUJA6DCXJOFDPTMLABANCNFSM4HJ7MGAQ
> >
> >> .
> >>
> >
> >
> > --
> > Daniel Galvez
> > http://danielgalvez.me
> > https://github.com/galv
> >
>
>
> --
> Daniel Galvez
> http://danielgalvez.me
> https://github.com/galv
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#3284 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAZFLO66A7E6CWD5QX6T7HLPTMLTRANCNFSM4HJ7MGAQ>
> .
>
|
@galv DockerHub supports building images there - we can also use your existing Travis CI pipeline to build, tag and push images to DockerHub, please have a look here: https://docs.travis-ci.com/user/docker/#building-a-docker-image-from-a-dockerfile @danpovey sure, however you feel like it's more appropriate - and sure, will include |
OK, email me with your id on dockerhub.
…On Thu, May 2, 2019 at 1:22 PM Mortaza (Morrie) Doulaty < ***@***.***> wrote:
@galv <https://github.com/galv> DockerHub support building images there -
we can also use your existing Travis CI pipeline to build, tag and push
images to DockerHub, please have a look here:
https://docs.travis-ci.com/user/docker/#building-a-docker-image-from-a-dockerfile
There are no issues with absolute paths or what so ever
@danpovey <https://github.com/danpovey> sure, however you feel like it's
more appropriate - and sure, will include 5.4 onward
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3284 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAZFLO4OZAV3IYCF47P3QZLPTMPNVANCNFSM4HJ7MGAQ>
.
|
I don't think you understood my comment on the absolute paths problem. It
won't affect the build but it will affect running docker containers.
2019. május 2., csütörtök dátummal Daniel Povey <[email protected]>
ezt írta:
… OK, email me with your id on dockerhub.
On Thu, May 2, 2019 at 1:22 PM Mortaza (Morrie) Doulaty <
***@***.***> wrote:
> @galv <https://github.com/galv> DockerHub support building images there
-
> we can also use your existing Travis CI pipeline to build, tag and push
> images to DockerHub, please have a look here:
> https://docs.travis-ci.com/user/docker/#building-a-
docker-image-from-a-dockerfile
> There are no issues with absolute paths or what so ever
>
> @danpovey <https://github.com/danpovey> sure, however you feel like it's
> more appropriate - and sure, will include 5.4 onward
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#3284 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/
AAZFLO4OZAV3IYCF47P3QZLPTMPNVANCNFSM4HJ7MGAQ>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3284 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABEL6UB56J3TQFXB4TCUBSDPTMS2JANCNFSM4HJ7MGAQ>
.
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
|
probably not fully understood what you meant then |
When you run kaldi inside a container, it will use an absolute path based
on how the container has mounted it's filesystem. The host likely has
mounted it's filfeystem differently though, so it forces you to do all your
work inside the container. Not necessarily bad.
2019. május 2., csütörtök dátummal Mortaza (Morrie) Doulaty <
[email protected]> ezt írta:
… probably not fully understood what you meant then
regardless, inside the container you can train without having to change
any folder structure of Kaldi and abs paths are fine
(can't think of why it can be an issue?)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3284 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABEL6UFGWCZAE6LCUZO4HQTPTMX7FANCNFSM4HJ7MGAQ>
.
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
|
probably the easiest would be: I create my proposed changes in my own forks, both in github and dockerhub, then you guys have a look and if all looks good, then we integrate in the main Kaldi repo in github and docker hub and continue there. |
Sounds good.
…On Thu, May 2, 2019 at 5:11 PM Mortaza (Morrie) Doulaty < ***@***.***> wrote:
probably the easiest would be: I create my proposed changes in my own
forks, both in github and dockerhub, then you guys have a look and if all
looks good, then we integrate in the main Kaldi repo in github and docker
hub and continue there.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3284 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAZFLO4M3GAJFXG4CDU3LRTPTNKI7ANCNFSM4HJ7MGAQ>
.
|
so here is the first version: It includes both CPU and GPU based images. I also pushed both images to DockerHub: I plan to add more image variants, a minimal image and etc. We also need to automate the building and pushing process, which can eventually be done (not entirely sure about building GPU based images in DockerHub, we may need to build them somewhere else that we have access to a GPU) |
Great! @galv do you have time to look into this? Sorry I have a lot to do today. |
Sure, @galv please have a look and let me know how would you like to proceed |
Seems okay to me, although I'm not sure that you need this line anymore: https://github.com/mdoulaty/kaldi/blob/75338cbd787943537322cae194e3d1ae11e7f103/docker/ubuntu16.04-gpu/Dockerfile#L26 My understanding was that the default |
as far as I remember in debian:9.8 there was no |
I will let @galv comment on that. |
Just tested the cpu image (for diarization). It works like a charm.. |
Thanks a lot!!
My preference is that @galv reviews this and lets me know whether to merge,
but if that doesn't happen by, say, Friday, ping me and I'll work on a
backup plan.
…On Tue, May 14, 2019 at 5:18 PM Fábio Franco Uechi ***@***.***> wrote:
so here is the first version:
https://github.com/mdoulaty/kaldi/tree/master/docker
It includes both CPU and GPU based images. I also pushed both images to
DockerHub:
https://cloud.docker.com/repository/docker/mdoulaty/kaldi/tags
I plan to add more image variants, a minimal image and etc.
We also need to automate the building and pushing process, which can
eventually be done (not entirely sure about building GPU based images in
DockerHub, we may need to build them somewhere else that we have access to
a GPU)
Just tested the cpu image (for diarization). It works like charm..
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3284?email_source=notifications&email_token=AAZFLO6NPW55DOJB4JZFQ53PVMUANA5CNFSM4HJ7MGA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVM2DTQ#issuecomment-492413390>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAZFLO6F3L7BGSPALXJLTXLPVMUANANCNFSM4HJ7MGAQ>
.
|
@fabito thanks for testing! |
Make a PR for it. I will look at it but I won't take the time to test it in
any capacity. I'm most interested in how we can do CI with these FYI.
2019. május 14., kedd dátummal Mortaza (Morrie) Doulaty <
[email protected]> ezt írta:
… @fabito <https://github.com/fabito> thanks for testing!
those are temporary locations and hopefully they will be moved to the
official Kaldi repo here on GitHub as well as Docker Hub very soon
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3284?email_source=notifications&email_token=ABEL6UBQJR3NGH3DOAWZKRTPVMVGDA5CNFSM4HJ7MGA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVM24BA#issuecomment-492416516>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABEL6UHPJUSJTD7DIBH5JQLPVMVGDANCNFSM4HJ7MGAQ>
.
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
|
@mdoulaty , what are your thoughts about the "minimal" image ? The idea is to remove the all build dependencies and copying over only the compiled binaries and utility scripts ? |
I'll go ahead and express my own thoughts on that. It's a moving target,
and it is better handled by a build system. In particular, cmake's cpack
packaging system is a good bet.
…On Tue, May 14, 2019 at 5:09 PM Fábio Franco Uechi ***@***.***> wrote:
@mdoulaty <https://github.com/mdoulaty> , what are your thoughts about
the "minimal" image ? The idea is to remove the all build dependencies and
copying over only the compiled binaries and utility scripts ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3284?email_source=notifications&email_token=ABEL6UFAI374QF4HRSCDP6TPVNICRA5CNFSM4HJ7MGA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVNELHA#issuecomment-492455324>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABEL6UAUIEUGYWLMQ2GCGJ3PVNICRANCNFSM4HJ7MGAQ>
.
--
Daniel Galvez
http://danielgalvez.me
https://github.com/galv
|
yes, something along those lines, have two envs in the Docker file, one for building Kaldi and one with just the compiled artifacts |
Kaldi wouldn't be much use without the scripts.
…On Wed, May 15, 2019 at 4:30 AM Mortaza (Morrie) Doulaty < ***@***.***> wrote:
@mdoulaty <https://github.com/mdoulaty> , what are your thoughts about
the "minimal" image ? The idea is to remove the all build dependencies and
copying over only the compiled binaries and utility scripts ?
yes, something along those lines, have two envs in the Docker file, one
for building Kaldi and one with just the compiled artifacts
Still a bit unsure if it's a good idea to include the scripts or just have
the core binaries there
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3284?email_source=notifications&email_token=AAZFLO4X7ANNBX2CSDKPJELPVPC25A5CNFSM4HJ7MGA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVN5ZAA#issuecomment-492559488>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAZFLO6GYW7ZI6HGUSJKPKDPVPC25ANCNFSM4HJ7MGAQ>
.
|
I agree with Dan -- Kaldi itself is not a product itself -- it's building
blocks for ASR research and scripts are part of it. WIthout the scripts,
it's not of much use.
There is a certain gap between the needs of the industry(product oriented
people) and our conception as being ASR toolbox. @kkm000 or perhaps
@dgalvez can comment on how much work it is to bridge the gap.
y.
On Wed, May 15, 2019 at 8:32 PM Daniel Povey <[email protected]>
wrote:
… Kaldi wouldn't be much use without the scripts.
On Wed, May 15, 2019 at 4:30 AM Mortaza (Morrie) Doulaty <
***@***.***> wrote:
> @mdoulaty <https://github.com/mdoulaty> , what are your thoughts about
> the "minimal" image ? The idea is to remove the all build dependencies
and
> copying over only the compiled binaries and utility scripts ?
>
> yes, something along those lines, have two envs in the Docker file, one
> for building Kaldi and one with just the compiled artifacts
> Still a bit unsure if it's a good idea to include the scripts or just
have
> the core binaries there
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <
#3284?email_source=notifications&email_token=AAZFLO4X7ANNBX2CSDKPJELPVPC25A5CNFSM4HJ7MGA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVN5ZAA#issuecomment-492559488
>,
> or mute the thread
> <
https://github.com/notifications/unsubscribe-auth/AAZFLO6GYW7ZI6HGUSJKPKDPVPC25ANCNFSM4HJ7MGAQ
>
> .
>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#3284?email_source=notifications&email_token=ACUKYX62YFRRZFV4KGAF4LDPVRQNLA5CNFSM4HJ7MGA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVPWZ5Q#issuecomment-492793078>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACUKYX3SXODNWCK2KKMKRITPVRQNLANCNFSM4HJ7MGAQ>
.
|
Sorry -- one more thought -- the things I have mentioned is one of the
reasons we don't really care about virtualization and packing -- there is
not a strong benefit for the researchers to go that way (or they have their
own infrastructure already set up and taken care of by the support team in
their company).
y.
…On Thu, May 16, 2019 at 2:51 PM Jan Trmal ***@***.***> wrote:
I agree with Dan -- Kaldi itself is not a product itself -- it's building
blocks for ASR research and scripts are part of it. WIthout the scripts,
it's not of much use.
There is a certain gap between the needs of the industry(product oriented
people) and our conception as being ASR toolbox. @kkm000 or perhaps
@dgalvez can comment on how much work it is to bridge the gap.
y.
On Wed, May 15, 2019 at 8:32 PM Daniel Povey ***@***.***>
wrote:
> Kaldi wouldn't be much use without the scripts.
>
>
> On Wed, May 15, 2019 at 4:30 AM Mortaza (Morrie) Doulaty <
> ***@***.***> wrote:
>
> > @mdoulaty <https://github.com/mdoulaty> , what are your thoughts about
> > the "minimal" image ? The idea is to remove the all build dependencies
> and
> > copying over only the compiled binaries and utility scripts ?
> >
> > yes, something along those lines, have two envs in the Docker file, one
> > for building Kaldi and one with just the compiled artifacts
> > Still a bit unsure if it's a good idea to include the scripts or just
> have
> > the core binaries there
> >
> > —
> > You are receiving this because you were mentioned.
> > Reply to this email directly, view it on GitHub
> > <
> #3284?email_source=notifications&email_token=AAZFLO4X7ANNBX2CSDKPJELPVPC25A5CNFSM4HJ7MGA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVN5ZAA#issuecomment-492559488
> >,
> > or mute the thread
> > <
> https://github.com/notifications/unsubscribe-auth/AAZFLO6GYW7ZI6HGUSJKPKDPVPC25ANCNFSM4HJ7MGAQ
> >
> > .
> >
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#3284?email_source=notifications&email_token=ACUKYX62YFRRZFV4KGAF4LDPVRQNLA5CNFSM4HJ7MGA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVPWZ5Q#issuecomment-492793078>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/ACUKYX3SXODNWCK2KKMKRITPVRQNLANCNFSM4HJ7MGAQ>
> .
>
|
okay then the minimal images will include all the scripts |
Does anyone actually use Kaldi dockers for training? Just curious. |
@sayint-ai Actually I am using kaldi docker container in my company. But my composition is complicated. I have included executable kaldi compiled binaries(CPU, GPU), some audio utilities (eg ffmpeg, sox), SGE configuration and etc. I have not installed any kaldi scripts here. I build this container on k8s and use it for model training. |
@galv I enabled automatic builds in Docker Hub (for CPU only image), apparently there is a 4-hour timout limit and with the VM that they provide, the image can't be built in 4 hours (a sample failed build can be found here: https://cloud.docker.com/repository/registry-1.docker.io/mdoulaty/kaldi/builds/650bc55f-9f18-4aeb-b98f-1ced857246bd) Then I tried integrating automatic builds in Travis, updated travis yaml and enabled Docker builds there (see https://github.com/mdoulaty/kaldi/blob/master/.travis.yml for ref on how to enable Docker builds) - this wasn't successful either, since Travis has a max limit of 50 mins (https://docs.travis-ci.com/user/customizing-the-build/#build-timeouts) I'll prepare some scripts to create a VM with GPU (will use some cloud provider agnostic tech, such as Terraform) to create a VM, pull Kaldi, build the images and push them to DockerHub. Then we can put this inside another Docker image and build this small image in Travis (which triggers the main build and returns well before the 50min timeout). Any thoughts? |
Great, thanks a lot!
It may be possible to avoid the timeout by reducing optimization levels and
using shared libraries.
I don't have time to get into this in too big a way right now but I
appreciate your work. Maybe @galv can give some advice.
…On Sat, May 18, 2019 at 1:17 PM Mortaza (Morrie) Doulaty < ***@***.***> wrote:
@galv <https://github.com/galv> I enabled automatic builds in Docker Hub
(for CPU only image), apparently there is a 4-hour timout limit and with
the VM that they provide, the image can't be built in 4 hours (a sample
failed build can be found here:
https://cloud.docker.com/repository/registry-1.docker.io/mdoulaty/kaldi/builds/650bc55f-9f18-4aeb-b98f-1ced857246bd
)
Then I tried integrating automatic builds in Travis, updated travis yaml
and enabled Docker builds there (see
https://github.com/mdoulaty/kaldi/blob/master/.travis.yml for ref on how
to enable Docker builds) - this wasn't successful either, since Travis has
a max limit of 50 mins (
https://docs.travis-ci.com/user/customizing-the-build/#build-timeouts)
Anyway neither of those was offering GPU support and we any way had to use
some other VMs that had GPUs. Now I guess we'll have to build CPU images
there as well. So not a big deal.
I'll prepare some scripts to create a VM with GPU (will use some cloud
provider agnostic tech, such as Terraform) to create a VM, pull Kaldi,
build the images and push them to DockerHub. Then we can put this inside
another Docker image and build this small image in Travis (which triggers
the main build and returns well before the 50min timeout). Any thoughts?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3284?email_source=notifications&email_token=AAZFLOYALYZUMNXFZENKEV3PWA2Y3A5CNFSM4HJ7MGA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVWSOAQ#issuecomment-493692674>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAZFLOZNQSV5A2B5NVQYLFLPWA2Y3ANCNFSM4HJ7MGAQ>
.
|
the simple work around (and what they officially suggest) is to break down into smaller images |
@galv here is a working version of the automated builds: |
@galv did you have a chance to check this? |
updated GPU image scripts |
Thanks! Let me know if there is anything you need from me, e.g. merging something. |
Sure, just sent a PR with some minor changes |
OK, great. Let's revisit the topic of moving that to kaldi-asr in the
future, I am pretty busy right now.
…On Thu, Jun 6, 2019 at 11:39 AM Mortaza (Morrie) Doulaty < ***@***.***> wrote:
Sure, just sent a PR with some minor changes
This initial part can be considered done. Two images are provided in the
main repo (CPU-based and GPU-based images).
Also have a side repo <https://github.com/mdoulaty/kaldi-image-builder>
which contains the automatic build & push scripts for the daily builds -
probably better to keep that as a separate repo (but can move to kaldi-asr
org if that makes sense). That repo includes some code for provisioning VMs
(in any public or private cloud provider - as long as it's supported by
Terraform, but the examples are with Microsoft Azure). I'm running those
builds daily on my account and pushing the latest versions of both CPU and
GPU images to Docker hub.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3284?email_source=notifications&email_token=AAZFLO4QVUA3UF3VYN5D55LPZEVSZA5CNFSM4HJ7MGA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXDIIWA#issuecomment-499549272>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAZFLO6UZOZBJBK2AVHUVRDPZEVSZANCNFSM4HJ7MGAQ>
.
|
Semi-related NVIDIA maintains a docker Kaldi image with a once a month release cycle. We try to keep the source relatively recent with TOT. https://ngc.nvidia.com/catalog/containers/nvidia:kaldi Note this container is tested against NVIDIA hardware to validate that things are functionally correct. |
@hwiorn : Hi, I wish to perform Kaldi training from multiple docker containers being on different physical machines. I have experience with SGE and Kaldi in the past, but I have troubles making the containers visible for SGE. Could you provide please some hints about how you configured SGE inside containers? My physical machines are in the same LAN. Thanks! |
This is really a gridengine question- you should ask on the gridengine-users list. It's really a networking issue more than anything else, as you need the docker images to be individually addressable on the local network. |
An alternative approach could be having the SGE running outside the containers and change queue.pl to call the command as a Docker command. For example when you run |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
resolved, we now rely on github actions to get the images for docker built |
Are there any plans to add official docker images for Kaldi on Docker Hub?
Running Kaldi inside containers might be quite helpful for some users/workloads and I think having official Kaldi images in Docker Hub would be a good thing to have
we can setup automated builds for cpu and gpu based images and I can help with the setup etc if this is something that you think would be beneficial to other users
(we've some good experience with running containerized Kaldi ASR workloads, both training and decoding on slurm cluster)
The text was updated successfully, but these errors were encountered: