docker安装whisperX

whisperX是一个语音转写文本的库，甚至能区分说话人，很适合拿来做会议语音转写。不过这个库也有很大的问题，环境依赖很复杂，需要安装特定cuda版本，所以我决定使用docker来解决问题。
现有镜像满足不了需求，基本都是一次性运行的，我的需求是启动一个web服务，api调用语音转写文本。git上找到一个Dockerfile可以拿来参考。

需要注意的坑

国内拉取git项目可能会速度慢，所以使用git的镜像站提速。我使用的是521github,只需要把git的url替换下就行。
pip安装速度慢，所以使用清华源。
修改Dockerfile后最好使用 docker-compose build --no-cache 来重建，直接使用 docker-compose up -d --build 有可能会使用缓存。

成品DockerFile

下面贴代码

FROM nvidia/cuda:11.8.0-runtime-ubuntu22.04

# 备份和修改apt源
RUN cp /etc/apt/sources.list /etc/apt/sources.list.bak
# 使用阿里的apt源
RUN sed -i 's/http:\/\/archive.ubuntu.com\/ubuntu\//http:\/\/mirrors.aliyun.com\/ubuntu\//g' /etc/apt/sources.list
RUN sed -i 's/http:\/\/security.ubuntu.com\/ubuntu\//http:\/\/mirrors.aliyun.com\/ubuntu\//g' /etc/apt/sources.list
# 安装 SSH等 服务
RUN apt-get update && apt-get install -y openssh-server vim iputils-ping wget curl git
RUN mkdir /var/run/sshd
RUN echo 'root:qingli520' | chpasswd
RUN sed -i 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config
#安装whisperx
RUN apt-get update -y 
RUN apt-get install -y git ffmpeg software-properties-common
RUN add-apt-repository -y ppa:deadsnakes/ppa
RUN apt-get install -y python3.10 python3-pip
#RUN pip3 install setuptools-rust -i https://pypi.tuna.tsinghua.edu.cn/simple
RUN pip3 install torch==2.0.0+cu118 torchaudio==2.0.0 -f https://download.pytorch.org/whl/torch_stable.html -i https://pypi.tuna.tsinghua.edu.cn/simple
#RUN pip3 install git+https://521github.com/m-bain/whisperX -i https://pypi.tuna.tsinghua.edu.cn/simple
WORKDIR /
RUN git clone https://521github.com/m-bain/whisperX
WORKDIR /whisperX
RUN pip3 install -r requirements.txt
#RUN mkdir /app

# SSH 登录时不需要密钥确认
RUN sed -i 's/#StrictModes yes/StrictModes no/' /etc/ssh/sshd_config

# 公开 SSH 端口
EXPOSE 22

# 启动 SSH 服务
CMD ["/usr/sbin/sshd", "-D"]

Loading

EchoBlog

EchoBlog

需要注意的坑

成品DockerFile

Alan

已有 0 条评论