+DL+Whisper の変更点

追加された行はこの色です。
削除された行はこの色です。
+DL+Whisper へ行く。
+DL+Whisper の差分を削除
#author("2024-07-18T10:41:30+08:00","default:Admin","Admin")
#author("2024-07-18T10:43:31+08:00","default:Admin","Admin")
[[Deep Learning]]

#contents

&color(red){※前提条件：本情報はWhisper 1.5.0を基づいて説明してる};

* Whisper [#v9e2bc78]

这个方案就是 OpenAI 开源的 Whisper，当然是用 Python 写的了，只需要简单安装几个包，然后几行代码一写，稍等片刻（根据你的机器性能和音视频长度不一），最终的文本内容就出来了，就是这么简单。

GitHub 仓库地址：

 https://github.com/openai/whisper

参考

 https://blog.csdn.net/xiaohucxy/article/details/134838912

Model 下载位置：

 https://huggingface.co/ggerganov/whisper.cpp/tree/main
 //不同人提交的
 https://huggingface.co/Systran
** CPU [#vca4d666]

NuGet安装下面两个包
- Whisper.net
- Whisper.net.Runtime

** GPU [#i3b54b68]

NuGet安装下面两个包
- Whisper.net
- Whisper.net.Runtime.Clblast


- Whisper.net.Runtime.Cublas: Cublas 是 NVIDIA 提供的一个线性代数库，它包含一系列用于解决线性代数问题的函数，例如矩阵和向量相乘、矩阵相乘等。
- Whisper.net.Runtime.Clblast: Clblast 是一个用于 OpenCL 的 BLAS (基础线性代数子程序) 库，它提供了一系列用于执行线性代数运算的函数。


* Fast-Whisper [#e9d4529a]

虽然已经很简单了，但是对于程序员来说还是不够简洁，毕竟程序员都很“懒”，Whisper 虽说安装和调用已经很简单了，但还是需要独立安装 PyTorch 、ffmpeg 甚至 Rust。

于是，就有了更快、更简洁的 Fast-Whisper。Fast-Whisper 并不是简单封装了一下 Whisper，而是是使用 CTranslate2 重新实现 OpenAI 的 Whisper 模型，CTranslate2 是 Transformer 模型的快速推理引擎。

总结一下，也就是比 Whisper 更快，官方的说法是比 Whisper 快了 4-8 倍。不仅能支持 GPU ，还能支持 CPU，连我这台破 Mac 也能用。

GitHub 仓库地址：

 https://github.com/SYSTRAN/faster-whisper

CUDA的下载路径：

 https://developer.nvidia.com/cuda-downloads

运行保存在本地的large-v3模型

#codeprettify{{
from faster_whisper import WhisperModel

model_size = "small"

path = r"E:\aSer\whisper\faster-whisper-small"

# Run on GPU with FP16
model = WhisperModel(model_size_or_path=path, device="cpu", compute_type="int8", local_files_only=True)

# or run on GPU with INT8
# model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")
# or run on CPU with INT8
# model = WhisperModel(model_size, device="cpu", compute_type="int8")

segments, info = model.transcribe("E:\\aSer\\whisper\\20240716091034.wav", beam_size=5, language="zh", vad_filter=True, vad_parameters=dict(min_silence_duration_ms=1000))

for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
}}



** 下载地址 [#td954730]

只能翻墙才能访问到

|large-v3模型|https://huggingface.co/Systran/faster-whisper-large-v3/tree/main|
|large-v2模型|https://huggingface.co/guillaumekln/faster-whisper-large-v2/tree/main|
|large-v2模型|https://huggingface.co/guillaumekln/faster-whisper-large-v1/tree/main|
|medium模型|https://huggingface.co/guillaumekln/faster-whisper-medium/tree/main|
|small模型|https://huggingface.co/guillaumekln/faster-whisper-small/tree/main|
|base模型|https://huggingface.co/guillaumekln/faster-whisper-base/tree/main|
|tiny模型|https://huggingface.co/guillaumekln/faster-whisper-tiny/tree/main|

下载cuBLAS and cuDNN

 https://github.com/Purfview/whisper-standalone-win/releases/tag/libs

** 环境配置 [#ae0befd5]

创建环境 在conda环境中创建python运行环境

 conda create -n faster_whisper python=3.9 # python版本要求3.8到3.11

激活虚拟环境

 conda activate faster_whisper

安装faster-whisper依赖

 pip install faster-whisper

* Distil-Whisper [#e89abc06]

Distil-Whisper is a distilled version of Whisper that is 6 times faster, 49% smaller, and performs within 1% word error rate (WER) on out-of-distribution evaluation sets:

 https://github.com/huggingface/distil-whisper?tab=readme-ov-file

Model下载位置

 https://huggingface.co/distil-whisper/distil-large-v3



#hr();
コメント：
#comment_kcaptcha