ASRT_SpeechRecognition/README_EN.md

![](https://res.ailemon.net/common/asrt_title_header_en.png)

[![GPL-3.0 Licensed](https://img.shields.io/badge/License-GPL3.0-blue.svg?style=flat)](https://opensource.org/licenses/GPL-3.0) 
[![Stars](https://img.shields.io/github/stars/nl8590687/ASRT_SpeechRecognition)](https://github.com/nl8590687/ASRT_SpeechRecognition) 
[![TensorFlow Version](https://img.shields.io/badge/Tensorflow-1.15+-blue.svg)](https://www.tensorflow.org/) 
[![Python Version](https://img.shields.io/badge/Python-3.6+-blue.svg)](https://www.python.org/) 
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5808434.svg)](https://doi.org/10.5281/zenodo.5808434)

ASRT is A Deep-Learning-Based Chinese Speech Recognition System. If you like this project, please **star** it. 

**ReadMe Language** | [中文版](https://github.com/nl8590687/ASRT_SpeechRecognition/blob/master/README.md) | English |

[**ASRT Project Home Page**](https://asrt.ailemon.net/) | 
[**Released Download**](https://wiki.ailemon.net/docs/asrt-doc/download) | 
[**View this project's wiki document (Chinese)**](https://wiki.ailemon.net/docs/asrt-doc) | 
[**Experience Demo**](https://asrt.ailemon.net/demo) | 
[**Donate**](https://wiki.ailemon.net/docs/asrt-doc/asrt-doc-1deo9u61unti9)

If you have any questions in your works with this project, welcome to put up issues in this repo and I will response as soon as possible. 

You can check the [FAQ Page (Chinese)](https://wiki.ailemon.net/docs/asrt-doc/asrt-doc-1deoeud494h4f) first before asking questions to avoid repeating questions.

If there is any abnormality when the program is running, please send a complete screenshot when asking questions, and indicate the CPU architecture, GPU model, operating system, Python, TensorFlow and CUDA versions used, and whether any code has been modified or data sets have been added or deleted, etc. .

## Introduction

This project uses tensorFlow.keras based on deep convolutional neural network and long-short memory neural network, attention mechanism and CTC to implement. 

## Minimum requirements for training
### Hardware
* CPU: 4 Core (x86_64, amd64) +
* RAM: 16 GB +
* GPU: NVIDIA, Graph Memory 11GB+ (>1080ti)
* 硬盘: 500 GB HDD(or SSD)

### Software
* Linux: Ubuntu 18.04 + / CentOS 7 +
* Python: 3.6 +
* TensorFlow: 1.15, 2.x + (The latest and x.x.0 are deprecated)

## Quick Start
Take the operation under the Linux system as an example:

First, clone the project to your computer through Git, and then download the data sets needed for the training of this project. For the download links, please refer to [End of Document](https://github.com/nl8590687/ASRT_SpeechRecognition/blob/master/README_EN.md#data-sets)
```shell
$ git clone https://github.com/nl8590687/ASRT_SpeechRecognition.git
```

Or you can use the "Fork" button to copy a copy of the project and then clone it locally with your own SSH key.

After cloning the repository via git, go to the project root directory; create a subdirectory `/data/speech_data` (you can use a soft link instead) for datasets, and then extract the downloaded datasets directly into it.

```shell
$ cd ASRT_SpeechRecognition

$ mkdir /data/speech_data

$ tar zxf <dataset zip files name> -C /data/speech_data/ 
```

Note that in the current version, in the configuration file, six data sets, Thchs30, ST-CMDS, Primewords, aishell-1, aidatatang200, MagicData, are added by default, please delete them if you don’t need them. If you want to use other data sets, you need to add data configuration yourself, and use the standard format supported by ASRT to organize the data in advance.

To download pinyin syllable list files for default dataset:
```shell
$ python download_default_datalist.py
```

Currently available models are 24, 25, 251 and 251bn

Before running this project, please install the necessary [Python3 version dependent library](https://github.com/nl8590687/ASRT_SpeechRecognition#python-import)

To start training this project, please execute:
```shell
$ python3 train_speech_model.py
```
To start the test of this project, please execute:
```shell
$ python3 evaluate_speech_model.py
```
Before testing, make sure the model file path filled in the code files exists.

To predict one wave audio file for speech recognition：
```shell
$ python3 predict_speech_file.py
```

ASRT API Server startup please execute:
```shell
$ python3 asrserver_http.py
```

Please note that after opening the API server, you need to use the client software corresponding to this ASRT project for voice recognition. For details, see the Wiki documentation to [download ASRT Client SDK & Demo](https://wiki.ailemon.net/docs/asrt-doc/download).


To test whether it is successful or not that calls api service interface:
```shell
$ python3 client_http.py
```

If you want to train and use other model(not Model 251bn), make changes in the corresponding position of the `import speech_model_zoo` in the code files.

If there is any problem during the execution of the program or during use, it can be promptly put forward in the issue, and I will reply as soon as possible.

Deploy ASRT by docker：
```shell
$ docker pull ailemondocker/asrt_service:1.2.0
$ docker run --rm -it -p 20001:20001 --name asrt-server -d ailemondocker/asrt_service:1.2.0
```
It will start a api server for recognition rather than training.

## Model

### Speech Model

DCNN + CTC

The maximum length of the input audio is 16 seconds, and the output is the corresponding Chinese pinyin sequence. 

* Questions about downloading trained models

The released finished software that includes trained model weights can be downloaded from [ASRT download page](https://wiki.ailemon.net/docs/asrt-doc/download). 

Github [Releases](https://github.com/nl8590687/ASRT_SpeechRecognition/releases) page includes the archives of the various versions of the software released and it's introduction. Under each version module, there is a zip file that includes trained model weights files. 

### Language Model 

Maximum Entropy Hidden Markov Model Based on Probability Graph. 

The input is a Chinese pinyin sequence, and the output is the corresponding Chinese character text. 

## About Accuracy

At present, the best model can basically reach 85% of Pinyin correct rate on the test set. 

## Python Dependency Library

* tensorFlow (1.15 - 2.x)
* numpy
* wave
* matplotlib
* math
* scipy
* requests
* flask
* waitress

If you have trouble when install those packages, please run the following script to do it as long as you have a GPU and CUDA 11.2 and cudnn 8.1 have been installed：

```shell
$ pip install -r requirements.txt
```

[Dependent Environment Details and Hardware Requirement](https://wiki.ailemon.net/docs/asrt-doc/asrt-doc-1deobk7bmlgd6)

## ASRT Client SDK for Calling Speech Recognition API

ASRT provides the abilities to import client SDKs for several platform and programing language for client develop speech recognition features , which work by RPC. Please refer ASRT project documents for detail.

|Client Platform|Project Repos Link|
|-|-|
|Windows Client SDK & Demo|[ASRT_SDK_WinClient](https://github.com/nl8590687/ASRT_SDK_WinClient)|
|Python3 Client SDK & Demo (Any Platform)|[ASRT_SDK_Python3](https://github.com/nl8590687/ASRT_SDK_Python3)|
|Golang Client SDK & Demo|[asrt-sdk-go](https://github.com/nl8590687/asrt-sdk-go)|
|Java Client SDK & Demo|[ASRT_SDK_Java](https://github.com/nl8590687/ASRT_SDK_Java)|

## Data Sets 

For full content please refer: [Some free Chinese speech datasets (Chinese)](https://blog.ailemon.net/2018/11/21/free-open-source-chinese-speech-datasets/)

|Dataset|Time|Size|Download (CN Mirrors)|Download (Source)|
|-|-|-|-|-|
|THCHS30|40h|6.01G|[data_thchs30.tgz](<http://openslr.magicdatatech.com/resources/18/data_thchs30.tgz>)|[data_thchs30.tgz](<http://www.openslr.org/resources/18/data_thchs30.tgz>)|
|ST-CMDS|100h|7.67G|[ST-CMDS-20170001_1-OS.tar.gz](<http://openslr.magicdatatech.com/resources/38/ST-CMDS-20170001_1-OS.tar.gz>)|[ST-CMDS-20170001_1-OS.tar.gz](<http://www.openslr.org/resources/38/ST-CMDS-20170001_1-OS.tar.gz>)|
|AIShell-1|178h|14.51G|[data_aishell.tgz](<http://openslr.magicdatatech.com/resources/33/data_aishell.tgz>)|[data_aishell.tgz](<http://www.openslr.org/resources/33/data_aishell.tgz>)|
|Primewords|100h|8.44G|[primewords_md_2018_set1.tar.gz](<http://openslr.magicdatatech.com/resources/47/primewords_md_2018_set1.tar.gz>)|[primewords_md_2018_set1.tar.gz](<http://www.openslr.org/resources/47/primewords_md_2018_set1.tar.gz>)|
|aidatatang_200zh|200h|17.47G|[aidatatang_200zh.tgz](<http://openslr.magicdatatech.com/resources/62/aidatatang_200zh.tgz>)|[aidatatang_200zh.tgz](<http://www.openslr.org/resources/62/aidatatang_200zh.tgz>)|
|MagicData|755h|52G/1.0G/2.2G| [train_set.tar.gz](<http://openslr.magicdatatech.com/resources/68/train_set.tar.gz>) / [dev_set.tar.gz](<http://openslr.magicdatatech.com/resources/68/dev_set.tar.gz>) / [test_set.tar.gz](<http://openslr.magicdatatech.com/resources/68/test_set.tar.gz>)|[train_set.tar.gz](<http://www.openslr.org/resources/68/train_set.tar.gz>) / [dev_set.tar.gz](<http://www.openslr.org/resources/68/dev_set.tar.gz>) / [test_set.tar.gz](<http://www.openslr.org/resources/68/test_set.tar.gz>)|


  Note：The way to unzip AISHELL-1 dataset

  ```
  $ tar xzf data_aishell.tgz
  $ cd data_aishell/wav
  $ for tar in *.tar.gz;  do tar xvf $tar; done
  ```

Special thanks! Thanks to the predecessors' public voice data set. 

If the provided dataset link cannot be opened and downloaded, click this link [OpenSLR](http://www.openslr.org)

## ASRT Docuemnts

* [ASRT project's Wiki document](https://wiki.ailemon.net/docs/asrt-doc)

A post about ASRT's introduction 
* [ASRT: Chinese Speech Recognition System (Chinese)](https://blog.ailemon.net/2018/08/29/asrt-a-chinese-speech-recognition-system/)

About how to use ASRT to train and deploy：
* [Teach you how to use ASRT to train Chinese ASR model (Chinese)](<https://blog.ailemon.net/2020/08/20/teach-you-how-use-asrt-train-chinese-asr-model/>)
* [Teach you how to use ASRT to deploy Chinese ASR API Server (Chinese)](<https://blog.ailemon.net/2020/08/27/teach-you-how-use-asrt-deploy-chinese-asr-api-server/>)

For questions about the principles of the statistical language model that are often asked, see: 
* [Simple Chinese word frequency statistics to generate N-gram language model (Chinese)](https://blog.ailemon.net/2017/02/20/simple-words-frequency-statistic-without-segmentation-algorithm/)
* [Statistical Language Model: Chinese Pinyin to Words (Chinese)](https://blog.ailemon.net/2017/04/27/statistical-language-model-chinese-pinyin-to-words/)

For questions about CTC, see: 

* [[Translation] Sequence Modeling with CTC (Chinese)](<https://blog.ailemon.net/2019/07/18/sequence-modeling-with-ctc/>)

For more infomation please refer to author's blog website: [AILemon Blog](https://blog.ailemon.net/) (Chinese)

## License

[GPL v3.0](LICENSE) © [nl8590687](https://github.com/nl8590687) Author: [ailemon](https://www.ailemon.net/)

## Cite this project

[DOI: 10.5281/zenodo.5808434](https://doi.org/10.5281/zenodo.5808434)

## Contributors

[Contributors Page](https://github.com/nl8590687/ASRT_SpeechRecognition/graphs/contributors)

@nl8590687 (repo owner)
-												docs: 更新readme文档

											
										
										
											2022-03-07 22:05:48 +08:00
+								![](https://res.ailemon.net/common/asrt_title_header_en.png)
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												update readme

											
										
										
											2020-04-18 14:35:39 +08:00
+								[![GPL-3.0 Licensed](https://img.shields.io/badge/License-GPL3.0-blue.svg?style=flat)](https://opensource.org/licenses/GPL-3.0)
-												docs: update readme

											
										
										
											2022-03-20 14:03:24 +08:00
+								[![Stars](https://img.shields.io/github/stars/nl8590687/ASRT_SpeechRecognition)](https://github.com/nl8590687/ASRT_SpeechRecognition)
-												feat: 提升最低建议版本

											
										
										
											2021-11-09 21:37:19 +08:00
+								[![TensorFlow Version](https://img.shields.io/badge/Tensorflow-1.15+-blue.svg)](https://www.tensorflow.org/)
 								[![Python Version](https://img.shields.io/badge/Python-3.6+-blue.svg)](https://www.python.org/)
-												docs: 更新readme文档

											
										
										
											2022-03-07 22:05:48 +08:00
+								[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5808434.svg)](https://doi.org/10.5281/zenodo.5808434)
 								ASRT is A Deep-Learning-Based Chinese Speech Recognition System. If you like this project, please **star** it.
-												update readme

											
										
										
											2018-07-26 10:41:00 +08:00
-												Update README

											
										
										
											2018-12-24 14:01:40 +08:00
+								**ReadMe Language** | [中文版](https://github.com/nl8590687/ASRT_SpeechRecognition/blob/master/README.md) | English |
-												update readme

											
										
										
											2018-07-26 10:41:00 +08:00
-												update:更新README中有关ASRT项目网站的URL

											
										
										
											2021-05-09 17:43:39 +08:00
+								[**ASRT Project Home Page**](https://asrt.ailemon.net/) |
-												docs: update readme

											
										
										
											2022-04-06 13:01:39 +08:00
+								[**Released Download**](https://wiki.ailemon.net/docs/asrt-doc/download) |
-												doc: 更新README文档

											
										
										
											2021-12-09 21:24:23 +08:00
+								[**View this project's wiki document (Chinese)**](https://wiki.ailemon.net/docs/asrt-doc) |
-												update:更新README中有关ASRT项目网站的URL

											
										
										
											2021-05-09 17:43:39 +08:00
+								[**Experience Demo**](https://asrt.ailemon.net/demo) |
-												doc: 更新README文档

											
										
										
											2021-12-09 21:24:23 +08:00
+								[**Donate**](https://wiki.ailemon.net/docs/asrt-doc/asrt-doc-1deo9u61unti9)
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												Update README

											
										
										
											2018-12-24 14:01:40 +08:00
+								If you have any questions in your works with this project, welcome to put up issues in this repo and I will response as soon as possible.
-												add some info

											
										
										
											2018-09-08 15:13:05 +08:00
-												doc: 更新README文档

											
										
										
											2021-12-09 21:24:23 +08:00
+								You can check the [FAQ Page (Chinese)](https://wiki.ailemon.net/docs/asrt-doc/asrt-doc-1deoeud494h4f) first before asking questions to avoid repeating questions.
-												Update README

											
										
										
											2018-12-24 14:01:40 +08:00
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								If there is any abnormality when the program is running, please send a complete screenshot when asking questions, and indicate the CPU architecture, GPU model, operating system, Python, TensorFlow and CUDA versions used, and whether any code has been modified or data sets have been added or deleted, etc. .
-												update readme

											
										
										
											2020-04-18 14:35:39 +08:00
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
+								## Introduction
-												doc: 更新readme文档

											
										
										
											2021-11-24 15:51:46 +08:00
+								This project uses tensorFlow.keras based on deep convolutional neural network and long-short memory neural network, attention mechanism and CTC to implement.
-												Update README

											
										
										
											2018-12-24 14:01:40 +08:00
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								## Minimum requirements for training
 								### Hardware
-												docs: update readme

											
										
										
											2022-03-20 14:03:24 +08:00
+								* CPU: 4 Core (x86_64, amd64) +
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								* RAM: 16 GB +
 								* GPU: NVIDIA, Graph Memory 11GB+ (>1080ti)
 								* 硬盘: 500 GB HDD(or SSD)
 								### Software
 								* Linux: Ubuntu 18.04 + / CentOS 7 +
 								* Python: 3.6 +
 								* TensorFlow: 1.15, 2.x + (The latest and x.x.0 are deprecated)
 								## Quick Start
-												docs: 更新readme文档

											
										
										
											2022-03-07 22:05:48 +08:00
+								Take the operation under the Linux system as an example:
-												Update README

											
										
										
											2018-12-24 14:01:40 +08:00
 								First, clone the project to your computer through Git, and then download the data sets needed for the training of this project. For the download links, please refer to [End of Document](https://github.com/nl8590687/ASRT_SpeechRecognition/blob/master/README_EN.md#data-sets)
 								```shell
 								$ git clone https://github.com/nl8590687/ASRT_SpeechRecognition.git
 								```
 								Or you can use the "Fork" button to copy a copy of the project and then clone it locally with your own SSH key.
-												docs: 更新readme文档

											
										
										
											2022-03-07 22:05:48 +08:00
+								After cloning the repository via git, go to the project root directory; create a subdirectory `/data/speech_data` (you can use a soft link instead) for datasets, and then extract the downloaded datasets directly into it.
-												添加提示：注意，Thchs30和ST-CMDS都必须下载，缺一不可

											
										
										
											2019-10-20 18:13:03 +08:00
-												Update README

											
										
										
											2018-12-24 14:01:40 +08:00
+								```shell
 								$ cd ASRT_SpeechRecognition
-												doc: 更新readme内容

											
										
										
											2022-02-11 15:33:30 +08:00
+								$ mkdir /data/speech_data
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												doc: 更新readme内容

											
										
										
											2022-02-11 15:33:30 +08:00
+								$ tar zxf <dataset zip files name> -C /data/speech_data/
-												Update README

											
										
										
											2018-12-24 14:01:40 +08:00
+								```
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												docs: 更新readme文档

											
										
										
											2022-03-07 22:05:48 +08:00
+								Note that in the current version, in the configuration file, six data sets, Thchs30, ST-CMDS, Primewords, aishell-1, aidatatang200, MagicData, are added by default, please delete them if you don’t need them. If you want to use other data sets, you need to add data configuration yourself, and use the standard format supported by ASRT to organize the data in advance.
-												doc: 更新readme文档

											
										
										
											2021-11-24 15:51:46 +08:00
-												doc: 更新readme内容

											
										
										
											2022-02-11 15:33:30 +08:00
+								To download pinyin syllable list files for default dataset:
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
+								```shell
-												doc: 更新readme内容

											
										
										
											2022-02-11 15:33:30 +08:00
+								$ python download_default_datalist.py
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
+								```
-												docs: update readme

											
										
										
											2022-03-21 23:10:55 +08:00
+								Currently available models are 24, 25, 251 and 251bn
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												Update README

											
										
										
											2018-12-24 14:01:40 +08:00
+								Before running this project, please install the necessary [Python3 version dependent library](https://github.com/nl8590687/ASRT_SpeechRecognition#python-import)
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
+								To start training this project, please execute:
 								```shell
-												doc: 更新readme文档

											
										
										
											2021-11-24 15:51:46 +08:00
+								$ python3 train_speech_model.py
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
+								```
 								To start the test of this project, please execute:
 								```shell
-												doc: 更新readme文档

											
										
										
											2021-11-24 15:51:46 +08:00
+								$ python3 evaluate_speech_model.py
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
+								```
 								Before testing, make sure the model file path filled in the code files exists.
-												docs: update readme

											
										
										
											2022-03-20 14:03:24 +08:00
+								To predict one wave audio file for speech recognition：
 								```shell
 								$ python3 predict_speech_file.py
 								```
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
+								ASRT API Server startup please execute:
 								```shell
-												docs: update readme

											
										
										
											2022-03-20 14:03:24 +08:00
+								$ python3 asrserver_http.py
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
+								```
-												docs: update readme

											
										
										
											2022-03-20 14:03:24 +08:00
+								Please note that after opening the API server, you need to use the client software corresponding to this ASRT project for voice recognition. For details, see the Wiki documentation to [download ASRT Client SDK & Demo](https://wiki.ailemon.net/docs/asrt-doc/download).
 								To test whether it is successful or not that calls api service interface:
 								```shell
 								$ python3 client_http.py
 								```
-												update readme

											
										
										
											2019-01-25 22:48:54 +08:00
-												feat: 切换默认声学模型到m251bn

											
										
										
											2022-03-27 21:47:12 +08:00
+								If you want to train and use other model(not Model 251bn), make changes in the corresponding position of the `import speech_model_zoo` in the code files.
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
 								If there is any problem during the execution of the program or during use, it can be promptly put forward in the issue, and I will reply as soon as possible.
-												docs: 添加docker部署命令

											
										
										
											2022-01-07 15:58:39 +08:00
+								Deploy ASRT by docker：
 								```shell
-												docs: update readme

											
										
										
											2022-04-06 13:01:39 +08:00
+								$ docker pull ailemondocker/asrt_service:1.2.0
 								$ docker run --rm -it -p 20001:20001 --name asrt-server -d ailemondocker/asrt_service:1.2.0
-												docs: 添加docker部署命令

											
										
										
											2022-01-07 15:58:39 +08:00
+								```
 								It will start a api server for recognition rather than training.
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
 								## Model
 								### Speech Model
-												docs: update readme

											
										
										
											2022-03-20 14:03:24 +08:00
+								DCNN + CTC
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												Update README

											
										
										
											2018-12-24 14:01:40 +08:00
+								The maximum length of the input audio is 16 seconds, and the output is the corresponding Chinese pinyin sequence.
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
+								* Questions about downloading trained models
-												doc: 更新README文档

											
										
										
											2021-12-09 21:24:23 +08:00
+								The released finished software that includes trained model weights can be downloaded from [ASRT download page](https://wiki.ailemon.net/docs/asrt-doc/download).
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												update readme

											
										
										
											2021-03-04 19:25:17 +08:00
+								Github [Releases](https://github.com/nl8590687/ASRT_SpeechRecognition/releases) page includes the archives of the various versions of the software released and it's introduction. Under each version module, there is a zip file that includes trained model weights files.
-												update readme

											
										
										
											2020-05-11 17:56:54 +08:00
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
+								### Language Model
 								Maximum Entropy Hidden Markov Model Based on Probability Graph.
-												Update README

											
										
										
											2018-12-24 14:01:40 +08:00
+								The input is a Chinese pinyin sequence, and the output is the corresponding Chinese character text.
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
+								## About Accuracy
-												docs: update readme

											
										
										
											2022-04-06 13:01:39 +08:00
+								At present, the best model can basically reach 85% of Pinyin correct rate on the test set.
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												doc: 更新readme文档

											
										
										
											2021-11-24 15:51:46 +08:00
+								## Python Dependency Library
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												doc: 更新readme文档

											
										
										
											2021-11-24 15:51:46 +08:00
+								* tensorFlow (1.15 - 2.x)
 								* numpy
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
+								* wave
 								* matplotlib
 								* math
-												doc: 更新readme文档

											
										
										
											2021-11-24 15:51:46 +08:00
+								* scipy
-												docs: 更新了README文档，补充了requirments.txt

											
										
										
											2021-05-16 21:54:12 +08:00
+								* requests
-												docs: update readme

											
										
										
											2022-03-20 14:03:24 +08:00
+								* flask
 								* waitress
-												docs: 更新了README文档，补充了requirments.txt

											
										
										
											2021-05-16 21:54:12 +08:00
-												feat: 提升最低建议版本

											
										
										
											2021-11-09 21:37:19 +08:00
+								If you have trouble when install those packages, please run the following script to do it as long as you have a GPU and CUDA 11.2 and cudnn 8.1 have been installed：
-												docs: 更新了README文档，补充了requirments.txt

											
										
										
											2021-05-16 21:54:12 +08:00
 								```shell
 								$ pip install -r requirements.txt
 								```
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												doc: 更新README文档

											
										
										
											2021-12-09 21:24:23 +08:00
+								[Dependent Environment Details and Hardware Requirement](https://wiki.ailemon.net/docs/asrt-doc/asrt-doc-1deobk7bmlgd6)
-												update readme

											
										
										
											2020-01-17 17:57:35 +08:00
-												docs: update readme

											
										
										
											2022-03-20 14:03:24 +08:00
+								## ASRT Client SDK for Calling Speech Recognition API
 								ASRT provides the abilities to import client SDKs for several platform and programing language for client develop speech recognition features , which work by RPC. Please refer ASRT project documents for detail.
 								|Client Platform|Project Repos Link|
 								|-|-|
 								|Windows Client SDK & Demo|[ASRT_SDK_WinClient](https://github.com/nl8590687/ASRT_SDK_WinClient)|
 								|Python3 Client SDK & Demo (Any Platform)|[ASRT_SDK_Python3](https://github.com/nl8590687/ASRT_SDK_Python3)|
-												docs: update readme

											
										
										
											2022-03-21 23:10:55 +08:00
+								|Golang Client SDK & Demo|[asrt-sdk-go](https://github.com/nl8590687/asrt-sdk-go)|
-												docs & ci: 更新相关信息

											
										
										
											2022-03-26 23:18:10 +08:00
+								|Java Client SDK & Demo|[ASRT_SDK_Java](https://github.com/nl8590687/ASRT_SDK_Java)|
-												docs: update readme

											
										
										
											2022-03-20 14:03:24 +08:00
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
+								## Data Sets
-												update readme

											
										
										
											2020-04-18 14:35:39 +08:00
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								For full content please refer: [Some free Chinese speech datasets (Chinese)](https://blog.ailemon.net/2018/11/21/free-open-source-chinese-speech-datasets/)
-												Update README

											
										
										
											2018-12-24 14:01:40 +08:00
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								|Dataset|Time|Size|Download (CN Mirrors)|Download (Source)|
 								|-|-|-|-|-|
 								|THCHS30|40h|6.01G|[data_thchs30.tgz](<http://openslr.magicdatatech.com/resources/18/data_thchs30.tgz>)|[data_thchs30.tgz](<http://www.openslr.org/resources/18/data_thchs30.tgz>)|
 								|ST-CMDS|100h|7.67G|[ST-CMDS-20170001_1-OS.tar.gz](<http://openslr.magicdatatech.com/resources/38/ST-CMDS-20170001_1-OS.tar.gz>)|[ST-CMDS-20170001_1-OS.tar.gz](<http://www.openslr.org/resources/38/ST-CMDS-20170001_1-OS.tar.gz>)|
 								|AIShell-1|178h|14.51G|[data_aishell.tgz](<http://openslr.magicdatatech.com/resources/33/data_aishell.tgz>)|[data_aishell.tgz](<http://www.openslr.org/resources/33/data_aishell.tgz>)|
 								|Primewords|100h|8.44G|[primewords_md_2018_set1.tar.gz](<http://openslr.magicdatatech.com/resources/47/primewords_md_2018_set1.tar.gz>)|[primewords_md_2018_set1.tar.gz](<http://www.openslr.org/resources/47/primewords_md_2018_set1.tar.gz>)|
 								|aidatatang_200zh|200h|17.47G|[aidatatang_200zh.tgz](<http://openslr.magicdatatech.com/resources/62/aidatatang_200zh.tgz>)|[aidatatang_200zh.tgz](<http://www.openslr.org/resources/62/aidatatang_200zh.tgz>)|
 								|MagicData|755h|52G/1.0G/2.2G| [train_set.tar.gz](<http://openslr.magicdatatech.com/resources/68/train_set.tar.gz>) / [dev_set.tar.gz](<http://openslr.magicdatatech.com/resources/68/dev_set.tar.gz>) / [test_set.tar.gz](<http://openslr.magicdatatech.com/resources/68/test_set.tar.gz>)|[train_set.tar.gz](<http://www.openslr.org/resources/68/train_set.tar.gz>) / [dev_set.tar.gz](<http://www.openslr.org/resources/68/dev_set.tar.gz>) / [test_set.tar.gz](<http://www.openslr.org/resources/68/test_set.tar.gz>)|
-												Update README

											
										
										
											2018-12-24 14:01:40 +08:00
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								  Note：The way to unzip AISHELL-1 dataset
-												update readme

											
										
										
											2019-01-25 22:48:54 +08:00
-												更新贡献者名单

											
										
										
											2019-03-16 13:22:59 +08:00
+								  ```
 								  $ tar xzf data_aishell.tgz
 								  $ cd data_aishell/wav
 								  $ for tar in *.tar.gz;  do tar xvf $tar; done
 								  ```
-												update readme

											
										
										
											2019-01-25 22:48:54 +08:00
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								Special thanks! Thanks to the predecessors' public voice data set.
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								If the provided dataset link cannot be opened and downloaded, click this link [OpenSLR](http://www.openslr.org)
-												add new open source dataset aidatatang_200zh in readme

											
										
										
											2019-04-13 17:18:54 +08:00
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								## ASRT Docuemnts
-												add new open source dataset aidatatang_200zh in readme

											
										
										
											2019-04-13 17:18:54 +08:00
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								* [ASRT project's Wiki document](https://wiki.ailemon.net/docs/asrt-doc)
-												添加新数据集的下载链接

											
										
										
											2019-08-14 12:06:44 +08:00
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								A post about ASRT's introduction
 								* [ASRT: Chinese Speech Recognition System (Chinese)](https://blog.ailemon.net/2018/08/29/asrt-a-chinese-speech-recognition-system/)
-												添加新数据集的下载链接

											
										
										
											2019-08-14 12:06:44 +08:00
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								About how to use ASRT to train and deploy：
 								* [Teach you how to use ASRT to train Chinese ASR model (Chinese)](<https://blog.ailemon.net/2020/08/20/teach-you-how-use-asrt-train-chinese-asr-model/>)
 								* [Teach you how to use ASRT to deploy Chinese ASR API Server (Chinese)](<https://blog.ailemon.net/2020/08/27/teach-you-how-use-asrt-deploy-chinese-asr-api-server/>)
-												添加新数据集的下载链接

											
										
										
											2019-08-14 12:06:44 +08:00
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								For questions about the principles of the statistical language model that are often asked, see:
 								* [Simple Chinese word frequency statistics to generate N-gram language model (Chinese)](https://blog.ailemon.net/2017/02/20/simple-words-frequency-statistic-without-segmentation-algorithm/)
 								* [Statistical Language Model: Chinese Pinyin to Words (Chinese)](https://blog.ailemon.net/2017/04/27/statistical-language-model-chinese-pinyin-to-words/)
-												添加新数据集的下载链接

											
										
										
											2019-08-14 12:06:44 +08:00
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								For questions about CTC, see:
-												添加新数据集的下载链接

											
										
										
											2019-08-14 12:06:44 +08:00
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								* [[Translation] Sequence Modeling with CTC (Chinese)](<https://blog.ailemon.net/2019/07/18/sequence-modeling-with-ctc/>)
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												docs: update readme

											
										
										
											2022-03-08 21:32:38 +08:00
+								For more infomation please refer to author's blog website: [AILemon Blog](https://blog.ailemon.net/) (Chinese)
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												modify md docs

											
										
										
											2019-09-01 19:43:10 +08:00
+								## License
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												docs: 更新了README文档，补充了requirments.txt

											
										
										
											2021-05-16 21:54:12 +08:00
+								[GPL v3.0](LICENSE) © [nl8590687](https://github.com/nl8590687) Author: [ailemon](https://www.ailemon.net/)
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												docs: update readme

											
										
										
											2021-12-29 20:56:58 +08:00
+								## Cite this project
-												docs: 更新readme文档

											
										
										
											2022-03-07 22:05:48 +08:00
+								[DOI: 10.5281/zenodo.5808434](https://doi.org/10.5281/zenodo.5808434)
-												docs: update readme

											
										
										
											2021-12-29 20:56:58 +08:00
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
+								## Contributors
-												doc: 更新README文档

											
										
										
											2021-12-09 21:24:23 +08:00
 								[Contributors Page](https://github.com/nl8590687/ASRT_SpeechRecognition/graphs/contributors)
-												update readme.md

											
										
										
											2018-06-25 20:22:23 +08:00
-												update readme

											
										
										
											2018-07-26 10:41:00 +08:00
+								@nl8590687 (repo owner)