Skip to content

Commit 1c9cd2f

Browse files
committed
docs: update documentation
1 parent e33659b commit 1c9cd2f

File tree

1 file changed

+96
-53
lines changed

1 file changed

+96
-53
lines changed

README.md

+96-53
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,8 @@ $ pip install --upgrade aspeak
2929
## Usage
3030

3131
```
32-
usage: aspeak [-h] [-V | -L | -Q | [-t [TEXT] | -s [SSML]]] [-p PITCH] [-r RATE] [-S STYLE] [-f FILE] [-e ENCODING] [-o OUTPUT_PATH] [--mp3 | -F FORMAT] [-l LOCALE] [-v VOICE]
32+
usage: aspeak [-h] [-V | -L | -Q | [-t [TEXT] | -s [SSML]]] [-p PITCH] [-r RATE] [-S STYLE] [-f FILE] [-e ENCODING] [-o OUTPUT_PATH] [--mp3 | --ogg | --webm | --wav | -F FORMAT]
33+
[-l LOCALE] [-v VOICE] [-q QUALITY]
3334
3435
This program uses trial auth token of Azure Cognitive Services to do speech synthesis for you
3536
@@ -48,13 +49,18 @@ options:
4849
Text/SSML file encoding, default to "utf-8"(Not for stdin!)
4950
-o OUTPUT_PATH, --output OUTPUT_PATH
5051
Output file path, wav format by default
51-
--mp3 Use mp3 format instead of wav. (Only works when outputting to a file)
52+
--mp3 Use mp3 format for output. (Only works when outputting to a file)
53+
--ogg Use ogg format for output. (Only works when outputting to a file)
54+
--webm Use webm format for output. (Only works when outputting to a file)
55+
--wav Use wav format for output
5256
-F FORMAT, --format FORMAT
5357
Set output audio format (experts only)
5458
-l LOCALE, --locale LOCALE
5559
Locale to use, default to en-US
5660
-v VOICE, --voice VOICE
5761
Voice to use
62+
-q QUALITY, --quality QUALITY
63+
Output quality, default to 0
5864
5965
Options for --text:
6066
-p PITCH, --pitch PITCH
@@ -73,7 +79,7 @@ Options for --text:
7379
#### Speak "Hello, world!" to default speaker.
7480

7581
```sh
76-
$ aspeak -t "Hello, world!"
82+
$ aspeak -t "Hello, world"
7783
```
7884

7985
#### List all available voices.
@@ -117,13 +123,95 @@ Status: GA
117123
#### Save synthesized speech to a file.
118124

119125
```sh
120-
$ aspeak -t "Hello, world!" -o output.wav
126+
$ aspeak -t "Hello, world" -o output.wav
121127
```
122128

123-
If you prefer mp3, you can use `--mp3` option.
129+
If you prefer mp3/ogg/webm, you can use `--mp3`/`--ogg`/`--webm` option.
124130

125131
```sh
126-
$ aspeak -t "Hello, world!" -o output.mp3 --mp3
132+
$ aspeak -t "Hello, world" -o output.mp3 --mp3
133+
$ aspeak -t "Hello, world" -o output.ogg --ogg
134+
$ aspeak -t "Hello, world" -o output.webm --webm
135+
```
136+
137+
#### List available quality levels and formats
138+
139+
```sh
140+
$ aspeak -Q
141+
```
142+
143+
<details>
144+
145+
<summary>Output</summary>
146+
147+
```
148+
Available qualities:
149+
Qualities for wav:
150+
-2: Riff8Khz16BitMonoPcm
151+
-1: Riff16Khz16BitMonoPcm
152+
0: Riff24Khz16BitMonoPcm
153+
1: Riff24Khz16BitMonoPcm
154+
Qualities for mp3:
155+
-3: Audio16Khz32KBitRateMonoMp3
156+
-2: Audio16Khz64KBitRateMonoMp3
157+
-1: Audio16Khz128KBitRateMonoMp3
158+
0: Audio24Khz48KBitRateMonoMp3
159+
1: Audio24Khz96KBitRateMonoMp3
160+
2: Audio24Khz160KBitRateMonoMp3
161+
3: Audio48Khz96KBitRateMonoMp3
162+
4: Audio48Khz192KBitRateMonoMp3
163+
Qualities for ogg:
164+
-1: Ogg16Khz16BitMonoOpus
165+
0: Ogg24Khz16BitMonoOpus
166+
1: Ogg48Khz16BitMonoOpus
167+
Qualities for webm:
168+
-1: Webm16Khz16BitMonoOpus
169+
0: Webm24Khz16BitMonoOpus
170+
1: Webm24Khz16Bit24KbpsMonoOpus
171+
172+
Available formats:
173+
- Riff8Khz16BitMonoPcm
174+
- Riff16Khz16BitMonoPcm
175+
- Audio16Khz128KBitRateMonoMp3
176+
- Raw24Khz16BitMonoPcm
177+
- Raw48Khz16BitMonoPcm
178+
- Raw16Khz16BitMonoPcm
179+
- Audio24Khz160KBitRateMonoMp3
180+
- Ogg24Khz16BitMonoOpus
181+
- Audio16Khz64KBitRateMonoMp3
182+
- Raw8Khz8BitMonoALaw
183+
- Audio24Khz16Bit48KbpsMonoOpus
184+
- Ogg16Khz16BitMonoOpus
185+
- Riff8Khz8BitMonoALaw
186+
- Riff8Khz8BitMonoMULaw
187+
- Audio48Khz192KBitRateMonoMp3
188+
- Raw8Khz16BitMonoPcm
189+
- Audio24Khz48KBitRateMonoMp3
190+
- Raw24Khz16BitMonoTrueSilk
191+
- Audio24Khz16Bit24KbpsMonoOpus
192+
- Audio24Khz96KBitRateMonoMp3
193+
- Webm24Khz16BitMonoOpus
194+
- Ogg48Khz16BitMonoOpus
195+
- Riff48Khz16BitMonoPcm
196+
- Webm24Khz16Bit24KbpsMonoOpus
197+
- Raw8Khz8BitMonoMULaw
198+
- Audio16Khz16Bit32KbpsMonoOpus
199+
- Audio16Khz32KBitRateMonoMp3
200+
- Riff24Khz16BitMonoPcm
201+
- Raw16Khz16BitMonoTrueSilk
202+
- Audio48Khz96KBitRateMonoMp3
203+
- Webm16Khz16BitMonoOpus
204+
```
205+
206+
</details>
207+
208+
#### Increase/Decrease audio qualities
209+
210+
```sh
211+
# Less than default quality.
212+
$ aspeak -t "Hello, world" -o output.mp3 --mp3 -q=-1
213+
# Best quality for mp3
214+
$ aspeak -t "Hello, world" -o output.mp3 --mp3 -q=3
127215
```
128216

129217
#### Read text from file and speak it.
@@ -184,55 +272,10 @@ $ aspeak -t "你好,世界!" -v zh-CN-XiaoxiaoNeural -p 1.5 -r 0.5 -S sad
184272

185273
### Examples for Advanced Users
186274

187-
#### List available audio formats
188-
189-
```sh
190-
$ aspeak -Q
191-
```
192-
193-
<details>
194-
195-
<summary>Output</summary>
196-
197-
```
198-
Available formats:
199-
- Audio24Khz96KBitRateMonoMp3
200-
- Audio16Khz128KBitRateMonoMp3
201-
- Webm24Khz16Bit24KbpsMonoOpus
202-
- Audio48Khz96KBitRateMonoMp3
203-
- Raw16Khz16BitMonoTrueSilk
204-
- Riff16Khz16BitMonoPcm
205-
- Audio24Khz16Bit24KbpsMonoOpus
206-
- Raw16Khz16BitMonoPcm
207-
- Raw8Khz8BitMonoMULaw
208-
- Ogg24Khz16BitMonoOpus
209-
- Audio24Khz160KBitRateMonoMp3
210-
- Audio16Khz64KBitRateMonoMp3
211-
- Riff48Khz16BitMonoPcm
212-
- Audio16Khz16Bit32KbpsMonoOpus
213-
- Raw24Khz16BitMonoTrueSilk
214-
- Raw8Khz16BitMonoPcm
215-
- Riff8Khz8BitMonoMULaw
216-
- Ogg48Khz16BitMonoOpus
217-
- Raw48Khz16BitMonoPcm
218-
- Webm16Khz16BitMonoOpus
219-
- Raw24Khz16BitMonoPcm
220-
- Riff8Khz8BitMonoALaw
221-
- Audio48Khz192KBitRateMonoMp3
222-
- Webm24Khz16BitMonoOpus
223-
- Riff24Khz16BitMonoPcm
224-
- Audio16Khz32KBitRateMonoMp3
225-
- Raw8Khz8BitMonoALaw
226-
- Audio24Khz48KBitRateMonoMp3
227-
- Riff8Khz16BitMonoPcm
228-
- Audio24Khz16Bit48KbpsMonoOpus
229-
- Ogg16Khz16BitMonoOpus
230-
```
231-
232-
</details>
233-
234275
#### Use a custom audio format for output
235276

277+
**Note**: When outputing to default speaker, using a non-wav format may lead to white noises.
278+
236279
```sh
237280
$ python -m aspeak -t "Hello World" -F Riff48Khz16BitMonoPcm -o high-quality.wav
238281
```

0 commit comments

Comments
 (0)