@@ -29,7 +29,8 @@ $ pip install --upgrade aspeak
29
29
## Usage
30
30
31
31
```
32
- usage: aspeak [-h] [-V | -L | -Q | [-t [TEXT] | -s [SSML]]] [-p PITCH] [-r RATE] [-S STYLE] [-f FILE] [-e ENCODING] [-o OUTPUT_PATH] [--mp3 | -F FORMAT] [-l LOCALE] [-v VOICE]
32
+ usage: aspeak [-h] [-V | -L | -Q | [-t [TEXT] | -s [SSML]]] [-p PITCH] [-r RATE] [-S STYLE] [-f FILE] [-e ENCODING] [-o OUTPUT_PATH] [--mp3 | --ogg | --webm | --wav | -F FORMAT]
33
+ [-l LOCALE] [-v VOICE] [-q QUALITY]
33
34
34
35
This program uses trial auth token of Azure Cognitive Services to do speech synthesis for you
35
36
@@ -48,13 +49,18 @@ options:
48
49
Text/SSML file encoding, default to "utf-8"(Not for stdin!)
49
50
-o OUTPUT_PATH, --output OUTPUT_PATH
50
51
Output file path, wav format by default
51
- --mp3 Use mp3 format instead of wav. (Only works when outputting to a file)
52
+ --mp3 Use mp3 format for output. (Only works when outputting to a file)
53
+ --ogg Use ogg format for output. (Only works when outputting to a file)
54
+ --webm Use webm format for output. (Only works when outputting to a file)
55
+ --wav Use wav format for output
52
56
-F FORMAT, --format FORMAT
53
57
Set output audio format (experts only)
54
58
-l LOCALE, --locale LOCALE
55
59
Locale to use, default to en-US
56
60
-v VOICE, --voice VOICE
57
61
Voice to use
62
+ -q QUALITY, --quality QUALITY
63
+ Output quality, default to 0
58
64
59
65
Options for --text:
60
66
-p PITCH, --pitch PITCH
@@ -73,7 +79,7 @@ Options for --text:
73
79
#### Speak "Hello, world!" to default speaker.
74
80
75
81
``` sh
76
- $ aspeak -t " Hello, world! "
82
+ $ aspeak -t " Hello, world"
77
83
```
78
84
79
85
#### List all available voices.
@@ -117,13 +123,95 @@ Status: GA
117
123
#### Save synthesized speech to a file.
118
124
119
125
``` sh
120
- $ aspeak -t " Hello, world! " -o output.wav
126
+ $ aspeak -t " Hello, world" -o output.wav
121
127
```
122
128
123
- If you prefer mp3, you can use ` --mp3 ` option.
129
+ If you prefer mp3/ogg/webm , you can use ` --mp3 ` / ` --ogg ` / ` --webm ` option.
124
130
125
131
``` sh
126
- $ aspeak -t " Hello, world!" -o output.mp3 --mp3
132
+ $ aspeak -t " Hello, world" -o output.mp3 --mp3
133
+ $ aspeak -t " Hello, world" -o output.ogg --ogg
134
+ $ aspeak -t " Hello, world" -o output.webm --webm
135
+ ```
136
+
137
+ #### List available quality levels and formats
138
+
139
+ ``` sh
140
+ $ aspeak -Q
141
+ ```
142
+
143
+ <details >
144
+
145
+ <summary >Output</summary >
146
+
147
+ ```
148
+ Available qualities:
149
+ Qualities for wav:
150
+ -2: Riff8Khz16BitMonoPcm
151
+ -1: Riff16Khz16BitMonoPcm
152
+ 0: Riff24Khz16BitMonoPcm
153
+ 1: Riff24Khz16BitMonoPcm
154
+ Qualities for mp3:
155
+ -3: Audio16Khz32KBitRateMonoMp3
156
+ -2: Audio16Khz64KBitRateMonoMp3
157
+ -1: Audio16Khz128KBitRateMonoMp3
158
+ 0: Audio24Khz48KBitRateMonoMp3
159
+ 1: Audio24Khz96KBitRateMonoMp3
160
+ 2: Audio24Khz160KBitRateMonoMp3
161
+ 3: Audio48Khz96KBitRateMonoMp3
162
+ 4: Audio48Khz192KBitRateMonoMp3
163
+ Qualities for ogg:
164
+ -1: Ogg16Khz16BitMonoOpus
165
+ 0: Ogg24Khz16BitMonoOpus
166
+ 1: Ogg48Khz16BitMonoOpus
167
+ Qualities for webm:
168
+ -1: Webm16Khz16BitMonoOpus
169
+ 0: Webm24Khz16BitMonoOpus
170
+ 1: Webm24Khz16Bit24KbpsMonoOpus
171
+
172
+ Available formats:
173
+ - Riff8Khz16BitMonoPcm
174
+ - Riff16Khz16BitMonoPcm
175
+ - Audio16Khz128KBitRateMonoMp3
176
+ - Raw24Khz16BitMonoPcm
177
+ - Raw48Khz16BitMonoPcm
178
+ - Raw16Khz16BitMonoPcm
179
+ - Audio24Khz160KBitRateMonoMp3
180
+ - Ogg24Khz16BitMonoOpus
181
+ - Audio16Khz64KBitRateMonoMp3
182
+ - Raw8Khz8BitMonoALaw
183
+ - Audio24Khz16Bit48KbpsMonoOpus
184
+ - Ogg16Khz16BitMonoOpus
185
+ - Riff8Khz8BitMonoALaw
186
+ - Riff8Khz8BitMonoMULaw
187
+ - Audio48Khz192KBitRateMonoMp3
188
+ - Raw8Khz16BitMonoPcm
189
+ - Audio24Khz48KBitRateMonoMp3
190
+ - Raw24Khz16BitMonoTrueSilk
191
+ - Audio24Khz16Bit24KbpsMonoOpus
192
+ - Audio24Khz96KBitRateMonoMp3
193
+ - Webm24Khz16BitMonoOpus
194
+ - Ogg48Khz16BitMonoOpus
195
+ - Riff48Khz16BitMonoPcm
196
+ - Webm24Khz16Bit24KbpsMonoOpus
197
+ - Raw8Khz8BitMonoMULaw
198
+ - Audio16Khz16Bit32KbpsMonoOpus
199
+ - Audio16Khz32KBitRateMonoMp3
200
+ - Riff24Khz16BitMonoPcm
201
+ - Raw16Khz16BitMonoTrueSilk
202
+ - Audio48Khz96KBitRateMonoMp3
203
+ - Webm16Khz16BitMonoOpus
204
+ ```
205
+
206
+ </details >
207
+
208
+ #### Increase/Decrease audio qualities
209
+
210
+ ``` sh
211
+ # Less than default quality.
212
+ $ aspeak -t " Hello, world" -o output.mp3 --mp3 -q=-1
213
+ # Best quality for mp3
214
+ $ aspeak -t " Hello, world" -o output.mp3 --mp3 -q=3
127
215
```
128
216
129
217
#### Read text from file and speak it.
@@ -184,55 +272,10 @@ $ aspeak -t "你好,世界!" -v zh-CN-XiaoxiaoNeural -p 1.5 -r 0.5 -S sad
184
272
185
273
### Examples for Advanced Users
186
274
187
- #### List available audio formats
188
-
189
- ``` sh
190
- $ aspeak -Q
191
- ```
192
-
193
- <details >
194
-
195
- <summary >Output</summary >
196
-
197
- ```
198
- Available formats:
199
- - Audio24Khz96KBitRateMonoMp3
200
- - Audio16Khz128KBitRateMonoMp3
201
- - Webm24Khz16Bit24KbpsMonoOpus
202
- - Audio48Khz96KBitRateMonoMp3
203
- - Raw16Khz16BitMonoTrueSilk
204
- - Riff16Khz16BitMonoPcm
205
- - Audio24Khz16Bit24KbpsMonoOpus
206
- - Raw16Khz16BitMonoPcm
207
- - Raw8Khz8BitMonoMULaw
208
- - Ogg24Khz16BitMonoOpus
209
- - Audio24Khz160KBitRateMonoMp3
210
- - Audio16Khz64KBitRateMonoMp3
211
- - Riff48Khz16BitMonoPcm
212
- - Audio16Khz16Bit32KbpsMonoOpus
213
- - Raw24Khz16BitMonoTrueSilk
214
- - Raw8Khz16BitMonoPcm
215
- - Riff8Khz8BitMonoMULaw
216
- - Ogg48Khz16BitMonoOpus
217
- - Raw48Khz16BitMonoPcm
218
- - Webm16Khz16BitMonoOpus
219
- - Raw24Khz16BitMonoPcm
220
- - Riff8Khz8BitMonoALaw
221
- - Audio48Khz192KBitRateMonoMp3
222
- - Webm24Khz16BitMonoOpus
223
- - Riff24Khz16BitMonoPcm
224
- - Audio16Khz32KBitRateMonoMp3
225
- - Raw8Khz8BitMonoALaw
226
- - Audio24Khz48KBitRateMonoMp3
227
- - Riff8Khz16BitMonoPcm
228
- - Audio24Khz16Bit48KbpsMonoOpus
229
- - Ogg16Khz16BitMonoOpus
230
- ```
231
-
232
- </details >
233
-
234
275
#### Use a custom audio format for output
235
276
277
+ ** Note** : When outputing to default speaker, using a non-wav format may lead to white noises.
278
+
236
279
``` sh
237
280
$ python -m aspeak -t " Hello World" -F Riff48Khz16BitMonoPcm -o high-quality.wav
238
281
```
0 commit comments