initial commit

author: Peter Bengtsson <mail@peterbe.com> 2020-12-08 14:40:17 -0500
committer: Peter Bengtsson <mail@peterbe.com> 2020-12-08 14:40:17 -0500
commit: 33058f2b292b3a581333bdfb21b8f671898c5060 (patch)
tree: 51c3e392513ec574331b2d3f85c394445ea803c6 /files/zh-cn/web/api/web_speech_api
parent: 8b66d724f7caf0157093fb09cfec8fbd0c6ad50a (diff)
download: translated-content-33058f2b292b3a581333bdfb21b8f671898c5060.tar.gz
translated-content-33058f2b292b3a581333bdfb21b8f671898c5060.tar.bz2
translated-content-33058f2b292b3a581333bdfb21b8f671898c5060.zip
2 files changed, 467 insertions, 0 deletions
diff --git a/files/zh-cn/web/api/web_speech_api/index.html b/files/zh-cn/web/api/web_speech_api/index.html
new file mode 100644
index 0000000000..82ffea08fa
--- /dev/null
+++ b/files/zh-cn/web/api/web_speech_api/index.html
@@ -0,0 +1,116 @@
+---
+title: Web Speech API
+slug: Web/API/Web_Speech_API
+tags:
+  - API
+  - 参考
+  - 合成
+  - 实验性的
+  - 网页语音API
+  - 识别
+  - 语音
+translation_of: Web/API/Web_Speech_API
+---
+<div>{{DefaultAPISidebar("Web Speech API")}}{{seecompattable}}</div>
+
+<div class="summary">
+<p>Web Speech API 使您能够将语音数据合并到 Web 应用程序中。 Web Speech API 有两个部分：SpeechSynthesis 语音合成 （文本到语音 TTS）和 SpeechRecognition  语音识别（异步语音识别）。</p>
+</div>
+
+<h2 id="Web_Speech_的概念及用法">Web Speech 的概念及用法</h2>
+
+<p>Web Speech API 使 Web 应用能够处理语音数据，该项 API 包含以下两个部分：</p>
+
+<ul>
+ <li>语音识别通过 {{domxref("SpeechRecognition")}} 接口进行访问，它提供了识别从音频输入（通常是设备默认的语音识别服务）中识别语音情景的能力。一般来说，你将使用该接口的构造函数来构造一个新的 {{domxref("SpeechRecognition")}} 对象，该对象包含了一些列有效的对象处理函数来检测识别设备麦克风中的语音输入。{{domxref("SpeechGrammar")}} 接口则表示了你应用中想要识别的特定文法。文法则通过 <a href="http://www.w3.org/TR/jsgf/">JSpeech Grammar Format</a> (<strong>JSGF</strong>.) 来定义。</li>
+ <li>语音合成通过 {{domxref("SpeechSynthesis")}} 接口进行访问，它提供了文字到语音（TTS）的能力，这使得程序能够读出它们的文字内容（通常使用设备默认的语音合成器）。不同的声音类类型通过 {{domxref("SpeechSynthesisVoice")}} 对象进行表示，不同部分的文字则由 {{domxref("SpeechSynthesisUtterance")}} 对象来表示。你可以将它们传递给 {{domxref("SpeechSynthesis.speak()")}} 方法来产生语音。</li>
+</ul>
+
+<p>更多关于这些特性的细节请参考 <a href="/en-US/docs/Web/API/Web_Speech_API/Using_the_Web_Speech_API">Using the Web Speech API</a>。</p>
+
+<h2 id="Web_Speech_的_API_接口">Web Speech 的 API 接口</h2>
+
+<h3 id="语音识别"> 语音识别</h3>
+
+<dl>
+ <dt>{{domxref("SpeechRecognition")}}</dt>
+ <dd>语音识别服务的控制器接口；它也处理由语音识别服务发来的 {{domxref("SpeechRecognitionEvent")}} 事件。</dd>
+ <dt>{{domxref("SpeechRecognitionAlternative")}}</dt>
+ <dd>表示由语音识别服务识别出的一个词汇。</dd>
+ <dt>{{domxref("SpeechRecognitionError")}}</dt>
+ <dd>表示语音识别服务发出的报错信息。</dd>
+ <dt>{{domxref("SpeechRecognitionEvent")}}</dt>
+ <dd>{{event("result")}} 和 {{event("nomatch")}} 的事件对象，包含了与语音识别过程中间或最终结果相关的全部数据。</dd>
+ <dt>{{domxref("SpeechGrammar")}}</dt>
+ <dd>我们将要交由语音识别服务进行识别的词汇或者词汇的模式。</dd>
+ <dt>{{domxref("SpeechGrammarList")}}</dt>
+ <dd>表示一个由 {{domxref("SpeechGrammar")}} 对象构成的列表。</dd>
+ <dt>{{domxref("SpeechRecognitionResult")}}</dt>
+ <dd>表示一次识别中的匹配项，其中可能包含多个 {{domxref("SpeechRecognitionAlternative")}} 对象。</dd>
+ <dt>{{domxref("SpeechRecognitionResultList")}}</dt>
+ <dd>表示包含 {{domxref("SpeechRecognitionResult")}} 对象的一个列表，如果是以 {{domxref("SpeechRecognition.continuous","continuous")}} 模式捕获的结果，则是单个对象。</dd>
+</dl>
+
+<h3 id="语音合成">语音合成</h3>
+
+<dl>
+ <dt>{{domxref("SpeechSynthesis")}}</dt>
+ <dd>语音合成服务的控制器接口，可用于获取设备上可用的合成语音，开始、暂停以及其它相关命令的信息。</dd>
+ <dt>{{domxref("SpeechSynthesisErrorEvent")}}</dt>
+ <dd>包含了在发音服务处理 {{domxref("SpeechSynthesisUtterance")}} 对象过程中的信息及报错信息。</dd>
+ <dt>{{domxref("SpeechSynthesisEvent")}}</dt>
+ <dd>包含了经由发音服务处理过的 {{domxref("SpeechSynthesisUtterance")}} 对象当前状态的信息。</dd>
+ <dt>{{domxref("SpeechSynthesisUtterance")}}</dt>
+ <dd>表示一次发音请求。其中包含了将由语音服务朗读的内容，以及如何朗读它（例如：语种、音高、音量）。</dd>
+</dl>
+
+<dl>
+ <dt>{{domxref("SpeechSynthesisVoice")}}</dt>
+ <dd>表示系统提供的一个声音。每个 <code>SpeechSynthesisVoice</code> 都有与之相关的发音服务，包括了语种、名称 和 URI 等信息。</dd>
+ <dt>{{domxref("Window.speechSynthesis")}}</dt>
+ <dd>由规格文档指定的，被称为 <code>SpeechSynthesisGetter</code> 的 <code>[NoInterfaceObject]</code> 接口的一部分，在 <code>Window</code> 对象中实现，<code>speechSynthesis</code> 属性可用于访问 {{domxref("SpeechSynthesis")}} 控制器，从而获取语音合成功能的入口。</dd>
+</dl>
+
+<h2 id="示例">示例</h2>
+
+<p>GitHub 上的 <a href="https://github.com/mdn/web-speech-api/">Web Speech API repo</a> 的示例程序展示了语音识别及合成。</p>
+
+<h2 id="规范">规范</h2>
+
+<table class="standard-table">
+ <tbody>
+  <tr>
+   <th scope="col">规范</th>
+   <th scope="col">状态</th>
+   <th scope="col">描述</th>
+  </tr>
+  <tr>
+   <td>{{SpecName('Web Speech API')}}</td>
+   <td>{{Spec2('Web Speech API')}}</td>
+   <td>原始定义</td>
+  </tr>
+ </tbody>
+</table>
+
+<h2 id="浏览器兼容性">浏览器兼容性</h2>
+
+<h3 id="语音识别_2"><code>语音识别</code></h3>
+
+<div class="hidden">此页面上的兼容性表格，是根据结构化数据生成的，如果您愿意贡献数据，请查看 <a href="https://github.com/mdn/browser-compat-data">https://github.com/mdn/browser-compat-data</a> 并发送 pull request 给我们。</div>
+
+<div>{{Compat("api.SpeechRecognition", 0)}}</div>
+
+<h3 id="语音合成_2"><code>语音合成</code></h3>
+
+<div>{{Compat("api.SpeechSynthesis", 0)}}</div>
+
+<div class="hidden">此页面上的兼容性表格，是根据结构化数据生成的，如果您愿意贡献数据，请查看 <a href="https://github.com/mdn/browser-compat-data">https://github.com/mdn/browser-compat-data</a> 并发送 pull request 给我们。</div>
+
+<h2 id="相关链接">相关链接</h2>
+
+<ul>
+ <li><a href="/en-US/docs/Web/API/Web_Speech_API/Using_the_Web_Speech_API">Using the Web Speech API</a></li>
+ <li><a href="http://www.sitepoint.com/talking-web-pages-and-the-speech-synthesis-api/">SitePoint article</a></li>
+ <li><a href="http://updates.html5rocks.com/2014/01/Web-apps-that-talk---Introduction-to-the-Speech-Synthesis-API">HTML5Rocks article</a></li>
+ <li><a href="http://aurelio.audero.it/demo/speech-synthesis-api-demo.html">Demo</a> [aurelio.audero.it]</li>
+</ul>
diff --git a/files/zh-cn/web/api/web_speech_api/using_the_web_speech_api/index.html b/files/zh-cn/web/api/web_speech_api/using_the_web_speech_api/index.html
new file mode 100644
index 0000000000..614247b6c4
--- /dev/null
+++ b/files/zh-cn/web/api/web_speech_api/using_the_web_speech_api/index.html
@@ -0,0 +1,351 @@
+---
+title: 使用 Web Speech API
+slug: Web/API/Web_Speech_API/Using_the_Web_Speech_API
+tags:
+  - 语音合成
+  - 语音识别
+translation_of: Web/API/Web_Speech_API/Using_the_Web_Speech_API
+---
+<p class="summary">Web Speech API 提供了两类不同方向的函数——语音识别和语音合成(也被称为文本转为语音，英语简写是tts)——开启了有趣的新可用性和控制机制。这篇文章提供了这两个方向的简单介绍，并且都带有例子。</p>
+
+<h2 id="Speech_recognition">Speech recognition</h2>
+
+<p>Speech recognition(语音识别) 涉及三个过程：首先，需要设备的麦克风接收这段语音；其次，speech recognition service(语音识别服务器) 会根据一系列语法(基本上，语法是你希望在具体的应用中能够识别出来的词汇) 来检查这段语音；最后，当一个单词或者短语被成功识别后，结果会以文本字符串的形式返回(结果可以有多个)，以及更多的行为可以设置被触发。</p>
+
+<p>Web Speech API 有一个主要的控制接口——<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition"><code>SpeechRecognition</code></a>， 外加一些如表示语法、表示结果等等亲密相关的接口。通常，设备都有可使用的默认语音识别系统，大部分现代操作系统使用这个语音识别系统来处理语音命令，比如 Mac OS X 上的 Dictation，iOS上的 Siri，Win10上的 Cortana，Android Speech 等等。</p>
+
+<p><span style='font-family: x-locale-heading-primary,zillaslab,Palatino,"Palatino Linotype",x-locale-heading-secondary,serif; font-size: 1.375rem;'>Demo</span></p>
+
+<p>为了简单展示 Web speech recognition 的作用，我们写了一个demo——<a href="https://github.com/mdn/web-speech-api/tree/master/speech-color-changer">Speech color changer</a>。点击屏幕之后，说出 HTML 颜色关键字(网页里罗列的单词就是)，接下来应用的背景颜色就会变成你说的颜色。</p>
+
+<p><img alt="The UI of an app titled Speech Color changer. It invites the user to tap the screen and say a color, and then it turns the background of the app that colour. In this case it has turned the background red." src="https://mdn.mozillademos.org/files/11975/speech-color-changer.png" style="border: 1px solid black; display: block; height: 533px; margin: 0px auto; width: 300px;"></p>
+
+<p>为了跑这个 demo，可以 clone Github仓库(上面甩出的就是，或者<a href="https://github.com/mdn/web-speech-api/archive/master.zip">directly download</a>)，可以在支持的移动端浏览器(比如Chrome) 导航到 <a href="http://mdn.github.io/web-speech-api/speech-color-changer/">live demo URL</a> 直接观看(亲测desktop browser也是可以的，不过只能是Chrome)，也可以通过 <a href="https://developer.mozilla.org/en-US/docs/Tools/WebIDE">WebIDE</a> 作为一个app加载到Firefox OS(Firefox OS使用API的权限问题见下文)。</p>
+
+<h3 id="Browser_support">Browser support</h3>
+
+<p>对于 Web Speech API speech recognition(语音识别)的支持，在各浏览器中还不成熟，还在发展，现在主要的限制如下：</p>
+
+<ul>
+ <li>
+  <p>Firefox 桌面端和移动端在Gecko 44+中都支持，并且是没有前缀的，它可以在<code>about:config</code> 中把 <code>media.webspeech.recognition.enable</code> 设置为 <code>true</code> 打开。权限设置/UI还没有整理出来，所以权限还不能被用户使用，也就是不能用。不过很快会修复吧~</p>
+ </li>
+ <li>
+  <p>Firefox OS 2.5+ 也支持，但作为一个特权API(privileged API)需要权限，因此你需要在<a href="https://developer.mozilla.org/en-US/docs/Web/Apps/Build/Manifest">manifest.webapp</a> (也可以通过 WebIDE 下载， 或者使应用得到验证后在 <a href="https://marketplace.firefox.com/">Firefox Marketplace</a> 可使用)如下设置：</p>
+
+  <pre class="brush: json line-numbers  language-json"><code class="language-json"><span class="key token">"permissions":</span> <span class="punctuation token">{</span>
+  <span class="key token">"audio-capture" :</span> <span class="punctuation token">{</span>
+    <span class="key token">"description" :</span> <span class="string token">"Audio capture"</span>
+  <span class="punctuation token">}</span><span class="punctuation token">,</span>
+  <span class="key token">"speech-recognition" :</span> <span class="punctuation token">{</span>
+    <span class="key token">"description" :</span> <span class="string token">"Speech recognition"</span>
+  <span class="punctuation token">}</span>
+<span class="punctuation token">}</span></code></pre>
+
+  <pre class="brush: json line-numbers  language-json"><code class="language-json"><span class="key token">"type":</span> <span class="string token">"privileged"</span></code></pre>
+ </li>
+ <li>
+  <p>Chrome桌面端和Android端自version 33 以来均支持，但是带有前缀，所以你需要使用带有前缀的版本，比如：<code>webkitSpeechRecognition</code></p>
+ </li>
+</ul>
+
+<h3 id="HTML_和_CSS">HTML 和 CSS</h3>
+
+<p>对于这个应用来说，HTML和CSS部分是无足轻重的。仅仅只有一个标题，一个介绍段落和一个div用来输出check的结果。</p>
+
+<pre class="brush: html">&lt;h1&gt;Speech color changer&lt;/h1&gt;
+&lt;p&gt;Tap/click then say a color to change the background color of the app.&lt;/p&gt;
+&lt;div&gt;
+  &lt;p class="output"&gt;&lt;em&gt;...diagnostic messages&lt;/em&gt;&lt;/p&gt;
+&lt;/div&gt;</pre>
+
+<p>CSS也只是提供了简单的响应式样式，跨设备看上去也是ok的。</p>
+
+<h3 id="JavaScript">JavaScript</h3>
+
+<p>JavaScript部分会介绍更多细节。</p>
+
+<h4 id="浏览器支持">浏览器支持</h4>
+
+<p>之前有说到过，Chrome现在支持的是带有前缀的 speech recognition，因此在code开始部分得加些内容保证在需要前缀的Chrome和不需要前缀的像Firefox中，使用的object都是正确的。</p>
+
+<pre class="brush: js">var SpeechRecognition = SpeechRecognition || webkitSpeechRecognition
+var SpeechGrammarList = SpeechGrammarList || webkitSpeechGrammarList
+var SpeechRecognitionEvent = SpeechRecognitionEvent || webkitSpeechRecognitionEvent</pre>
+
+<h4 id="The_grammar">The grammar</h4>
+
+<p>这部分是我们的代码定义希望应用能够识别的语法。语法放在下面定义的变量<code>grammar</code>中：</p>
+
+<pre class="brush: js">var colors = [ 'aqua' , 'azure' , 'beige', 'bisque', 'black', 'blue', 'brown', 'chocolate', 'coral' ... ];
+var grammar = '#JSGF V1.0; grammar colors; public &lt;color&gt; = ' + colors.join(' | ') + ' ;'</pre>
+
+<p>语法格式使用的是 <a href="http://www.w3.org/TR/jsgf/">JSpeech Grammar Format</a> (<strong>JSGF</strong>) ——你可以在前面的链接中了解更多关于语法格式的规范。不过现在，让我们快速地浏览它：</p>
+
+<ul>
+ <li>
+  <p>每一行用分号分隔，和js中一样</p>
+ </li>
+ <li>
+  <p>第一行——<code>#JSGF V1.0</code> ——说的是语法使用的格式和版本。这总是需要首先包括在内</p>
+ </li>
+ <li>
+  <p>第二行表示我们想要识别的<code>term</code> 的类型(这里就是<code>colors</code>)。<code>public</code> 声明这是一条公共规则，尖括号中的字符串定义需要识别<code>term</code> 的名字(这里就是<code>color</code>)，等号后面的是这个<code>term</code> 可以被识别和接受的具体值。得注意每一个值如何被一个管道符号分割开的</p>
+ </li>
+ <li>
+  <p>你可以按照上面的结构，在多行中，想定义多少就定义多少<code>terms</code> ，也可以包括相当复杂的语法定义。对于我们这个简单的demo，就把语法定义的简单些</p>
+ </li>
+</ul>
+
+<h4 id="将grammer插入speech_recognition">将grammer插入speech recognition</h4>
+
+<p>接下来是使用 <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition/SpeechRecognition"><code>SpeechRecognition()</code></a> 构造函数，定义一个speech recognition实例，控制对于这个应用的识别。还需要使用 <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechGrammarList/SpeechGrammarList"><code>SpeechGrammarList()</code></a> 构造函数，创立一个speech grammer list对象，包含我们的语法。</p>
+
+<pre class="brush: js">var recognition = new SpeechRecognition();
+var speechRecognitionList = new SpeechGrammarList();</pre>
+
+<p>使用 <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechGrammarList/addFromString"><code>SpeechGrammarList.addFromString()</code></a> 把语法添加到列表(list)，这个方法接收两个参数，第一个是我们想要添加的包含语法内容的字符串，第二个是对添加的这条语法的权重(权重值范围是0~1)，权重其实是相对于其他语法，这一条语法的重要程度。添加到列表的语法就是可用的，并且是一个<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechGrammar"><code>SpeechGrammar</code></a> 实例。</p>
+
+<pre class="brush: js">speechRecognitionList.addFromString(grammar, 1);</pre>
+
+<p>我们然后通过设置 <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition/grammars"><code>SpeechRecognition.grammars</code></a> 属性值，把我们的<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechGrammarList"><code>SpeechGrammarList</code></a> 添加到speech recognition实例。在继续往前走之前，我们还需要设置recognition实例其他的一些属性：</p>
+
+<ul>
+ <li>
+  <p><a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition/lang"><code>SpeechRecognition.lang</code></a> ：设置识别的是什么语言。这个设定是良好的做好，因此墙裂推荐~</p>
+ </li>
+ <li>
+  <p><a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition/interimResults"><code>SpeechRecognition.interimResults</code></a> ：定义speech recognition系统要不要返回临时结果(interim results)，还是只返回最终结果。对于这个简单demo，只返回最终结果就够了。</p>
+ </li>
+ <li>
+  <p><a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition/maxAlternatives"><code>SpeechRecognition.maxAlternatives</code></a> ：定义每次结果返回的可能匹配值的数量。这有时有用，比如要的结果不明确，你想要用一个列表展示所有可能值，让用户自己从中选择一个正确的。但这里这个简单demo就不用了，因此我们设置为1(1也就是默认值)。</p>
+ </li>
+</ul>
+
+<pre class="brush: js">recognition.grammars = speechRecognitionList;
+//recognition.continuous = false;
+recognition.lang = 'en-US';
+recognition.interimResults = false;
+recognition.maxAlternatives = 1;</pre>
+
+<div class="note">
+<p><strong>Note</strong>: <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition/continuous"><code>SpeechRecognition.continuous</code></a> 控制的是每一次允许多个结果被捕捉(比如在这个demo中连着说两个颜色关键字，都可以被捕捉)，或者一次只能识别一个结果。代码中它被注释掉的原因是，在Gecko中它还不被支持，所以如果把它加进去会破坏这个应用。你可以在收到第一个结果后简单地停止识别，从而得到类似的结果，稍后将会看到。</p>
+</div>
+
+<h4 id="开始语音识别">开始语音识别</h4>
+
+<p>在获取输出的<code>&lt;div&gt;</code> 和 html元素引用之后(这些我们可以用来待会输出语音识别诊断的结果，更新应用的背景颜色)，我们添加了一个<code>onclick</code> 事件处理，作用是当屏幕被点击后，语音识别服务将开启——这通过调用 <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition/start"><code>SpeechRecognition.start()</code></a> 实现。<code>forEach()</code> 方法内部的工作是，为每个颜色关键字添加一个这个颜色的背景色，这样就直观知道了这个颜色关键字指向什么颜色。</p>
+
+<pre class="brush: js">var diagnostic = document.querySelector('.output');
+var bg = document.querySelector('html');
+var hints = document.querySelector('.hints');
+
+var colorHTML= '';
+colors.forEach(function(v, i, a){
+  console.log(v, i);
+  colorHTML += '&lt;span style="background-color:' + v + ';"&gt; ' + v + ' &lt;/span&gt;';
+});
+hints.innerHTML = 'Tap/click then say a color to change the background color of the app. Try '+ colorHTML + '.';
+
+document.body.onclick = function() {
+  recognition.start();
+  console.log('Ready to receive a color command.');
+}</pre>
+
+<h4 id="接收、处理结果">接收、处理结果</h4>
+
+<p>一旦语音识别开始，有许多event handlers可以用于做结果返回的后续操作，除了识别的结果外还有些零碎的相关信息可供操作(可查看 <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition#Event_handlers"><code>SpeechRecognition</code> event handlers list</a> )。最常见会使用的一个是 <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition/onresult"><code>SpeechRecognition.onresult</code></a> ，这在收到一个成功的结果时候触发。</p>
+
+<pre class="brush: js">recognition.onresult = function(event) {
+  var last = event.results.length - 1;
+  var color = event.results[last][0].transcript;
+  diagnostic.textContent = 'Result received: ' + color + '.';
+  bg.style.backgroundColor = color;
+  console.log('Confidence: ' + event.results[0][0].confidence);
+}</pre>
+
+<p>代码中第三行看上去有一点复杂，让我们一步一步解释以下。<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognitionEvent/results"><code>SpeechRecognitionEvent.results</code></a> 属性返回的是一个<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognitionResultList"><code>SpeechRecognitionResultList</code></a> 对象(这个对象会包含<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognitionResult"><code>SpeechRecognitionResult</code></a> 对象们)，它有一个getter，所以它包含的这些对象可以像一个数组被访问到——所以<code>[last]</code> 返回的是排在最后位置(最新)的<code>SpeechRecognitionResult</code>对象。每个<code>SpeechRecognitionResult</code> 对象包含的 <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognitionAlternative"><code>SpeechRecognitionAlternative</code></a> 对象含有一个被识别的单词。这些<code>SpeechRecognitionResult</code> 对象也有一个getter，所以<code>[0]</code> 返回的是其中包含的第一个<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognitionAlternative"><code>SpeechRecognitionAlternative</code></a> 对象。最后返回的<code>transcript</code>属性就是被识别单词的字符串，把背景颜色设置为这个颜色，并在UI中报告出这个结果信息。</p>
+
+<p>也使用了 <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition/onspeechend"><code>SpeechRecognition.onspeechend</code></a> 这个handle停止语音识别服务(具体工作在<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition/stop"><code>SpeechRecognition.stop()</code></a>) ，一旦一个单词被识别就不能再说咯(必须再点击屏幕再次开启语音识别)</p>
+
+<pre class="brush: js">recognition.onspeechend = function() {
+  recognition.stop();
+}</pre>
+
+<h4 id="处理error和未能识别的语音">处理error和未能识别的语音</h4>
+
+<p>最后两个handlers处理的两种情况，一种是你说的内容不在定义的语法中所以识别不了，另一种是出现了error。</p>
+
+<p><a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition/onnomatch"><code>SpeechRecognition.onnomatch</code></a> 支持的就是第一种，但是得注意它似乎在Firefox或者Chrome中触发会有问题；它只是返回任何被识别的内容：</p>
+
+<pre class="brush: js">recognition.onnomatch = function(event) {
+  diagnostic.textContent = 'I didnt recognise that color.';
+}</pre>
+
+<p><a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition/onerror"><code>SpeechRecognition.onerror</code></a> 处理的是第二种情况，识别成功了但是有error出现—— <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognitionError/error"><code>SpeechRecognitionError.error</code></a> 属性包含的信息就是返回的确切的error是什么。</p>
+
+<pre class="brush: js">recognition.onerror = function(event) {
+  diagnostic.textContent = 'Error occurred in recognition: ' + event.error;
+}</pre>
+
+<h2 id="Speech_synthesis">Speech synthesis</h2>
+
+<p>语音合成(也被称作是文本转为语音，英语简写是tts) 包括接收app中需要语音合成的文本，再在设备麦克风播放出来这两个过程。</p>
+
+<p>Web Speech API 对此有一个主要控制接口——<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesis"><code>SpeechSynthesis</code></a> ，外加一些处理如何表示要被合成的文本(也被称为 utterances)，用什么声音来播出 utterances 等工作的相关接口。同样的，许多操作系统都有自己的某种语音合成系统，在这个任务中我们调用可用的API来使用语音合成系统。</p>
+
+<h3 id="Demo">Demo</h3>
+
+<p>为了展示 Web语音合成的简单使用，我们提供了一个例子—— <a href="https://github.com/mdn/web-speech-api/tree/gh-pages/speak-easy-synthesis">Speak easy synthesis</a> 。例子是一套表单控件，包括输入需要被合成的文本，设置音调、语速和发出文本时需要的语音。在输入文本之后，按下<code>Enter</code>/<code>Return</code>键使它播放。</p>
+
+<p><img alt="UI of an app called speak easy synthesis. It has an input field in which to input text to be synthesised, slider controls to change the rate and pitch of the speech, and a drop down menu to choose between different voices." src="https://mdn.mozillademos.org/files/11977/speak-easy-synthesis.png" style="border: 1px solid black; display: block; height: 533px; margin: 0px auto; width: 300px;"></p>
+
+<p>想跑这个例子，你可以git clone Github仓库中的部分(或者<a href="https://github.com/mdn/web-speech-api/archive/master.zip">直接下载</a>)，在桌面版支持的浏览器打开 index.html 文件，或者在移动端浏览器直接导向 <a href="http://mdn.github.io/web-speech-api/speak-easy-synthesis/">live demo URL</a> ，像 Chrome和Firefox OS。</p>
+
+<h3 id="浏览器支持_2">浏览器支持</h3>
+
+<p>Web Speech API 语音合成部分在各浏览器中还是在发展，还不成熟，现在有以下几个限制点：</p>
+
+<ul>
+ <li>
+  <p>Firefox 桌面版和移动版在 Gecko 42+(Windows)/44+支持，但是没有前缀，可以通过将<code>media.webspeech.synth.enabled</code>标志在<code>about:config</code>中转为<code>true</code>来启用。</p>
+ </li>
+ <li>
+  <p>Firefox OS 2.5+ 支持，但是默认的，不需要任何权限。</p>
+ </li>
+ <li>
+  <p>Chrome 桌面版和安卓版自33版以来都支持，但是没有前缀</p>
+ </li>
+</ul>
+
+<h3 id="HTML_和_CSS_2">HTML 和 CSS</h3>
+
+<p>HTML和CSS还是无足轻重，只是简单包含一个标题，一段介绍文字，以及一个表格带有一些简单控制功能。<code>&lt;select&gt;</code> 元素初始时空的，之后会用 js使用<code>&lt;option&gt;</code> 填充。</p>
+
+<pre class="brush: html">&lt;h1&gt;Speech synthesiser&lt;/h1&gt;
+
+&lt;p&gt;Enter some text in the input below and press return to hear it. change voices using the dropdown menu.&lt;/p&gt;
+
+&lt;form&gt;
+  &lt;input type="text" class="txt"&gt;
+  &lt;div&gt;
+    &lt;label for="rate"&gt;Rate&lt;/label&gt;&lt;input type="range" min="0.5" max="2" value="1" step="0.1" id="rate"&gt;
+    &lt;div class="rate-value"&gt;1&lt;/div&gt;
+    &lt;div class="clearfix"&gt;&lt;/div&gt;
+  &lt;/div&gt;
+  &lt;div&gt;
+    &lt;label for="pitch"&gt;Pitch&lt;/label&gt;&lt;input type="range" min="0" max="2" value="1" step="0.1" id="pitch"&gt;
+    &lt;div class="pitch-value"&gt;1&lt;/div&gt;
+    &lt;div class="clearfix"&gt;&lt;/div&gt;
+  &lt;/div&gt;
+  &lt;select&gt;
+
+  &lt;/select&gt;
+&lt;/form&gt;</pre>
+
+<h3 id="JavaScript_2">JavaScript</h3>
+
+<p>让我们看看js在这个app中的强大表现。</p>
+
+<h4 id="设置变量">设置变量</h4>
+
+<p>首先我们获得UI中涉及的DOM元素的引用，但更有趣的是，我们获得了<a href="https://developer.mozilla.org/en-US/docs/Web/API/Window/speechSynthesis"><code>Window.speechSynthesis</code></a> 的引用。这是API的入口点——它返回了<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesis"><code>SpeechSynthesis</code></a> 的一个实例，对于 web语音合成的控制接口。</p>
+
+<pre class="brush: js">var synth = window.speechSynthesis;
+
+var inputForm = document.querySelector('form');
+var inputTxt = document.querySelector('.txt');
+var voiceSelect = document.querySelector('select');
+
+var pitch = document.querySelector('#pitch');
+var pitchValue = document.querySelector('.pitch-value');
+var rate = document.querySelector('#rate');
+var rateValue = document.querySelector('.rate-value');
+
+var voices = [];
+</pre>
+
+<h4 id="填充_select_元素">填充 select 元素</h4>
+
+<p>为使用设备可使用的不同的语音选项填充<code>&lt;select&gt;</code>元素，我们写了<code>populateVoiceList()</code> 方法。首先调用<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesis/getVoices"><code>SpeechSynthesis.getVoices()</code></a> ，这个函数返回包含所有可用语音(<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisVoice"><code>SpeechSynthesisVoice</code></a>对象)的列表。接下来循环这个列表，每次创建一个<code>&lt;option&gt;</code> 元素，设置它的文本内容以显示声音的名称（从<code>SpeechSynthesisVoice.name</code>获取），语音的语言（从<code>SpeechSynthesisVoice.lang</code>获取），如果某个语音是合成引擎默认的(检查<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisVoice/default"><code>SpeechSynthesisVoice.default</code></a>属性返回<code>true</code>) 在文本内容后面添加<code>-- DEFAULT</code>。</p>
+
+<p>对于每个<code>option</code>元素，我们也创建了<code>data-</code> 属性，属性值是语音的名字和语言，这样在之后我们可以轻松获取这个信息。之后把所有的<code>option</code>元素作为孩子添加到<code>select</code> 元素。</p>
+
+<pre class="brush: js">function populateVoiceList() {
+  voices = synth.getVoices();
+
+  for(i = 0; i &lt; voices.length ; i++) {
+    var option = document.createElement('option');
+    option.textContent = voices[i].name + ' (' + voices[i].lang + ')';
+
+    if(voices[i].default) {
+      option.textContent += ' -- DEFAULT';
+    }
+
+    option.setAttribute('data-lang', voices[i].lang);
+    option.setAttribute('data-name', voices[i].name);
+    voiceSelect.appendChild(option);
+  }
+}</pre>
+
+<p>接下来是运行这个函数，我们做如下代码工作。因为Firefox不支持<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesis/onvoiceschanged"><code>SpeechSynthesis.onvoiceschanged</code></a> ，所以很常规地只是返回语音对象列表当<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesis/getVoices"><code>SpeechSynthesis.getVoices()</code></a> 被触发。但是 Chrome 就有点不同了，在<code>SpeechSynthesis.getVoices()</code> 被触发时，先要等待事件触发(有点绕~按照下面代码，<code>populateVoiceList</code> 函数在Firefox运行一次，在Chrome中运行两次)：</p>
+
+<pre class="brush: js">populateVoiceList();
+if (speechSynthesis.onvoiceschanged !== undefined) {
+  speechSynthesis.onvoiceschanged = populateVoiceList;
+}</pre>
+
+<h4 id="说出输入的文本">说出输入的文本</h4>
+
+<p>接下来我们创建一个事件处理器(handler)，开始说出在文本框中输入的文本。我们把<code>onsubmit</code> 处理器挂在表单上，当<code>Enter/Return</code> 被按下，对应行为就会发生。我们首先通过构造函数创建一个新的<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisUtterance/SpeechSynthesisUtterance"><code>SpeechSynthesisUtterance()</code></a> 实例——把文本输入框中的值作为参数传递。</p>
+
+<p>接下来，我们需要弄清楚使用哪种语音。使用<a href="https://developer.mozilla.org/en-US/docs/Web/API/HTMLSelectElement"><code>HTMLSelectElement</code></a> <code>selectedOptions</code> 属性返回当前选中的 <a href="https://developer.mozilla.org/en-US/docs/Web/HTML/Element/option"><code>&lt;option&gt;</code></a> 元素。然后使用元素的<code>data-name</code>属性，找到 <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisVoice"><code>SpeechSynthesisVoice</code></a> 对象的<code>name</code>匹配<code>data-name</code> 的值。把匹配的语音对象设置为<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisUtterance/voice"><code>SpeechSynthesisUtterance.voice</code></a> 的属性值。</p>
+
+<p>最后，我们设置 <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisUtterance/pitch"><code>SpeechSynthesisUtterance.pitch</code></a> 和<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisUtterance/rate"><code>SpeechSynthesisUtterance.rate</code></a> 属性值为对应范围表单元素中的值。哈哈所有准备工作就绪，调用 <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesis/speak"><code>SpeechSynthesis.speak()</code></a> 开始说话。把 <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisUtterance"><code>SpeechSynthesisUtterance</code></a> 实例作为参数传递。</p>
+
+<pre class="brush: js">inputForm.onsubmit = function(event) {
+  event.preventDefault();
+
+  var utterThis = new SpeechSynthesisUtterance(inputTxt.value);
+  var selectedOption = voiceSelect.selectedOptions[0].getAttribute('data-name');
+  for(i = 0; i &lt; voices.length ; i++) {
+    if(voices[i].name === selectedOption) {
+      utterThis.voice = voices[i];
+    }
+  }
+  utterThis.pitch = pitch.value;
+  utterThis.rate = rate.value;
+  synth.speak(utterThis);</pre>
+
+<p>在事件处理器的最后部分，我们加入了一个 <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisUtterance/onpause"><code>SpeechSynthesisUtterance.onpause</code></a> 处理器，来展示<a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisEvent"><code>SpeechSynthesisEvent</code></a> 如何可以很好地使用。当 <a href="https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesis/pause"><code>SpeechSynthesis.pause()</code></a> 被调用，这将返回一条消息，报告该语音暂停的字符编号和名称。</p>
+
+<pre class="brush: js">   utterThis.onpause = function(event) {
+    var char = event.utterance.text.charAt(event.charIndex);
+    console.log('Speech paused at character ' + event.charIndex + ' of "' +
+    event.utterance.text + '", which is "' + char + '".');
+  }</pre>
+
+<p>最后，我们在文本输入框添加了 <a href="https://developer.mozilla.org/en-US/docs/Web/API/HTMLElement/blur">blur()</a> 方法。这主要是在Firefox操作系统上隐藏键盘</p>
+
+<pre class="brush: js">  inputTxt.blur();
+}</pre>
+
+<h4 id="Updating_the_displayed_pitch_and_rate_values">Updating the displayed pitch and rate values</h4>
+
+<p>代码的最后部分，在每次滑动条移动时，简单地更新<code>pitch/rate</code>在UI中展示的值。</p>
+
+<pre class="brush: js">pitch.onchange = function() {
+  pitchValue.textContent = pitch.value;
+}
+
+rate.onchange = function() {
+  rateValue.textContent = rate.value;
+}</pre>
+
+<p> </p>
+
+<p> </p>
+
+<p> </p>
+
+<p> </p>
author	Peter Bengtsson <mail@peterbe.com>	2020-12-08 14:40:17 -0500
committer	Peter Bengtsson <mail@peterbe.com>	2020-12-08 14:40:17 -0500
commit	33058f2b292b3a581333bdfb21b8f671898c5060 (patch)
tree	51c3e392513ec574331b2d3f85c394445ea803c6 /files/zh-cn/web/api/web_speech_api
parent	8b66d724f7caf0157093fb09cfec8fbd0c6ad50a (diff)
download	translated-content-33058f2b292b3a581333bdfb21b8f671898c5060.tar.gz translated-content-33058f2b292b3a581333bdfb21b8f671898c5060.tar.bz2 translated-content-33058f2b292b3a581333bdfb21b8f671898c5060.zip