diff options
Diffstat (limited to 'files/zh-cn/glossary/base64/index.html')
-rw-r--r-- | files/zh-cn/glossary/base64/index.html | 605 |
1 files changed, 605 insertions, 0 deletions
diff --git a/files/zh-cn/glossary/base64/index.html b/files/zh-cn/glossary/base64/index.html new file mode 100644 index 0000000000..5da5d1e0f4 --- /dev/null +++ b/files/zh-cn/glossary/base64/index.html @@ -0,0 +1,605 @@ +--- +title: Base64的编码与解码 +slug: Web/API/WindowBase64/Base64_encoding_and_decoding +translation_of: Glossary/Base64 +--- +<p><strong>Base64 </strong>是一组相似的<a href="https://en.wikipedia.org/wiki/Binary-to-text_encoding">二进制到文本</a>(binary-to-text)的编码规则,使得二进制数据在解释成 radix-64 的表现形式后能够用 ASCII 字符串的格式表示出来。<em>Base64</em> 这个词出自一种 <a href="https://en.wikipedia.org/wiki/MIME#Content-Transfer-Encoding">MIME 数据传输编码</a>。 </p> + +<p>Base64编码普遍应用于需要通过被设计为处理文本数据的媒介上储存和传输二进制数据而需要编码该二进制数据的场景。这样是为了保证数据的完整并且不用在传输过程中修改这些数据。Base64 也被一些应用(包括使用 <a href="https://en.wikipedia.org/wiki/MIME">MIME</a> 的电子邮件)和在 <a href="/zh-CN/docs/XML">XML</a> 中储存复杂数据时使用。 </p> + +<p>在 JavaScript 中,有两个函数被分别用来处理解码和编码 <em>base64</em> 字符串:</p> + +<ul> + <li>{{domxref("WindowBase64.atob","atob()")}}</li> + <li>{{domxref("WindowBase64.btoa","btoa()")}}</li> +</ul> + +<p><code>atob()</code> 函数能够解码通过base-64编码的字符串数据。相反地,<code>btoa()</code> 函数能够从二进制数据“字符串”创建一个base-64编码的ASCII字符串。</p> + +<p><code>atob()</code> 和 <code>btoa()</code> 均使用字符串。如果你想使用 <code><a href="/en-US/docs/Web/API/ArrayBuffer">ArrayBuffers</a></code>,请参阅后文。</p> + +<h4 id="编码尺寸增加">编码尺寸增加</h4> + +<p>每一个Base64字符实际上代表着6比特位。因此,3字节(一字节是8比特,3字节也就是24比特)的字符串/二进制文件可以转换成4个Base64字符(4x6 = 24比特)。</p> + +<p>这意味着Base64格式的字符串或文件的尺寸约是原始尺寸的133%(增加了大约33%)。如果编码的数据很少,增加的比例可能会更高。例如:字符串<code>"a"</code>的<code>length === 1</code>进行Base64编码后是<code>"YQ=="</code>的<code>length === 4</code>,尺寸增加了300%。</p> + + + +<table class="topicpage-table"> + <tbody> + <tr> + <td> + <h2 class="Documentation" id="Documentation" name="Documentation">文档</h2> + + <dl> + <dt><a href="https://developer.mozilla.org/en-US/docs/data_URIs" title="https://developer.mozilla.org/en-US/docs/data_URIs"><code>data</code> URIs</a></dt> + <dd><small><code>data</code> URIs, 定义于 <a class="external" href="http://tools.ietf.org/html/rfc2397" title="http://tools.ietf.org/html/rfc2397">RFC 2397</a>,用于在文档内嵌入小的文件。</small></dd> + <dt><a href="https://en.wikipedia.org/wiki/Base64" title="https://en.wikipedia.org/wiki/Base64">Base64</a></dt> + <dd><small>维基百科上关于 Base64 的文章。</small></dd> + <dt>{{domxref("WindowBase64.atob","atob()")}}</dt> + <dd><small>解码一个Base64字符串。</small></dd> + <dt>{{domxref("WindowBase64.btoa","btoa()")}}</dt> + <dd><small>从一个字符串或者二进制数据编码一个Base64字符串。</small></dd> + <dt><a href="#The_.22Unicode_Problem.22">"Unicode 问题"</a></dt> + <dd><small>在大多数浏览器里里,在一个Unicode字符串上调用btoa()会造成一个<code>Character Out Of Range异常。这一段写了一些解决方案。</code></small></dd> + <dt><a href="/en-US/docs/URIScheme" title="/en-US/docs/URIScheme">URIScheme</a></dt> + <dd><small>Mozilla支持的URI schemes列表。</small></dd> + <dt><a href="/en-US/docs/Web/JavaScript/Typed_arrays/StringView" title="/en-US/docs/Web/JavaScript/Typed_arrays/StringView"><code>StringView</code></a></dt> + <dd>这篇文章发布了一个我们做的库,目的在于: + <ul> + <li>为字符串创建一个类C接口 (i.e. array of characters codes —<a href="/en-US/docs/Web/JavaScript/Typed_arrays/ArrayBufferView" title="/en-US/docs/Web/JavaScript/Typed_arrays/ArrayBufferView"> <code>ArrayBufferView</code></a> in JavaScript) ,基于JavaScript <a href="/en-US/docs/Web/JavaScript/Typed_arrays/ArrayBuffer" title="/en-US/docs/Web/JavaScript/Typed_arrays/ArrayBuffer"><code>ArrayBuffer</code></a> 接口。</li> + <li>为类字符串对象(目前为止为: <code>stringView</code>s) 创建一系列方法,它们<strong>严格按照数字数组</strong>工作,而不是不可变的字符串。</li> + <li>可用于其它Unicode编码,和默认的 <code><a href="/en-US/docs/Web/API/DOMString" title="/en-US/docs/Web/API/DOMString">DOMStrings</a>不同。</code></li> + </ul> + </dd> + </dl> + + <p><span class="alllinks"><a href="/en-US/docs/tag/Base64">查看所有...</a></span></p> + </td> + <td> + <h2 class="Tools" id="Tools" name="Tools">工具</h2> + + <ul> + <li><a href="#Solution_.232_.E2.80.93_rewriting_atob()_and_btoa()_using_TypedArrays_and_UTF-8">Rewriting <code>atob()</code> and <code>btoa()</code> using <code>TypedArray</code>s and UTF-8</a></li> + <li><a href="/en-US/docs/Web/JavaScript/Typed_arrays/StringView" title="/en-US/docs/Web/JavaScript/Typed_arrays/StringView"><code>StringView</code> – a C-like representation of strings based on typed arrays</a></li> + </ul> + + <p><span class="alllinks"><a href="/en-US/docs/tag/Base64">View All...</a></span></p> + + <h2 class="Related_Topics" id="Related_Topics" name="Related_Topics">相关文章</h2> + + <ul> + <li><a href="/en-US/docs/Web/JavaScript/Typed_arrays/ArrayBuffer"><code>ArrayBuffer</code></a></li> + <li><a href="/en-US/docs/Web/JavaScript/Typed_arrays">Typed arrays</a></li> + <li><a href="/en-US/docs/Web/JavaScript/Typed_arrays/ArrayBufferView">ArrayBufferView</a></li> + <li><a href="/en-US/docs/Web/JavaScript/Typed_arrays/Uint8Array"><code>Uint8Array</code></a></li> + <li><a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays/StringView" title="/en-US/docs/Web/JavaScript/Typed_arrays/StringView"><code>StringView</code> – a C-like representation of strings based on typed arrays</a></li> + <li><a href="/en-US/docs/Web/API/DOMString" title="/en-US/docs/Web/API/DOMString"><code>DOMString</code></a></li> + <li><a href="/en-US/docs/URI" title="/en-US/docs/URI"><code>URI</code></a></li> + <li><a href="/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURI" title="/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURI"><code>encodeURI()</code></a></li> + </ul> + </td> + </tr> + </tbody> +</table> + +<h2 id="Unicode_问题">Unicode 问题</h2> + +<p>由于 <a href="/en-US/docs/Web/API/DOMString" title="/en-US/docs/Web/API/DOMString"><code>DOMString</code></a> 是16位编码的字符串,所以如果有字符超出了8位ASCII编码的字符范围时,在大多数的浏览器中对Unicode字符串调用 <code>window.btoa</code> 将会造成一个 <code>Character Out Of Range</code> 的异常。有很多种方法可以解决这个问题:</p> + +<ul> + <li><a href="#Solution_1_–_JavaScript's_UTF-16_>_base64">the first method</a> consists in encoding JavaScript's native UTF-16 strings directly into base64 (fast, portable, clean)</li> + <li><a href="#Solution_2_–_JavaScript's_UTF-16_>_UTF-8_>_base64">the second method</a> consists in converting JavaScript's native UTF-16 strings to UTF-8 and then encode the latter into base64 (relatively fast, portable, clean)</li> + <li><a href="#Solution_3_–_JavaScript's_UTF-16_>_binary_string_>_base64">the third method</a> consists in encoding JavaScript's native UTF-16 strings directly into base64 via binary strings (very fast, relatively portable, very compact)</li> + <li><a href="#Solution_4_–_escaping_the_string_before_encoding_it">the fourth method</a> consists in escaping the whole string (with UTF-8, see {{jsxref("encodeURIComponent")}}) and then encode it (portable, non-standard)</li> + <li><a href="#Solution_5_–_rewrite_the_DOMs_atob()_and_btoa()_using_JavaScript's_TypedArrays_and_UTF-8">the fifth method</a> is similar to the second method, but uses third party libraries</li> +</ul> + +<h3 id="Solution_1_–_JavaScripts_UTF-16_>_base64">Solution #1 – JavaScript's UTF-16 => base64</h3> + +<p>A very fast and widely useable way to solve the unicode problem is by encoding JavaScript native UTF-16 strings directly into base64. Please visit the URL <code>data:text/plain;charset=utf-16;base64,OCY5JjomOyY8Jj4mPyY=</code> for a demonstration (copy the data uri, open a new tab, paste the data URI into the address bar, then press enter to go to the page). This method is particularly efficient because it does not require any type of conversion, except mapping a string into an array. The following code is also useful to get an <a href="/en-US/docs/Web/JavaScript/Typed_arrays/ArrayBuffer">ArrayBuffer</a> from a <em>Base64</em> string and/or viceversa (<a href="#Appendix_to_Solution_1_Decode_a_Base64_string_to_Uint8Array_or_ArrayBuffer" title="Appendix to Solution #1: Decode a Base64 string to Uint8Array or ArrayBuffer">see below</a>).</p> + +<pre class="brush: js notranslate">"use strict"; + +/*\ +|*| +|*| Base64 / binary data / UTF-8 strings utilities (#1) +|*| +|*| https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding +|*| +|*| Author: madmurphy +|*| +\*/ + +/* Array of bytes to base64 string decoding */ + +function b64ToUint6 (nChr) { + + return nChr > 64 && nChr < 91 ? + nChr - 65 + : nChr > 96 && nChr < 123 ? + nChr - 71 + : nChr > 47 && nChr < 58 ? + nChr + 4 + : nChr === 43 ? + 62 + : nChr === 47 ? + 63 + : + 0; + +} + +function base64DecToArr (sBase64, nBlockSize) { + + var + sB64Enc = sBase64.replace(/[^A-Za-z0-9\+\/]/g, ""), nInLen = sB64Enc.length, + nOutLen = nBlockSize ? Math.ceil((nInLen * 3 + 1 >>> 2) / nBlockSize) * nBlockSize : nInLen * 3 + 1 >>> 2, aBytes = new Uint8Array(nOutLen); + + for (var nMod3, nMod4, nUint24 = 0, nOutIdx = 0, nInIdx = 0; nInIdx < nInLen; nInIdx++) { + nMod4 = nInIdx & 3; + nUint24 |= b64ToUint6(sB64Enc.charCodeAt(nInIdx)) << 18 - 6 * nMod4; + if (nMod4 === 3 || nInLen - nInIdx === 1) { + for (nMod3 = 0; nMod3 < 3 && nOutIdx < nOutLen; nMod3++, nOutIdx++) { + aBytes[nOutIdx] = nUint24 >>> (16 >>> nMod3 & 24) & 255; + } + nUint24 = 0; + } + } + + return aBytes; +} + +/* Base64 string to array encoding */ + +function uint6ToB64 (nUint6) { + + return nUint6 < 26 ? + nUint6 + 65 + : nUint6 < 52 ? + nUint6 + 71 + : nUint6 < 62 ? + nUint6 - 4 + : nUint6 === 62 ? + 43 + : nUint6 === 63 ? + 47 + : + 65; + +} + +function base64EncArr (aBytes) { + + var eqLen = (3 - (aBytes.length % 3)) % 3, sB64Enc = ""; + + for (var nMod3, nLen = aBytes.length, nUint24 = 0, nIdx = 0; nIdx < nLen; nIdx++) { + nMod3 = nIdx % 3; + /* Uncomment the following line in order to split the output in lines 76-character long: */ + /* + if (nIdx > 0 && (nIdx * 4 / 3) % 76 === 0) { sB64Enc += "\r\n"; } + */ + nUint24 |= aBytes[nIdx] << (16 >>> nMod3 & 24); + if (nMod3 === 2 || aBytes.length - nIdx === 1) { + sB64Enc += String.fromCharCode(uint6ToB64(nUint24 >>> 18 & 63), uint6ToB64(nUint24 >>> 12 & 63), uint6ToB64(nUint24 >>> 6 & 63), uint6ToB64(nUint24 & 63)); + nUint24 = 0; + } + } + + return eqLen === 0 ? + sB64Enc + : + sB64Enc.substring(0, sB64Enc.length - eqLen) + (eqLen === 1 ? "=" : "=="); + +} +</pre> + +<h4 id="Tests">Tests</h4> + +<pre class="brush: js notranslate">var myString = "☸☹☺☻☼☾☿"; + +/* Part 1: Encode `myString` to base64 using native UTF-16 */ + +var aUTF16CodeUnits = new Uint16Array(myString.length); +Array.prototype.forEach.call(aUTF16CodeUnits, function (el, idx, arr) { arr[idx] = myString.charCodeAt(idx); }); +var sUTF16Base64 = base64EncArr(new Uint8Array(aUTF16CodeUnits.buffer)); + +/* Show output */ + +alert(sUTF16Base64); // "OCY5JjomOyY8Jj4mPyY=" + +/* Part 2: Decode `sUTF16Base64` to UTF-16 */ + +var sDecodedString = String.fromCharCode.apply(null, new Uint16Array(base64DecToArr(sUTF16Base64, 2).buffer)); + +/* Show output */ + +alert(sDecodedString); // "☸☹☺☻☼☾☿"</pre> + +<p>The produced string is fully portable, although represented as UTF-16. If you prefer UTF-8, see <a href="#Solution_2_–_JavaScript's_UTF-16_>_UTF-8_>_base64">the next solution</a>.</p> + +<h4 id="Appendix_to_Solution_1_Decode_a_Base64_string_to_Uint8Array_or_ArrayBuffer">Appendix to <a href="#Solution_1_–_JavaScript's_UTF-16_>_base64">Solution #1</a>: Decode a <em>Base64</em> string to <a href="/en-US/docs/Web/JavaScript/Typed_arrays/Uint8Array">Uint8Array</a> or <a href="/en-US/docs/Web/JavaScript/Typed_arrays/ArrayBuffer">ArrayBuffer</a></h4> + +<p>The functions above let us also create <a href="/en-US/docs/Web/JavaScript/Typed_arrays/Uint8Array">uint8Arrays</a> or <a href="/en-US/docs/Web/JavaScript/Typed_arrays/ArrayBuffer">arrayBuffers</a> from <em>base64</em>-encoded strings:</p> + +<pre class="brush: js notranslate">var myArray = base64DecToArr("QmFzZSA2NCDigJQgTW96aWxsYSBEZXZlbG9wZXIgTmV0d29yaw=="); // "Base 64 \u2014 Mozilla Developer Network" (as UTF-8) + +var myBuffer = base64DecToArr("QmFzZSA2NCDigJQgTW96aWxsYSBEZXZlbG9wZXIgTmV0d29yaw==").buffer; // "Base 64 \u2014 Mozilla Developer Network" (as UTF-8) + +alert(myBuffer.byteLength);</pre> + +<div class="note"><strong>Note:</strong> The function <code>base64DecToArr(sBase64[, <em>nBlockSize</em>])</code> returns an <a href="/en-US/docs/Web/JavaScript/Typed_arrays/Uint8Array"><code>uint8Array</code></a> of bytes. If your aim is to build a buffer of 16-bit / 32-bit / 64-bit raw data, use the <code>nBlockSize</code> argument, which is the number of bytes which the <code>uint8Array.buffer.bytesLength</code> property must result to be a multiple of (<code>1</code> or omitted for ASCII, binary content, <a href="https://developer.mozilla.org/en-US/docs/Web/API/DOMString/Binary">binary strings</a>, UTF-8-encoded strings; <code>2</code> for UTF-16 strings; <code>4</code> for UTF-32 strings).</div> + +<p>For a more complete library, see <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays/StringView" title="/en-US/docs/Web/JavaScript/Typed_arrays/StringView"><code>StringView</code> – a C-like representation of strings based on typed arrays</a> (source code <a href="https://github.com/madmurphy/stringview.js">available on GitHub</a>).</p> + +<h3 id="Solution_2_–_JavaScripts_UTF-16_>_UTF-8_>_base64">Solution #2 – JavaScript's UTF-16 => UTF-8 => base64</h3> + +<p>This solution consists in converting a JavaScript's native UTF-16 string into a UTF-8 string and then encoding the latter into base64. This also grants that converting a pure ASCII string to base64 always produces the same output as the native <a href="https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/btoa" title="The documentation about this has not yet been written; please consider contributing!"><code>btoa()</code></a>.</p> + +<pre class="brush: js notranslate">"use strict"; + +/*\ +|*| +|*| Base64 / binary data / UTF-8 strings utilities (#2) +|*| +|*| https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding +|*| +|*| Author: madmurphy +|*| +\*/ + +/* Array of bytes to base64 string decoding */ + +function b64ToUint6 (nChr) { + + return nChr > 64 && nChr < 91 ? + nChr - 65 + : nChr > 96 && nChr < 123 ? + nChr - 71 + : nChr > 47 && nChr < 58 ? + nChr + 4 + : nChr === 43 ? + 62 + : nChr === 47 ? + 63 + : + 0; + +} + +function base64DecToArr (sBase64, nBlockSize) { + + var + sB64Enc = sBase64.replace(/[^A-Za-z0-9\+\/]/g, ""), nInLen = sB64Enc.length, + nOutLen = nBlockSize ? Math.ceil((nInLen * 3 + 1 >>> 2) / nBlockSize) * nBlockSize : nInLen * 3 + 1 >>> 2, aBytes = new Uint8Array(nOutLen); + + for (var nMod3, nMod4, nUint24 = 0, nOutIdx = 0, nInIdx = 0; nInIdx < nInLen; nInIdx++) { + nMod4 = nInIdx & 3; + nUint24 |= b64ToUint6(sB64Enc.charCodeAt(nInIdx)) << 18 - 6 * nMod4; + if (nMod4 === 3 || nInLen - nInIdx === 1) { + for (nMod3 = 0; nMod3 < 3 && nOutIdx < nOutLen; nMod3++, nOutIdx++) { + aBytes[nOutIdx] = nUint24 >>> (16 >>> nMod3 & 24) & 255; + } + nUint24 = 0; + } + } + + return aBytes; +} + +/* Base64 string to array encoding */ + +function uint6ToB64 (nUint6) { + + return nUint6 < 26 ? + nUint6 + 65 + : nUint6 < 52 ? + nUint6 + 71 + : nUint6 < 62 ? + nUint6 - 4 + : nUint6 === 62 ? + 43 + : nUint6 === 63 ? + 47 + : + 65; + +} + +function base64EncArr (aBytes) { + + var eqLen = (3 - (aBytes.length % 3)) % 3, sB64Enc = ""; + + for (var nMod3, nLen = aBytes.length, nUint24 = 0, nIdx = 0; nIdx < nLen; nIdx++) { + nMod3 = nIdx % 3; + /* Uncomment the following line in order to split the output in lines 76-character long: */ + /* + if (nIdx > 0 && (nIdx * 4 / 3) % 76 === 0) { sB64Enc += "\r\n"; } + */ + nUint24 |= aBytes[nIdx] << (16 >>> nMod3 & 24); + if (nMod3 === 2 || aBytes.length - nIdx === 1) { + sB64Enc += String.fromCharCode(uint6ToB64(nUint24 >>> 18 & 63), uint6ToB64(nUint24 >>> 12 & 63), uint6ToB64(nUint24 >>> 6 & 63), uint6ToB64(nUint24 & 63)); + nUint24 = 0; + } + } + + return eqLen === 0 ? + sB64Enc + : + sB64Enc.substring(0, sB64Enc.length - eqLen) + (eqLen === 1 ? "=" : "=="); + +} + +/* UTF-8 array to DOMString and vice versa */ + +function UTF8ArrToStr (aBytes) { + + var sView = ""; + + for (var nPart, nLen = aBytes.length, nIdx = 0; nIdx < nLen; nIdx++) { + nPart = aBytes[nIdx]; + sView += String.fromCharCode( + nPart > 251 && nPart < 254 && nIdx + 5 < nLen ? /* six bytes */ + /* (nPart - 252 << 30) may be not so safe in ECMAScript! So...: */ + (nPart - 252) * 1073741824 + (aBytes[++nIdx] - 128 << 24) + (aBytes[++nIdx] - 128 << 18) + (aBytes[++nIdx] - 128 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128 + : nPart > 247 && nPart < 252 && nIdx + 4 < nLen ? /* five bytes */ + (nPart - 248 << 24) + (aBytes[++nIdx] - 128 << 18) + (aBytes[++nIdx] - 128 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128 + : nPart > 239 && nPart < 248 && nIdx + 3 < nLen ? /* four bytes */ + (nPart - 240 << 18) + (aBytes[++nIdx] - 128 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128 + : nPart > 223 && nPart < 240 && nIdx + 2 < nLen ? /* three bytes */ + (nPart - 224 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128 + : nPart > 191 && nPart < 224 && nIdx + 1 < nLen ? /* two bytes */ + (nPart - 192 << 6) + aBytes[++nIdx] - 128 + : /* nPart < 127 ? */ /* one byte */ + nPart + ); + } + + return sView; + +} + +function strToUTF8Arr (sDOMStr) { + + var aBytes, nChr, nStrLen = sDOMStr.length, nArrLen = 0; + + /* mapping... */ + + for (var nMapIdx = 0; nMapIdx < nStrLen; nMapIdx++) { + nChr = sDOMStr.charCodeAt(nMapIdx); + nArrLen += nChr < 0x80 ? 1 : nChr < 0x800 ? 2 : nChr < 0x10000 ? 3 : nChr < 0x200000 ? 4 : nChr < 0x4000000 ? 5 : 6; + } + + aBytes = new Uint8Array(nArrLen); + + /* transcription... */ + + for (var nIdx = 0, nChrIdx = 0; nIdx < nArrLen; nChrIdx++) { + nChr = sDOMStr.charCodeAt(nChrIdx); + if (nChr < 128) { + /* one byte */ + aBytes[nIdx++] = nChr; + } else if (nChr < 0x800) { + /* two bytes */ + aBytes[nIdx++] = 192 + (nChr >>> 6); + aBytes[nIdx++] = 128 + (nChr & 63); + } else if (nChr < 0x10000) { + /* three bytes */ + aBytes[nIdx++] = 224 + (nChr >>> 12); + aBytes[nIdx++] = 128 + (nChr >>> 6 & 63); + aBytes[nIdx++] = 128 + (nChr & 63); + } else if (nChr < 0x200000) { + /* four bytes */ + aBytes[nIdx++] = 240 + (nChr >>> 18); + aBytes[nIdx++] = 128 + (nChr >>> 12 & 63); + aBytes[nIdx++] = 128 + (nChr >>> 6 & 63); + aBytes[nIdx++] = 128 + (nChr & 63); + } else if (nChr < 0x4000000) { + /* five bytes */ + aBytes[nIdx++] = 248 + (nChr >>> 24); + aBytes[nIdx++] = 128 + (nChr >>> 18 & 63); + aBytes[nIdx++] = 128 + (nChr >>> 12 & 63); + aBytes[nIdx++] = 128 + (nChr >>> 6 & 63); + aBytes[nIdx++] = 128 + (nChr & 63); + } else /* if (nChr <= 0x7fffffff) */ { + /* six bytes */ + aBytes[nIdx++] = 252 + (nChr >>> 30); + aBytes[nIdx++] = 128 + (nChr >>> 24 & 63); + aBytes[nIdx++] = 128 + (nChr >>> 18 & 63); + aBytes[nIdx++] = 128 + (nChr >>> 12 & 63); + aBytes[nIdx++] = 128 + (nChr >>> 6 & 63); + aBytes[nIdx++] = 128 + (nChr & 63); + } + } + + return aBytes; + +}</pre> + +<h4 id="Tests_2">Tests</h4> + +<pre class="brush: js notranslate">/* Tests */ + +var sMyInput = "Base 64 \u2014 Mozilla Developer Network"; + +var aMyUTF8Input = strToUTF8Arr(sMyInput); + +var sMyBase64 = base64EncArr(aMyUTF8Input); + +alert(sMyBase64); // "QmFzZSA2NCDigJQgTW96aWxsYSBEZXZlbG9wZXIgTmV0d29yaw==" + +var aMyUTF8Output = base64DecToArr(sMyBase64); + +var sMyOutput = UTF8ArrToStr(aMyUTF8Output); + +alert(sMyOutput); // "Base 64 — Mozilla Developer Network"</pre> + +<h3 id="Solution_3_–_JavaScripts_UTF-16_>_binary_string_>_base64">Solution #3 – JavaScript's UTF-16 => binary string => base64</h3> + +<p>The following is the fastest and most compact possible approach. The output is exactly the same produced by <a href="#Solution_1_–_JavaScript's_UTF-16_>_base64">Solution #1</a> (UTF-16 encoded strings), but instead of rewriting {{domxref("WindowBase64.atob","atob()")}} and {{domxref("WindowBase64.btoa","btoa()")}} it uses the native ones. This is made possible by the fact that instead of using typed arrays as encoding/decoding inputs this solution uses <a href="/en-US/docs/Web/API/DOMString/Binary">binary strings</a> as an intermediate format. It is a “dirty” workaround in comparison to <a href="#Solution_1_–_JavaScript's_UTF-16_>_base64">Solution #1</a> (<a href="/en-US/docs/Web/API/DOMString/Binary">binary strings</a> are a grey area), however it works pretty well and requires only a few lines of code.</p> + +<pre class="brush: js notranslate">"use strict"; + +/*\ +|*| +|*| Base64 / binary data / UTF-8 strings utilities (#3) +|*| +|*| https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding +|*| +|*| Author: madmurphy +|*| +\*/ + +function btoaUTF16 (sString) { + + var aUTF16CodeUnits = new Uint16Array(sString.length); + Array.prototype.forEach.call(aUTF16CodeUnits, function (el, idx, arr) { arr[idx] = sString.charCodeAt(idx); }); + return btoa(String.fromCharCode.apply(null, new Uint8Array(aUTF16CodeUnits.buffer))); + +} + +function atobUTF16 (sBase64) { + + var sBinaryString = atob(sBase64), aBinaryView = new Uint8Array(sBinaryString.length); + Array.prototype.forEach.call(aBinaryView, function (el, idx, arr) { arr[idx] = sBinaryString.charCodeAt(idx); }); + return String.fromCharCode.apply(null, new Uint16Array(aBinaryView.buffer)); + +}</pre> + +<h4 id="Tests_3">Tests</h4> + +<pre class="brush: js notranslate">var myString = "☸☹☺☻☼☾☿"; + +/* Part 1: Encode `myString` to base64 using native UTF-16 */ + +var sUTF16Base64 = btoaUTF16(myString); + +/* Show output */ + +alert(sUTF16Base64); // "OCY5JjomOyY8Jj4mPyY=" + +/* Part 2: Decode `sUTF16Base64` to UTF-16 */ + +var sDecodedString = atobUTF16(sUTF16Base64); + +/* Show output */ + +alert(sDecodedString); // "☸☹☺☻☼☾☿" +</pre> + +<p>For a cleaner solution that uses typed arrays instead of binary strings, see solutions <a href="#Solution_1_–_JavaScript's_UTF-16_>_base64">#1</a> and <a href="#Solution_2_–_JavaScript's_UTF-16_>_UTF-8_>_base64">#2</a>.</p> + +<h3 id="Solution_4_–_escaping_the_string_before_encoding_it">Solution #4 – escaping the string before encoding it</h3> + +<pre class="brush:js notranslate">function b64EncodeUnicode(str) { + // first we use encodeURIComponent to get percent-encoded UTF-8, + // then we convert the percent encodings into raw bytes which + // can be fed into btoa. + return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, + function toSolidBytes(match, p1) { + return String.fromCharCode('0x' + p1); + })); +} + +b64EncodeUnicode('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU=" +b64EncodeUnicode('\n'); // "Cg==" +</pre> + +<p>To decode the Base64-encoded value back into a String:</p> + +<pre class="brush: js notranslate">function b64DecodeUnicode(str) { + // Going backwards: from bytestream, to percent-encoding, to original string. + return decodeURIComponent(atob(str).split('').map(function(c) { + return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2); + }).join('')); +} + +b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode" +b64DecodeUnicode('Cg=='); // "\n" +</pre> + +<p><a href="https://git.daplie.com/Daplie/unibabel-js">Unibabel</a> implements common conversions using this strategy.</p> + +<h3 id="Solution_5_–_rewrite_the_DOMs_atob_and_btoa_using_JavaScripts_TypedArrays_and_UTF-8">Solution #5 – rewrite the DOMs <code>atob()</code> and <code>btoa()</code> using JavaScript's <code>TypedArray</code>s and UTF-8</h3> + +<p>Use a <a href="/en-US/docs/Web/API/TextEncoder">TextEncoder</a> polyfill such as <a href="https://github.com/inexorabletash/text-encoding">TextEncoding</a> (also includes legacy windows, mac, and ISO encodings), <a href="https://github.com/coolaj86/TextEncoderLite">TextEncoderLite</a>, combined with a <a href="https://github.com/feross/buffer">Buffer</a> and a Base64 implementation such as <a href="https://github.com/beatgammit/base64-js">base64-js</a> or <a href="https://github.com/waitingsong/base64">TypeScript version of </a>base64-js for both modern browsers and Node.js.</p> + +<p>When a native <code>TextEncoder</code> implementation is not available, the most light-weight solution would be to use <a href="#Solution_3_–_JavaScript's_UTF-16_>_binary_string_>_base64">Solution #3</a> because in addition to being much faster, <a href="#Solution_3_–_JavaScript's_UTF-16_>_binary_string_>_base64">Solution #3</a> also works in IE9 "out of the box." Alternatively, use <a href="https://github.com/coolaj86/TextEncoderLite">TextEncoderLite</a> with <a href="https://github.com/beatgammit/base64-js">base64-js</a>. Use the browser implementation when you can.</p> + +<p>The following function implements such a strategy. It assumes base64-js imported as <code><script type="text/javascript" src="base64js.min.js"/></code>. Note that TextEncoderLite only works with UTF-8.</p> + +<pre class="brush: js notranslate">function Base64Encode(str, encoding = 'utf-8') { + var bytes = new (typeof TextEncoder === "undefined" ? TextEncoderLite : TextEncoder)(encoding).encode(str); + return base64js.fromByteArray(bytes); +} + +function Base64Decode(str, encoding = 'utf-8') { + var bytes = base64js.toByteArray(str); + return new (typeof TextDecoder === "undefined" ? TextDecoderLite : TextDecoder)(encoding).decode(bytes); +} +</pre> + +<p><strong>注意</strong>: <a href="https://github.com/coolaj86/TextEncoderLite">TextEncoderLite</a> 不能正确处理四字节 UTF-8 字符, 比如 '\uD842\uDFB7' 或缩写为 '\u{20BB7}' 。参见 <a href="https://github.com/solderjs/TextEncoderLite/issues/16">issue </a><br> + 可使用 <a href="https://github.com/inexorabletash/text-encoding">text-encoding</a> 作为替代。</p> + +<p>某些场景下,以上经由 UTF-8 转换到 Base64 的实现在空间利用上不一定高效。当处理包含大量 U+0800-U+FFFF 区域间字符的文本时, UTF-8 输出结果长于 UTF-16 的,因为这些字符在 UTF-8 下占用三个字节而 UTF-16 是两个。在处理均匀分布 UTF-16 码点的 JavaScript 字符串时应考虑采用 UTF-16 替代 UTF-8 作为 Base64 结果的中间编码格式,这将减少 40% 尺寸。</p> + +<div class="standardNoteBlock"> +<p><strong>译者注</strong>:下为陈旧翻译片段</p> +</div> + +<ul> + <li>第一个是转义(escape)整个字符串然后编码这个它;</li> + <li>第二个是把UTF-16的 <a href="/en-US/docs/Web/API/DOMString" title="/en-US/docs/Web/API/DOMString"><code>DOMString</code></a> 转码为UTF-8的字符数组然后编码它。</li> +</ul> + +<h3 id="方案_1_–_编码之前转义escape字符串">方案 #1 – 编码之前转义(escape)字符串</h3> + +<pre class="brush:js notranslate">function b64EncodeUnicode(str) { + return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function(match, p1) { + return String.fromCharCode('0x' + p1); + })); +} + +b64EncodeUnicode('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU=" +</pre> + +<p>把base64转换回字符串</p> + +<pre class="notranslate"><code>function b64DecodeUnicode(str) { + return decodeURIComponent(atob(str).split('').map(function(c) { + return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2); + }).join('')); +} + +b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode" +b64DecodeUnicode('Cg=='); // "\n"</code></pre> + +<p><a href="https://github.com/coolaj86/unibabel-js">Unibabel</a> 是一个包含了一些使用这种策略的通用转换的库。</p> + +<h3 id="方案_6_–_用JavaScript的_TypedArray_和_UTF-8重写DOM的_atob_和_btoa">方案 #6 – 用JavaScript的 <code>TypedArray</code> 和 UTF-8重写DOM的 <code>atob()</code> 和 <code>btoa()</code></h3> + +<p>使用像<a href="https://github.com/inexorabletash/text-encoding">TextEncoding</a>(包含了早期(legacy)的windows,mac, 和 ISO 编码),<a href="https://github.com/coolaj86/TextEncoderLite/blob/master/index.js">TextEncoderLite</a> 或者 <a href="https://github.com/feross/buffer">Buffer</a> 这样的文本编码器增强(polyfill)和Base64增强,比如<a href="https://github.com/beatgammit/base64-js/blob/master/index.js">base64-js</a> 或 <a href="https://github.com/waitingsong/base64">TypeScript 版本的 </a>base64-js (适用于长青浏览器和 Node.js)。</p> + +<p>最简单,最轻量级的解决方法就是使用 <a href="https://github.com/coolaj86/TextEncoderLite/blob/master/index.js">TextEncoderLite</a> 和 <a href="https://github.com/beatgammit/base64-js/blob/master/index.js">base64-js</a>.</p> + +<p>想要更完整的库的话,参见 <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays/StringView" title="/en-US/docs/Web/JavaScript/Typed_arrays/StringView"><code>StringView</code> – a C-like representation of strings based on typed arrays</a>.</p> + +<h2 id="参见">参见</h2> + +<ul> + <li>{{domxref("WindowBase64.atob","atob()")}}</li> + <li>{{domxref("WindowBase64.btoa","btoa()")}}</li> + <li><a href="/en-US/docs/data_URIs" title="/en-US/docs/data_URIs"><code>data</code> URIs</a></li> + <li><a href="/en-US/docs/Web/JavaScript/Typed_arrays/ArrayBuffer">ArrayBuffer</a></li> + <li><a href="/en-US/docs/Web/JavaScript/Typed_arrays">TypedArrays</a></li> + <li><a href="/en-US/docs/Web/JavaScript/Typed_arrays/ArrayBufferView">ArrayBufferView</a></li> + <li><a href="/en-US/docs/Web/JavaScript/Typed_arrays/Uint8Array">Uint8Array</a></li> + <li><a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays/StringView" title="/en-US/docs/Web/JavaScript/Typed_arrays/StringView"><code>StringView</code> – a C-like representation of strings based on typed arrays</a></li> + <li><a href="/en-US/docs/Web/API/DOMString" title="/en-US/docs/Web/API/DOMString">DOMString</a></li> + <li><a href="/en-US/docs/URI" title="/en-US/docs/URI"><code>URI</code></a></li> + <li><a href="/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURI" title="/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURI"><code>encodeURI()</code></a></li> + <li><a href="/en-US/docs/XPCOM_Interface_Reference/nsIURIFixup" title="/en-US/docs/XPCOM_Interface_Reference/nsIURIFixup"><code>nsIURIFixup()</code></a></li> + <li><a href="https://en.wikipedia.org/wiki/Base64" title="https://en.wikipedia.org/wiki/Base64"><code>Base64 on Wikipedia</code></a></li> +</ul> |