{"id":476002,"date":"2023-08-09T07:25:33","date_gmt":"2023-08-09T07:25:33","guid":{"rendered":""},"modified":"2023-09-05T11:11:49","modified_gmt":"2023-09-05T11:11:49","slug":"bert","status":"publish","type":"wiki","link":"https:\/\/oneproxy.pro\/cn\/wiki\/bert\/","title":{"rendered":"\u4f2f\u7279"},"content":{"rendered":"<p>BERT\uff0c\u5373 Transformers \u7684\u53cc\u5411\u7f16\u7801\u5668\u8868\u793a\uff0c\u662f\u81ea\u7136\u8bed\u8a00\u5904\u7406 (NLP) \u9886\u57df\u7684\u4e00\u79cd\u9769\u547d\u6027\u65b9\u6cd5\uff0c\u5b83\u5229\u7528 Transformer \u6a21\u578b\u4ee5\u65e9\u671f\u6280\u672f\u65e0\u6cd5\u5b9e\u73b0\u7684\u65b9\u5f0f\u7406\u89e3\u8bed\u8a00\u3002<\/p>\n<h2>BERT \u7684\u8d77\u6e90\u548c\u5386\u53f2<\/h2>\n<p>BERT \u662f\u7531 Google AI Language \u7684\u7814\u7a76\u4eba\u5458\u4e8e 2018 \u5e74\u63a8\u51fa\u7684\u3002\u521b\u5efa BERT \u7684\u76ee\u7684\u662f\u63d0\u4f9b\u4e00\u79cd\u53ef\u4ee5\u514b\u670d\u4ee5\u524d\u8bed\u8a00\u8868\u793a\u6a21\u578b\u7684\u5c40\u9650\u6027\u7684\u89e3\u51b3\u65b9\u6848\u3002\u7b2c\u4e00\u6b21\u63d0\u5230 BERT \u662f\u5728\u53d1\u8868\u4e8e arXiv \u7684\u8bba\u6587\u201cBERT\uff1a\u7528\u4e8e\u8bed\u8a00\u7406\u89e3\u7684\u6df1\u5ea6\u53cc\u5411 Transformers \u9884\u8bad\u7ec3\u201d\u4e2d\u3002<\/p>\n<h2>\u7406\u89e3 BERT<\/h2>\n<p>BERT \u662f\u4e00\u79cd\u9884\u8bad\u7ec3\u8bed\u8a00\u8868\u5f81\u7684\u65b9\u6cd5\uff0c\u5373\u5728\u5927\u91cf\u6587\u672c\u6570\u636e\u4e0a\u8bad\u7ec3\u901a\u7528\u7684\u201c\u8bed\u8a00\u7406\u89e3\u201d\u6a21\u578b\uff0c\u7136\u540e\u9488\u5bf9\u7279\u5b9a\u4efb\u52a1\u5bf9\u8be5\u6a21\u578b\u8fdb\u884c\u5fae\u8c03\u3002BERT \u5f7b\u5e95\u6539\u53d8\u4e86 NLP \u9886\u57df\uff0c\u56e0\u4e3a\u5b83\u65e8\u5728\u66f4\u51c6\u786e\u5730\u5efa\u6a21\u548c\u7406\u89e3\u8bed\u8a00\u7684\u590d\u6742\u6027\u3002<\/p>\n<p>BERT \u7684\u5173\u952e\u521b\u65b0\u5728\u4e8e\u5176\u5bf9 Transformers \u7684\u53cc\u5411\u8bad\u7ec3\u3002\u4e0e\u4ee5\u524d\u7684\u6a21\u578b\u5355\u5411\uff08\u4ece\u5de6\u5230\u53f3\u6216\u4ece\u53f3\u5230\u5de6\uff09\u5904\u7406\u6587\u672c\u6570\u636e\u4e0d\u540c\uff0cBERT \u4e00\u6b21\u8bfb\u53d6\u6574\u4e2a\u5355\u8bcd\u5e8f\u5217\u3002\u8fd9\u4f7f\u5f97\u6a21\u578b\u80fd\u591f\u6839\u636e\u5355\u8bcd\u7684\u6240\u6709\u5468\u56f4\u73af\u5883\uff08\u5355\u8bcd\u7684\u5de6\u4fa7\u548c\u53f3\u4fa7\uff09\u6765\u5b66\u4e60\u5355\u8bcd\u7684\u4e0a\u4e0b\u6587\u3002<\/p>\n<h2>BERT \u7684\u5185\u90e8\u7ed3\u6784\u548c\u529f\u80fd<\/h2>\n<p>BERT \u5229\u7528\u4e00\u79cd\u79f0\u4e3a Transformer \u7684\u67b6\u6784\u3002Transformer \u5305\u62ec\u7f16\u7801\u5668\u548c\u89e3\u7801\u5668\uff0c\u4f46 BERT \u4ec5\u4f7f\u7528\u7f16\u7801\u5668\u90e8\u5206\u3002\u6bcf\u4e2a Transformer \u7f16\u7801\u5668\u90fd\u6709\u4e24\u4e2a\u90e8\u5206\uff1a<\/p>\n<ol>\n<li>\u81ea\u6ce8\u610f\u529b\u673a\u5236\uff1a\u5b83\u786e\u5b9a\u53e5\u5b50\u4e2d\u54ea\u4e9b\u5355\u8bcd\u662f\u76f8\u4e92\u5173\u8054\u7684\u3002\u5b83\u901a\u8fc7\u5bf9\u6bcf\u4e2a\u5355\u8bcd\u7684\u76f8\u5173\u6027\u8fdb\u884c\u8bc4\u5206\uff0c\u5e76\u4f7f\u7528\u8fd9\u4e9b\u5206\u6570\u6765\u8861\u91cf\u5355\u8bcd\u4e4b\u95f4\u7684\u76f8\u4e92\u5f71\u54cd\u6765\u5b9e\u73b0\u8fd9\u4e00\u70b9\u3002<\/li>\n<li>\u524d\u9988\u795e\u7ecf\u7f51\u7edc\uff1a\u7ecf\u8fc7\u6ce8\u610f\u529b\u673a\u5236\u4e4b\u540e\uff0c\u5355\u8bcd\u88ab\u4f20\u9012\u5230\u524d\u9988\u795e\u7ecf\u7f51\u7edc\u3002<\/li>\n<\/ol>\n<p>BERT \u4e2d\u7684\u4fe1\u606f\u6d41\u662f\u53cc\u5411\u7684\uff0c\u8fd9\u4f7f\u5f97\u5b83\u53ef\u4ee5\u770b\u5230\u5f53\u524d\u5355\u8bcd\u4e4b\u524d\u548c\u4e4b\u540e\u7684\u5355\u8bcd\uff0c\u4ece\u800c\u63d0\u4f9b\u66f4\u51c6\u786e\u7684\u4e0a\u4e0b\u6587\u7406\u89e3\u3002<\/p>\n<h2>BERT \u7684\u4e3b\u8981\u7279\u70b9<\/h2>\n<ol>\n<li>\n<p><strong>\u53cc\u5411\u6027<\/strong>\uff1a\u4e0e\u4e4b\u524d\u7684\u6a21\u578b\u4e0d\u540c\uff0cBERT \u901a\u8fc7\u67e5\u770b\u5355\u8bcd\u524d\u540e\u51fa\u73b0\u7684\u5355\u8bcd\u6765\u8003\u8651\u5355\u8bcd\u7684\u5b8c\u6574\u4e0a\u4e0b\u6587\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u53d8\u538b\u5668<\/strong>\uff1aBERT \u4f7f\u7528 Transformer \u67b6\u6784\uff0c\u8fd9\u4f7f\u5f97\u5b83\u80fd\u591f\u66f4\u6709\u6548\u3001\u66f4\u9ad8\u6548\u5730\u5904\u7406\u957f\u5e8f\u5217\u7684\u5355\u8bcd\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u9884\u8bad\u7ec3\u548c\u5fae\u8c03<\/strong>\uff1aBERT \u5728\u5927\u91cf\u672a\u6807\u8bb0\u6587\u672c\u6570\u636e\u4e0a\u8fdb\u884c\u9884\u8bad\u7ec3\uff0c\u7136\u540e\u9488\u5bf9\u7279\u5b9a\u4efb\u52a1\u8fdb\u884c\u5fae\u8c03\u3002<\/p>\n<\/li>\n<\/ol>\n<h2>BERT \u7684\u7c7b\u578b<\/h2>\n<p>BERT \u6709\u4e24\u79cd\u5c3a\u5bf8\uff1a<\/p>\n<ol>\n<li><strong>BERT-Base<\/strong>\uff1a12 \u5c42\uff08Transformer \u5757\uff09\u300112 \u4e2a\u6ce8\u610f\u529b\u5934\u548c 1.1 \u4ebf\u4e2a\u53c2\u6570\u3002<\/li>\n<li><strong>BERT-\u5927\u578b<\/strong>\uff1a24 \u5c42\uff08Transformer \u5757\uff09\u300116 \u4e2a\u6ce8\u610f\u529b\u5934\u548c 3.4 \u4ebf\u4e2a\u53c2\u6570\u3002<\/li>\n<\/ol>\n<table>\n<thead>\n<tr>\n<th><\/th>\n<th>BERT-Base<\/th>\n<th>BERT-\u5927\u578b<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>\u5c42\uff08\u53d8\u538b\u5668\u5757\uff09<\/td>\n<td>12<\/td>\n<td>24<\/td>\n<\/tr>\n<tr>\n<td>\u6ce8\u610f\u5934<\/td>\n<td>12<\/td>\n<td>16<\/td>\n<\/tr>\n<tr>\n<td>\u53c2\u6570<\/td>\n<td>1.1\u4ebf<\/td>\n<td>3.4\u4ebf<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>BERT \u7684\u4f7f\u7528\u3001\u6311\u6218\u548c\u89e3\u51b3\u65b9\u6848<\/h2>\n<p>BERT \u5e7f\u6cdb\u5e94\u7528\u4e8e\u95ee\u7b54\u7cfb\u7edf\u3001\u53e5\u5b50\u5206\u7c7b\u548c\u5b9e\u4f53\u8bc6\u522b\u7b49\u8bb8\u591a NLP \u4efb\u52a1\u3002<\/p>\n<p>BERT \u9762\u4e34\u7684\u6311\u6218\u5305\u62ec\uff1a<\/p>\n<ol>\n<li>\n<p><strong>\u8ba1\u7b97\u8d44\u6e90<\/strong>\uff1aBERT \u7531\u4e8e\u5176\u53c2\u6570\u6570\u91cf\u4f17\u591a\u4e14\u67b6\u6784\u6df1\u5ea6\u6df1\uff0c\u56e0\u6b64\u9700\u8981\u5927\u91cf\u8ba1\u7b97\u8d44\u6e90\u8fdb\u884c\u8bad\u7ec3\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u7f3a\u4e4f\u900f\u660e\u5ea6<\/strong>\uff1a\u4e0e\u8bb8\u591a\u6df1\u5ea6\u5b66\u4e60\u6a21\u578b\u4e00\u6837\uff0cBERT \u53ef\u4ee5\u5145\u5f53\u201c\u9ed1\u5323\u5b50\u201d\uff0c\u56e0\u6b64\u5f88\u96be\u7406\u89e3\u5b83\u5982\u4f55\u505a\u51fa\u7279\u5b9a\u7684\u51b3\u5b9a\u3002<\/p>\n<\/li>\n<\/ol>\n<p>\u8fd9\u4e9b\u95ee\u9898\u7684\u89e3\u51b3\u65b9\u6848\u5305\u62ec\uff1a<\/p>\n<ol>\n<li>\n<p><strong>\u4f7f\u7528\u9884\u5148\u8bad\u7ec3\u7684\u6a21\u578b<\/strong>\uff1a\u65e0\u9700\u4ece\u5934\u5f00\u59cb\u8bad\u7ec3\uff0c\u800c\u662f\u53ef\u4ee5\u4f7f\u7528\u9884\u5148\u8bad\u7ec3\u7684 BERT \u6a21\u578b\u5e76\u9488\u5bf9\u7279\u5b9a\u4efb\u52a1\u8fdb\u884c\u5fae\u8c03\uff0c\u8fd9\u6837\u9700\u8981\u7684\u8ba1\u7b97\u8d44\u6e90\u5c31\u66f4\u5c11\u4e86\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u89e3\u91ca\u5de5\u5177<\/strong>\uff1aLIME \u548c SHAP \u7b49\u5de5\u5177\u53ef\u4ee5\u5e2e\u52a9\u4f7f BERT \u6a21\u578b\u7684\u51b3\u7b56\u66f4\u6613\u4e8e\u89e3\u91ca\u3002<\/p>\n<\/li>\n<\/ol>\n<h2>BERT \u548c\u7c7b\u4f3c\u6280\u672f<\/h2>\n<table>\n<thead>\n<tr>\n<th><\/th>\n<th>\u4f2f\u7279<\/th>\n<th>\u957f\u77ed\u671f\u8bb0\u5fc6\uff08LSTM\uff09<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>\u65b9\u5411<\/td>\n<td>\u53cc\u5411<\/td>\n<td>\u5355\u5411<\/td>\n<\/tr>\n<tr>\n<td>\u5efa\u7b51\u5b66<\/td>\n<td>\u53d8\u538b\u5668<\/td>\n<td>\u590d\u53d1\u6027<\/td>\n<\/tr>\n<tr>\n<td>\u60c5\u5883\u7406\u89e3<\/td>\n<td>\u66f4\u597d\u7684<\/td>\n<td>\u6709\u9650\u7684<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>\u4e0e BERT \u76f8\u5173\u7684\u672a\u6765\u89c2\u70b9\u548c\u6280\u672f<\/h2>\n<p>BERT \u7ee7\u7eed\u542f\u53d1 NLP \u9886\u57df\u7684\u65b0\u6a21\u578b\u3002DistilBERT \u662f BERT \u7684\u66f4\u5c0f\u3001\u66f4\u5feb\u3001\u66f4\u8f7b\u91cf\u7248\u672c\uff0c\u800c RoBERTa \u662f BERT \u7684\u4e00\u4e2a\u7248\u672c\uff0c\u5b83\u5220\u9664\u4e86\u4e0b\u4e00\u53e5\u9884\u8bad\u7ec3\u76ee\u6807\uff0c\u5b83\u4eec\u5c31\u662f\u8fd1\u671f\u8fdb\u5c55\u7684\u5178\u578b\u4f8b\u5b50\u3002<\/p>\n<p>BERT \u7684\u672a\u6765\u7814\u7a76\u53ef\u80fd\u96c6\u4e2d\u4e8e\u4f7f\u6a21\u578b\u66f4\u9ad8\u6548\u3001\u66f4\u6613\u4e8e\u89e3\u91ca\u3001\u4ee5\u53ca\u66f4\u597d\u5730\u5904\u7406\u66f4\u957f\u7684\u5e8f\u5217\u3002<\/p>\n<h2>BERT \u548c\u4ee3\u7406\u670d\u52a1\u5668<\/h2>\n<p>BERT \u4e0e\u4ee3\u7406\u670d\u52a1\u5668\u57fa\u672c\u65e0\u5173\uff0c\u56e0\u4e3a BERT \u662f NLP \u6a21\u578b\uff0c\u800c\u4ee3\u7406\u670d\u52a1\u5668\u662f\u7f51\u7edc\u5de5\u5177\u3002\u4f46\u662f\uff0c\u5728\u4e0b\u8f7d\u9884\u5148\u8bad\u7ec3\u597d\u7684 BERT \u6a21\u578b\u6216\u901a\u8fc7 API \u4f7f\u7528\u5b83\u4eec\u65f6\uff0c\u50cf OneProxy \u8fd9\u6837\u53ef\u9760\u3001\u5feb\u901f\u4e14\u5b89\u5168\u7684\u4ee3\u7406\u670d\u52a1\u5668\u53ef\u4ee5\u786e\u4fdd\u7a33\u5b9a\u5b89\u5168\u7684\u6570\u636e\u4f20\u8f93\u3002<\/p>\n<h2>\u76f8\u5173\u94fe\u63a5<\/h2>\n<ol>\n<li>\n<p><a href=\"https:\/\/arxiv.org\/abs\/1810.04805\" target=\"_new\" rel=\"noopener nofollow\">BERT\uff1a\u7528\u4e8e\u8bed\u8a00\u7406\u89e3\u7684\u6df1\u5ea6\u53cc\u5411\u53d8\u538b\u5668\u7684\u9884\u8bad\u7ec3<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/ai.googleblog.com\/2018\/11\/open-sourcing-bert-state-of-art-pre.html\" target=\"_new\" rel=\"noopener nofollow\">Google AI \u535a\u5ba2\uff1a\u5f00\u6e90 BERT<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/towardsdatascience.com\/bert-explained-state-of-the-art-language-model-for-nlp-f8b21a9b6270\" target=\"_new\" rel=\"noopener nofollow\">BERT \u89e3\u91ca\uff1a\u5305\u542b\u7406\u8bba\u548c\u6559\u7a0b\u7684\u5b8c\u6574\u6307\u5357<\/a><\/p>\n<\/li>\n<\/ol>","protected":false},"featured_media":467710,"menu_order":0,"template":"","meta":{"_acf_changed":false,"content-type":"","inline_featured_image":false,"footnotes":""},"class_list":["post-476002","wiki","type-wiki","status-publish","has-post-thumbnail","hentry"],"acf":{"faq_title":"Frequently Asked Questions about <mark>Bidirectional Encoder Representations from Transformers (BERT)<\/mark>","faq_items":[{"question":"What is BERT?","answer":"<p>BERT, or Bidirectional Encoder Representations from Transformers, is a cutting-edge method in the field of natural language processing (NLP) that leverages Transformer models to understand language in a way that surpasses earlier technologies.<\/p>"},{"question":"Who introduced BERT and when?","answer":"<p>BERT was introduced by researchers at Google AI Language in 2018. The paper titled \"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,\" published on arXiv, was the first to mention BERT.<\/p>"},{"question":"What is the key innovation of BERT?","answer":"<p>The key innovation of BERT is its bidirectional training of Transformers. This is a departure from previous models that processed text data in one direction only. BERT reads the entire sequence of words at once, learning the context of a word based on all its surroundings.<\/p>"},{"question":"How does BERT work internally?","answer":"<p>BERT uses an architecture known as Transformer, specifically its encoder part. Each Transformer encoder comprises a self-attention mechanism, which determines the relevance of words to each other, and a feed-forward neural network, which the words pass through after the attention mechanism. BERT's bidirectional information flow gives it a richer contextual understanding of language.<\/p>"},{"question":"What are the main types of BERT?","answer":"<p>BERT primarily comes in two sizes: BERT-Base and BERT-Large. BERT-Base has 12 layers, 12 attention heads, and 110 million parameters. BERT-Large, on the other hand, has 24 layers, 16 attention heads, and 340 million parameters.<\/p>"},{"question":"What challenges might one face when using BERT?","answer":"<p>BERT requires substantial computational resources for training due to its large number of parameters and deep architecture. Furthermore, like many deep learning models, BERT can be a \"black box,\" making it challenging to understand how it makes a particular decision.<\/p>"},{"question":"How do BERT and proxy servers relate?","answer":"<p>While BERT and proxy servers operate in different spheres (NLP and networking, respectively), a proxy server can be crucial when downloading pre-trained BERT models or using them via APIs. A reliable proxy server like OneProxy ensures secure and stable data transmission.<\/p>"},{"question":"What are the future prospects related to BERT?","answer":"<p>BERT continues to inspire new models in NLP like DistilBERT and RoBERTa. Future research in BERT may focus on making the model more efficient, more interpretable, and better at handling longer sequences.<\/p>"}]},"_links":{"self":[{"href":"https:\/\/oneproxy.pro\/cn\/wp-json\/wp\/v2\/wiki\/476002","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/oneproxy.pro\/cn\/wp-json\/wp\/v2\/wiki"}],"about":[{"href":"https:\/\/oneproxy.pro\/cn\/wp-json\/wp\/v2\/types\/wiki"}],"version-history":[{"count":0,"href":"https:\/\/oneproxy.pro\/cn\/wp-json\/wp\/v2\/wiki\/476002\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/oneproxy.pro\/cn\/wp-json\/wp\/v2\/media\/467710"}],"wp:attachment":[{"href":"https:\/\/oneproxy.pro\/cn\/wp-json\/wp\/v2\/media?parent=476002"}],"curies":[{"name":"\u53ef\u6e7f\u6027\u7c89\u5242","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}