{"id":478342,"date":"2023-08-09T09:31:27","date_gmt":"2023-08-09T09:31:27","guid":{"rendered":""},"modified":"2023-09-05T11:16:35","modified_gmt":"2023-09-05T11:16:35","slug":"parquet","status":"publish","type":"wiki","link":"https:\/\/oneproxy.pro\/cn\/wiki\/parquet\/","title":{"rendered":"\u9576\u6728\u5730\u677f"},"content":{"rendered":"<p>Parquet \u662f\u4e00\u79cd\u5217\u5f0f\u5b58\u50a8\u6587\u4ef6\u683c\u5f0f\uff0c\u65e8\u5728\u9ad8\u6548\u5b58\u50a8\u548c\u5904\u7406\u5927\u91cf\u6570\u636e\u3002\u5b83\u7531 Cloudera \u548c Twitter \u4e8e 2013 \u5e74\u5f00\u53d1\u4e3a\u4e00\u4e2a\u5f00\u6e90\u9879\u76ee\u3002Parquet \u7684\u4e3b\u8981\u76ee\u6807\u662f\u4f18\u5316\u5927\u6570\u636e\u5206\u6790\u7684\u6570\u636e\u5b58\u50a8\u548c\u5904\u7406\uff0c\u4f7f\u5176\u6210\u4e3a\u6570\u636e\u4ed3\u5e93\u3001\u6570\u636e\u6e56\u548c Apache Hadoop \u751f\u6001\u7cfb\u7edf\u4e2d\u7528\u4f8b\u7684\u7406\u60f3\u683c\u5f0f\u3002<\/p>\n<h2>\u9576\u6728\u5730\u677f\u7684\u8d77\u6e90\u548c\u9996\u6b21\u63d0\u53ca<\/h2>\n<p>Parquet \u7684\u8d77\u6e90\u53ef\u4ee5\u8ffd\u6eaf\u5230\u5bf9\u9ad8\u6548\u5b58\u50a8\u548c\u5904\u7406\u5927\u6570\u636e\u7684\u9700\u6c42\u3002\u968f\u7740\u5927\u6570\u636e\u6280\u672f\u7684\u5174\u8d77\uff0c\u4f20\u7edf\u5b58\u50a8\u683c\u5f0f\u5728\u5904\u7406\u5927\u578b\u6570\u636e\u96c6\u65f6\u9762\u4e34\u6311\u6218\u3002Parquet \u7684\u5f00\u53d1\u65e8\u5728\u901a\u8fc7\u5f15\u5165\u5217\u5f0f\u5b58\u50a8\u65b9\u6cd5\u6765\u89e3\u51b3\u8fd9\u4e9b\u95ee\u9898\u3002<\/p>\n<p>\u7b2c\u4e00\u6b21\u63d0\u5230 Parquet \u662f\u5728 2013 \u5e74\u7684\u64cd\u4f5c\u7cfb\u7edf\u539f\u7406\u7814\u8ba8\u4f1a (SOSP) \u4e0a Twitter \u5de5\u7a0b\u5e08\u63d0\u4ea4\u7684\u4e00\u7bc7\u7814\u7a76\u8bba\u6587\u4e2d\u3002\u5728\u8fd9\u7bc7\u8bba\u6587\u4e2d\uff0c\u4ed6\u4eec\u4ecb\u7ecd\u4e86 Parquet \u683c\u5f0f\u5e76\u5f3a\u8c03\u4e86\u5b83\u7684\u4f18\u70b9\uff0c\u4f8b\u5982\u66f4\u597d\u7684\u538b\u7f29\u3001\u6539\u8fdb\u7684\u67e5\u8be2\u6027\u80fd\u4ee5\u53ca\u5bf9\u590d\u6742\u6570\u636e\u7c7b\u578b\u7684\u652f\u6301\u3002<\/p>\n<h2>\u5173\u4e8e Parquet \u7684\u8be6\u7ec6\u4fe1\u606f\uff1a\u6269\u5c55\u4e3b\u9898<\/h2>\n<p>Parquet \u91c7\u7528\u5217\u5f0f\u5b58\u50a8\u65b9\u6cd5\uff0c\u5373\u6570\u636e\u4ee5\u5217\u800c\u4e0d\u662f\u884c\u7684\u5f62\u5f0f\u8fdb\u884c\u5b58\u50a8\u548c\u7ec4\u7ec7\u3002\u8fd9\u79cd\u8bbe\u8ba1\u53ef\u4ee5\u5b9e\u73b0\u5404\u79cd\u6027\u80fd\u4f18\u5316\uff0c\u5c24\u5176\u6709\u5229\u4e8e\u5206\u6790\u5de5\u4f5c\u8d1f\u8f7d\u3002Parquet \u7684\u4e00\u4e9b\u4e3b\u8981\u7279\u6027\u5305\u62ec\uff1a<\/p>\n<ol>\n<li>\n<p><strong>\u5217\u5f0f\u5b58\u50a8\uff1a<\/strong> Parquet \u5355\u72ec\u5b58\u50a8\u6bcf\u4e00\u5217\uff0c\u4ece\u800c\u5b9e\u73b0\u66f4\u597d\u7684\u538b\u7f29\uff0c\u5e76\u4e14\u80fd\u591f\u5728\u67e5\u8be2\u6267\u884c\u671f\u95f4\u4ec5\u8bfb\u53d6\u6240\u9700\u7684\u5217\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u538b\u7f29\u6280\u672f\uff1a<\/strong> Parquet\u4f7f\u7528\u5404\u79cd\u538b\u7f29\u7b97\u6cd5\uff0c\u4f8b\u5982Snappy\uff0cGzip\u548cZstandard\uff0c\u4ee5\u51cf\u5c11\u5b58\u50a8\u7a7a\u95f4\u5e76\u63d0\u9ad8\u6570\u636e\u8bfb\u53d6\u6027\u80fd\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u6570\u636e\u7c7b\u578b\u652f\u6301\uff1a<\/strong> \u5b83\u4e3a\u5404\u79cd\u6570\u636e\u7c7b\u578b\u63d0\u4f9b\u5e7f\u6cdb\u7684\u652f\u6301\uff0c\u5305\u62ec\u539f\u59cb\u7c7b\u578b\uff08\u4f8b\u5982\u6574\u6570\u3001\u5b57\u7b26\u4e32\u3001\u5e03\u5c14\u503c\uff09\u548c\u590d\u6742\u7c7b\u578b\uff08\u4f8b\u5982\u6570\u7ec4\u3001\u6620\u5c04\u3001\u7ed3\u6784\u4f53\uff09\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u6a21\u5f0f\u6f14\u53d8\uff1a<\/strong> Parquet \u652f\u6301\u6a21\u5f0f\u6f14\u53d8\uff0c\u5141\u8bb8\u7528\u6237\u968f\u7740\u65f6\u95f4\u7684\u63a8\u79fb\u6dfb\u52a0\u3001\u5220\u9664\u6216\u4fee\u6539\u5217\uff0c\u800c\u4e0d\u4f1a\u7834\u574f\u4e0e\u73b0\u6709\u6570\u636e\u7684\u517c\u5bb9\u6027\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u8c13\u8bcd\u4e0b\u63a8\uff1a<\/strong> \u6b64\u529f\u80fd\u5c06\u67e5\u8be2\u8c13\u8bcd\u4e0b\u63a8\u81f3\u5b58\u50a8\u5c42\uff0c\u51cf\u5c11\u4e86\u67e5\u8be2\u6267\u884c\u671f\u95f4\u9700\u8981\u8bfb\u53d6\u7684\u6570\u636e\u91cf\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u5e76\u884c\u5904\u7406\uff1a<\/strong> Parquet \u6587\u4ef6\u53ef\u4ee5\u5206\u6210\u66f4\u5c0f\u7684\u884c\u7ec4\uff0c\u4ece\u800c\u80fd\u591f\u5728\u5206\u5e03\u5f0f\u73af\u5883\uff08\u5982 Hadoop\uff09\u4e2d\u8fdb\u884c\u5e76\u884c\u5904\u7406\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u8de8\u5e73\u53f0\u517c\u5bb9\u6027\uff1a<\/strong> Parquet \u7684\u8bbe\u8ba1\u662f\u72ec\u7acb\u4e8e\u5e73\u53f0\u7684\uff0c\u80fd\u591f\u5b9e\u73b0\u4e0d\u540c\u7cfb\u7edf\u4e4b\u95f4\u7684\u65e0\u7f1d\u6570\u636e\u4ea4\u6362\u3002<\/p>\n<\/li>\n<\/ol>\n<h2>Parquet \u7684\u5185\u90e8\u7ed3\u6784\uff1aParquet \u7684\u5de5\u4f5c\u539f\u7406<\/h2>\n<p>Parquet \u6587\u4ef6\u7531\u51e0\u4e2a\u7ec4\u4ef6\u7ec4\u6210\uff0c\u8fd9\u4e9b\u7ec4\u4ef6\u6709\u52a9\u4e8e\u5b9e\u73b0\u9ad8\u6548\u7684\u5b58\u50a8\u548c\u5904\u7406\u80fd\u529b\uff1a<\/p>\n<ol>\n<li>\n<p><strong>\u6587\u4ef6\u5143\u6570\u636e\uff1a<\/strong> \u5305\u542b\u6709\u5173\u6587\u4ef6\u67b6\u6784\u3001\u4f7f\u7528\u7684\u538b\u7f29\u7b97\u6cd5\u548c\u5176\u4ed6\u5c5e\u6027\u7684\u4fe1\u606f\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u884c\u7ec4\uff1a<\/strong> \u6bcf\u4e2a Parquet \u6587\u4ef6\u88ab\u5212\u5206\u4e3a\u884c\u7ec4\uff0c\u884c\u7ec4\u8fdb\u4e00\u6b65\u5212\u5206\u4e3a\u5217\u3002\u884c\u7ec4\u6709\u52a9\u4e8e\u5e76\u884c\u5904\u7406\u548c\u6570\u636e\u538b\u7f29\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u5217\u5143\u6570\u636e\uff1a<\/strong> \u5bf9\u4e8e\u6bcf\u4e00\u5217\uff0cParquet \u5b58\u50a8\u5143\u6570\u636e\uff0c\u4f8b\u5982\u6570\u636e\u7c7b\u578b\u3001\u538b\u7f29\u7f16\u89e3\u7801\u5668\u548c\u7f16\u7801\u4fe1\u606f\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u6570\u636e\u9875\uff1a<\/strong> \u6570\u636e\u9875\u5b58\u50a8\u5b9e\u9645\u7684\u5217\u5f0f\u6570\u636e\uff0c\u5e76\u5355\u72ec\u538b\u7f29\u4ee5\u6700\u5927\u9650\u5ea6\u5730\u63d0\u9ad8\u5b58\u50a8\u6548\u7387\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u8bcd\u5178\u9875\u9762\uff08\u53ef\u9009\uff09\uff1a<\/strong> \u5bf9\u4e8e\u5177\u6709\u91cd\u590d\u503c\u7684\u5217\uff0cParquet \u4f7f\u7528\u5b57\u5178\u7f16\u7801\u6765\u5b58\u50a8\u552f\u4e00\u503c\u5e76\u5728\u6570\u636e\u9875\u4e2d\u5f15\u7528\u5b83\u4eec\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u7edf\u8ba1\u6570\u636e\uff1a<\/strong> Parquet \u8fd8\u53ef\u4ee5\u5b58\u50a8\u6bcf\u5217\u7684\u7edf\u8ba1\u6570\u636e\uff0c\u4f8b\u5982\u6700\u5c0f\u503c\u548c\u6700\u5927\u503c\uff0c\u8fd9\u4e9b\u6570\u636e\u53ef\u4ee5\u7528\u4e8e\u67e5\u8be2\u4f18\u5316\u3002<\/p>\n<\/li>\n<\/ol>\n<h2>\u5b9e\u6728\u590d\u5408\u5730\u677f\u4e3b\u8981\u7279\u70b9\u5206\u6790<\/h2>\n<p>Parquet \u7684\u4e3b\u8981\u7279\u6027\u4f7f\u5176\u5728\u5927\u6570\u636e\u5904\u7406\u4e2d\u5f97\u5230\u5e7f\u6cdb\u91c7\u7528\u548c\u666e\u53ca\u3002\u8ba9\u6211\u4eec\u5206\u6790\u4e00\u4e0b\u5176\u4e2d\u7684\u4e00\u4e9b\u7279\u6027\uff1a<\/p>\n<ol>\n<li>\n<p><strong>\u9ad8\u6548\u538b\u7f29\uff1a<\/strong> Parquet \u7684\u5217\u5f0f\u5b58\u50a8\u548c\u538b\u7f29\u6280\u672f\u53ef\u51cf\u5c0f\u6587\u4ef6\u5927\u5c0f\uff0c\u4ece\u800c\u964d\u4f4e\u5b58\u50a8\u6210\u672c\u5e76\u63d0\u9ad8\u6570\u636e\u4f20\u8f93\u901f\u5ea6\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u6027\u80fd\u4f18\u5316\uff1a<\/strong> \u901a\u8fc7\u5728\u67e5\u8be2\u671f\u95f4\u4ec5\u8bfb\u53d6\u5fc5\u8981\u7684\u5217\uff0cParquet \u6700\u5927\u9650\u5ea6\u5730\u51cf\u5c11\u4e86 I\/O \u64cd\u4f5c\uff0c\u4ece\u800c\u52a0\u5feb\u4e86\u67e5\u8be2\u5904\u7406\u901f\u5ea6\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u67b6\u6784\u7075\u6d3b\u6027\uff1a<\/strong> \u5bf9\u6a21\u5f0f\u6f14\u53d8\u7684\u652f\u6301\u5141\u8bb8\u654f\u6377\u7684\u6570\u636e\u6a21\u5f0f\u66f4\u6539\uff0c\u800c\u4e0d\u4f1a\u635f\u5bb3\u73b0\u6709\u6570\u636e\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u8de8\u8bed\u8a00\u652f\u6301\uff1a<\/strong> Parquet \u6587\u4ef6\u53ef\u4ee5\u88ab\u5404\u79cd\u7f16\u7a0b\u8bed\u8a00\u4f7f\u7528\uff0c\u5305\u62ec Java\u3001Python\u3001C++ \u7b49\uff0c\u4f7f\u5176\u6210\u4e3a\u9002\u7528\u4e8e\u5404\u79cd\u6570\u636e\u5904\u7406\u5de5\u4f5c\u6d41\u7a0b\u7684\u591a\u529f\u80fd\u683c\u5f0f\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u6570\u636e\u7c7b\u578b\u4e30\u5bcc\u6027\uff1a<\/strong> \u5bf9\u4e0d\u540c\u6570\u636e\u7c7b\u578b\u7684\u5e7f\u6cdb\u652f\u6301\u6ee1\u8db3\u4e86\u5e7f\u6cdb\u7684\u7528\u4f8b\uff0c\u9002\u5e94\u4e86\u5927\u6570\u636e\u5206\u6790\u4e2d\u5e38\u89c1\u7684\u590d\u6742\u6570\u636e\u7ed3\u6784\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u4e92\u64cd\u4f5c\u6027\uff1a<\/strong> \u4f5c\u4e3a\u4e00\u4e2a\u5177\u6709\u660e\u786e\u89c4\u8303\u7684\u5f00\u6e90\u9879\u76ee\uff0cParquet \u4fc3\u8fdb\u4e86\u4e0d\u540c\u5de5\u5177\u548c\u7cfb\u7edf\u4e4b\u95f4\u7684\u4e92\u64cd\u4f5c\u6027\u3002<\/p>\n<\/li>\n<\/ol>\n<h2>\u9576\u6728\u5730\u677f\u7684\u79cd\u7c7b\u53ca\u5176\u7279\u70b9<\/h2>\n<p>Parquet \u6709\u4e24\u4e2a\u4e3b\u8981\u7248\u672c\uff1a <strong>Parquet-1.0<\/strong> \u548c <strong>Parquet-2.0<\/strong>\u540e\u8005\u4e5f\u88ab\u79f0\u4e3a <strong>Apache Arrow Parquet<\/strong> \u5e76\u57fa\u4e8e Arrow \u6570\u636e\u683c\u5f0f\u3002\u4e24\u4e2a\u7248\u672c\u5177\u6709\u76f8\u540c\u7684\u57fa\u672c\u6982\u5ff5\u548c\u4f18\u52bf\uff0c\u4f46\u5728\u517c\u5bb9\u6027\u548c\u529f\u80fd\u96c6\u65b9\u9762\u6709\u6240\u4e0d\u540c\u3002\u4ee5\u4e0b\u662f\u4e24\u4e2a\u7248\u672c\u7684\u6bd4\u8f83\uff1a<\/p>\n<table>\n<thead>\n<tr>\n<th>\u7279\u5f81<\/th>\n<th>Parquet-1.0<\/th>\n<th>Parquet-2.0\uff08Apache Arrow Parquet\uff09<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>\u6a21\u5f0f\u6f14\u5316<\/td>\n<td>\u652f\u6301\u7684<\/td>\n<td>\u652f\u6301\u7684<\/td>\n<\/tr>\n<tr>\n<td>\u67f1\u72b6\u538b\u7f29<\/td>\n<td>\u652f\u6301\uff08Gzip\u3001Snappy \u7b49\uff09<\/td>\n<td>\u652f\u6301\uff08Gzip\u3001Snappy\u3001LZ4\u3001Zstd\uff09<\/td>\n<\/tr>\n<tr>\n<td>\u5b57\u5178\u7f16\u7801<\/td>\n<td>\u652f\u6301\u7684<\/td>\n<td>\u652f\u6301\u7684<\/td>\n<\/tr>\n<tr>\n<td>\u5d4c\u5957\u6570\u636e\u652f\u6301<\/td>\n<td>\u5bf9\u590d\u6742\u7c7b\u578b\u7684\u652f\u6301\u6709\u9650<\/td>\n<td>\u5168\u9762\u652f\u6301\u590d\u6742\u7c7b\u578b<\/td>\n<\/tr>\n<tr>\n<td>\u517c\u5bb9\u6027<\/td>\n<td>\u4e0e\u5927\u591a\u6570\u5de5\u5177\u517c\u5bb9<\/td>\n<td>\u901a\u8fc7 Arrow \u63d0\u9ad8\u517c\u5bb9\u6027<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>Parquet \u7684\u4f7f\u7528\u65b9\u6cd5\u3001\u95ee\u9898\u548c\u89e3\u51b3\u65b9\u6848<\/h2>\n<h3>\u4f7f\u7528 Parquet \u7684\u65b9\u6cd5<\/h3>\n<p>Parquet \u53ef\u5e94\u7528\u4e8e\u5404\u79cd\u6570\u636e\u5bc6\u96c6\u578b\u573a\u666f\uff0c\u4f8b\u5982\uff1a<\/p>\n<ol>\n<li>\n<p><strong>\u6570\u636e\u4ed3\u50a8\uff1a<\/strong> Parquet \u56e0\u5176\u5feb\u901f\u7684\u67e5\u8be2\u6027\u80fd\u548c\u9ad8\u6548\u7684\u5b58\u50a8\u800c\u5e38\u7528\u4e8e\u6570\u636e\u4ed3\u5e93\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u5927\u6570\u636e\u5904\u7406\uff1a<\/strong> \u5728Hadoop\u7b49\u5927\u6570\u636e\u5904\u7406\u6846\u67b6\u4e2d\uff0cParquet\u6587\u4ef6\u56e0\u5176\u5e76\u884c\u5904\u7406\u80fd\u529b\u800c\u6210\u4e3a\u9996\u9009\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u6570\u636e\u6e56\uff1a<\/strong> Parquet \u662f\u4e00\u79cd\u5728\u6570\u636e\u6e56\u4e2d\u5b58\u50a8\u591a\u79cd\u6570\u636e\u7c7b\u578b\u7684\u6d41\u884c\u683c\u5f0f\uff0c\u53ef\u4ee5\u66f4\u8f7b\u677e\u5730\u5206\u6790\u548c\u63d0\u53d6\u89c1\u89e3\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u6d41\u6570\u636e\uff1a<\/strong> \u7531\u4e8e\u652f\u6301\u6a21\u5f0f\u6f14\u53d8\uff0cParquet \u9002\u5408\u5904\u7406\u4e0d\u65ad\u53d1\u5c55\u7684\u6570\u636e\u6d41\u3002<\/p>\n<\/li>\n<\/ol>\n<h3>\u95ee\u9898\u4e0e\u89e3\u51b3\u65b9\u6848<\/h3>\n<ol>\n<li>\n<p><strong>\u517c\u5bb9\u6027\u95ee\u9898\uff1a<\/strong> \u4e00\u4e9b\u8f83\u65e7\u7684\u5de5\u5177\u53ef\u80fd\u5bf9 Parquet-2.0 \u7684\u652f\u6301\u6709\u9650\u3002\u89e3\u51b3\u65b9\u6848\u662f\u4f7f\u7528 Parquet-1.0 \u6216\u66f4\u65b0\u5de5\u5177\u4ee5\u652f\u6301\u6700\u65b0\u7248\u672c\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u67b6\u6784\u8bbe\u8ba1\u590d\u6742\u6027\uff1a<\/strong> \u8bbe\u8ba1\u7075\u6d3b\u7684\u6a21\u5f0f\u9700\u8981\u4ed4\u7ec6\u8003\u8651\u3002\u8de8\u6570\u636e\u6e90\u4f7f\u7528\u7edf\u4e00\u7684\u6a21\u5f0f\u53ef\u4ee5\u7b80\u5316\u6570\u636e\u96c6\u6210\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u6570\u636e\u8d28\u91cf\u95ee\u9898\uff1a<\/strong> \u9519\u8bef\u7684\u6570\u636e\u7c7b\u578b\u6216\u6a21\u5f0f\u53d8\u66f4\u53ef\u80fd\u4f1a\u5bfc\u81f4\u6570\u636e\u8d28\u91cf\u95ee\u9898\u3002\u6570\u636e\u9a8c\u8bc1\u548c\u6a21\u5f0f\u6f14\u8fdb\u5b9e\u8df5\u53ef\u4ee5\u7f13\u89e3\u8fd9\u4e9b\u95ee\u9898\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u51b7\u542f\u52a8\u5f00\u9500\uff1a<\/strong> \u7531\u4e8e\u5143\u6570\u636e\u89e3\u6790\uff0c\u8bfb\u53d6 Parquet \u6587\u4ef6\u7684\u524d\u51e0\u884c\u53ef\u80fd\u4f1a\u6bd4\u8f83\u6162\u3002\u9884\u7f13\u5b58\u6216\u4f7f\u7528\u4f18\u5316\u7684\u6587\u4ef6\u7ed3\u6784\u53ef\u4ee5\u51cf\u8f7b\u8fd9\u79cd\u5f00\u9500\u3002<\/p>\n<\/li>\n<\/ol>\n<h2>\u4e3b\u8981\u7279\u70b9\u53ca\u5176\u4ed6\u6bd4\u8f83<\/h2>\n<table>\n<thead>\n<tr>\n<th>\u7279\u5f81<\/th>\n<th>\u63cf\u8ff0<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>\u5b58\u50a8\u683c\u5f0f<\/td>\n<td>\u67f1\u72b6<\/td>\n<\/tr>\n<tr>\n<td>\u538b\u7f29\u9009\u9879<\/td>\n<td>Gzip\u3001Snappy\u3001LZ4\u3001Zstandard<\/td>\n<\/tr>\n<tr>\n<td>\u5e73\u53f0\u72ec\u7acb\u6027<\/td>\n<td>\u662f\u7684<\/td>\n<\/tr>\n<tr>\n<td>\u6570\u636e\u7c7b\u578b\u652f\u6301<\/td>\n<td>\u5e7f\u6cdb\u652f\u6301\u539f\u59cb\u548c\u590d\u6742\u6570\u636e\u7c7b\u578b<\/td>\n<\/tr>\n<tr>\n<td>\u6a21\u5f0f\u6f14\u5316<\/td>\n<td>\u652f\u6301\u7684<\/td>\n<\/tr>\n<tr>\n<td>\u8c13\u8bcd\u4e0b\u63a8<\/td>\n<td>\u652f\u6301\u7684<\/td>\n<\/tr>\n<tr>\n<td>\u5e76\u884c\u5904\u7406<\/td>\n<td>\u901a\u8fc7\u884c\u7ec4\u542f\u7528<\/td>\n<\/tr>\n<tr>\n<td>\u4e92\u64cd\u4f5c\u6027<\/td>\n<td>\u4e0e\u5404\u79cd\u5927\u6570\u636e\u6846\u67b6\u914d\u5408\u4f7f\u7528\uff0c\u4f8b\u5982 Apache Hadoop\u3001Apache Spark \u548c Apache Drill<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>\u4e0e\u9576\u6728\u5730\u677f\u76f8\u5173\u7684\u672a\u6765\u524d\u666f\u548c\u6280\u672f<\/h2>\n<p>Parquet \u7684\u672a\u6765\u524d\u666f\u5149\u660e\uff0c\u4eba\u4eec\u6b63\u5728\u4e0d\u65ad\u52aa\u529b\u63d0\u9ad8\u5176\u529f\u80fd\u548c\u96c6\u6210\u5ea6\u3002\u4e00\u4e9b\u5173\u952e\u7684\u5f00\u53d1\u548c\u91c7\u7528\u9886\u57df\u5305\u62ec\uff1a<\/p>\n<ol>\n<li>\n<p><strong>\u4f18\u5316\u7684\u67e5\u8be2\u5f15\u64ce\uff1a<\/strong> Apache Arrow\u3001Apache Drill \u548c Presto \u7b49\u67e5\u8be2\u5f15\u64ce\u7684\u4e0d\u65ad\u8fdb\u6b65\u5c06\u8fdb\u4e00\u6b65\u589e\u5f3a Parquet \u7684\u67e5\u8be2\u6027\u80fd\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u6d41\u5a92\u4f53\u652f\u6301\uff1a<\/strong> \u9884\u8ba1 Parquet \u5c06\u5728\u5b9e\u65f6\u6570\u636e\u6d41\u548c\u5206\u6790\u9886\u57df\u53d1\u6325\u91cd\u8981\u4f5c\u7528\uff0c\u540c\u65f6\u8fd8\u5c06\u4e0e Apache Kafka \u548c Apache Flink \u7b49\u65b0\u5174\u6280\u672f\u4e00\u8d77\u53d1\u6325\u4f5c\u7528\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u4e91\u6570\u636e\u6e56\uff1a<\/strong> \u53d7 Amazon S3 \u548c Azure Data Lake Storage \u7b49\u5e73\u53f0\u63a8\u52a8\u7684\u4e91\u6570\u636e\u6e56\u7684\u5174\u8d77\uff0c\u5c06\u63a8\u52a8 Parquet \u7684\u91c7\u7528\uff0c\u56e0\u4e3a\u5b83\u5177\u6709\u6210\u672c\u6548\u76ca\u548c\u53ef\u6269\u5c55\u7684\u6027\u80fd\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u4eba\u5de5\u667a\u80fd\u548c\u673a\u5668\u5b66\u4e60\u96c6\u6210\uff1a<\/strong> \u7531\u4e8e Parquet \u80fd\u591f\u9ad8\u6548\u5b58\u50a8\u5927\u578b\u6570\u636e\u96c6\uff0c\u5b83\u5c06\u7ee7\u7eed\u6210\u4e3a\u673a\u5668\u5b66\u4e60\u548c\u4eba\u5de5\u667a\u80fd\u9879\u76ee\u4e2d\u6570\u636e\u51c6\u5907\u548c\u8bad\u7ec3\u6d41\u7a0b\u4e0d\u53ef\u6216\u7f3a\u7684\u4e00\u90e8\u5206\u3002<\/p>\n<\/li>\n<\/ol>\n<h2>\u5982\u4f55\u4f7f\u7528\u4ee3\u7406\u670d\u52a1\u5668\u6216\u5c06\u5176\u4e0e Parquet \u5173\u8054<\/h2>\n<p>\u4ee3\u7406\u670d\u52a1\u5668\u53ef\u4ee5\u901a\u8fc7\u591a\u79cd\u65b9\u5f0f\u4ece Parquet \u4e2d\u83b7\u76ca\uff1a<\/p>\n<ol>\n<li>\n<p><strong>\u7f13\u5b58\u548c\u6570\u636e\u538b\u7f29\uff1a<\/strong> \u4ee3\u7406\u670d\u52a1\u5668\u53ef\u4ee5\u4f7f\u7528Parquet\u6709\u6548\u5730\u7f13\u5b58\u7ecf\u5e38\u8bbf\u95ee\u7684\u6570\u636e\uff0c\u4ece\u800c\u51cf\u5c11\u540e\u7eed\u8bf7\u6c42\u7684\u54cd\u5e94\u65f6\u95f4\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u65e5\u5fd7\u5904\u7406\u548c\u5206\u6790\uff1a<\/strong> \u4ee5 Parquet \u683c\u5f0f\u6536\u96c6\u7684\u4ee3\u7406\u670d\u52a1\u5668\u65e5\u5fd7\u53ef\u4ee5\u4f7f\u7528\u5927\u6570\u636e\u5904\u7406\u5de5\u5177\u8fdb\u884c\u5206\u6790\uff0c\u4ece\u800c\u4e3a\u7f51\u7edc\u4f18\u5316\u548c\u5b89\u5168\u63d0\u4f9b\u6709\u4ef7\u503c\u7684\u89c1\u89e3\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u6570\u636e\u4ea4\u6362\u4e0e\u96c6\u6210\uff1a<\/strong> \u5904\u7406\u6765\u81ea\u5404\u79cd\u6765\u6e90\u7684\u6570\u636e\u7684\u4ee3\u7406\u670d\u52a1\u5668\u53ef\u4ee5\u8f6c\u6362\u548c\u5b58\u50a8 Parquet \u683c\u5f0f\u7684\u6570\u636e\uff0c\u5b9e\u73b0\u4e0e\u5927\u6570\u636e\u5e73\u53f0\u548c\u5206\u6790\u7cfb\u7edf\u7684\u65e0\u7f1d\u96c6\u6210\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u8d44\u6e90\u4f18\u5316\uff1a<\/strong> \u901a\u8fc7\u5229\u7528 Parquet \u7684\u5217\u5f0f\u5b58\u50a8\u548c\u8c13\u8bcd\u4e0b\u63a8\u529f\u80fd\uff0c\u4ee3\u7406\u670d\u52a1\u5668\u53ef\u4ee5\u4f18\u5316\u8d44\u6e90\u4f7f\u7528\u5e76\u63d0\u9ad8\u6574\u4f53\u6027\u80fd\u3002<\/p>\n<\/li>\n<\/ol>\n<h2>\u76f8\u5173\u94fe\u63a5<\/h2>\n<p>\u6709\u5173 Parquet \u7684\u66f4\u591a\u4fe1\u606f\uff0c\u53ef\u4ee5\u53c2\u8003\u4ee5\u4e0b\u8d44\u6e90\uff1a<\/p>\n<ol>\n<li><a href=\"https:\/\/parquet.apache.org\/\" target=\"_new\" rel=\"noopener nofollow\">Apache Parquet \u5b98\u65b9\u7f51\u7ad9<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/apache\/parquet-format\" target=\"_new\" rel=\"noopener nofollow\">Parquet \u683c\u5f0f\u89c4\u8303<\/a><\/li>\n<li><a href=\"https:\/\/blog.cloudera.com\/parquet\/\" target=\"_new\" rel=\"noopener nofollow\">Cloudera \u5de5\u7a0b\u535a\u5ba2\u5173\u4e8e Parquet \u7684\u5185\u5bb9<\/a><\/li>\n<li><a href=\"https:\/\/arrow.apache.org\/\" target=\"_new\" rel=\"noopener nofollow\">Apache Arrow \u5b98\u65b9\u7f51\u7ad9<\/a> \uff08\u6709\u5173 Parquet-2.0 \u7684\u4fe1\u606f\uff09<\/li>\n<\/ol>","protected":false},"featured_media":0,"menu_order":0,"template":"","meta":{"_acf_changed":false,"content-type":"","inline_featured_image":false,"footnotes":""},"class_list":["post-478342","wiki","type-wiki","status-publish","hentry"],"acf":{"faq_title":"Frequently Asked Questions about <mark>Parquet: A Comprehensive Guide<\/mark>","faq_items":[{"question":"What is Parquet?","answer":"<p>Parquet is a columnar storage file format designed for efficient storage and processing of large datasets. It is particularly well-suited for big data analytics, data warehousing, and Apache Hadoop environments.<\/p>"},{"question":"How did Parquet originate, and when was it first mentioned?","answer":"<p>Parquet was developed as an open-source project by Cloudera and Twitter in 2013. It was first mentioned in a research paper presented by Twitter engineers at the Symposium on Operating Systems Principles (SOSP) in the same year.<\/p>"},{"question":"What are the key features of Parquet?","answer":"<p>Parquet offers several key features, including columnar storage, efficient compression techniques, support for various data types (primitive and complex), schema evolution, predicate pushdown, and parallel processing.<\/p>"},{"question":"How does Parquet work internally?","answer":"<p>Internally, Parquet files consist of file metadata, row groups, column metadata, data pages, and optional dictionary pages. This design allows for optimized storage, fast query processing, and support for various data types.<\/p>"},{"question":"What are the different types of Parquet versions, and how do they differ?","answer":"<p>Parquet comes in two main versions: Parquet-1.0 and Parquet-2.0 (Apache Arrow Parquet). While both versions share core concepts, Parquet-2.0 offers improved compatibility with Arrow-based systems and additional compression options.<\/p>"},{"question":"In what ways can Parquet be used, and what problems does it solve?","answer":"<p>Parquet finds applications in data warehousing, big data processing, data lakes, and handling streaming data. It solves challenges related to efficient storage, fast query performance, schema evolution, and cross-platform compatibility.<\/p>"},{"question":"What are the main characteristics of Parquet compared to other storage formats?","answer":"<p>Compared to other formats, Parquet stands out for its columnar storage, efficient compression options, extensive data type support, schema evolution capabilities, and the ability to enable predicate pushdown for query optimization.<\/p>"},{"question":"What are the perspectives and future technologies related to Parquet?","answer":"<p>The future of Parquet is promising, with ongoing improvements in query engines, support for real-time data streaming, and its growing role in cloud data lakes and AI\/ML integration.<\/p>"},{"question":"How can proxy servers benefit from Parquet?","answer":"<p>Proxy servers can utilize Parquet for caching, data compression, log processing, and seamless data integration. Parquet's resource optimization features can improve overall proxy server performance.<\/p>"},{"question":"Where can I find more information about Parquet?","answer":"<p>For more information about Parquet, you can visit the <a href=\"https:\/\/parquet.apache.org\/\" target=\"_new\">Apache Parquet Official Website<\/a> or refer to the Parquet Format Specification on <a href=\"https:\/\/github.com\/apache\/parquet-format\" target=\"_new\">GitHub<\/a>. Additionally, you can explore Cloudera's Engineering Blog for insightful articles on Parquet. For information on Parquet-2.0, you can visit the <a href=\"https:\/\/arrow.apache.org\/\" target=\"_new\">Apache Arrow Official Website<\/a>.<\/p>"}]},"_links":{"self":[{"href":"https:\/\/oneproxy.pro\/cn\/wp-json\/wp\/v2\/wiki\/478342","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/oneproxy.pro\/cn\/wp-json\/wp\/v2\/wiki"}],"about":[{"href":"https:\/\/oneproxy.pro\/cn\/wp-json\/wp\/v2\/types\/wiki"}],"version-history":[{"count":0,"href":"https:\/\/oneproxy.pro\/cn\/wp-json\/wp\/v2\/wiki\/478342\/revisions"}],"wp:attachment":[{"href":"https:\/\/oneproxy.pro\/cn\/wp-json\/wp\/v2\/media?parent=478342"}],"curies":[{"name":"\u53ef\u6e7f\u6027\u7c89\u5242","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}