{"id":490643,"date":"2023-10-18T21:07:32","date_gmt":"2023-10-18T21:07:32","guid":{"rendered":"https:\/\/oneproxy.pro\/uncategorized\/how-recruiters-use-scraping-to-find-professionals\/"},"modified":"2024-08-27T06:53:50","modified_gmt":"2024-08-27T06:53:50","slug":"how-recruiters-use-scraping-to-find-professionals","status":"publish","type":"post","link":"https:\/\/oneproxy.pro\/vn\/guides\/how-recruiters-use-scraping-to-find-professionals\/","title":{"rendered":"C\u00e1ch nh\u00e0 tuy\u1ec3n d\u1ee5ng s\u1eed d\u1ee5ng Scraping \u0111\u1ec3 t\u00ecm chuy\u00ean gia"},"content":{"rendered":"<p>C\u00e1c nh\u00e0 tuy\u1ec3n d\u1ee5ng \u0111\u01b0\u1ee3c tuy\u1ec3n d\u1ee5ng trong tr\u01b0\u1eddng h\u1ee3p v\u1ecb tr\u00ed tuy\u1ec3n d\u1ee5ng t\u1eadp trung trong ph\u1ea1m vi h\u1eb9p v\u00e0 kh\u00f3 t\u00ecm \u0111\u01b0\u1ee3c chuy\u00ean gia. V\u00ec v\u1eady, ch\u00fang ta ph\u1ea3i tinh t\u1ebf b\u1eb1ng m\u1ecdi c\u00e1ch c\u00f3 th\u1ec3, s\u1eed d\u1ee5ng c\u00e1c c\u00f4ng ngh\u1ec7 ti\u00ean ti\u1ebfn \u0111\u1ec3 t\u00ecm ki\u1ebfm th\u00f4ng tin. V\u00ed d\u1ee5: qu\u00e9t ho\u1eb7c ph\u00e2n t\u00edch c\u00fa ph\u00e1p trang web, cho ph\u00e9p b\u1ea1n t\u1ef1 \u0111\u1ed9ng thu th\u1eadp th\u00f4ng tin t\u1eeb c\u00e1c trang web. H\u00e3y c\u00f9ng t\u00ecm hi\u1ec3u c\u00e1ch c\u00e1c nh\u00e0 tuy\u1ec3n d\u1ee5ng s\u1eed d\u1ee5ng n\u00f3 \u0111\u1ec3 t\u00ecm ki\u1ebfm c\u00e1c chuy\u00ean gia v\u00e0 nh\u1eefng l\u1ee3i \u00edch m\u00e0 n\u00f3 mang l\u1ea1i.<\/p>\n\n\n\n<p>Ph\u00e2n t\u00edch c\u00fa ph\u00e1p l\u00e0 m\u1ed9t c\u00f4ng ngh\u1ec7 thu th\u1eadp d\u1eef li\u1ec7u t\u1eeb c\u00e1c trang web b\u1eb1ng c\u00e1ch tr\u00edch xu\u1ea5t th\u00f4ng tin t\u1eeb m\u00e3 HTML. V\u1edbi m\u1ee5c \u0111\u00edch n\u00e0y, c\u00e1c nh\u00e0 tuy\u1ec3n d\u1ee5ng s\u1eed d\u1ee5ng nhi\u1ec1u c\u00f4ng c\u1ee5 v\u00e0 ch\u01b0\u01a1ng tr\u00ecnh kh\u00e1c nhau.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">\u1ee8ng d\u1ee5ng ph\u00e2n t\u00edch trong tuy\u1ec3n d\u1ee5ng<\/h2>\n\n\n\n<p>Trong ph\u1ea7n n\u00e0y, ch\u00fang t\u00f4i s\u1ebd ph\u00e2n t\u00edch chi ti\u1ebft t\u1ea5t c\u1ea3 c\u00e1c kh\u00eda c\u1ea1nh c\u1ee7a vi\u1ec7c s\u1eed d\u1ee5ng t\u00ednh n\u0103ng qu\u00e9t web \u0111\u1ec3 t\u00ecm nh\u00e2n s\u1ef1 v\u00e0 l\u1ef1a ch\u1ecdn \u1ee9ng vi\u00ean.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Thu th\u1eadp th\u00f4ng tin v\u1ec1 chuy\u00ean gia<\/h3>\n\n\n\n<p>Nh\u00e0 tuy\u1ec3n d\u1ee5ng s\u1eed d\u1ee5ng t\u00ednh n\u0103ng thu th\u1eadp th\u00f4ng tin \u0111\u1ec3 t\u1ef1 \u0111\u1ed9ng thu th\u1eadp d\u1eef li\u1ec7u v\u1ec1 c\u00e1c chuy\u00ean gia t\u1eeb nhi\u1ec1u n\u1ec1n t\u1ea3ng kh\u00e1c nhau nh\u01b0 <a href=\"https:\/\/career.habr.com\/companies\/oneproxy\" target=\"_blank\" data-type=\"link\" data-id=\"https:\/\/career.habr.com\/companies\/oneproxy\" rel=\"noreferrer noopener nofollow\">Habr<\/a>, <a href=\"https:\/\/vc.ru\/u\/3414104-oneproxy\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">VC.ru<\/a> ho\u1eb7c <a href=\"https:\/\/github.com\/oneproxypro\" rel=\"nofollow noopener\" target=\"_blank\">GitHub<\/a>. \u0110i\u1ec1u n\u00e0y cho ph\u00e9p b\u1ea1n truy c\u1eadp th\u00f4ng tin v\u1ec1 k\u1ef9 n\u0103ng, kinh nghi\u1ec7m l\u00e0m vi\u1ec7c, tr\u00ecnh \u0111\u1ed9 h\u1ecdc v\u1ea5n v\u00e0 th\u1eadm ch\u00ed c\u1ea3 th\u00f4ng tin li\u00ean h\u1ec7 c\u1ee7a \u1ee9ng vi\u00ean.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Ph\u00e2n t\u00edch s\u01a1 y\u1ebfu l\u00fd l\u1ecbch v\u00e0 h\u1ed3 s\u01a1<\/h3>\n\n\n\n<p>B\u1eb1ng c\u00e1ch s\u1eed d\u1ee5ng t\u00ednh n\u0103ng qu\u00e9t web, nh\u00e0 tuy\u1ec3n d\u1ee5ng tr\u00edch xu\u1ea5t d\u1eef li\u1ec7u v\u1ec1 c\u00e1c k\u1ef9 n\u0103ng v\u00e0 kinh nghi\u1ec7m ch\u00ednh t\u1eeb s\u01a1 y\u1ebfu l\u00fd l\u1ecbch v\u00e0 h\u1ed3 s\u01a1 c\u1ee7a \u1ee9ng vi\u00ean tr\u00ean c\u00e1c n\u1ec1n t\u1ea3ng chuy\u00ean bi\u1ec7t. \u0110i\u1ec1u n\u00e0y gi\u00fap t\u1ef1 \u0111\u1ed9ng h\u00f3a qu\u00e1 tr\u00ecnh so kh\u1edbp y\u00eau c\u1ea7u c\u00f4ng vi\u1ec7c v\u1edbi h\u1ed3 s\u01a1 \u1ee9ng vi\u00ean, \u0111\u1ea9y nhanh v\u00e0 c\u1ea3i thi\u1ec7n qu\u00e1 tr\u00ecnh tuy\u1ec3n ch\u1ecdn.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">\u0110\u00e1nh gi\u00e1 \u1ee9ng vi\u00ean d\u1ef1a tr\u00ean th\u00f4ng tin c\u00f4ng khai<\/h3>\n\n\n\n<p>B\u1eb1ng c\u00e1ch ph\u00e2n t\u00edch c\u00e1c th\u00e0nh t\u1ef1u chuy\u00ean m\u00f4n v\u00e0 d\u1ef1 \u00e1n c\u00f3 s\u1eb5n tr\u00ean c\u00e1c ngu\u1ed3n c\u00f4ng khai, nh\u00e0 tuy\u1ec3n d\u1ee5ng s\u1ebd \u0111\u01b0a ra nh\u1eefng l\u1ef1a ch\u1ecdn s\u00e1ng su\u1ed1t h\u01a1n khi m\u1eddi \u1ee9ng vi\u00ean ph\u1ecfng v\u1ea5n. Ngo\u00e0i ra, b\u1eb1ng c\u00e1ch ph\u00e2n t\u00edch c\u00e1c tuy\u00ean b\u1ed1 v\u00e0 t\u01b0\u01a1ng t\u00e1c c\u00f4ng khai c\u1ee7a \u1ee9ng vi\u00ean tr\u00ean m\u1ea1ng x\u00e3 h\u1ed9i, nh\u00e0 tuy\u1ec3n d\u1ee5ng c\u00f3 th\u1ec3 \u0111\u00e1nh gi\u00e1 c\u00e1ch ti\u1ebfp c\u1eadn c\u00f4ng vi\u1ec7c c\u1ee7a h\u1ecd v\u00e0 s\u1ef1 ph\u00f9 h\u1ee3p v\u1edbi v\u0103n h\u00f3a doanh nghi\u1ec7p. \u0110\u00e2y l\u00e0 m\u1ed9t kh\u00eda c\u1ea1nh kh\u00f4ng r\u00f5 r\u00e0ng c\u1ee7a vi\u1ec7c tuy\u1ec3n d\u1ee5ng v\u00e0 v\u1eabn b\u1ecb \u0111\u00e1nh gi\u00e1 th\u1ea5p. Tuy nhi\u00ean, h\u00e3y y\u00ean t\u00e2m r\u1eb1ng c\u00e1c t\u1eadp \u0111o\u00e0n h\u00e0ng \u0111\u1ea7u c\u0169ng t\u00ednh \u0111\u1ebfn th\u00f4ng tin n\u00e0y khi l\u1ef1a ch\u1ecdn chuy\u00ean gia.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">\u01afu \u0111i\u1ec3m c\u1ee7a vi\u1ec7c s\u1eed d\u1ee5ng ph\u00e2n t\u00edch c\u00fa ph\u00e1p trong tuy\u1ec3n d\u1ee5ng<\/h2>\n\n\n\n<p>C\u00f3 m\u1ed9t s\u1ed1 l\u1ee3i \u00edch khi s\u1eed d\u1ee5ng t\u00ednh n\u0103ng qu\u00e9t web trong tuy\u1ec3n d\u1ee5ng. \u0110\u1ea7u ti\u00ean, n\u00f3 c\u1ea3i thi\u1ec7n hi\u1ec7u qu\u1ea3 t\u00ecm ki\u1ebfm v\u00ec quy tr\u00ecnh t\u1ef1 \u0111\u1ed9ng cho ph\u00e9p b\u1ea1n x\u1eed l\u00fd nhi\u1ec1u h\u1ed3 s\u01a1 nhanh h\u01a1n quy tr\u00ecnh th\u1ee7 c\u00f4ng. Th\u1ee9 hai, n\u00f3 ti\u1ebft ki\u1ec7m th\u1eddi gian v\u00e0 ngu\u1ed3n l\u1ef1c cho nh\u00e0 tuy\u1ec3n d\u1ee5ng, nh\u1eefng ng\u01b0\u1eddi c\u00f3 th\u1ec3 t\u1eadp trung ph\u00e2n t\u00edch d\u1eef li\u1ec7u v\u00e0 \u0111\u01b0a ra quy\u1ebft \u0111\u1ecbnh.<\/p>\n\n\n\n<p>\u0110i\u1ec1u \u0111\u00e1ng ghi nh\u1edb l\u00e0 vi\u1ec7c qu\u00e9t web c\u00f3 nh\u1eefng h\u1ea1n ch\u1ebf v\u00e0 r\u1ee7i ro. M\u1ed9t s\u1ed1 trang web c\u00f3 th\u1ec3 h\u1ea1n ch\u1ebf quy\u1ec1n truy c\u1eadp v\u00e0o d\u1eef li\u1ec7u thu th\u1eadp d\u1eef li\u1ec7u c\u1ee7a h\u1ecd. \u0110i\u1ec1u n\u00e0y c\u00f3 th\u1ec3 \u0111\u01b0\u1ee3c gi\u1ea3i quy\u1ebft b\u1eb1ng c\u00e1ch s\u1eed d\u1ee5ng g\u00f3i m\u00e1y ch\u1ee7 proxy s\u1ebd ph\u00e2n t\u00e1ch c\u00e1c y\u00eau c\u1ea7u \u0111\u1ebfn c\u00e1c \u0111\u1ecba ch\u1ec9 IP kh\u00e1c nhau. B\u1eb1ng c\u00e1ch n\u00e0y, chuy\u00ean gia ph\u00e2n t\u00edch c\u00fa ph\u00e1p s\u1ebd v\u01b0\u1ee3t qua l\u1ec7nh c\u1ea5m th\u00e0nh c\u00f4ng.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">\u0110\u1ec1 xu\u1ea5t ph\u00e2n t\u00edch c\u00fa ph\u00e1p cho tuy\u1ec3n d\u1ee5ng<\/h2>\n\n\n\n<p>\u0110\u1ec3 s\u1eed d\u1ee5ng th\u00e0nh c\u00f4ng kh\u1ea3 n\u0103ng ph\u00e2n t\u00edch c\u00fa ph\u00e1p trong tuy\u1ec3n d\u1ee5ng, b\u1ea1n n\u00ean l\u00e0m theo m\u1ed9t s\u1ed1 khuy\u1ebfn ngh\u1ecb:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>T\u1ed1i \u01b0u h\u00f3a c\u00e1c truy v\u1ea5n v\u00e0 b\u1ecf qua c\u00e1c l\u1ec7nh c\u1ea5m. \u0110i\u1ec1u n\u00e0y \u0111\u00e3 \u0111\u01b0\u1ee3c vi\u1ebft \u1edf tr\u00ean: v\u01b0\u1ee3t qu\u00e1 gi\u1edbi h\u1ea1n y\u00eau c\u1ea7u t\u1edbi m\u00e1y ch\u1ee7 t\u1eeb m\u1ed9t IP s\u1ebd d\u1eabn \u0111\u1ebfn l\u1ec7nh c\u1ea5m. \u0110\u1ec3 l\u00e0m \u0111i\u1ec1u n\u00e0y, b\u1ea1n c\u1ea7n s\u1eed d\u1ee5ng nhi\u1ec1u \u0111\u1ecba ch\u1ec9 IP. \u0110i\u1ec1u n\u00e0y s\u1ebd gi\u1ea3m thi\u1ec3u r\u1ee7i ro b\u1ecb c\u1ea5m v\u00e0 t\u0103ng hi\u1ec7u qu\u1ea3 ph\u00e2n t\u00edch c\u00fa ph\u00e1p.<\/li>\n\n\n\n<li>Th\u01b0\u1eddng xuy\u00ean c\u1eadp nh\u1eadt th\u00f4ng tin v\u1ec1 h\u1ed3 s\u01a1 c\u1ee7a \u1ee9ng vi\u00ean. B\u1ea3ng c\u00e2u h\u1ecfi c\u00f3 th\u1ec3 thay \u0111\u1ed5i n\u00ean \u0111i\u1ec1u quan tr\u1ecdng l\u00e0 ph\u1ea3i th\u01b0\u1eddng xuy\u00ean c\u1eadp nh\u1eadt c\u01a1 s\u1edf d\u1eef li\u1ec7u. Ngo\u00e0i ra, c\u00e1c h\u1ed3 s\u01a1 m\u1edbi li\u00ean t\u1ee5c xu\u1ea5t hi\u1ec7n. V\u00ec v\u1eady, vi\u1ec7c ph\u00e2n t\u00edch c\u00fa ph\u00e1p ph\u1ea3i tr\u1edf th\u00e0nh m\u1ed9t th\u1ee7 t\u1ee5c th\u00f4ng th\u01b0\u1eddng.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Tri\u1ec3n v\u1ecdng ph\u00e2n t\u00edch trong tuy\u1ec3n d\u1ee5ng<\/h2>\n\n\n\n<p>V\u1edbi s\u1ef1 ph\u00e1t tri\u1ec3n c\u1ee7a tr\u00ed tu\u1ec7 nh\u00e2n t\u1ea1o v\u00e0 h\u1ecdc m\u00e1y, vai tr\u00f2 v\u00e0 l\u1ee3i \u00edch c\u1ee7a vi\u1ec7c ph\u00e2n t\u00edch c\u00fa ph\u00e1p trong tuy\u1ec3n d\u1ee5ng s\u1ebd ng\u00e0y c\u00e0ng t\u0103ng l\u00ean. C\u00e1c thu\u1eadt to\u00e1n ng\u00e0y c\u00e0ng tr\u1edf n\u00ean ch\u00ednh x\u00e1c h\u01a1n trong vi\u1ec7c x\u00e1c \u0111\u1ecbnh \u1ee9ng vi\u00ean ph\u00f9 h\u1ee3p v\u00e0 d\u1ef1 \u0111o\u00e1n th\u00e0nh c\u00f4ng \u1edf m\u1ed9t v\u1ecb tr\u00ed nh\u1ea5t \u0111\u1ecbnh.<\/p>\n\n\n\n<p>\u0110\u1ec3 ph\u00e2n t\u00edch c\u00fa ph\u00e1p th\u00e0nh c\u00f4ng, \u0111i\u1ec1u quan tr\u1ecdng l\u00e0 ph\u1ea3i c\u00f3 m\u1ed9t kho m\u00e1y ch\u1ee7 proxy s\u1ea1ch, \u0111i\u1ec1u n\u00e0y s\u1ebd \u0111\u1ea3m b\u1ea3o vi\u1ec7c truy\u1ec1n d\u1eef li\u1ec7u \u0111\u01b0\u1ee3c m\u00e3 h\u00f3a \u0111\u00e1ng tin c\u1eady. Proxy m\u00e1y ch\u1ee7 (m\u1ed9t s\u1ed1 proxy) ho\u1eb7c proxy di \u0111\u1ed9ng s\u1ebd l\u00e0m \u0111\u01b0\u1ee3c. M\u1ed9t proxy di \u0111\u1ed9ng ch\u1ee9a m\u1ed9t nh\u00f3m g\u1ed3m v\u00e0i ngh\u00ecn IP, \u0111\u1ea3m b\u1ea3o t\u00ednh b\u1ea3o m\u1eadt trong qu\u00e1 tr\u00ecnh ph\u00e2n t\u00edch c\u00fa ph\u00e1p.<\/p>","protected":false},"excerpt":{"rendered":"<p>Recruiters are hired in cases where the vacancy is narrowly focused and it is difficult to find a specialist. Therefore, we have to be sophisticated in every possible way, using advanced technologies to search for information. For example, web scraping or parsing, which allows you to automatically collect information from sites. Let&#8217;s figure out how [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":490650,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","inline_featured_image":false,"footnotes":""},"categories":[33],"tags":[],"class_list":["post-490643","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-guides"],"acf":{"faq_title":"","faq_items":null},"_links":{"self":[{"href":"https:\/\/oneproxy.pro\/vn\/wp-json\/wp\/v2\/posts\/490643","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/oneproxy.pro\/vn\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/oneproxy.pro\/vn\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/oneproxy.pro\/vn\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/oneproxy.pro\/vn\/wp-json\/wp\/v2\/comments?post=490643"}],"version-history":[{"count":1,"href":"https:\/\/oneproxy.pro\/vn\/wp-json\/wp\/v2\/posts\/490643\/revisions"}],"predecessor-version":[{"id":505479,"href":"https:\/\/oneproxy.pro\/vn\/wp-json\/wp\/v2\/posts\/490643\/revisions\/505479"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/oneproxy.pro\/vn\/wp-json\/wp\/v2\/media\/490650"}],"wp:attachment":[{"href":"https:\/\/oneproxy.pro\/vn\/wp-json\/wp\/v2\/media?parent=490643"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/oneproxy.pro\/vn\/wp-json\/wp\/v2\/categories?post=490643"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/oneproxy.pro\/vn\/wp-json\/wp\/v2\/tags?post=490643"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}