{"id":491326,"date":"2023-11-07T05:06:42","date_gmt":"2023-11-07T05:06:42","guid":{"rendered":"https:\/\/oneproxy.pro\/?p=491326"},"modified":"2024-08-27T06:51:14","modified_gmt":"2024-08-27T06:51:14","slug":"proxy-rotation-with-python","status":"publish","type":"post","link":"https:\/\/oneproxy.pro\/my\/guides\/proxy-rotation-with-python\/","title":{"rendered":"Teknik Lanjutan untuk Putaran Proksi dengan Python"},"content":{"rendered":"\n<p>Creating an efficient proxy rotation mechanism is essential when dealing with large-scale web scraping or data mining tasks. While the early stages of web scraping projects or minimal-scale crawls might suffice with a basic setup, the real challenge arises when scaling up. To mitigate risks such as IP blocking and to ensure the robustness of your scraping infrastructure, utilizing a sophisticated proxy rotation system becomes imperative.<\/p>\n\n\n\n<p>For such purposes, the use of a professional proxy service provider like OneProxy becomes invaluable. With a diverse pool of data center proxy servers, such services can vastly enhance the reliability and efficiency of your scraping tasks.<\/p>\n\n\n\n<p>Below, we delve into the development of a more advanced proxy rotator using Python and Beautiful Soup, leveraging the services from OneProxy for optimal results.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1792\" height=\"1024\" src=\"https:\/\/oneproxy.pro\/wp-content\/uploads\/2023\/11\/rotating-proxies-1.webp\" alt=\"Proxy Rotation With Python\" class=\"wp-image-491329\" title=\"\" srcset=\"https:\/\/oneproxy.pro\/wp-content\/uploads\/2023\/11\/rotating-proxies-1.webp 1792w, https:\/\/oneproxy.pro\/wp-content\/uploads\/2023\/11\/rotating-proxies-1-1280x731.webp 1280w, https:\/\/oneproxy.pro\/wp-content\/uploads\/2023\/11\/rotating-proxies-1-150x86.webp 150w, https:\/\/oneproxy.pro\/wp-content\/uploads\/2023\/11\/rotating-proxies-1-768x439.webp 768w, https:\/\/oneproxy.pro\/wp-content\/uploads\/2023\/11\/rotating-proxies-1-1536x878.webp 1536w, https:\/\/oneproxy.pro\/wp-content\/uploads\/2023\/11\/rotating-proxies-1-18x10.webp 18w\" sizes=\"auto, (max-width: 1792px) 100vw, 1792px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Preliminary Setup<\/h2>\n\n\n\n<p>Before you begin, ensure that you have Beautiful Soup and the <code>requests<\/code> library installed in your Python environment. These tools will enable you to parse HTML content and manage HTTP requests easily.<\/p>\n\n\n\n<p>Our proxy rotation script will fetch public proxies from OneProxy&#8217;s free proxy pool, which can be accessed at <a href=\"https:\/\/oneproxy.pro\/free-proxy\/\">OneProxy Free Proxy List<\/a>. This list is updated regularly, offering a fresh set of proxies for various needs.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Basic Fetch Code<\/h2>\n\n\n\n<p>First, we need to establish the basic code to fetch the HTML content from the OneProxy&#8217;s free proxy list. We use a user-agent string to emulate a web browser, which helps in bypassing basic user-agent based bot detections.<\/p>\n\n\n\n<div class=\"hcb_wrap\"><pre class=\"prism line-numbers lang-python\" data-lang=\"Python\"><code># -*- coding: utf-8 -*-\nfrom bs4 import BeautifulSoup\nimport requests\nurl = https:\/\/oneproxy.pro\/free-proxy\/\n\ndef fetch_proxies(url):\n    header = {\n        &#39;User-Agent&#39;: &#39;Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_11_2) &#39; +\n        &#39;AppleWebKit\/601.3.9 (KHTML, like Gecko) Version\/9.0.2 Safari\/601.3.9&#39;\n    }\n    response = requests.get(url, headers=header)\n    return response.content\n<\/code><\/pre><\/div>\n\n\n\n<p>This function simply retrieves the HTML content from the provided URL.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Parsing the Proxy List<\/h2>\n\n\n\n<p>The <code>BeautifulSoup<\/code> library will parse the HTML content to extract the proxies. The proxies are typically listed within a table structure on the web page, identified by specific HTML tags and attributes.<\/p>\n\n\n\n<div class=\"hcb_wrap\"><pre class=\"prism line-numbers lang-python\" data-lang=\"Python\"><code>def parse_proxies(html_content):\n    soup = BeautifulSoup(html_content, &#39;lxml&#39;)\n    proxy_table = soup.select_one(&#39;#proxy-list-table&#39;)  # Replace with the correct ID\n    proxies = []\n    for row in proxy_table.select(&#39;tr&#39;):\n        columns = row.select(&#39;td&#39;)\n        if columns:\n            ip, port = columns[0].get_text(), columns[1].get_text()\n            proxies.append({&#39;ip&#39;: ip, &#39;port&#39;: port})\n    return proxies<\/code><\/pre><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Rotating Proxies<\/h2>\n\n\n\n<p>The following function orchestrates the proxy rotation by randomly selecting an available proxy from the fetched list:<\/p>\n\n\n\n<div class=\"hcb_wrap\"><pre class=\"prism line-numbers lang-python\" data-lang=\"Python\"><code>from random import choice\n\ndef rotate_proxies(proxies):\n    if proxies:\n        return choice(proxies)\n    else:\n        return None<\/code><\/pre><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Putting It All Together<\/h2>\n\n\n\n<p>Combining all the functions, the final script integrates proxy fetching, parsing, and rotation, providing a seamless proxy rotation system.<\/p>\n\n\n\n<div class=\"hcb_wrap\"><pre class=\"prism line-numbers lang-python\" data-lang=\"Python\"><code># -*- coding: utf-8 -*-\nimport requests\nfrom bs4 import BeautifulSoup\nfrom random import choice\n\n# Functions previously defined: fetch_proxies, parse_proxies, rotate_proxies\n\nproxies = []  # This will hold our list of proxies\n\ndef refresh_proxies():\n    global proxies\n    proxies = parse_proxies(fetch_proxies(&#39;https:\/\/oneproxy.pro\/free-proxy\/&#39;))\n\ndef get_random_proxy():\n    if not proxies:\n        refresh_proxies()\n    return rotate_proxies(proxies)\n\n# Main execution\nrefresh_proxies()\nproxy = get_random_proxy()\nprint(proxy[&#39;ip&#39;], proxy[&#39;port&#39;])<\/code><\/pre><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Professional Scaling with OneProxy<\/h2>\n\n\n\n<p>For production environments where the scale extends to thousands of requests, free proxy pools may not suffice due to reliability and speed considerations. At this juncture, a <a href=\"https:\/\/oneproxy.pro\/services\/rotating-proxies\/\" data-type=\"link\" data-id=\"https:\/\/oneproxy.pro\/services\/rotating-proxies\/\">rotating proxy service<\/a> becomes essential.<\/p>\n\n\n\n<p>OneProxy offers a robust solution with features such as:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Global High-Speed Proxies<\/strong>: Millions of data center proxies worldwide ensure uninterrupted and rapid connections.<\/li>\n\n\n\n<li><strong>Automatic IP Rotation<\/strong>: IP addresses are rotated seamlessly to prevent detection and bans.<\/li>\n\n\n\n<li><strong>User-Agent String Rotation<\/strong>: Mimics requests from various web browsers and versions, enhancing the non-detectability of bots.<\/li>\n\n\n\n<li><strong>CAPTCHA Solving<\/strong>: Integrates technology to solve CAPTCHAs automatically, thereby streamlining the scraping process.<\/li>\n<\/ul>\n\n\n\n<p>With OneProxy, customers have triumphantly navigated the challenges of IP blocking, thereby streamlining their web data extraction processes.<\/p>\n\n\n\n<p>OneProxy&#8217;s services are versatile and can be implemented in any programming language, catering to a wide array of projects and requirements.<\/p>\n\n\n\n<p><strong>Special Offer<\/strong>: Experience the power of professional proxy rotation with OneProxy. Get started with 50,000 requests at no cost<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/oneproxy.pro\/services\/rotating-proxies\/\" data-type=\"link\" data-id=\"https:\/\/oneproxy.pro\/services\/rotating-proxies\/\">Buy Rotating Proxies<\/a><\/h3>\n","protected":false},"excerpt":{"rendered":"<p>Creating an efficient proxy rotation mechanism is essential when dealing with large-scale web scraping or data mining tasks. While the early stages of web scraping projects or minimal-scale crawls might suffice with a basic setup, the real challenge arises when scaling up. To mitigate risks such as IP blocking and to ensure the robustness of [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":491327,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","inline_featured_image":false,"footnotes":""},"categories":[33],"tags":[],"class_list":["post-491326","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-guides"],"acf":{"faq_title":"","faq_items":null},"_links":{"self":[{"href":"https:\/\/oneproxy.pro\/my\/wp-json\/wp\/v2\/posts\/491326","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/oneproxy.pro\/my\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/oneproxy.pro\/my\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/oneproxy.pro\/my\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/oneproxy.pro\/my\/wp-json\/wp\/v2\/comments?post=491326"}],"version-history":[{"count":1,"href":"https:\/\/oneproxy.pro\/my\/wp-json\/wp\/v2\/posts\/491326\/revisions"}],"predecessor-version":[{"id":505840,"href":"https:\/\/oneproxy.pro\/my\/wp-json\/wp\/v2\/posts\/491326\/revisions\/505840"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/oneproxy.pro\/my\/wp-json\/wp\/v2\/media\/491327"}],"wp:attachment":[{"href":"https:\/\/oneproxy.pro\/my\/wp-json\/wp\/v2\/media?parent=491326"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/oneproxy.pro\/my\/wp-json\/wp\/v2\/categories?post=491326"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/oneproxy.pro\/my\/wp-json\/wp\/v2\/tags?post=491326"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}