Sign in to follow this  
Pured

[Pured Python TUT] How to code a proxy scraper and scrape bulk proxies!

Recommended Posts

Today I will be showing you guys how to code a proxy scraper in Python. What it will do is scrape proxies from multiple proxy sites and make them in to one list. Enjoy these python tutorials, I will be making more. 

Leave a like  :hype:

[hide]

First, we need to import the modules.

import requests

from bs4 import BeautifulSoup

import os

 

proxies = []

We need requests so we can send a request out to scrape the proxies, we need bs4 so we can actually scrape the proxies from the site and we need os so we can write the proxies to a .txt files after scraping. We also need to set the proxies.

 

Second, we need to set up the request and scraping to the page.

page = requests.get('https://free-proxy-list.net/')

soup = BeautifulSoup(page.text, 'html.parser')

table = soup.find('table', attrs={'id':'proxylisttable'})

body = table.find('tbody')

We need to set the page to a site with proxy lists, and send a request to it. Next we have to set up beautifulsoup. We have to set it up correctlly so that it scrapes the proxies off the site and not any other html on the given page.

 

Third, we now actually scrape the proxies.

for row in body.find_all('tr'):

   cols = row.find_all('td')[:7]

   proxies.append({

       'ip': cols[0].text,

       'port': cols[1].text,

       'iso': cols[2].text,

       'country': cols[3].text,

       'protocol': 'https' if cols[6].text == 'yes' else 'http',

       'alive': True})

From the certain html, we need to scrape the table of proxies. We will only be scraping alive http proxies by specifying it.

 

Fourth, we have to save the proxies

 

with open("proxies.txt","w+") as file:

   [file.write(f"{p['ip']}:{p['port']}\n") for p in proxies]

 

os.startfile('proxies.txt')

 

GG, we done. Now we will save the IP and the port only from the table we scraped. We could make it save the country, iso, etc but all we really need is the ip:port. After it saves it will open up the saved proxies.

[/hide]

Share this post


Link to post
Share on other sites

Useful to know! thanks for this.

Share this post


Link to post
Share on other sites

if this actually works this is lit ngl

Share this post


Link to post
Share on other sites

thank you so much


thank you so much for the thread


WOW this is so good


WOW this is so good

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this