Last Updated:
Scrape email ids with python

[How to] Scrape emails using python

Krupal variya Python

This tutorilal is about how you can code an email crawler to scrape email ids from the web using python.

Coding such crawler can be useful for those who would like to understand how crawling and scraping works, also at the same time it can be a gem for direct marketers looking for a way to automate their data extraction tasks.

Before jumping into the coding part, let's take a look at the basic functionality of this scrapper. The crawler has to perfom following tasks to scrape emails :

  1. Open start page
  2. Look for emails, add to db if found
  3. Look for new links, add to crawling queue if found
  4. Keep crawling untill all pages are crawled

Let's start by importing some libraries,

Step 1: Importing Libraries

import re
import sqlite3
import requests
from urllib.parse import urlsplit
from bs4 import BeautifulSoup

 

Comments