The Ultimate Guide to Automating SEO with Python: Boost Rankings & Save Time!

The Ultimate Guide to Automating SEO with Python: Boost Rankings & Save Time!

As a blogger and technical writer, I’ve discovered, over the years, that SEO is an important factor for reaching a wider audience and making sure content ranks well on search engines, particularly Google.

Since I wanted to do what I like the most, which is blogging and writing about programming and software development, I’ve been looking and experimenting with different techniques, libraries, and automation methods to automate SEO and enhance visibility using Python.

Note: Understanding how to optimize content, analyze keyword trends, and leverage automation has been a game-changer for me in driving organic traffic and improving authority.

In this guide, I’ll share essential tips on how Python can be used to automate SEO tasks. We'll learn about:

  • Setting Up Your Python Environment for SEO Automation.
  • Why Python is the Perfect Tool for SEO Automation.
  • Essential Python Libraries for SEO Automation.
  • Error Handling & Rate Limiting with practical code examples for implementing responsible scraping practices that won't get your IPs blocked.
  • Enhanced Content Optimization with NLP - with example code for text similarity analysis to help optimize content relevance.
  • Competitive Analysis Automation - with a comprehensive code example for analyzing competitors' SEO strategies, including visualizations.
  • Machine Learning for SEO Predictions - Using Random Forest to predict SEO rankings base.

Introduction: Why Automate SEO with Python?

In the fast-paced world of blogging and technical writing, Search Engine Optimization (SEO) is a critical driver of search visibility and success.

However, managing SEO manually can quickly become overwhelming. Tasks like keyword research, site audits, rank tracking, and backlink analysis are not only repetitive but also require great attention to detail and constant updates.

For bloggers and technical writers, this often means spending hours on repetitive tasks instead of focusing on writing tutorials and articles that provide value to other developers.

Since i'm a programmer in the first place, I already have a magic tool that can be used to automate these tasks which is Python—a versatile, powerful, and open-source programming language that has become a go-to tool for automating complex tasks.

Python’s simplicity, combined with its vast ecosystem of libraries and frameworks, makes it uniquely suited for SEO automation.

Whether you’re analyzing large datasets, scraping web data, or generating reports, Python can handle it all with precision and efficiency. By automating these tasks, you can save time, reduce human error, and free up resources to focus on writing high-quality articles and tutorials!

This guide is designed to help you harness the power of Python for SEO. From setting up your environment to building custom scripts for keyword analysis, site audits, and performance tracking, you’ll learn how to streamline your workflows, improve accuracy, and boost productivity.

Whether you’re a seasoned developer or a content marketer with limited coding experience, this guide will equip you with the tools and knowledge to take your SEO efforts to the next level.

Let’s dive into the world of SEO automation with Python and discover how you can work smarter, not harder, to achieve your content marketing goals!

🛠️ Getting Started: Setting Up Your Python Environment for SEO Automation

Before diving into the world of SEO automation with Python, it’s important to set up your development environment. Don’t worry—this process is straightforward, even if you’re new to Python. By the end of this section, you’ll have everything you need to start building powerful SEO automation scripts.


Step 1: Install Python

If you haven’t already, the first step is to install Python on your computer. Python is compatible with Windows, macOS, and Linux, so you can follow the steps for your operating system:

  • Visit the official Python website: python.org
  • Download the latest version: As of now, Python 3.x is the latest. Make sure to check the box that says "Add Python to PATH" during installation in Windows (this makes it easier to run Python from your command line or terminal).
  • Verify the installation: Open your command line (Command Prompt on Windows, Terminal on macOS/Linux) and type: bash python --version

If Python is installed correctly, you’ll see the version number displayed.


Step 2: Choose a Code Editor or IDE

You can write Python scripts in any text editor, but using an Integrated Development Environment (IDE) or code editor will make writing, debugging code and even importing necessary libraries much easier. Here are some popular IDEs you can choose from:

  • Visual Studio Code (VS Code): Lightweight, customizable, and packed with extensions for Python development.
  • PyCharm: A powerful IDE specifically designed for Python, with advanced features for debugging and testing.
  • Jupyter Notebook: Great for interactive coding and data analysis, especially if you’re working with datasets.

Step 3: Install Essential Python Libraries for SEO

Python is known as a batteries-included language and its strength lies in its libraries, which are pre-written code modules that make complex tasks simple. For SEO automation, here are some must-to-know libraries:

  • BeautifulSoup and Requests: For web scraping and extracting data from websites. bash pip install beautifulsoup4 requests
  • Pandas: For data manipulation and analysis, especially useful for keyword research and reporting. bash pip install pandas
  • NumPy: For numerical computations and handling large datasets. bash pip install numpy
  • SEO-specific libraries: Libraries like SEOanalyzer or pySEO can simplify tasks like site audits and backlink analysis. bash pip install seoanalyzer

Step 4: Set Up a Virtual Environment

A virtual environment is a self-contained directory where you can install all the dependencies required for your project. This ensures that your Python environment stays clean and avoids conflicts between different projects.

You can create a virtual environment using the following command: bash python -m venv seo-automation-env

Next, you need to activate the virtual environment: - Windows: bash seo-automation-env\Scripts\activate - macOS/Linux: bash source seo-automation-env/bin/activate

After that, you can install any required libraries within the virtual environment using pip and keep everything organized.


Step 5: Test Your Setup

Once everything is set up, it’s time to test your environment. Create a simple Python script to ensure everything is working as expected. For example, try scraping a webpage using BeautifulSoup and Requests:

import requests
from bs4 import BeautifulSoup

# Fetch a webpage
url = "https://example.com"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

# Extract the title tag
title = soup.title.string
print(f"Page Title: {title}")

Run the script, and if you see the title of the webpage printed, congratulations—your environment is ready for SEO automation!


With your Python environment set up, you’re now equipped to start automating SEO tasks.


🔥 Why Python is the Perfect Tool for SEO Automation

Python offers several advantages that make it the ideal choice for SEO tasks:

1️⃣ Scalability & Efficiency

Python effortlessly processes large datasets, making it perfect for analyzing keyword lists, traffic reports, and SEO performance metrics.

2️⃣ Flexibility with APIs & Web Scraping

Python can interact with APIs and extract data from websites, helping you gather competitor insights and optimize content.

3️⃣ Open-Source & Cost-Effective

As a free and open-source tool, Python is backed by a vibrant community constantly developing new libraries and resources.

4️⃣ Customization & Automation

Unlike third-party SEO tools with fixed functionalities, Python allows you to create custom automation scripts tailored to your needs.


🛠 Essential Python Libraries for SEO Automation

To get started, you'll need to familiarize yourself with some powerful Python libraries:

🔍 1. BeautifulSoup

A popular web scraping tool for extracting SEO-relevant data such as meta tags, headings, and keywords from web pages. Here is a quick example:

from bs4 import BeautifulSoup
import requests

url = "https://example.com"
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
meta_tags = soup.find_all('meta')
for tag in meta_tags:
    print(tag)

🌐 2. Requests

Helps send HTTP requests to retrieve website data, useful for scraping and API interactions.

import requests
response = requests.get("https://api.example.com/data")
print(response.json())

📊 3. Pandas

A must-have for processing and analyzing large keyword datasets, traffic reports, and rankings.

import pandas as pd

# Example: Creating a dataframe from SEO keyword data
data = {"Keyword": ["SEO tools", "Python SEO"], "Search Volume": [5000, 3000]}
df = pd.DataFrame(data)
print(df)

🤖 4. Selenium

Automates browser interactions, useful for crawling JavaScript-heavy websites.

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://example.com")
print(driver.title)
driver.quit()

📈 5. Matplotlib & Seaborn

Used for visualizing SEO trends, keyword performance, and competitor analysis.

import matplotlib.pyplot as plt

data = [5000, 7000, 6500, 7200]
labels = ["Jan", "Feb", "Mar", "Apr"]
plt.plot(labels, data)
plt.title("SEO Traffic Growth")
plt.show()

🔗 6. Google API Client

Allows access to Google Analytics, Search Console, and other SEO-related data.

from googleapiclient.discovery import build

service = build('searchconsole', 'v1', developerKey='YOUR_API_KEY')


⚡ Automating SEO Tasks with Python

Now, let's see some examples of how to automate SEO tasks with Python.

📌 1. Web Scraping for SEO Insights

Extract competitor metadata, headings, and keyword usage.

from bs4 import BeautifulSoup
import requests

url = "https://competitor.com"
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
meta_tags = soup.find_all('meta')
for tag in meta_tags:
    print(tag)

🔍 2. Automating Keyword Research

Use APIs like Google Keyword Planner to discover keyword opportunities.

from googleads import adwords
client = adwords.AdWordsClient.LoadFromStorage()

🏗 3. Site Audits & SEO Performance Checks

Check for broken links, missing meta descriptions, and page speed issues.

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://yourwebsite.com")
links = driver.find_elements_by_tag_name('a')
for link in links:
    print(link.get_attribute('href'))
driver.quit()

🔗 4. Automating Backlink Analysis

Pull backlink data using tools like Ahrefs or SEMrush APIs.

import requests
url = "https://apiv2.ahrefs.com/?from=backlinks&target=example.com&mode=domain&token=YOUR_TOKEN"
response = requests.get(url)
data = response.json()
for backlink in data['backlinks']:
    print(backlink['url'])

📝 5. Content Optimization & Generation

Analyze keyword density and generate meta descriptions using Natural Language Processing (NLP).

import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("This is a sample SEO-optimized content snippet.")
for token in doc:
    print(token.text, token.pos_)


🚀 Creating an SEO Keyword Rank Tracker

Monitor keyword performance in Google Search results.

from googleapiclient.discovery import build

service = build('searchconsole', 'v1', developerKey='YOUR_API_KEY')
site_url = "https://yourwebsite.com"
response = service.searchanalytics().query(siteUrl=site_url, body={
    'startDate': '2023-01-01',
    'endDate': '2023-01-31',
    'dimensions': ['query'],
    'rowLimit': 10
}).execute()
for row in response['rows']:
    print(row['keys'][0], row['clicks'], row['position'])


📌 Best Practices for Python SEO Automation

✅ 1. Ensure Data Accuracy

Regularly verify that your data sources and scraping methods are up to date.

✅ 2. Follow Ethical Guidelines

Always respect website permissions, robots.txt files, and API rate limits.

✅ 3. Maintain & Update Scripts

SEO trends and APIs change frequently. Keep your Python scripts updated to stay relevant.


🔄 Error Handling & Rate Limiting

Implementing proper error handling and rate limiting is crucial when working with web scraping and APIs to avoid getting blocked:

# Example of rate limiting and error handling
import requests
import time
from requests.exceptions import RequestException

def scrape_with_rate_limit(urls, delay=2):
    results = []
    for url in urls:
        try:
            response = requests.get(url, timeout=10)
            response.raise_for_status()  # Raise exception for 4XX/5XX responses
            results.append(response.text)
            time.sleep(delay)  # Respect websites by waiting between requests
        except RequestException as e:
            print(f"Error scraping {url}: {e}")
            continue
    return results


🧠 Advanced Content Optimization with NLP

Take your content optimization to the next level with Natural Language Processing techniques:

# Content optimization with text similarity analysis
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

def analyze_content_relevance(target_keyword, content):
    # Create TF-IDF vectorizer
    vectorizer = TfidfVectorizer()

    # Transform texts to vectors
    vectors = vectorizer.fit_transform([target_keyword, content])

    # Calculate similarity
    similarity = cosine_similarity(vectors[0:1], vectors[1:2])[0][0]
    return similarity

# Example usage
keyword = "python seo automation"
article = "This comprehensive guide covers various aspects of automating SEO tasks using Python programming."
relevance_score = analyze_content_relevance(keyword, article)
print(f"Content relevance score: {relevance_score:.2f}")


🔍 Automating Competitive Analysis

Stay ahead of your competitors by automating the analysis of their SEO strategies:

import pandas as pd
import requests
from bs4 import BeautifulSoup
import matplotlib.pyplot as plt
import seaborn as sns

def analyze_competitors(competitor_urls, keywords):
    """
    Analyze multiple competitors for keyword usage, content length, 
    and meta tag optimization.

    Args:
        competitor_urls (list): List of competitor website URLs
        keywords (list): List of target keywords to analyze

    Returns:
        DataFrame: Results of the competitive analysis
    """
    results = []

    for url in competitor_urls:
        try:
            # Get the webpage content
            response = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'}, timeout=10)
            response.raise_for_status()
            soup = BeautifulSoup(response.content, 'html.parser')

            # Extract text content
            text_content = soup.get_text(separator=' ', strip=True)
            word_count = len(text_content.split())

            # Extract title and meta description
            title = soup.title.text if soup.title else "No title"
            meta_desc = soup.find('meta', attrs={'name': 'description'})
            meta_desc = meta_desc['content'] if meta_desc else "No meta description"

            # Count keyword occurrences
            keyword_counts = {}
            for keyword in keywords:
                keyword_counts[keyword] = text_content.lower().count(keyword.lower())

            # Count headings
            h1_count = len(soup.find_all('h1'))
            h2_count = len(soup.find_all('h2'))
            h3_count = len(soup.find_all('h3'))

            # Extract image alt text usage
            images = soup.find_all('img')
            images_with_alt = sum(1 for img in images if img.get('alt'))

            # Analyze internal and external links
            all_links = soup.find_all('a', href=True)
            internal_links = sum(1 for link in all_links if link['href'].startswith('/') or url in link['href'])
            external_links = len(all_links) - internal_links

            # Compile results
            result = {
                'URL': url,
                'Title Length': len(title),
                'Meta Description Length': len(meta_desc),
                'Word Count': word_count,
                'H1 Count': h1_count,
                'H2 Count': h2_count,
                'H3 Count': h3_count,
                'Images with Alt Text': images_with_alt,
                'Total Images': len(images),
                'Internal Links': internal_links,
                'External Links': external_links
            }

            # Add keyword counts
            for keyword, count in keyword_counts.items():
                result[f'Keyword: {keyword}'] = count

            results.append(result)

        except Exception as e:
            print(f"Error analyzing {url}: {e}")

    # Convert to DataFrame
    df = pd.DataFrame(results)
    return df

def visualize_competitive_analysis(df):
    """
    Create visualizations for competitive analysis data

    Args:
        df (DataFrame): Competitive analysis results
    """
    # Set up the matplotlib figure
    plt.figure(figsize=(14, 10))

    # 1. Content Length Comparison
    plt.subplot(2, 2, 1)
    sns.barplot(x='URL', y='Word Count', data=df)
    plt.title('Content Length Comparison')
    plt.xticks(rotation=45, ha='right')
    plt.tight_layout()

    # 2. Meta Description Length
    plt.subplot(2, 2, 2)
    sns.barplot(x='URL', y='Meta Description Length', data=df)
    plt.title('Meta Description Length')
    plt.xticks(rotation=45, ha='right')
    plt.tight_layout()

    # 3. Heading Usage
    plt.subplot(2, 2, 3)
    heading_data = df.melt(
        id_vars=['URL'], 
        value_vars=['H1 Count', 'H2 Count', 'H3 Count'],
        var_name='Heading Type', 
        value_name='Count'
    )
    sns.barplot(x='URL', y='Count', hue='Heading Type', data=heading_data)
    plt.title('Heading Usage')
    plt.xticks(rotation=45, ha='right')
    plt.tight_layout()

    # 4. Internal vs External Links
    plt.subplot(2, 2, 4)
    link_data = df.melt(
        id_vars=['URL'], 
        value_vars=['Internal Links', 'External Links'],
        var_name='Link Type', 
        value_name='Count'
    )
    sns.barplot(x='URL', y='Count', hue='Link Type', data=link_data)
    plt.title('Internal vs External Links')
    plt.xticks(rotation=45, ha='right')
    plt.tight_layout()

    plt.tight_layout(pad=3.0)
    plt.savefig('competitor_analysis.png')
    plt.show()

# Example usage
competitors = [
    "https://competitor1.com",
    "https://competitor2.com",
    "https://competitor3.com"
]

target_keywords = ["python seo", "automation", "web scraping"]
analysis_results = analyze_competitors(competitors, target_keywords)
print(analysis_results)
visualize_competitive_analysis(analysis_results)


📊 Using Machine Learning for SEO Predictions

Take your SEO strategy to the next level by using machine learning to predict rankings and performance:

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib.pyplot as plt
import seaborn as sns

def predict_seo_performance(data_csv_path):
    """
    Use machine learning to predict SEO performance based on historical data

    Args:
        data_csv_path (str): Path to CSV file with historical SEO data

    Returns:
        model: Trained RandomForest model
        X_test: Test features
        y_test: Test target values
        y_pred: Predicted values
    """
    # Load your historical SEO data
    # CSV should have columns like: word_count, keyword_density, backlinks, page_load_time, 
    # meta_desc_length, h1_count, etc. and a target column "ranking" or "traffic"
    df = pd.read_csv(data_csv_path)

    # Handle missing values
    df = df.fillna(0)

    # Define features and target
    X = df.drop(['url', 'ranking'], axis=1)  # Adjust column names as needed
    y = df['ranking']

    # Split data
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

    # Standardize features
    scaler = StandardScaler()
    X_train_scaled = scaler.fit_transform(X_train)
    X_test_scaled = scaler.transform(X_test)

    # Train Random Forest model
    model = RandomForestRegressor(n_estimators=100, random_state=42)
    model.fit(X_train_scaled, y_train)

    # Make predictions
    y_pred = model.predict(X_test_scaled)

    # Evaluate the model
    mse = mean_squared_error(y_test, y_pred)
    r2 = r2_score(y_test, y_pred)
    print(f"Mean Squared Error: {mse:.2f}")
    print(f"R² Score: {r2:.2f}")

    # Feature importance
    feature_importance = pd.DataFrame({
        'Feature': X.columns,
        'Importance': model.feature_importances_
    }).sort_values('Importance', ascending=False)

    print("\nFeature Importance:")
    print(feature_importance.head(10))

    # Visualize feature importance
    plt.figure(figsize=(10, 6))
    sns.barplot(x='Importance', y='Feature', data=feature_importance.head(10))
    plt.title('Top 10 Features for SEO Ranking')
    plt.tight_layout()
    plt.savefig('seo_feature_importance.png')
    plt.show()

    return model, X_test_scaled, y_test, y_pred

def predict_new_content_performance(model, scaler, new_content_features):
    """
    Predict the performance of new content based on its features

    Args:
        model: Trained model
        scaler: Fitted StandardScaler
        new_content_features (dict): Features of the new content

    Returns:
        float: Predicted ranking or traffic
    """
    # Convert features to DataFrame
    new_df = pd.DataFrame([new_content_features])

    # Scale features
    scaled_features = scaler.transform(new_df)

    # Make prediction
    prediction = model.predict(scaled_features)[0]

    return prediction


📍 Automating Local SEO Tasks

Boost your local search presence with these Python automation techniques:

# Local SEO automation: Extract and analyze Google My Business data
from googleapiclient.discovery import build
from google.oauth2 import service_account

def analyze_gmb_insights(service_account_file):
    # Set up credentials
    credentials = service_account.Credentials.from_service_account_file(
        service_account_file,
        scopes=['https://www.googleapis.com/auth/business.manage']
    )

    # Build the service
    service = build('mybusinessaccountmanagement', 'v1', credentials=credentials)

    # Get account list
    accounts = service.accounts().list().execute()

    # Process account data
    for account in accounts.get('accounts', []):
        print(f"Account Name: {account['name']}")

        # Get locations for this account
        locations_service = build('mybusiness', 'v4', credentials=credentials)
        locations = locations_service.accounts().locations().list(
            name=account['name'],
            pageSize=100
        ).execute()

        # Process location data
        for location in locations.get('locations', []):
            print(f"Location: {location['name']}")

            # Get insights for this location
            insights = locations_service.accounts().locations().reportInsights(
                name=location['name'],
                body={
                    'locationNames': [location['name']],
                    'basicRequest': {
                        'metricRequests': [
                            {'metric': 'QUERIES_DIRECT'},
                            {'metric': 'QUERIES_INDIRECT'},
                            {'metric': 'VIEWS_MAPS'},
                            {'metric': 'VIEWS_SEARCH'},
                            {'metric': 'ACTIONS_WEBSITE'},
                            {'metric': 'ACTIONS_PHONE'},
                            {'metric': 'ACTIONS_DRIVING_DIRECTIONS'}
                        ],
                        'timeRange': {
                            'startTime': '2023-01-01T00:00:00Z',
                            'endTime': '2023-01-31T23:59:59Z'
                        }
                    }
                }
            ).execute()

            # Process insights data
            print(insights)


🔄 End-to-End SEO Workflow Automation

Create a complete workflow that handles all aspects of SEO analysis and optimization:

import pandas as pd
import requests
from bs4 import BeautifulSoup
import matplotlib.pyplot as plt
import seaborn as sns
import os
import time
import csv
from datetime import datetime
import json
import random
import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
import re

# Download required NLTK data
nltk.download('punkt')
nltk.download('stopwords')

class SEOAutomator:
    """
    A comprehensive class for automating various SEO tasks in a single workflow
    """

    def __init__(self, website_url, competitors=None, keywords=None):
        """
        Initialize the SEO automator

        Args:
            website_url (str): URL of the website to analyze
            competitors (list): List of competitor URLs
            keywords (list): List of target keywords
        """
        self.website_url = website_url
        self.competitors = competitors or []
        self.keywords = keywords or []
        self.results_dir = f"seo_analysis_{datetime.now().strftime('%Y%m%d_%H%M%S')}"

        # Create results directory
        if not os.path.exists(self.results_dir):
            os.makedirs(self.results_dir)

        # Initialize results containers
        self.site_audit_results = {}
        self.keyword_analysis = {}
        self.competitor_analysis = {}
        self.content_recommendations = {}

        # User agent rotation to avoid blocking
        self.user_agents = [
            'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
            'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1 Safari/605.1.15',
            'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36'
        ]

    def _get_random_user_agent(self):
        """Get a random user agent from the list"""
        return random.choice(self.user_agents)

    def _fetch_page(self, url):
        """
        Fetch a webpage with error handling and rate limiting

        Args:
            url (str): URL to fetch

        Returns:
            BeautifulSoup: Parsed HTML content
        """
        headers = {'User-Agent': self._get_random_user_agent()}
        try:
            response = requests.get(url, headers=headers, timeout=10)
            response.raise_for_status()

            # Basic rate limiting
            time.sleep(2)

            return BeautifulSoup(response.content, 'html.parser')
        except Exception as e:
            print(f"Error fetching {url}: {e}")
            return None

    def audit_site(self):
        """
        Perform a basic SEO audit of the website

        Returns:
            dict: Audit results
        """
        print(f"Auditing website: {self.website_url}")

        soup = self._fetch_page(self.website_url)
        if not soup:
            return {"error": f"Could not fetch {self.website_url}"}

        # Extract key SEO elements
        title = soup.title.text if soup.title else "No title"
        meta_desc = soup.find('meta', attrs={'name': 'description'})
        meta_desc = meta_desc['content'] if meta_desc else "No meta description"

        # Check headings structure
        headings = {
            'h1': [h.text.strip() for h in soup.find_all('h1')],
            'h2': [h.text.strip() for h in soup.find_all('h2')],
            'h3': [h.text.strip() for h in soup.find_all('h3')]
        }

        # Check for alt text on images
        images = soup.find_all('img')
        images_missing_alt = [img['src'] for img in images if not img.get('alt')]

        # Check for canonical URL
        canonical = soup.find('link', attrs={'rel': 'canonical'})
        canonical_url = canonical['href'] if canonical else "No canonical URL found"

        # Check for structured data
        structured_data = soup.find_all('script', attrs={'type': 'application/ld+json'})
        has_structured_data = len(structured_data) > 0

        # Collect all links
        all_links = soup.find_all('a', href=True)
        internal_links = [link['href'] for link in all_links if link['href'].startswith('/') or self.website_url in link['href']]
        external_links = [link['href'] for link in all_links if link['href'].startswith('http') and self.website_url not in link['href']]

        # Check for responsive design meta tag
        responsive = soup.find('meta', attrs={'name': 'viewport'}) is not None

        # Store results
        self.site_audit_results = {
            'title': title,
            'title_length': len(title),
            'meta_description': meta_desc,
            'meta_description_length': len(meta_desc),
            'headings': headings,
            'h1_count': len(headings['h1']),
            'total_images': len(images),
            'images_missing_alt': images_missing_alt,
            'images_missing_alt_count': len(images_missing_alt),
            'canonical_url': canonical_url,
            'has_structured_data': has_structured_data,
            'internal_links_count': len(internal_links),
            'external_links_count': len(external_links),
            'is_responsive': responsive
        }

        # Save results to file
        with open(f"{self.results_dir}/site_audit.json", 'w') as f:
            json.dump(self.site_audit_results, f, indent=4)

        print("Site audit completed and saved to site_audit.json")
        return self.site_audit_results

    def analyze_keywords(self):
        """
        Analyze keyword usage on the website

        Returns:
            dict: Keyword analysis results
        """
        print(f"Analyzing keywords for: {self.website_url}")

        soup = self._fetch_page(self.website_url)
        if not soup:
            return {"error": f"Could not fetch {self.website_url}"}

        # Extract all text content
        text_content = soup.get_text(separator=' ', strip=True)

        # Tokenize and clean text
        tokens = word_tokenize(text_content.lower())
        stop_words = set(stopwords.words('english'))
        tokens = [word for word in tokens if word.isalnum() and word not in stop_words]

        # Calculate word frequency
        word_freq = {}
        for token in tokens:
            if token in word_freq:
                word_freq[token] += 1
            else:
                word_freq[token] = 1

        # Sort by frequency
        sorted_freq = dict(sorted(word_freq.items(), key=lambda x: x[1], reverse=True))
        top_words = {k: sorted_freq[k] for k in list(sorted_freq.keys())[:50]}

        # Analyze keyword usage
        keyword_analysis = {}
        for keyword in self.keywords:
            keyword = keyword.lower()
            # Check exact matches
            exact_count = text_content.lower().count(keyword)

            # Check for keyword in title
            in_title = keyword in soup.title.text.lower() if soup.title else False

            # Check for keyword in meta description
            meta_desc = soup.find('meta', attrs={'name': 'description'})
            in_meta = keyword in meta_desc['content'].lower() if meta_desc else False

            # Check for keyword in headings
            in_h1 = any(keyword in h.text.lower() for h in soup.find_all('h1'))
            in_h2 = any(keyword in h.text.lower() for h in soup.find_all('h2'))

            # Store keyword analysis
            keyword_analysis[keyword] = {
                'exact_matches': exact_count,
                'in_title': in_title,
                'in_meta_description': in_meta,
                'in_h1': in_h1,
                'in_h2': in_h2,
                'keyword_density': (exact_count / len(tokens)) * 100 if tokens else 0
            }

        self.keyword_analysis = {
            'top_words': top_words,
            'total_words': len(tokens),
            'keyword_analysis': keyword_analysis
        }

        # Save results to file
        with open(f"{self.results_dir}/keyword_analysis.json", 'w') as f:
            json.dump(self.keyword_analysis, f, indent=4)

        print("Keyword analysis completed and saved to keyword_analysis.json")
        return self.keyword_analysis

# Example usage of the SEO Automator
automator = SEOAutomator(
    website_url="https://example.com",
    competitors=["https://competitor1.com", "https://competitor2.com"],
    keywords=["python seo", "seo automation", "python scraping"]
)

# Run the complete workflow
automator.audit_site()
automator.analyze_keywords()


🎯 Conclusion: Supercharge Your SEO with Python

Python is a game-changer for SEO professionals. By leveraging automation, you can: ✔ Save time on repetitive tasks ✔ Improve accuracy and efficiency ✔ Scale your SEO efforts effectively ✔ Make data-driven decisions with advanced analytics ✔ Stay ahead of competitors with real-time monitoring ✔ Customize your SEO approach with precision

Start integrating Python into your SEO workflow today and take your rankings to the next level! 🚀🔥


🔗 Additional Resources


  • Date: