Python Automation Scripts: Automate Repetitive Tasks 2026
Tutorials

Python Automation Scripts: Automate Repetitive Tasks 2026

13 min read
114 Views
Share:

Why Python is the automation language of choice

I have spent years automating workflows across dozens of projects, and Python consistently wins over alternatives like Bash, PowerShell, or JavaScript for automation tasks. The reason is a combination of readable syntax, a vast standard library, and an ecosystem of third-party packages that covers virtually any automation scenario you can think of.

The return on investment for automation is remarkable. A script that takes four hours to write but saves thirty minutes of manual work per day pays itself back in eight days and then keeps paying forever. In 2026, with AI coding assistants accelerating script writing, the payback period is even shorter.

File and directory automation

The most common automation tasks involve files. Organizing downloads, renaming batches of files, cleaning up old logs, or processing data files daily are all excellent automation candidates.

import os
import shutil
from pathlib import Path
from datetime import datetime, timedelta

def organize_downloads(downloads_path: str) -> None:
    """
    Organizes files in a downloads folder by type.
    Creates subfolders: Images, Documents, Videos, Audio, Archives, Other
    """
    downloads = Path(downloads_path)

    # Define extension to folder mapping
    categories = {
        'Images':    ['.jpg', '.jpeg', '.png', '.gif', '.webp', '.svg', '.bmp'],
        'Documents': ['.pdf', '.docx', '.doc', '.xlsx', '.xls', '.pptx', '.txt', '.csv'],
        'Videos':    ['.mp4', '.mkv', '.avi', '.mov', '.wmv', '.flv'],
        'Audio':     ['.mp3', '.wav', '.flac', '.aac', '.ogg'],
        'Archives':  ['.zip', '.tar', '.gz', '.rar', '.7z'],
    }

    # Build reverse lookup: extension -> folder
    ext_map = {ext: folder for folder, exts in categories.items() for ext in exts}

    moved = 0
    for file_path in downloads.iterdir():
        if not file_path.is_file():
            continue

        ext = file_path.suffix.lower()
        target_folder = ext_map.get(ext, 'Other')
        target_dir = downloads / target_folder
        target_dir.mkdir(exist_ok=True)

        # Handle duplicate filenames
        target_path = target_dir / file_path.name
        if target_path.exists():
            stem = file_path.stem
            timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
            target_path = target_dir / f"{stem}_{timestamp}{ext}"

        shutil.move(str(file_path), str(target_path))
        moved += 1
        print(f"  Moved: {file_path.name} → {target_folder}/")

    print(f"\nDone. Organized {moved} files.")

def cleanup_old_files(directory: str, days_old: int = 30) -> None:
    """Deletes files older than N days. Use with caution."""
    cutoff = datetime.now() - timedelta(days=days_old)
    deleted = 0

    for file_path in Path(directory).rglob('*'):
        if file_path.is_file():
            mod_time = datetime.fromtimestamp(file_path.stat().st_mtime)
            if mod_time < cutoff:
                file_path.unlink()
                deleted += 1
                print(f"Deleted: {file_path}")

    print(f"Cleaned up {deleted} files older than {days_old} days.")

# Run the organizer
organize_downloads(os.path.expanduser('~/Downloads'))

Web scraping automation

Web scraping is one of the most powerful and frequently needed automation tasks. Before scraping any site, always check its robots.txt and terms of service.

import requests
from bs4 import BeautifulSoup
import pandas as pd
import time
from typing import Optional

def scrape_hacker_news(pages: int = 3) -> pd.DataFrame:
    """
    Scrapes Hacker News front page stories.
    Respects the site by adding delays between requests.
    """
    base_url = 'https://news.ycombinator.com'
    stories = []

    headers = {
        'User-Agent': 'Mozilla/5.0 (compatible; research-bot/1.0)'
    }

    for page in range(1, pages + 1):
        url = f"{base_url}/?p={page}"
        print(f"Scraping page {page}...")

        response = requests.get(url, headers=headers, timeout=10)
        response.raise_for_status()

        soup = BeautifulSoup(response.text, 'html.parser')
        rows = soup.select('tr.athing')

        for row in rows:
            title_cell = row.select_one('.titleline > a')
            score_row = row.find_next_sibling('tr')

            if not title_cell:
                continue

            score_el = score_row.select_one('.score') if score_row else None
            comments_el = score_row.select_one('a[href*="item"]') if score_row else None

            stories.append({
                'title': title_cell.get_text(strip=True),
                'url': title_cell.get('href', ''),
                'score': int(score_el.text.replace(' points', '')) if score_el else 0,
                'comments': comments_el.get_text(strip=True) if comments_el else '0',
            })

        time.sleep(1.5)  # Be polite: wait between requests

    df = pd.DataFrame(stories)
    df.to_csv('hacker_news.csv', index=False)
    print(f"Saved {len(df)} stories to hacker_news.csv")
    return df

# Install: pip install requests beautifulsoup4 pandas
# stories = scrape_hacker_news(pages=2)

Email automation

After years of building notification systems, the pattern I always come back to is a simple email utility that is easy to call from anywhere in a pipeline.

import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.base import MIMEBase
from email import encoders
from pathlib import Path
import os

class EmailAutomator:
    def __init__(self, smtp_host: str, smtp_port: int, username: str, password: str):
        self.smtp_host = smtp_host
        self.smtp_port = smtp_port
        self.username = username
        self.password = password

    def send(
        self,
        to: list[str],
        subject: str,
        body_html: str,
        attachments: list[str] = None
    ) -> bool:
        msg = MIMEMultipart('alternative')
        msg['From'] = self.username
        msg['To'] = ', '.join(to)
        msg['Subject'] = subject
        msg.attach(MIMEText(body_html, 'html'))

        if attachments:
            for filepath in attachments:
                part = MIMEBase('application', 'octet-stream')
                with open(filepath, 'rb') as f:
                    part.set_payload(f.read())
                encoders.encode_base64(part)
                part.add_header(
                    'Content-Disposition',
                    f'attachment; filename={Path(filepath).name}'
                )
                msg.attach(part)

        try:
            with smtplib.SMTP(self.smtp_host, self.smtp_port) as server:
                server.starttls()
                server.login(self.username, self.password)
                server.sendmail(self.username, to, msg.as_string())
            print(f"Email sent to {', '.join(to)}")
            return True
        except Exception as e:
            print(f"Failed to send email: {e}")
            return False

# Example usage with Gmail
mailer = EmailAutomator(
    smtp_host='smtp.gmail.com',
    smtp_port=587,
    username=os.getenv('EMAIL_USER'),
    password=os.getenv('EMAIL_APP_PASSWORD')  # Use App Password, not your real password
)

mailer.send(
    to=['team@company.com'],
    subject='Daily Sales Report — March 20, 2026',
    body_html='

Sales Report

See attached CSV for today\'s numbers.

', attachments=['sales_report_2026-03-20.csv'] )

API automation and monitoring

import requests
import json
import time
from datetime import datetime

def monitor_api_health(endpoints: list[dict], interval_seconds: int = 300) -> None:
    """
    Monitors a list of API endpoints and logs their status.
    endpoints = [{'name': 'Users API', 'url': 'https://api.example.com/health', 'expected_status': 200}]
    """
    log_file = f"api_monitor_{datetime.now().strftime('%Y%m%d')}.log"

    while True:
        timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
        results = []

        for endpoint in endpoints:
            try:
                start = time.time()
                response = requests.get(endpoint['url'], timeout=10)
                elapsed = round((time.time() - start) * 1000, 2)

                status = 'OK' if response.status_code == endpoint.get('expected_status', 200) else 'FAIL'
                result = f"[{status}] {endpoint['name']}: {response.status_code} in {elapsed}ms"
            except requests.exceptions.Timeout:
                result = f"[TIMEOUT] {endpoint['name']}: No response after 10s"
            except requests.exceptions.ConnectionError:
                result = f"[DOWN] {endpoint['name']}: Connection refused"

            results.append(result)
            print(result)

        log_entry = f"{timestamp}\n" + '\n'.join(results) + '\n' + '-'*60 + '\n'
        with open(log_file, 'a') as f:
            f.write(log_entry)

        time.sleep(interval_seconds)

# Monitor your APIs every 5 minutes
endpoints_to_monitor = [
    {'name': 'Main API', 'url': 'https://api.myapp.com/health', 'expected_status': 200},
    {'name': 'Auth Service', 'url': 'https://auth.myapp.com/ping', 'expected_status': 200},
]
# monitor_api_health(endpoints_to_monitor, interval_seconds=300)

Comparison: automation approaches

TaskBest ToolPython LibraryDifficulty
File organizationPythonpathlib, shutilBeginner
Web scraping (static)Pythonrequests, BeautifulSoupBeginner
Web scraping (dynamic JS)Pythonplaywright, seleniumIntermediate
Email automationPythonsmtplib, emailBeginner
Scheduled taskscron + Pythonschedule, APSchedulerBeginner
Browser automationPythonplaywrightIntermediate
Excel/CSV processingPythonpandas, openpyxlBeginner

Scheduling scripts to run automatically

# Using the 'schedule' library for simple task scheduling
# pip install schedule

import schedule
import time
from datetime import datetime

def daily_backup():
    print(f"[{datetime.now()}] Starting daily backup...")
    # Your backup logic here
    print("Backup complete.")

def hourly_report():
    print(f"[{datetime.now()}] Generating hourly report...")
    # Your report generation logic here

def weekly_cleanup():
    print(f"[{datetime.now()}] Running weekly cleanup...")

# Schedule tasks
schedule.every().day.at("02:00").do(daily_backup)
schedule.every().hour.do(hourly_report)
schedule.every().monday.at("09:00").do(weekly_cleanup)

print("Scheduler running. Press Ctrl+C to stop.")
while True:
    schedule.run_pending()
    time.sleep(60)  # Check every minute

Common errors and solutions

ModuleNotFoundError: No module named 'requests'

The library is not installed in your current Python environment. If you are using a virtual environment, make sure it is activated before installing: source env/bin/activate (Linux/Mac) or env\Scripts\activate (Windows), then pip install requests. If you have multiple Python versions, try python -m pip install requests to ensure you are installing for the right version.

PermissionError: [Errno 13] Permission denied

Your script does not have permission to read or write a file. On Linux/Mac, check ownership with ls -la and fix with chmod or chown. On Windows, run your terminal as Administrator or check that the file is not open in another program. Never run automation scripts as root/Administrator unless absolutely necessary.

requests.exceptions.SSLError: certificate verify failed

The SSL certificate of the target site cannot be verified. Do not use verify=False as a permanent solution as it leaves you open to man-in-the-middle attacks. Instead, update your CA certificates: pip install --upgrade certifi. If the site uses a self-signed certificate in a controlled environment, pass the cert path: requests.get(url, verify='/path/to/cert.pem').

Script works manually but fails when scheduled via cron

Cron jobs run with a minimal environment. Your script likely relies on environment variables or PATH entries that are not available in cron's context. Fix this by adding full paths to executables, loading your .env file explicitly in the script, or running your script with the full Python path: /home/user/env/bin/python /home/user/scripts/myscript.py.

UnicodeDecodeError when reading files

The file encoding does not match what Python expects. Always specify the encoding explicitly: open(file, encoding='utf-8'). If you do not know the encoding, use the chardet library to detect it: pip install chardet, then chardet.detect(open(file, 'rb').read()) to identify the encoding before reading.

Additional resources

J
Written by
Jesús García

Apasionado por la tecnologia y las finanzas personales. Escribo sobre innovacion, inteligencia artificial, inversiones y estrategias para mejorar tu economia. Mi objetivo es hacer que temas complejos sean accesibles para todos.

Share post:

Related posts

Comments

Leave a comment

Recommended Tools

The ones we use in our projects

Affiliate links. No extra cost to you.

Need technology services?

We offer comprehensive web development, mobile apps, consulting, and more.

Web Development Mobile Apps Consulting