Universal Python Unit Test Generator AI Agent
About
An AI-powered Python test generation agent that analyzes codebases, generates contextual unit tests, validates them, and auto-heals failures across multiple LLM backends.
Overview
Developed a Universal Python Unit Test Generator AI Agent that automatically analyzes Python codebases and generates comprehensive, context-aware unit tests. The system supports multiple LLM backends and adapts to different project types such as web applications, machine learning projects, data science workflows, scripts, and libraries.
The agent is designed to reduce manual testing effort, improve code reliability, and accelerate test creation for Python projects of any complexity.
Objectives
- Automate unit test generation for Python codebases
- Support multiple AI providers and local LLM execution
- Generate tests based on project type and code structure
- Validate and auto-heal generated tests
- Improve developer productivity and test coverage
Tech Stack
- Language: Python
- Testing Frameworks: Pytest, Unittest
- AI Backends: OpenAI GPT, Google Gemini, Anthropic Claude, Ollama
- Code Analysis: AST-based parsing
- Async Processing: Python async workflows
- Tooling: CLI, environment variables, caching, logging
Key Features
- Multi-LLM backend support with OpenAI, Gemini, Claude, and Ollama
- AST-based code analysis for deep understanding of modules, functions, and classes
- Automatic project type detection for web, ML, data science, scripts, and libraries
- Context-aware unit test generation with positive cases, edge cases, and error handling
- Framework-specific test patterns for Flask, Django, FastAPI, TensorFlow, Scikit-learn, and more
- Syntax validation and import resolution before test creation
- Auto-healing mechanism to fix common test generation issues
- Asynchronous processing for large codebases
- Intelligent caching to reduce repeated API calls
- Exponential backoff and retry handling for LLM API failures
- CLI support with interactive and direct execution modes
Architecture / Design
Designed the system as a modular AI agent pipeline:
Codebase Input → AST Analysis → Project Type Detection → Dependency Mapping → LLM Prompt Generation → Test Generation → Validation → Auto-Healing → Test Output
The architecture supports interchangeable LLM backends, allowing developers to choose cloud-based models or local Ollama models depending on privacy, cost, and performance requirements.
Implementation
- Implemented AST-based parsing to extract functions, classes, imports, and module metadata
- Built a strategy-based test generation flow based on detected project type
- Integrated multiple LLM providers through a unified backend interface
- Added validation layers for syntax checking, import resolution, and execution readiness
- Implemented caching to reduce duplicate LLM requests by up to 60–80%
- Added retry logic with configurable timeouts for reliable API usage
- Built CLI options for model selection, repository path, framework choice, dry-run mode, and verbose logging
Outcomes
- Reduced manual unit test writing effort by 80%+
- Supported Python projects ranging from small scripts to enterprise-scale codebases
- Generated tests for positive flows, edge cases, and exception scenarios
- Improved developer productivity by automating repetitive testing workflows
- Enabled privacy-friendly test generation using local Ollama models
Performance Highlights
- Small projects: ~30 seconds test generation time
- Medium projects: ~2 minutes average generation time
- Large projects: ~8 minutes average generation time
- API caching reduces repeated LLM calls by 60–80%
- Supports concurrent processing for faster test generation
Impact
This project demonstrates how AI agents can assist developers in writing reliable and maintainable software. By combining code analysis, LLM reasoning, validation, and auto-healing, the system creates a practical developer productivity tool for modern Python teams.
The project showcases:
- AI-assisted software engineering
- Developer tooling
- Test automation
- Multi-LLM orchestration
- Code intelligence and static analysis
Scalability & Reliability
- Async processing for concurrent file handling
- Configurable retry and timeout policies
- Local LLM support for private codebases
- Dry-run mode for safe validation
- Exclusion patterns for large repositories
- Logging and diagnostics for troubleshooting
Key Learnings
- Building AI agents for software engineering workflows
- Designing multi-provider LLM abstractions
- Static analysis using Python AST
- Test validation and auto-healing strategies
- Developer experience design through CLI tooling
Role
- AI Tooling / Python Developer