Testing Architecture & Infrastructure¶
This document outlines AICO's testing strategy, architecture, and infrastructure for the polyglot monorepo.
Overview¶
AICO uses a comprehensive testing approach across all subsystems (/cli
, /backend
, /frontend
, /studio
) with consistent patterns and tooling. The testing strategy emphasizes reliability, maintainability, and developer productivity.
Testing Philosophy¶
- Quality First: Tests are first-class citizens, not afterthoughts
- Fast Feedback: Quick test execution for rapid development cycles
- Comprehensive Coverage: Unit, integration, and end-to-end testing
- Polyglot Consistency: Consistent patterns across different technologies
- Real-World Validation: Tests use actual AICO configuration and data
Directory Structure¶
Each subsystem follows a consistent testing structure that mirrors the source code organization:
/{subsystem}/tests/
├── unit/ # Unit tests (mirror source structure)
│ ├── api/
│ ├── services/
│ └── models/
├── integration/ # Integration tests
│ ├── test_auth_cycle.py
│ └── test_api_endpoints.py
├── fixtures/ # Test data and utilities
│ ├── mock_data.py
│ └── test_config.py
└── conftest.py # Test configuration
Unit Tests mirror the source code structure, making it easy to find and maintain tests alongside the code they validate.
Integration Tests focus on component interactions and full workflow validation, including authentication flows and API endpoint testing.
Test Support includes reusable fixtures, mock objects, and shared configuration that reduces duplication across test suites.
Test Categories¶
Unit Tests¶
- Purpose: Test individual components in isolation
- Scope: Functions, classes, methods
- Characteristics: Fast, isolated, mocked dependencies
- Tools: pytest (Python), Jest (React), Flutter test framework
Integration Tests¶
- Purpose: Test component interactions
- Scope: API endpoints, database operations, service integration
- Characteristics: Real dependencies, database transactions
- Current Examples: Authentication flows, session management
End-to-End Tests¶
- Purpose: Test complete user workflows
- Scope: Full system interactions across subsystems
- Characteristics: Real environment, user scenarios
- Future: Cross-subsystem workflows
Performance Tests¶
- Purpose: Validate system performance characteristics
- Scope: Load testing, stress testing, benchmarks
- Tools: pytest-benchmark, Artillery, Flutter performance tests
Current Implementation¶
Backend Integration Tests¶
The backend currently implements comprehensive integration tests that validate critical authentication and session management functionality:
# Run from project root
backend/.venv/Scripts/python.exe backend/tests/integration/test_auth_dependency.py
backend/.venv/Scripts/python.exe backend/tests/integration/test_full_auth_cycle.py
backend/.venv/Scripts/python.exe backend/tests/integration/test_session_endpoints.py
Authentication Dependency Testing validates JWT token generation, validation, and integration between authentication services and the API gateway.
Full Authentication Lifecycle Testing provides end-to-end validation covering user login, token generation, protected resource access, session management, and proper logout/token revocation for both REST and WebSocket protocols.
Session Endpoint Testing focuses on session creation, validation, renewal, and cleanup, including edge cases like concurrent sessions and timeouts.
Key Features: - Real AICO configuration and encrypted database (not mocked) - Multi-protocol coverage (REST + WebSocket) - Complete token lifecycle verification - Database session auditing and cleanup validation
Future Testing Infrastructure¶
Test Framework Strategy¶
The future testing infrastructure will leverage industry-standard frameworks tailored to each subsystem's technology stack, ensuring optimal developer experience and robust test execution.
Backend Testing Framework¶
The backend will use pytest with configuration emphasizing strict test discovery and comprehensive coverage:
# pyproject.toml
[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = ["--strict-markers", "--cov=backend", "--cov-report=html"]
markers = [
"unit: Unit tests",
"integration: Integration tests",
"auth: Authentication tests"
]
This enables developers to run specific test subsets (pytest -m unit
) during development while ensuring comprehensive validation in CI/CD.
Frontend Testing Framework¶
Flutter's built-in testing framework provides widget testing and integration with Dart's ecosystem:
// test/flutter_test_config.dart
Future<void> testExecutable(FutureOr<void> Function() testMain) async {
setUpAll(() => /* Global test setup */);
await testMain();
}
Studio Testing Framework¶
Jest with React Testing Library for component and integration testing:
{
"scripts": {
"test": "jest",
"test:watch": "jest --watch"
},
"jest": {
"testEnvironment": "jsdom"
}
}
Test Data Management Strategy¶
Effective test data management is crucial for maintaining reliable, fast, and maintainable tests across the AICO ecosystem. The strategy balances realism with test isolation and performance.
Fixture-Based Data Management¶
Reusable test fixtures organized by domain reduce duplication and ensure consistency:
# backend/tests/fixtures/auth_fixtures.py
@pytest.fixture
def mock_user():
return User(user_uuid="test-uuid", username="testuser", roles=["admin"])
@pytest.fixture
def test_db():
conn = get_connection(":memory:") # In-memory for speed
yield conn
conn.close()
Database Testing Strategy¶
Unit Tests use in-memory SQLite for maximum speed and isolation.
Integration Tests use dedicated test database instances with controlled test data that mirrors production schema.
The infrastructure includes automatic schema setup/teardown, transaction-based isolation, and utilities for creating realistic test data.
Mock and Stub Management¶
External dependencies are mocked to balance test isolation with realistic behavior simulation, including external APIs, file operations, and network communications.
CI/CD Integration Strategy¶
The continuous integration and deployment pipeline will provide automated testing across all subsystems, ensuring that changes are validated before integration and deployment. The strategy emphasizes parallel execution, comprehensive coverage, and fast feedback.
Multi-Subsystem Testing Pipeline¶
The CI/CD pipeline will run tests for all four subsystems (CLI, Backend, Frontend, Studio) in parallel, maximizing efficiency and providing rapid feedback to developers. Each subsystem will have its own optimized testing environment with appropriate language runtimes and dependencies.
Quality Gates and Coverage¶
The pipeline will enforce quality gates including minimum code coverage thresholds, successful test execution, and performance benchmarks. Coverage reports will be automatically generated and tracked over time, providing visibility into testing effectiveness and identifying areas that need additional test coverage.
Test Result Integration¶
Test results will be integrated with the development workflow through pull request status checks, coverage reporting services, and notification systems. This ensures that test failures are immediately visible to developers and that code quality standards are maintained.
Environment Management¶
The CI/CD system will manage multiple testing environments, including isolated database instances, mock external services, and realistic configuration scenarios. This approach ensures that tests run in consistent, predictable environments while avoiding interference between parallel test runs.
Testing Best Practices¶
General Principles¶
- AAA Pattern: Arrange, Act, Assert
- Single Responsibility: One test, one concern
- Descriptive Names: Test names describe behavior
- Independent Tests: No test dependencies
- Fast Execution: Quick feedback loops
Authentication Testing¶
- Real Credentials: Use actual user credentials for integration tests
- Session Lifecycle: Test complete authentication flows
- Security Validation: Verify token revocation and session cleanup
- Protocol Coverage: Test both REST and WebSocket authentication
Database Testing¶
- Transaction Isolation: Each test in its own transaction
- Cleanup: Proper test data cleanup
- Audit Trails: Verify session and audit logging
- Migration Testing: Test schema migrations
API Testing¶
- Contract Testing: Verify API contracts
- Error Handling: Test error conditions
- Authorization: Test role-based access control
- Rate Limiting: Test API limits and throttling
Performance Considerations¶
Test Execution Speed¶
- Parallel Execution: Run tests in parallel where possible
- Test Categorization: Separate fast and slow tests
- Database Optimization: Use in-memory databases for unit tests
- Mock External Services: Avoid network calls in unit tests
Resource Management¶
- Memory Usage: Monitor test memory consumption
- Database Connections: Proper connection pooling and cleanup
- File Handles: Clean up temporary files and resources
- Process Isolation: Avoid test interference
Monitoring and Reporting¶
Test Metrics¶
- Coverage Reports: Code coverage tracking
- Test Duration: Monitor test execution times
- Flaky Tests: Identify and fix unreliable tests
- Failure Analysis: Root cause analysis of test failures
Quality Gates¶
- Minimum Coverage: Enforce coverage thresholds
- Test Success Rate: Require passing tests for deployment
- Performance Benchmarks: Maintain performance standards
- Security Scans: Automated security testing
Future Enhancements¶
Planned Improvements¶
- Complete pytest framework setup for backend
- Flutter test framework configuration
- React/Jest setup for studio
- CLI test framework (Python/pytest)
- Cross-subsystem end-to-end tests
- Performance testing infrastructure
- Test data management system
- Automated test generation tools
Advanced Features¶
- Property-based testing with Hypothesis
- Mutation testing for test quality
- Visual regression testing for UI
- Load testing with realistic scenarios
- Chaos engineering tests
- Security penetration testing automation
Code Coverage¶
Coverage Strategy¶
AICO uses a unified coverage approach across all subsystems using the LCOV format for consistency and cross-platform compatibility.
Coverage Data Storage¶
Each subsystem stores coverage data in its own coverage/
directory following the idiomatic approach:
backend/coverage/ # Python coverage data
cli/coverage/ # CLI coverage data
frontend/coverage/ # Flutter coverage data
studio/coverage/ # React coverage data
Important: All coverage/
directories should be added to .gitignore
as coverage data is generated locally and should not be committed to version control.
Coverage Generation by Subsystem¶
Flutter (Frontend)¶
Python (Backend/CLI)¶
# Generate coverage with pytest-cov
pytest --cov --cov-report=lcov
# Output: backend/coverage/lcov.info or cli/coverage/lcov.info
React (Studio)¶
HTML Coverage Reports¶
To generate human-readable HTML coverage reports from LCOV data:
Prerequisites¶
Install the cross-platform LCOV viewer:
Generate HTML Reports¶
# From any subsystem directory
lcov-viewer lcov coverage/lcov.info --output coverage/html
# Examples:
# Frontend
cd frontend && lcov-viewer lcov coverage/lcov.info --output coverage/html
# Backend
cd backend && lcov-viewer lcov coverage/lcov.info --output coverage/html
# Studio
cd studio && lcov-viewer lcov coverage/lcov.info --output coverage/html
Viewing Reports¶
Open the generated HTML report in your browser:
# Windows
start coverage/html/index.html
# macOS
open coverage/html/index.html
# Linux
xdg-open coverage/html/index.html
Coverage Workflow¶
- Run tests with coverage: Use subsystem-specific commands to generate
lcov.info
- Generate HTML report: Use
lcov-viewer
to convert LCOV data to HTML - Review coverage: Open HTML report in browser to analyze coverage gaps
- Iterate: Add tests for uncovered code and repeat
Known Coverage Limitations¶
Coverage instrumentation has known limitations across different platforms and frameworks:
Flutter Coverage Limitations¶
- Static factory methods: Coverage tracking may be incomplete for static methods in factory classes
- Stream operations: Complex stream subscriptions can interfere with coverage collection timing
- Widget constructors: Some widget initialization code may not be accurately tracked
General Coverage Considerations¶
- Focus on functionality: Prioritize comprehensive functional testing over achieving 100% coverage metrics
- Quality over quantity: Well-designed tests that verify behavior are more valuable than tests written solely for coverage
- Known gaps: Document and accept coverage gaps for code patterns that are difficult to instrument but well-tested functionally
Running Tests¶
Current Test Execution¶
Current integration tests use direct Python execution with detailed output:
# Backend integration tests
backend/.venv/Scripts/python.exe backend/tests/integration/test_full_auth_cycle.py
backend/.venv/Scripts/python.exe backend/tests/integration/test_auth_dependency.py
backend/.venv/Scripts/python.exe backend/tests/integration/test_session_endpoints.py
Each test provides comprehensive output including database session analysis and detailed failure reporting.
Future Framework-Based Execution¶
Streamlined execution through framework-specific commands:
# Backend
pytest tests/ # All tests
pytest tests/unit/ # Unit tests only
pytest -m auth # Authentication tests
# Frontend
flutter test # All tests
flutter test --coverage # With coverage
# Studio
npm test # All tests
npm test -- --coverage # With coverage
# All subsystems
make test-all
This approach supports both targeted execution during development and comprehensive validation in CI/CD pipelines.
Conclusion¶
AICO's testing architecture provides a solid foundation for reliable software development across the polyglot monorepo. The current integration tests validate critical authentication and session management functionality, while the planned infrastructure will support comprehensive testing at all levels.
The emphasis on real-world validation, consistent patterns, and developer productivity ensures that testing remains an enabler rather than a bottleneck in the development process.