Contributor Onboarding Guide¶

Welcome to the Codex repository! This guide will help you get started as a contributor, whether you're a human developer or an AI Agent.

Table of Contents¶

Quick Start
Repository Overview
Development Setup
Understanding the Codebase
Making Your First Contribution
Testing Guidelines
Code Review Process
AI Agent Integration
Common Workflows
Getting Help

Quick Start¶

For Human Contributors¶

# 1. Clone the repository
git clone https://github.com/Aries-Serpent/_codex_.git
cd _codex_

# 2. Set up Python environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -e .
pip install -r requirements-dev.txt

# 4. Run tests to verify setup
pytest tests/ -v

# 5. Start exploring!
python -m codex.cli --help

For AI Agents¶

1. Read AGENTS.md for comprehensive agent guidance
2. Review agents/prompts/ for pre-defined workflows
3. Use workflow navigator for common tasks
4. Follow this onboarding guide for contribution patterns

Repository Overview¶

Purpose¶

Codex is a Level 4 MLOps production system with: - Audit Pipeline v1.5.5: Deterministic capability tracking - 1,208+ tests: Comprehensive test coverage - 693 documentation files: Extensive documentation - AI-First Design: Built for AI Assistant/Agent intuitiveness

Key Statistics¶

MLOps Maturity: Level 4 Certified (100/100 score)
Test Coverage: 72%
Capabilities Tracked: 39 (18/18 critical at maturity)
Python Version: 3.9-3.12 supported
Repository Size: ~200,000 lines of code

Architecture¶

_codex_/
├── src/codex_ml/          # Core ML code
├── training/              # Training pipelines
├── scripts/               # Utility scripts
├── tests/                 # Test suite
├── docs/                  # Documentation
├── agents/                # AI Agent infrastructure
├── .github/workflows/     # CI/CD pipelines
└── requirements*.txt      # Dependencies

Development Setup¶

Prerequisites¶

Python 3.9+ (3.12 recommended)
Git
pip and virtualenv
(Optional) Docker for containerized development

Detailed Setup¶

1. Environment Variables¶

Create a .env file based on .env.example:

cp .env.example .env
# Edit .env with your configuration

Common variables: - CODEX_ENV_PYTHON_VERSION: Python version (default: detected) - CODEX_SESSION_ID: Session identifier - CODEX_LOG_DB_PATH: SQLite database path

2. Install Development Tools¶

# Install pre-commit hooks
pip install pre-commit
pre-commit install

# Install linters
pip install ruff black isort mypy

# Install testing tools
pip install pytest pytest-cov hypothesis

3. Verify Installation¶

# Check Python installation
python --version

# Check dependencies
pip list | grep -E "pytest|ruff|black"

# Run sample test
pytest tests/test_tokenization.py -v

4. IDE Setup¶

VS Code:

{
  "python.linting.enabled": true,
  "python.linting.ruffEnabled": true,
  "python.formatting.provider": "black",
  "python.testing.pytestEnabled": true
}

PyCharm: - Enable Ruff in Settings → Tools → External Tools - Set test runner to pytest - Enable type checking with mypy

Understanding the Codebase¶

Core Concepts¶

1. Deterministic Audit Pipeline¶

Located in scripts/space_traversal/: - audit_runner.py: Main audit orchestration - trend_database.py: SQLite-based trend storage - viz_*.py: Multiple visualization formats

2. Capability System¶

Capabilities are tracked and scored: - Config: codex_capability_map.yaml - Scores: audit_artifacts/capabilities_scored.json - Trends: .codex/trends/trends.db

3. Agent Infrastructure¶

Located in agents/: - workflow_navigator.py: Tokenized workflow execution - prompts/: Pre-defined prompts for common tasks - ORCHESTRATION.md: Physics-inspired decision-making

4. Testing Philosophy¶

Comprehensive: 1,208+ test files
Fast: Most tests run in < 1 second
Isolated: Each test is independent
Deterministic: No flaky tests

Code Organization¶

Key Files:
- AGENTS.md                 # Main agent guide
- COMPREHENSIVE_GAP_ANALYSIS.md  # Current gaps and priorities
- codex_gap_registry.yaml   # Known gaps tracking

Core Modules:
- src/codex_ml/training/    # Training pipelines
- src/codex_ml/evaluation/  # Evaluation metrics
- src/codex_ml/connectors/  # Storage connectors
- src/codex_ml/plugins/     # Plugin system

Scripts:
- scripts/space_traversal/  # Audit pipeline
- scripts/archive_files.py  # Archival automation
- scripts/dependency_analyzer.py  # Dependency analysis

Tests:
- tests/                    # Main test suite
- tests/capabilities/       # Capability-specific tests
- tests/space_traversal/    # Audit pipeline tests

Making Your First Contribution¶

Step-by-Step Guide¶

1. Find an Issue¶

Good first issues: - Look for good-first-issue label - Check COMPREHENSIVE_GAP_ANALYSIS.md for priorities - Review codex_gap_registry.yaml for known gaps

2. Create a Branch¶

# Create feature branch
git checkout -b feature/your-feature-name

# Or bugfix branch
git checkout -b fix/bug-description

3. Make Changes¶

Follow existing code style
Add tests for new functionality
Update documentation if needed
Keep commits small and focused

4. Run Tests and Linters¶

# Format code
black src/ scripts/ tests/
isort src/ scripts/ tests/

# Lint code
ruff check src/ scripts/ tests/

# Run tests
pytest tests/ -v

# Type check (if applicable)
mypy src/codex_ml/

5. Commit Changes¶

# Stage changes
git add .

# Commit with descriptive message
git commit -m "feat: Add feature description

- Detail 1
- Detail 2"

# Push to your fork
git push origin feature/your-feature-name

6. Create Pull Request¶

Go to GitHub repository
Click "New Pull Request"
Fill in PR template
Link related issues
Wait for review

Commit Message Format¶

Use conventional commits:

type(scope): Short description

Longer description if needed

- Bullet point 1
- Bullet point 2

Fixes #issue-number

Types: - feat: New feature - fix: Bug fix - docs: Documentation changes - test: Test additions/modifications - refactor: Code refactoring - chore: Maintenance tasks

Testing Guidelines¶

Writing Tests¶

# tests/test_example.py
import pytest

def test_basic_functionality():
    """Test basic functionality."""
    result = function_under_test()
    assert result == expected_value

def test_error_handling():
    """Test error handling."""
    with pytest.raises(ValueError):
        function_with_error()

@pytest.fixture
def sample_data():
    """Provide sample data for tests."""
    return {"key": "value"}

def test_with_fixture(sample_data):
    """Test using fixture."""
    assert sample_data["key"] == "value"

Running Tests¶

# Run all tests
pytest tests/ -v

# Run specific file
pytest tests/test_example.py -v

# Run specific test
pytest tests/test_example.py::test_basic_functionality -v

# Run with coverage
pytest tests/ --cov=src/codex_ml --cov-report=term

# Run tests matching pattern
pytest -k "tokenization" -v

Test Categories¶

Unit tests: Test individual functions/classes
Integration tests: Test component interactions
Smoke tests: Quick sanity checks
Property-based tests: Hypothesis-driven tests

Code Review Process¶

Review Checklist¶

For Authors: - [ ] Tests pass locally - [ ] Code is formatted and linted - [ ] Documentation updated - [ ] PR description is clear - [ ] Commits are clean

For Reviewers: - [ ] Code follows repository conventions - [ ] Tests are comprehensive - [ ] Changes are minimal and focused - [ ] Documentation is accurate - [ ] No security vulnerabilities

Responding to Feedback¶

Address all comments
Ask for clarification if needed
Make requested changes in new commits
Re-request review when ready

AI Agent Integration¶

For AI Agents¶

The repository is designed for AI Agent intuitiveness:

Workflow Navigator¶

from agents.workflow_navigator import WorkflowNavigator

navigator = WorkflowNavigator()

# Execute audit
navigator.execute('AUDIT_EXEC')

# Generate documentation
navigator.execute('DOC_GEN')

# Self-healing
navigator.execute('SELF_HEAL')

Pre-Defined Prompts¶

Located in agents/prompts/: - Audit: audit/run-full-audit.md - Debugging: debugging/test-failure-debugging.md - Organization: organization/repository-cleanup.md - Deployment: deployment/pre-release-deployment.md

Agent Guidelines¶

Read AGENTS.md for comprehensive guidance on: - Agent architecture - Workflow tokens - Physics-inspired orchestration - Mental mapping

Common Workflows¶

Workflow 1: Add New Test¶

# 1. Create test file
touch tests/test_new_feature.py

# 2. Write test
cat > tests/test_new_feature.py << 'EOF'
def test_new_feature():
    assert True
EOF

# 3. Run test
pytest tests/test_new_feature.py -v

# 4. Commit
git add tests/test_new_feature.py
git commit -m "test: Add tests for new feature"

Workflow 2: Fix Bug¶

# 1. Reproduce bug
pytest tests/test_failing.py -v

# 2. Fix bug in source
# Edit src/codex_ml/module.py

# 3. Verify fix
pytest tests/test_failing.py -v

# 4. Commit
git add src/codex_ml/module.py
git commit -m "fix: Fix issue with module"

Workflow 3: Run Audit Pipeline¶

# Full audit
python scripts/space_traversal/audit_runner.py run

# Generate dashboard
python scripts/generate_audit_dashboard.py

# Check results
cat audit_artifacts/capabilities_scored.json | jq '.[] | select(.score < 0.85)'

Workflow 4: Update Documentation¶

# Edit documentation
vim docs/my-doc.md

# Preview (if using mkdocs)
mkdocs serve

# Commit
git add docs/my-doc.md
git commit -m "docs: Update documentation for feature"

Getting Help¶

Resources¶

AGENTS.md: Comprehensive agent guide
CONTRIBUTING.md: Contribution guidelines
COMPREHENSIVE_GAP_ANALYSIS.md: Current priorities
agents/prompts/: Pre-defined workflows

Asking Questions¶

Check existing documentation first
Search closed issues/PRs
Open new issue with:
Clear problem description
Steps to reproduce
Expected vs actual behavior
Environment details

Debugging Tips¶

See agents/prompts/debugging/: - Test failure debugging - Merge conflict resolution - Performance optimization - Security remediation

Next Steps¶

Now that you're onboarded:

Explore the codebase: Browse src/, scripts/, tests/
Run the audit pipeline: python scripts/space_traversal/audit_runner.py run
Read key documentation: AGENTS.md, COMPREHENSIVE_GAP_ANALYSIS.md
Find your first issue: Check good-first-issue labels
Join the community: Engage with other contributors

Suggested First Contributions¶

Add tests for uncovered code
Improve documentation
Fix small bugs
Enhance AI Agent prompts
Optimize performance bottlenecks

Conclusion¶

Welcome to the Codex project! We're excited to have you as a contributor. This repository is designed to be AI-friendly, well-documented, and maintainable.

Remember: - Start small - Ask questions - Follow conventions - Write tests - Have fun!

Happy Contributing! 🚀

Appendix: Useful Commands¶

# Development
python -m codex.cli --help
python scripts/space_traversal/audit_runner.py --help

# Testing
pytest tests/ -v
pytest -k "pattern" -v
pytest --collect-only

# Linting
ruff check .
black --check .
isort --check .

# Type Checking
mypy src/codex_ml/

# Git
git status
git log --oneline -10
git diff main..HEAD

# Dependencies
pip list
pip freeze > requirements.txt

Last Updated: 2025-12-11 Version: 1.0.0 Maintainer: Codex Team