Codebase Cognitive Map¶
Generated: 2026-01-23T08:42:00Z | Updated by: doc-freshness-checker agent PR: #2960 | Branch:
copilot/update-html-documentation-standards
π― Mission Overview¶
Objective: Provide a high-level cognitive map of the _codex_ repository including components, flows, dependencies, and operational context for AI agents and human contributors.
Energy Level: β‘β‘β‘β‘ (4/5 - High Priority Reference Document)
Status: π’ Active
Last Updated: 2026-01-23T08:42:00Z | Version: 2.0.0 | Last Reviewed: 2026-01-23T08:42:00Z
Architecture Overview¶
Type: Modular ML/AI Platform with Agent Orchestration MLOps Maturity: Level 4 (100/100 Azure MLOps) - Production Ready Stats: 1500+ tests (100% passing), 72% coverage, 0 vulnerabilities
Repository Structure¶
_codex_/
βββ src/ # Core application code
β βββ codex/ # Ingestion pipeline (ingestβanalyzeβtransformβverify)
β βββ rag/ # RAG pipelines & retrieval
β βββ verification/ # Chain-of-Verification (CoVe)
β βββ mcp/ # Model Context Protocol adapters
β βββ tools/ # Tool registry
βββ agents/ # Autonomous agents (workflow, quantum, physics)
βββ scripts/ # Automation & utilities
β βββ mcp/ # ChatGPT Project packaging system
βββ tests/ # 1500+ test suite
βββ docs/ # Documentation hub
β βββ mcp/ # MCP packaging docs (93+ KB)
β βββ system/ # Cognitive brain (this file)
β βββ capabilities/ # Capability guides
βββ .github/ # CI/CD workflows & automation
Core Components¶
1. Codex Ingestion Pipeline (src/codex/)¶
Purpose: Complete Python code processing system
Commands:
python -m codex.cli ingest <source> # Ingest code (file/ZIP/Git)
python -m codex.cli analyze <snapshot-id> # Static + runtime analysis
python -m codex.cli transform <snapshot-id> --tier A # Apply transformations
python -m codex.cli verify <snapshot-id> # Behavior verification
Flow: Source β Ingest β Analyze β Transform β Verify β PR
2. Agent System (agents/)¶
Purpose: Autonomous AI agents with physics-inspired optimization
Key Agents:
- workflow_navigator.py - Tokenized workflows (AUDIT_EXEC, DOC_GEN)
- quantum_game_theory.py - Quantum-inspired decisions
- physics_orchestrator.py - 6 physics paradigms
- mental_mapping.py - Context tracking
Tokens: audit, decide, docs, organize, review, heal
3. MCP Package System (scripts/mcp/)¶
Purpose: Package codebase for ChatGPT Projects
Commands:
./scripts/mcp/mcp-package --list # List 9 topics
./scripts/mcp/mcp-package --topic agents # Package by topic
./scripts/mcp/mcp-package --custom "patterns" # Custom patterns
Topics: zendesk, agents, quantum, docs, mcp, workflows, python_dev, testing, security
Output: Flat ZIP with manifest.json, README_dataset.md, index.md
Docs: docs/mcp/ - 93+ KB across 8 comprehensive guides
4. RAG & Verification (src/rag/, src/verification/)¶
- RAG pipelines: Chunking, embedding, retrieval
- CoVe: Chain-of-Verification fact-checking
- MCP adapters: Pinecone, Mock integrations
Data Flows¶
Code Ingestion¶
External Source β Ingest β Static Analysis β Runtime Analysis β
LLM Intent Inference β Transformation β Verification β PR Creation
Agent Workflow¶
Request β WorkflowNavigator β Agent Orchestration β
Task Execution β Verification β State Persistence
MCP Packaging¶
Human Request β Component Selection β File Flattening β
Manifest Generation β ZIP Creation β ChatGPT Upload
CI/CD¶
Git Push β Status Validation β Security Gates β Quality Gates β
Test Execution β Cache Management β Artifact Generation
Dependencies & Integrations¶
External Services¶
- OpenAI API: LLM intent inference (
OPENAI_API_KEY) - GitHub API: PR creation, workflows (
GITHUB_TOKEN) - Pinecone: Vector embeddings (optional)
- CodeQL/Semgrep: Security scanning
Python Dependencies¶
- Core: numpy, pandas, openai, httpx, pydantic, hydra-core
- Dev: pytest, black, ruff, mypy, nox, pre-commit
- ML/AI: torch, transformers, safetensors (optional)
CI/CD Pipeline¶
Key Workflows (.github/workflows/)¶
| Workflow | Trigger | Purpose | Cache |
|---|---|---|---|
status_validation.yml |
push, PR | Repo status | - |
security_gates.yml |
push, PR | Security | - |
nox_gates.yml |
push, PR | Quality (lint/type) | Ruff, MyPy |
optimized-ci.yml |
push, PR | Optimized CI | All tools |
build-chatgpt-package.yml |
dispatch | MCP packaging | - |
scan-secrets-variables.yml |
schedule | Secrets scan | Gitleaks |
Cache Strategy (Phase 3C-Lite)¶
- Ruff: ~20-30 MB | MyPy: ~50-80 MB
- Pytest: ~30-50 MB | pre-commit: ~50-100 MB
- Total: 7.69 GB / 10 GB limit (23% buffer)
- Keys:
${{ runner.os }}-${{ github.workflow }}-<tool>-${{ hashFiles(...) }}
Security & Secrets¶
Secrets (GitHub UI injected)¶
OPENAI_API_KEY- OpenAI APIPINECONE_API_KEY- Pinecone (optional)CODEX_MASTER_KEY- Genesis Protocol
Security Scanning¶
- Gitleaks, Trufflehog - Secret detection
- Semgrep SAST - Static analysis
- CodeQL - Code scanning
Anti-/tmp/ Protection¶
Policy: Use .github/tmp/ instead of /tmp/
Applied: emergency_cache_cleanup.sh, MCP tools
Doc: docs/system/ANTI_TMP_PROTECTION_SYSTEM.md
MCP & ChatGPT Integration¶
Packaging Capabilities¶
- 9 Predefined Topics: All major capabilities covered
- Custom Patterns: Glob-based file selection
- Flat Structure: Optimized for ChatGPT
- Metadata: SHA256, sizes, language detection
- Navigation: Manifest-driven discovery
Methodology Transfer (8 Capabilities)¶
- Python script development/deconstruction
- Workflow navigation & state management
- Quantum game theory application
- API integration patterns
- CI/CD workflow optimization
- Agent-based architecture
- TDD methodology
- Documentation generation
Documentation (docs/mcp/)¶
QUICK_START.md- 5-minute onboardingPACKAGING_GUIDE.md- Complete workflowsPACKAGEABLE_CAPABILITIES.md- Capability transferChatGPT_Project_SYSTEM_PROMPT.md- AI promptGENERIC_NAVIGATION_SYSTEM.md- Universal navigationADVANCED_FEATURES_PLANSET.md- Roadmap (Future iterations)
Operational Context¶
GitHub Limits¶
- Copilot Pro+: 64K tokens/session
- GitHub Team: 10 GB cache, limited Actions minutes
- Current Cache: 7.69 GB (23% buffer)
Quality Metrics¶
- Tests: 1500+ (100% passing)
- Coverage: 72% (target: 80%+)
- Security: 0 vulnerabilities
- Cache Hit Rate: 90%+ projected
Performance Targets¶
- Test execution: <5 min
- Lint/type: <2 min
- Package creation: <2 min
Quick Reference¶
Common Commands¶
# Codex
python -m codex.cli ingest|analyze|transform|verify
# MCP
./scripts/mcp/mcp-package --list|--topic|--custom
# Testing
make docker-test
pytest tests/ --cov=src/
# Quality
nox -s lint|type|format
# Agent
python -m scripts.space_traversal.audit_runner agent-interface
Entry Points¶
| System | Entry | Type |
|---|---|---|
| Codex CLI | python -m codex.cli |
Module |
| MCP Package | ./scripts/mcp/mcp-package |
Script |
| Agent Navigator | agents.workflow_navigator |
Class |
| Tests | pytest / make docker-test |
Command |
Navigation for AI Agents¶
Getting Started¶
- Architecture: This doc β
docs/ARCHITECTURE.md - Capabilities:
docs/capabilities/*.md - Workflows:
agents/TOKENIZED_WORKFLOWS.md - MCP:
docs/mcp/QUICK_START.md - Contributing:
docs/CONTRIBUTING.md
Finding Things¶
- Code:
src/(app),agents/(agents) - Tests:
tests/(mirrorssrc/) - Scripts:
scripts/(automation) - Docs:
docs/(organized by topic) - CI/CD:
.github/workflows/
Common Tasks¶
- New capability:
docs/capabilities/template - Extend agents:
agents/workflow_navigator.py - Add CI:
.github/workflows/templates - Package code:
scripts/mcp/mcp-package - Run tests:
make docker-test
Related Documents¶
- Codebase Dashboard - Live status & next steps
- Roadmap - Feature roadmap & iterations
- Architecture - Detailed architecture
- Contributing - Contribution guide
- Admin Guide - Admin setup
Owner: DevOps + Agent Development Team Review: Monthly or after major changes Last Reviewed: 2026-01-23T08:42:00Z
βοΈ Verification Checklist¶
Architecture Accuracy¶
- Component structure matches current repository layout
- Data flows reflect actual implementation
- Dependencies list is up-to-date
- Integration points correctly documented
Documentation Quality¶
- All code examples are valid and tested
- Links to related documents are functional
- Tables render correctly in GitHub/browser
- Commands and paths are accurate
Currency¶
- Updated to reflect latest repository state (2026-01-23)
- Version number incremented (2.0.0)
- Iteration-based workflow language used throughout
π Success Metrics¶
| Metric | Target | Current | Status |
|---|---|---|---|
| Documentation freshness | <30 iterations | 0 iterations | β |
| Broken links | 0 | 0 | β |
| Outdated references | 0 | 0 | β |
| Table rendering issues | 0 | 0 | β |
βοΈ Physics Alignment¶
| Principle | Application | Section |
|---|---|---|
| Path π€οΈ | Clear navigation from overview to detailed components | All sections |
| Fields π | Data flows show transformation through pipeline | Data Flows |
| Patterns ποΈ | Architecture patterns visible and documented | Components |
| Redundancy π | Multiple entry points and cross-references | Navigation |
| Balance βοΈ | Balanced detail across all major components | All sections |
π§ Redundancy Patterns¶
Navigation Redundancy: - Multiple access paths: By component, by workflow, by role - Cross-references between related sections - Both top-down and bottom-up navigation supported
Update Strategy: - Version-controlled documentation - Git history maintains all previous versions - Rollback available via commit history
β‘ Energy Distribution¶
| Section | Energy | Rationale |
|---|---|---|
| Architecture Overview | β‘β‘β‘β‘ | Critical for understanding system structure |
| Core Components | β‘β‘β‘β‘β‘ | Essential for development and maintenance |
| Data Flows | β‘β‘β‘ | Important for troubleshooting and optimization |
| CI/CD Pipeline | β‘β‘β‘ | Key for deployment and automation |
| Quick Reference | β‘β‘ | Utility section for common tasks |
Questions? β Dashboard