🎯 AIP-C01 Complete Gap Cheat Sheet

Common Mistakes to watch for

Jump to Section

πŸ›‘οΈ

Bedrock Guardrails β€” Deep Dive HOT Β· 4 mistakes

Domain 3.1 Β· 3.3 Β· 3.4

Guardrail Component Map β€” What Does What

Content filters β€” harmful content (violence, hate, sexual) by category + strength
Word filters β€” block specific forbidden words/phrases
Sensitive info / PII filters β€” detect/redact PII in I/O
Denied topics β€” block specific off-topic discussions (competitors, crypto, etc.)
Contextual grounding β€” detect hallucination / unauthorized data leakage in RAG
Prompt attack filter β€” detect jailbreaks and prompt injection attacks
Denied topics β‰  injection prevention. SQL injection / jailbreaks β†’ Prompt Attack Filter

Guardrails Monitoring Metrics Tricky

Monitor InvocationsIntervened filtered by GuardrailContentSource dimension
Filter by GuardrailPolicyType dimension β€” shows which policy is intervening
GuardrailPolicyType values: ContentPolicy TopicPolicy SensitiveInformationPolicy WordPolicy GroundingPolicy
Enable tracing first: {"trace": "enabled"} in guardrailConfig

Enforcing Guardrails β€” IAM Condition Key

Lambda proxy that validates guardrails before forwarding to Bedrock
IAM policy condition key bedrock:GuardrailIdentifier on InvokeModel and Converse API calls
Apply to ALL roles accessing Bedrock β†’ mandatory guardrail with no bypass. No Lambda overhead, no single point of failure.

WAF vs Guardrails vs SageMaker Clarify

βœ“ Use Bedrock Guardrails For

  • AI content safety (harmful, toxic)
  • Prompt injection / jailbreak detection
  • PII in AI I/O
  • Topic restriction for AI
  • RAG hallucination (grounding)
  • Bedrock Agents protection
  • βœ— Not These

  • AWS WAF β†’ HTTP attacks, DDoS
  • Not for AI content
  • SageMaker Clarify β†’ bias/explainability for ML training, NOT policy enforcement
  • NOT for governance/RBAC
  • Bias Detection β€” BOLD Dataset NEW

    SageMaker Clarify + RealToxicityPrompts dataset for text generation bias
    Bedrock model evaluation jobs using BOLD dataset (Bias in Open-Ended Language Generation) + secondary model validation
    BOLD = purpose-built for bias detection in language generation models. SageMaker Clarify = ML model bias/explainability (not text gen evaluation).

    Guardrails Contextual Grounding for RAG Quality

    SageMaker Clarify for RAG quality and explainability assessment
    Bedrock Guardrails contextual grounding checks β€” detects responses not grounded in retrieved context
    Grounding checks compare the model's response against the retrieved context. If the response contains claims not supported by the KB β†’ intervention triggers.
    πŸ§ͺ

    Evaluation Systems CRITICAL Β· 6 mistakes Β· 17–20% score

    Domain 5.1 β€” Your Single Weakest Domain

    CreateEvaluationJob API β€” The Correct Approach

    Batch inference jobs β†’ custom LLM judge Lambda β†’ Spearman correlation
    Use CreateEvaluationJob API with a consistent evaluator model for ALL candidate FMs β†’ parallel managed processing β†’ results to S3
    Then use Lambda for Spearman's rank correlation on S3 results. Managed API = consistent + parallel + automatic. DIY = inconsistent + fragile.

    Evaluation Dataset Hard Limit EXACT LIMIT

    S3 versioning + CORS config for 5,000-prompt dataset
    Hard limit: 1,000 prompts per evaluation job. For 5,000 prompts β†’ split into 5 jobs of 1,000 each.
    CORS = browser console access. S3 versioning = file history. Neither affects prompt limits. The fix is always: split the dataset.

    Human Evaluation Workforce

    SageMaker Ground Truth labeling job with AwsManagedHumanLoopRequestSource
    Create an Amazon Cognito user pool β†’ manage custom workforce β†’ assign to Bedrock evaluation work team
    SageMaker Ground Truth = data labeling for ML training. Bedrock human evaluation = Cognito user pools for your custom expert workforce.

    Custom vs Benchmark Evaluation Metrics

    Industry-standard benchmark dataset (MMLU etc.) for company-specific tone
    Human-validated custom dataset + custom metrics for formality scale and brand-specific tone/style
    General benchmarks = general capability. Company-specific quality (tone, formality, domain) = custom dataset + custom metrics. Always.

    Evaluation Process Order Sequence

    Create test dataset with diverse scenarios first (to replace a model in production)
    Run eval β†’ Analyze results and generate a comprehensive evaluation report β€” that's the final step and primary deliverable
    The exam tests whether you know the OUTPUT of an evaluation process (the report/analysis) vs just creating inputs. Analysis + report = the answer when asked about systematic evaluation.

    Output Traceability β€” Tag at Generation Time

    Bedrock invocation logging to correlate logs with data source
    Tag FM outputs with metadata from the data source at generation time β€” content, timestamps, source ID
    Invocation logs capture inputs/outputs but don't inherently link to which source influenced the output. Tagging at generation = traceability baked in.
    πŸ”

    IAM Identity Center & Authentication 6 mistakes

    Domain 3.2

    Enterprise Federation β€” Always IAM Identity Center

    Create IAM users matching AD usernames + cross-account roles
    IAM roles + AssumeRole API calls for on-prem apps
    IAM Identity Center + external IdP (Microsoft Entra ID / Active Directory) + permission sets mapped to departments + multi-account Regional failover
    Corporate IdP β†’ IAM Identity Center β†’ Permission Sets per dept β†’ access to Bedrock models. Never create IAM users that mirror AD accounts.

    Third-Party App Auth β€” OIDC + Cognito

    IAM users per employee + Secrets Manager credential rotation
    OIDC integration with Amazon Cognito β†’ authenticate via IdP β†’ exchange for temporary AWS credentials (STS)
    Temporary credentials > long-lived IAM creds. Secrets Manager rotation β‰  federation. Cognito = federated identity for apps β†’ STS tokens.

    PII Detection β€” Comprehend Right Tool 2 mistakes

    Batch Comprehend + weekly Macie scans
    Textract + Macie + Kendra for email PII + search
    Comprehend real-time PII detection API + Macie custom classifiers for continuous monitoring; Kendra for search after PII redaction

    Correct Tool Per Use Case

  • Email/text PII β†’ Comprehend (real-time)
  • S3 bucket discovery β†’ Macie (custom classifiers)
  • Document OCR β†’ Textract
  • Enterprise search β†’ Kendra
  • Common Wrong Combos

  • Textract for email β†’ OCR tool, not NLP
  • Batch PII β†’ too slow for continuous
  • Weekly Macie β†’ not continuous enough
  • VPC Endpoint for Amazon Bedrock

    VPC Gateway endpoint for Bedrock Runtime
    VPC Interface endpoint (PrivateLink) for Bedrock Runtime
    Gateway endpoints = S3 and DynamoDB ONLY. Everything else (Bedrock, SageMaker, SSM, etc.) = Interface endpoints via PrivateLink.
    πŸ€–

    Strands SDK + MCP Servers + Agentic Patterns NEW Β· 2 mistakes

    Domain 2.1

    Strands + MCP vs Bedrock Flows

    Bedrock Flows with sequential LLM invocations + Lambda for dynamic tool access
    Strands Agents SDK + create MCP servers for each tool β†’ deploy Strands agent + MCP servers as Lambda β†’ dynamic tool selection by the agent

    Strands + MCP

  • Dynamic tool selection at runtime
  • LLM decides which tool to call
  • Each company tool = one MCP server
  • Flexible, agentic workflows
  • Bedrock Flows

  • Predefined sequential steps
  • Fixed execution path
  • Good for deterministic pipelines
  • NOT for dynamic tool routing
  • Strands Multi-Specialist Agents

    Single Bedrock Agent with OpenAPI schema per task (BillingTool, SupportTool)
    Strands Agents SDK β†’ specialist agents per domain (BillingAgent, TechSupportAgent) β†’ deploy as Lambda β†’ register in Strands β†’ orchestration routes requests
    Multi-domain complex queries β†’ Strands with specialist agents. Single-task automation β†’ Bedrock Agents with action groups.

    ReAct Pattern in Step Functions

    Chain-of-thought with Choice states to adjust reasoning path
    ReAct states: Observation β†’ Reasoning β†’ Action + built-in Step Functions error handling + retry logic

    ReAct (Reasoning + Acting)

  • Interleaves reasoning + tool use
  • Observe env β†’ reason β†’ act
  • Dynamic β€” loops until done
  • Agentic/tool-use workflows
  • Chain-of-Thought

  • Explicit reasoning chain
  • No action/tool loop
  • Linear reasoning path
  • Complex reasoning (not actions)
  • πŸ“

    Inference Profiles β€” Multi-Use NEW Β· 2 mistakes

    Domain 1.2 Β· 4.1

    Inference Profiles for Cost Attribution

    S3 Event Notifications + Lambda routing by S3 key prefix
    Create Bedrock inference profiles per clinic/cost center β†’ invoke model via profile β†’ cost reports broken down by profile
    1 inference profile = 1 cost center. Tag profile with clinic ID. AWS Cost Explorer shows costs per profile = per-clinic attribution.

    Inference Profiles for Multi-Region HA

    Cross-Region inference in round-robin pattern for HA
    Inference profile with primary region + secondary region configuration β€” automatic failover (NOT round-robin load balancing)
    Cross-Region inference provides failover capability. An inference profile defines the failover order. Round-robin = load balancing β‰  HA. The exam wants failover.
    Inference profile = (primary region) + (secondary region) β†’ failover on primary unavailability

    Batch Inference + Inference Profiles for High Throughput

    Synchronous API calls + Lambda concurrent invocations for millions of documents
    Batch inference jobs with S3 I/O configuration + inference profiles for workload distribution across regions = max throughput + cost efficiency
    Synchronous = latency-sensitive, not designed for bulk. Batch inference = purpose-built for large-scale async processing. S3 I/O = input from S3, results to S3.
    🧠

    Model Distillation NEW Β· 1 mistake

    Domain 1.2

    Distillation β€” Prompts Only, Teacher Generates Responses

    Use prompt-response pairs from invocation logs as distillation training data
    Model distillation = supply only prompts to Bedrock. The teacher model generates the responses during the distillation process β€” Bedrock handles the teacher inference internally.
    TechniqueTraining InputWho Generates Responses
    Fine-tuningPrompt + response pairs (JSONL)You provide the responses
    Distillation βœ“Prompts onlyTeacher model (Bedrock handles it)
    Continued pre-trainingRaw text corpusN/A (unsupervised)
    Distillation: small student model learns from large teacher model's outputs. You just provide the prompts β€” Bedrock runs the teacher and captures its responses automatically.
    ⚑

    Semantic Caching NEW Β· 1 mistake

    Domain 4.2

    OpenSearch k-NN for Semantic Cache vs ElastiCache Exact-String

    ElastiCache Redis with custom semantic caching using exact string matching
    OpenSearch Service with k-NN vector index β†’ Lambda generates embedding for incoming query β†’ finds semantically similar cached response β†’ returns cache hit without FM invocation

    OpenSearch k-NN Semantic Cache

  • Embeddings stored in vector index
  • Finds similar (not identical) queries
  • "What's the price?" β‰ˆ "How much does it cost?"
  • Cache hits even with different wording
  • βœ“ True semantic caching
  • ElastiCache Exact-String

  • Key-value: exact query string = key
  • Only matches identical strings
  • "What's the price?" β‰  "How much?"
  • Low cache hit rate for conversational AI
  • βœ— NOT semantic
  • πŸ“‘

    Monitoring & Observability NEW Β· 3 mistakes

    Domain 4.3 Β· 5.2

    CloudWatch GenAI Observability (Not X-Ray)

    CloudWatch Application Signals + X-Ray traces for Lambda-Bedrock interactions
    Amazon CloudWatch GenAI observability for deployed Lambda + Bedrock functions β€” purpose-built for GenAI monitoring
    CloudWatch GenAI observability = new capability purpose-built for tracking model invocations, latency, token counts, errors in GenAI apps. X-Ray = distributed tracing for microservices, not optimized for GenAI metrics.

    RAG Performance: CloudWatch Dashboard vs X-Ray

    X-Ray distributed tracing focusing on OpenSearch vector query latency
    Custom CloudWatch dashboard combining retrieval latency + OpenSearch operation counts β†’ analyze Bedrock invocation logs to identify degraded KB queries
    For RAG: you need a combined view of retrieval latency + model latency + KB query patterns. CloudWatch dashboards with multiple metrics sources = better picture than X-Ray traces alone.

    KB Logging vs Model Invocation Logging

    Model invocation logging for KB document processing failures
    Knowledge base logging β†’ CloudWatch Logs β†’ CloudWatch Logs Insights to query KB ingestion failures
    Two separate log types:
    Model Invocation Logs = LLM I/O, latency, tokens
    KB Ingestion Logs = document processing, chunking, embedding failures

    Troubleshooting Truncated Responses

    CloudWatch Logs Insights to analyze API call patterns for truncation issue
    Truncated/cut-off responses = increase the max_tokens parameter in model invocation
    Troubleshooting quick-reference:
    Cut-off output β†’ max_tokens too low
    Repetitive output β†’ temperature too low or top-p issue
    Irrelevant output β†’ RAG retrieval / reranking issue
    High latency β†’ batch inference or streaming
    πŸ“„

    Bedrock Data Automation (BDA) 4 mistakes

    Domain 1.3

    BDA Architecture: 1 Project, Many Blueprints

    Create separate BDA projects per document type
    ONE BDA project + multiple blueprints (one per doc type) β†’ BDA auto-selects correct blueprint per document
    Invoke via InvokeDataAutomationAsync API β†’ pass the single project β†’ BDA picks the right blueprint automatically.

    BDA as Multimodal Parser Before RAG

    Simple S3 β†’ Knowledge Base for multimodal data (PDFs, filings, charts)
    BDA as parser β†’ extracts structured insights from multimodal content β†’ THEN feeds into Knowledge Base for RAG
    Pattern: S3 β†’ BDA parser β†’ structured output β†’ KB ingestion β†’ RAG queries. BDA handles: PDFs, images, tables, charts, audio, video.

    BDA Blueprint: Transformation vs Validation

    Validation to enforce required subfields (FIRST_NAME, LAST_NAME) and reject malformed
    Transformation with a reusable custom type to split/reformat extracted fields (e.g., split "AuthorizedSigner" into components)
    Validation = check constraints, reject bad data. Transformation = reshape/split/convert the value. Custom type = reusable transformation definition in blueprints.

    BDA vs EventBridge + Step Functions for Orchestration

    BDA blueprint to orchestrate video processing (Rekognition + Bedrock)
    BDA = document/media intelligence extraction only. For multi-step orchestration β†’ EventBridge + Step Functions
    S3 upload β†’ EventBridge rule β†’ Step Functions state machine β†’ calls Rekognition β†’ calls Bedrock FMs β†’ returns results
    πŸ“

    Bedrock Prompt Management & Governance Β· 5 mistakes

    Domain 3.3

    Prompt Management Features

    Versioning β€” track template changes over time
    Review workflows β€” approve versions BEFORE going live (never auto-activate)
    Parameterized templates β€” reusable prompts with variables
    Auto-activate new versions as they save (bypasses governance!)
    CloudTrail = compliance audit. CloudWatch Logs = operational monitoring. For prompt compliance β†’ CloudTrail.

    Complete Governance Stack

    1. Bedrock Prompt Management β€” parameterized templates + versioning + review workflows
    2. Bedrock Guardrails β€” content safety policies (NOT for RBAC!)
    3. AWS CloudTrail β€” audit log of all prompt invocations + template changes
    4. IAM + permission sets β€” for RBAC (role-based access)
    Use Guardrails for role-based access control (RBAC)
    Guardrails = content safety. RBAC = IAM. Compliance logs = CloudTrail (not CloudWatch).

    Prompt Lineage / Audit Trail

    CloudWatch Logs + S3 object tags for prompt lineage and metadata
    Bedrock Prompt Management for template versioning + CloudTrail to record all Bedrock API calls (who called what, when, which template version)

    Output Source Tagging

    Bedrock invocation logging to correlate outputs with data source
    Tag FM outputs with metadata from the data source at generation time (source ID, timestamp, doc name)
    Invocation logs tell you inputs/outputs. They don't tell you which KB document influenced the answer. Tagging at generation time = traceability built in.
    πŸ”

    RAG & Knowledge Bases β€” Nuances Β· 4 mistakes

    Domain 1.4 Β· 1.5

    Retrieval Relevance β€” Reranking

    Increase number of retrieved documents to improve relevance
    Configure reranking in Bedrock Knowledge Bases β€” second-pass semantic re-scoring of retrieved chunks
    More docs β‰  better. Often adds noise. Reranking pipeline: Vector search β†’ Top-K β†’ Reranker β†’ Best-N β†’ LLM

    KB Sync: SQS Buffer for Resilience

    S3 Event Notifications β†’ direct Lambda trigger β†’ IngestKnowledgeBaseDocuments
    S3 Events β†’ SQS queue β†’ Lambda polls queue β†’ IngestKnowledgeBaseDocuments
    SQS buffer = retry on failure. Direct Lambda trigger = if Lambda fails, the message is lost. SQS = resilient by design.

    KB Ingestion: Split Docs, Don't Compress

    Enable S3 bucket compression for large PDF files before KB ingestion
    Split large documents into smaller files before ingestion β€” compression doesn't help with Bedrock's document size limits
    Bedrock KB has per-document size limits. Compression changes file size but doesn't affect how Bedrock processes the document content. Splitting is the solution.

    RAG Accuracy vs Streaming Latency

    Implement response streaming to fix product accuracy issues
    Accuracy issues β†’ RAG Knowledge Bases with current product catalog. Streaming β†’ reduces perceived latency only.
    Streaming = first-token faster. RAG = model knows correct facts. Different problems, different solutions.

    Custom Chunking for Complex HTML

    Built-in hierarchical chunking with parent/child chunk size settings
    Custom Lambda function + LangChain framework β†’ deploy via Lambda layer β†’ configure as custom chunking in Knowledge Base
    Built-in hierarchical = simple docs. Complex HTML with nested headers, tables, mixed content = custom Lambda chunking for fine-grained control.

    Bedrock On-Demand vs SageMaker for Nova Models

    SageMaker AI real-time endpoint + auto scaling for unpredictable traffic with Nova models
    Deploy Nova model in Amazon Bedrock β†’ invoke via Lambda on-demand inference API calls β†’ no infrastructure management
    If the model is available in Bedrock natively (e.g., Nova, Claude, Titan), use Bedrock on-demand β€” no endpoints to manage, automatic scaling, pay per token.
    πŸ”Ž

    OpenSearch: Hybrid Search & Service vs Serverless NEW Β· 1 mistake

    Domain 1.4

    OpenSearch Service vs Serverless for Sub-Second Hybrid Search

    OpenSearch Serverless + dense-only vectors for sub-second search
    OpenSearch Service (managed) + BOTH sparse AND dense vectors = hybrid search for best accuracy + sub-second performance

    OpenSearch Service (managed)

  • Full feature support including hybrid search
  • Sparse vectors (BM25/keyword) + dense vectors
  • Best for sub-second + accuracy requirements
  • More control over indexing, replicas
  • OpenSearch Serverless

  • Simpler, auto-scaling
  • Good for variable workloads
  • Limited: no cross-cluster search, less feature set
  • Dense vectors only (at exam time)
  • Hybrid search = sparse (keyword/BM25) + dense (semantic embedding) combined. Better precision + recall than either alone.
    πŸ”—

    Integration Patterns Β· 4 mistakes

    Domain 2.3 Β· 2.4 Β· 2.5

    Async Document Processing Architecture

    API Gateway WebSocket + InvokeModelWithResponseStream for document upload + processing
    S3 presigned URL β†’ user uploads to S3 directly β†’ S3 Event Notifications β†’ SQS β†’ Lambda polls β†’ Bedrock async processing β†’ results to S3
    Document processing = async (not streaming). Streaming = real-time token-by-token output. Large doc processing = async queue-based pipeline.

    IAM for WebSocket Streaming

    DynamoDB table + DynamoDB Streams to buffer and forward streaming tokens
    IAM role for Lambda needs: bedrock:InvokeModelWithResponseStream + execute-api:ManageConnections β€” resource ARNs must include WebSocket API ID

    API Gateway for Model Routing

    Separate API GWs per provider + client-side routing + client-side API keys
    Single API GW REST API + non-proxy integration + mapping templates (VTL) + Secrets Manager for keys + stage variables for endpoint URLs
    Never store API keys client-side. 1 API GW β†’ mapping templates transform per provider β†’ Secrets Manager for API keys.

    Amazon Q Developer β€” Full Scope NEW

    Q Developer = only code completions + documentation lookups
    Q Developer = only security analysis + compliance
    Q Developer does ALL: code generation + refactoring, IDE contextual suggestions based on YOUR project codebase, API guidance, performance optimization, security analysis, code transformation/modernization
    Key word: "contextual" β€” Q Developer understands YOUR project, not just generic APIs. It's an AI coding assistant, not just autocomplete.
    βš™οΈ

    SageMaker Inference Types Β· 2 mistakes

    Domain 2.2 Β· 5.2

    SageMaker Inference Decision Matrix

    TypeWhen to UseInstance TypeYour Mistake
    Real-timeLow latency, synchronous, always-on trafficAnyβ€”
    Asynchronous βœ“Long-running, large payloads, image/video generation, spiky trafficAccelerated (GPU)Used Serverless + general purpose
    ServerlessInfrequent/variable traffic, cost savings when idle, simple modelsGeneral purpose (auto)Wrong for image gen (needs GPU)
    BatchOffline large dataset, no latency reqAnyβ€”
    Shadow testValidate new model version without production impactAnyPicked A/B testing instead
    Shadow test = traffic copy, no user impact. A/B test = splits live traffic between variants. For safe validation β†’ shadow test.
    πŸ“¦

    Data Processing & Storage Patterns Β· 6 mistakes

    Domain 1.3 Β· 1.4

    Fine-Tuning Data Pipeline

    AWS Glue crawler + Amazon EMR + Apache Spark to produce JSONL
    Glue crawler β†’ Glue Data Catalog β†’ Glue ETL jobs β†’ transform to JSONL (Converse API format) β†’ S3 β†’ Bedrock fine-tuning job
    Glue ETL = serverless, managed. EMR = complex cluster. For format conversion to JSONL β†’ Glue ETL is the correct AWS-native path.

    S3 Metadata Types

    TypeExamplesSet By
    System-definedLast-Modified, Content-Type, ETag, Content-LengthS3 (auto)
    User-definedx-amz-meta-author, x-amz-meta-source, x-amz-meta-deptYou on upload
    Object tagsClassification=Confidential, Discipline=PhysicsYou anytime
    Store timestamps as user-defined metadata
    Timestamps β†’ system-defined (S3 auto). Authorship β†’ user-defined. Classifications β†’ tags or user-defined.

    Comprehend: Entity Recognition vs Classification

    Comprehend custom classification to categorize products
    Comprehend entity recognition to extract product attributes (brand, model, specs)

    Entity Recognition

  • Extract named entities from text
  • Product names, brands, specs
  • People, places, dates
  • Output: structured attributes
  • Custom Classification

  • Label whole document/text
  • Assign category labels
  • Sentiment categories
  • Output: class labels
  • PII in Existing Text vs New Stream

    Existing email data in S3 β†’ Comprehend PII detection/redaction + Kendra for enterprise search
    Real-time data stream β†’ Comprehend real-time PII detection API (not batch)
    S3 bucket PII discovery β†’ Macie with custom classifiers for continuous monitoring
    Textract for email text (Textract = OCR for documents, not NLP for text PII)
    πŸ“Š

    Complete Domain Score Analysis β€” All 7 Documents

    120 Questions Total

    Mistakes by Subdomain β€” Priority Order

    DomainMistakesPriorityKey Gap
    5.1 Evaluation Systems6πŸ”₯πŸ”₯πŸ”₯ CRITICALCreateEvaluationJob API, 1000-prompt limit, Cognito workforce, custom metrics, BOLD dataset
    3.2 Data Security & Privacy6πŸ”₯πŸ”₯πŸ”₯ CRITICALIAM Identity Center vs IAM users, OIDC+Cognito, VPC interface endpoint, Comprehend PII real-time
    1.3 Data Validation & Pipelines6πŸ”₯πŸ”₯πŸ”₯ CRITICALBDA blueprints, Glue ETL vs EMR, BDA multimodal parser, custom chunking, Comprehend entity
    3.3 AI Governance & Compliance5πŸ”₯πŸ”₯ HIGHPrompt Management versioning/review, GuardrailIdentifier IAM key, CloudTrail vs CloudWatch
    2.3 Enterprise Integration4πŸ”₯πŸ”₯ HIGHReAct vs CoT, S3 presigned+SQS for async, EventBridge+Step Functions, RAG vs streaming
    3.1 Input/Output Safety4πŸ”₯πŸ”₯ HIGHPrompt attack filter vs denied topics, GuardrailPolicyType metrics, contextual grounding for RAG
    4.3 Monitoring Systems3πŸ”₯πŸ”₯ HIGHCloudWatch GenAI observability, KB logging vs invocation logging, max_tokens troubleshooting
    2.5 App Integration Patterns3πŸ”₯ MEDIUMQ Developer contextual IDE integration, WebSocket IAM permissions
    5.2 Troubleshooting3πŸ”₯ MEDIUMShadow test vs A/B test, split docs vs compress, max_tokens for truncation
    2.1 Agentic AI Solutions2πŸ”₯ MEDIUMStrands+MCP vs Bedrock Flows, multi-specialist agents
    2.2 Model Deployment2πŸ”₯ MEDIUMSageMaker async+GPU for image gen, Bedrock on-demand for Nova
    1.2 Select & Configure FMs2πŸ”₯ MEDIUMInference profile for HA (primary+secondary), distillation = prompts only
    1.4 Vector Store Solutions2πŸ”₯ MEDIUMOpenSearch Service+hybrid search vs Serverless, S3 metadata types
    4.1 Cost Optimization2πŸ”₯ MEDIUMBatch inference + inference profiles, inference profiles for cost attribution
    1.5 Retrieval Mechanisms2πŸ”₯ MEDIUMReranking, SQS-buffered KB sync
    3.4 Responsible AI2πŸ”₯ MEDIUMBOLD dataset via Bedrock eval (not SageMaker Clarify)
    4.2 App Performance1βœ… LOWSemantic caching with OpenSearch k-NN
    2.4 FM API Integrations1βœ… LOWSingle API GW + mapping templates + Secrets Manager

    🧠 Master Recall Card β€” 25 Rules to Memorize

    BDA Structure1 project, multiple blueprints β†’ BDA auto-selects per document
    Eval Prompt LimitMax 1,000 prompts per evaluation job β†’ split into multiple jobs
    Shadow TestSafe validation (traffic copy) vs A/B (live split) β€” use shadow for new model validation
    VPC EndpointBedrock = Interface endpoint (PrivateLink). Gateway = S3/DynamoDB ONLY
    Guardrail IAMbedrock:GuardrailIdentifier condition key on InvokeModel/Converse β€” not Lambda proxy
    Prompt Attack FilterFor injection attacks. Denied topics = off-topic content. NOT the same.
    IAM Identity CenterAlways for enterprise AD/Entra ID. Never create IAM users matching AD accounts.
    Glue ETL not EMRFine-tuning data β†’ Glue ETL for JSONL. EMR = overkill for format conversion.
    Comprehend EntityEntity recognition = extract attributes. Classification = assign labels to doc.
    ReAct vs CoTReAct = observe→reason→act (agentic). CoT = linear reasoning chain (no actions).
    KB Sync ResilienceS3 Event β†’ SQS queue β†’ Lambda (poll) β†’ IngestKnowledgeBaseDocuments
    Custom Eval MetricsCompany-specific quality = custom dataset + custom metrics. Not industry benchmarks.
    S3 TimestampsTimestamps = system-defined metadata (S3 auto). Authorship = user-defined.
    RAG vs StreamingRAG = accuracy. Streaming = latency. Different problems, different solutions.
    Inference Profiles HAPrimary + secondary region config. Failover, not round-robin.
    Prompt ManagementVersioning + review workflow before activation. CloudTrail for compliance audit.
    CloudWatch GenAIUse CloudWatch GenAI observability (not Application Signals + X-Ray) for Lambda+Bedrock.
    BDA = Parser onlyBDA extracts from docs/media. Orchestration = EventBridge + Step Functions.
    Strands+MCPDynamic tool use at runtime. Bedrock Flows = predefined sequential steps only.
    Cognito WorkforceBedrock human evaluation = Cognito user pool (not SageMaker Ground Truth).
    Distillation = PromptsSupply only prompts. Teacher model generates responses internally during distillation.
    Semantic CacheOpenSearch k-NN for semantic matching. ElastiCache Redis = exact-string only.
    BOLD DatasetBias detection in text gen = Bedrock eval + BOLD dataset. Not SageMaker Clarify.
    max_tokensTruncated responses = increase max_tokens parameter. Not a logging/monitoring fix.
    Hybrid SearchOpenSearch Service (not Serverless) + sparse + dense vectors = sub-second hybrid search.