Scenario-Based Practice Questions
40+ exam-style questions weighted by domain importance. Select a domain or attempt all.
Flashcard System
60+ flashcards organized by domain. Click to flip, then rate your knowledge.
"Which AWS Service?" Rapid-Fire Drill
30+ timed scenarios. Pick the right service in 15 seconds!
Test your service knowledge under pressure. 15 seconds per question.
Architecture Pattern Matcher
Match services to use cases, order pipeline steps, and connect architecture patterns.
Exam Readiness Tracker
Self-assess your knowledge across all 5 domains. Progress saves automatically.
Final 48-Hour Focus Board
Your fastest route to exam readiness: study the traps, then grind the final simulator.
Tonight: Highest-Value Topics
- Bedrock APIs: `Converse`, `InvokeModel`, `InvokeModelWithResponseStream`, `CreateModelInvocationJob`, `RetrieveAndGenerate`
- Inference profiles vs. cross-Region inference vs. Provisioned Throughput
- Knowledge Bases vs. OpenSearch hybrid search vs. custom RAG
- Bedrock Agents vs. Strands vs. Step Functions vs. Flows
- Guardrails: tracing, `GuardrailPolicyType`, `bedrock:GuardrailIdentifier`
Tomorrow: Stabilize the Gaps
- Reranker models, hybrid retrieval, embedding drift, KB ingestion logging
- Identity patterns: Cognito OIDC, IAM Identity Center, temporary credentials
- Optimization patterns: prompt caching, batch inference, model cascading
- Evaluation patterns: LLM-as-judge, RAGAs, A/B testing, canary rollout
- Application patterns: Amplify AI Kit, Amazon Q Developer, MCP deployment
Final Practice Order
- Run the full deck once in Final Practice.
- Review only missed questions and write the decision rule in one line.
- Use flashcards here for weak domains.
- Do one last retry-misses pass, then finish with a shuffled fresh run.
| If the question says... | Usually think... | Why it matters |
|---|---|---|
| Managed failover and best Region for inference | Inference profile | More precise than generic Route 53 or manual cross-Region logic. |
| Large async batch jobs for text/image workloads | `CreateModelInvocationJob` | `StartAsyncInvoke` is a trap for Nova Reel video generation, not general batch inference. |
| Need to improve ordering of already relevant retrieved docs | Reranker models | Retrieval found good docs, but ranking is weak. |
| Need to inspect KB ingestion failures | Knowledge base logging to CloudWatch Logs | CloudTrail logs API calls, not document-level ingestion statuses. |
| Need to force guardrail use on every model call | IAM with `bedrock:GuardrailIdentifier` | Central enforcement beats custom proxy Lambda. |
| Need to know exactly which guardrail policy intervened | `trace: enabled` + `GuardrailPolicyType` metrics | Better than only knowing input vs output was blocked. |
| Need a model to stop after a phrase | Stop sequences | Prompt instructions are weaker and unreliable. |
| Unpredictable traffic with long idle periods | On-demand Bedrock via Lambda/API | Provisioned Throughput is usually wrong unless utilization is steady. |
| Deterministic compliance workflow with audit | Step Functions | Agents and Flows are too dynamic for strict execution guarantees. |
| Persistent MCP server connections | ECS Fargate | Lambda is a common distractor but is a poor fit for persistent SSE connections. |
Retrieval Stack
- Knowledge Bases = managed RAG, least ops
- OpenSearch hybrid = best search tuning + high QPS
- pgvector = vector + SQL joins
- S3 Vectors = billion-scale, lowest cost
Agent Stack
- Bedrock Agents = managed agent orchestration
- Strands = open-source, more control
- AgentCore = runtime/policy/memory/evals building blocks
- Step Functions = deterministic orchestration, not autonomous reasoning
Security Stack
- Guardrails = content safety and grounded responses
- Cognito / Identity Center = federation and temporary access
- PrivateLink = network isolation
- CloudTrail = API audit, not KB ingestion detail
Additional Study Material
Deep-dive content on topics not fully covered in the reference guide.
Amazon Q Developer
AI-powered development assistant (replaced CodeWhisperer):
- Code Generation: Inline suggestions, function completion, boilerplate
- Code Transformation: Refactoring, language translation, modernization
- Security Scanning: Vulnerability detection, remediation suggestions
- Unit Test Generation: Auto-generate tests for existing code
- Code Explanation: Explain complex code blocks in plain English
- CLI Assistance: Natural language to CLI commands
IDE Support: VS Code, JetBrains, Visual Studio, AWS Console, CLI
Amazon Q Business
Fully managed GenAI assistant for enterprise knowledge:
- 40+ Data Source Connectors: Confluence, SharePoint, Salesforce, S3, databases, Slack, Jira
- ACL Respect: Honors existing access permissions from source systems
- Natural Language Q&A: Ask questions across all connected sources
- Admin Controls: Topic blocking, response guardrails
- Plugins: Create custom actions (ticket creation, approvals)
AWS Amplify AI Kit
- Declarative React components:
<AIConversation> - Built-in streaming response handling
- Conversation history management
- Authentication integration (Cognito)
- Backend Bedrock connection (no direct client API calls)
- Rapid prototyping for mobile/web GenAI features
Bedrock Flows vs Step Functions
| Aspect | Bedrock Flows | Step Functions |
|---|---|---|
| Type | No-code prompt chain builder | General workflow orchestration |
| Users | Business analysts, non-devs | Developers |
| Best For | LLM prompt chains | Complex business logic, audit trails |
| Execution | Sequential prompt flow | Deterministic state machine |
| Integrations | Bedrock models, KBs, Guardrails | 200+ AWS services |
Bedrock Data Automation
- Automated extraction from complex documents (PDFs, images)
- Handles tables, forms, charts, key-value pairs
- Pre-processes documents before Knowledge Base ingestion
- Better retrieval accuracy for structured content
- Replaces need for custom Textract + parsing pipelines
CRM Enhancement Pattern
- CRM events → EventBridge → Lambda → Bedrock
- Use cases: summarize customer interactions, generate next-best-action, auto-draft responses
- Write results back to CRM via API
- Works with Salesforce, HubSpot, ServiceNow
- Lambda handles authentication and data transformation
TTL Options
- Default TTL: 5 minutes — Auto-expires if no requests with that prefix
- Extended TTL: 1 hour — For longer sessions (document QA, agentic workflows)
- Cache refreshes on each access within TTL window
- TTL is per-model, per-region
Supported APIs
- Converse API
- InvokeModel API
- ConverseStream API
- InvokeModelWithResponseStream API
- Compatible with Cross-Region Inference
Key Requirements
- Minimum tokens per checkpoint: Varies by model (e.g., 1024 tokens for Claude)
- Cache checkpoints are placed at the end of the cacheable prefix
- Prefix must be an exact byte-for-byte match
- Multiple cache checkpoints possible in a single prompt
Cost Model
- Cache Write: Slightly higher cost than normal processing (one-time)
- Cache Read: Up to 90% cheaper than normal processing
- Break-even: After ~2-3 cache reads, you save money
- Response metadata shows cache hit/miss status
2. Document QA (same document, many questions)
3. Agentic workflows with repeated tool definitions
4. Few-shot learning with extensive example sets
5. Any pattern where the prompt prefix is reused across requests
AgentCore Runtime
- Session Isolation: Each user session runs in its own sandboxed environment
- Duration: Up to 8 hours of continuous execution per session
- Bidirectional Streaming: Real-time interaction between user and agent
- Code Interpreter: Execute Python/JavaScript in sandbox
- Browser Runtime: Navigate web pages, fill forms, extract data
- Auto-scaling based on concurrent sessions
AgentCore Gateway
- API → Tool: Convert any REST API or Lambda into agent-compatible tools
- MCP Server Connection: Connect to external MCP servers
- Unified Interface: Single management layer for all tool integrations
- Authentication: Handles tool-level auth (API keys, OAuth, IAM)
- Schema Management: Auto-generates tool schemas for the agent
AgentCore Policy
- Natural Language Rules: "Only admin users can access the delete API"
- Auto-compilation: Converts to Cedar policy language
- Real-time Enforcement: Evaluated on every tool call during execution
- Identity-aware: Uses user role, tenant, custom claims for decisions
- Audit Logging: Every policy decision is logged
AgentCore Memory
- Session Memory: Short-term context within a single conversation
- Episodic Memory: Long-term memory that persists across sessions
- Agents learn from past interactions (user preferences, past resolutions)
- Semantic search over memory store
- Memory scoped per user, per tenant
AgentCore Evaluations
- 13 Built-in Evaluators: Tool selection, relevance, safety, consistency, etc.
- Custom Evaluators: Define domain-specific quality checks
- Continuous Monitoring: Evaluate every interaction in real-time
- Batch Evaluation: Run evaluations on historical conversations
- Integrates with CloudWatch for alerting
AgentCore Identity & Observability
Identity:
- Integrates with identity providers (Okta, Azure AD, Cognito)
- Custom claims for tenant-level isolation
- Maps external identities to agent authorization context
Observability:
- CloudWatch dashboards (latency, throughput, errors, tokens)
- OpenTelemetry integration for distributed tracing
- Track agent behavior across tools and sessions
AI Governance Mechanisms
- Model Inventory: Centralized catalog of all AI models (SageMaker Model Registry)
- Approval Workflows: Pending → Approved → Rejected status for each model version
- Version Control: Track model versions, parameters, training data
- Data Lineage: Source data → preprocessing → training → deployment tracking
- Risk Assessment: Categorize models by risk level (low/medium/high)
SageMaker Model Cards
- Standardized documentation for ML models
- Content: Purpose, training data, performance metrics, ethical considerations, limitations
- Designed for auditor review
- Integrates with Model Registry
- Exportable as PDF for compliance records
- Supports custom fields for domain-specific requirements
Responsible AI Principles on AWS
| Principle | AWS Implementation |
|---|---|
| Fairness | SageMaker Clarify bias detection (pre/post-training) |
| Transparency | Model Cards, SHAP explanations, feature importance |
| Privacy | Data never used for training, encryption, VPC isolation |
| Security | Guardrails, IAM, KMS encryption, PrivateLink |
| Robustness | Model evaluation, A/B testing, canary deployments |
| Governance | Model Registry, approval workflows, audit trails |
SageMaker Clarify Bias Metrics
- DPPL (Difference in Positive Proportions in Labels): Measures labeling bias between groups
- KL Divergence: Statistical measure of distribution difference between groups
- Class Imbalance Ratio (CI): Detects over/under-representation
- DPL (Difference in Positive Proportions in Labels): Pre-training metric
- DPPL: Post-training — compares predicted outcomes across groups
RAGAs Framework
Purpose-built metrics for RAG pipeline evaluation:
- Context Relevance: Are retrieved documents relevant to the query?
- Faithfulness: Is the answer grounded in retrieved context? (detects hallucination)
- Answer Relevance: Does the answer address the actual question?
- Answer Correctness: Is the answer factually correct?
Automated evaluation using LLM-as-judge. Open-source Python library.
LLM-as-Judge Pattern
- Use a powerful evaluator model to score outputs of another model
- Dimensions: coherence, relevance, fluency, harmlessness, helpfulness
- Scalable alternative to human evaluation
- Supported by Bedrock Model Evaluation (automatic mode)
- Limitations: May have style bias, position bias, verbosity bias
- Best practice: use a different model family as judge than the one being evaluated
Bedrock Model Evaluation
| Type | Automatic | Human |
|---|---|---|
| Method | Built-in metrics + LLM-as-judge | Human reviewers via Ground Truth |
| Speed | Fast, scalable | Slow, expensive |
| Best For | Objective metrics, model comparison | Subjective quality, edge cases |
| Metrics | Accuracy, toxicity, robustness, quality dimensions | Custom rubrics, preferences |
SageMaker Deployment Strategies
- Canary: Send small % to new model first, gradually increase. Auto-rollback on CloudWatch alarm triggers.
- Blue-Green: Deploy new model in parallel, switch all traffic at once. Old model stays for instant rollback.
- Linear: Shift traffic in equal increments (e.g., 10% every 10 min).
- All-at-once: Switch immediately. Fastest but riskiest — no gradual rollout.
A/B Testing with SageMaker Experiments
- Organize model variants as experiment trials
- Track parameters, metrics, and artifacts per variant
- Compare results across trials with built-in visualization
- Integrates with Model Registry for version tracking
- Use for: prompt variants, model comparisons, hyperparameter tuning
Common Troubleshooting Patterns
- Throttling (429): Retry with exponential backoff + Cross-Region Inference
- RAG quality degradation: Re-evaluate chunking, re-index, check embedding consistency
- Agent wrong tool selection: Improve tool descriptions, use AgentCore Evaluations
- High latency: Enable streaming, optimize prompt length, use prompt caching
- Cost overruns: Implement intelligent routing, prompt caching, batch processing