Domain 1 · 20% of exam

Fundamentals of AI and ML

This is the warm-up. It's your 'what is even happening' foundation — what AI is, what ML is, what the difference between supervised/unsupervised/reinforcement learning is, and how the ML lifecycle works.

Task statements: 1.1, 1.2, 1.3Estimated questions: ~10 of 50 scored

Updated May 21, 2026

The Big Picture (read this first)

Imagine you have a 5-year-old who’s never heard of computers. Here is how to explain the words on the box of this exam:

Artificial Intelligence (AI)= a giant umbrella term. Any computer that does something that looks “smart.” That’s it. Playing chess, recognizing faces, answering email — all AI.
Machine Learning (ML) = a kind of AI where the computer learns the rules from examples instead of a human writing the rules. You show it 10,000 cat pictures, it figures out what a cat looks like.
Deep Learning (DL)= a kind of ML that uses many-layered “neural networks” (math inspired loosely by brain cells). Best for messy, complex data like images, speech, and language.
Generative AI (GenAI) = a kind of deep learning that can produce new things — text, images, audio, code — instead of only classifying or predicting. ChatGPT is GenAI.
Foundation Model (FM) = a giant pre-trained GenAI model, trained on huge data, that can be adapted to many tasks. Think of it as a college graduate vs. a kindergartner — already broadly educated.
Agentic AI= an FM that doesn’t just answer once. It plans steps, uses tools (search, code, APIs), checks its work, and iterates toward a goal. It can act in the world, not just talk.

The nesting

AI ⊃ ML ⊃ Deep Learning ⊃ Generative AI ⊃ Foundation Models. Agentic AI sits on top of FMs. Each is a stricter subset of the one before it.

"GenAI is a type of ML" is true on the exam

Some people think GenAI is its own thing. The exam treats GenAI as a subset of deep learning, which is a subset of ML. If a question asks “Which of the following is a category that includes generative AI?” — the answer is machine learning (or deep learning), not “AI” alone if both are options.

Task 1.1 — Basic AI Concepts and Terminology

▶1.1.1 Define basic AI terms (you must know these cold)

Term	Plain-language meaning	Don't confuse with
AI	Anything where a computer does something "smart"	—
ML	AI that learns rules from examples (data)	Hard-coded if/else logic
Deep learning	ML using neural networks with many layers	Shallow ML like decision trees
Neural network	Many layers of math, loosely inspired by neurons, that transform input into output	Actual biology — they're not really brains
Algorithm	A set of instructions / a recipe	The trained model itself
Model	The trained output of an algorithm + data. The "smart thing" you actually use.	The algorithm. Algorithm + data → model.
Inference	Using a trained model to make a prediction or generate output	Training, which is teaching the model
Training	The process of teaching the model from data	Inference, which is using it
Bias (in ML)	Systematic error favoring certain outcomes — often from skewed data	Statistical bias-variance tradeoff (different concept)
Variance (in ML)	How much a model's predictions change when training data changes; high variance = overfitting	Statistical variance (related but specific here)
Generative AI	AI that creates new content (text, images, audio, code)	Discriminative AI, which classifies or predicts existing categories
Foundation model (FM)	A massive pretrained model adaptable to many tasks	A traditional task-specific model (like a single classifier)
LLM	Large Language Model — an FM specifically for text	FM in general (FM is broader; LLM is text-only)
Multi-modal	Model that handles multiple input types: text + images + audio	Single-modal models
Diffusion model	Model type used for image/video generation (Stable Diffusion, etc.)	Transformer (used for text)
Transformer	The neural network architecture behind almost all modern LLMs	Diffusion (different architecture for images)
Embedding	A list of numbers representing meaning. Words/sentences/images become vectors so the computer can compare them.	Tokens (separate concept — see Domain 2)
Token	A chunk of text the model sees as one unit. Roughly 4 characters or ~¾ of a word.	Word — tokens aren't always words
Agentic AI	AI that plans, uses tools, observes results, and iterates — autonomous goal-seeking	A normal LLM that just answers a single prompt

Algorithm vs. Model

This is constantly confused. Algorithm = the recipe. Model = the cake. You feed an algorithm data and out pops a model. You then infer with the model.

▶1.1.2 Identify similarities and differences between AI / ML / Deep Learning / GenAI

You will be asked something like “Which of the following is a subset of machine learning?” or “What distinguishes generative AI from traditional ML?” — these are the answers:

Question form	Answer
Subset of AI	ML, DL, GenAI
Subset of ML	DL, GenAI (because it uses DL)
Subset of DL	GenAI
What separates GenAI from traditional ML	GenAI creates new content; traditional ML classifies/predicts existing categories
What separates DL from "shallow" ML	DL uses many neural network layers, handles unstructured data better, needs more compute and more data

If a question says...

“Spam vs. not spam” → classification (traditional ML)
“Predict house price” → regression (traditional ML)
“Group customers into segments without labels” → clustering (unsupervised ML)
“Recognize a cat in a photo” → deep learning (CNN)
“Write a marketing email from scratch” → generative AI
“Write code suggestions in an IDE” → generative AI (code LLM)

▶1.1.3 Describe types of inferencing

Inference = the model “thinking” — taking input, producing output. Inference happens in different shapes depending on speed and volume needs.

Type	What it means (plain)	When to use	AWS service / pattern
Real-time	Send one request, wait for one response right now	Chatbots, fraud check at checkout, face login	SageMaker real-time endpoint, Bedrock InvokeModel
Batch	Send a huge file of inputs, get all outputs back later	Score 10M customer records overnight	SageMaker Batch Transform, Bedrock Batch Inference
Asynchronous	Submit and don't block; pick up the result when ready	Big inputs (long video, big PDF) where waiting is fine	SageMaker Async Inference
Serverless	Endpoint that scales to zero between requests, scales up on demand	Spiky low-volume traffic where idle cost matters	SageMaker Serverless Inference
Streaming (GenAI)	Output is sent token-by-token as it generates	Chat UIs where users want to see typing in real time	Bedrock InvokeModelWithResponseStream

Real-time ≠ streaming

Real-time means “synchronous, low-latency.” Streaming means “the response is delivered piece by piece.” A real-time API can also be streaming, but they aren’t the same word.

Cost intuition

Batch is the cheapest per inference. Real-time is the most expensive (you pay for the endpoint to be ready). Serverless is in between but has cold-start latency. The exam loves “most cost-effective for predictable nightly workload” → batch.

▶1.1.4 Describe ML development lifecycle (the pipeline)

Every ML project follows roughly the same path. The exam loves to ask you to put these in order or pick which AWS service does which step.

Business problem framing — what are we even trying to predict / generate / decide?
Data collection — gather raw data from sources (S3, databases, APIs).
Exploratory data analysis (EDA) — understand the data: distributions, missing values, weirdness.
Data preparation / feature engineering — clean, transform, label, split into train/validation/test.
Model selection — pick algorithm or pretrained model.
Training — feed data to algorithm, produce model.
Evaluation — test on held-out data; check metrics (accuracy, F1, etc.).
Tuning — adjust hyperparameters, repeat.
Deployment — push model to production endpoint.
Monitoring — watch for drift, errors, bias in production. Retrain when needed.

Stage	AWS service that fits
Data storage	S3 (data lake), RDS, Redshift
Data prep	SageMaker Data Wrangler, AWS Glue
Labeling	SageMaker Ground Truth
Training (custom)	SageMaker AI Training Jobs
Pretrained models	SageMaker JumpStart, Bedrock
Hosting / inference	SageMaker Endpoints, Bedrock
Monitoring	SageMaker Model Monitor, CloudWatch
Bias / explainability	SageMaker Clarify
Documentation / governance	SageMaker Model Cards

Out of scope for this exam

You do not need to know how to write training code, choose hyperparameters, or design feature engineering pipelines. The exam wants you to know what each stage is and which AWS service fits there. If you’re spending hours on hyperparameter math — stop, that’s the ML Specialty exam.

▶1.1.5 Learning paradigms — the most-tested concept in Domain 1

Three main paradigms. Memorize the one-line definition for each, then drill scenarios.

Type	Plain meaning	Data needed	Common tasks
Supervised	Learn from examples that have the right answer attached	Labeled data	Classification, regression
Unsupervised	Find patterns without being told what's right	Unlabeled data	Clustering, anomaly detection, dimensionality reduction
Reinforcement	Learn by trial and error from rewards	An environment that gives feedback	Game playing, robotics, RLHF for LLMs
Semi-supervised	Some labeled, mostly unlabeled	Mixed	When labeling is expensive
Self-supervised	The data labels itself (e.g., predict the next word)	Massive unlabeled corpora	How LLMs / FMs are pre-trained

Classification vs. regression vs. clustering (your diagnostic Q2 miss)

Classification = predicting a category. “Is this spam?” “Is this image a cat, dog, or bird?” Output is discrete labels.
Regression = predicting a number. “What’s the house price?” “What’s tomorrow’s temperature?” Output is continuous.
Clustering = grouping similar things without knowing the groups in advance. “Segment my customers into 5 natural groups.” Output is group assignments, not predictions of a known label.

Mnemonic: Classification = Categories. Regression = Real numbers. Clustering = Connect similar without labels.

Scenario drills (decide before reading the answer)

“Predict if a credit card transaction is fraudulent” → supervised, classification
“Predict tomorrow’s stock price” → supervised, regression
“Group website visitors into similar shopping personas” → unsupervised, clustering
“Train a robot to walk by trying random moves” → reinforcement
“Train a chatbot to be more helpful by getting thumbs-up/down feedback” → reinforcement (RLHF)
“Detect when a server’s metrics look unusual” → unsupervised, anomaly detection
“Pre-train an LLM by predicting the next word in trillions of sentences” → self-supervised

Reinforcement learning trap

Don’t pick reinforcement learning just because the question mentions “feedback” or “improvement over time.” If there’s no environment with rewards and no agent making actions, it’s not RL. A spam filter that retrains weekly is still supervised, not RL.

Task 1.2 — Practical Use Cases for AI

▶1.2.1 When AI helps and when it doesn't

This is heavily tested. Some questions ask “When is AI notthe right tool?” — the answers usually involve:

Hard rules required — tax calculations, regulatory thresholds. Use deterministic code, not ML.
Decisions that must be 100% explainable — life-or-death medical or legal. ML can be too opaque.
Cost outweighs benefit — small dataset, simple problem. A spreadsheet rule beats a model.
No data available — ML needs data. No data = no ML.
Real-time, very low latency needs — large LLM inference can be too slow.

When AI is the right tool

Pattern recognition in messy unstructured data (images, text, audio), prediction from many features, personalization at scale, content generation, scenarios where rules would be too numerous to write by hand.

▶1.2.2 Traditional ML vs. Generative AI — when to pick which (v1.1 addition)

Use traditional ML when…	Use generative AI / FM when…
You have structured tabular data and a specific prediction target	You have unstructured data (text, images, audio) or want to generate content
The task is narrow and defined (fraud, churn, demand forecast)	The task is broad or open-ended (Q&A, summaries, drafts)
You need full explainability for regulators	Some opacity is acceptable
You have plenty of labeled examples	Few or no labeled examples (FMs work zero-shot or few-shot)
You need consistent, deterministic outputs	You can tolerate non-deterministic outputs and verify them
Cost per prediction must be very low	Token-based cost is acceptable

FMs are not always the right answer

The exam will tempt you with “use GenAI” answers in scenarios where traditional ML or even rule-based logic is correct. If the use case is structured prediction with abundant labels and explainability requirements (e.g., loan approval), traditional ML wins.

▶1.2.3 Real-world AWS-flavored use cases

Use case	AWS managed service
Sentiment analysis on customer reviews	Amazon Comprehend
Detect text in scanned documents / forms	Amazon Textract
Detect faces, objects, unsafe content in images	Amazon Rekognition
Convert audio recordings to text	Amazon Transcribe
Convert text to spoken audio	Amazon Polly
Translate text across languages	Amazon Translate
Build chatbots / voice bots	Amazon Lex
Personalize product recommendations	Amazon Personalize
Enterprise search across documents	Amazon Kendra
Generate text / code / images via LLM	Amazon Bedrock
Build, train, deploy custom ML models	Amazon SageMaker AI
Enterprise AI assistant for employees	Amazon Q Business
AI coding assistant	Amazon Q Developer
BI insights / dashboards with AI	Amazon QuickSight Q
Migrate / modernize workloads with AI	AWS Transform (v1.1 addition)

Textract vs. Rekognition (always tested)

Both work on images. Textract reads printed/handwritten words and form fields. Rekognition identifies objects, faces, scenes, and unsafe content. A scanned invoice → Textract. A photo of a person → Rekognition.

Comprehend vs. Kendra

Comprehend takes text in, gives back analysis (sentiment, entities, key phrases, language detection). Kendra takes a question and returns the most relevant documents from a corpus you pre-indexed. Comprehend = analysis. Kendra = search.

Personalize vs. just "use Bedrock"

If the question is “recommend products to a user based on past behavior” — use Personalize, not Bedrock. Personalize is a managed recommendation engine. Bedrock would be over-engineering.

▶1.2.4 Capabilities and limitations of AI/ML for solving business problems

Capabilities

Process huge volumes of unstructured data fast
Personalize at scale
Automate repetitive analytical work
Surface patterns humans would miss
Generate content drafts, summaries, code

Limitations

Hallucinations — GenAI invents plausible-sounding falsehoods
Bias — model reflects bias in training data
Opacity — hard to explain why a deep model made a specific call
Drift — performance degrades as the world changes after training cutoff
Cost — large models can be expensive per inference
Privacy / data leakage — sensitive data could leak into outputs
Determinism — same prompt may give different answers

Task 1.3 — ML Development Lifecycle (deeper)

▶1.3.1 Components of an ML pipeline (in order)

Data collection
Exploratory data analysis (EDA)
Data pre-processing
Feature engineering
Model training
Hyperparameter tuning
Evaluation
Deployment
Monitoring

Mnemonic: "Data Engineers Prep Features, Train Tuned Evaluations, Deploy Monitors"

D-E-P-F-T-T-E-D-M. Use whatever you can remember; the order is what’s tested.

▶1.3.2 Sources of ML models — buy, build, or borrow

Pre-trained from a marketplace / hub — fastest, cheapest, but generic. Use SageMaker JumpStart, Bedrock, AWS Marketplace.
Fine-tuned from a pre-trained model — adapt to your data without training from scratch.
Trained from scratch — full control, max cost, only worth it for huge unique datasets.

▶1.3.3 Methods to use a model in production

Hosted real-time endpoint — always-on inference (SageMaker Endpoint, Bedrock).
Batch transform — score a big chunk all at once.
Async inference — long-running, non-blocking.
Serverless endpoint — pay per request, scales to zero.
Edge deployment — model runs on the device (SageMaker Edge, but rarely tested on AIF-C01).

▶1.3.4 MLOps concepts (high level)

You don’t need to do MLOps for this exam, but you should recognize the words:

CI/CD for ML — automated pipeline from data → trained model → deployment
Model registry — version control for models (SageMaker Model Registry)
Feature store — central, reusable feature library (SageMaker Feature Store)
Drift detection — alert when production data shifts away from training data
A/B testing — split traffic between two models to compare performance

MLOps depth is out of scope

If a question asks you to design an MLOps pipeline architecture — that’s the ML Engineer Associate or ML Specialty exam, not AIF-C01. Pick the simplest answer that uses managed services.

▶1.3.5 Model performance metrics

Metric	What it tells you	When to use
Accuracy	% predictions correct	Balanced datasets only
Precision	Of items you predicted positive, how many actually were? (low false positives)	When false positives are expensive (spam filter)
Recall	Of all true positives, how many did you catch? (low false negatives)	When missing a positive is dangerous (cancer screening)
F1 score	Harmonic mean of precision and recall	Imbalanced data, balance both
AUC-ROC	How well a classifier ranks positives above negatives across thresholds	Binary classification quality overall
RMSE / MAE	Average error magnitude in regression	Predicting numbers

Confusion matrix in plain English

True Positive (TP) = model said “yes,” answer was “yes”
True Negative (TN) = model said “no,” answer was “no”
False Positive (FP) = model said “yes,” answer was “no” (false alarm)
False Negative (FN) = model said “no,” answer was “yes” (missed it)
Precision = TP / (TP + FP). Recall = TP / (TP + FN).

Accuracy is misleading on imbalanced data

If 99% of transactions are not fraud, a model that always says “not fraud” is 99% accurate but useless. Use F1, precision, recall for imbalanced problems. Exam loves to set up this trap.

▶1.3.6 Business metrics for AI/ML solutions

Technical metrics aren’t enough — the exam expects you to also think in business terms.

ROI / cost savings — does the model save more than it costs?
Customer satisfaction — surveys, NPS, support feedback
Cost per inference / cost per user
Time saved per task
Conversion rate / click-through rate
Reduction in manual review

Watch for "the most appropriate metric to evaluate the business impact"

When you see “business impact” — pick a business metric (ROI, customer satisfaction), not a technical one (F1, accuracy).

Self-Quiz

Question 1

A retail company wants to group its customers into segments based on purchasing behavior, but it does not have predefined segment labels. Which type of machine learning is most appropriate?

A. Supervised classification
B. Supervised regression
C. Unsupervised clustering
D. Reinforcement learning

Question 2

A logistics company processes 5 million shipping records every night and needs to score each one for predicted delivery time. Cost is a major consideration. Which inference type is most appropriate?

A. Real-time inference
B. Batch inference
C. Asynchronous inference
D. Serverless inference

Question 3

Which AWS service should be used to extract printed text and form fields from scanned PDFs of insurance claims?

A. Amazon Rekognition
B. Amazon Comprehend
C. Amazon Textract
D. Amazon Kendra

Question 4

A medical diagnostic model must minimize the chance of missing a true positive case. Which metric is the most appropriate to optimize?

A. Precision
B. Recall
C. Accuracy
D. RMSE

Question 5

Which of the following is the correct order of stages in a typical ML lifecycle?

A. Training → Data prep → Evaluation → Deployment
B. Data collection → Data prep → Training → Evaluation → Deployment → Monitoring
C. Deployment → Training → Monitoring → Data prep
D. Data collection → Deployment → Training → Evaluation

Flashcards (click to flip)

Home Domain 2 — Fundamentals of Generative AI→