← Back to Certifications
Domain 1 · 20% of exam

Fundamentals of AI and ML

This is the warm-up. It's your 'what is even happening' foundation — what AI is, what ML is, what the difference between supervised/unsupervised/reinforcement learning is, and how the ML lifecycle works.

Task statements: 1.1, 1.2, 1.3Estimated questions: ~10 of 50 scored

Updated May 21, 2026

The Big Picture (read this first)

Imagine you have a 5-year-old who’s never heard of computers. Here is how to explain the words on the box of this exam:

  • Artificial Intelligence (AI)= a giant umbrella term. Any computer that does something that looks “smart.” That’s it. Playing chess, recognizing faces, answering email — all AI.
  • Machine Learning (ML) = a kind of AI where the computer learns the rules from examples instead of a human writing the rules. You show it 10,000 cat pictures, it figures out what a cat looks like.
  • Deep Learning (DL)= a kind of ML that uses many-layered “neural networks” (math inspired loosely by brain cells). Best for messy, complex data like images, speech, and language.
  • Generative AI (GenAI) = a kind of deep learning that can produce new things — text, images, audio, code — instead of only classifying or predicting. ChatGPT is GenAI.
  • Foundation Model (FM) = a giant pre-trained GenAI model, trained on huge data, that can be adapted to many tasks. Think of it as a college graduate vs. a kindergartner — already broadly educated.
  • Agentic AI= an FM that doesn’t just answer once. It plans steps, uses tools (search, code, APIs), checks its work, and iterates toward a goal. It can act in the world, not just talk.

The nesting

AI ⊃ ML ⊃ Deep Learning ⊃ Generative AI ⊃ Foundation Models. Agentic AI sits on top of FMs. Each is a stricter subset of the one before it.

"GenAI is a type of ML" is true on the exam

Some people think GenAI is its own thing. The exam treats GenAI as a subset of deep learning, which is a subset of ML. If a question asks “Which of the following is a category that includes generative AI?” — the answer is machine learning (or deep learning), not “AI” alone if both are options.

Task 1.1 — Basic AI Concepts and Terminology

1.1.1 Define basic AI terms (you must know these cold)
TermPlain-language meaningDon't confuse with
AIAnything where a computer does something "smart"
MLAI that learns rules from examples (data)Hard-coded if/else logic
Deep learningML using neural networks with many layersShallow ML like decision trees
Neural networkMany layers of math, loosely inspired by neurons, that transform input into outputActual biology — they're not really brains
AlgorithmA set of instructions / a recipeThe trained model itself
ModelThe trained output of an algorithm + data. The "smart thing" you actually use.The algorithm. Algorithm + data → model.
InferenceUsing a trained model to make a prediction or generate outputTraining, which is teaching the model
TrainingThe process of teaching the model from dataInference, which is using it
Bias (in ML)Systematic error favoring certain outcomes — often from skewed dataStatistical bias-variance tradeoff (different concept)
Variance (in ML)How much a model's predictions change when training data changes; high variance = overfittingStatistical variance (related but specific here)
Generative AIAI that creates new content (text, images, audio, code)Discriminative AI, which classifies or predicts existing categories
Foundation model (FM)A massive pretrained model adaptable to many tasksA traditional task-specific model (like a single classifier)
LLMLarge Language Model — an FM specifically for textFM in general (FM is broader; LLM is text-only)
Multi-modalModel that handles multiple input types: text + images + audioSingle-modal models
Diffusion modelModel type used for image/video generation (Stable Diffusion, etc.)Transformer (used for text)
TransformerThe neural network architecture behind almost all modern LLMsDiffusion (different architecture for images)
EmbeddingA list of numbers representing meaning. Words/sentences/images become vectors so the computer can compare them.Tokens (separate concept — see Domain 2)
TokenA chunk of text the model sees as one unit. Roughly 4 characters or ~¾ of a word.Word — tokens aren't always words
Agentic AIAI that plans, uses tools, observes results, and iterates — autonomous goal-seekingA normal LLM that just answers a single prompt

Algorithm vs. Model

This is constantly confused. Algorithm = the recipe. Model = the cake. You feed an algorithm data and out pops a model. You then infer with the model.
1.1.2 Identify similarities and differences between AI / ML / Deep Learning / GenAI

You will be asked something like “Which of the following is a subset of machine learning?” or “What distinguishes generative AI from traditional ML?” — these are the answers:

Question formAnswer
Subset of AIML, DL, GenAI
Subset of MLDL, GenAI (because it uses DL)
Subset of DLGenAI
What separates GenAI from traditional MLGenAI creates new content; traditional ML classifies/predicts existing categories
What separates DL from "shallow" MLDL uses many neural network layers, handles unstructured data better, needs more compute and more data

If a question says...

  • “Spam vs. not spam” → classification (traditional ML)
  • “Predict house price” → regression (traditional ML)
  • “Group customers into segments without labels” → clustering (unsupervised ML)
  • “Recognize a cat in a photo” → deep learning (CNN)
  • “Write a marketing email from scratch” → generative AI
  • “Write code suggestions in an IDE” → generative AI (code LLM)
1.1.3 Describe types of inferencing

Inference = the model “thinking” — taking input, producing output. Inference happens in different shapes depending on speed and volume needs.

TypeWhat it means (plain)When to useAWS service / pattern
Real-timeSend one request, wait for one response right nowChatbots, fraud check at checkout, face loginSageMaker real-time endpoint, Bedrock InvokeModel
BatchSend a huge file of inputs, get all outputs back laterScore 10M customer records overnightSageMaker Batch Transform, Bedrock Batch Inference
AsynchronousSubmit and don't block; pick up the result when readyBig inputs (long video, big PDF) where waiting is fineSageMaker Async Inference
ServerlessEndpoint that scales to zero between requests, scales up on demandSpiky low-volume traffic where idle cost mattersSageMaker Serverless Inference
Streaming (GenAI)Output is sent token-by-token as it generatesChat UIs where users want to see typing in real timeBedrock InvokeModelWithResponseStream

Real-time ≠ streaming

Real-time means “synchronous, low-latency.” Streaming means “the response is delivered piece by piece.” A real-time API can also be streaming, but they aren’t the same word.

Cost intuition

Batch is the cheapest per inference. Real-time is the most expensive (you pay for the endpoint to be ready). Serverless is in between but has cold-start latency. The exam loves “most cost-effective for predictable nightly workload” → batch.
1.1.4 Describe ML development lifecycle (the pipeline)

Every ML project follows roughly the same path. The exam loves to ask you to put these in order or pick which AWS service does which step.

  1. Business problem framing — what are we even trying to predict / generate / decide?
  2. Data collection — gather raw data from sources (S3, databases, APIs).
  3. Exploratory data analysis (EDA) — understand the data: distributions, missing values, weirdness.
  4. Data preparation / feature engineering — clean, transform, label, split into train/validation/test.
  5. Model selection — pick algorithm or pretrained model.
  6. Training — feed data to algorithm, produce model.
  7. Evaluation — test on held-out data; check metrics (accuracy, F1, etc.).
  8. Tuning — adjust hyperparameters, repeat.
  9. Deployment — push model to production endpoint.
  10. Monitoring — watch for drift, errors, bias in production. Retrain when needed.
StageAWS service that fits
Data storageS3 (data lake), RDS, Redshift
Data prepSageMaker Data Wrangler, AWS Glue
LabelingSageMaker Ground Truth
Training (custom)SageMaker AI Training Jobs
Pretrained modelsSageMaker JumpStart, Bedrock
Hosting / inferenceSageMaker Endpoints, Bedrock
MonitoringSageMaker Model Monitor, CloudWatch
Bias / explainabilitySageMaker Clarify
Documentation / governanceSageMaker Model Cards

Out of scope for this exam

You do not need to know how to write training code, choose hyperparameters, or design feature engineering pipelines. The exam wants you to know what each stage is and which AWS service fits there. If you’re spending hours on hyperparameter math — stop, that’s the ML Specialty exam.
1.1.5 Learning paradigms — the most-tested concept in Domain 1

Three main paradigms. Memorize the one-line definition for each, then drill scenarios.

TypePlain meaningData neededCommon tasks
SupervisedLearn from examples that have the right answer attachedLabeled dataClassification, regression
UnsupervisedFind patterns without being told what's rightUnlabeled dataClustering, anomaly detection, dimensionality reduction
ReinforcementLearn by trial and error from rewardsAn environment that gives feedbackGame playing, robotics, RLHF for LLMs
Semi-supervisedSome labeled, mostly unlabeledMixedWhen labeling is expensive
Self-supervisedThe data labels itself (e.g., predict the next word)Massive unlabeled corporaHow LLMs / FMs are pre-trained

Classification vs. regression vs. clustering (your diagnostic Q2 miss)

  • Classification = predicting a category. “Is this spam?” “Is this image a cat, dog, or bird?” Output is discrete labels.
  • Regression = predicting a number. “What’s the house price?” “What’s tomorrow’s temperature?” Output is continuous.
  • Clustering = grouping similar things without knowing the groups in advance. “Segment my customers into 5 natural groups.” Output is group assignments, not predictions of a known label.
Mnemonic: Classification = Categories. Regression = Real numbers. Clustering = Connect similar without labels.

Scenario drills (decide before reading the answer)

  1. “Predict if a credit card transaction is fraudulent” → supervised, classification
  2. “Predict tomorrow’s stock price” → supervised, regression
  3. “Group website visitors into similar shopping personas” → unsupervised, clustering
  4. “Train a robot to walk by trying random moves” → reinforcement
  5. “Train a chatbot to be more helpful by getting thumbs-up/down feedback” → reinforcement (RLHF)
  6. “Detect when a server’s metrics look unusual” → unsupervised, anomaly detection
  7. “Pre-train an LLM by predicting the next word in trillions of sentences” → self-supervised

Reinforcement learning trap

Don’t pick reinforcement learning just because the question mentions “feedback” or “improvement over time.” If there’s no environment with rewards and no agent making actions, it’s not RL. A spam filter that retrains weekly is still supervised, not RL.

Task 1.2 — Practical Use Cases for AI

1.2.1 When AI helps and when it doesn't

This is heavily tested. Some questions ask “When is AI notthe right tool?” — the answers usually involve:

  • Hard rules required — tax calculations, regulatory thresholds. Use deterministic code, not ML.
  • Decisions that must be 100% explainable — life-or-death medical or legal. ML can be too opaque.
  • Cost outweighs benefit — small dataset, simple problem. A spreadsheet rule beats a model.
  • No data available — ML needs data. No data = no ML.
  • Real-time, very low latency needs — large LLM inference can be too slow.

When AI is the right tool

Pattern recognition in messy unstructured data (images, text, audio), prediction from many features, personalization at scale, content generation, scenarios where rules would be too numerous to write by hand.
1.2.2 Traditional ML vs. Generative AI — when to pick which (v1.1 addition)
Use traditional ML when…Use generative AI / FM when…
You have structured tabular data and a specific prediction targetYou have unstructured data (text, images, audio) or want to generate content
The task is narrow and defined (fraud, churn, demand forecast)The task is broad or open-ended (Q&A, summaries, drafts)
You need full explainability for regulatorsSome opacity is acceptable
You have plenty of labeled examplesFew or no labeled examples (FMs work zero-shot or few-shot)
You need consistent, deterministic outputsYou can tolerate non-deterministic outputs and verify them
Cost per prediction must be very lowToken-based cost is acceptable

FMs are not always the right answer

The exam will tempt you with “use GenAI” answers in scenarios where traditional ML or even rule-based logic is correct. If the use case is structured prediction with abundant labels and explainability requirements (e.g., loan approval), traditional ML wins.
1.2.3 Real-world AWS-flavored use cases
Use caseAWS managed service
Sentiment analysis on customer reviewsAmazon Comprehend
Detect text in scanned documents / formsAmazon Textract
Detect faces, objects, unsafe content in imagesAmazon Rekognition
Convert audio recordings to textAmazon Transcribe
Convert text to spoken audioAmazon Polly
Translate text across languagesAmazon Translate
Build chatbots / voice botsAmazon Lex
Personalize product recommendationsAmazon Personalize
Enterprise search across documentsAmazon Kendra
Generate text / code / images via LLMAmazon Bedrock
Build, train, deploy custom ML modelsAmazon SageMaker AI
Enterprise AI assistant for employeesAmazon Q Business
AI coding assistantAmazon Q Developer
BI insights / dashboards with AIAmazon QuickSight Q
Migrate / modernize workloads with AIAWS Transform (v1.1 addition)

Textract vs. Rekognition (always tested)

Both work on images. Textract reads printed/handwritten words and form fields. Rekognition identifies objects, faces, scenes, and unsafe content. A scanned invoice → Textract. A photo of a person → Rekognition.

Comprehend vs. Kendra

Comprehend takes text in, gives back analysis (sentiment, entities, key phrases, language detection). Kendra takes a question and returns the most relevant documents from a corpus you pre-indexed. Comprehend = analysis. Kendra = search.

Personalize vs. just "use Bedrock"

If the question is “recommend products to a user based on past behavior” — use Personalize, not Bedrock. Personalize is a managed recommendation engine. Bedrock would be over-engineering.
1.2.4 Capabilities and limitations of AI/ML for solving business problems

Capabilities

  • Process huge volumes of unstructured data fast
  • Personalize at scale
  • Automate repetitive analytical work
  • Surface patterns humans would miss
  • Generate content drafts, summaries, code

Limitations

  • Hallucinations — GenAI invents plausible-sounding falsehoods
  • Bias — model reflects bias in training data
  • Opacity — hard to explain why a deep model made a specific call
  • Drift — performance degrades as the world changes after training cutoff
  • Cost — large models can be expensive per inference
  • Privacy / data leakage — sensitive data could leak into outputs
  • Determinism — same prompt may give different answers

Task 1.3 — ML Development Lifecycle (deeper)

1.3.1 Components of an ML pipeline (in order)
  1. Data collection
  2. Exploratory data analysis (EDA)
  3. Data pre-processing
  4. Feature engineering
  5. Model training
  6. Hyperparameter tuning
  7. Evaluation
  8. Deployment
  9. Monitoring

Mnemonic: "Data Engineers Prep Features, Train Tuned Evaluations, Deploy Monitors"

D-E-P-F-T-T-E-D-M. Use whatever you can remember; the order is what’s tested.
1.3.2 Sources of ML models — buy, build, or borrow
  • Pre-trained from a marketplace / hub — fastest, cheapest, but generic. Use SageMaker JumpStart, Bedrock, AWS Marketplace.
  • Fine-tuned from a pre-trained model — adapt to your data without training from scratch.
  • Trained from scratch — full control, max cost, only worth it for huge unique datasets.
1.3.3 Methods to use a model in production
  • Hosted real-time endpoint — always-on inference (SageMaker Endpoint, Bedrock).
  • Batch transform — score a big chunk all at once.
  • Async inference — long-running, non-blocking.
  • Serverless endpoint — pay per request, scales to zero.
  • Edge deployment — model runs on the device (SageMaker Edge, but rarely tested on AIF-C01).
1.3.4 MLOps concepts (high level)

You don’t need to do MLOps for this exam, but you should recognize the words:

  • CI/CD for ML — automated pipeline from data → trained model → deployment
  • Model registry — version control for models (SageMaker Model Registry)
  • Feature store — central, reusable feature library (SageMaker Feature Store)
  • Drift detection — alert when production data shifts away from training data
  • A/B testing — split traffic between two models to compare performance

MLOps depth is out of scope

If a question asks you to design an MLOps pipeline architecture — that’s the ML Engineer Associate or ML Specialty exam, not AIF-C01. Pick the simplest answer that uses managed services.
1.3.5 Model performance metrics
MetricWhat it tells youWhen to use
Accuracy% predictions correctBalanced datasets only
PrecisionOf items you predicted positive, how many actually were? (low false positives)When false positives are expensive (spam filter)
RecallOf all true positives, how many did you catch? (low false negatives)When missing a positive is dangerous (cancer screening)
F1 scoreHarmonic mean of precision and recallImbalanced data, balance both
AUC-ROCHow well a classifier ranks positives above negatives across thresholdsBinary classification quality overall
RMSE / MAEAverage error magnitude in regressionPredicting numbers

Confusion matrix in plain English

  • True Positive (TP) = model said “yes,” answer was “yes”
  • True Negative (TN) = model said “no,” answer was “no”
  • False Positive (FP) = model said “yes,” answer was “no” (false alarm)
  • False Negative (FN) = model said “no,” answer was “yes” (missed it)
  • Precision = TP / (TP + FP). Recall = TP / (TP + FN).

Accuracy is misleading on imbalanced data

If 99% of transactions are not fraud, a model that always says “not fraud” is 99% accurate but useless. Use F1, precision, recall for imbalanced problems. Exam loves to set up this trap.
1.3.6 Business metrics for AI/ML solutions

Technical metrics aren’t enough — the exam expects you to also think in business terms.

  • ROI / cost savings — does the model save more than it costs?
  • Customer satisfaction — surveys, NPS, support feedback
  • Cost per inference / cost per user
  • Time saved per task
  • Conversion rate / click-through rate
  • Reduction in manual review

Watch for "the most appropriate metric to evaluate the business impact"

When you see “business impact” — pick a business metric (ROI, customer satisfaction), not a technical one (F1, accuracy).

Self-Quiz

Question 1

A retail company wants to group its customers into segments based on purchasing behavior, but it does not have predefined segment labels. Which type of machine learning is most appropriate?

  • A. Supervised classification
  • B. Supervised regression
  • C. Unsupervised clustering
  • D. Reinforcement learning

Question 2

A logistics company processes 5 million shipping records every night and needs to score each one for predicted delivery time. Cost is a major consideration. Which inference type is most appropriate?

  • A. Real-time inference
  • B. Batch inference
  • C. Asynchronous inference
  • D. Serverless inference

Question 3

Which AWS service should be used to extract printed text and form fields from scanned PDFs of insurance claims?

  • A. Amazon Rekognition
  • B. Amazon Comprehend
  • C. Amazon Textract
  • D. Amazon Kendra

Question 4

A medical diagnostic model must minimize the chance of missing a true positive case. Which metric is the most appropriate to optimize?

  • A. Precision
  • B. Recall
  • C. Accuracy
  • D. RMSE

Question 5

Which of the following is the correct order of stages in a typical ML lifecycle?

  • A. Training → Data prep → Evaluation → Deployment
  • B. Data collection → Data prep → Training → Evaluation → Deployment → Monitoring
  • C. Deployment → Training → Monitoring → Data prep
  • D. Data collection → Deployment → Training → Evaluation

Flashcards (click to flip)