Fundamentals of AI and ML
This is the warm-up. It's your 'what is even happening' foundation — what AI is, what ML is, what the difference between supervised/unsupervised/reinforcement learning is, and how the ML lifecycle works.
Updated May 21, 2026
The Big Picture (read this first)
Imagine you have a 5-year-old who’s never heard of computers. Here is how to explain the words on the box of this exam:
- Artificial Intelligence (AI)= a giant umbrella term. Any computer that does something that looks “smart.” That’s it. Playing chess, recognizing faces, answering email — all AI.
- Machine Learning (ML) = a kind of AI where the computer learns the rules from examples instead of a human writing the rules. You show it 10,000 cat pictures, it figures out what a cat looks like.
- Deep Learning (DL)= a kind of ML that uses many-layered “neural networks” (math inspired loosely by brain cells). Best for messy, complex data like images, speech, and language.
- Generative AI (GenAI) = a kind of deep learning that can produce new things — text, images, audio, code — instead of only classifying or predicting. ChatGPT is GenAI.
- Foundation Model (FM) = a giant pre-trained GenAI model, trained on huge data, that can be adapted to many tasks. Think of it as a college graduate vs. a kindergartner — already broadly educated.
- Agentic AI= an FM that doesn’t just answer once. It plans steps, uses tools (search, code, APIs), checks its work, and iterates toward a goal. It can act in the world, not just talk.
The nesting
"GenAI is a type of ML" is true on the exam
Task 1.1 — Basic AI Concepts and Terminology
▶1.1.1 Define basic AI terms (you must know these cold)
| Term | Plain-language meaning | Don't confuse with |
|---|---|---|
| AI | Anything where a computer does something "smart" | — |
| ML | AI that learns rules from examples (data) | Hard-coded if/else logic |
| Deep learning | ML using neural networks with many layers | Shallow ML like decision trees |
| Neural network | Many layers of math, loosely inspired by neurons, that transform input into output | Actual biology — they're not really brains |
| Algorithm | A set of instructions / a recipe | The trained model itself |
| Model | The trained output of an algorithm + data. The "smart thing" you actually use. | The algorithm. Algorithm + data → model. |
| Inference | Using a trained model to make a prediction or generate output | Training, which is teaching the model |
| Training | The process of teaching the model from data | Inference, which is using it |
| Bias (in ML) | Systematic error favoring certain outcomes — often from skewed data | Statistical bias-variance tradeoff (different concept) |
| Variance (in ML) | How much a model's predictions change when training data changes; high variance = overfitting | Statistical variance (related but specific here) |
| Generative AI | AI that creates new content (text, images, audio, code) | Discriminative AI, which classifies or predicts existing categories |
| Foundation model (FM) | A massive pretrained model adaptable to many tasks | A traditional task-specific model (like a single classifier) |
| LLM | Large Language Model — an FM specifically for text | FM in general (FM is broader; LLM is text-only) |
| Multi-modal | Model that handles multiple input types: text + images + audio | Single-modal models |
| Diffusion model | Model type used for image/video generation (Stable Diffusion, etc.) | Transformer (used for text) |
| Transformer | The neural network architecture behind almost all modern LLMs | Diffusion (different architecture for images) |
| Embedding | A list of numbers representing meaning. Words/sentences/images become vectors so the computer can compare them. | Tokens (separate concept — see Domain 2) |
| Token | A chunk of text the model sees as one unit. Roughly 4 characters or ~¾ of a word. | Word — tokens aren't always words |
| Agentic AI | AI that plans, uses tools, observes results, and iterates — autonomous goal-seeking | A normal LLM that just answers a single prompt |
Algorithm vs. Model
▶1.1.2 Identify similarities and differences between AI / ML / Deep Learning / GenAI
You will be asked something like “Which of the following is a subset of machine learning?” or “What distinguishes generative AI from traditional ML?” — these are the answers:
| Question form | Answer |
|---|---|
| Subset of AI | ML, DL, GenAI |
| Subset of ML | DL, GenAI (because it uses DL) |
| Subset of DL | GenAI |
| What separates GenAI from traditional ML | GenAI creates new content; traditional ML classifies/predicts existing categories |
| What separates DL from "shallow" ML | DL uses many neural network layers, handles unstructured data better, needs more compute and more data |
If a question says...
- “Spam vs. not spam” → classification (traditional ML)
- “Predict house price” → regression (traditional ML)
- “Group customers into segments without labels” → clustering (unsupervised ML)
- “Recognize a cat in a photo” → deep learning (CNN)
- “Write a marketing email from scratch” → generative AI
- “Write code suggestions in an IDE” → generative AI (code LLM)
▶1.1.3 Describe types of inferencing
Inference = the model “thinking” — taking input, producing output. Inference happens in different shapes depending on speed and volume needs.
| Type | What it means (plain) | When to use | AWS service / pattern |
|---|---|---|---|
| Real-time | Send one request, wait for one response right now | Chatbots, fraud check at checkout, face login | SageMaker real-time endpoint, Bedrock InvokeModel |
| Batch | Send a huge file of inputs, get all outputs back later | Score 10M customer records overnight | SageMaker Batch Transform, Bedrock Batch Inference |
| Asynchronous | Submit and don't block; pick up the result when ready | Big inputs (long video, big PDF) where waiting is fine | SageMaker Async Inference |
| Serverless | Endpoint that scales to zero between requests, scales up on demand | Spiky low-volume traffic where idle cost matters | SageMaker Serverless Inference |
| Streaming (GenAI) | Output is sent token-by-token as it generates | Chat UIs where users want to see typing in real time | Bedrock InvokeModelWithResponseStream |
Real-time ≠ streaming
Cost intuition
▶1.1.4 Describe ML development lifecycle (the pipeline)
Every ML project follows roughly the same path. The exam loves to ask you to put these in order or pick which AWS service does which step.
- Business problem framing — what are we even trying to predict / generate / decide?
- Data collection — gather raw data from sources (S3, databases, APIs).
- Exploratory data analysis (EDA) — understand the data: distributions, missing values, weirdness.
- Data preparation / feature engineering — clean, transform, label, split into train/validation/test.
- Model selection — pick algorithm or pretrained model.
- Training — feed data to algorithm, produce model.
- Evaluation — test on held-out data; check metrics (accuracy, F1, etc.).
- Tuning — adjust hyperparameters, repeat.
- Deployment — push model to production endpoint.
- Monitoring — watch for drift, errors, bias in production. Retrain when needed.
| Stage | AWS service that fits |
|---|---|
| Data storage | S3 (data lake), RDS, Redshift |
| Data prep | SageMaker Data Wrangler, AWS Glue |
| Labeling | SageMaker Ground Truth |
| Training (custom) | SageMaker AI Training Jobs |
| Pretrained models | SageMaker JumpStart, Bedrock |
| Hosting / inference | SageMaker Endpoints, Bedrock |
| Monitoring | SageMaker Model Monitor, CloudWatch |
| Bias / explainability | SageMaker Clarify |
| Documentation / governance | SageMaker Model Cards |
Out of scope for this exam
▶1.1.5 Learning paradigms — the most-tested concept in Domain 1
Three main paradigms. Memorize the one-line definition for each, then drill scenarios.
| Type | Plain meaning | Data needed | Common tasks |
|---|---|---|---|
| Supervised | Learn from examples that have the right answer attached | Labeled data | Classification, regression |
| Unsupervised | Find patterns without being told what's right | Unlabeled data | Clustering, anomaly detection, dimensionality reduction |
| Reinforcement | Learn by trial and error from rewards | An environment that gives feedback | Game playing, robotics, RLHF for LLMs |
| Semi-supervised | Some labeled, mostly unlabeled | Mixed | When labeling is expensive |
| Self-supervised | The data labels itself (e.g., predict the next word) | Massive unlabeled corpora | How LLMs / FMs are pre-trained |
Classification vs. regression vs. clustering (your diagnostic Q2 miss)
- Classification = predicting a category. “Is this spam?” “Is this image a cat, dog, or bird?” Output is discrete labels.
- Regression = predicting a number. “What’s the house price?” “What’s tomorrow’s temperature?” Output is continuous.
- Clustering = grouping similar things without knowing the groups in advance. “Segment my customers into 5 natural groups.” Output is group assignments, not predictions of a known label.
Scenario drills (decide before reading the answer)
- “Predict if a credit card transaction is fraudulent” → supervised, classification
- “Predict tomorrow’s stock price” → supervised, regression
- “Group website visitors into similar shopping personas” → unsupervised, clustering
- “Train a robot to walk by trying random moves” → reinforcement
- “Train a chatbot to be more helpful by getting thumbs-up/down feedback” → reinforcement (RLHF)
- “Detect when a server’s metrics look unusual” → unsupervised, anomaly detection
- “Pre-train an LLM by predicting the next word in trillions of sentences” → self-supervised
Reinforcement learning trap
Task 1.2 — Practical Use Cases for AI
▶1.2.1 When AI helps and when it doesn't
This is heavily tested. Some questions ask “When is AI notthe right tool?” — the answers usually involve:
- Hard rules required — tax calculations, regulatory thresholds. Use deterministic code, not ML.
- Decisions that must be 100% explainable — life-or-death medical or legal. ML can be too opaque.
- Cost outweighs benefit — small dataset, simple problem. A spreadsheet rule beats a model.
- No data available — ML needs data. No data = no ML.
- Real-time, very low latency needs — large LLM inference can be too slow.
When AI is the right tool
▶1.2.2 Traditional ML vs. Generative AI — when to pick which (v1.1 addition)
| Use traditional ML when… | Use generative AI / FM when… |
|---|---|
| You have structured tabular data and a specific prediction target | You have unstructured data (text, images, audio) or want to generate content |
| The task is narrow and defined (fraud, churn, demand forecast) | The task is broad or open-ended (Q&A, summaries, drafts) |
| You need full explainability for regulators | Some opacity is acceptable |
| You have plenty of labeled examples | Few or no labeled examples (FMs work zero-shot or few-shot) |
| You need consistent, deterministic outputs | You can tolerate non-deterministic outputs and verify them |
| Cost per prediction must be very low | Token-based cost is acceptable |
FMs are not always the right answer
▶1.2.3 Real-world AWS-flavored use cases
| Use case | AWS managed service |
|---|---|
| Sentiment analysis on customer reviews | Amazon Comprehend |
| Detect text in scanned documents / forms | Amazon Textract |
| Detect faces, objects, unsafe content in images | Amazon Rekognition |
| Convert audio recordings to text | Amazon Transcribe |
| Convert text to spoken audio | Amazon Polly |
| Translate text across languages | Amazon Translate |
| Build chatbots / voice bots | Amazon Lex |
| Personalize product recommendations | Amazon Personalize |
| Enterprise search across documents | Amazon Kendra |
| Generate text / code / images via LLM | Amazon Bedrock |
| Build, train, deploy custom ML models | Amazon SageMaker AI |
| Enterprise AI assistant for employees | Amazon Q Business |
| AI coding assistant | Amazon Q Developer |
| BI insights / dashboards with AI | Amazon QuickSight Q |
| Migrate / modernize workloads with AI | AWS Transform (v1.1 addition) |
Textract vs. Rekognition (always tested)
Comprehend vs. Kendra
Personalize vs. just "use Bedrock"
▶1.2.4 Capabilities and limitations of AI/ML for solving business problems
Capabilities
- Process huge volumes of unstructured data fast
- Personalize at scale
- Automate repetitive analytical work
- Surface patterns humans would miss
- Generate content drafts, summaries, code
Limitations
- Hallucinations — GenAI invents plausible-sounding falsehoods
- Bias — model reflects bias in training data
- Opacity — hard to explain why a deep model made a specific call
- Drift — performance degrades as the world changes after training cutoff
- Cost — large models can be expensive per inference
- Privacy / data leakage — sensitive data could leak into outputs
- Determinism — same prompt may give different answers
Task 1.3 — ML Development Lifecycle (deeper)
▶1.3.1 Components of an ML pipeline (in order)
- Data collection
- Exploratory data analysis (EDA)
- Data pre-processing
- Feature engineering
- Model training
- Hyperparameter tuning
- Evaluation
- Deployment
- Monitoring
Mnemonic: "Data Engineers Prep Features, Train Tuned Evaluations, Deploy Monitors"
▶1.3.2 Sources of ML models — buy, build, or borrow
- Pre-trained from a marketplace / hub — fastest, cheapest, but generic. Use SageMaker JumpStart, Bedrock, AWS Marketplace.
- Fine-tuned from a pre-trained model — adapt to your data without training from scratch.
- Trained from scratch — full control, max cost, only worth it for huge unique datasets.
▶1.3.3 Methods to use a model in production
- Hosted real-time endpoint — always-on inference (SageMaker Endpoint, Bedrock).
- Batch transform — score a big chunk all at once.
- Async inference — long-running, non-blocking.
- Serverless endpoint — pay per request, scales to zero.
- Edge deployment — model runs on the device (SageMaker Edge, but rarely tested on AIF-C01).
▶1.3.4 MLOps concepts (high level)
You don’t need to do MLOps for this exam, but you should recognize the words:
- CI/CD for ML — automated pipeline from data → trained model → deployment
- Model registry — version control for models (SageMaker Model Registry)
- Feature store — central, reusable feature library (SageMaker Feature Store)
- Drift detection — alert when production data shifts away from training data
- A/B testing — split traffic between two models to compare performance
MLOps depth is out of scope
▶1.3.5 Model performance metrics
| Metric | What it tells you | When to use |
|---|---|---|
| Accuracy | % predictions correct | Balanced datasets only |
| Precision | Of items you predicted positive, how many actually were? (low false positives) | When false positives are expensive (spam filter) |
| Recall | Of all true positives, how many did you catch? (low false negatives) | When missing a positive is dangerous (cancer screening) |
| F1 score | Harmonic mean of precision and recall | Imbalanced data, balance both |
| AUC-ROC | How well a classifier ranks positives above negatives across thresholds | Binary classification quality overall |
| RMSE / MAE | Average error magnitude in regression | Predicting numbers |
Confusion matrix in plain English
- True Positive (TP) = model said “yes,” answer was “yes”
- True Negative (TN) = model said “no,” answer was “no”
- False Positive (FP) = model said “yes,” answer was “no” (false alarm)
- False Negative (FN) = model said “no,” answer was “yes” (missed it)
- Precision = TP / (TP + FP). Recall = TP / (TP + FN).
Accuracy is misleading on imbalanced data
▶1.3.6 Business metrics for AI/ML solutions
Technical metrics aren’t enough — the exam expects you to also think in business terms.
- ROI / cost savings — does the model save more than it costs?
- Customer satisfaction — surveys, NPS, support feedback
- Cost per inference / cost per user
- Time saved per task
- Conversion rate / click-through rate
- Reduction in manual review
Watch for "the most appropriate metric to evaluate the business impact"
Self-Quiz
Question 1
A retail company wants to group its customers into segments based on purchasing behavior, but it does not have predefined segment labels. Which type of machine learning is most appropriate?
- A. Supervised classification
- B. Supervised regression
- C. Unsupervised clustering
- D. Reinforcement learning
Question 2
A logistics company processes 5 million shipping records every night and needs to score each one for predicted delivery time. Cost is a major consideration. Which inference type is most appropriate?
- A. Real-time inference
- B. Batch inference
- C. Asynchronous inference
- D. Serverless inference
Question 3
Which AWS service should be used to extract printed text and form fields from scanned PDFs of insurance claims?
- A. Amazon Rekognition
- B. Amazon Comprehend
- C. Amazon Textract
- D. Amazon Kendra
Question 4
A medical diagnostic model must minimize the chance of missing a true positive case. Which metric is the most appropriate to optimize?
- A. Precision
- B. Recall
- C. Accuracy
- D. RMSE
Question 5
Which of the following is the correct order of stages in a typical ML lifecycle?
- A. Training → Data prep → Evaluation → Deployment
- B. Data collection → Data prep → Training → Evaluation → Deployment → Monitoring
- C. Deployment → Training → Monitoring → Data prep
- D. Data collection → Deployment → Training → Evaluation