Lecture 3: Deep Learning and Neural Networks — How AI Perceives the World
Deep Learning: Inspired by the Brain
Artificial neural networks (ANNs) draw inspiration from the structure of neurons in the brain — but how they actually work is quite different from biological brains.
| Property | Biological Neuron | Artificial Neuron |
|---|---|---|
| Input | Receives chemical signals via dendrites | Sum of numbers (weight × input value) |
| Processing | Fires an electrical signal when threshold is exceeded | Passes through an activation function (ReLU, Sigmoid) |
| Output | Transmitted to the next neuron via the axon | Passes a number to the next layer |
| Learning | Synaptic strength changes (Hebbian learning) | Weights adjusted via backpropagation |
How Neural Networks Work
Input data is transformed as it passes through multiple layers. At each layer, it is multiplied by weights and passed through an activation function to produce the final output.
The difference between the model's output and the correct answer is quantified. The smaller the loss, the more accurate the model.
To reduce the loss, the error is propagated backward from the output layer to the input layer. This calculates how much each weight contributed to the loss.
Weights are adjusted slightly in the direction opposite to the computed gradient. After millions of iterations, the model becomes progressively more accurate.
An intuitive analogy:
→ Descending a mountain blindfolded (gradient descent)
→ Feeling the slope underfoot (gradient = slope)
→ Taking one step downhill at a time (weight update)
→ Eventually reaching the lowest point (minimum loss)
Challenges:
→ Global minimum vs. local minimum
→ Learning rate: too large → oscillates, too small → slow
Revolutionary Deep Learning Architectures
| Architecture | Specialization | Core Idea | Key Applications |
|---|---|---|---|
| CNN (Convolutional Neural Network) | Images & video | Extracts spatial patterns using filters | Face recognition, medical imaging, self-driving cars |
| RNN / LSTM | Sequential data | Remembers prior information (gate mechanism) | Translation, speech recognition (pre-2017) |
| Transformer | Text & general purpose | Attention processes entire context in parallel | GPT, BERT, ChatGPT, Claude |
| GAN (Generative Adversarial Network) | Image & audio generation | Generator vs. discriminator competing | Image generation, deepfakes |
| Diffusion Model | Image generation | Generates images by reversing a noise process | DALL·E, Stable Diffusion |
CNN — The Eyes of AI
| Layer | Features Extracted | Analogy |
|---|---|---|
| Early Layers (Low-level) | Edges, lines, color changes | Basic shapes like dots, lines, circles |
| Middle Layers (Mid-level) | Textures, patterns, corner combinations | Body parts like eyes, nose, ears |
| Later Layers (High-level) | Full faces, cars, animals | Recognizing cats, dogs, people |
In 2012, AlexNet halved the error rate in the ImageNet classification competition compared to prior methods. That moment marked the beginning of the deep learning revolution. Since then, AI’s visual capabilities have surpassed human-level performance.
Transformer — The Engine of Modern AI
Building on the introduction in Lecture 1, let’s understand Transformers more deeply.
| Property | RNN/LSTM | Transformer |
|---|---|---|
| Processing Method | Sequential (one step at a time) | All at once (parallel) |
| Long-Range Dependencies | Distant information fades | Attention connects any position directly |
| Training Speed | Cannot parallelize → slow | Parallelizable → fast |
| Model Size | Millions of parameters | Tens to hundreds of billions of parameters |
| Representative Models | LSTM translation models | GPT-4, Claude, Gemini |
The Transformer's core innovation — Attention:
Sentence: "The animal didn't cross the street because it was too tired."
RNN: "it" → information passed sequentially from earlier → distant "animal" fades
Transformer: "it" → attends to every word simultaneously → connects directly to "animal"
→ Correctly identifies "it = animal"
→ This is how LLMs understand complex context
Limitations of Deep Learning
| Limitation | Explanation | Real-World Problem |
|---|---|---|
| Black Box | Cannot explain why it reached a conclusion | Trust issues with medical and legal AI |
| Data Dependency | Requires large volumes of high-quality data | Insufficient AI for diagnosing rare diseases |
| Distribution Shift | Fails when test conditions differ from training | Self-driving cars encountering unfamiliar roads |
| Compute Cost | Training requires enormous energy and expense | Environmental impact, accessibility inequality |
Key Takeaways
Backpropagation: propagate error backward → adjust weights → repeat CNN: hierarchical feature extraction — edges → parts → whole objects Transformer: sequential processing → parallel Attention → speed and performance revolution Deep learning limitations: black box + data dependency + compute cost
OIYO Editorial
Content Editor지식 인큐베이터이자 전문 콘텐츠 크리에이터. 경영, 경제, 법률 및 실생활에 유용한 실무/자격증 중심의 깊이 있는 정보를 연구하고 공유합니다.