Lecture 4: AI's Limitations and Risks — Hallucination, Bias, and Privacy

The Danger of Blindly Trusting AI

AI tools like ChatGPT and Claude demonstrate remarkable capabilities — but they also have critical limitations. The core of AI literacy is understanding both AI’s abilities and its limitations at the same time.

AI Capabilities vs. Limitations — A Realistic Assessment
Domain	What AI Does Well	What AI Cannot Do
Knowledge	Reproducing patterns from vast training data	Current information, expert-verified knowledge
Language	Generating fluent, structured text	True understanding, grasping intent
Reasoning	Following logical steps	Creative reasoning about genuinely novel concepts
Reliability	Maintaining consistent style	Recognizing and correcting its own errors

Hallucination

The pattern by which AI fails in the most dangerous way.

Hallucination is when an LLM confidently generates facts that don’t exist. Despite the name, AI isn’t actually confused — it’s simply producing the most plausible next tokens, which may have no basis in reality.

Types of Hallucination and Real Examples
Type	Description	Real-World Case
Non-existent citations	Cites papers or books that don't exist	US lawyer ChatGPT incident — cited fabricated case law and faced court sanctions
Date and number errors	Confidently states incorrect figures	Wrong corporate financial figures, faulty statistics
Blended information	Incorrectly merges multiple facts	Mixing the biographical details of two different people
Missing recent information	Unaware of events after training cutoff	Makes incorrect guesses about post-cutoff events

Strategies to prevent acting on hallucinations:
→ Always verify important facts from original sources
→ "Tell me the source" is not enough — AI can fabricate sources too
→ High-stakes domains (legal, medical, financial) require expert review
→ Narrow questions to specific, verifiable claims
→ Treat AI responses as drafts; humans must verify

AI Bias

AI can absorb the biases embedded in its training data — and amplify them.

Types of AI Bias and Real Harms
Type of Bias	Cause	Real-World Case
Data Bias	Training data over- or under-represents certain groups	MIT study: facial recognition misidentified Black women at a 35% error rate
Historical Bias	Learns from past discriminatory patterns	Amazon's hiring AI penalized resumes from women (2018)
Confirmation Bias Amplification	Recommends content that reinforces existing beliefs	Social media filter bubbles, radicalization
Linguistic Bias	English-dominated training data	Lower performance and accuracy for non-English languages

Recognize Bias

Consciously check whether AI outputs favor or disadvantage certain groups (by gender, race, age, or nationality).

Diversify Outputs

Compare results from multiple AI tools and try prompts from different perspectives.

Be Cautious with High-Stakes Decisions

Do not use AI as the sole criterion for important decisions such as hiring, loan approvals, or medical diagnoses.

Provide Feedback

Report incorrect AI outputs. Collective feedback contributes to model improvement.

Privacy and AI

Privacy Risks When Using AI
Risk Type	Description	Prevention
Inclusion in Training Data	Information you input may be used for future model training	Check the service's terms; opt out of data training if possible
Leakage of Sensitive Information	AI may regenerate personal information about others	Never input real names, ID numbers, or account details
Corporate Confidentiality Leaks	Entering work information into external AI poses risks	Use enterprise AI or minimize what you input
Copyright Infringement	AI-generated content may infringe on training data copyrights	Check the AI service's policy before commercial use

Deepfakes and AI Disinformation

Types of AI-Generated Disinformation
Type	Technology	Harm Examples
Deepfake Video	Face synthesis via GAN or Diffusion	Celebrity impersonation scams, election disinformation
Voice Cloning	Clones a voice from just 5 seconds of audio	Fake kidnapping phone calls impersonating family members
AI-Generated Text	Automated mass production of fake news and comments	Public opinion manipulation, fake reviews
Synthetic Images	Photos of events that never happened	Spreading false information about wars or disasters

Perfect detection is difficult, but look for: (1) unnatural face edges or hair, (2) out-of-sync blinking and lip movement, (3) asymmetric earrings or glasses, (4) inconsistent backgrounds, (5) verify the originating account. AI detection tools like Deepware and Microsoft Video Authenticator can also help.

Key Takeaways

Hallucination: AI confidently generates non-existent facts → always verify important information from original sources Bias: Training data prejudices are absorbed and amplified by AI → never use AI alone for high-stakes decisions Privacy: Never input sensitive information into external AI → use dedicated enterprise AI for confidential work Deepfakes: AI-generated disinformation is surging → critically verify the source and context of information