Syllabus#
This course introduces the fundamental concepts of Natural Language Processing (NLP) and advanced language model technologies. Students will learn through hands-on practice, starting from basic text processing to advanced language model API utilization and NLP application development. The course emphasizes the use of Large Language Models (LLMs) and prompt engineering, aiming to develop practical skills in applying cutting-edge NLP technologies.
Learning Objectives#
Understand the basic concepts and key technologies of NLP and language models.
Practice core NLP techniques such as text preprocessing, word embeddings, and transformer architecture.
Learn methods to perform various NLP tasks using LLM APIs.
Master prompt engineering techniques and apply them to solve real-world problems.
Develop skills to design and implement NLP-based web applications.
Understand the ethical aspects of LLM utilization and learn methods for developing safe AI systems.
Course Outline#
Week 1
Overview: Introduction to Natural Language Processing and Language Models
Key Learning Content: Basic concepts of NLP, application areas, introduction to major tasks
Note: Lecture, Discussion on NLP application cases
Week 2
Overview: Basics of Text Preprocessing
Key Learning Content: Tokenization, normalization, stop word removal
Note: Lecture, Practice (Text preprocessing using NLTK library)
Week 3
Overview: Fundamentals of Language Models
Key Learning Content: N-gram models, statistical language models
Note: Lecture, Practice (Implementation of simple N-gram models)
Week 4
Overview: Word Embeddings
Key Learning Content: Word2Vec, GloVe, FastText
Note: Lecture, Practice (Creating and visualizing word embeddings using Gensim)
Week 5
Overview: Introduction to Transformer Architecture
Key Learning Content: Attention mechanism, transformer structure
Note: Lecture, Analysis of transformer model structure
Week 6
Overview: Understanding LLM APIs
Key Learning Content: OpenAI API usage, tokenization, sampling methods
Note: Lecture, Practice (Simple text generation through API calls)
Special Lecture: 2024 Nobel Prize in Physics - Foundations of Machine Learning
Overview: Foundational discoveries enabling machine learning with artificial neural networks
Key Learning Content: Hopfield networks, Boltzmann machines, and their impact on modern AI and NLP
Note: Discussion on the connection between physics and machine learning
Special Lecture: 2024 Nobel Prize in Chemistry - Computational Protein Design and Structure Prediction
Overview: Breakthrough discoveries in computational protein design and AI-driven protein structure prediction
Key Learning Content:
David Baker’s work on computational protein design
Demis Hassabis and John Jumper’s development of AlphaFold2 for protein structure prediction
Implications for NLP and AI in scientific research
Note: Lecture, Discussion on the intersection of AI, computational biology, and chemistry
Week 8
Overview: Midterm Project Presentation
Key Learning Content: Development of NLP app prototype using content from weeks 1-7
Note: Student project presentations and feedback
Week 9
Overview: Basics of Prompt Engineering
Key Learning Content: Zero-shot, few-shot prompting, chain-of-thought technique
Note: Lecture, Practice (Applying various prompting techniques)
Week 10
Overview: Building LLM-based Q&A Systems
Key Learning Content: Introduction to vector databases, document parsing
Note: Lecture, Practice (Implementing a simple Q&A system)
Week 11
Overview: Basics of Web Application Development
Key Learning Content: Introduction to Flask/Streamlit, basic web app structure
Note: Lecture, Practice (Creating a web app prototype using LLM API)
Week 12
Overview: Controlling and Structuring LLM Outputs
Key Learning Content: Adjusting temperature, utilizing top_p, JSON output
Note: Lecture, Practice (Building an app for structured data extraction)
Week 13
Overview: Introduction to RAG (Retrieval-Augmented Generation)
Key Learning Content: RAG architecture, basics of vector search
Note: Lecture, Practice (Implementing a simple RAG system)
Week 14
Overview: Ethics and Safety in LLM Applications
Key Learning Content: Bias detection, content filtering, preventing prompt injection
Note: Lecture, Discussion (Ethical considerations in LLM usage)
Week 15
Overview: Final Project Presentation and Course Wrap-up
Key Learning Content: Sharing results of NLP app development projects
Note: Student project presentations, feedback, discussion on future learning directions
Evaluation#
Attendance and Participation (10%)
Weekly Practical Assignments (30%)
Midterm Project (25%)
Final Project (35%)
Course Materials#
Lecture Note: https://nlp2024.halla.ai
GitHub: entelecheia/intronlp-2024
OpenAI API documentation, Hugging Face documentation, latest NLP-related papers and blog posts
Prerequisites#
Basic Python Programming
Fundamentals of Statistics and Linear Algebra
Additional Notes#
Personal laptop required as the course is practice-oriented
Course content may be partially modified to reflect the latest technology trends