Syllabus

Syllabus#

This course introduces the fundamental concepts of Natural Language Processing (NLP) and advanced language model technologies. Students will learn through hands-on practice, starting from basic text processing to advanced language model API utilization and NLP application development. The course emphasizes the use of Large Language Models (LLMs) and prompt engineering, aiming to develop practical skills in applying cutting-edge NLP technologies.

Learning Objectives#

Understand the basic concepts and key technologies of NLP and language models.
Practice core NLP techniques such as text preprocessing, word embeddings, and transformer architecture.
Learn methods to perform various NLP tasks using LLM APIs.
Master prompt engineering techniques and apply them to solve real-world problems.
Develop skills to design and implement NLP-based web applications.
Understand the ethical aspects of LLM utilization and learn methods for developing safe AI systems.

Course Outline#

Week 1

Overview: Introduction to Natural Language Processing and Language Models
Key Learning Content: Basic concepts of NLP, application areas, introduction to major tasks
Note: Lecture, Discussion on NLP application cases

Week 2

Overview: Basics of Text Preprocessing
Key Learning Content: Tokenization, normalization, stop word removal
Note: Lecture, Practice (Text preprocessing using NLTK library)

Week 3

Overview: Fundamentals of Language Models
Key Learning Content: N-gram models, statistical language models
Note: Lecture, Practice (Implementation of simple N-gram models)

Week 4

Overview: Word Embeddings
Key Learning Content: Word2Vec, GloVe, FastText
Note: Lecture, Practice (Creating and visualizing word embeddings using Gensim)

Week 5

Overview: Introduction to Transformer Architecture
Key Learning Content: Attention mechanism, transformer structure
Note: Lecture, Analysis of transformer model structure

Week 6

Overview: Understanding LLM APIs
Key Learning Content: OpenAI API usage, tokenization, sampling methods
Note: Lecture, Practice (Simple text generation through API calls)

Special Lecture: 2024 Nobel Prize in Physics - Foundations of Machine Learning

Overview: Foundational discoveries enabling machine learning with artificial neural networks
Key Learning Content: Hopfield networks, Boltzmann machines, and their impact on modern AI and NLP
Note: Discussion on the connection between physics and machine learning

Special Lecture: 2024 Nobel Prize in Chemistry - Computational Protein Design and Structure Prediction

Overview: Breakthrough discoveries in computational protein design and AI-driven protein structure prediction
Key Learning Content:
- David Baker’s work on computational protein design
- Demis Hassabis and John Jumper’s development of AlphaFold2 for protein structure prediction
- Implications for NLP and AI in scientific research
Note: Lecture, Discussion on the intersection of AI, computational biology, and chemistry

Week 8

Overview: Midterm Project Presentation
Key Learning Content: Development of NLP app prototype using content from weeks 1-7
Note: Student project presentations and feedback

Week 9

Overview: Basics of Prompt Engineering
Key Learning Content: Zero-shot, few-shot prompting, chain-of-thought technique
Note: Lecture, Practice (Applying various prompting techniques)

Week 10

Overview: Building LLM-based Q&A Systems
Key Learning Content: Introduction to vector databases, document parsing
Note: Lecture, Practice (Implementing a simple Q&A system)

Week 11

Overview: Basics of Web Application Development
Key Learning Content: Introduction to Flask/Streamlit, basic web app structure
Note: Lecture, Practice (Creating a web app prototype using LLM API)

Week 12

Overview: Controlling and Structuring LLM Outputs
Key Learning Content: Adjusting temperature, utilizing top_p, JSON output
Note: Lecture, Practice (Building an app for structured data extraction)

Week 13

Overview: Introduction to RAG (Retrieval-Augmented Generation)
Key Learning Content: RAG architecture, basics of vector search
Note: Lecture, Practice (Implementing a simple RAG system)

Week 14

Overview: Ethics and Safety in LLM Applications
Key Learning Content: Bias detection, content filtering, preventing prompt injection
Note: Lecture, Discussion (Ethical considerations in LLM usage)

Week 15

Overview: Final Project Presentation and Course Wrap-up
Key Learning Content: Sharing results of NLP app development projects
Note: Student project presentations, feedback, discussion on future learning directions

Evaluation#

Attendance and Participation (10%)
Weekly Practical Assignments (30%)
Midterm Project (25%)
Final Project (35%)

Course Materials#

Lecture Note: https://nlp2024.halla.ai
GitHub: entelecheia/intronlp-2024
OpenAI API documentation, Hugging Face documentation, latest NLP-related papers and blog posts

Prerequisites#

Basic Python Programming
Fundamentals of Statistics and Linear Algebra

Additional Notes#

Personal laptop required as the course is practice-oriented
Course content may be partially modified to reflect the latest technology trends