Syllabus#

This course introduces the fundamental concepts of Natural Language Processing (NLP) and advanced language model technologies. Students will learn through hands-on practice, starting from basic text processing to advanced language model API utilization and NLP application development. The course emphasizes the use of Large Language Models (LLMs) and prompt engineering, aiming to develop practical skills in applying cutting-edge NLP technologies.

Learning Objectives#

  1. Understand the basic concepts and key technologies of NLP and language models.

  2. Practice core NLP techniques such as text preprocessing, word embeddings, and transformer architecture.

  3. Learn methods to perform various NLP tasks using LLM APIs.

  4. Master prompt engineering techniques and apply them to solve real-world problems.

  5. Develop skills to design and implement NLP-based web applications.

  6. Understand the ethical aspects of LLM utilization and learn methods for developing safe AI systems.

Course Outline#

Week 1

  • Overview: Introduction to Natural Language Processing and Language Models

  • Key Learning Content: Basic concepts of NLP, application areas, introduction to major tasks

  • Note: Lecture, Discussion on NLP application cases

Week 2

  • Overview: Basics of Text Preprocessing

  • Key Learning Content: Tokenization, normalization, stop word removal

  • Note: Lecture, Practice (Text preprocessing using NLTK library)

Week 3

  • Overview: Fundamentals of Language Models

  • Key Learning Content: N-gram models, statistical language models

  • Note: Lecture, Practice (Implementation of simple N-gram models)

Week 4

  • Overview: Word Embeddings

  • Key Learning Content: Word2Vec, GloVe, FastText

  • Note: Lecture, Practice (Creating and visualizing word embeddings using Gensim)

Week 5

  • Overview: Introduction to Transformer Architecture

  • Key Learning Content: Attention mechanism, transformer structure

  • Note: Lecture, Analysis of transformer model structure

Week 6

  • Overview: Understanding LLM APIs

  • Key Learning Content: OpenAI API usage, tokenization, sampling methods

  • Note: Lecture, Practice (Simple text generation through API calls)

Special Lecture: 2024 Nobel Prize in Physics - Foundations of Machine Learning

  • Overview: Foundational discoveries enabling machine learning with artificial neural networks

  • Key Learning Content: Hopfield networks, Boltzmann machines, and their impact on modern AI and NLP

  • Note: Discussion on the connection between physics and machine learning

Special Lecture: 2024 Nobel Prize in Chemistry - Computational Protein Design and Structure Prediction

  • Overview: Breakthrough discoveries in computational protein design and AI-driven protein structure prediction

  • Key Learning Content:

    • David Baker’s work on computational protein design

    • Demis Hassabis and John Jumper’s development of AlphaFold2 for protein structure prediction

    • Implications for NLP and AI in scientific research

  • Note: Lecture, Discussion on the intersection of AI, computational biology, and chemistry

Week 8

  • Overview: Midterm Project Presentation

  • Key Learning Content: Development of NLP app prototype using content from weeks 1-7

  • Note: Student project presentations and feedback

Week 9

  • Overview: Basics of Prompt Engineering

  • Key Learning Content: Zero-shot, few-shot prompting, chain-of-thought technique

  • Note: Lecture, Practice (Applying various prompting techniques)

Week 10

  • Overview: Building LLM-based Q&A Systems

  • Key Learning Content: Introduction to vector databases, document parsing

  • Note: Lecture, Practice (Implementing a simple Q&A system)

Week 11

  • Overview: Basics of Web Application Development

  • Key Learning Content: Introduction to Flask/Streamlit, basic web app structure

  • Note: Lecture, Practice (Creating a web app prototype using LLM API)

Week 12

  • Overview: Controlling and Structuring LLM Outputs

  • Key Learning Content: Adjusting temperature, utilizing top_p, JSON output

  • Note: Lecture, Practice (Building an app for structured data extraction)

Week 13

  • Overview: Introduction to RAG (Retrieval-Augmented Generation)

  • Key Learning Content: RAG architecture, basics of vector search

  • Note: Lecture, Practice (Implementing a simple RAG system)

Week 14

  • Overview: Ethics and Safety in LLM Applications

  • Key Learning Content: Bias detection, content filtering, preventing prompt injection

  • Note: Lecture, Discussion (Ethical considerations in LLM usage)

Week 15

  • Overview: Final Project Presentation and Course Wrap-up

  • Key Learning Content: Sharing results of NLP app development projects

  • Note: Student project presentations, feedback, discussion on future learning directions

Evaluation#

  1. Attendance and Participation (10%)

  2. Weekly Practical Assignments (30%)

  3. Midterm Project (25%)

  4. Final Project (35%)

Course Materials#

Prerequisites#

  • Basic Python Programming

  • Fundamentals of Statistics and Linear Algebra

Additional Notes#

  • Personal laptop required as the course is practice-oriented

  • Course content may be partially modified to reflect the latest technology trends