Natural Language Processing and Computer Vision (English)
How does an AI model recognize a face among thousands, or write a flawless essay? In this in-depth minor, you will delve into the technology behind the biggest AI breakthroughs of this decade. You will learn how to teach computers to see, read, and understand. From Convolutional Neural Networks to the power of Transformers and Large Language Models (LLMs). You will not only build on the theory, but also develop advanced systems for image recognition and text analysis yourself. Become a specialist in the technology that is transforming industries.
The world consists of unstructured data: images, videos, and text. In this minor, you will learn how to transform this data into understandable information and new creations. We will delve deeply into the two most influential domains of the current AI revolution.
The minor is divided into two main components and an integration section:
Part 1: Computer Vision. In this unit, you will learn how machines process visual information. We will start with the basics of digital image processing and quickly scale up to Deep Learning.
Convolutional Neural Networks (CNNs): You will learn the architecture of networks that recognize patterns in pixels.
Object Detection & Segmentation: How do you ensure that a self-driving car not only sees an obstacle but also knows exactly where the curb ends and a pedestrian begins? We will work with models such as YOLO and Mask R-CNN.
Generative Vision: You will explore the world of GANs and Diffusion models for generating and processing images.
Part 2: Natural Language Processing. Text is complex due to context, nuance, and grammar. You'll learn how modern AI has broken through this barrier.
From Word to Vector: You'll delve into word embeddings and the transition from RNNs to the revolutionary Attention mechanisms.
The Transformer Revolution: You'll learn how models work. We'll look at tokenization, contextual embeddings, and the architecture of Large Language Models.
LLM Engineering: You'll get hands-on experience with Retrieval Augmented Generation (RAG) and fine-tuning models for specific applications.
Part 3: Multimodal AI & Practical Lab.
Multimodality: How do you connect language to images? Consider models like CLIP or DALL-E. You'll learn how AI creates descriptions for images or generates images based on textual commands.
Deployment & Ethics: Building a model is one thing, making it scalable is another. We'll cover the technical challenges of running large models and the ethical aspects: how do we prevent biased language models or unintended discrimination in computer vision?
Project phase: You'll conclude the minor with a deep dive project. For example, you might build a system that translates sign language into text in real time, a medical tool that diagnoses scans, or an advanced AI bot that analyzes and summarizes complex documents.
Leerdoelen
After completing the NLP & Computer Vision minor, students will be able to:
- Implement state-of-the-art Deep Learning architectures (such as CNNs and Transformers) for complex visual and textual datasets.
- Perform advanced Computer Vision tasks, including object detection, segmentation, and image generation.
- Design complex NLP pipelines for tasks such as sentiment analysis, automatic translation, and text extraction using Large Language Models (LLMs).
- Apply transfer learning and fine-tuning techniques to optimize existing pre-trained models for specific domains.
- Develop multimodal systems that bridge the gap between text and images (e.g., image captioning or text-to-image generation).
- Critically evaluate the performance of perception models and address their technical and ethical limitations (such as bias and deepfakes).
Ingangseisen
Target groups: study programmes in Technology
Requirements: Mathematics B (VWO level), programming, AI engineering
Rooster
Description of meetings: Lectures, discussions, workshops and an ongoing project.
Attendance required during assessment moments and collaboration in project groups.
Contact hours spread throughout the week during the day; 10 to 15 hours per week
Toetsing
Performance profile, 15 EC, minimum 5,5: The student delivers performance based on established performance indicators. Work attitude (behaviour), professional products (skills) and written test (knowledge) will be assessed.
Aanvullende informatie
Locations: Maastricht Paul-Henri Spaaklaan and Heerlen Brightlands Campus