It's UWAweek 47
|
unitinfo
This page provides helpful information about many coursework units offered by
Computer Science and Software Engineering
in 2023.
The information here is not official -
for official information please see the
current UWA Handbook.
Instead, it will help students to prepare for their future units,
before the beginning of each semester,
and before they have access to
UWA's
Learning Management System (LMS).
|
About the unit CITS4012 Natural Language Processing (1st semester 2023)
Unit description:
Natural Language has been and will remain as the most preferred way to store and transfer knowledge. More than 80% of electronic data in modern societies are generated and stored in textual format. How to process unstructured text to extract useful insights and support actionable decision making and discover the hidden treasure of collective intelligence is of enormous value. In this unit, we start with traditional text processing techniques using Regular Expressions and discuss the needs of text processing and normalisation. We then introduce fundamental pipelines of natural language processing (NLP), including part-of-speech tagging and various ways of sentence parsing, with the aim of introducing traditional text feature collection techniques for higher-level tasks such as sentiment or document classification. Building on the understanding of the pros and cons of feature-based NLP pipeline approaches, the unit moves onto the modern approach of deep learning for NLP, focusing on word vector representation, neural language models, and recurrent neural networks for NLP. The unit situates the techniques around major NLP tasks, including information extraction, sentiment detection, dialogue systems and machine translation.
Unit outcomes:
Students are able to (1) apply pre-processing techniques for textual data preparation; (2) build pipelines for core NLP tasks; (3) critically analyse different language models; (4) explain how vector representations of words can be obtained; (5) evaluate performance of NLP solutions, both traditional and neural; and (6) undertake core components of major NLP tasks.
Unit coordinator:
Unit homepage:
|
|
Unit is offered in these majors and courses:
Indicative weekly topics:
week 1 |
Introduction to Natural Language Processing |
week 2 |
Word Embedding and Representation |
week 3 |
From Linear Models to Neural Networks |
week 4 |
Sequence Model and Recurrent Neural Networks |
week 5 |
Language Fundamentals |
week 6 |
Part-of-Speech Tagging |
week 7 |
Dependency Parsing |
week 8 |
Attention and Transformer |
week 9 |
Pretrained Models in NLP |
week 10 |
Natural Language Generation |
week 11 |
Applications: GPT, Named Entity Recognition, Topic Modelling |
week 12 |
Revision and Ethics |
Indicative assessment:
Programming Projects and Final Exam
Useful prior experience and background knowledge:
Python, Jupyter Notebook, Numpy, Pandas, Matplotlib, Google Colab
Useful prior programming and software experience:
Hardware required for this unit:
Students are able to undertake their laboratory exercises and projects in laboratories in the CSSE building, but most students also complete work on their own laptops.
The following hardware is required to successfully complete this unit:Standard laptop
Software required for this unit:
Students are able to undertake their laboratory exercises and projects in laboratories in the CSSE building, but most students also complete work on their own laptops.
The following software is required to successfully complete this unit:Conda,
Python, and various packages, SSH client, Python IDE (e.g. Visual Studio Code, PyCharm)
Operating system(s) used in this unit:
Different units will use different operating systems for their teaching - for in-class examples, laboratory exercises, and programming projects.
If an operating system is REQUIRED, it will be used when marking assessments.
ANY reasonable platform
This information last updated 6:47pm Thu 20th Apr 2023