Catalog Course Description:
This course introduces the concepts and principles of knowledge discovery in databases (KDD), with a focus on the techniques of data mining and its function in business, governmental, medical and other information-intensive environments.
Pre-requisites and Co-requisites:
This course is intended for the advanced MS, MSIS, CAS, or Ph.D. student who is specializing in information systems. A background in programming is highly recommended.
Specific prerequisite courses are:
INFO 605 Database Management I
INFO 629 Artificial Intelligence or INFO 612 Knowledge Base Systems
Curriculum Role:
This course is a domain course for PH.D. students. Ph.D. students in the IS program typically take it in the second year after they finish the 5 core courses.
Course Rationale:
This course is offered to provide students with advanced knowledge in data mining technique, algorithm and methods. Students learn data pre-processing, various data mining algorithms including supervised learning, semi-supervising learning and unsupervised learning.
Course Outcomes:
Upon successful completion of this course, a student will be able to:
Understand the issues of KDD, its history, uses, and motivation.
Understand the importance of data pre-processing and techniques for accomplishing it
Become familiar with a variety of data mining techniques and will perform them on practice databases, understanding which tools are appropriate for which data mining tasks
Understand how to develop and evaluate a data mining enterprise in the context of the KDD life cycle
Learn how to assist clients with evaluating the appropriateness and efficacy of a variety of data mining methods, as well as the output from the data mining enterprise
Course Content:
Principal topics and the approximate number of weeks devoted to each are:
Introduction (1)
Data warehouse and OLAP technology for data mining (1)
Data pre-processing methods (1)
Association rules (1)
Classification and prediction (2)
Cluster analysis and other unsupervised methods (1)
Mining complex data (1)
Data mining in bioinformatics (1)
Data mining in various work environments (1)
Presentation:
Note: Presentation method may vary somewhat from section to section.
The principal method of presentation is by lecture, in-class presentation, and class discussion.
Assessment:
Note: Assessment method may vary somewhat from section to section.
Evaluation is by examination and homework assignments or project in each of the major areas described above.
|