Home - Courses - D - Content

Data Mining

ProgramTeacherCreditDuration

Computer science

Mei Yu

2

40

Course Name: Data Mining

Course Code:S2293208

Semester: 4

Credit: 2

Program: Computer science

Course Module: Optional

Responsible: Mei Yu

E-mail: wjr@tju.edu.cn

Department: Tianjin International Engineering Institute

Time Allocation (1 credit hour = 45 minutes)

Exercise

Lecture

Lab-study

Project

Internship (days)

Personal Work

8

12

20

10

Course Description

The course is optional designed for Engineering Master of Computer Science in TIEI, and a course based on artificial intelligence, machine learning, pattern recognition, statistics and database, and could analyze data automatically, then give inductive reasoning. Adopting simple expressions, this course comprehensively and systematically introduces the basic concepts, methods, and technologies, as well as the latest progress of the database from the perspective of database and data warehouse. This course states top-ten algorithms in data mining in detail by application examples, in order to ensure that students could reach the effect of learning for practice.

Prerequisite

  • Mathematical statistics and analysis: concepts and methods

  • Data structure: data storage and query method

Course Objectives

This course discusses basic concepts of data mining to help students find potential knowledge. After this course, students should be able to:

  • Understand what data mining is and how to cope with the actual problem with data mining method,

  • Master the related algorithms about on-line analytical processing (OLAP), classification, clustering, prediction and so on,

  • Identify several data mining strategies and the application environment of each strategy, and to

  • Comprehensively understand how to establish a model through data mining technology to solve an actual problem.

Course Syllabus

  • Data mining overview: definition, task, mining object.

  • Data preprocessing: data, data quality issues, data preprocessing.

  • Clustering: definition, main method.

  • Classify: definition, decision tree, Naive Bayesian.

  • Classification methods, such as decision tree, naive bayesian, and neural network.

  • Association analysis: definition, task, Apriori algorithm, FP- tree algorithm.

  • Exception Mining: definition, application, exception data generation causes, solutions.

Textbooks & References

  • Jiawei Han and Micheline Kamber.Data Mining: Concepts and Techniques. China Machine Press, 2006.

  • D. Hand, H. Mannila and P. Smith. Principle of Data Mining. Springer, 2004.

  • Pangning Tan,Michael Steinbach andVipin Kumar. Introduction to Data Mining. Addison Wesley, 2005.

  • Teming Huang, Vojislav Kecman and Ivica Kopriva.Kernel Based Algorithms for Mining Huge Data Sets: Supervised, Semi-supervised and Unsupervised Learning. Springer, 2006.

Grade Distribution

Survey: 12%; Experiments: 18%; Final Exam: 70%

Capability Tasks

CT2: To understand the basic concepts and steps of data mining.

CT3: To master related algorithms, such as OLAP, classification, clustering, and prediction.

CT4: To implement data mining’s algorithms under the particular environment.

CS1: To master the basic theories of data mining, and understand the development status and trends of data mining.

CS2: To grasp the top-ten processing algorithms of data mining to develop a system.

Achievements

  • To understand what problems that data mining technology could handle with. - Level: M

  • To grasp OLAP algorithm, classification, clustering, prediction algorithms. - Level: M

  • To apply data mining technology to solve practical problems. - Level: M

  • To use programming language to implement algorithms in data mining. - Level: M

Students: Computer science, Year2