Home - Students - My Studies - Courses - B - Content

Big Data

ProgramTeacherCreditDuration

Computer science

Jianrong Wang

2

40

Course Name: Big Data

Course Code:S2293209

Semester: 4

Credit: 2

Program: Computer science

Course Module: Optional

Responsible: Jianrong Wang

E-mail: wjr@tju.edu.cn

Department: Tianjin International Engineering Institute

Time Allocation (1 credit hour = 45 minutes)

Exercise

Lecture

Lab-study

Project

Internship

(days)

Personal

Work

8

12

20

10

Course Description

The course is optional designed for Engineering Master of Computer Science in TIEI. It comprehensively introduces related knowledge about big data, including overview of big data, key technologies and challenges of big data, NoSQL database, cloud database, Hadoop, HDFS, HBase, MapReduce, flow calculation, etc. Students will have a preliminary and extensive understanding about big data, which enables students to deal with big data set.

Prerequisite

  • Basic knowledge of data structure and algorithm analysis

  • Basic concept and steps of data mining

Course Objectives

This course discusses basic concepts of Big Data to help students deal with big data set. After this course, students should be able to:

  • Understand the professional knowledge and development of big data,

  • Master the current popular technologies of big data, and understand theirs characteristics and application scenarios, and to

  • Have the ability to construct big data system and could do simple big data application development

Course Syllabus

  • Big data overview: basic concepts, development process, the main technology.

  • Hadoop: project structure and its components.

  • Distributed file system HDFS: basic concepts, structure and design requirements, programming.

  • Distributed database HBase: access interface, data model, implementation principle and operation mechanism.

  • NoSQL database: four types and three big foundation.

  • Cloud database: concepts, features.

  • MapReduce: work flow, design method.

  • Stream computing: basic concepts and requirements, design ideas and architecture of Storm.

  • Graph calculation: Pregel graph calculation model.

  • Data visualization: concepts and visualization tools.

Textbooks & References

  • Viktor Mayer-Schonberger andKenneth Cukier.Big Data: A Revolution That Will Transform How We Live, Work and Think.Hodder Export, 2013.

  • Ian Ayres.Super Crunchers: Why Thinking-By-Numbers Is the New Way to Be Smart. Bantam, 2008.

  • Nathan Marz and James Warren. Big Data: Principles and Best Practices of Scalable Realtime Data Systems. Manning Publications, 2015.

  • Frank J. Ohlhorst.Big Data Analytics: Turning Big Data into Big Money. Wiley, 2012.

Capability Tasks

CT2: To master the basic concepts of big data.

CT3: To master the principle, installation, and configuration method of HDFS.

CS1: To master the basic theories of big data, and understand the development status and trends of big data.

CS2: To gain a comprehensive and solid foundation of NoSQL database and cloud database to proceed big data set.

Achievements

  • To master the basic knowledge of big data. - Level: A

  • To apply HDFS to process big data. - Level: M

  • To understand the work principle of Map Reduce, Hive, H-Base, Storm and zookeeper. - Level: A

  • To be able to demonstrate the result through visualization algorithm. - Level: M

Students: Computer science,Year 2