MISM5302, MDA5304  DATA MINING AND WAREHOUSING  KCA Past Paper

UNIVERSITY EXAMINATIONS: 2018/2019
EXAMINATION FOR THE DEGREES OF MASTER OF SCIENCE IN
INFORMATION SYSTEMS MANAGEMENT
MISM5302, MDA5304 DATA MINING AND WAREHOUSING
ORDINARY EXAMINATIONS
DATE: DECEMBER, 2018 TIME: 2 HOURS
INSTRUCTIONS: Answer Question One & ANY OTHER TWO questions.

QUESTION ONE
(a) Describe the meaning of the following terms in the context of data mining and warehousing.
(i) Noisy data (1 Mark)
(ii) Data mining (1 Mark)
(iv) Entropy (1 Mark)

(b) Briefly describe knowledge discovery process. Use a diagram to illustrate your answer
(4 Marks)
(c) State and explain two techniques of data reduction (2Marks)
(d). Consider the following confusion matrix:
a b c <– classified as
7 6 9 | a = part time
1 8 4 | b = full time
5 3 7| c = Distance learning
Use the above confusion matrix to determine the following: (3 Marks)
(i) Precision for full time class
(ii) Recall for part time class
(iii) True negatives for Distance learning class
(e) Describe four types of data mining tasks. (4 Marks)
(f) Describe two motivations of data mining (2 Marks)
(g) Describe two techniques of filling missing values during pre-processing phase (2 Marks)
QUESTION TWO
(a) Briefly explain the meaning of the following terms in the context of data mining and
warehousing
(i) Clustering (1 Mark)
(ii) Dendrogram (1 Mark)
(iii) category utility (1 Mark)
(b) State and explain three types of clustering approaches (3 Marks)
(c) Briefly explain three metrics (functions) of measuring similarity of data items during
clustering. (3 Marks)
(d). Explain any two applications of clustering in business enterprises (2 Marks)
(e) Describe four operations that are used by COBWEB algorithm when building the
classification tree (4 Marks)
QUESTION THREE
(a) Describe any three situations when decision tree learning methods can be considered
(3 Marks)
(b) Consider the following data set

Compute information gain for selecting outlook attribute as the root of decision tree
using ID3 algorithm (5 Marks)
(c) Briefly explain two symptoms of overfiting and two approaches of how it can be avoided
(4 Marks)
(d) Describe the criteria of stopping building decision tree (2 Marks)
(e) Describe the meaning of the term ‘overfiting’ as used in decision tree learning (1 Mark)
QUESTION FOUR
(a) Describe the meaning of the following terms in the context of warehousing
(i) Dimension (1 Mark)
(ii) Schema (1 Mark)
(iii) Fact (1 Mark)
(b) There three types of schemas that can be used design and develop a datawarehouse
(3 Marks)
(c) Describe any two properties of a data mart (2 Marks)
(d) Describe the meaning of initials ETL in the context of data warehousing (3 Marks)
(e) State and explain four characteristics of a data warehouse (4 Marks)

(Visited 113 times, 1 visits today)