UNIVERSITY EXAMINATIONS: 2012/2013
EXAMINATION FOR THE BACHELOR OF SCIENCE IN
INFORMATION TECHNOLOGY
BIT 4204 DATA WAREHOUSING AND DATA MINING
DATE: AUGUST, 2013 TIME: 2 HOURS
INSTRUCTIONS: Answer Question ONE and any other TWO Questions
QUESTION ONE: 30 MARKS (COMPULSORY)
a) Define the following terms. (4 Marks)
(i) Data warehousing
(ii) Data mining
b) Describe any six types of data that can be mined and state the type of organizations
that gather these types data (6 Marks)
c) Discuss any four applications of data mining. (6 Marks)
d) With the help of a diagram illustrate the Knowledge Discovery Process. (8Marks)
e) Discuss six ways in which the data that has been mined can be visually presented.
(6 Mark)
QUESTION TWO: 20 MARKS
a) Define the term data warehouse (2 Marks)
b) Discuss four benefits of data mining. (4 Marks)
c) Discuss any four challenges facing data mining. (4 Marks)
d) Describe any five possible discoveries from a data mining exercise. (10 Marks)
QUESTION THREE: 20 MARKS
a) Discuss six factors that lead to the growth and popularity of data mining.
(6 Marks)
b) State and explain four ways of categorizing data mining systems. (6 Marks)
c) Define an OLAP system. (2 Marks)
d) Discuss six characteristics of OLAP. (6 Marks)
QUESTION FOUR: 20 MARKS
a) A grocery shop sells six items which are Bread, Cheese, Eggs, Juice, Milk and
Yogurt. The shopkeeper also keeps a record of the transactions as follows.
Using the Apriori algorithm find the association rules with 50% and 75% confidence.
(14 Marks)
b) Discuss six factors that influence the selection and acquisition of data mining
software. (6 Marks)
QUESTION FIVE: 20 MARKS
a) In building a decision tree, three possible attributes are considered as split attributes,
the information gain for the attributes A, B, and C are 0.97, 0.029, and 0.15
respectively. Which attribute should be selected for the split and why? (3 Marks)
b) The table below shows the training data for classifying bank loan applications by
assigning applications to one of the risk classes.
(i) Find the attribute that has the highest information gain. (13 Marks)
(ii) Draw the decision tree for the table above (4 Marks)