BIT3201A BBIT300 DATA WAREHOUSING AND DATA MINING .

UNIVERSITY EXAMINATIONS 2017/2018
EXAMINATION FOR THE DEGREE OF BACHELOR OF SCIENCE
IN INFORMATION TECHNOLOGY/ BACHELOR OF BUSINESS IN
INFORMATION TECHNOLOGY
BIT 3201A/BBIT 300: DATA WAREHOUSING AND DATA MINING
FULL TIME/PART TIME/DISTANCE LEARNING
DATE: DECEMBER, 2017 TIME: 2 HOURS
INSTRUCTIONS: Answer Question One & ANY OTHER TWO questions.

QUESTION ONE: 30 MARKS (COMPULSORY)
a) Differentiate between data mining for characterization and data mining for
discrimination 4 Marks
b) Discuss any six desired features of cluster analysis. 6 Marks
c) Discuss six factors that influence the selection and acquisition of data mining
software. 6 Marks
d) Describe any six types of data that are gathered and be mined and state the type of
organizations that gather these types data 6 Marks
e) In the context of association rules mining and using two items X and Y, define the
following terms. 4 Marks
i. Support
ii. Confidence
f) In the context of association rules mining describe the following terms. 4 Marks
i. Frequent itemsets
ii. Confident rules
QUESTION TWO: 20 MARKS
a) Define the term pruning and explain its application in data mining. 4 Marks
b) Discuss six ways in which the data that has been mined can be visually presented.
6 Marks
c) Discuss four benefits of data mining 4 Marks
d) Discuss any five challenges facing data mining. 5 Marks
QUESTION THREE: 20 MARKS
a) Define the following terms 4 Marks
i. Data warehousing
ii. Data miningb) By use of appropriate examples discuss the following possible discoveries from a data
mining exercise. 4 Marks
i. Discrimination
ii. Sequence Analysis
iii. Time Series Analysis
iv. Outlier Analysis
c) Discuss six factors that lead to the growth and popularity of data mining.
6 Marks
d) Describe the various classification of data mining systems 6 Marks
QUESTION FOUR: 20 MARKS
a) Differentiate between OLTP and OLAP systems 4 Marks
b) Discuss five characteristics of OLAP 5 Marks
c) Consider a retail shop with the following set of transactions

Using the improved Apriori algorithm find the association rules with 30% and 60%
confidence.
11 Marks
QUESTION FIVE: 20 MARKS
a) Describe four distance measures in cluster analysis 4
Marks
i. Euclidean distance
ii. Manhattan distance
iii. Chebychev distance
iv. Categorical data distance
b) In the context of cluster analysis and with the help of a well labeled diagram describe
between inter-class similarity and intra-class similarity. 4 Marks
c) Describe three applications of each of the following data mining tasks in real life.
6 Marks
i. Classification
ii. Clustering
d) Describe the following cluster-based methods; where possible use a
diagram.
6 Marks
i. K-means
ii. Density based method

(Visited 104 times, 1 visits today)
Share this:

Written by