BIT 3201 DATA WAREHOUSING AND DATA MINING  KCA Past Paper

UNIVERSITY EXAMINATIONS: 2014/2015
ORDINARY EXAMINATION FOR THE BACHELOR OF SCIENCE
IN INFORMATION TECHNOLOGY
BIT 3201 DATA WAREHOUSING AND DATA MINING
(DISTANCE LEARNING)
DATE: DECEMBER, 2014 TIME: 2 HOURS
INSTRUCTIONS: Answer Question ONE and any other TWO

QUESTION ONE [30 MARKS]
a) Define the following terms as used in data warehousing and data mining
i) Data
ii) Data mart
iii) Data dredging
iv) Data scrubbing [4 Marks]
b) For each of the following tasks, state whether or not it is a data mining task. Give
a reason in each case.
i) Dividing the customers of a company according to their profitability.
ii) Monitoring the heart rate of a patient for abnormalities.
iii) Computing the total sales of a company.
iv) Sorting a student database based on student registration numbers.
v) Predicting the outcomes of tossing a fair coin
vi) Predicting the future stock price of a company using historical records.
[12 Marks]
c) Differentiate between a data warehouse and a database. [2 Marks]
d) Using two examples in each case, distinguish between predictive and descriptive data
mining techniques. [6 Marks]
e) Discuss three ways of representing the database design in an OLAP system.
[6 Marks]
QUESTION TWO [20 MARKS]
A multidimensional database (MBD) is defined as a type of database that is optimized for
data warehouse and online analytical processing (OLAP) applications. Multidimensional
databases are frequently created using input from existing relational databases. In this
respect:
a) Define the following terms:
i) Data cube
ii) Dimension
iii) Dimension table
iv) Schema [4 Marks]
b) Describe any three benefits of multidimensional databases [6 Marks]
c) Describe the following categories of OLAP tools.
i) ROLAP
ii) HOLAP
iii) MOLAP [6 Marks]
d) Describe the following analytical operations supported by multidimensional OLAP
databases.
i) Drill-down
ii) Slicing and dicing
iii) Roll-up [6 Marks]
QUESTION THREE [20 MARKS]
The main purpose of a data warehouse is to provide aggregate data like totals, average,
variance, trends e.t.c which is in a suitable format for decision making. From this point of
view:
a) Define the term data warehouse [2
Marks]
b) Discuss the four key characteristics of a data warehouse. [8 Marks]
c) Discuss the following components of a data warehouse
i) Operational databases
ii) Load manager
iii) Ware house manager
iv) Query manager
v) End user access tools [5 Marks]
d) Describe the activities involved in designing and implementing a data warehouse
[5 Marks]
QUESTIONS FOUR [20 MARKS]
Data mining can be defined as a process of uncovering of potentially useful information
in the data warehouse. From this stand point:
a) Discuss the challenges that face data mining with regard to:
i) Data mining methodology and user interaction issues
ii) Performance issues [6 Marks]
b) Describe the various classifications of data mining techniques. [4 Marks]
c) Discuss the architecture of a typical data mining system. [6 Marks]
d) Discuss how data mining can be applied in the following fields.
i) Medicine
ii) Marketing
iii) Crime management [6 Marks]
QUESTION FIVE [20 MARKS]
a) Define clustering as used in data mining. [2 Marks]
b) Discuss the following distance measures in cluster analysis.
i) Euclidean distance
ii) Manhattan distance
iii) Chebychev distance [6 Marks]
c) The following data was extracted from a certain retail shop in town.

i) Calculate the support for the rule {Bread,Milk}->{Cake} [3 Marks]
ii) Calculate the confidence for the rule in [i] above [3 Marks]
iii) Explain the use of confidence and support of a rule. [3 Marks]
d) Explain the use of Apriori algorithm in data mining.
[3 Marks]

(Visited 116 times, 1 visits today)
Share this:

Written by