BIT3201A  DATA WAREHOUSING AND DATA MINING.

UNIVERSITY EXAMINATIONS: 2017/2018
EXAMINATION FOR THE DEGREE IN BBIT/BAC/Bsc IT/Bsc ICT
BIT3201A DATA WAREHOUSING AND DATA MINING
MODE: FULL TIME/PART TIME/DISTANCE LEARNING
ORDINARY EXAMINATIONS
DATE: APRIL, 2018 DURATION: 2 HOURS
INSTRUCTIONS: Answer question ONE and any other TWO

QUESTION ONE [30 MARKS]
a) Data quality can be assessed in terms of accuracy, completeness, consistency etc. Describe
TWO other dimensions of data quality.
4 Marks
b) Describe the concept of data mining.
1 Mark
c) Suppose your task as a software engineer at KCA University is to design a data mining
system to examine university courses database which contains the following information: the
name, address and status (undergraduate or graduate) of each student, the course taken and
their grades. Describe the architecture you would choose as well as the purpose for each
component of the architecture.
6 Marks
d) Define each of the following data mining functionalities:
i. Characterization
ii. Discrimination
iii. Association
iv. Clustering
4 Marks
e) Discuss the following types of data warehouse usage
i. Information processing
ii. Data mining
iii. Analytical processing
6 Marks
f) In data warehouse technology, a multiple dimensional view can be implemented by a
relational database technique(ROLAP), or by multidimensional database technique(MOLAP)
or by a hybrid database technique(HOLAP).
i. Briefly describe each implementation technique. 6
Marks
ii. Explain any three common operations/functions performed on the above techniques.
3 Marks
QUESTION TWO [20 MARKS]
a) Suppose that a data warehouse consists of four dimensions: date, spectator, location, and
game. There are two measures: count and charge where charge is the fee that a spectator pays
when watching a game on a given date. The spectators may be students, adults, or seniors
with each category having its own charge rate. Draw a star schema for the data warehouse.
8 Marks
b) Describe the following concepts as used in data warehousing
i. Data cleaning
ii. Data transformation
iii. Refresh 6 Marks
c) Data warehouse can be modelled by either star schema or a snowflake schema. Briefly
describe the similarities and differences of the two models.
8 Marks
QUESTION THREE [20 MARKS]
a) Explain how dissimilarity between objects can be described by numerical variables
6 Marks
b) Briefly explain the following approaches to clustering
i. Partitioning method
ii. Hierarchical method
iii. Density based method
9 Marks
c) Briefly describe the major steps of decision tree classification
5 Marks
QUESTION FOUR [20 MARKS]
a) Describe the Apriori algorithm as used in data mining.
6 Marks
b) A certain store in town sold the following items to five different customers.
Using Apriori algorithm, find the frequent itemsets with 60% and 80% confidence from the
store.
8 Marks
c) Using two items X and Y, define the following terms.
i. Support
ii. Confidence
6 Marks
QUESTION FIVE [20 MARKS]
a) Define the term pruning and explain its application in data mining.
4 Marks
b) Discuss SIX ways in which the data that has been mined can be visually presented.
6 Marks
c) Discuss FIVE benefits of data mining.
4 Marks
d) Discuss any FIVE challenges of facing data mining.
5 Marks

(Visited 128 times, 1 visits today)
Share this:

Written by