UNIVERSITY EXAMINATIONS: 2020/2021
EXAMINATION FOR THE DEGREE OF BACHELOR OF SCIENCE IN
SOFTWARE DEVELOPMENT
BSD 3203: PROGRAMMMING FOR DATA SCIENCE
FULLTIME/ PART TIME/DISTANCE LEARNING
ORDINARY EXAMINATION
DATE: DECEMBER, 2021 TIME: 2 HOURS
INSTRUCTIONS: Question ONE IS COMPULSORY, Choose TWO OTHER Questions
QUESTION ONE (20 Marks) Compulsory
a) Explain any three features that makes python a preferred language for data science.
[6 Marks]
b) Describe following Scikit-Learn APIs
i) Dataset representation:
ii) Transformers:
iii) Estimators:
iv) Predictors [8 Marks]
c) Write numpy codes to perform the following tasks
i). Creates a matrix of specified dimension containing only ones:
ii). Create a 4×4 array of uniformly distributed random values between 0 and 1
iii). Create a 3×3 array of normally distributed random values with mean 0 and standard
deviation [6 Marks]
QUESTION TWO (15 Marks)
Using appropriate examples, explain the following Python libraries
a) Pandas
b) Matplotlib
c) Scipy
d) Scikit-learn
e) Tensorflow [15 Marks]
QUESTION THREE (15 Marks)
Use the following dataset to answer the following questions
Input (x) Output (y)
a) Choose a class of model by importing the appropriate estimator class from Scikit-Learn
[2 Marks]
b) Create the model and arrange the data into a features matrix and target vector [4 Marks]
c) Fit the model to your data and display the slope and intercept [3 Marks]
d) Check the results of model fitting to know whether the model is satisfactory [3 Marks]
e) Apply the model to new data [3 Marks]
QESTION FOUR (15 Marks)
Use the Pima Indian Diabetes dataset shown below to perform the following tasks using ScikitLearn:
a) Import the required libraries for the decision tree analysis and load in the required data
[4 Marks]
b) Determine the target and feature variables [2 Marks]
c) Dividing the data into training and testing sets in the ratio of 70:30. [3 Marks]
d) Building the Decision Tree Model using scikit-learn [3 Marks]
e) Determining how accurately the classifier predicts the outcome. The accuracy is computed
by comparing actual test set values and predicted values. [3 Marks]