MIS772 – Predictive Analytics
Trimester 2, 2022
Assessment 1 (Individual) – Data Analysis and Report
DUE DATE AND TIME: Due 12th August 2022 8:00PM AEST
PERCENTAGE OF FINAL GRADE: 20%
WORD LIMIT: Equivalent to 2000 words
Word count is only an indication of the workload. Word limit is not applied for this assignment. Instead, page limit is applied following the provided assignment template.
Learning Outcome Details
Unit Learning Outcome (ULO) | Graduate Learning Outcome (GLO) |
ULO1: Understand and apply key statistical theories and data mining concepts | GLO1: Discipline-specific knowledge and capabilities |
Assessment Feedback
Students who submit their work by the due date will receive their marks and feedback on CloudDeakin 3 weeks after the due date.
Extensions
No extensions will be considered unless a written request is submitted and negotiated with the Unit Chair before the due date and time. Extension request form must be filled via Cloud Deakin – Assessment – Extension Request (https://www.deakin.edu.au/students/studying/assessment-and-results/assignments), which is accompanied by appropriate documentary evidence for the extension. Submissions after the due date/time without an approved extension will be considered late.
Extensions are only granted in extreme circumstances, such as ongoing health, personal hardship or work-related problems. Temporary illnesses, normal work pressures, multiple assignments due at the same time, failure to keep backups, technology failure, etc are not reasons for an extension. Extension request after the assignment due date should be submitted visa Student Connect following Deakin procedures https://www.deakin.edu.au/students/studying/assessment-and-results/special-consideration.
Assignment Objectives
This assignment aims for students to learn how to …
- Articulate problems and solutions in business terms
- Gain insights from data
- Prepare data for different models
- Develop classification models
- Assess and report model performance.
Case Study Description
This case is about a Portuguese banking institution, which has a growing customer base. The bank manager would like to employ data analytics and machine learning to analyse its customer data for bank direct marketing campaign. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required.
The data set contains approximately 45,211 observations with 17 variables as described in the below:
bank client data:
- – age (numeric)
- – job: type of job (categorical)
- – marital: marital status (categorical)
- – default: has credit in default? (binary)
- – balance: average yearly balance, in euros (numeric) 7 – housing: has housing loan? (binary)
8 – loan: has personal loan? (binary)
related with the last contact of the current campaign: 9 – contact: contact communication type (categorical) 10 – day: last contact day of the month (numeric)
- – month: last contact month of year (categorical)
- – duration: last contact duration, in seconds (numeric)
Image source: https://www.completecontroller.com/6-things-you-must-
consider-when-choosing-your-banking-institution/
other attributes:
- – campaign: number of contacts performed during this campaign and for this client (numeric, includes last contact) 14 – pdays: number of days that passed by after the client was last contacted from a previous campaign (numeric, -1 means client was not previously contacted)
15 – previous: number of contacts performed before this campaign and for this client (numeric) 16 – poutcome: outcome of the previous marketing campaign (categorical)
17 – y – has the client subscribed a term deposit? (binary)
Financial AI are interested in generating some insights about the clients, especially answering the below questions:
- What is the distribution of customer age by marital status?
- What are the (top 5) most popular occupations among the bank customers? Among them, which occupation has the highest average yearly balance? Which occupations has the most people completed tertiary education?
- How to reliably predict if the client will subscribe to the term deposit? Define appropriate measures and compare the performance of different classifiers to predict client’s subscription.
Financial AI wants you to use RapidMiner to process and explore the provided data, and then develop and evaluate classifiers to predict the customer’s subscriptions to the term deposit, and to minimise misclassifications. The data set is available on Cloud Deakin site, named MIS771 A1 data.zip. you will need to unzip the file before importing into RapidMiner.
Task and Deliverables:
- Executive Summary: Define your problem and solution in business terms, in doing so answer questions A, B and C, cross-reference with other report sections for support.
- Data Exploration, Pattern Discovery, and Preparation: Visualise the selected attribute characteristics. Use the visualisations to support answering questions A and B.
Prepare data for predictive modelling. Transform attributes or create new ones as needed. Use appropriate analysis and data visualisation to investigate relationships between attributes (predictors and label). Interpret results.
- Predictive Modelling: Create and explain two classification models, e.g., k-NN and Decision Tree, to address part of question C. Explain and justify your model’s properties. Investigate and deal with any class imbalance.
- Model evaluation and improvement: Use hold-out and cross-validation of the model. Utilize honest testing. Compare the performance of different models and select the best. Qualify how much we can trust the answer to question C.
Submission Instructions
See CloudDeakin for more info about this assignment, especially the assignment template and the assessment rubric. The assignment must be prepared using the provided assignment template (.docx file). Read these instructions and suggestions in the report template and understand the assessment rubric. The report font and size must be: Arial 10 points
Only the contents according to the page limits of the report template will be assessed. Any part which is missing in the report or beyond its page limit will not be assessed. We will not look for anything that was missing from your report in your RapidMiner scripts. However, we will check the RapidMiner scripts for consistency with your report and to ensure an authentic effort. Use comments in your RapidMiner process(es) to enable assessors to follow your logic.
A professional analytics report is evidenced-based, not speculative! Anything reported that is not substantiated by RapidMiner scripts will not be awarded marks. Create new versions your processes as you work on them (i.e., include a version number as part of the filename when you save it). It is your responsibility to make regular backups of your RapidMiner processes on alternative storage media. Failure to do so will not be accepted as a reason for seeking extensions.
Your RapidMiner script must be developed so that they can be run independently by assessors. Therefore, do NOT create intermediate data stores, or modify the provided dataset for the assignment outside of RapidMiner.
Submission format: Submit two separate files:
- Your report, according to the template in PDF format.
- Final versions of all your RapidMiner scripts (*.rmp files) compressed as a single ZIP file. This is an individual assignment. The Deakin policy on Academic Integrity applies.
Your files should be named as your firstname_lastname_MIS772A1 (e.g. John_Smith_MIS772A1.pdf and
John_Smith_MIS772A1.zip).
You are to submit your assignment in the individual Assignment Dropbox in the MIS772 CloudDeakin unit site by the due date. Only assignments received via the CloudDeakin submission box will be marked. Do NOT submit assignments via email. Approved extensions will result in deadlines redefined for individuals.
Notes
- Any work you submit may be checked by electronic or other means for the purposes of detecting collusion and/or plagiarism.
- Feel free to discuss concepts and ideas with peers but remember your submission must be your own work. Be careful not to allow others to copy your work. Submissions, whose python codes are significantly similar (e.g., mostly identical except for only some variable names), are subjected to investigation for potential copying issue. The authors of such submissions may also be asked to present their work to an academic panel if necessary.
- You must keep a backup copy of every assignment you submit, until the marked assignment has been returned to you. In the unlikely event that one of your assignments is misplaced, you will need to submit your backup copy.
- When you are required to submit an assignment through your CloudDeakin unit site, you will receive an email to your Deakin email address confirming that it has been submitted. You should check that you can see your assignment in the Submissions view of the Assignment dropbox folder after upload, and check for, and keep, the email receipt of the submission. You are responsible for submitting the correct documents for the correct unit, in the required content or format. Should you wish to correct your submission, you can resubmit with any applicable penalties. You will not be able to submit, resubmit or correct your submission after the 5 day lateness period (or your extension deadline).
- Penalties for late submission: The following marking penalties will apply if you submit an assessment task after the due date without an approved extension: 5% will be deducted from available marks for each day up to five days, and work that is submitted more than five days after the due date will not be marked. You will receive 0% for the task. ‘Day’ means working day for paper submissions and calendar day for electronic submissions. The Unit Chair may refuse to accept a late submission where it is unreasonable or impracticable to assess the task after the due date.
Get expert help for MIS772 Assessment 1 (Individual) – Data Analysis and Report and many more. 24X7 help, plag free solution. Order online now!