
Note: You need to submit your answers in a word document. You need to transfer the results from the excel file into the word document. In addition, you must submit your Excel file (we prefer a single excel file with one or multiple worksheet for each question) but note that only the word document will be marked. If you think there is any issue or unclarity in any question, please make your assumptions (if there is any) and clearly explain them in your report.
Business Analytics
You need to add the coversheet and sign it. Please write the name of your tutor as well as the name of your lecturer in the coversheet. The analyses and the answers must be your own individual work without consultation of any other person. Also, you are not allowed to help/advise other students.
- Download Facebook live sellers data set. Find explanations of variables on the same page. Considering this data set, answer the following questions. (6 marks)
- Calculate mean, median and mode for the variable number of shares. Interpret the results. (1.5 marks)
- Calculate range, variance and standard deviation for the variable “number of comments”. Interpret the results. (1.5 marks)
- Calculate quartiles and z-score for the variable “number of reactions”. Interpret the results. (1 marks)
- Develop scatterplots and correlation coefficients to investigate the relationship between the variables “number of reactions”, “number of comments” and “number of shares”. Interpret the results. (2 marks)
- Download Facebook live sellers data set. Find explanations of variables on the same page. Considering this data set, answer the following questions. (6 marks)
- Develop a pivot table to explain what kind of status type attracts the highest number of reactions. Interpret the results. (2 marks)
- Develop a pivot table to explain what kind of status attracts the highest number of likes. Interpret the results. (2 marks)
- Develop a pivot table to explain what kind of status types has the strongest and lowest influence in making the audience angry. Present a detailed argument. (2 marks)
- The following questions should be answered by referring to the excel sheet “Q3” of the Excel file
provided for this assignment. (10 marks)
- Present a figure to explore whether there is a relationship between salary and productivity. (2 marks)
- Present a figure to show the distribution of income. (2 marks)
- Present a figure to illustrate the number of staff that are recruited during the time. (2 marks)
- Present a figure to explore whether there is an outlier in the data set. (2 marks)
- The variable “Trustworthiness” identifies how much the manager trusts in an employee. What kind of predictive model do you suggest developing to predict “Trustworthiness” as the output variable and other variables as the input variable. (2 marks)
- Open excel sheet Q4 of the Excel file provided for this assignment. This Excel sheet is about the probability of accepting loan applications. If the bank accepts the customer loan application, it is shown by 1. If the bank rejects the customer loan application, it is shown by 0. In the Excel sheet, the variable “actual” indicates the actual probability of accepting a customer loan
application. The variable “prediction” is the output of the logistics regression model indicating the probability of loan application acceptance. (11 marks)
- Considering the cut-off value of 60%, develop the confusion matrix. What insights we can get from this confusion matrix. (6 marks)
- Considering the cut-off value of 60%, calculate sensitivity, specificity, class I error, class 0 error, and overall model error. Interpret the results. (5 marks)
- Open excel sheet Q5 of the Excel file provided for this assignment. This data set presents a sample of properties in Melbourne, including information about price, condition, postcode, area and the number of bedrooms. (20 marks)
- Develop a multiple regression model including all the variables to predict the price? Interpret the output of the regression model. (5 marks)
- How much is the price of a new property with the postcode of 3003, area of 500 and 5 bedrooms? (5 marks)
- Considering the developed model, which property provides the best investment opportunity? (5 marks)
- Do you think a polynomial degree 2 multiple regression model can be a better model to represent the housing price? Justify your answer. (5 marks)
- A logistic regression model is developed to explore the quality of wine on several different variables. The explanation of variables can be found here. The rating of wine is ranged from 0 to
10. In this model, we consider the wine as good quality if the rating is more than six. Otherwise, we regard it as poor quality. Consider the following as the screenshot of R-studio analysis. (10 marks)
- Write the logistic regression equation. Interpret the model. (5 marks)
- What is the quality of the wine with the following characteristics? (5 marks)
fixed acidity=7volatile acidity=0.3citric acid=0.4residual sugar=11chlorides=0.05free sulfur dioxide=37 | total sulfur dioxide=188density=0.99PH=3.2sulphates=0.44alcohol=11 |
- We want to hold a conference on business analytics. The plan is to invite three speakers to talk about the industry application of business analytics. Currently, we have 3 corporate members. At every conference, every corporate member receives 5 seats for free. If a corporate is not already a member, the attendance cost is $75 per person. We provide breakfast, lunch, and parking for every attendee. The current estimation of conference cost is outlined below. (10 marks)
- Renting conference place: $150.
- Registration processing: $8.50 for every attendee.
- Paying speakers: 3 * 800= $2,400.
- Breakfast: $4 for every attendee.
- Lunch cost $7 for every attendee.
- Parking: $5 for every attendee.
- At the moment, the number of non-member attendees is the essential factor in determining conference profit. Develop a spreadsheet model and explore this issue. (5 marks)
- How many non-member attendees should participate to make no profit and no loss (Profit=0). (5 marks)
- Open excel sheet Q8 of the Excel file provided for this assignment. The data set is related to a construction company machinery usage. Every machinery has a unique ID. Every machinery can be used in one or several different projects. There are missing values in this data set. Explain
your approach (step by step) on how you will fill in the missing values. If you believe it is possible to fill in missing values, write them in the answer sheet. (7 marks)
- Assume that you are in Business Analytics in a manufacturing company. Your manager gave you a task to optimize the scheduling of a project. Your job is to minimize the overall manufacturing time. There are three items to be manufactured and three different types of machinery are available for the manufacturing task. Due to operational reasons, every machinery can only be allocated to the manufacturing of one item. Every item can only be manufactured by one machine. Different machinery has different speeds in manufacturing different items. Let us say 𝑖 𝑎𝑛𝑑 𝑚 represent item and machinery, respectively. 𝑖1 represents the first item 𝑖2 represents the second item and so on. 𝑚1 represents the first machine, 𝑚2 represents the second machinery and so on. The duration of manufacturing every item pertaining to every machinery is presented in the below table. In this problem the sequence of manufacturing is important. Items 𝑖1 and 𝑖2 should be manufactured before manufacturing item 𝑖3. (10+10=20 marks)
𝑚1 | 𝑚2 | 𝑚3 | |
𝑖1 | 2 | 5 | 3 |
𝑖2 | 4 | 7 | 4 |
𝑖3 | 3 | 6 | 2 |
- Write the linear optimization model for the company to make the best decision. (10 marks)
- Solve the model, present the results, and interpret them. (10 marks)
Hint: you can use a binary variable such as 𝑥𝑖𝑚 which can take values of zero and one. Let us say
𝑥𝑖𝑚 = 1 if machine 𝑚 is engaged to manufacture item 𝑖 otherwise 𝑥𝑖𝑚 = 0.

Get expert help for Business Analytics and many more. 24X7 help, plag-free solution. Order online now!