
University of Sunderland School of Computer Science
MODULE CODE | CETM50 |
MODULE TITLE | Technology Management For Organisations |
MODULE LEADER | Ashley Williamson |
ASSESSMENT | 1 of 1 |
TITLE OF ASSESSMENT | Gathering Up Your Data & Scaling Out |
ASSESSMENT VALUE | 100 |
PLEASE READ ALL INSTRUCTIONS AND INFORMATION CAREFULLY.
This assignment contributes 100% to your final module mark. Please ensure that you retain a duplicate of your assignment work as a safeguard in the unlikely event that your work is lost or corrupted online.
THE FOLLOWING LEARNING OUTCOMES WILL BE ASSESSED:
[LO1] Have Critical appreciation of the policies and procedures to manage data and information securely, to manage risk in technology management
[LO2] Critically analyse the strategic challenges, risks, opportunities and practical applications afforded by cybersecurity and data science for organisations to enable effective business operation and ensure business continuity
[LO3] Critically assess the appropriate technologies, infrastructures, tools and techniques required to address practical problems and challenges for organisations in data science and cybersecurity
[LO4] Complete analysis and evaluation of the professional, legal and ethical requirements of secure big data management in business and industrial environments
IMPORTANT INFORMATION
You are required to submit your work within the bounds of the University Academic Regulations for Postgraduate Study (see your Programme Guide). Plagiarism, and other forms of academic misconduct will not be tolerated and will be dealt with severely. The coursework submission for this module is largely based upon your own practice, but when using material from other sources, for example an occasional short quote, this should be duly referenced. It is important to note that your work WILL BE SUBJECT TO CHECKS FOR ORIGINALITY, which WILL include the use of an electronic plagiarism detection service.
Originality reports will NOT be available until after the assessment deadline. It is therefore important that you understand the referencing standards and make use of available guidance from University Library resources for Academic Referencing.
Where referencing is required, and unless otherwise stated, the Harvard Referencing system MUST be used. (See Programme Guide, or University Library Website).
Where you are asked to submit an individual piece of work, the work must be entirely your own. The safety of your assessments is your responsibility. You must not permit another student access to your work at any time during the inception, design, or development of your coursework submission and must take great care in this respect.
Submission Date and Time | Detailed in Canvas assignment area |
Submission Location | Electronic submission to Canvas assignment area. |
Assessment
Scenario
A SMB (small-to-medium business), Laurel Technology Solutions Ltd., has recently begun to utilise the data they obtain from their customers. They are currently based within the UK with a singular office location. Their IT systems are off-the-shelf components for their specific areas (Finance, HR, etc), requiring manual intervention if any data needs to be compared / crossed between those systems. Their customer data is split across these various systems.
E.g Credit card data is only stored by the financial systems, Employment is only within the HR systems, etc. They do not currently have a single cohesive record representing all of their customer data. They are looking to unify these ahead of further data investigation and exploitation, and to pool these data together into a central database.
In the last financial year they have found huge success, with a large influx of new customers; therefore, they are looking to expand. This expansion will allow them to take on even more customers, and to expand their offerings/operations to include more social media aspects.
As part of this expansion, Laurel Technology Solutions Ltd., is looking at foreign markets (East Asia), with an aim of setting up an office space initially located in Seoul, South Korea. The Seoul office would be responsible for customers in that region; however, the main company will still be based within the UK, and require regular communication back-and-forth, including customer data. The company’s eventual goal is to improve their core infrastructure by combining their different data streams with an aim to analyse, and then exploit their data.
Employees which are part of the UK branch will remain employed in the UK. New Employees will be hired in South Korea for that office location. It is expected that there will be executives of the company regularly travelling to the new office and needing to conduct work from that location involving the UK branch.
Tasks
In this assignment you are required to produce a Python program for combining multiple data sources provided, and pushing these to a central organisational data store. This will also involve a critical report, reflecting on the practical component, as well as recommendations and critical evaluation of suitable technologies for the provided expansion scenario above.
Task 1: Python ETL
For this component, you are provided data records from the given SMB. These data involve customer attributes such as names, banking credentials, family attributes, etc. These data files are provided as a mixed modality in a variety of formats (CSV, JSON, XML, and TXT).
The work herein requires the processing of these data into a homogenous record, aligning the same customers from different sources together, which are then automatically entered into a Relational Database System using modern tools & libraries.
Figure 1 – Example of how two separate data files can be combined into the final form
You are expected to read and extract data from these various formats, wrangle the data – solving inconsistencies if present – and bring data together into a singular format (See Figure 1 as an example). These unified records are then to be mapped to a relational database using PonyORM, with all unified records being entered into the database. It is expected that your code is sufficiently commented, especially where any documentation is referenced.
Your solution must ONLY make use of stock Python (including any libraries as part of the standard library: E.g csv, xml, json), and PonyORM.
Any work utilising libraries not stated above will be ignored and awarded a mark of 0. You shall NOT utilise libraries such as the Pandas library, all code should be standard python you have written to perform these functions with the exception of PonyORM. For a complete list of the standard Python Libraries please consult the documentation available (or 3.9 / 3.8, depending on the version you are utilising).
As part of this task, and in addition to the Python solution, you will produce a 10-15 minute recorded presentation reflecting on the types of data provided and the challenges with combining them. In particular, you should reflect on this from a data perspective as well as a personal perspective (the process of you undertaking the task itself, any difficulties faced, challenges overcome, etc.); within this you are expected to walk through your solution as part of this process.
The final part of the presentation should cover any difficulties the company may have in automating such a process, and/or any future considerations given the scenario provided. The structure and style of this presentation is at your discretion. You may wish to use a mixture of voice-over, presentation slides, and direct oration.
For screen recording you may find OBS a helpful tool, this allows full screen capture, as well as picture-in-picture and works on multiple platforms.
Database Access
A MySQL Database shall be used for connection for this task; however, you are welcome to install a WAMP/LAMP stack yourself for testing purposes.
Host: xxxxxxxxxxxxxxx
User: student_ followed by your student ID
Password: xxxxxxxxxxxxxxx
Database: student_ followed by your student ID
E.g If your student ID is bh12xy, then your connection would use: User: student_bh12xy
Database: student_bh12xy
Note: These student_ credentials will also work for PhpMyAdmin should you wish to use this for inspection purposes.
Note 2: Credentials covered in the workshops are the same as these, you are free to use whichever student_ or sec_student_ variants. These have security considerations, and you should refer back to practical workshops if necessary.
Task 2: Scaling Up: Big Data, Big Problems?
Based on the provided expansion scenario you will write a report ( 8 pages max ) which covers the following:
- Critically evaluate potential technology solutions the company could utilise for their expansion, settling on a recommendation for the client with rationale. As part of this, you will also critique the choice of Relational Database to store combined customer data from the Python ETL task.
- Highlight and discuss the potential Big Data issues present with the company opening a foreign office and dealing with non-UK customer data. You should consider any regulatory or legal requirements of the business, as well as technological issues presented by the scaling up of a company’s operations and the data volumes they begin to accrue.
As part of this report submission, you should also include your source code as text in your appendices (in addition to uploading your .py / .ipynb files for Task 1).
Deliverables & Submission Requirements
This assignment requires the submission of a Python Notebook (.ipynb) file, or a Python Script (.py) solution file for the code submission. There are no length limitations on the code.
When marking, code files will be rerun from scratch. Therefore, ensure your solution works prior to submission. For notebooks, this will be Kernel -> Restart and Run All. Any existing output within your scripts will not be considered when marking.
Video Presentation files should be of reasonable resolution (approx 1080p), of reasonable file-size, and of a playable file format (E.g .mp4). Links will NOT be accepted. The entire video presentation is to be uploaded to the video presentation submission zone. You may find it useful to record a short test-clip to verify you have appropriate settings for resolution and file-size prior to recording your presentation.
Scaling Up: Big Data, Big Problems? requires a PDF submission via Canvas with a maximum of 8 pages. Note, any work which goes beyond this limit will be ignored for the purposes of marking. You should ensure that the PDF is uploaded as-is, and not within a ZIP file or any other archive, for the purposes of plagiarism. The overall structure is up to you; however, it is recommended to follow the subpoints as outlined above, as well as consulting the mark scheme. You must include your source code from Task 1 as a plain-text appendix (not images/screenshots) alongside this report.
Cover page, table of contents, references, and appendices sections do not count towards the page limit.
Help with Referencing
Whenever you need to refer the reader to the source of some information, e.g. a book/journal/academic paper/WWW address, provide a citation at that point within the main body of your report.
Example 1: … as we are all now aware referencing is not trivial (Kendal, 2017)
Provide a reference list towards the end of your research paper (after your conclusions section but before any appendices) that contains:
- References, a list of books/journals/academic papers/URLs etc. that have been directly cited from within the report (see example citation above).
- Any material from which text, diagrams or specific ideas have been used, even if this has been presented in your own words, must be cited within the main body of the paper and listed in the reference list. It is not enough to list this material in a bibliography.
Example 2: For Example 1, (using Harvard system) the reference list would contain the following:
Kendal S., 2017, Referencing standards, International Student Journal, Vol 55, Pages 25 – 30, Scotts Pub., ISBN 1-243567-89
This shows the authors, date published, title of paper (in single quotes), title of journal or conference (in italics), volume, page numbers, and publisher (ISBN desirable but not essential).
For further help see the following book which is available in the library:
- Cite Them Right: The Essential Guide to Referencing and Plagiarism by Richard Pears and Graham Shields
An interactive online version of this guide is available by logging into My Sunderland with your User ID and password and then clicking on Me and Library Resources.

Get expert help for CETM50 Technology Management and many more. 24X7 help, plag free solution. Order online now!