Summative assessment brief: Data science project report
1 Assessment in brief
1.1 Module code and title
TRAN5340M - Transport Data Science
1.2 Assessment title
Summative Coursework: Data Science Project Report
1.3 Assessment type
Project Report and Reproducible Code
1.4 Learning outcomes assessed
- To demonstrate advanced data science techniques applied to a transport problem
- To show proficiency in data processing, visualization, and analysis
- To produce high-quality, reproducible research with clear implications for transport planning/policy
- To critically evaluate methodological approaches and results
1.5 Assessment length/Time limit guidance
A PDF document (max 10 pages)
1.6 Weighting
100%
1.7 Deadline or date of assessment
Friday 15th May 2026 at 2pm: 2026-05-15 14:00 BST
1.8 Submission method
- A
.zipfile containing:- A PDF document (max 10 pages)
- Reproducible code (
.qmdfile) - Any necessary data files, but do not include any large (above around 10 MB) datasets: provide links to these instead
- Maximum file size: 40 MB
- Submission: Via Minerva (Blackboard Assignment)
1.9 Feedback provision
When there is more than one assessment on a module, you will usually receive your feedback before your next assessment for the module is due. Where it is appropriate to do so, and feedback can be released without invalidating the integrity of ongoing assessments, this will typically be no later than 15 working days post submission. Please be mindful that some students may have approved extensions for assessments which mean it is not appropriate to release feedback within 15 working days after individual submissions. In these cases, feedback will be released no later than 15 working days following the submission of all outstanding work for the assessment.
Feedback will be provided via Minerva and during practical sessions with dedicated time for coursework.
1.10 Module leader & contact details
Robin Lovelace, r.lovelace [at] leeds.ac.uk
1.11 Assessment summary guidance
In ITS Assessment Instructions will be made available separately. Your module leader will tell you when to expect the Assessment Instructions to be available. When the Assessment Instructions are made available you will be able to find them in the Assessment and Feedback Folder on the module Minerva Page.
2 Use of GenAI
Generative AI category: GREEN
Under this category, AI tools are actively encouraged and can be used extensively.
In this assessment, AI tools can be utilised to:
- Generate, test, and debug code for your transport data analysis
- Assist with data visualization and mapping
- Provide explanations of transport concepts and methods
- Help with code optimization and best practices
- Support your research on the topic by suggesting areas to investigate
- Give feedback on content and provide proofreading
- Accelerate your learning and productivity
Important: You must understand and be able to explain all code and analysis you submit, whether AI-generated or not. Document your AI usage in reflective sections of your report.
In this assessment, AI tools cannot be utilised to:
- produce the entirety of, or sections of, a piece of work that you submit for assessment beyond that which is outlined above.
The use of Generative AI must be acknowledged in an ‘Acknowledgements’ section of any piece of academic work where it has been used as a functional tool to assist in the process of creating academic work.
The minimum requirement to include in acknowledgement:
- Name and version of the generative AI system used e.g. Gemini 3
- Publisher (company that made the AI system) e.g. Google
- URL of the AI system
- Brief description (single sentence) of context in which the tool was used.
For example: “I acknowledge the use of Gemini 3 to prototype the visualisation of my results and test different software packages, including ggplot2, base graphics, and plotly.” Best practice is to include a link to the exact prompt used to generate the content or an appendix of up to 1 page in length with key prompts and session information.
The standard Academic Misconduct procedure applies for students believed to have ignored this categorisation.
For detailed guidance see https://generative-ai.leeds.ac.uk/ai-and-assessments/categories-of-assessments/.
General guidance
Skills@library hosts useful guidance on academic skills including specific guidance on academic writing and referencing Academic skills.
3 Assessment criteria and process
The detailed marking criteria are made available on Minerva and online for accessibility.
Assessment length penalties are applied during the marking process & will normally be applied before you receive your mark. Late penalties are normally applied after you receive your mark – if you know you submitted late without a permitted extension, please be aware that your mark may change.
In ITS written English is assessed in line with the University’s policies on Assessment of written English (Student Education Service | University of Leeds). You should use spelling, punctuation & grammar to communicate your ideas clearly.
Marks for the submitted report, are awarded in 4 categories, accounting for the following criteria:
3.1 Data processing: 20%
- The selection and effective use of input datasets that are large (e.g. covering multiple years), complex (e.g. containing multiple variables) and/or diverse (e.g. input datasets from multiple sources are used and where appropriate combined in the analysis)
- Describe how the data was collected and implications for data quality, and outline how the input datasets were downloaded (with a reproducible example if possible), with a description that will allow others to understand the structure of the inputs and how to import them
- Evidence of data cleaning techniques (e.g. by re-categorising variables)
- Adding value to datasets with joins (key-based or spatial), creation of new variables (also known as feature engineering) and reshaping data (e.g. from wide to long format)
Distinction (70%+): The report makes use of a complex (with many columns and rows) and/or multiple input datasets, efficiently importing them and adding value by creating new variables, recategorising, changing data formats/types, and/or reshaping the data. Selected datasets are very well suited to the research questions, clearly described, with links to the source and understanding of how the datasets were generated.
Merit (60-69%): The report makes some use of complex or multiple input datasets. The selection, description of, cleaning or value-added to the input datasets show skill and care applied to the data processing stage but with some weaknesses. Selected datasets are appropriate for the research questions, with some description or links to the data source.
Pass (50-59%): There is some evidence of care and attention put into the selection, description of or cleaning of the input datasets but little value has been added. The report makes little use of complex or multiple input datasets. The datasets are not appropriate for the research questions, the datasets are not clearly described, or there are no links to the source or understanding of how the datasets were generated, but the data processing aspect of the work acceptable.
Fail (0-49%): The report does not make use of appropriate input datasets and contains very little or now evidence of data cleaning, adding value to the datasets or reshaping the data. While there may be some evidence of data processing, it is of poor quality and/or not appropriate for the research questions.
3.2 Visualization and report: 20%
- Creation of figures that are readable and well-described (e.g. with captions and description)
- High quality, attractive or advanced techniques (e.g. multi-layered maps or graphs, facets or other advanced techniques)
- Using visualisation techniques appropriate to the topic and data and interpreting the results correctly (e.g. mentioning potential confounding factors that could account for observed patterns)
- The report is well-formatted, accessible (e.g. with legible text size and does not contain excessive code in the submitted report) and clearly communicates the data and analysis visually, with appropriate figure captions, cross-references and a consistent style
Distinction (70%+): The report contains high quality, attractive, advanced and meaningful visualisations that are very well-described and interpreted, showing deep understanding of how visualisation can communicate meaning contained within datasets. The report is very well-formatted, accessible and clearly communicates the data and analysis visually.
Merit (60-69%): The report contains good visualisations that correctly present the data and highlight key patterns. The report is has appropriate formatting.
Pass (50-59%): The report contains basic visualisations or are not well-described or interpreted correctly or the report is poorly formatted, not accessible or does not clearly communicate the data and analysis visually.
Fail (0-49%): The report is of unacceptable quality (would likely be rejected in a professional setting) and/or has poor quality and/or few visualisations, or the visualisations are inappropriate given the data and research questions.
3.3 Code quality, efficiency and reproducibility: 20%
- Code quality in the submitted source code, including using consistent style, appropriate packages, and clear comments
- Efficiency, including pre-processing to reduce input datasets (avoiding having to share large datasets in the submission for example) and computationally efficient implementations
- The report is fully reproducible, including generation of figures. There are links to online resources for others wanting to reproduce the analysis for another area, and links to the input data
Distinction (70%+): The source code underlying the report contains high quality, efficient and reproducible code that is very well-written, using consistent syntax and good style, well-commented and uses appropriate packages. The report is fully reproducible, with links to online resources for others wanting to reproduce the analysis for another area, and links to the input data.
Merit (60-69%): The code is readable and describes the outputs in the report but lacks quality, either in terms of comments, efficiency or reproducibility.
Pass (50-59%): The source code underlying the report describes the outputs in the report but is not well-commented, not efficient or has very limited levels of reproduicibility, with few links to online resources for others wanting to reproduce the analysis for another area, and few links to the input data.
Fail (0-49%): The report has little to no reproducible, readable or efficient code. A report that includes limited well-described code in the main text or in associated files would be considered at the borderline between a fail and a pass. A report that includes no code would be considered a low fail under this criterion.
3.4 Understanding the data science process, including choice of topic and impact: 40%
- Topic selection, including originality, availability of datasets related to the topic and relevance to solving transport planning problems
- Clear research question
- Appropriate reference to the academic, policy and/or technical literature and use of the literature to inform the research question and methods
- Use of appropriate data science methods and techniques
- Discussion of the strengths and weaknesses of the analysis and input datasets and/or how limitations could be addressed
- Discuss further research and/or explain the potential impacts of the work
- The conclusions are supported by the analysis and results
- The contents of the report fit together logically and support the aims and/or research questions of the report
Distinction (70%+): The report contains a clear research question, appropriate reference to the academic, policy and/or technical literature, use of appropriate data science methods and techniques, discussion of the strengths and weaknesses of the analysis and input datasets and/or how limitations could be addressed. The report discusses further research and/or explores of the potential impacts of the work. Conclusions are supported by the analysis and results, and the contents of the report fit together logically as a cohehisive whole that has a clear direction set-out by the aims and/or research questions. To get a Distinction there should also be evidence of considering the generalisability of the methods and reflections on how it could be built on by others in other areas.
Merit (60-69%): There is a clear research question. There is some reference to the academic, policy and/or technical literature. The report has a good structure and the results are supported by the analysis. There is some discussion of the strengths and weaknesses of the analysis and input datasets and/or how limitations could be addressed.
Pass (50-59%): The report contains a valid research question but only limited references to appropriate literature or justification. There is evidence of awareness of the limitations of the results and how they inform conclusions, but these are not fully supported by the analysis. The report has a reasonable structure but does not fit together well in a cohesive whole.
Fail (0-49%): The report does not contain a valid research question, has no references to appropriate literature or justification, does not discuss the limitations of the results or how they inform conclusions, or the report does not have a reasonable structure.
3.5 Presentation/Formatting and referencing
You must appropriately cite all supporting evidence using Quarto’s citation system: https://quarto.org/docs/authoring/citations.html
4 Academic misconduct and plagiarism
The university expects that all the work you do which includes all forms of assessments submitted and examinations taken, meet the university’s standard for Academic Integrity. All forms of Academic Integrity are investigated through the Academic Misconduct Procedure. This applies to all taught elements of your study, including undergraduate programmes, taught postgraduate study and taught elements of research degrees. Breaching academic integrity standards can lead to serious penalties.
Guidance on Academic Integrity and Academic Misconduct can be found on the For Students website pages (https://students.leeds.ac.uk/info/10110/academic-integrity) and full definitions of offences under the Academic Misconduct Procedure can be found in the Academic Misconduct Procedure on the Student Cases website page (https://secretariat.leeds.ac.uk/student-cases/academic-misconduct/).
5 Assessment criteria rubric
The assessment criteria rubric will be included in the Assessment Instructions that are made available separately.