Overview
Introduction
In this assignment we extracted data from a real-life database and was tasked with wrangling and preparing the
data to solve a prediction problem. (regression, classification)
• To extract data from a database, explore the data and formulate a prediction problem
• To create a tabular data table from multiple tables based on the formulated problem
• To wrangle and prepare the data ready for modeling, use the prepared data to build
and evaluate a simple machine learning model
• To document the process, analysis, comparison and findings
The dataset we had used was from Ergast.com which provides a database of Formula 1 races, starting from the 1950 season until today. The dataset includes information such as the time taken in each lap, the time taken for pit stops, the performance in the qualifying rounds etc. of all Formula 1 races
My prediction problem was to predict the total lap time of each driver in each race. With a model that can predict total lap time, it can help both the players and car team analyze the performance of themselves with a more comprehensive comparison of drivers' skills and car performance on different tracks and circuits.