This report was generated using artificial intelligence (Claude from Anthropic) under general human direction. At the time of generation, the contents have not been comprehensively reviewed by a human analyst.
Meta Information:
This document was generated by Databot using the following prompt: Produce a time series plot of apprenticeship starts delivered in Bristol, broken down by learner home local authority district (LAD). Use the WECA apprenticeship starts dataset in data/weca_apprenticeship_starts.csv.
Databot is a tool that leverages large language models to generate data analyses based on user prompts. It runs in the Positron IDE. It is primarily intended to support EDA and initial analysis tasks. The product is BYOK (Bring Your Own Key) and uses the Anthropic API key provided in the Positron Assistant setup.
The assistant starts by running exploratory data analysis to understand the dataset structure, followed by filtering and aggregating the data as specified in the prompt. It then generates visualizations and summary statistics to highlight key findings.
There are a few \commands provided. \report summarises the EDA into a quarto document - as shown below. This seems like a pretty powerful analysis model for quick initial data exploration.
Introduction
This analysis examines apprenticeship starts delivered in Bristol, with particular focus on understanding the geographic distribution of learners’ home locations. The analysis aims to reveal cross-boundary learning patterns and temporal trends in apprenticeship uptake across the West of England Combined Authority (WECA) region.
Data Source
The analysis uses the WECA apprenticeship starts dataset (weca_apprenticeship_starts.csv), which contains detailed information about apprenticeship programs including:
# A tibble: 4 × 3
learner_home_lad n percentage
<chr> <int> <dbl>
1 Bristol, City of 5331 61.1
2 South Gloucestershire 2107 24.1
3 North Somerset 832 9.5
4 Bath and North East Somerset 459 5.3
Time Series Analysis
We create a monthly time series to examine temporal patterns in apprenticeship starts across different learner home locations.
Code
# Create monthly time series data grouped by learner_home_ladmonthly_data <- bristol_data |>mutate(year_month =floor_date(start_date, "month")) |>group_by(year_month, learner_home_lad) |>summarise(total_starts =sum(starts, na.rm =TRUE), .groups ="drop") |>arrange(year_month)# Show data rangecat("Time series covers:", min(monthly_data$year_month), "to", max(monthly_data$year_month))
Time series covers: 17897 to 19327
Visualization
Code
# Create the time series plotggplot(monthly_data, aes(x = year_month, y = total_starts, color = learner_home_lad)) +geom_line(linewidth =1.2) +geom_point(size =2, alpha =0.7) +labs(title ="Apprenticeship Starts in Bristol by Learner Home Location",subtitle ="Monthly time series showing apprenticeships delivered in Bristol",x ="Date",y ="Number of Apprenticeship Starts",color ="Learner Home LAD" ) +theme_minimal() +theme(plot.title =element_text(size =14, face ="bold"),plot.subtitle =element_text(size =12),legend.position ="bottom",legend.title =element_text(face ="bold"),axis.text.x =element_text(angle =45, hjust =1) ) +scale_x_date(date_breaks ="6 months", date_labels ="%b %Y") +scale_color_viridis_d(option ="plasma", end =0.8)
# A tibble: 4 × 6
learner_home_lad mean_monthly_starts median_monthly_starts max_monthly_starts
<chr> <dbl> <dbl> <dbl>
1 Bristol, City of 123. 114. 318
2 South Gloucester… 48.3 39 155
3 North Somerset 18.2 15 61
4 Bath and North E… 10.1 8 33
# ℹ 2 more variables: min_monthly_starts <dbl>, total_starts <dbl>
Key Findings
Geographic Distribution
The analysis reveals clear patterns in cross-boundary apprenticeship uptake:
Bristol, City of accounts for 61.1% of all apprenticeships delivered in Bristol
South Gloucestershire is the second largest source, representing 24.1% of learners
Significant cross-boundary movement occurs, with 38.9% of apprentices traveling from neighboring authorities
Temporal Patterns
The time series analysis shows:
Consistent seasonal variation across all learner home locations
Similar temporal patterns suggest external factors (potentially including COVID-19 impacts) affecting apprenticeship starts region-wide
Bristol maintains its position as the dominant source throughout the time period
Cross-Boundary Learning
The data demonstrates Bristol’s role as a regional hub for apprenticeship training, attracting learners from across the West of England area. This pattern reflects:
Bristol’s position as the major economic center in the region
Availability of diverse apprenticeship opportunities in the city
Strong transport links facilitating cross-boundary learning
Technical Details
Data Processing
The analysis involved:
Filtering the full WECA dataset to include only Bristol delivery locations (8729 records)
Aggregating individual apprenticeship records to monthly totals by learner home LAD
Creating time series visualization using ggplot2 with appropriate color coding for different locations
Data Quality Notes
The dataset contains 24925 total records across the WECA region
All Bristol delivery records had valid learner home LAD information