Extracting Freeze Frames

In my first post, I provided a tutorial on importing StatsBomb data in to R using the StatsBombR package. We took that imported data and created a few summary tables for the FA Women’s Super League. Today we are going to take this a step further and extract the shot freeze frame data provided.

You will have noticed last time, we removed the shot.freeze_frame column from our dataset so we could write the CSV summary tables. This was an important step as the shot freeze frame is provided as a nested dataframe. This means that there is a dataframe nested within that cell of the column.

Here is an example of the data nested within the shot.freeze frame_column for a single shot.

location teammate player.id player.name position.id position.name
115.9, 42.7 FALSE 15709 Megan Walsh 1 Goalkeeper
103.4, 58.2 TRUE 15547 Melissa Lawley 17 Right Wing
98.8, 44.0 TRUE 15613 Rinsola Babajide 23 Center Forward
91.8, 57.7 FALSE 16392 Felicity Gibbons 6 Left Back
97.8, 51.1 FALSE 22337 Maya Le Tissier 5 Left Center Back
95.0, 43.1 FALSE 16383 Danique Kerkdijk 3 Right Center Back
98.7, 32.2 FALSE 19414 Kirsty Barton 2 Right Back
88.3, 47.3 FALSE 20034 Danielle Buet 13 Right Center Midfield
88.5, 40.9 FALSE 31529 Léa Le Garrec 15 Left Center Midfield
87.2, 42.9 FALSE 23289 Emily Simpkins 10 Center Defensive Midfield
89.1, 51.3 FALSE 16399 Kate Natkiel 16 Left Midfield
a Table 1. A summary of data extracted from the freeze frame column

As we can see, the freeze frame provides some valuable information on player locations at the time of the shot. From this we could see how many players are in front or behind of the ball, does the player have a clear shot and so on. This was a simple extraction using tidyr::unnest. However, we can run in to problems if there is a null value within the column, where no freeze frame positional data is provided. We will need to filter this row out before we can unnest the data. We would do that as follows:

### I have read in all data previously using StatsBombFreeEvents

FreezeFrameData <- Data %>% 
  filter(type.name == "Shot") %>% 
  select(minute, second, shot.outcome.name, shot.freeze_frame)

FreezeFrame <- FreezeFrameData %>% 
  filter(!map_lgl(shot.freeze_frame, is.null)) %>% 
## Warning: `cols` is now required when using unnest().
## Please use `cols = c(shot.freeze_frame)`

Using purrr and “map_lgl” I can filter out the null values from the shot freeze frame column. From there I can then unnest all the data in to separate rows. Using this filterd and unnested data you can now plot or calculate player density at the time of the shot.

I hope this helps you examine the free StatsBomb data in more detail.

comments powered by Disqus