Analyzing Angles with Pingouin
pingouin
is a python package for
calculating statistics on data organized in pandas dataframes. It has
an easier to use interface than stats-models and a batteries-included
philosophy where operations that maybe take multiple function calls in
scipy.stats
are rolled into one call in pingouin
. The author calls
it simple-but-exhaustive statistics.
While poking around the capabilities of this package, I discovered the
circular
module. I’d never heard of circular statistics before, but
there are a lot of angles in the data I work with every day – and angles
don’t always “play nice” with other types of scalar data.
For example, if we’re talking about compass headings, the difference between
a heading of 45 degrees and a heading of 48 degrees is 2 degrees.
But, the difference between a heading of 358 degrees and 0 degrees is also
2 degrees. You can’t treat angles like other types of scalar data.
Even if you don’t work in physical coordinate systems, any data with dates and times can be worked in circular statistics. A year is a 360 degree trip around the sun. A day is a 360 degree rotation of the earth. Anything that cycles can be turned into an angle. Tuesday is just an angle away from Sunday. By converting date-times to radians, you can treat 11:59pm on Monday and 12:02am on Tuesday as the near-identical times they really are, rather than as two separate calendar days.
I generally think in degrees (it’s easier
to imaging a 15 degree angle in my mind than a .26 radian angle, but
trigonometry runs on radians.
So, the first thing we need to do with any type of circular data analysis is
convert it into radians.
For radar data, np.radians
works great. But for other types of data,
pingouin’s covert_angles function
lets you specify the number of “units” in your particular kind of circle.
Once your data is in radians, what sort of stats can you do? There is no simple analog to linear regression, unfortunately, but there are correlation functions for circular-to-circular or circular-to-scalar variables, ways to calculate values analogous to scalar mean and variance, and checks for a uniform circular distribution.