When Your Compass Keeps Flipping

A paper about making eigenvectors β€” the hidden "compass needles" inside data β€” point in consistent directions, so we can actually understand what they're telling us.

Jay Damask Β· arXiv:2402.08139 Β· February 2024
Scroll down to explore the full paper ↓
Why should I care?

The Problem with Wobbly Compass Needles

Imagine you're navigating through fog using a compass, but every time you glance down, the needle might have randomly flipped 180Β°. North becomes south without warning. You can't trust your heading. Now imagine you have seven compasses, all interconnected β€” if one flips, it confuses the others. That's the situation data scientists face with eigenvectors.

When scientists or financial analysts break complex data into its fundamental patterns β€” a technique called eigenanalysis β€” they get two things for each pattern: a number saying how important the pattern is (the eigenvalue) and an arrow saying what direction the pattern points (the eigenvector).

Here's the catch: the standard software that computes eigenvectors (svd and eig calls) doesn't guarantee the sign of those arrows. The arrow could point north or south β€” mathematically both are equally valid. When you're tracking how patterns evolve over time, this sign ambiguity is catastrophic. It's like trying to track a stock market trend when your chart randomly inverts.

"The same eigenvector, computed at two nearby moments in time, might appear on opposite sides of a hemisphere β€” even when the actual direction has barely changed."

In 2020, Jay Damask published an algorithm (and a free Python package called thucyd) to fix this. It worked well, but it had a limitation: it could only see half the compass rose β€” angles were stuck in a 180Β° window. This paper doubles that to 360Β°, which turns out to matter enormously when distinguishing real patterns from noise.

Now that we know what's at stake, let's build the vocabulary we need.
Building blocks β€” these concepts power everything that follows

The Foundation: Rotations, Reflections & Handedness

Eigenvectors: The DNA of a Dataset

Think of a dataset with 7 measurements per record β€” say, 7 currency exchange rates. You can think of each record as a point in a 7-dimensional room. Eigenvectors are the natural "compass directions" of that room: they reveal the axes along which the data actually stretches or compresses. The first eigenvector points in the direction of maximum variation; the second points perpendicular to it in the direction of next-most variation; and so on.

The Sign Problem

Software computes eigenvectors V by solving a mathematical equation that has a built-in ambiguity: if v is a solution, then βˆ’v is equally valid. It's like saying "the strongest pattern runs along this line" without saying which direction along the line. North or south? The software picks essentially at random.

Rotations vs. Reflections

Try me

Drag the slider to see the difference between a rotation and a reflection.

30Β°

Left: rotation keeps handedness (the "L" stays readable). Right: reflection flips it (the "L" appears mirrored). Eigenvectors can secretly contain reflections, and this paper is about detecting and handling them.

Handedness and the Determinant

In 3D, we can tell if a coordinate system is "right-handed" or "left-handed" β€” like the difference between your actual right hand and its reflection in a mirror. The mathematical test is the determinant of the eigenvector matrix V:

+1
det(V) = Pure rotation
βˆ’1
det(V) = Has reflections

The goal of the orientation algorithm is to produce a matrix 𝒱 that lives in the special orthogonal group SO(N) β€” the club of pure rotations. To get there, we need to detect and fix any hidden reflections.

A Givens rotation is the simplest possible rotation: it rotates within exactly one 2D plane while leaving everything else untouched. In a 7-dimensional space, you'd need a series of these simple rotations β€” applied one after another β€” to swing a vector into alignment with a coordinate axis. It's like turning a combination lock: each click adjusts one pair of coordinates.

For an N-dimensional space, there are N(Nβˆ’1)/2 such angles embedded in the eigenvector matrix. For 7 dimensions, that's 21 angles total β€” 6 modes Γ— a decreasing number of angles per mode.

The Central Equation

RT V S = I

In plain English: "Un-rotate the eigenvector matrix V (that's the RT part), after fixing any reflections (that's the S part), and you should get back to the identity matrix I β€” a perfectly aligned coordinate system." Finding R and S is the algorithm.

With the vocabulary in place, here comes the key insight of this paper.
The "aha" moment β€” this is the paper's main contribution

The Core Idea: Rotate Instead of Reflect

Imagine you're trying to park a car facing north, but it's currently facing south. You have two choices: (A) pick the car up, flip it like a pancake so it faces north β€” that's a reflection. Or (B) turn the steering wheel and drive it around in a U-turn β€” that's a rotation through a major angle (more than 90Β°). Both end with the car facing north, but option B stays in the same world of driving maneuvers. The original algorithm chose option A; this paper shows how to choose option B.

Why This Matters for Angles

The original algorithm used a function called arcsin to compute angles. The problem is that arcsin can only return angles between βˆ’90Β° and +90Β° β€” a half-circle. This means you're looking at your eigenvectors through a narrow window: you can see them in the front hemisphere, but if one wanders to the back, the algorithm forces it to the front via a reflection, making it look like a small angle when it was actually a big one.

The new algorithm uses arctan2 β€” a function that returns angles across the full 360Β° circle. But a naive application of arctan2 breaks on edge cases (as the original paper showed). The insight is:

Only the first rotation in each subspace needs the full 360Β° range. All subsequent rotations in that subspace are inherently limited to 180Β° β€” the vector is guaranteed to be in the front hemisphere after the first rotation aligns its leading component. This is the "modified arctan2" method.

Explore

Compare the two methods side by side. Click each step to see how a left-handed basis gets oriented.

Original (arcsin) Method

Starting basis: v₁, vβ‚‚, v₃ form a left-handed system. Both methods start here.

New (modified arctan2) Method

Same starting point. The difference emerges at step 2.

The Reflection Matrix Simplifies

With the original method, the reflection matrix S could have any pattern of Β±1's along its diagonal β€” like (1, βˆ’1, 1, 1, βˆ’1, 1, 1). With the new method, reflections are deferred to the very last, irreducible dimension. The new S looks like:

Stan = diag(1, 1, 1, ..., 1, Β±1)

Only the very last entry might be βˆ’1 (a reflection), and only when det(V) = βˆ’1.

When a vector entry is exactly zero (sparse vectors), the standard formula stalls because it divides by zero. The paper handles this by adjusting the indexing: instead of looking at the immediately preceding entry, the algorithm looks back to the first nonzero entry. This is formalized as:

ΞΈk = arctan2(ak |sin ΞΈkβˆ’j|, |akβˆ’j|), j β‰₯ 1

where akβˆ’j is the first nonzero entry before ak. There's also a special case when the second entry is zero, where the sine term drops out entirely.

The recipe β€” how the algorithm actually works

The Algorithm, Step by Step

Think of this algorithm as an assembly line that processes eigenvectors one at a time, from the most important to the least. At each station, the current vector is rotated into alignment with its target axis. The whole process is like solving a Rubik's cube: you fix one face at a time, and each fix constrains what's left.

Sort eigenvectors

Arrange eigenvectors by their eigenvalue size (biggest first). The most important patterns get processed first.

Optional: Orient to first orthant

If you expect the dominant eigenvector to have all positive entries (common in finance where the "market mode" lifts all boats), flip it if needed. This is the OrientToFirstOrthant flag.

For each reducible subspace…

Compute rotation angles using the modified arctan2 formula. The first angle spans 360Β°; the rest span 180Β°. Apply the corresponding Givens rotations to align the current eigenvector with its target axis.

Handle the last dimension

The final eigenvector lives in an irreducible subspace β€” there are no axes left to rotate around. If it's pointing the wrong way, reflect it. This is the only place a reflection might be needed.

Reconstruct the oriented matrix

Apply all reflections to the sorted eigenvector matrix: 𝒱 = VΒ·S. The result is a pure rotation away from the identity.

The paper provides a complete pseudocode listing (Algorithm 1) for the orient_eigenvectors function. Key subroutines:

  • SortEigenvectors β€” sorts by |eigenvalue| descending
  • ReduceDimensionByOne β€” computes angles and applies rotation for one subspace
  • SolveRotationAnglesInSubDim β€” the heart: uses arctan2 with the guaranteed-positive trick
  • ConstructSubspaceRotationMtx β€” builds the composite Givens rotation
  • MakeGivensRotation β€” creates a single 2D rotation embedded in N dimensions

The Python package thucyd (v0.2.5+) implements this. Install via pip install thucyd or conda install thucyd.

The algorithm is in hand. Now let's see what happens when we use it on real data.
The proof β€” real financial data reveals the difference

Results: Seeing Full Circles

The Dataset

Damask uses real data from the Chicago Mercantile Exchange (CME): quotes and trades for 7 foreign exchange contracts (EUR/USD, USD/JPY, GBP/USD, USD/CHF, AUD/USD, NZD/USD, CAD/USD) over 23 business days in May 2023.

7
FX pairs
23
Business days
14
Columns (7 quotes + 7 trades)
500–1000
Records/day

Each day's data was carefully processed: mid-prices derived from quotes, trades signed for direction, all filtered to a common timescale, downsampled to remove autocorrelation, and mapped through a statistical copula to produce clean multivariate Gaussian panels. The quote and trade panels (7 columns each) were then separately eigenanalyzed via SVD.

The Big Picture: Half-Circle vs. Full-Circle

Interactive

Toggle between the two methods to see how eigenvector pointing directions change across 6 modes and 23 days. Each dot is one day's eigenvector orientation for that mode. Radius = participation score.

Showing the arcsin method. All points are confined to the right half-circle (βˆ’90Β° to +90Β°). Some points that appear near the edges may actually be wrapped around from the other side.

What the Full Circle Reveals

The switch from arcsin to arctan2 exposes three key findings:

1. Informative modes cluster tightly. Quote mode 1 and trade mode 1 point in a consistent direction β€” they're the "real signal" in the data. For quotes, modes 2 and 3 are also somewhat directed.

2. Noisy modes scatter uniformly. Quote modes 4–6 and trade modes 2–6 spread evenly around the full circle β€” they're indistinguishable from random noise. This was ambiguous with only a half-circle view.

3. Outliers become visible. In quote mode 3, one day stands apart from the cluster. It turns out this was May 10, 2023 β€” when the US Consumer Price Index was released. EUR/USD responded anomalously, creating a genuine data outlier that was invisible with the arcsin method.

Participation Score

The participation score (PS) measures how many of the 7 currencies contribute to a given eigenvector. Think of an orchestra: if all instruments play equally, PS = 1.0. If one instrument dominates while the rest are silent, PS = 1/7 β‰ˆ 0.14.

It's calculated from the inverse participation ratio (IPR): IPR = Ξ£ vi4, and then PS = 1/(N Γ— IPR). Mode 1 has high PS (β‰ˆ0.8–1.0), meaning all currencies participate. Random modes have low, scattered PS values.

Random Matrix Theory Cross-Validation

Can eigenvalue analysis confirm what we see from the eigenvector directions?
Imagine generating a completely random dataset with no real structure β€” just noise. Even this random data would produce eigenvalues, and they'd follow a predictable distribution called the Marčenko–Pastur (MP) distribution. It's like the "background hum" of eigenvalues. Any real eigenvalue that sticks out above this hum probably carries genuine information.
Explore

The MP distribution depends on the ratio q = N/T. Drag the slider to see how the "noise zone" changes.

0.014

The shaded region is the MP distribution β€” eigenvalues here could be pure noise. Eigenvalues to the right are likely real signal.

For the FX data: with N=7 features and Tβ‰ˆ500–1000 records, q is very small, so the MP distribution is narrow. The results:

Modes 1–3 have eigenvalues that fall outside the MP distribution β†’ genuine signal. This matches the directional analysis perfectly: modes 1–3 are directed.

Modes 4–7 fall within the MP distribution β†’ indistinguishable from noise. Again, perfect match: modes 4–6 scatter randomly.

Only mode 1 falls outside the MP distribution β†’ genuine signal.

Modes 2–7 are all within the noise zone. This explains why trade eigenvectors for modes 2–6 scatter uniformly on the full circle.

The beautiful conclusion: eigenvector directional analysis and eigenvalue spectral analysis agree. But the directional analysis provides independent evidence β€” it doesn't rely on the somewhat subjective rescaling of the MP distribution that's needed for the eigenvalue approach.

Practical payoff β€” how to make eigenvectors stable over time

Eigenvector Stabilization

Now that we can distinguish informative modes from noisy ones, Damask proposes two strategies to stabilize eigenvectors as they evolve:

Dynamic Stabilization

For informative modes that wobble slightly day to day: smooth them with a causal time filter. Stack recent eigenvectors end-to-end (weighted), find the resultant direction, and re-orthogonalize.

Filter used: h[n] = [1, 2, 3, 2, 1]/9 (5-day, 2-day delay). After filtering, scatter is visibly tighter and participation scores increase.

Static Stabilization

For noisy modes whose directions are meaningless: set their rotation angles to zero, fixing them to the coordinate axes. This trades variance for bias.

Geometrically, the stabilized eigenvector matrix 𝒱modal is "closer" to the identity. Unlike PCA, all modes remain β€” none are discarded.

Dynamic stabilization is like using image stabilization on a camera β€” the real scene is preserved but made steadier. Static stabilization is like photoshopping out background noise β€” you know it's not real information, so you replace it with a clean default.

How the Averaging Works

You can't simply average angles to find the mean direction of wobbling vectors. Instead, you stack the vectors end-to-end (possibly with weights), measure the resultant vector's direction, and renormalize. This is formalized as:

Mh[n] = (h βˆ— 𝒱)[n] β†’ normalize β†’ re-orthogonalize via orientation algorithm

The convolution with filter h weights recent days more. Normalization restores unit length. The orientation algorithm restores orthogonality.

The Angle Matrix for Static Stabilization

For the 7-dimensional quote system with 3 informative modes, the 21 embedded angles are arranged in a matrix. Static stabilization zeros out the angles for noisy modes 4–6:

Toggle

Click to see what gets zeroed out:

  0  θ₁₂  θ₁₃  θ₁₄  θ₁₅  θ₁₆  θ₁₇   ← mode 1
     0   θ₂₃  ΞΈβ‚‚β‚„  ΞΈβ‚‚β‚…  θ₂₆  θ₂₇   ← mode 2
         0   θ₃₄  θ₃₅  θ₃₆  θ₃₇   ← mode 3
              0   ΞΈβ‚„β‚…  θ₄₆  θ₄₇   ← mode 4
                   0   θ₅₆  θ₅₇   ← mode 5
                        0   θ₆₇   ← mode 6
                             0

Effect on Correlation Matrices

The whole point is to produce cleaner, more stable correlation estimates. The paper shows a progression:

Compare

Original correlations show wide dispersion across the 23-day sample. Each circle represents one element of the correlation matrix across days.

Ledoit–Wolf shrinkage is a classical technique that blends a noisy empirical correlation matrix with an identity matrix: Ξ£shr = Ξ±Ξ£Μ‚ + (1βˆ’Ξ±)I. But it applies to the whole matrix uniformly.

The insight here is more surgical: you can rotate away the informative modes first, apply shrinkage only to the noisy remainder, and then rotate back. Static stabilization with angles set to zero is equivalent to shrinkage with Ξ± = 0 for the noisy modes β€” maximum shrinkage, targeted precisely where it's needed.

arcsin vs. Modified arctan2: Full Comparison

Feature arcsin (original) Modified arctan2 (new)
Angular range (primary rotation) Ο€ (180Β°) 2Ο€ (360Β°)
Reflection matrix S Any pattern of Β±1's diag(1,...,1,Β±1)
Angular wrap-around artifacts Present at Ο€ boundary Eliminated
Edge case robustness arctan2 breaks on signed zeros Signed zeros avoided by design
Best suited for Regression (sign stability matters) Directional statistics (full angular disambiguation)
Informative vs. noisy mode distinction Ambiguous (half-circle view) Clear (full-circle scatter vs. clustering)
Implementation thucyd method='arcsin' thucyd method='arctan2'

Key Takeaways

What you'd tell a friend over coffee:

1. Software-generated eigenvectors have a fundamental sign ambiguity. An algorithm exists (with free code) to fix it consistently.

2. The original fix saw the world through a half-circle. The new "modified arctan2" method sees the full circle by recognizing that only the first rotation in each subspace needs the wide view.

3. The full-circle view cleanly separates signal from noise in eigenvectors β€” matching what eigenvalue theory predicts, but providing independent confirmation.

4. Eigenvectors can be stabilized over time: smooth the informative ones (dynamic), fix the noisy ones (static). This produces cleaner correlation matrices.

5. Both methods serve different purposes: arcsin for regression applications, arctan2 for directional statistics. The thucyd Python package supports both.

6. Real-world test on FX markets found that the new method revealed an outlier caused by a US CPI announcement β€” invisible with the old method.