A paper about making eigenvectors β the hidden "compass needles" inside data β point in consistent directions, so we can actually understand what they're telling us.
Jay Damask Β· arXiv:2402.08139 Β· February 2024
Scroll down to explore the full paper β
Why should I care?
The Problem with Wobbly Compass Needles
Imagine you're navigating through fog using a compass, but every time you glance down, the needle might have randomly flipped 180Β°. North becomes south without warning. You can't trust your heading. Now imagine you have seven compasses, all interconnected β if one flips, it confuses the others. That's the situation data scientists face with eigenvectors.
When scientists or financial analysts break complex data into its fundamental patterns β a technique called eigenanalysis β they get two things for each pattern: a number saying how important the pattern is (the eigenvalue) and an arrow saying what direction the pattern points (the eigenvector).
Here's the catch: the standard software that computes eigenvectors (svd and eig calls) doesn't guarantee the sign of those arrows. The arrow could point north or south β mathematically both are equally valid. When you're tracking how patterns evolve over time, this sign ambiguity is catastrophic. It's like trying to track a stock market trend when your chart randomly inverts.
"The same eigenvector, computed at two nearby moments in time, might appear on opposite sides of a hemisphere β even when the actual direction has barely changed."
In 2020, Jay Damask published an algorithm (and a free Python package called thucyd) to fix this. It worked well, but it had a limitation: it could only see half the compass rose β angles were stuck in a 180Β° window. This paper doubles that to 360Β°, which turns out to matter enormously when distinguishing real patterns from noise.
Now that we know what's at stake, let's build the vocabulary we need.
Building blocks β these concepts power everything that follows
The Foundation: Rotations, Reflections & Handedness
Eigenvectors: The DNA of a Dataset
Think of a dataset with 7 measurements per record β say, 7 currency exchange rates. You can think of each record as a point in a 7-dimensional room. Eigenvectors are the natural "compass directions" of that room: they reveal the axes along which the data actually stretches or compresses. The first eigenvector points in the direction of maximum variation; the second points perpendicular to it in the direction of next-most variation; and so on.
The Sign Problem
Software computes eigenvectors V by solving a mathematical equation that has a built-in ambiguity: if v is a solution, then βv is equally valid. It's like saying "the strongest pattern runs along this line" without saying which direction along the line. North or south? The software picks essentially at random.
Rotations vs. Reflections
Try me
Drag the slider to see the difference between a rotation and a reflection.
30Β°
Left: rotation keeps handedness (the "L" stays readable). Right: reflection flips it (the "L" appears mirrored). Eigenvectors can secretly contain reflections, and this paper is about detecting and handling them.
Handedness and the Determinant
In 3D, we can tell if a coordinate system is "right-handed" or "left-handed" β like the difference between your actual right hand and its reflection in a mirror. The mathematical test is the determinant of the eigenvector matrix V:
+1
det(V) = Pure rotation
β1
det(V) = Has reflections
The goal of the orientation algorithm is to produce a matrix π± that lives in the special orthogonal group SO(N) β the club of pure rotations. To get there, we need to detect and fix any hidden reflections.
A Givens rotation is the simplest possible rotation: it rotates within exactly one 2D plane while leaving everything else untouched. In a 7-dimensional space, you'd need a series of these simple rotations β applied one after another β to swing a vector into alignment with a coordinate axis. It's like turning a combination lock: each click adjusts one pair of coordinates.
For an N-dimensional space, there are N(Nβ1)/2 such angles embedded in the eigenvector matrix. For 7 dimensions, that's 21 angles total β 6 modes Γ a decreasing number of angles per mode.
The Central Equation
RT V S = I
In plain English: "Un-rotate the eigenvector matrix V (that's the RT part), after fixing any reflections (that's the S part), and you should get back to the identity matrix I β a perfectly aligned coordinate system." Finding R and S is the algorithm.
With the vocabulary in place, here comes the key insight of this paper.
The "aha" moment β this is the paper's main contribution
The Core Idea: Rotate Instead of Reflect
Imagine you're trying to park a car facing north, but it's currently facing south. You have two choices: (A) pick the car up, flip it like a pancake so it faces north β that's a reflection. Or (B) turn the steering wheel and drive it around in a U-turn β that's a rotation through a major angle (more than 90Β°). Both end with the car facing north, but option B stays in the same world of driving maneuvers. The original algorithm chose option A; this paper shows how to choose option B.
Why This Matters for Angles
The original algorithm used a function called arcsin to compute angles. The problem is that arcsin can only return angles between β90Β° and +90Β° β a half-circle. This means you're looking at your eigenvectors through a narrow window: you can see them in the front hemisphere, but if one wanders to the back, the algorithm forces it to the front via a reflection, making it look like a small angle when it was actually a big one.
The new algorithm uses arctan2 β a function that returns angles across the full 360Β° circle. But a naive application of arctan2 breaks on edge cases (as the original paper showed). The insight is:
Only the first rotation in each subspace needs the full 360Β° range. All subsequent rotations in that subspace are inherently limited to 180Β° β the vector is guaranteed to be in the front hemisphere after the first rotation aligns its leading component. This is the "modified arctan2" method.
Explore
Compare the two methods side by side. Click each step to see how a left-handed basis gets oriented.
Original (arcsin) Method
Starting basis: vβ, vβ, vβ form a left-handed system. Both methods start here.
New (modified arctan2) Method
Same starting point. The difference emerges at step 2.
The Reflection Matrix Simplifies
With the original method, the reflection matrix S could have any pattern of Β±1's along its diagonal β like (1, β1, 1, 1, β1, 1, 1). With the new method, reflections are deferred to the very last, irreducible dimension. The new S looks like:
Stan = diag(1, 1, 1, ..., 1, Β±1)
Only the very last entry might be β1 (a reflection), and only when det(V) = β1.
When a vector entry is exactly zero (sparse vectors), the standard formula stalls because it divides by zero. The paper handles this by adjusting the indexing: instead of looking at the immediately preceding entry, the algorithm looks back to the first nonzero entry. This is formalized as:
where akβj is the first nonzero entry before ak. There's also a special case when the second entry is zero, where the sine term drops out entirely.
The recipe β how the algorithm actually works
The Algorithm, Step by Step
Think of this algorithm as an assembly line that processes eigenvectors one at a time, from the most important to the least. At each station, the current vector is rotated into alignment with its target axis. The whole process is like solving a Rubik's cube: you fix one face at a time, and each fix constrains what's left.
Sort eigenvectors
Arrange eigenvectors by their eigenvalue size (biggest first). The most important patterns get processed first.
Optional: Orient to first orthant
If you expect the dominant eigenvector to have all positive entries (common in finance where the "market mode" lifts all boats), flip it if needed. This is the OrientToFirstOrthant flag.
For each reducible subspaceβ¦
Compute rotation angles using the modified arctan2 formula. The first angle spans 360Β°; the rest span 180Β°. Apply the corresponding Givens rotations to align the current eigenvector with its target axis.
Handle the last dimension
The final eigenvector lives in an irreducible subspace β there are no axes left to rotate around. If it's pointing the wrong way, reflect it. This is the only place a reflection might be needed.
Reconstruct the oriented matrix
Apply all reflections to the sorted eigenvector matrix: π± = VΒ·S. The result is a pure rotation away from the identity.
The paper provides a complete pseudocode listing (Algorithm 1) for the orient_eigenvectors function. Key subroutines:
SortEigenvectors β sorts by |eigenvalue| descending
ReduceDimensionByOne β computes angles and applies rotation for one subspace
SolveRotationAnglesInSubDim β the heart: uses arctan2 with the guaranteed-positive trick
ConstructSubspaceRotationMtx β builds the composite Givens rotation
MakeGivensRotation β creates a single 2D rotation embedded in N dimensions
The Python package thucyd (v0.2.5+) implements this. Install via pip install thucyd or conda install thucyd.
The algorithm is in hand. Now let's see what happens when we use it on real data.
The proof β real financial data reveals the difference
Results: Seeing Full Circles
The Dataset
Damask uses real data from the Chicago Mercantile Exchange (CME): quotes and trades for 7 foreign exchange contracts (EUR/USD, USD/JPY, GBP/USD, USD/CHF, AUD/USD, NZD/USD, CAD/USD) over 23 business days in May 2023.
7
FX pairs
23
Business days
14
Columns (7 quotes + 7 trades)
500β1000
Records/day
Each day's data was carefully processed: mid-prices derived from quotes, trades signed for direction, all filtered to a common timescale, downsampled to remove autocorrelation, and mapped through a statistical copula to produce clean multivariate Gaussian panels. The quote and trade panels (7 columns each) were then separately eigenanalyzed via SVD.
The Big Picture: Half-Circle vs. Full-Circle
Interactive
Toggle between the two methods to see how eigenvector pointing directions change across 6 modes and 23 days. Each dot is one day's eigenvector orientation for that mode. Radius = participation score.
Showing the arcsin method. All points are confined to the right half-circle (β90Β° to +90Β°). Some points that appear near the edges may actually be wrapped around from the other side.
What the Full Circle Reveals
The switch from arcsin to arctan2 exposes three key findings:
1. Informative modes cluster tightly. Quote mode 1 and trade mode 1 point in a consistent direction β they're the "real signal" in the data. For quotes, modes 2 and 3 are also somewhat directed.
2. Noisy modes scatter uniformly. Quote modes 4β6 and trade modes 2β6 spread evenly around the full circle β they're indistinguishable from random noise. This was ambiguous with only a half-circle view.
3. Outliers become visible. In quote mode 3, one day stands apart from the cluster. It turns out this was May 10, 2023 β when the US Consumer Price Index was released. EUR/USD responded anomalously, creating a genuine data outlier that was invisible with the arcsin method.
Participation Score
The participation score (PS) measures how many of the 7 currencies contribute to a given eigenvector. Think of an orchestra: if all instruments play equally, PS = 1.0. If one instrument dominates while the rest are silent, PS = 1/7 β 0.14.
It's calculated from the inverse participation ratio (IPR): IPR = Ξ£ vi4, and then PS = 1/(N Γ IPR). Mode 1 has high PS (β0.8β1.0), meaning all currencies participate. Random modes have low, scattered PS values.
Random Matrix Theory Cross-Validation
Can eigenvalue analysis confirm what we see from the eigenvector directions?
Imagine generating a completely random dataset with no real structure β just noise. Even this random data would produce eigenvalues, and they'd follow a predictable distribution called the MarΔenkoβPastur (MP) distribution. It's like the "background hum" of eigenvalues. Any real eigenvalue that sticks out above this hum probably carries genuine information.
Explore
The MP distribution depends on the ratio q = N/T. Drag the slider to see how the "noise zone" changes.
0.014
The shaded region is the MP distribution β eigenvalues here could be pure noise. Eigenvalues to the right are likely real signal.
For the FX data: with N=7 features and Tβ500β1000 records, q is very small, so the MP distribution is narrow. The results:
Modes 1β3 have eigenvalues that fall outside the MP distribution β genuine signal. This matches the directional analysis perfectly: modes 1β3 are directed.
Modes 4β7 fall within the MP distribution β indistinguishable from noise. Again, perfect match: modes 4β6 scatter randomly.
Only mode 1 falls outside the MP distribution β genuine signal.
Modes 2β7 are all within the noise zone. This explains why trade eigenvectors for modes 2β6 scatter uniformly on the full circle.
The beautiful conclusion: eigenvector directional analysis and eigenvalue spectral analysis agree. But the directional analysis provides independent evidence β it doesn't rely on the somewhat subjective rescaling of the MP distribution that's needed for the eigenvalue approach.
Practical payoff β how to make eigenvectors stable over time
Eigenvector Stabilization
Now that we can distinguish informative modes from noisy ones, Damask proposes two strategies to stabilize eigenvectors as they evolve:
Dynamic Stabilization
For informative modes that wobble slightly day to day: smooth them with a causal time filter. Stack recent eigenvectors end-to-end (weighted), find the resultant direction, and re-orthogonalize.
Filter used: h[n] = [1, 2, 3, 2, 1]/9 (5-day, 2-day delay). After filtering, scatter is visibly tighter and participation scores increase.
Static Stabilization
For noisy modes whose directions are meaningless: set their rotation angles to zero, fixing them to the coordinate axes. This trades variance for bias.
Geometrically, the stabilized eigenvector matrix π±modal is "closer" to the identity. Unlike PCA, all modes remain β none are discarded.
Dynamic stabilization is like using image stabilization on a camera β the real scene is preserved but made steadier. Static stabilization is like photoshopping out background noise β you know it's not real information, so you replace it with a clean default.
How the Averaging Works
You can't simply average angles to find the mean direction of wobbling vectors. Instead, you stack the vectors end-to-end (possibly with weights), measure the resultant vector's direction, and renormalize. This is formalized as:
The convolution with filter h weights recent days more. Normalization restores unit length. The orientation algorithm restores orthogonality.
The Angle Matrix for Static Stabilization
For the 7-dimensional quote system with 3 informative modes, the 21 embedded angles are arranged in a matrix. Static stabilization zeros out the angles for noisy modes 4β6:
The whole point is to produce cleaner, more stable correlation estimates. The paper shows a progression:
Compare
Original correlations show wide dispersion across the 23-day sample. Each circle represents one element of the correlation matrix across days.
LedoitβWolf shrinkage is a classical technique that blends a noisy empirical correlation matrix with an identity matrix: Ξ£shr = Ξ±Ξ£Μ + (1βΞ±)I. But it applies to the whole matrix uniformly.
The insight here is more surgical: you can rotate away the informative modes first, apply shrinkage only to the noisy remainder, and then rotate back. Static stabilization with angles set to zero is equivalent to shrinkage with Ξ± = 0 for the noisy modes β maximum shrinkage, targeted precisely where it's needed.
1. Software-generated eigenvectors have a fundamental sign ambiguity. An algorithm exists (with free code) to fix it consistently.
2. The original fix saw the world through a half-circle. The new "modified arctan2" method sees the full circle by recognizing that only the first rotation in each subspace needs the wide view.
3. The full-circle view cleanly separates signal from noise in eigenvectors β matching what eigenvalue theory predicts, but providing independent confirmation.
4. Eigenvectors can be stabilized over time: smooth the informative ones (dynamic), fix the noisy ones (static). This produces cleaner correlation matrices.
5. Both methods serve different purposes: arcsin for regression applications, arctan2 for directional statistics. The thucyd Python package supports both.
6. Real-world test on FX markets found that the new method revealed an outlier caused by a US CPI announcement β invisible with the old method.