A1
Neurons communicate via brief electrical pulses called action potentials. When a neuron fires, calcium ions flood into the cell. We exploit this by using calcium imaging: neurons are genetically engineered to express GCaMP6s, a protein that fluoresces in the presence of calcium. A miniature microscope implanted on the mouse's head records this fluorescence, producing a brightness time series for each neuron at 30 frames per second.
GCaMP6s has a ~600 ms decay time, which acts as a low-pass filter on neural activity. Frequencies above ~7 Hz are not reliably resolved. This constrains our analysis to the 0.01–7 Hz range.
Our dataset contains 3,938 neurons across 18 sessions, totaling ~87 minutes of recording at 30 Hz.
A2
The microscope captures 30 snapshots per second, giving a sampling rate of \( f_s = 30 \) Hz. The Nyquist-Shannon theorem states that the maximum resolvable frequency is half the sampling rate:
Frequencies above the Nyquist limit cannot be accurately represented and produce distortion called aliasing. In practice, the GCaMP indicator's slow dynamics further restrict the useful bandwidth to ~7 Hz.
A3
Calcium signals exhibit slow downward drift from photobleaching (the fluorescent protein losing brightness over time). We remove this by fitting and subtracting a linear trend:
where \( a \) is the slope and \( b \) the intercept of the best-fit line.
Behavior video (25 fps) and calcium imaging (30 fps) are recorded at different rates. We resample behavior labels to 30 fps using nearest-neighbor interpolation:
where \( n \) is the number of behavior frames and \( m \) is the target number of calcium-matched frames.
Behavior video is processed through SLEAP (a deep-learning pose tracker, 15 body keypoints per animal). A frame is labeled "social" when the resident mouse's nose is within 10 pixels of the intruder's nearest body part.
A4
The SMI quantifies each neuron's preference for social vs. solo states as a normalized difference in mean activity:
where \( \varepsilon = 10^{-12} \) prevents division by zero when both means are near zero.
To test whether each neuron's SMI is significantly different from zero, a Wilcoxon rank-sum test compares the neuron's social and solo activity distributions. With 3,938 neurons tested simultaneously, the resulting p-values are corrected using Benjamini-Hochberg FDR (false discovery rate) at α = 0.05. This controls the expected proportion of false positives among all neurons declared significant:
87% of neurons pass BH-FDR correction, showing significant modulation — but in both directions (some excited, some suppressed). This heterogeneity means population averaging cancels out opposing responses.
A5
Any signal can be decomposed into a sum of sine waves at different frequencies. The Discrete Fourier Transform (DFT) computes this decomposition for sampled data. Given \( N \) samples \( x[0], \ldots, x[N-1] \):
The power at frequency bin \( k \) is the squared magnitude:
Bin \( k \) corresponds to real-world frequency \( f = k \cdot f_s / N \) Hz.
A6
A single FFT of the entire signal produces a noisy spectral estimate. Welch's method (1967) reduces this variance by averaging spectra across overlapping, windowed segments:
Figure A1. Welch's method: the signal is split into overlapping segments, each windowed and FFT'd, then the resulting power spectra are averaged to reduce noise.
A7
| Band | Range | Period | Captures |
|---|---|---|---|
| Infraslow | 0.01 – 0.1 Hz | 10 – 100 s | Very slow drifts; partly photobleaching artifact |
| Slow | 0.1 – 1.0 Hz | 1 – 10 s | Network-level synchronization |
| Delta | 1.0 – 4.0 Hz | 0.25 – 1 s | Slow rhythmic activity |
| Theta | 4.0 – 7.0 Hz | 0.14 – 0.25 s | Social behavior and attention |
Total power in a band is computed by integrating the PSD using the trapezoidal rule:
For cross-neuron comparison, band power is normalized by total power:
This yields a value in [0, 1] representing what fraction of the neuron's power resides in each band. Fractional power is used for spectral clustering.
A8
In Section A7, we computed how much total power a neuron has in each frequency band. That tells us, on average, how strong each band is — but it collapses the entire recording into a single number. A bandpass filter does something different: it isolates a specific frequency band as a time-varying signal, so we can watch how that band's activity rises and falls moment to moment.
For example, applying a 4–7 Hz bandpass filter to a neuron's trace removes all activity outside the theta range. The output is a new waveform that oscillates purely at theta frequencies. This lets us overlay theta activity onto behavioral timestamps and visually inspect whether theta power increases during social contact (as shown in Figure 5 of the main paper).
Every filter has a frequency response \( H(f) \) that describes how much it amplifies or suppresses each frequency. For a bandpass filter, frequencies inside the passband (e.g., 4–7 Hz) pass through with gain ≈ 1 (unchanged), while frequencies outside the passband are multiplied by a gain close to 0 (suppressed). The transition between passband and stopband is not instantaneous — it follows a gradual rolloff curve.
The Butterworth design is defined by having the flattest possible response inside the passband. For a lowpass Butterworth filter of order \( N \) with cutoff \( f_c \):
At the cutoff frequency \( f = f_c \), the gain is exactly \( 1/\sqrt{2} \approx 0.707 \) (i.e., power is halved, a –3 dB point). Frequencies well below \( f_c \) pass through nearly unchanged; frequencies well above \( f_c \) are attenuated. For a bandpass filter, two cutoffs are combined: a lower cutoff \( f_{\text{lo}} \) and an upper cutoff \( f_{\text{hi}} \).
Figure A2. Butterworth bandpass frequency response for the theta band (4–7 Hz). Order 4 (blue) produces a sharper rolloff than order 2 (gray dashed). The passband region is shaded.
The order \( N \) controls how sharply the filter transitions from passband to stopband. Higher orders produce sharper rolloff, but they also introduce more ringing (oscillatory artifacts near the edges) and can become numerically unstable. We use order \( N = 4 \):
A9
After bandpass filtering, a neuron's signal oscillates up and down at the band's frequency. During a period of strong theta activity, the filtered signal swings between large positive and negative values. During weak theta activity, the swings are small. The overall "loudness" of the oscillation is changing over time, but the signal itself keeps crossing zero, making it hard to directly compare against behavioral states.
We need a way to extract a smooth, non-negative curve that tracks how strong the oscillation is at each moment — ignoring the up-and-down cycling. This curve is called the envelope, or equivalently the instantaneous amplitude.
Consider a sine wave: \( x(t) = A \cos(2\pi f t) \). Its amplitude is \( A \), but measuring \( A \) from the signal alone is not straightforward because the signal passes through zero twice per cycle. If we had a second copy of the signal shifted by exactly 90° (a quarter cycle) — namely \( A \sin(2\pi f t) \) — we could combine them:
The squared cosine and squared sine always sum to 1, so the magnitude recovers the amplitude \( A \) regardless of where we are in the oscillation cycle. The Hilbert transform manufactures that companion signal — it shifts every frequency component by exactly −90° in phase without changing any amplitudes, so we can compute the amplitude at every time point.
The Hilbert Transform \( \mathcal{H}\{x(t)\} \) shifts every frequency component of \( x(t) \) by exactly −90° in phase, without changing any amplitudes. For a pure cosine, the Hilbert Transform produces a sine. For a complex, multi-frequency signal (like our bandpass-filtered neural trace), it shifts each constituent frequency independently by 90°.
Combining the original signal with its Hilbert Transform produces the analytic signal:
Here \( j = \sqrt{-1} \). The analytic signal is complex-valued: the real part is the original signal, and the imaginary part is the 90°-shifted version.
The envelope is the magnitude (absolute value) of the analytic signal at each time point:
This produces a smooth, non-negative curve that traces the peaks of the oscillation. When the oscillation is large, the envelope is high. When the oscillation is small, the envelope is low. The rapid up-and-down cycling is removed, leaving only the slow amplitude modulation.
Figure A4. The Hilbert envelope (red) traces the instantaneous amplitude of a bandpass-filtered oscillation (gray). The envelope is always non-negative and smooth, making it easy to compare against behavioral states.
Taking \( |x(t)| \) directly would produce a bumpy, pulsating curve (flipping negative half-cycles upward) rather than a smooth envelope. The Hilbert approach yields a genuinely smooth curve because the 90° companion fills in the gaps between peaks, providing continuous amplitude information.
A10
Purpose: Convert raw calcium traces into a compact set of numbers that a classifier can use — 6 spectral features per 1-second window.
Each neuron's signal is divided into 1-second, non-overlapping windows (30 frames). Windows are labeled by behavioral purity:
Each window is linearly detrended, then its Welch PSD is computed and integrated into band powers.
| Feature | Description |
|---|---|
| Infraslow power | Integrated PSD, 0.01–0.1 Hz |
| Slow power | Integrated PSD, 0.1–1 Hz |
| Delta power | Integrated PSD, 1–4 Hz |
| Theta power | Integrated PSD, 4–7 Hz |
| Spectral entropy | Shannon entropy of the normalized PSD (see below) |
| Theta/delta ratio | \( P_\theta / \max(P_\delta, \, \varepsilon) \), where \( \varepsilon = 10^{-12} \) |
The PSD is normalized to a probability distribution, then Shannon entropy is computed:
Low entropy = power concentrated at few frequencies. High entropy = power spread broadly.
This ratio captures spectral shape (which band dominates) rather than raw power magnitude, making it robust to overall signal brightness. It emerged as the top discriminative feature for Cluster 1 classification.
A11
Purpose: Discover whether neurons form distinct spectral subpopulations — splitting the population before classification reveals a signal that the whole-population average hides.
Neurons are clustered by their 4-dimensional fractional band power vectors (infraslow, slow, delta, theta fractions summing to 1). K-Means iterates between two steps until convergence:
We evaluated k = 2 through 6 using the silhouette score:
where \( a(i) \) = mean intra-cluster distance and \( b(i) \) = mean nearest-cluster distance. Range: [−1, +1]; higher is better. k = 2 consistently achieved the best score (~0.399) across all 18 sessions.
A12
Purpose: Test whether spectral features can actually predict social vs. solo behavior — the classifier is the evaluation tool, not the contribution.
Features are z-scored using training-set statistics only:
Three linear classifiers were tested:
Social and solo windows are often imbalanced. Balanced class weights increase the misclassification penalty for the minority class, preventing the classifier from defaulting to the majority label.
The primary metric is Area Under the ROC Curve. The ROC plots true positive rate vs. false positive rate across all classification thresholds. AUC = 0.5 indicates chance; AUC = 1.0 indicates perfect separation.
A13
Purpose: Check whether the two spectral clusters are spatially segregated or intermixed — ruling out the possibility that cluster identity is just a location artifact.
Each neuron's spatial footprint (from CNMF-E source extraction) is reduced to a centroid via center of mass:
Euclidean distances between centroids are computed to test spatial organization:
If Cluster 1 neurons were spatially grouped, intra-cluster NN would be smaller than cross-cluster NN. The opposite is observed: Cluster 1 neurons are spatially intermixed with Cluster 0, indicating spectral identity is an intrinsic cellular property, not location-dependent.
Each neuron's spatial footprint is weighted by its band power and overlaid to produce maps of the spatial distribution of delta and theta activity across the field of view.
A14
We use GroupKFold (5 folds) with recording sessions as groups. In each fold, all windows from a set of sessions are held out for testing while the remaining sessions are used for training. No data from a test session appears during training.
Performance is reported as the mean AUC across folds.
Figure A5. GroupKFold cross-validation. In each fold, entire recording sessions are held out for testing (red). No windows from a test session appear in training (blue).
A15
A permutation test determines whether classification performance exceeds what would be expected by chance. The procedure:
If \( p < 0.05 \), the classifier performs significantly better than chance.
We run this test three times: once on the full population, and once on each spectral cluster separately. This is the key comparison — the whole-population classifier fails the permutation test (p = 0.196), but the Cluster 1 classifier passes (p = 0.020), revealing that the social signal is concentrated in the theta-enriched minority rather than spread across all neurons.
Figure A6. Permutation test. The gray histogram shows the distribution of AUC scores from 100 label-shuffled runs (null distribution). The red line marks the real classifier's AUC. The p-value is the fraction of shuffled AUCs that equal or exceed the real AUC.
A16
To test whether band power differs between social and solo windows, we use the Wilcoxon rank-sum test (also called the Mann-Whitney U test). This is a non-parametric test: it makes no assumptions about the shape of the data distribution, only that the two groups are independent.
The procedure:
A small p-value means the band-power values in one condition tend to be systematically larger than the other — i.e., the two rank distributions are shifted apart. This test generates the p-values that Bonferroni correction (A16) is then applied to.
A17
A p-value indicates whether an effect exists; Cohen's d quantifies its magnitude. It measures the distance between two group means in units of pooled standard deviation:
| |d| | Interpretation |
|---|---|
| ~0.2 | Small effect |
| ~0.5 | Medium effect |
| ~0.8 | Large effect |
Theta-band social vs. solo: d = 0.235 (small but consistent across 14/18 sessions). Delta-band: d = 0.069.
A18
Testing 4 frequency bands simultaneously inflates the false-positive rate. The Bonferroni correction divides the significance threshold by the number of tests:
Delta and theta pass Bonferroni correction (both p < 0.001), with theta showing the strongest effect. Infraslow (p = 0.435) and slow (p = 0.088) do not reach significance.