Welcome to EIT! (NeurIPS 2022)

The paper preprint can be found here. If you use our code, please cite our paper as below.

[cite] Ran Liu, Mehdi Azabou, Max Dabagia, Jingyun Xiao, and Eva L. Dyer. "Seeing the forest and the tree: Building representations of both individual and collective dynamics with transformers." Advances in Neural Information Processing Systems 35 (2022).


Complex systems (such as the brain) contain multiple individual elements (e.g. neurons) that interact dynamically to generate their outputs. The process by which these local interactions give rise to large-scale behaviors is important in many domains of science and engineering, from ecology and social networks to microbial interactions and brain dynamics. A natural way to model the activity of a system is to build a collective or population-level view, where we consider individuals (or channels) jointly to determine the dynamics of the population. However, studying systems from this population-level perspective might cause the model to lose sight of the contributions of different individuals’ dynamics to the final prediction or inference. Moving forward, we need methods that can build good population-level representations while also providing an interpretable view of the data at the individual level.


EIT (Embedded Interaction Transformer) presents a new framework for modeling time-varying observations of a system by using dynamic embeddings of individual channels to construct a population-level view.

EIT decomposes multi-channel time-series by first learning rich features from individual time-series before incorporating information and learned interactions across different individuals in the population. One critical benefit of our model is spatial/individual separability: it builds a population-level representation from embeddings of individual channels, which naturally leads to channel-level permutation invariance. In domain generalization tasks, this means a trained model can be tested with permuted channels or entirely different number of channels.

The overview image of EIT. (A) A traditional state-space view would treat the collective dynamics as a population right from the beginning, and use a population encoder to learn how the dynamics progress along time, which creates a highly abstracted latent space. (B) EIT learns individual dynamics with an individual encoder at the beginning. After establishing individual representation, we feed them into an interaction encoder to build a population representation. The two encoders work together to build a representation space that is richer than that of the traditional method. (C) The detailed architecture: EIT consists of an individual transformer that processes data for each individual, an interaction transformer that processes embeddings at each timepoint, and two projection modules at the end of both transformers.


Our contributions are as follows:

  • We proposed a novel framework that considers individual dynamics before collective dynamics, and realized it with a multi-stage transformer EIT. EIT learns both population representation and individual representation through decoupling the dynamics of individuals and their interactions in multi-variate time-series.
  • We introduced methods for generalization across datasets of different input size (number of channels) and ordering. We further proposed to measure the alignments of time-series in different datasets through the Wasserstein divergence.
  • We applied EIT to both many-body systems and neural systems recordings. After demonstrating our model’s robust decoding performance, we validated its ability to transfer individual dynamics by performing domain generalization across different populations of neurons and finding neuron alignments.


  1. Ran Liu (web), Mehdi Azabou (web), Max Dabagia (web), Jingyun Xiao, Eva Dyer (web).