Release (and paper) of 'seraphim' 2.0, an extended toolbox for studying phylogenetically informed movements

Published on March 24, 2026, by Simon Dellicour

Ten years after the release and publication of its version 1.0, we are happy to communicate about the release of the version 2.0 of the toolbox “seraphim”, our R package for studying phylogenetically informed movements. This toolbox can for instance be used to investigate the impact of environmental factors on the dispersal history and dynamics of viral lineages, to estimate lineage dispersal statistics, to map continuous phylogeographic reconstructions, or to conduct continuous phylogeographic simulations. The application note presenting this new version has been published today in Bioinformatics (Delliocur et al. 2026) and the “seraphim” R package is available on GitHub along with detailed manual, tutorials and associated example files.

What’s new in “seraphim” 2.0?

  • improvement of the functions that can be used to visualise continuous phylogeographic reconstructions of viral spreads (see below for an example). For instance, highest posterior density polygons used to display the uncertainty associated with the Bayesian inference can now be saved as vectorial objects and can either be estimated by time slice or retrieved from a maximum clade credibility tree.
  • estimation of new lineage dispersal statistics like, for instance, the isolation-by-distance signal metric (Dellicour et al. 2024).
  • the first post hoc analysis of the impact of environmental factors on the pace of viral spread now focuses on testing the association between environmental factors on the diffusion - instead of the dispersal - velocity of viral lineages (Dellicour et al. 2025).
  • addition of a second post hoc analysis to investigate the isolation-by-resistance signal, i.e. to investigate to what extent environmental factors can be associated with a deviation from an isolation-by-distance pattern (Dellicour et al. 2025).
  • the package can now also be used to follow prior-informed (as opposed to post hoc) landscape phylogeographic approaches to investigate the impact of environmental factors on the diffusion velocity of lineages. Such prior-informed landscape phylogeographic analyses can for instance be conducted through an environmental factor based multidimensional scaling transformation (Dellicour et al. 2025).
  • finally, the package now includes four phylogeographic simulators: (i) tree branches randomisation on an environmental raster according to various randomisation procedures (Dellicour et al. 2016), simulations of a relaxed random walk diffusion process along time-scaled phylogenies (which can, e.g., be used to investigate the impact of barriers on the dispersal frequency of lineages; Dellicour et al. 2018), (iii) simulations based on a birth-death process and a Brownian random walk or a relaxed random walk diffusion process (Dellicour et al. 2024), and (iv) simulations of a relaxed random walk diffusion process with a dispersal velocity impacted by an environmental raster (Dellicour et al. 2025).

Figure Figure 1: examples of visualisations that can be generated with the toolbox “seraphim” 2.0. Visualisations are based on a continuous phylogeographic analysis of the yellow fever virus (YFV) outbreak that started around 2015 in southeastern Brazil (Hill et al. 2022). (A) Continuous phylogeographic reconstruction of the dispersal history of YFV outbreak lineages: maximum clade credibility (MCC) tree and overall 80% highest posterior density (HPD) regions reflecting the uncertainty of the Bayesian phylogeographic inference summarized from 1000 trees sampled from the post-burn-in posterior tree distribution. MCC tree nodes are colored according to their time of occurrence and 80% HPD regions were computed for successive time layers and then superimposed using the same color scale to reflect time. The underlying map delimiting the Brazilian states was retrieved from the Database of Global Administrative Areas (GADM). (B) Evolution of the maximal wavefront distance from the epidemic origin: the solid curve represents the median value and the surrounding polygon the 95% HPD interval. Those estimates are also based on 1000 trees sampled from the post-burn-in posterior tree distribution, and the uncertainty polygon is colored according to the same time scale used in panel A. (C) Evaluation of the diffusion velocity of viral lineages through the estimation of the weighted diffusion coefficient (WDC): kernel density estimates of the diffusion coefficient (DC) parameters, with the posterior WDC estimates on the x-axis and the coefficient of variation of the diffusion coefficient among the branches of each sampled tree on the y-axis. In this graph, the three contours show, in shades of decreasing darkness, the 50%, 75%, and 95% HPD regions via kernel density estimation, respectively.

Reference: Dellicour S, Faria NR, Rose R, Lemey P, Pybus OG (2026). SERAPHIM 2.0: an extended toolbox for studying phylogenetically informed movements. Bioinformatics 42: btag093