This page contains some of the most important overview content from the public data release paper.

### 1. Simulation Overview

#### Description of the Simulations

Illustris is a suite of large volume, cosmological hydrodynamical simulations run with the moving-mesh code Arepo and including a comprehensive set of physical models critical for following the formation and evolution of galaxies across cosmic time. Each simulates a volume of $(106.5 {\rm Mpc})^3$ and self-consistently evolves five different types of resolution elements from a starting redshift of $z=127$ to the present day, $z=0$. These components are: dark matter particles, gas cells, passive gas tracers, stars and stellar wind particles, and supermassive black holes.

This data release includes the snapshots at all 136 available redshifts, halo and subhalo catalogs at each snapshot, and two distinct merger trees. Six primary realizations of the Illustris volume are released, including the flagship Illustris-1 run. These realizations include three resolution levels with the fiducial "full" baryonic physics model, and a dark matter only analog for each. In addition, there are four distinct, high time resolution, smaller volume "subboxes".

Caption. The most important numerical parameters for the six full volume runs. Gravitational softenings for all particle types other than DM are comoving kpc (with value equal to that of the DM) until $z=1$ after which they are fixed to their $z=1$ values, such that at $z=0$ they have half the softening length as the DM. $m_{\rm baryon}$ is the "target gas mass" (i.e. only the mean mass). The number of gas cells equals the $N_{\rm GAS}$ value only in the initial conditions, the number will then drop as stars and black holes form. Moreover, the total number of baryonic particles (gas cells + star particles + wind particles + black holes) is also not conserved since gas cells can be refined/de-refined to keep their mass within a factor of 2 around $m_{\rm baryon}$. In contrast, the total number of tracers and dark matter particles are both conserved for the duration of the simulation.

In the table above we provide an overview of the specifications of the six Illustris runs, including the computational volume, gravitational softening lengths, and masses of the different particle/cell types, which collectively indicate the resolution and dynamic range achieved. To emphasize the variety of galaxy formation and evolution phenomena which can be addressed with the Illustris simulations, in the following figure we give the approximate number of a selection of interesting astrophysical objects that can be found in the simulated box, from dark-matter dominated halos at $z=0$ to luminous active galactic nuclei (AGN) at higher redshifts.

Caption. Overview of the variety of galaxy formation and evolution phenomena accessible in the Illustris simulations. A few classes of interesting objects are listed for each of the four mass components present in the simulation: dark matter, stars, gas, and black holes. These are visualized on the left column, for different volumes and spatial scales, as dark-matter density, stellar light, gas density and gas temperature maps, with black holes denoted as black dots. The approximate number present in the Illustris-1 volume is given (from bottom to top), for a) galaxy clusters at $z=0$ with total mass $M_{200c}> 10^{14} {\rm M}_\odot$; b) Milky Way-like halos at $z=0$ ($6 \times 10^{11} < M_{200c}< 2 \times 10^{12} {\rm M}_\odot$); c) gravitationally-bound objects (dark or luminous) resolved with more than a thousand particles at the end of the reionization epoch; d) galaxies at $z=0$ with stellar mass exceeding $10^{10} {\rm M}_\odot$, including both centrals and satellites, from elliptical to disk morphologies; e) satellite galaxies at $z=0$ more massive than the Large Magellanic Cloud (stellar mass $> 1.5 \times 10^9 {\rm M}_\odot$), in any mass host; f) massive, compact galaxies at $z=2$ according to the selection of Barro et al. (2013); g) clusters of galaxies at $z=0$ emitting in the X-rays with luminosity exceeding $10^{42}$ erg/s; h) sources at $z=0$ with neutral hydrogen mass exceeding $5 \times 10^8 {\rm M}_\odot$; i) $10^{12} {\rm M}_\odot$ halos at $z=3$ with at least a damped Lyman-alpha system (HI column density $> 10^{20.3} {\rm cm}^{-2}$) within $50 {\rm kpc}$; j) black holes at $z=0$ more massive than $10^9 {\rm M}_\odot$; k) black-hole merger remnants at $z=0$ , i.e. sub grid black-hole binaries with $M_{\rm BH} > 10^6 {\rm M}_\odot$ for each BH and 1 Gyr delay between the simulation BH merger time and the actual BH merger; l) AGNs at $z=1$ with bolometric luminosity greater than $10^{45}$ erg/s.

A series of analyses based on the Illustris suite have already been performed. These include 1) comparisons to observations and studies of the impact of different feedback models on the distribution and content of gas on large scales, within halos and in the circumgalactic regime; 2) characterizations of the properties of galactic stellar halos, of the satellite populations across host masses, of the star formation histories and of the morphologies and angular-momentum build up of Illustris galaxies; 3) applications of shock finder algorithms; 4) analyses on the formation of massive, compact galaxies at high redshifts; 5) quantification of the galaxy merger rates, and 6) applications of post-processing radiative transfer algorithms in the study of cosmic reionization. See the up to date List of Results for references.

#### Physical Models and Numerical Methods

All of the "full physics: Illustris runs contain the following physical components:

• Primordial and metal-line radiative cooling in the presence of a redshift-dependent, spatially uniform, ionizing UV background field, with self-shielding corrections.
• Stochastic star formation in dense gas.
• Pressurization of the ISM due to unresolved supernovae using an effective equation of state model of a two-phase medium.
• Stellar evolution with the associated mass loss (gas recycling) and chemical enrichment, taking into account SN Ia/II and AGB stars.
• Galactic-scale outflows with an energy-driven, kinetic wind scheme.
• Seeding and growth of supermassive black holes.
• Feedback from AGN in both quasar and radio (bubble) modes, as well as modifications to the cooling curve of nearby gas due to radiation proximity effects.

For complete details on the behavior, implementation, parameter selection, and validation of these physical models, see Vogelsberger+ (2013) which describes the feedback models, and Torrey+ (2014), which compares the model output with observations from $z=0$ to $z=3$.

The Illustris simulations employ the Arepo code which evolves the equations of continuum hydrodynamics coupled with self-gravity. The spatial discretization of the fluid is provided by an unstructured, moving, Voronoi tessellation. On the volumes defined by individual cells Godunov's method is employed, with a directionally unsplit MUSCL-Hancock scheme and an exact Riemann solver. The Voronoi mesh is generated from a set of control points which move with the local fluid velocity modulo mesh regularization corrections. Gravitational forces are computed using the Tree-PM approach, with long-range forces calculated with a Fourier particle-mesh method, and short-range forces with a hierarchical tree algorithm. The code is second order in space, and with hierarchical adaptive time-stepping, also second order in time. During the simulation we employ a Monte Carlo tracer particle scheme (see Genel+ 2013) to follow the Lagrangian evolution of baryons.

In terms of both physical models and numerical methods, the Illustris simulations rely on a substantial foundation of previous work. In the following figure we provide an abridged reference tree covering both the physical models and numerical methods. The papers along any given branch are essential for understanding the details and limitations of the data.

Caption. Reference tree for the major components of Illustris, including both numerical methods and physical models. Each paper links to its arXiv or ADS entry (only if viewed at full size). We generally include both models and methods which were directly implemented in Illustris, while entries in the dark subboxes indicate model data inputs.

### 2. Data Access

There are two complementary ways to access the Illustris data products.

• Raw files can be directly downloaded, and example scripts are provided as a starting point for local analysis.
• A web-based API can be used, either through a web browser or programmatically in an analysis script, to perform common search and extraction tasks.

These two approaches can be combined. For example, you may be forced to download the full redshift zero group catalog in order to perform a complex search not supported by the API. After locally determining a sample of interesting galaxies, you could then extract their individual merger trees (and/or raw particle data) without needing to download the full simulation merger tree (or a full snapshot).

#### Direct File Download and Example Scripts

All of the primary data products for Illustris are released in HDF5 format. This is a portable, self-describing, binary specification suitable for large numerical datasets, for which file access routines are available in all common computing languages. We use only the basic features of the format: groups, attributes, and datasets, with one and two dimensional numeric arrays.

In order to maintain reasonable filesizes, most outputs are split across multiple file "pieces" (or "chunks"). For example, each snapshot of Illustris-1 is split into 512 sequentially numbered files. Individual links to each file chunk are available by selecting a particular simulation on the main data page. Pre-computed sha256 checksums are provided for all files so that their integrity can be verified.

For a getting-started guide and reference see the Example Scripts Documentation (in IDL, Python, and Matlab).

#### Web-based API

We have implemented a web-based interface (API) which can respond to a variety of user requests and queries. It is a well-defined interface between the user and the Illustris data products, which is expressed in terms of the required input(s) and expected output(s) for each type of request. The provided functionality is independent, as much as possible, from the underlying data structure, heterogeneity, format, and access methods. The API can be used in addition to, or in place of, the download and local analysis of large data files. At a high level, the API allows a user to search, extract, visualize, and analyze. In each case, the goal is to reduce the data response size, either by extracting an unmodified subset, or by calculating a derivative quantity.

By specific example, the following types of requests can be handled through the current API, for any simulation at any snapshot:

• List the available simulations, their snapshots, and all associated metadata.
• List all objects in the Subfind group catalog and their properties.
• Search with numeric range(s) over any field(s) present in the Subfind group catalogs.
• Return all fields from the group catalog for a specific halo or subhalo.
• Return a full snapshot cutout of the particle/cell data for a given halo or subhalo.
• Return a subset of this 'group cutout' containing only specified particle/cell type(s), and/or specific field(s) for each type.
• Return the complete merger history, or just the main progenitor branch, for a given subhalo, for any of the merger trees.
• Download all raw snapshot, group catalog, merger tree, and supplementary data catalog files which exist.
• Download subsets of raw snapshot files, containing only specified particle/cell type(s), and/or specific field(s) for each type.
• Crossmatch subhalos between full physics runs and their dark matter only analogues.
• Traverse relationships between halos and subhalos, for instance from a satellite subhalo to its parent FoF group to the primary (central) subhalo of that group.
• Traverse descendant and primary progenitor links across adjacent snapshots, as available in the SubLink merger trees.
• View or render visualizations of the different components (e.g. dark matter, gas, stars) of halos and subhalos, when available.
• Retrieve or calculate additional properties, beyond what is available in the group catalogs, for halos and subhalos, when available.

For a getting-started guide, cookbook of examples, and API reference see the Web API Documentation (in IDL, Python, and Matlab).

#### Further Online Tools

Subhalo Search Form: We provide a simple search form through which users can query the subhalo database. The search capabilities that exist in the API are exposed in a more human-friendly interface, to enable exploration without the need to write code or write URLs by hand. For example, objects can be selected based on total mass, stellar mass, star formation rate, gas metallicity, or size. The output is a familiar spreadsheet type format, which lists properties from the group catalogs. In addition, each subhalo row provides links to a common set of web-based tools for introspection. These include a full listing of all catalog fields, a form for selecting particle types and initiating an extraction of particles from the snapshot, merger tree visualization, and links to pre-rendered images, when available.

Explorer: The Illustris Explorer is an experiment in the visualization, exploration, and dissemination of large data sets -- in particular, those generated by large, astrophysical simulations such as Illustris. It uses the approach of thin-client interaction with derived data products, in this case, pre-computed imagery layered under group catalog information. Rapid search over group properties spatially overlays the results on top of the pre-rendered images. All mass components of the simulation are present: the continuous gas and dark matter fields, stellar light from individual stars, and black holes. We have found the interface particularly useful in exploring the spatial relationships between these four components and the discrete halos and subhalos identified with substructure finding algorithms.

Merger Tree Tool: As a demonstration of the potential of rich client applications built on top of the Illustris API, the above figure shows a snapshot of the current available interface for interactively exploring the merger trees. A zoomed-in portion of the SubLink tree for the 500th most massive central subhalo of Illustris-1 at z=0 is shown. The tree is vector based, and client side, so each node can be interacted with individually. The informational popup provides a link, back into the API, where the details of the selected progenitor subhalo can be interrogated.

### 3. Scientific Remarks and Cautions

The Illustris Simulations (particularly Illustris-1) have been shown to resolve many details of the small-scale properties of galaxies, as well as the evolution of stars and gas within the cosmic web. Illustris-1 reproduces many observational facts on the demographics and properties of the galaxy populations at various epochs, and on the distribution of gas on large scales. This has been achieved with a comprehensive galaxy formation model which is intended to account for all the primary processes that are believed to be important for the formation and evolution of galaxies.

However, the enormous dynamical range and the variety and complexity of physics phenomena involved in these numerical endeavours necessarily involve some modeling uncertainties. We have identified below the known problems and points of caution in the Illustris simulated output that any user of the public data must be aware of before embarking on the analysis of the released products. These points should be carefully taken into account before advancing scientific conclusions or making comparisons to observational results.

#### Caveats with the Illustris Galaxy Formation Model

Limitations in the Illustris implementations of the stellar and AGN feedback, and possibly of the adopted star-formation recipe, determine a series of issues in the simulated galaxy populations and gas content of halos in comparison to observational constraints. These all point to an inefficient quenching of the star formation in galaxies at different masses and regimes, and in some cases also to qualitatively not-realistic behaviors of the feedback models. In particular, the following issues applicable to the highest-resolution realization (Illustris-1) must be noted:

• The cosmic star formation rate density is too high at $z \lesssim 1$, possibly because of an inefficient quenching of galaxies residing in halos of $10^{11-12} {\rm M}_\odot$ (see Figs. 8 and 2 in Vogelsberger+ (2014b) and Genel+ (2014), respectively.
• The stellar mass function at $z \lesssim 1$ is too high both at the high and the low ends of the sampled stellar mass range, $M_\star \lesssim 10^{10} {\rm M}_\odot$ and $M_\star \gtrsim 10^{11.5} {\rm M}_\odot$, see Fig.11, Vogelsberger+ (2014b) and Fig.3, Genel+ (2014).
• The physical extent of galaxies can be a factor of a few larger than observed for $M_\star \lesssim 10^{10.7} {\rm M}_\odot$ (see Fig. 9 in Snyder+ 2015).
• The galaxy color distribution deviates from observations in that it does not exhibit a clear bimodality between red and blue galaxies, and the green-valley and the blue cloud appear over populated with respect to to the red sequence (especially for $M_\star \gtrsim 10^{10} {\rm M}_\odot$, see Fig.14 in Vogelsberger+ 2014b).
• About 10 percent of disk galaxies in the mass range $M_\star \sim 10^{10.5-11} {\rm M}_\odot$ at $z=0$ exhibit strong stellar and gaseous ring-like features, and appear as an additional sub-population in the $G_{\rm ini}-M_{20}$ plane (see Fig. 5 in Snyder+ 2015); such features appear to be even more frequent at higher redshifts. Via fragmentation, stellar rings may give rise to spurious stellar clumps that the Subfind algorithm identifies as subhalos but whose origin and existence is not necessarily physically well motivated. Furthermore, these stellar rings are often associated with cores in the stellar and dark matter components, visible in the inner radial density profiles. These cores can extend up $\sim$\,10 kpc in radius and are likely not realistic in detail.
• The total gas within $R_{\rm 500c}$ is underestimated at late times by a factor 3-10 in halos with $M_{\rm 500c} \sim 10^{13-14} {\rm M}_\odot$, because of the too violent operation mode of the Illustris radio-mode feedback (see Fig. 10 in Genel+ 2014).
• For similar reasons, the bolometric X-ray luminosity in the hot coronae of elliptical galaxies is by many factors lower than in spiral galaxies, contradicting observational constraints (see Section 5.2 of Bogdan+ 2015); and the predictions for the Sunyaev-Zel'dovich signals from Illustris clusters are not reliable (Popa et al. 2015, in prep).

For some items of this list we have intentionally omitted more specific quantifications of the tensions with observations for two reasons: on the one side, not all observational results are in agreement among each other, making quantitative statements necessarily partial; on the other side, excruciating care is necessary to properly map simulated variables into observationally-derived quantities.

For example, we notice that the adopted low star-formation density threshold value and the low thermal energy content of galactic winds may be the cause for spurious star-formation in the circumgalactic medium around Milky Way-like galaxies, at large distances from the natural, dense sites of star formation activity (i.e. disks, see \cite{marinacci14a}). However, no observational data are available to properly quantify such phenomenon. Similarly, the impact of the AGN feedback on the dark-matter distribution within Illustris halos might be overestimated, but direct observational constraints are lacking.

Furthermore, while a first analysis of the stellar ages of Illustris galaxies seemed to reveal an overestimation of the predicted stellar ages for $M_\star \lesssim 10^{10.5} {\rm M}_\odot$ galaxies (see Fig. 25 of Vogelsberger+ 2014b), we have now recognized that such a comparison to observations is rather inconclusive, as the shape of the age-mass relation of galaxies strongly depends, in the first place, on whether stellar ages are measured by mass- or light- weighting.

To better inform which features of the simulations should be trusted when making science conclusions, note also the following points more directly related to numerical choices:

• In both the snapshots and halo catalogs, metallicity values should be used and interpreted with care. These depend on the underlying choices for stellar evolution and metal enrichment, with tabulated yields being uncertain and continuously updated. Furthermore, no metallicity floor has been imposed to the output data, so that metallicities of a small fraction of gas and star elements adopt minuscule, unrealistic values. Any user should feel free to adopt the most convenient and appropriate metallicity floor.
• In the Subfind catalogs, relatively-low mass, stellar- or gas-dominated objects at small galactocentric distances from their host halos may be artifacts and should be considered with care. These may be the results of the fragmentation of aforementioned stellar rings in disk galaxies, and may appear as outliers in halos/galaxies scaling relations involving sizes, masses, metallicities and mass-to-light ratios.
• Low-mass BHs in relatively low-mass subhalos should also be considered with care, particularly those hosted in satellite subhalos of more massive galaxies or at low redshifts. Because spurious motions of BH particles are prevented by repositioning the BH on halo potential minimum, in some cases, low-mass BHs in satellite galaxies are repositioned on the central halo on artificially short timescales. These "empty" satellites may then be repopulated with new BH seeds, regardless of redshift. The vast majority of these late-forming, satellite-hosted seeds do not grow significantly before merging with the central BH, so the effects are largely confined to BHs with mass $<10^6 M_{\odot}$.
• In all Illustris simulations, the time-variable UVB radiation field of Faucher-Giguerre et al. (2009), in particular the FG11 (2011) version, has been enabled only for $z < 6$. Therefore, the ionization state of the IGM above redshift six should be studied with caution.

### 4. Community Considerations

#### Citation

To support proper attribution, recognize the effort of individuals involved, and monitor ongoing usage and impact, we request the following. Any publication making use of data from the Illustris simulations should cite the release paper (Nelson et al. 2015c) as well as the original paper introducing the project (Vogelsberger et al. 2014a). Furthermore, extensive use of the data, or studies of galaxy properties and populations, should cite if appropriate Vogelsberger et al. (2014b) as well as Genel et al. (2014). Any investigation of the black hole population should cite if appropriate Sijacki et al. (2015).

Finally, use of any of the supplementary data products should include the relevant citation. A full and up to date list will be maintained here:

#### Collaboration and Contributions

The full snapshots of Illustris-1 are sufficiently large that it will be prohibitive for most users to acquire or store a large number. We note that transferring 1.5 TB (per snapshot) at 10 MB/s will take roughly 42 hours. As a result, projects which require access to the entire snapshot set may benefit from closer interaction with members of the Illustris collaboration. In particular, many team members are open to more direct collaboration, which can include guest access to compute resources which are local to full copies of the data. We welcome ideas for joint projects, so long as they intersect with the interests of collaboration members and do not overlap with existing efforts. We suggest, practically, to contact the author(s) who have already published work using Illustris data in related scientific topics.

We also welcome contributions to the data release. These can take the form of either analysis code, or computed data products:

• If you would like to develop an (expensive) analysis routine, we can run it against one or all simulations or snapshots. The resulting data can be made immediately public through the Illustris API. Alternatively, the resulting data can be made privately available until an initial publication is released, and then released publicly.
• If you would like to develop an (inexpensive, fast) analysis routine, we can integrate it into the Illustris API, such that it can be requested on demand for any object. In this case, analysis should be restricted to subhalo or halo particles, and take at most a few seconds.
• If you produce a data set derived from the Illustris simulations, and would like to make it publicly available, we can host and distribute it alongside the other supplementary data catalogs.
• If you would like to develop a web-based client application which leverages the API (similar to the Explorer or Merger Tree visualization tools), we can provide assistance, and ultimately integrate and host it on this site, if desired.

#### Future Data Releases

We anticipate the ongoing release of additional data products, for which further documentation will be provided online:

Rockstar and Consistent-Trees. We plan to release Rockstar group catalogs and the Consistent-Trees merger trees built upon them for the six Illustris boxes in the near future, and will provide further documentation at that time. These group catalogs can include a different subhalo population than identified with the Subfind algorithm, particularly during mergers. The algorithm used to construct the C-Trees also has fundamental differences to both LHaloTree and SubLink. This can provide a powerful comparison and consistency check for any scientific analysis. We also anticipate that some users will simply be more familiar with these outputs, or need them as inputs to other tools.

Additional Supplementary Data Catalogs:

• Dark-matter halo catalogs at selected snapshots will be released including dark-matter density profiles fit parameters, fit-independent concentration estimates, halo formation times, and halo shapes. (Lead: Annalisa Pillepich, Eddie Chua)
• Mock images and property catalogs of Illustris-1 stellar halos will be released, at a selection of snapshots between $z=0$ to $z=2$. (Lead: Annalisa Pillepich)
• Lightcones for mock-observed survey fields, in HST and JWST filters. (Lead: Greg Snyder)

Additional Simulations: Several smaller simulations related to Illustris have been discussed in previous papers, including a series of $25 {\rm Mpc}/h$ boxes with variations on the input feedback parameters. These can be released in the future if there is community interest. Ongoing and future projects, including higher resolution zooms of individual systems, as well as larger volumes, will also be released through this platform in the future.

API Functionality Expansion: There is significant room for the development of additional features in the web-based API. In particular, for (i) on-demand visualization tasks, (ii) on-demand analysis tasks, and (iii) client-side, browser based tools for data exploration and visualization. For example, (i) requesting an image of projected gas density for a given halo, (ii) requesting a power-law radial slope measurement of a stellar halo or best-fit NFW parameters, and (iii) an interactive 3D representation of the subhalos within a given halo. We welcome community input and direct contributions in any of these directions.