Illustris - Data Access

General

1. Are there any useful additional resources for understanding the data?

Since Illustris is, in many aspects, a descendant of the Millenium simulations, much of that excellent documentation may be useful, particularly with respect to the merger tree structure. Since the Arepo owes much of its heritage to Gadget-2, there are many similarities at the particle level.

2. What is the difference between a halo and a subhalo? Group and subgroup? FoF and Subfind?

There are currently two distinct types of objects, for which group catalogs exist (once the Rockstar groups are released, there will be three). We presented the following naming definitions in the data release paper:

"Group" == "FoF Group" == "FoF Halo" == "Halo"
"Subgroup" == "Subfind Group" == "Subhalo"

The first type are computed with a friends-of-friends algorithm, while the second are computed with the Subfind algorithm. Each FoF halo can have zero or more subhalos, which belong to it. One of the properties of the Subfind algorithm is that gravitationally unbound member particles are removed. Therefore, for very small FoF halos (near the minimum particle limit of 32), it is possible that this "unbinding" procedure will remove enough particles such that there is no subhalo (above the minimum particle limit of 20) left. In this case a FoF halo will have no subhalos. If it does, there are two fundamentally different types:

The "Central" or "Primary" subhalo, of which there can be only one per FoF halo, which is by construction the most massive.
"Satellite" or "secondary" subhalos, of which there can be zero or one or many.

3. How do I find the parent halo of a subhalo? How do I find the subhalos of a halo?

For a given subhalo, the catalog field SubhaloGrNr gives its "group number", i.e. the index (or ID) of its parent halo (or "group").

Note that this index (or ID) refers to the Group* datasets in the catalog, not the Subhalo* datasets. For example, to obtain the halo mass of subhalo i, you could access Group_M_Crit200[SubhaloGrNr[i]].

For a given halo, the catalog field GroupFirstSub gives the index (or ID) of its "first" (i.e. primary, or "central") subhalo. This value is invalid (equal to -1) unless GroupNsubs >= 1. As above, this index (or ID) refers to the Subhalo* datasets in the catalog. To access the stellar mass of the central subhalo of group j, you could take SubhaloMassInRadType[GroupFirstSub[j],4].

If a halo has more than one subhalo (GroupNsubs > 1), then all subhalos after the first are satellites. The indices (or IDs) of these satellite subhalos are given by {GroupFirstSub+1, GroupFirstSub+2, ..., GroupFirstSub+GroupNsubs-1}, or in python, range(GroupFirstSub+1, GroupFirstSub+GroupNsubs).

4. What is the difference between a "subhaloID", "SubfindID", and just "id"?

These are all different names for the same value, and are used interchangeably in this documentation and most papers. They all refer to the "ID" (or "index") of a subhalo in the group catalogs, as determined by the Subfind substructure algorithm, i.e. the index of any dataset within the Subhalo group of the HDF5 file.

For instance, when loading a single object from the catalog with the groupcat.loadSingle() function, or when loading particles belonging to a given object from the snapshot with the snapshot.loadSubhalo() function, or when using the SubLink merger trees to find the SubfindID of a progenitor or descendant subhalo at different snapshots.

5. Does every halo have at least one subhalo? Can Group_R_Crit200 be zero?

Small, low-mass halos near the minimum particle count threshold of 32 may not contain any bound subhalo. In particular, the central subhalo may fall below the required 20 particle minimum for a Subfind object during the iterative unbinding procedure. In this case, GroupFirstSub == -1 and spherical overdensity (SO) quantities such as the virial radius (Group_R_Crit200) are not calculated and may appear as zero.

6. Where is the gas temperature of cells? I don't see it in PartType0.

The gas temperature must be derived from the internal energy $u$ and the electron abundance $x_e$ ($=n_e/n_H$) which are PartType0 snapshot fields InternalEnergy and ElectronAbundance, respectively. First, the mean molecular weight can be calculated as $$\mu = \frac{4}{ 1 + 3 X_H + 4 X_H x_e } * m_p.$$ Then, the temperature in Kelvin is given by $$T = (\gamma - 1) * u / k_B * \frac{\rm{UnitEnergy}}{\rm{UnitMass}} * \mu.$$ Here, $\gamma = 5/3$ is the adiabatic index, $k_B$ is the Boltzmann constant in CGS units, and $X_H = 0.76$ is the hydrogen mass fraction. Finally, UnitMass and UnitEnergy are those code units in CGS units. Specifically, UnitEnergy = UnitMass * UnitLength^2 / UnitTime^2 (UnitLength is 1 kpc, UnitTime is 1 Gyr), so their ratio is $10^{10}$.

7. How are DM IDs related between the full physics and dark matter only runs?

They are same, meaning that they started in the same locations in the initial conditions.

8. Are all black holes guaranteed to belong to a halo or subhalo?

Not necessarily. In particular, at z=0 in Illustris-1 there are a handful of black hole particles which are outside of all FoF halos. This number is very small, and may occur if e.g. a BH is seeded in a low mass halo which is then stripped or disrupted, falling below the minimum mass threshold for subhalos. In this case, a BH may reside outside of a group until it falls into the potential center of a new halo.

IllustrisTNG Specific

1. Do Illustris and TNG snapshots have the same numbers/redshifts?

No. While the final, z=0 snapshot of Illustris is number 135, the z=0 snapshot of TNG is number 99. All higher redshifts will also be saved at different snapshot numbers, and there may not be an exact correspondence between a snapshot of Illustris and of TNG. This will also be true for other simulations. In general, to make a comparison between any two simulations, the closest snapshots to a given redshift should be found in each.

2. I receive a HDF5 error 'dataset does not exist' when trying to load a snapshot field.

Keep in mind the 'mini' snapshots! You may be trying to load e.g. NeutralHydrogenAbundance or MagneticField for one of the 80 mini snapshots, while this field is only available in the 20 full snapshots.

3. Can two particles/cells ever have the same ID?

This can occur for PartType4 (stars) in some, extremely rare cases. For example, ID 116384809780 in TNG100-1 snapshot 14. This may only affect TNG100-1. Typically zero or at most a few such duplicates exist, and are only an issue for algorithms sensitive to perfect ID uniqueness.

Original Illustris Specific

1. Why are there only 134 snapshots for Illustris-1? What happened to #53 and #55?

For Illustris-1, snapshots 53 and 55 were corrupted and cannot be used. Although the group catalogs exist, the merger trees skip over these snapshots, as if they are not present. In general, for most analysis tasks, the best approach is to simply pretend that these two snapshots do not exist.

2. Why does the size of Illustris-1 snapshots suddenly get larger at #72?

Prior to snapshot 72, the Coordinates field of all particle types was written in single precision floating point (32-bits). This was changed to double (64-bit) precision at snapshots 72 through 135.

3. Why are the dark matter IDs so large? Why are they not sequential?

The dark matter particle IDs encode their starting positions in the initial conditions, using a Peano-Hilbert key with 21 bits per dimension (in order to fit into a 64-bit ID). If you would like to make use of this information, get in touch for more details. (Note: DM IDs are sequential, and small, in TNG).

4. What is this 'GFM_Metals' field which can be found floating around the snapshots?

In original Illustris, this field is present for stars and gas in the original snapshot files, but should not be used. In particular, the values are incorrect, and do not contain valid information. (These are available and valid in TNG).

Merger Trees

1. Is every subhalo in the merger trees? Are halos in the merger trees?

The vast majority of subhalos (at every snapshot) are in both the SubLink and LHaloTree merger trees, but not all. If a subhalo does not have an established progenitor/descendant history, then it will not be included, and this will be most common for the smallest subhalos, near the resolution limit of 20 particles, and at high redshift.

FoF halos are not tracked in the merger trees, which are based solely on subhalos.

2. I understand FirstProgenitor and Descendant links, but how does NextProgenitor work?

The 'next progenitor' for a subhalo ID N points to: the "next subhalo" which shares the same descendant as subhalo N. Therefore be careful: it is not a pointer from a given subhalo to a progenitor of that subhalo. By "next subhalo" we mean the next most massive (for LHaloTree), or the one with the next most massive history (for SubLink).

3. Why are there sudden dips in the mass accretion history (MAH) of a subhalo?

Sometimes a subhalo briefly changes its central/satellite status according to SUBFIND. Since centrals are generally more massive than satellites in SUBFIND, the change results in a visible spike in the MAH of the subhalo.

Web-based API

1. Can I download binary (HDF5) data and inspect it in-memory, without first saving a (temporary) file to disk?

Yes. But, this feature is relatively immature in most languages and HDF5 wrappers. For doing this in Python, see h5py issue #552 or the PyTables 'in-memory' page. With the HDF5 library (in C), this requires HDF5 v1.8.9+, and you should look at HDF5 File Image Operations [PDF].

2. wget: unrecognized option `--content-disposition'

The --content-disposition option is how wget knows what filename to save the download as. This was added to wget v1.11 in Jan 2008, if you're using any software this old you are going to have many problems! Please upgrade.

3. wget: Error 403: Forbidden

In order to use wget to download data files, including snapshots, catalogs, supplementary catalogs, and so on, you need to pass your API key with the request, by adding it to the commandline. You can find your API key at the top of the Data Access page. It can be included with the following syntax: wget --header="API-Key: KEY_HERE" "https://www.tng-project.org/api/path/to/file.hdf5".

Public Data Access Overview / Frequently Asked Questions

General

IllustrisTNG Specific

Illustris Specific

Merger Trees

Web-based Tools and API

General

1. Are there any useful additional resources for understanding the data?

2. What is the difference between a halo and a subhalo? Group and subgroup? FoF and Subfind?

3. How do I find the parent halo of a subhalo? How do I find the subhalos of a halo?

4. What is the difference between a "subhaloID", "SubfindID", and just "id"?

5. Does every halo have at least one subhalo? Can Group_R_Crit200 be zero?

6. Where is the gas temperature of cells? I don't see it in PartType0.

7. How are DM IDs related between the full physics and dark matter only runs?

8. Are all black holes guaranteed to belong to a halo or subhalo?

IllustrisTNG Specific

1. Do Illustris and TNG snapshots have the same numbers/redshifts?

2. I receive a HDF5 error 'dataset does not exist' when trying to load a snapshot field.

3. Can two particles/cells ever have the same ID?

Original Illustris Specific

1. Why are there only 134 snapshots for Illustris-1? What happened to #53 and #55?

2. Why does the size of Illustris-1 snapshots suddenly get larger at #72?

3. Why are the dark matter IDs so large? Why are they not sequential?

4. What is this 'GFM_Metals' field which can be found floating around the snapshots?

Merger Trees

1. Is every subhalo in the merger trees? Are halos in the merger trees?

2. I understand FirstProgenitor and Descendant links, but how does NextProgenitor work?

3. Why are there sudden dips in the mass accretion history (MAH) of a subhalo?

Web-based API

1. Can I download binary (HDF5) data and inspect it in-memory, without first saving a (temporary) file to disk?

2. wget: unrecognized option `--content-disposition'

3. wget: Error 403: Forbidden