Spiral galaxy image data

Alexander Gruber
  • 14 Sep '18

I am trying to figure out how to download image data of as many galaxies as possible, as high resolution as possible. I'm particularly interested in the data being separated into types (e.g. "spirals", "ellipticals", etc.) if that is possible.

I'm having difficulty understanding the guide for accessing the data. Is there a simple way to do a bulk download like this?

Dylan Nelson
  • 14 Sep '18

Hi Alex,

Are you just interested in PNGs, or do you need the images in actual scientific units?

There aren't any simple measurements of type (e.g. spiral vs elliptical) in the group catalogs, but there are a few supplementary catalogs which have some relevant information. If you download one of these, you can use it to make a selection of 'subhalo IDs' for each type, then you could run a wget command to download the images for all the subhalos in each list.

You can look either at the "(b) Photometric Non-Parametric Stellar Morphologies" catalog, then you will maybe need to read a bit in the reference what "Gini,M20" means and how to use it to separate out the types you're interested in. Or you could use the "(c) Stellar Circularities" catalog, taking high values of the "CircAbove07Frac" parameter as indicators of a disk.

Alexander Gruber
  • 1
  • 14 Sep '18

I am just interested in PNGs (at least, at the moment). This would be for an image processing experiment so scale is not super important.

I see the list of subhalo commands (I think) but I confess I'm not sure how to construct a wget or any other type of download right now. I'm at square one. The documentation is a little overwhelming. Is this the right page to be looking at for just basic downloading? Do I need to register for an API key? Forgive my ignorance.

Dylan Nelson
  • 14 Sep '18

Hi Alex,

An easy way is maybe to wget a list of links that you first put in a text file.

The links will look like:

http://www.illustris-project.org/api/Illustris-1/snapshots/135/subhalos/132699/stellar_mocks/image_fof.png

where the number 132699 is the subhalo ID, and this is the only thing you need to change.

To get a list of subhalo IDs, I suggest you download the Stellar Circularities supplemental catalog for Illustris-1 at z=0. This is an HDF5 file, described on the documentation page - one entry is SubfindID, this is what you need. You could just try to get the first 10 or 100 using this procedure, then later sort them by the CircAbove07Frac field as I mentioned above.

To actually run the wget command, you will need to register and get an API key, and follow the syntax e.g. shown at the top of the galaxy observatory page.

Alexander Gruber
  • 13 Oct '18

Thanks for all your help. I've got it working well enough to have downloaded a few pngs just using a wget on the first 100 subhalo IDs. I've gone through the data access guide for loading the group 135 data and have that working now, but I'm having trouble figuring out how to open that stellar circularities catalog.

Dylan Nelson
  • 13 Oct '18

Hi Alexander,

This is just a HDF5 file, you can use "h5py" in python to load it, and its webpage has some simple tutorials to help there.

Alexander Gruber
  • 14 Oct '18

I figured out how to get a [SubfindID, CircAbove07Frac] list and sort it by CircAbove07Frac, then took the SubfindID entries and put them into a file list of the form you gave above. It seems like most of them are not getting found, though. Does every subhalo have an image?

Dylan Nelson
  • 14 Oct '18

No, only relatively large subhalos have images, and probably your list is dominated by small things which aren't really of interest to you. If you apply the same criterion as the images (M* > 10^10 Msun), then you should find most all of them.

Alexander Gruber
  • 14 Nov '18

Sorry to bother you again-- I have two more questions. If I should be posting them as separate threads, please let me know and I'll do so.

The first question is probably dumb, but how do I link up a supplementary data catalog to use with the main group catalog at the same time? Earlier, when I sorted the SubfindIDs by CircAbove07Frac, I was using stellar_circs.hdf5. If I want to get, say, the SubhaloMass associated with that SubfindID, the SubhaloMass is located in groups_135.0.hdf5. Can these hdf5s cooperate together so that this is possible?

Second, I noticed that there is some discussion of (relative) coordinates of particles in the properties for halos and subhalos. Does this imply that there is a way to access point cloud data-- specifically, to get a list of coordinates corresponding to the stars/other bodies that comprise a galaxy? Currently I've been trying to reverse engineer data in this form the images via several methods (e.g. sampling points using the image as a histogram distribution), but it occurs to me that this data may exist already and I'm being redundant.

Dylan Nelson
  • 14 Nov '18

Hi Alex,

Yes the SubfindID is actually the index into the group catalog, at that snapshot. You can load all the fields of a single subhalo with the illustris_python.groupcat.loadSingle() function of the scripts (see examples). Just a caution that "groups_135.0.hdf5" is one of many "chunks", careful here of the indexing.

Second, yes you can always obtain all the member particles/cells of a given subhalo/halo. If you have the snapshot downloaded, you can directly use the illustris_python.snapshot.loadSubhalo() function of the helper scripts. If you haven't downloaded the snapshot, you can get a "cutout" directly from the web-API using [base]/subhalos/{id}/cutout.hdf5 (see docs).

Alexander Gruber
  • 7 Jan

Thanks-- I pulled the subhalos. A couple questions about the results. It seems like many of the subhalos consist of more than one galaxy that appear topologically separate, e.g. these two:
tempexample1.png
tempexample2.png
and also there are some (quite a few) subhalos consisting of only a few sparse stars, e.g.
tempexample3.png
I suppose this was to be expected as subhalos and galaxies aren't the same thing, and there are obvious difficulties in defining what constitutes a single galaxy based purely on coordinate data (particularly in the case of mergers or big interacting clusters). Is there a way among the other physical parameters to separate the stars into galaxy-like groups? In other words, I recall you saying there isn't a simple measurement of galaxy type-- is there a measurement to select discrete galaxies?

Dylan Nelson
  • 7 Jan

Hi Alex,

I think you are seeing galaxies wrap around the periodic box, as here:

http://www.illustris-project.org/data/forum/topic/136/some-questions-on-particles/

Alexander Gruber
  • 9 Jan

You mean it could be the result of wraparound? That is certainly possible for a few of the examples I've seen, but I'm not sure it could be for most of them. For example the following set is rotated to be more or less viewed head through one of the coordinate planes, and there would definitely have to be more than one cluster, whether or not there is wraparound.

example2.png

or, for a more extreme example, the subhalo with the greatest size point cloud looks like this on my machine

exampleangle1.png

have I misunderstood in some way how these are structured?

Dylan Nelson
  • 9 Jan

Can you provide the run,snap,subhalo_Id of that first example?

Alexander Gruber
  • 10 Jan

That would be subhalo 62 in snap 135 of Illustris 3.

Dylan Nelson
  • 10 Jan

Hi Alex,

This of course a satellite of the first group. I checked and loaded the dark matter, stars, and gas particles, but found them all localized and not quite as you show above. For instance, for the gas:

In [9]: pos_gas[:,0].max() - pos_gas[:,0].min()
Out[9]: 41.335938

In [10]: pos_gas[:,1].max() - pos_gas[:,1].min()
Out[10]: 32.875

In [11]: pos_gas[:,2].max() - pos_gas[:,2].min()
Out[11]: 42.847656

In [15]: pos_gas.shape
Out[15]: (143, 3)

In [16]: pos_gas[:,0].mean()
Out[16]: 74701.11

Maybe verify you have the same numbers above, and check that you are loading the correct particles?

Alexander Gruber
  • 1
  • 10 Jan

This would just be for the star particles. I got them with

data = il.snapshot.loadHalo(basePath,135,62,'stars');
data['Coordinates']

which I then exported to a csv. I'm not getting the same numbers, e.g.

>>> data['Coordinates'][:,1].max() - data['Coordinates'][:,1].min()
1741.0605
Dylan Nelson
  • 10 Jan

Hi Alex,

Yes your numbers are correct (halo 62, not subhalo 62), and it's true this halo spans ~1.5 Mpc. This is slightly large for its mass, and in this case I would guess a large halo-halo merger has just taken place, such that what you are effectively seeing is two (or more) halos "bridged" into a single object. This "FoF bridging" can always occur since the FoF algorithm will at some point link together two previously distinct structures. If you look at the merger tree of this halo, you'll probably be able to spot the distinct (large) progenitor branches. If you look at the subhalos, you see the first three have very large M ~ 10^11.1, and the the fourth has M ~ 10.1 (i.e. the first true satellite, the other three being roughly equal mass, essentially central galaxies).

Alexander Gruber
  • 11 Jan

I see-- so, I guess then that, in general, Subhalos more likely will correspond to distinct galaxies than Halos?

Is the intuition here that Subhalos correspond to clusters and Halos correspond to clusters that are close enough to be significantly interacting?

Dylan Nelson
  • 11 Jan

Yes you should always use 'subhalos' as 'galaxies'. A general medium-sized halo will have a large central galaxy and many satellite galaxies, each of which will be picked up as separate subhalos.

  • Page 1 of 1