A strange bug about hdf5 file created by myself

Zijian Zhang
  • 18 Sep '20

Hi,

I create some hdf5 files which have ~1e7 items (~1Gb) and they are organized the same as the original simulation files in the jupyterlab workspace but find that I can't do any calculation on the file such as np.where, f['PartType0']['Coordinates'][:,1]. I used the 'top' in the terminal and found the process always became zombie. I don't know why because the 600 files (e.g. /home/tnguser/sims.TNG/L205n2500TNG/output/snapdir_099/snap_099.0.hdf5) you provided have ~2e7 items and could run well. Maybe I can provide my id and password in some way?

Thanks

Dylan Nelson
  • 18 Sep '20

Hi Zijian,

First, these files are a bit strange

# h5ls -r snap_099.hdf5
/                        Group
/PartType0               Group
/PartType0/Coordinates   Dataset {12116520/Inf, 3}
/PartType0/ElectronDensity Dataset {12116520/Inf}

I have never seen "Inf" like that. Perhaps you are creating them in an odd way? This is perhaps causing h5py to be very slow and not work well when trying to read them.

I would suggest you try to make them the simplest way. Given a numpy array x, you should write it to the file like:

with h5py.File('out.hdf5','w') as f:
    f['x'] = x
Zijian Zhang
  • 3
  • 19 Sep '20

Hi Dylan,

Thank you for your reply!

"Inf" is shown because I set the maxshape of dataset is maxshape=(None,3) (or maxshape=(None,)) and I am trying to set the shape large enough to see whether the problem still exists. However, I find that sometimes the cpu usage is high (~99%) but sometimes is low (<10%), which causes the time of my calculation varys from ~1h to 10h (maybe longer). I find some explanations (e.g. https://stackoverflow.com/questions/47259358/jupyter-notebook-low-cpu-usage) but the link given in this link has been dead. Could you please give some suggestions?

Cheers

Dylan Nelson
  • 20 Sep '20

Hello,

You shouldn't use None in a shape, this must lead to non-ideal behavior. Just use a simple number, and also I suggest to avoid all filters/compression/chunking/etc.

  • Page 1 of 1