Skip to content

pdf_maker.py demo

Chris Morrison edited this page Apr 13, 2017 · 7 revisions

Now that we've created the a HDF5 data file with the pair_maker.py portion of the-wizz we can start thinking about creating clustering redshifts for sub-samples of this catalog. In this example we'll be creating clustering-z for a sample selected in photometric redshift with z=0.3-0.5. Included in the demo files is a Jupyter notebook to run the selection below, produce a clustering-z using the-wizz, and plotting the result. Again, you can find the files for this demo within the-wizz repository at

the-wizz/tests/data

First off, we need to select this sub-sample from the fits file we used to create the HDF5 data file. That is, we need to cut a sub sample out of the file.

COSMOS_iband_2009_radecidstomp_regionzp_best.fits

Using your favorite catalog manipulation tool, we cut a sample selection on the photometric redshift column as

0.3 <= zp_best < 0.5

and save it to a file. I've chosen the name COSMOS_iband_2009_radecidstomp_regionzp_best_zp0.3t0.5.fits. Now we have everything we need to create a clustering-z for this sample.

Using this sub-sample we run the pdf_maker portion of the-wizz with the following command:

python the-wizz/pdf_maker.py --input_pair_hdf5_file=COSMOS30_iband_2009_the-wizz_nregion8.hdf5 --pair_scale_name=kpc100t300 --unknown_sample_file=COSMOS_iband_2009_radecidstomp_regionzp_best_zp0.3t0.5.fit --unknown_index_name=id --unknown_stomp_region_name=stomp_region --output_pdf_file_name=COSMOS30_iband_2009_kpc100t300_z0.3t0.5.ascii --z_min=0.01 --z_max=1.5 --z_n_bins=25 --z_binning_type=logspace --use_inverse_weighting --n_processes=2

In order, this command for pdf_maker loads our HDF5 file, selects the physical scale R=30-300 [kpc] as the one to run, loads the subsample of objects we just created, specifies the unique id of these objects to match in the HDF5 file, specifies the column name of the spatial regions to bootstrap over, defines the name of the file to output results to, sets the minimum redshift to bin the reference sample, sets the maximum redshift to bin, number of bins for the reference sample, type of binning in this class logspace binning, use signal matched filter weighting as a function of physical scale, use two cores to compute the results. Descriptions of each of these flags can be found in the file input_flags.py.

If we plot the results they look like this: Clustering-z for the bin z=0.3-0.5 from COSMOS normalized to an integral of 1.

That's the basics of it. There are more advanced options for running the-wizz or getting higher quality recoveries, but this is the simplest analysis one can do.

With the output of the code one could then use one of the many galaxy bias mitigation techniques in post yield a more accurate clustering-z estimate. These bias mitigations can take the form of e.g., Newman 2008, Menard et al. 2013, or Rahman et al. 2016. The choice is left to the user.

Clone this wiki locally