You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the estimation of gamma in k-prototypes, the current implementation appears to estimate the gamma by using 0.5 * (standard deviation of all numeric data).
However, in the paper [Huang 1997] it was mentioned that gamma is guided by the " average standard deviation of numeric attributes". If that is the case, shouldn't we be calculating the mean for all the standard deviation of each numeric attribute?
The text was updated successfully, but these errors were encountered:
Generally speaking, γ_l is related to σ_l , the average standard deviation of numeric attributes in cluster l. In practice, σ_l can be used as a guidance to determine γ_l . However, since σ_l is unknown before clustering, the overall average standard deviation σ of numeric attributes can be used for all σ_l.
So yes, it appears you are correct in your statement.
For the estimation of gamma in k-prototypes, the current implementation appears to estimate the gamma by using 0.5 * (standard deviation of all numeric data).
However, in the paper [Huang 1997] it was mentioned that gamma is guided by the " average standard deviation of numeric attributes". If that is the case, shouldn't we be calculating the mean for all the standard deviation of each numeric attribute?
The text was updated successfully, but these errors were encountered: