Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java predict results are different from python predict results by loading the same model #11221

Closed
henghamao opened this issue Feb 8, 2025 · 4 comments
Labels

Comments

@henghamao
Copy link

We used python code to train xgb model, and then used java to load the model and do predictions.
We observed that java predict results are quite different from python results by loading the same model.
python version

Name: xgboost
Version: 2.1.4
Summary: XGBoost Python Package

Java version:

 <dependency>
            <groupId>ml.dmlc</groupId>
            <artifactId>xgboost4j_2.12</artifactId>
            <version>2.1.4</version>
  </dependency>

Same 2.1.4 version were used in python and java code.

Here are the code to reproduce the issue.

import xgboost as xgb
import numpy as np


X = np.random.rand(100, 229)
y = np.random.rand(100)

dtrain = xgb.DMatrix(X, label=y)

# 设置参数
params = {
    'objective': 'reg:squarederror',
    'eval_metric': 'rmse',
    'max_depth': 3,
    'eta': 0.1
}

num_round = 10
model = xgb.train(params, dtrain, num_round)
model.save_model('xgb_model.json')

#load model and predict
model=xgb.Booster(model_file='xgb_model.json')
new_data=[-0.01, 0.255, -0.24, -0.015, 0.6666666, -0.01, -0.72, 0.61916846, 0.4, 0.9957447, 0.619647, 0.0, 0.0, -0.0029980468, -0.005, -0.0027490235, -0.000625, 0.70212764, 5.714286, 0.005, 0.0, 0.0035106381, -0.0125, 0.010638298, -1.0, -0.01074707, 0.00011330161, -4.69, -0.51347524, -1.465625, -0.00299689, 0.002615717, 0.0, 5.671828e-05, -0.00056682917, -0.00056682917, -5.6711848e-05, -0.00010632944, 0.005612563, 0.00299689, 0.003053608, 0.002430065, 0.002430065, 0.0029401786, 0.0028905615, -0.002615717, -0.0025589992, -0.0031825416, -0.0031825416, -0.0026724285, -0.0027220456, 5.671828e-05, -0.00056682917, -0.00056682917, -5.6711848e-05, -0.00010632944, -0.0006235474, -0.0006235474, -0.00011343013, -0.00016304772, 0.0, 0.00051011733, 0.00046049975, 0.00051011733, 0.00046049975, -4.961759e-05, 10.0, 10.0, 0.0, 0.0, 20000.0, 20000.0, 0.0, 10.0, 10.0, 0.0, 1.1916667, 1.4128441, 13.0, 0.10091743, -0.01, -0.01, -0.01, -0.005, -0.01, 0.0, -0.01, -0.01, -0.01, -0.005, -0.01, 0.0, -0.01, -0.01, -0.005, 0.0, -0.01, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -18.0, -13.0, -42.0, -3.0, -19.0, -7.0, 16.0, 22.0, 27.0, 25.0, -14.0, 18.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.003612679, 0.003773796, 0.006413864, 0.002510638, 0.0017606382, 0.002399527, 4.754286, 4.8142858, 5.44898, 4.214286, 3.6373627, 4.1428576, 0.0, 35.0, 0.0, 69.0, 0.0, 28.0, 0.0, 5.0, 0.0, 25.0, -0.16666667, 0.7637626, -0.16666667, 0.7637626, -0.33333334, 0.57735026, 0.0, 0.0, 0.2, 0.75828755, 0.2, 0.75828755, 0.4, 0.6519202, 0.0, 0.0, -0.5, 0.8164966, -0.5, 0.8164966, -0.45, 0.95597535, 0.0, 0.0, -0.5, 1.1355499, -0.525, 1.1525143, -0.425, 1.238367, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5595238, 0.27740905, -1.1515151, 3.2273796, 0.30294397, 0.33926272, 0.6926407, -0.34872004, -1.1388888, 2.5796635, -2.0, 2.0317724, 0.6044226, 0.27588674, 0.68835616, -0.5550575, 0.3097345, 1.1402917, -0.4318182, 1.9581538, 0.31690142, 0.49123436, 0.643168, -0.7472216, -0.09598604, 1.0664079, 0.14205608, 1.1953759, 0.20737448, 0.9390114, 0.72867227, -1.1205546, -0.11328125, -0.08678538, 0.035141055, 0.17725138, 44.092304, -0.62646484, -0.77785134, 0.07804155, 0.180319, 43.923176]
dtest = xgb.DMatrix(np.array([new_data]))
python_prediction = model.predict(dtest)
print("Python predict result 1:", python_prediction)

new_data2=list(range(229))
dtest = xgb.DMatrix(np.array([new_data2]))
python_prediction = model.predict(dtest)
print("Python predict result 2:", python_prediction)

We used python to train xgb model, and then load the model, make two predictions.
One prediction used customized float array, and the other prediction used the continued array from 0 to 228.
Output:

Python predict result 1: [0.49624503]
Python predict result 2: [0.5902365]
public static void main(String[] args) {
        try {
            Booster booster = XGBoost.loadModel("/data/release/infinity_stock4/xgb_model.json");
            float[] mat = {-0.01f, 0.255f, -0.24f, -0.015f, 0.6666666f, -0.01f, -0.72f, 0.61916846f, 0.4f, 0.9957447f, 0.619647f, 0.0f, 0.0f, -0.0029980468f, -0.005f, -0.0027490235f, -0.000625f, 0.70212764f, 5.714286f, 0.005f, 0.0f, 0.0035106381f, -0.0125f, 0.010638298f, -1.0f, -0.01074707f, 0.00011330161f, -4.69f, -0.51347524f, -1.465625f, -0.00299689f, 0.002615717f, 0.0f, 5.671828e-05f, -0.00056682917f, -0.00056682917f, -5.6711848e-05f, -0.00010632944f, 0.005612563f, 0.00299689f, 0.003053608f, 0.002430065f, 0.002430065f, 0.0029401786f, 0.0028905615f, -0.002615717f, -0.0025589992f, -0.0031825416f, -0.0031825416f, -0.0026724285f, -0.0027220456f, 5.671828e-05f, -0.00056682917f, -0.00056682917f, -5.6711848e-05f, -0.00010632944f, -0.0006235474f, -0.0006235474f, -0.00011343013f, -0.00016304772f, 0.0f, 0.00051011733f, 0.00046049975f, 0.00051011733f, 0.00046049975f, -4.961759e-05f, 10.0f, 10.0f, 0.0f, 0.0f, 20000.0f, 20000.0f, 0.0f, 10.0f, 10.0f, 0.0f, 1.1916667f, 1.4128441f, 13.0f, 0.10091743f, -0.01f, -0.01f, -0.01f, -0.005f, -0.01f, 0.0f, -0.01f, -0.01f, -0.01f, -0.005f, -0.01f, 0.0f, -0.01f, -0.01f, -0.005f, 0.0f, -0.01f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, -18.0f, -13.0f, -42.0f, -3.0f, -19.0f, -7.0f, 16.0f, 22.0f, 27.0f, 25.0f, -14.0f, 18.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.003612679f, 0.003773796f, 0.006413864f, 0.002510638f, 0.0017606382f, 0.002399527f, 4.754286f, 4.8142858f, 5.44898f, 4.214286f, 3.6373627f, 4.1428576f, 0.0f, 35.0f, 0.0f, 69.0f, 0.0f, 28.0f, 0.0f, 5.0f, 0.0f, 25.0f, -0.16666667f, 0.7637626f, -0.16666667f, 0.7637626f, -0.33333334f, 0.57735026f, 0.0f, 0.0f, 0.2f, 0.75828755f, 0.2f, 0.75828755f, 0.4f, 0.6519202f, 0.0f, 0.0f, -0.5f, 0.8164966f, -0.5f, 0.8164966f, -0.45f, 0.95597535f, 0.0f, 0.0f, -0.5f, 1.1355499f, -0.525f, 1.1525143f, -0.425f, 1.238367f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.5595238f, 0.27740905f, -1.1515151f, 3.2273796f, 0.30294397f, 0.33926272f, 0.6926407f, -0.34872004f, -1.1388888f, 2.5796635f, -2.0f, 2.0317724f, 0.6044226f, 0.27588674f, 0.68835616f, -0.5550575f, 0.3097345f, 1.1402917f, -0.4318182f, 1.9581538f, 0.31690142f, 0.49123436f, 0.643168f, -0.7472216f, -0.09598604f, 1.0664079f, 0.14205608f, 1.1953759f, 0.20737448f, 0.9390114f, 0.72867227f, -1.1205546f, -0.11328125f, -0.08678538f, 0.035141055f, 0.17725138f, 44.092304f, -0.62646484f, -0.77785134f, 0.07804155f, 0.180319f, 43.923176f};
            DMatrix dMatrix = new DMatrix(mat, 1, 229, 0.0f);
            float[][] javaPredictions = booster.predict(dMatrix);
            System.out.println("Java predict result 1: " + Arrays.toString(javaPredictions[0]));

            for(int i=0;i<mat.length;i++)
                mat[i] = i;
            dMatrix = new DMatrix(mat, 1, 229, 0.0f);
            javaPredictions = booster.predict(dMatrix);
            System.out.println("Java predict result 2:" + Arrays.toString(javaPredictions[0]));
        } catch (XGBoostError e) {
            e.printStackTrace();
        }
    }

In java code, we load the same model, and also made two predictions.
One prediction used the same float array, and the other used the same continued array from 0 to 228.
We found out prediction 1 is quite different from python code to java code, while perdiction 2 are exactly the same.
Output:

Java predict result 1: [0.5563295]
Java predict result 2:[0.5902365]

We've no ideas why prediction result 1 is quite different.

@trivialfis
Copy link
Member

Thank you for sharing, will look into it. Initial guess is there's floating point error and one of the values is quite close to the split value. Python is using f64 while java is using f32.

Any chance you can share the model for reproducing the issue?

@henghamao
Copy link
Author

Thanks for looking into the issue.
The issue is easily reproducible.
The python code has included how the model trained by ramdon values.
We had tried more than 5 runs, and for every run we observed different results of predict "result 2".

import xgboost as xgb
import numpy as np


X = np.random.rand(100, 229)
y = np.random.rand(100)

dtrain = xgb.DMatrix(X, label=y)

params = {
    'objective': 'reg:squarederror',
    'eval_metric': 'rmse',
    'max_depth': 3,
    'eta': 0.1
}

num_round = 10
model = xgb.train(params, dtrain, num_round)
model.save_model('xgb_model.json')

@ayoub317
Copy link
Contributor

ayoub317 commented Feb 9, 2025

Hello,
In my opinion, this issue is due to the way you handle missing and NaN values across different bindings.
In Python, the default value for missing data is NaN (reference).
But in Java you are setting the missing value to 0 with DMatrix(mat, 1, 229, 0.0f).
In both of your test datasets, there are 0 values, which Java interprets as missing, causing misalignment.
To fix this, you can modify your DMatrix instantiation as follows DMatrix dMatrix = new DMatrix(mat, 1, 229, Float.NaN).

In Java, the default value is 0 in all cases except for two specific places. @trivialfis could you please review this PR to change the default value to NaN as well ? It makes more sense than 0 and improves consistency across bindings.

@henghamao
Copy link
Author

@ayoub317
Thanks!
With the change, we got the same predict results in java code.

DMatrix dMatrix = new DMatrix(mat, 1, 229, Float.NaN)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants