Selecting some features and inputing them as a character vector to the "select" parameter of ggpicrust2 dosen't work. #134

Geonhui-Kang · 2025-01-25T12:28:59Z

Hello. I got warnings from my ggpicrust2 analysis.

Below are my codes for ggpicrust2 analysis.
results_file_input_4 <- ggpicrust2(data = abundance_data_filtered[, c(1,40:47,48:57)], metadata = metadata[c(39:46,47:56), ], group = "Group", pathway = "KO", daa_method = "edgeR", ko_to_kegg = TRUE, order = "pathway_class", p_values_bar = TRUE, x_lab = "pathway_name", select = result_4_select)

The number of features with statistical significance exceeds 30, leading to suboptimal visualization. Please use 'select' to reduce the number of features.
Currently, you have these features: "ko05412", "ko03450", "ko04142", "ko00604", "ko04260", "ko05142", "ko04973", "ko04974", "ko04976", "ko00565", "ko00624", "ko00941", "ko01053", "ko00100", "ko05219", "ko00531", "ko00364", "ko05130", "ko03050", "ko00361", "ko05143", "ko04020", "ko05414", "ko05012", "ko05150", "ko05131", "ko00196", "ko02060", "ko04622", "ko00511", "ko04972", "ko00540", "ko00140", "ko05100", "ko05410", "ko00906", "ko04210", "ko00944", "ko04144", "ko00930".
You can find the statistically significant features with the following command:
daa_results_df %>% filter(p_adjust < 0.05) %>% select(c("feature","p_adjust"))

So, I selected some featues not exceeding its number over 30, and I gave this information to ggpicrust as the "select" parameter.

result_4_select= c("ko05412", "ko03450", "ko04142", "ko00604", "ko04260", "ko05142", "ko04973", "ko04974", "ko04976", "ko00565", "ko00624", "ko00941", "ko01053", "ko00100", "ko05219", "ko00531", "ko00364", "ko05130", "ko03050", "ko00361", "ko05143", "ko04020", "ko05414", "ko05012", "ko05150", "ko05131", "ko00196", "ko02060", "ko04622")"

But I got an error saying that "Some selected samples are not present in the abundance data."
I can't undetstand above error because the pathways I selected are the one given from the analysis result.

Can I get some information about this problem?

Thank you.

The text was updated successfully, but these errors were encountered:

cafferychen777 · 2025-01-27T03:36:22Z

Hi! I've identified the issue with your code. The error message "Some selected samples are not present in the abundance data" occurs due to incorrect parameter usage. Let me explain in detail:

In your code, you're using:

results_file_input_4 <- ggpicrust2(
    data = abundance_data_filtered[, c(1,40:47,48:57)], 
    metadata = metadata[c(39:46,47:56), ], 
    group = "Group", 
    pathway = "KO", 
    daa_method = "edgeR", 
    ko_to_kegg = TRUE, 
    order = "pathway_class", 
    p_values_bar = TRUE, 
    x_lab = "pathway_name", 
    select = result_4_select
)

The issue is with how the select parameter is being used. In the ggpicrust2 function, the select parameter should be used during visualization, not during the initial analysis. Here's how to modify your code:

# Step 1: Perform differential analysis without using select parameter
results_file_input_4 <- ggpicrust2(
    data = abundance_data_filtered[, c(1,40:47,48:57)], 
    metadata = metadata[c(39:46,47:56), ], 
    group = "Group", 
    pathway = "KO", 
    daa_method = "edgeR", 
    ko_to_kegg = TRUE, 
    order = "pathway_class", 
    p_values_bar = TRUE, 
    x_lab = "pathway_name"
)

# Step 2: Get the differential analysis results dataframe
daa_results_df <- results_file_input_4[[1]]$results

# Step 3: Use pathway_errorbar for visualization with select parameter
p <- pathway_errorbar(
    abundance = abundance_data_filtered[, c(1,40:47,48:57)],
    daa_results_df = daa_results_df,
    Group = metadata[c(39:46,47:56), ]$Group,
    ko_to_kegg = TRUE,
    p_values_threshold = 0.05,
    order = "pathway_class",
    select = result_4_select,  # Use select here
    p_value_bar = TRUE,
    x_lab = "pathway_name"
)

This modification should resolve the issue. The select parameter is meant for filtering pathways during the visualization stage, not during the differential analysis stage.

If you still encounter issues, please check:

Ensure that the pathway IDs in result_4_select exactly match those in the feature column of daa_results_df
Verify that the sample order is consistent between your data and metadata
Confirm that the grouping information in the Group column is correct

I hope this helps! Feel free to ask if you have any questions.

Geonhui-Kang closed this as completed Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Selecting some features and inputing them as a character vector to the "select" parameter of ggpicrust2 dosen't work. #134

Selecting some features and inputing them as a character vector to the "select" parameter of ggpicrust2 dosen't work. #134

Geonhui-Kang commented Jan 25, 2025 •

edited

Loading

cafferychen777 commented Jan 27, 2025

Selecting some features and inputing them as a character vector to the "select" parameter of ggpicrust2 dosen't work. #134

Selecting some features and inputing them as a character vector to the "select" parameter of ggpicrust2 dosen't work. #134

Comments

Geonhui-Kang commented Jan 25, 2025 • edited Loading

cafferychen777 commented Jan 27, 2025

Geonhui-Kang commented Jan 25, 2025 •

edited

Loading