Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can Loose Hits be Reported in Publication? #303

Open
drbmanna opened this issue Jan 23, 2025 · 2 comments
Open

Can Loose Hits be Reported in Publication? #303

drbmanna opened this issue Jan 23, 2025 · 2 comments
Assignees

Comments

@drbmanna
Copy link

Dear CARD RGI Team,

I'm analyzing metagenomic assembled sequences (MAGs) from wastewater samples for antibiotic resistance genes. Using RGI main with 'exclude nudge' and 'high quality coverage' parameters, I observe 2-5 strict hits and 150-400 loose hits per sample.
While CARD 2017 and 2023 papers indicate loose hits' relevance for ARG discovery, I'm seeking:

  • Guidelines for reporting resistome dynamics using loose hits
  • Necessary precautions when interpreting loose hits results

Any relevant literature references discussing loose hits would be appreciated.

Thank you,
BM

@agmcarthur
Copy link
Collaborator

Hello! I'll quote some of the documentation, found here: https://github.com/arpcard/rgi/blob/master/docs/rgi_main.rst. We do not yet have a RGI publicationon.

The RGI analyzes genome or proteome sequences under a Perfect, Strict, and Loose (a.k.a. Discovery) paradigm. The Perfect algorithm is most often applied to clinical surveillance as it detects perfect matches to the curated reference sequences in CARD. In contrast, the Strict algorithm detects previously unknown variants of known AMR genes, including secondary screen for key mutations, using detection models with CARD's curated similarity cut-offs to ensure the detected variant is likely a functional AMR gene. The Loose algorithm works outside of the detection model cut-offs to provide detection of new, emergent threats and more distant homologs of AMR genes, but will also catalog homologous sequences and spurious partial matches that may not have a role in AMR. Combined with phenotypic screening, the Loose algorithm allows researchers to hone in on new AMR genes.

There is also a discussion of the technical definition of perfect, strict, and loose #140

Overall, look at the percent similarity of Loose annotations to the CARD reference, examine the sequence alignment, and be careful with your interpretation. Experimental validation is likely needed, see https://www.thewrightlab.com/antibiotic-resistance-platform.

FYI, it sounds like you are using older version of RGI. As of version 6.0.0, exclude_nudge has been replaced with include_nudge.

@agmcarthur agmcarthur self-assigned this Jan 27, 2025
@drbmanna
Copy link
Author

Hi Andrew,

Thank you for the response. From issue #140, I understand that loose hits are those below the bitscore cut-off of strict hits. I have a few specific questions about interpreting these results:

  1. What is the minimum similarity percentage required for a sequence to be considered a potential ARG in the loose category, even with a low bit score? In my analysis using the CARD RGI web server (https://card.mcmaster.ca/analyze/rgi), I've noticed two genes (qacG, adeF) classified as strict hits despite having low identity percentages (44.9% and 69.93% respectively). Given that these are strict hits, can these similarity levels be trusted to indicate functional ARGs?

Image

  1. For loose hits, I'm seeing similarities as low as 20%. What would be an appropriate percent identity cutoff for filtering loose hits?

  2. I noticed the web server (RGI 6.0.3 and CARD 3.3.0) still shows the exclude_nudge option.

Image

I appreciate your help in interpreting these results!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants