Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero length sequence ranges aren't allowed if 'strict_bounds=True' #93

Closed
dhauge opened this issue Aug 4, 2016 · 2 comments
Closed
Assignees
Labels

Comments

@dhauge
Copy link
Contributor

dhauge commented Aug 4, 2016

If a 'Fasta' index is created with 'strict_bounds=True', then attempts to extract 0-length sequences fail. This might be an odd case, but it can happen in limiting cases of algorithms that want to treat the sequence as a standard Python sequence. The following tests demonstrate the error:

class TestZeroLengthSequenceSubRange(TestCase):
    def test_as_raw_zero_length_subsequence(self):
        fasta = Fasta('data/genes.fasta', as_raw=True, strict_bounds=True)
        expect = ''
        result = fasta['gi|557361099|gb|KF435150.1|'][100:100]
        assert result == expect

    def test_zero_length_subsequence(self):
        fasta = Fasta('data/genes.fasta', strict_bounds=True)
        expect = ''
        result = fasta['gi|557361099|gb|KF435150.1|'][100:100]
        assert result.seq == expect

The first raw sequence test fails with

Error
Traceback (most recent call last):
  File "/Users/hauge/dev/tools/pyfaidx/tests/test_feature_sequence_as_raw.py", line 57, in test_zero_length_subsequence
    result = fasta['gi|557361099|gb|KF435150.1|'][100:100]
  File "/Users/hauge/dev/tools/pyfaidx/pyfaidx/__init__.py", line 581, in __getitem__
    return self._fa.get_seq(self.name, start + 1, stop)[::step]
  File "/Users/hauge/dev/tools/pyfaidx/pyfaidx/__init__.py", line 715, in get_seq
    return self.faidx.fetch(name, start, end)
  File "/Users/hauge/dev/tools/pyfaidx/pyfaidx/__init__.py", line 447, in fetch
    seq = self.from_file(name, start, end)
  File "/Users/hauge/dev/tools/pyfaidx/pyfaidx/__init__.py", line 493, in from_file
    "invalid.\n".format(start, end))
pyfaidx.FetchError: Requested coordinates start=101 end=100 are invalid.

If the logic in 'from_file' is fixed to allow a zero-length sequence, the second will still fail when it tries to construct a 'Sequence':

Error
Traceback (most recent call last):
  File "/Users/hauge/dev/tools/pyfaidx/tests/test_feature_sequence_as_raw.py", line 57, in test_zero_length_subsequence
    result = fasta['gi|557361099|gb|KF435150.1|'][100:100]
  File "/Users/hauge/dev/tools/pyfaidx/pyfaidx/__init__.py", line 583, in __getitem__
    return self._fa.get_seq(self.name, start + 1, stop)[::step]
  File "/Users/hauge/dev/tools/pyfaidx/pyfaidx/__init__.py", line 118, in __getitem__
    raise ValueError("Coordinates start=%s and end=%s imply a diffent length than sequence (length %s)." % (self.start, self.end, len(self.seq)))
ValueError: Coordinates start=101 and end=100 imply a diffent length than sequence (length 0).
@mdshw5
Copy link
Owner

mdshw5 commented Aug 5, 2016

Thanks for pointing this out. I'll absolutely fix this since my goal for the project is to mimic Python sequence types as much as possible.

@mdshw5
Copy link
Owner

mdshw5 commented Nov 21, 2019

I sincerely forgot about this issue, and thanks to @prihoda for submitting a PR. Also thanks @dhauge for not only writing tests, but basing them on my existing test style. Thanks! Will close this issue as #155 provides a fix.

@mdshw5 mdshw5 closed this as completed Nov 21, 2019
mdshw5 added a commit that referenced this issue Nov 22, 2019
Support zero-length sequences, fixes #93. Release new version 0.5.6 to PyPI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants