Skip to content

Commit

Permalink
If the size of the content parsed by StringScanner to parse huge XML …
Browse files Browse the repository at this point in the history
…exceeds a certain size, have it removed.

See: #150
  • Loading branch information
naitoh committed Jun 20, 2024
1 parent f704011 commit 57fa969
Show file tree
Hide file tree
Showing 3 changed files with 40 additions and 0 deletions.
2 changes: 2 additions & 0 deletions lib/rexml/parsers/baseparser.rb
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,8 @@ def peek depth=0

# Returns the next event. This is a +PullEvent+ object.
def pull
@source.drop_parsed_content

pull_event.tap do |event|
@listeners.each do |listener|
listener.receive event
Expand Down
7 changes: 7 additions & 0 deletions lib/rexml/source.rb
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ class Source
attr_reader :encoding

module Private
SCANNER_RESET_SIZE = 100000
PRE_DEFINED_TERM_PATTERNS = {}
pre_defined_terms = ["'", '"', "<"]
pre_defined_terms.each do |term|
Expand Down Expand Up @@ -84,6 +85,12 @@ def buffer
@scanner.rest
end

def drop_parsed_content
if @scanner.pos > SCANNER_RESET_SIZE
@scanner.string = @scanner.rest
end
end

def buffer_encoding=(encoding)
@scanner.string.force_encoding(encoding)
end
Expand Down
31 changes: 31 additions & 0 deletions test/test_baseparser.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# frozen_string_literal: false

require 'rexml/parsers/baseparser'

module REXMLTests
class BaseParserTester < Test::Unit::TestCase
include REXML

N_ELEMENTS = 100
N_STRING = 'a' * 50000
def build_xml(n_elements)
xml = '<?xml version="1.0"?><root>'

n_elements.times do |i|
xml << '<child >'
xml << N_STRING
xml << '</child>'
end
xml << '</root>'
end

def test_parse_large_xml
xml = build_xml(N_ELEMENTS)
parser = REXML::Parsers::BaseParser.new(xml)
while parser.has_next?
parser.pull
assert_compare REXML::Source::SCANNER_RESET_SIZE + N_STRING.size, ">", parser.position
end
end
end
end

0 comments on commit 57fa969

Please sign in to comment.