-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve path-based routing #173
Comments
It is possible we could use https://github.com/toritori0318/lua-resty-r3 or https://github.com/APItools/router.lua to build a routing tree and do a dispatch on that. |
Adding to 3.3 milestone as agreed with @andrewdavidmackenzie |
I think when improving this we should think about some better design for this. We will need to do not only path based routing, but also header / session / user based. We should come up with a pluggable design that allows to register code to do this routing. This is something that will be required for ostia project. We should evaluate several designs and see which is more flexible. One would be a "conditional" policy. This policy that would embed a policy chain within itself and would execute only when certain condition is met. That would allow for creating independent trees of policies. But maybe independent trees are not what we want. Maybe introducing a new concept of a "named policy chain" that is basically just a chain with a name (no host, no paths, ...) and some concept of a router/dispatcher that sends the request to the correct named chain when certain conditions are met. |
@mayorova I understand your concerns about performance. I think that the efficient way of solving this is using a prefix tree like the library @mikz mentioned: https://github.com/toritori0318/lua-resty-r3 However, notice that it's labeled as experimental. Also, I'd need to check all the kind of regexes that we allow in the mapping rules to see how to build the prefix tree in our case. Having said that, I wonder if the performance penalty is noticeable in typical use cases. My concern is that we might introduce a complex algorithm and nobody will benefit from it. That's why I went ahead and benchmarked the mapping rules matching. For these tests, I'm going to emulate the case of an apicast user with 100 services and 100 mapping rules for each of them. I believe that most users will have way less than that. Here's the code using the benchmark-ips lib by @mikz require('resty.core')
local mapping_rules_matcher = require 'apicast.mapping_rules_matcher'
local MappingRule = require 'apicast.mapping_rule'
math.randomseed(os.time())
-- Define valid charset (numbers and letters).
local charset = {}
for c = 48, 57 do table.insert(charset, string.char(c)) end
for c = 65, 90 do table.insert(charset, string.char(c)) end
for c = 97, 122 do table.insert(charset, string.char(c)) end
local function randomString(length)
local res = {}
for i = 1, length do
local random_char = charset[math.random(1, #charset)]
table.insert(res, random_char)
end
return table.concat(res, '')
end
local patterns = {}
for i = 1, 100 do
table.insert(patterns, '/' .. randomString(32))
end
local rules = {}
for i = 1, 100 do
table.insert(rules, MappingRule.new('GET', patterns[i], {}, {}, 'hits', 1))
end
require('ips')(function(b)
b.time = 5
b.warmup = 2
local function apply_mapping_rules()
for i = 1, 100 do
mapping_rules_matcher.matches('GET', '/12345', {}, rules)
end
end
b:report('mapping rules', function() return apply_mapping_rules() end)
end) When I run this, I get around 500 i/s on my laptop. That means that matching 100 services with 100 rules each would take 1/500 = 0.002 s = 2 ms, and that's when all of them need to be checked because there isn't a match. If this is accurate, I wonder if this is worth optimizing. If nobody has reported a problem with this with a production workload, I'd rather lower the priority of this and leave it out of the 3.3 milestone. Do you think that we should test with higher numbers of services and rules? Please let me know if you see any flaws in my reasoning. |
@davidor so is the conclusion that path based routing does not have performance issues and this is ready for QE? |
@MarkCheshire mine conclusion is that we have no numbers that this should meet. How many services, how many mapping rules, what patterns, ... Using our examples it looks fine. It is possible there are some pathological cases where this is not fine for someone, but we need examples of those cases and see what is the difference between path routing enabled/disabled. |
I would not have a better suggestion that David already covered in the performance test. Beyond that we need to see how much customers start to experience performance impact and then adjust the implementation correspondingly. |
Exactly. I think we have some information to indicate its "probably OK" and no information to indicate there is a problem. So, I propose we close as-is, have QE test it in 2.2 (separate request Mark made to them), don't spend more time discussing it or benchmarking it or implement any changes - and then deal with any hypothetical problems if and when they appear in the future. |
OK. So let's leave this out for 3.3, but keep the issue open for future consideration. |
This PR introduced routing the requests to different services based on the path, which allows having the same Public Base URL in multiple services.
This works, but the way it works is not very efficient. In the worst case it will iterate over all services and all mapping rules, and do regex matching for each one of them. After getting the service id, the results of regex matching are discarded, and it will be iterated again for all the rules of the service.
I think this can be improved.
As the 1st step it's probably worth doing a benchmark to understand better how serious the problem is.
Then, we can think about how this can be improved. I am thinking of two ways:
Use the
find_service_cascade
function as it is now, but store the results ofusage
andmatched_patterns
to reuse them later and avoiding regex matching again.On configuration load build some kind of index based on the mapping rules patterns (across all services), which will be used for service id lookup on incoming requests.
The text was updated successfully, but these errors were encountered: