Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate redundant token joining and const_get calls #2655

Merged
merged 5 commits into from
Dec 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions docs/data-flow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
## Data Flow

This shows the general data flow when a user runs facter on the command to lookup facts `a.b` and `c`.

Generally, facter loads fact definitions (`LoadedFact`) to determine all of the things it could collect, including internal (aka core) facts, custom facts (implemented using the `Facter.add` API) and external facts (json, yaml, bash, etc). Each `LoadedFact` specifies a name like `os.family` and a class that can be called later to collect the values, e.g. `Facts::Linux::Os::Release`.

The `QueryParser` parse both user queries `a.b` and `c` and matches each query against all LoadedFacts, returning an array of `SearchedFacts`. These are more like SearchableFacts, since they haven't been searched yet.

Facter attempts to lookup the facts from the cache, otherwise it calls the `InternalFactManager` and `ExternalFactManager` to resolve facts.

For internal facts, facter wraps each `SearchedFact` with a `CoreFact`. The `CoreFact` calls the `call_the_resolver` method on the class that the `SearchedFact` references. The `call_the_resolver` method then typically delegates to a resolver and returns the fact value which may be scalar or structured data. For example, `os.family` returns a string, but `gce` returns a Hash.

```mermaid
flowchart TD
CLI[facter a.b c] --> Facter[Facter.to_user_output]
Facter --> FactManager[FactManager#resolve_facts]
FactManager --> FactLoader[FactLoader.load]
FactLoader -->|internal| InternalFactLoader[InternalLoader.core_facts]
FactLoader -->|custom| CustomFactLoader[ExternalFactLoader.custom_facts]
FactLoader -->|external| ExternalFactLoader[ExternalFactLoader.external_facts]
InternalFactLoader --> QueryParser[QueryParser.parse]
CustomFactLoader --> QueryParser
ExternalFactLoader --> QueryParser
QueryParser -->|empty query| AllSearchable[All loaded facts are searchable]
QueryParser -->|not empty| SomeSearchable[Match query tokens to loaded facts]
AllSearchable --> SearchedFacts[Array of SearchedFacts]
SomeSearchable --> SearchedFacts
SearchedFacts --> CacheManager[CacheManager.resolve_facts]
CacheManager -->|internal| InternalFactManager[InternalFactManager.resolve_facts]
InternalFactManager --> CoreFact[CoreFact#create]
CoreFact --> SearchedFact[SearchedFact -> call_the_resolver]
SearchedFact --> Resolvers[Facter::Resolvers::*.resolve]
CacheManager -->|external| ExternalFactManager[ExternalFactManager.resolve_facts]
Resolvers --> ResolvedFacts[Array of ResolvedFacts]
ExternalFactManager --> ResolvedFacts
ResolvedFacts --> CacheFacts[CacheManager.cache_facts]
CacheFacts --> FilterFacts[FactFilter#filter_facts!]
FilterFacts --> Formatter
```
7 changes: 4 additions & 3 deletions lib/facter/framework/core/fact_loaders/class_discoverer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,10 @@ def discover_classes(operating_system)

def find_nested_classes(mod, discovered_classes)
mod.constants.each do |constant_name|
if mod.const_get(constant_name).instance_of? Class
discovered_classes << mod.const_get(constant_name)
elsif mod.const_get(constant_name).instance_of? Module
obj = mod.const_get(constant_name)
if obj.instance_of? Class
discovered_classes << obj
elsif obj.instance_of? Module
find_nested_classes(Module.const_get("#{mod.name}::#{constant_name}"), discovered_classes)
end
end
Expand Down
44 changes: 24 additions & 20 deletions lib/facter/framework/parsers/query_parser.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,18 +18,18 @@ class << self
# Because a root fact will always be resolved by a collection of child facts,
# we can return one or more child facts for each parent.
#
# query - is the user input used to search for facts
# loaded_fact - is a list with all facts for the current operating system
# @param query_list [Array] The list of facts to search for
# @param loaded_facts [Array] All of the fact definitions for the current operating system
#
# Returns a list of SearchedFact objects that resolve the users query.
def parse(query_list, loaded_fact)
# @return [Array<SearchedFact>] a list of searchable facts that resolve the user's query
def parse(query_list, loaded_facts)
matched_facts = []
@query_list = query_list

return no_user_query(loaded_fact) unless query_list.any?
return no_user_query(loaded_facts) unless query_list.any?

query_list.each do |query|
found_facts = search_for_facts(query, loaded_fact)
found_facts = search_for_facts(query, loaded_facts)
matched_facts << found_facts
end

Expand All @@ -44,15 +44,17 @@ def no_user_query(loaded_facts)
searched_facts
end

def search_for_facts(query, loaded_fact_hash)
def search_for_facts(query, loaded_facts)
resolvable_fact_list = []
query = query.to_s
query_tokens = query.end_with?('.*') ? [query] : query.split('.')
size = query_tokens.size

# Try to match the most specific query_tokens to the least, returning the first match
size.times do |i|
query_token_range = 0..size - i - 1
resolvable_fact_list = get_facts_matching_tokens(query_tokens, query_token_range, loaded_fact_hash)
query_fact = query_tokens[query_token_range].join('.')
resolvable_fact_list = get_facts_matching_tokens(query_tokens, query_fact, loaded_facts)

return resolvable_fact_list if resolvable_fact_list.any?
end
Expand All @@ -62,12 +64,10 @@ def search_for_facts(query, loaded_fact_hash)
resolvable_fact_list
end

def get_facts_matching_tokens(query_tokens, query_token_range, loaded_fact_hash)
def get_facts_matching_tokens(query_tokens, query_fact, loaded_facts)
resolvable_fact_list = []

loaded_fact_hash.each do |loaded_fact|
query_fact = query_tokens[query_token_range].join('.')

loaded_facts.each do |loaded_fact|
next unless found_fact?(loaded_fact.name, query_fact)

searched_fact = construct_loaded_fact(query_tokens, loaded_fact)
Expand All @@ -79,16 +79,20 @@ def get_facts_matching_tokens(query_tokens, query_token_range, loaded_fact_hash)
end

def found_fact?(fact_name, query_fact)
# This is the case where the fact_name contains a wildcard like
# blockdevice_.*_model and we're querying for the legacy fact
# specifically using 'blockdevice_sba_model' and we don't want the query
# 'blockdevice.sba.model' to match
fact_with_wildcard = fact_name.include?('.*') && !query_fact.include?('.')

processed_equery_fact = query_fact.gsub('\\', '\\\\\\\\')

return false if fact_with_wildcard && !query_fact.match("^#{fact_name}$")

# Must escape metacharacters (like dots) to ensure the correct fact is found
return false unless fact_with_wildcard || fact_name.match("^#{Regexp.escape(processed_equery_fact)}($|\\.)")

true
if fact_with_wildcard
# fact_name contains wildcard, so we're intentially not escaping.
query_fact.match("^#{fact_name}$")
else
processed_equery_fact = query_fact.gsub('\\', '\\\\\\\\')
# Must escape metacharacters (like dots) to ensure the correct fact is found
fact_name.match("^#{Regexp.escape(processed_equery_fact)}($|\\.)")
end
end

def construct_loaded_fact(query_tokens, loaded_fact)
Expand Down