Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'Enable fuzzy CPE matching' options lead to I/O Exception in the FuzzyVulnerableSoftwareSearchManager #2104

Closed
AndrewR777 opened this issue Oct 28, 2022 · 12 comments
Labels
defect Something isn't working p2 Non-critical bugs, and features that help organizations to identify and reduce risk
Milestone

Comments

@AndrewR777
Copy link

Selecting 'Enable fuzzy CPE matching' options leads to I/O Exception in the FuzzyVulnerableSoftwareSearchManager

Steps to Reproduce:

One may reproduce the error on a clean install.
I used Ubuntu 22.04 clean image with 30 GB HDD and 16 GB RAM.

I started as a sudo user and switched to root:
sudo -i
Than I installed Docker:

apt-get update
apt-get install     ca-certificates     curl     gnupg     lsb-release
mkdir -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo   "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
apt-get update
apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin

and the latest (4.6.2) version of Dependency Track:

docker pull dependencytrack/bundled
docker volume create --name dependency-track
docker run -d -m 8192m -p 8080:8080 --name dependency-track -v dependency-track:/data dependencytrack/bundled

I opened Administration -> Analyzers -> Internal and ticked all 3 'Enable fuzzy CPE matching' checkboxes.

Now one can upload a SBOM file and inspect the log file (dependency-track.log):

2022-10-28 13:11:49,749 [] **ERROR [org.dependencytrack.search.FuzzyVulnerableSoftwareSearchManager] An I/O Exception occurred while searching Lucene index**
org.apache.lucene.index.IndexNotFoundException: no segments* file found in SimpleFSDirectory@/data/.dependency-track/index/vulnerablesoftware lockFactory=org.apache.lucene.store.NativeFSLockFactory@1e50f7fb: files: []
	at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:715)

Additional Details:

As a result of this error a number of vulnerabilities are not found.

See the log file in attachment.
dependency-track.log

@proteus-russ
Copy link

I upgraded from 4.5 to 4.6.2 and am seeing this error in the log as well. Although, I haven't noticed any missing vulnerabilities after I enabled the new fuzzy matching options.

@AndrewR777
Copy link
Author

AndrewR777 commented Oct 29, 2022

Hi proteus-russ, thank you for prompt confirmation!

Concerning the difference that the FuzzyVulnerableSoftwareSearchManager makes, kindly see the examples below.

I downloaded the image with 4.5.0 and compared the results with 4.6.2, using two very simple projects.
The latest version gives less (valid) vulnerabilities compared to 4.5.0, because of the problem with FuzzyVulnerableSoftwareSearchManager:
56->50
39->30 vulnerabilities.

This is how you can easily reproduce it.
1 Run v.4.5.0
(there are no 'Enable fuzzy CPE matching' options there, I assume this mechanism is switched ON by default)
2 Provide a valid GitHub Advisory token and make sure to restart the container and wait until GitHub Advisory mirroring complete.
3 Create a new project. Manually add a Django component with the following PURL:
pkg:pypi/[email protected]

Without using the Fuzzy Logic the system will not find any vulnerability; the Tracker does need Fuzzy Logic to locate the correct package in this case.

As a result, v.4.6.2 finds nothing.
v.4.5.0 gives us the following valid issue: GHSA-qrw5-5h28-6cmg (GITHUB)

and if we add a component with the following PURL pkg:pypi/[email protected], v 4.5.0 will find 29 vulnerabilities.

@AndrewR777
Copy link
Author

All in all, I wanted to emphasize, how important the Fuzzy Logic module is; it would be great if we could use it again....

@yruss972
Copy link

yruss972 commented Nov 7, 2022

Also having these errors on v4.6.1

@nscuro nscuro added defect Something isn't working p2 Non-critical bugs, and features that help organizations to identify and reduce risk and removed in triage labels Nov 20, 2022
@nscuro
Copy link
Member

nscuro commented Nov 21, 2022

@officerNordberg Ever ran into this before?

@officerNordberg
Copy link
Contributor

officerNordberg commented Nov 21, 2022

We are using v4.6.2 and FuzzyMatchingEnabled with no errors that I can see. My suggestion would be to stop the server and delete your VulnerableSoftwareIndex. Restart it and make sure the indexer completes before restarting or testing again. This takes a long time. I've corrupted the index by doing successive reboots before it has completed. I've noted here #1641 my frustration with the Indexer. We found and fixed threading issues it had a couple releases ago but I still think there are problems with the implementation overall. It is not resilient and can get itself into stuck states that seem to only be fixed by a manual deletion. I'm no Lucene expert so I'm not sure how to improve this without all sorts of health checks and/or automatically rebuilding the index whenever exceptions like this are caught. It is very frustrating when you can't find a record that you know is in the database because the query is not SQL and goes against Lucene instead.

DO NOT TRY THIS WITHOUT A BACKUP OF THE INDEX FOLDER, reindexing appears to be broken
$ cat rebuild_index.sh

#!/bin/bash

/usr/local/bin/docker-compose -f /opt/docker-compose.yml stop  dtrack-apiserver
rm -rf /var/lib/docker/volumes/dependency-track/_data/.dependency-track/index/vulnerablesoftware/
/usr/local/bin/docker-compose -f /opt/docker-compose.yml start dtrack-apiserver

@AndrewR777
Copy link
Author

AndrewR777 commented Nov 25, 2022

Hi officerNordberg,
---We are using v4.6.2 and FuzzyMatchingEnabled with no errors that I can see

Could you try it once again, enable the internal analyzer and set ALL the three 'Fuzzy' options ON?
Sometimes we need to restart the Tracker and wait for the 'internal analysis' task:

726 [] INFO [org.dependencytrack.tasks.VulnerabilityAnalysisTask] Analyzing portfolio
2022-11-24 15:52:50,734 [] INFO [org.dependencytrack.tasks.repositories.RepositoryMetaAnalyzerTask] Performing component repository metadata analysis against 182 components in project: d7723c13-48b1-45be-87d2-3fe021fe79d9
2022-11-24 15:52:50,735 [] INFO [org.dependencytrack.tasks.VulnerabilityAnalysisTask] Analyzing 182 components in project: d7723c13-48b1-45be-87d2-3fe021fe79d9
2022-11-24 15:52:50,755 [] INFO [org.dependencytrack.tasks.scanners.InternalAnalysisTask] Starting internal analysis task
2022-11-24 15:52:50,780 [] INFO [org.dependencytrack.tasks.InternalComponentIdentificationTask] Starting internal component identification
2022-11-24 15:52:50,801 [] INFO [org.dependencytrack.tasks.InternalComponentIdentificationTask] Internal component identification completed in 00:00:20
2022-11-24 15:52:51,589 [] ERROR [org.dependencytrack.search.FuzzyVulnerableSoftwareSearchManager] An I/O Exception occurred while searching Lucene index
org.apache.lucene.index.IndexNotFoundException: no segments* file found in SimpleFSDirectory@/data/.dependency-track/index/vulnerablesoftware lockFactory=org.apache.lucene.store.NativeFSLockFactory@26032433: files: []
	at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:715)
	at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:84)
	at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:64)
	at org.dependencytrack.search.IndexManager.getIndexSearcher(IndexManager.java:169)
	at org.dependencytrack.search.FuzzyVulnerableSoftwareSearchManager.searchIndex(FuzzyVulnerableSoftwareSearchManager.java:139)
	at org.dependencytrack.search.FuzzyVulnerableSoftwareSearchManager.fuzzySearch(FuzzyVulnerableSoftwareSearchManager.java:191)
	at org.dependencytrack.search.FuzzyVulnerableSoftwareSearchManager.fuzzySearch(FuzzyVulnerableSoftwareSearchManager.java:126)
	at org.dependencytrack.search.FuzzyVulnerableSoftwareSearchManager.fuzzyAnalysis(FuzzyVulnerableSoftwareSearchManager.java:98)
	at org.dependencytrack.tasks.scanners.InternalAnalysisTask.versionRangeAnalysis(InternalAnalysisTask.java:148)
	at org.dependencytrack.tasks.scanners.InternalAnalysisTask.analyze(InternalAnalysisTask.java:90)
	at org.dependencytrack.tasks.scanners.InternalAnalysisTask.inform(InternalAnalysisTask.java:65)
	at org.dependencytrack.tasks.VulnerabilityAnalysisTask.performAnalysis(VulnerabilityAnalysisTask.java:163)
	at org.dependencytrack.tasks.VulnerabilityAnalysisTask.analyzeComponents(VulnerabilityAnalysisTask.java:122)
	at org.dependencytrack.tasks.VulnerabilityAnalysisTask.inform(VulnerabilityAnalysisTask.java:93)
	at alpine.event.framework.BaseEventService.lambda$publish$0(BaseEventService.java:101)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
2022-11-24 15:52:51,611 [] ERROR [org.dependencytrack.search.FuzzyVulnerableSoftwareSearchManager] An I/O Exception occurred while searching Lucene index
org.apache.lucene.index.IndexNotFoundException: no segments* file found in SimpleFSDirectory@/data/.dependency-track/index/vulnerablesoftware lockFactory=org.apache.lucene.store.NativeFSLockFactory@26032433: files: []
	at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:715)
	at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:84)
	at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:64)
	at org.dependencytrack.search.IndexManager.getIndexSearcher(IndexManager.java:169)
	at org.dependencytrack.search.FuzzyVulnerableSoftwareSearchManager.searchIndex(FuzzyVulnerableSoftwareSearchManager.java:139)
	at org.dependencytrack.search.FuzzyVulnerableSoftwareSearchManager.fuzzySearch(FuzzyVulnerableSoftwareSearchManager.java:191)
	at org.dependencytrack.search.FuzzyVulnerableSoftwareSearchManager.fuzzySearch(FuzzyVulnerableSoftwareSearchManager.java:126)
	at org.dependencytrack.search.FuzzyVulnerableSoftwareSearchManager.fuzzyAnalysis(FuzzyVulnerableSoftwareSearchManager.java:100)
	at org.dependencytrack.tasks.scanners.InternalAnalysisTask.versionRangeAnalysis(InternalAnalysisTask.java:148)
	at org.dependencytrack.tasks.scanners.InternalAnalysisTask.analyze(InternalAnalysisTask.java:90)
	at org.dependencytrack.tasks.scanners.InternalAnalysisTask.inform(InternalAnalysisTask.java:65)
	at org.dependencytrack.tasks.VulnerabilityAnalysisTask.performAnalysis(VulnerabilityAnalysisTask.java:163)
	at org.dependencytrack.tasks.VulnerabilityAnalysisTask.analyzeComponents(VulnerabilityAnalysisTask.java:122)
	at org.dependencytrack.tasks.VulnerabilityAnalysisTask.inform(VulnerabilityAnalysisTask.java:93)
	at alpine.event.framework.BaseEventService.lambda$publish$0(BaseEventService.java:101)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)

@officerNordberg
Copy link
Contributor

officerNordberg commented Nov 27, 2022

The reindex events are dispatched in DefaultObjectGenerator prior to the Event system initialization.

2022-11-27 16:27:09,988 INFO [DefaultObjectGenerator] Initializing default object generator
2022-11-27 16:27:09,997 INFO [DefaultObjectGenerator] Dispatching event to reindex licenses
2022-11-27 16:27:09,998 INFO [DefaultObjectGenerator] Dispatching event to reindex projects
2022-11-27 16:27:09,999 INFO [DefaultObjectGenerator] Dispatching event to reindex components
2022-11-27 16:27:09,999 INFO [DefaultObjectGenerator] Dispatching event to reindex vulnerabilities
2022-11-27 16:27:09,999 INFO [DefaultObjectGenerator] Dispatching event to reindex vulnerablesoftware

2022-11-27 16:27:13,469 INFO [EventSubsystemInitializer] Initializing asynchronous event subsystem

DefaultObjectGenerator's reindex events are getting dropped. I'm never seeing this line executed in any of the Indexers. LOGGER.info("Starting reindex task. This may take some time.");

It used to come before the DefaultObjectGenerator

2022-05-19 14:50:05,182 [] INFO [org.dependencytrack.event.EventSubsystemInitializer] Initializing asynchronous event subsystem
2022-05-19 14:50:05,275 [] INFO [org.dependencytrack.persistence.DefaultObjectGenerator] Initializing default object generator
2022-05-19 14:50:05,296 [] INFO [org.dependencytrack.persistence.DefaultObjectGenerator] Dispatching event to reindex licenses
2022-05-19 14:50:05,298 [] INFO [org.dependencytrack.persistence.DefaultObjectGenerator] Dispatching event to reindex projects
2022-05-19 14:50:05,298 [] INFO [org.dependencytrack.persistence.DefaultObjectGenerator] Dispatching event to reindex components
2022-05-19 14:50:05,298 [] INFO [org.dependencytrack.persistence.DefaultObjectGenerator] Dispatching event to reindex vulnerabilities
2022-05-19 14:50:05,298 [] INFO [org.dependencytrack.persistence.DefaultObjectGenerator] Dispatching event to reindex vulnerablesoftware

@AndrewR777 you'll still need to reindex your server but that can't happen until this is resolved. Maybe an admin feature to trigger a reindex through the UI might be useful down the road as well.

@officerNordberg
Copy link
Contributor

@syalioune can you describe why this change was needed?
ac6186c#diff-f1c1b2e33984786db489cb4229fb7610cabae6dc83df075488c9e1236a91268c

@nscuro
Copy link
Member

nscuro commented Nov 28, 2022

@officerNordberg Thanks for investigating. I think the change was necessary because EventSubsystemInitializer also takes care of initializing TaskScheduler. The interval of many tasks in TaskScheduler has been made configurable, and the default configuration is populated in DefaultObjectGenerator.

Perhaps it makes sense to move the initialization of default config properties into its own class, and switch the order of EventSubsystemInitializer and DefaultObjectGenerator again.

@syalioune
Copy link
Contributor

syalioune commented Nov 28, 2022

@officerNordberg Thanks for investigating. I think the change was necessary because EventSubsystemInitializer also takes care of initializing TaskScheduler. The interval of many tasks in TaskScheduler has been made configurable, and the default configuration is populated in DefaultObjectGenerator.

That's exactly the reason. However that was maybe a bit strict since the default values can be derived from the enum ConfigPropertyConstants if the value is not already in the database.

Perhaps it makes sense to move the initialization of default config properties into its own class, and switch the order of EventSubsystemInitializer and DefaultObjectGenerator again.

Maybe create a distinct IndexSubsystemInitializer to make more explicit and decouple it from the DefaultObjectGenerator. That initializer could be last.

Maybe an admin feature to trigger a reindex through the UI might be useful down the road as well.

Totally agree.

syalioune added a commit to syalioune/frontend that referenced this issue Nov 29, 2022
@nscuro nscuro linked a pull request Nov 29, 2022 that will close this issue
3 tasks
syalioune added a commit to syalioune/frontend that referenced this issue Dec 5, 2022
syalioune added a commit to syalioune/dependency-track that referenced this issue Dec 6, 2022
…d listener

A REST API is also exposed to allow index rebuild through the GUI. See DependencyTrack#2104
Automatic periodic consistency check with database are performed if enabled

Signed-off-by: Alioune SY <[email protected]>
@nscuro nscuro closed this as completed in c40c117 Dec 6, 2022
@nscuro nscuro added this to the 4.7 milestone Dec 6, 2022
nscuro pushed a commit to DependencyTrack/frontend that referenced this issue Dec 6, 2022
* Feature: Enable lucene index rebuild through UI

See DependencyTrack/dependency-track#2104

Signed-off-by: Alioune SY <[email protected]>

* Feature: Enable lucene index rebuild through UI

See DependencyTrack/dependency-track#2104

Signed-off-by: Alioune SY <[email protected]>

* Fix: Restoring lucene index build during startup by having a dedicated listener

Takint into account review comments

Signed-off-by: Alioune SY <[email protected]>

Signed-off-by: Alioune SY <[email protected]>
mulder999 pushed a commit to mulder999/dependency-track that referenced this issue Dec 23, 2022
…dencyTrack#2200)

* Fix: Restoring lucene index build during startup by having a dedicated listener

A REST API is also exposed to allow index rebuild through the GUI. See DependencyTrack#2104
Automatic periodic consistency check with database are performed if enabled

Signed-off-by: Alioune SY <[email protected]>

* Fix: Restoring lucene index build during startup by having a dedicated listener

Takint into account review comments

Signed-off-by: Alioune SY <[email protected]>

* Fix: Restoring lucene index build during startup by having a dedicated listener

Fixing unit tests.

Signed-off-by: Alioune SY <[email protected]>

Signed-off-by: Alioune SY <[email protected]>

Fixes DependencyTrack#2104

Signed-off-by: mulder999 <[email protected]>
@github-actions
Copy link
Contributor

github-actions bot commented Jan 6, 2023

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 6, 2023
stephan-wolf-ais pushed a commit to AISAutomation/dependency-track that referenced this issue Mar 1, 2023
…dencyTrack#2200)

* Fix: Restoring lucene index build during startup by having a dedicated listener

A REST API is also exposed to allow index rebuild through the GUI. See DependencyTrack#2104
Automatic periodic consistency check with database are performed if enabled

Signed-off-by: Alioune SY <[email protected]>

* Fix: Restoring lucene index build during startup by having a dedicated listener

Takint into account review comments

Signed-off-by: Alioune SY <[email protected]>

* Fix: Restoring lucene index build during startup by having a dedicated listener

Fixing unit tests.

Signed-off-by: Alioune SY <[email protected]>

Signed-off-by: Alioune SY <[email protected]>

Fixes DependencyTrack#2104
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
defect Something isn't working p2 Non-critical bugs, and features that help organizations to identify and reduce risk
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants