Skip to content

Commit d5409c4

Browse files
authored
[Linux] Fix system-probe enablement conditions (#336)
Fixes the logic around starting / stopping the system-probe service, with different paths depending on the Agent version: - before 6/7.18.0: the datadog-agent-sysprobe service is independent from the datadog-agent service and is started / stopped / restarted by the role, depending on the system_probe_config.enabled variable - between 6/7.18.0 and 6/7.24.0: the datadog-agent-sysprobe service depends on datadog-agent, therefore only the datadog-agent service needs to be restarted on configuration changes. The system_probe_config.enabled variable still defines that behavior. - since 6/7.24.1: the network_config.enabled variable defines if the service should be started or not. The system_probe_config.enabled variable is still supported for backwards compatibility, but is not recommended in the documentation anymore. Makes sure the role doesn't crash when system_probe_config has a None or empty value.
1 parent 0b30e66 commit d5409c4

File tree

9 files changed

+87
-33
lines changed

9 files changed

+87
-33
lines changed

README.md

+10-8
Original file line numberDiff line numberDiff line change
@@ -163,7 +163,7 @@ The following variables are available for live processes:
163163

164164
#### System probe
165165

166-
The system probe is configured under the `network_config` variable. Any variables nested underneath are written to the `system-probe.yaml`.
166+
The system probe is configured under the `system_probe_config` variable. Any variables nested underneath are written to the `system-probe.yaml`, in the `system_probe_config` section.
167167

168168
[Network Performance Monitoring][7] (NPM) is configured under the `network_config` variable. Any variables nested underneath are written to the `system-probe.yaml`, in the `network_config` section.
169169

@@ -183,13 +183,15 @@ network_config:
183183
enabled: true
184184
```
185185

186-
Once modification is complete, follow the steps below:
186+
**Note**: This configuration works with Agent 6.24.1+ and 7.24.1+. For older Agent versions, refer to [the public documentation][8] on how to enable system-probe.
187+
188+
On Linux, once this modification is complete, follow the steps below if you installed an Agent version older than 6.18.0 or 7.18.0:
187189

188190
1. Start the system-probe: `sudo service datadog-agent-sysprobe start` **Note**: If the service wrapper is not available on your system, run this command instead: `sudo initctl start datadog-agent-sysprobe`.
189-
2. [Restart the Agent][8]: `sudo service datadog-agent restart`.
191+
2. [Restart the Agent][9]: `sudo service datadog-agent restart`.
190192
3. Enable the system-probe to start on boot: `sudo service enable datadog-agent-sysprobe`.
191193

192-
For manual setup, refer to the [NPM][9] documentation.
194+
For manual setup, refer to the [NPM][8] documentation.
193195

194196
#### Agent v5
195197

@@ -334,7 +336,7 @@ To downgrade to a prior version of the Agent:
334336

335337
Below are some sample playbooks to assist you with using the Datadog Ansible role.
336338

337-
The following example sends data to Datadog US (default), enables logs, and configures a few checks.
339+
The following example sends data to Datadog US (default), enables logs, NPM and configures a few checks.
338340

339341
```yml
340342
- hosts: servers
@@ -403,7 +405,7 @@ The following example sends data to Datadog US (default), enables logs, and conf
403405
version: 1.11.0
404406
datadog-postgres:
405407
action: remove
406-
system_probe_config:
408+
network_config:
407409
enabled: true
408410
```
409411

@@ -530,6 +532,6 @@ For more details, see [Critical Bug in Uninstaller for Datadog Agent 6.14.0 and
530532
[5]: https://github.com/DataDog/integrations-core
531533
[6]: https://docs.datadoghq.com/infrastructure/process/
532534
[7]: https://docs.datadoghq.com/network_performance_monitoring/
533-
[8]: https://docs.datadoghq.com/agent/guide/agent-commands/#restart-the-agent
534-
[9]: https://docs.datadoghq.com/network_performance_monitoring/installation/?tab=agent#setup
535+
[8]: https://docs.datadoghq.com/network_performance_monitoring/installation/?tab=agent#setup
536+
[9]: https://docs.datadoghq.com/agent/guide/agent-commands/#restart-the-agent
535537
[10]: https://app.datadoghq.com/help/agent_fix

ci_test/install_agent_6.yaml

+4
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,10 @@
1818
env: dev
1919
trace.concentrator:
2020
extra_aggregators: version
21+
system_probe_config:
22+
sysprobe_socket: /opt/datadog-agent/run/sysprobe.sock
23+
network_config:
24+
enabled: true
2125
datadog_checks:
2226
process:
2327
init_config:

ci_test/install_agent_7.yaml

+4
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,10 @@
1818
env: dev
1919
trace.concentrator:
2020
extra_aggregators: version
21+
system_probe_config:
22+
sysprobe_socket: /opt/datadog-agent/run/sysprobe.sock
23+
network_config:
24+
enabled: true
2125
datadog_checks:
2226
process:
2327
init_config:

defaults/main.yml

+2-4
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,9 @@
22
role_version: 4.7.1
33

44
# default system-probe.yaml options
5-
system_probe_config:
6-
enabled: false
5+
system_probe_config: {}
76

8-
network_config:
9-
enabled: false
7+
network_config: {}
108

119
# define if the datadog-agent services should be enabled
1210
datadog_enabled: yes

handlers/main.yml

+6
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
---
22

3+
- name: restart datadog-agent-sysprobe
4+
service:
5+
name: datadog-agent-sysprobe
6+
state: restarted
7+
when: datadog_enabled and datadog_sysprobe_enabled and not ansible_check_mode and not ansible_facts.os_family == "Windows"
8+
39
- name: restart datadog-agent
410
service:
511
name: datadog-agent

manual_tests/test_6_full.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,14 @@
66
datadog_api_key: "123456"
77
datadog_agent_allow_downgrade: true
88
system_probe_config:
9-
enabled: true
109
source_excludes:
1110
"*":
1211
- 8301
1312
dest_excludes:
1413
"*":
1514
- 8301
15+
network_config:
16+
enabled: true
1617
datadog_config:
1718
tags: "mytag0, mytag1"
1819
log_level: INFO

manual_tests/test_7_full.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,14 @@
66
datadog_api_key: "123456"
77
datadog_agent_allow_downgrade: true
88
system_probe_config:
9-
enabled: true
109
source_excludes:
1110
"*":
1211
- 8301
1312
dest_excludes:
1413
"*":
1514
- 8301
15+
network_config:
16+
enabled: true
1617
datadog_config:
1718
tags: "mytag0, mytag1"
1819
log_level: INFO

tasks/agent-linux.yml

+56-18
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,19 @@
11
---
2-
- name: populate service facts
2+
- name: Populate service facts
33
service_facts:
44

5-
- name: add "{{ datadog_user }}" user to additional groups
5+
- name: Set before 6/7.24.1 flag
6+
set_fact:
7+
datadog_before_7241: "{{ datadog_major is defined and datadog_minor is defined and datadog_bugfix is defined
8+
and datadog_major | int < 8
9+
and (datadog_minor | int < 24 or (datadog_minor | int == 24 and datadog_bugfix | int < 1)) }}"
10+
11+
- name: Set before 6/7.18.0 flag
12+
set_fact:
13+
datadog_before_7180: "{{ datadog_major is defined and datadog_minor is defined
14+
and datadog_major | int < 8 and datadog_minor | int < 18 }}"
15+
16+
- name: Add "{{ datadog_user }}" user to additional groups
617
user: name="{{ datadog_user }}" groups="{{ datadog_additional_groups }}" append=yes
718
when: datadog_additional_groups | default([], true) | length > 0
819
notify: restart datadog-agent
@@ -78,40 +89,55 @@
7889
mode: 0640
7990
owner: "root"
8091
group: "{{ datadog_group }}"
92+
notify:
93+
"{% if datadog_before_7180 %}restart datadog-agent-sysprobe{% else %}restart datadog-agent{% endif %}"
8194

82-
- name: Ensure datadog-agent is running
83-
service:
84-
name: datadog-agent
85-
state: started
86-
enabled: yes
87-
when: not datadog_skip_running_check and datadog_enabled and not ansible_check_mode
88-
89-
- name: set system probe installed
95+
- name: Set system probe installed
9096
set_fact:
9197
datadog_sysprobe_installed: "{{ ansible_facts.services['datadog-agent-sysprobe'] is defined
9298
or ansible_facts.services['datadog-agent-sysprobe.service'] is defined }}"
9399
when: not datadog_skip_running_check
94100

95-
- name: set system probe enabled
101+
# Before 6/7.24.1, system_probe_config controls the system-probe service
102+
# datadog_minor is only defined when a specific Agent version is given
103+
# (see tasks/parse-version.yml)
104+
- name: Set system probe enabled (before 6/7.24.1)
96105
set_fact:
97106
datadog_sysprobe_enabled: "{{ system_probe_config is defined
107+
and 'enabled' in (system_probe_config | default({}, true))
98108
and system_probe_config['enabled']
99109
and datadog_sysprobe_installed }}"
100110
when: not datadog_skip_running_check
111+
and datadog_before_7241
101112

102-
- name: Ensure datadog-agent-sysprobe is running if enabled and installed
113+
# Since 6/7.24.1, setting enabled: true in network_config is enough to start the system-probe service:
114+
# https://docs.datadoghq.com/network_monitoring/performance/setup/?tab=agent#setup
115+
- name: Set system probe enabled (since 6/7.24.1)
116+
set_fact:
117+
datadog_sysprobe_enabled: "{{
118+
((system_probe_config is defined
119+
and 'enabled' in (system_probe_config | default({}, true))
120+
and system_probe_config['enabled'])
121+
or (network_config is defined
122+
and 'enabled' in (network_config | default({}, true))
123+
and network_config['enabled']))
124+
and datadog_sysprobe_installed }}"
125+
when: not datadog_skip_running_check
126+
and (not datadog_before_7241)
127+
128+
- name: Ensure datadog-agent is running
103129
service:
104-
name: datadog-agent-sysprobe
130+
name: datadog-agent
105131
state: started
106132
enabled: yes
107-
when: not datadog_skip_running_check and datadog_enabled and not ansible_check_mode and datadog_sysprobe_enabled
133+
when: not datadog_skip_running_check and datadog_enabled and not ansible_check_mode
108134

109-
- name: Ensure datadog-agent-sysprobe is stopped if disabled or not installed
135+
- name: Ensure datadog-agent-sysprobe is running if enabled and installed
110136
service:
111137
name: datadog-agent-sysprobe
112-
state: stopped
113-
enabled: no
114-
when: not datadog_skip_running_check and (not datadog_enabled or not datadog_sysprobe_enabled) and datadog_sysprobe_installed
138+
state: started
139+
enabled: yes
140+
when: not datadog_skip_running_check and datadog_enabled and not ansible_check_mode and datadog_sysprobe_enabled
115141

116142
- name: Ensure datadog-agent, datadog-agent-process and datadog-agent-trace are not running
117143
service:
@@ -124,6 +150,18 @@
124150
- datadog-agent-process
125151
- datadog-agent-trace
126152

153+
# Stop system-probe manually on Agent versions < 6/7.18, as it was not tied
154+
# to the main Agent service: https://github.com/DataDog/datadog-agent/pull/4883
155+
- name: Ensure datadog-agent-sysprobe is stopped if disabled or not installed (before 6/7.18.0)
156+
service:
157+
name: datadog-agent-sysprobe
158+
state: stopped
159+
enabled: no
160+
when: not datadog_skip_running_check
161+
and (not datadog_enabled or not datadog_sysprobe_enabled)
162+
and datadog_before_7180
163+
and datadog_sysprobe_installed
164+
127165
- name: Ensure datadog-agent-security is not running
128166
service:
129167
name: datadog-agent-security

templates/system-probe.yaml.j2

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Managed by Ansible
22

3-
{% if system_probe_config is defined and system_probe_config|length > 0 -%}
3+
{% if system_probe_config is defined and system_probe_config | default({}, true) | length > 0 -%}
44
system_probe_config:
55
{# The "first" option is only supported by jinja 2.10+
66
which is not present on older systems (CentOS 7, Debian 8, etc.)

0 commit comments

Comments
 (0)