Skip to content

Commit

Permalink
[8.18] [Automatic Import] Fix unstructured syslog flow (elastic#213042)…
Browse files Browse the repository at this point in the history
… (elastic#213208)

# Backport

This will backport the following commits from `main` to `8.18`:
- [[Automatic Import] Fix unstructured syslog flow
(elastic#213042)](elastic#213042)

<!--- Backport version: 9.6.4 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Bharat
Pasupula","email":"[email protected]"},"sourceCommit":{"committedDate":"2025-03-04T15:02:14Z","message":"[Automatic
Import] Fix unstructured syslog flow (elastic#213042)\n\n## Summary\n\nThis PR
fixes the Unstructured syslog flow. It picks up 5 samples send\nthem to
LLM to create a pattern and tests all the samples against the\npattern ,
collects the unparsed samples [ if any ] , send them in for\nnext round
of pattern check and so on.\n\nThis creates a list of patterns that
matches all the samples and creates\na grok processor with those
patterns and it breaks the syslogs down into\na JSON for ECS mapping ,
categorization and related graphs.\n\n### Checklist\n\n- [x] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common scenarios\n- [x] The PR
description includes the appropriate Release Notes section,\nand the
correct `release_note:*` label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"715a72fa1832242e3a96a664695f84f75346d106","branchLabelMapping":{"^v9.1.0$":"main","^v8.19.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","v9.0.0","backport:prev-minor","Team:Security-Scalability","backport:version","Feature:AutomaticImport","v9.1.0","v8.19.0","v8.18.1"],"title":"[Automatic
Import] Fix unstructured syslog
flow","number":213042,"url":"https://github.com/elastic/kibana/pull/213042","mergeCommit":{"message":"[Automatic
Import] Fix unstructured syslog flow (elastic#213042)\n\n## Summary\n\nThis PR
fixes the Unstructured syslog flow. It picks up 5 samples send\nthem to
LLM to create a pattern and tests all the samples against the\npattern ,
collects the unparsed samples [ if any ] , send them in for\nnext round
of pattern check and so on.\n\nThis creates a list of patterns that
matches all the samples and creates\na grok processor with those
patterns and it breaks the syslogs down into\na JSON for ECS mapping ,
categorization and related graphs.\n\n### Checklist\n\n- [x] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common scenarios\n- [x] The PR
description includes the appropriate Release Notes section,\nand the
correct `release_note:*` label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"715a72fa1832242e3a96a664695f84f75346d106"}},"sourceBranch":"main","suggestedTargetBranches":["8.x","8.18"],"targetPullRequestStates":[{"branch":"9.0","label":"v9.0.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"url":"https://github.com/elastic/kibana/pull/213118","number":213118,"state":"MERGED","mergeCommit":{"sha":"722374253815912d38d6f4f21098738cafba8f69","message":"[9.0]
[Automatic Import] Fix unstructured syslog flow (elastic#213042) (elastic#213118)\n\n#
Backport\n\nThis will backport the following commits from `main` to
`9.0`:\n- [[Automatic Import] Fix unstructured syslog
flow\n(elastic#213042)](https://github.com/elastic/kibana/pull/213042)\n\n<!---
Backport version: 9.6.6 -->\n\n### Questions ?\nPlease refer to the
[Backport
tool\ndocumentation](https://github.com/sorenlouv/backport)\n\n<!--BACKPORT
[{\"author\":{\"name\":\"Bharat\nPasupula\",\"email\":\"[email protected]\"},\"sourceCommit\":{\"committedDate\":\"2025-03-04T15:02:14Z\",\"message\":\"[Automatic\nImport]
Fix unstructured syslog flow (elastic#213042)\\n\\n## Summary\\n\\nThis
PR\nfixes the Unstructured syslog flow. It picks up 5 samples
send\\nthem to\nLLM to create a pattern and tests all the samples
against the\\npattern ,\ncollects the unparsed samples [ if any ] , send
them in for\\nnext round\nof pattern check and so on.\\n\\nThis creates
a list of patterns that\nmatches all the samples and creates\\na grok
processor with those\npatterns and it breaks the syslogs down into\\na
JSON for ECS mapping ,\ncategorization and related graphs.\\n\\n###
Checklist\\n\\n- [x] [Unit
or\nfunctional\\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\\nwere\nupdated
or added to match the most common scenarios\\n- [x] The PR\ndescription
includes the appropriate Release Notes section,\\nand the\ncorrect
`release_note:*` label is applied
per\nthe\\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\",\"sha\":\"715a72fa1832242e3a96a664695f84f75346d106\",\"branchLabelMapping\":{\"^v9.1.0$\":\"main\",\"^v8.19.0$\":\"8.x\",\"^v(\\\\d+).(\\\\d+).\\\\d+$\":\"$1.$2\"}},\"sourcePullRequest\":{\"labels\":[\"release_note:fix\",\"backport:prev-minor\",\"Team:Security-Scalability\",\"backport:version\",\"Feature:AutomaticImport\",\"v9.1.0\",\"v8.19.0\",\"v8.18.1\"],\"title\":\"[Automatic\nImport]
Fix unstructured
syslog\nflow\",\"number\":213042,\"url\":\"https://github.com/elastic/kibana/pull/213042\",\"mergeCommit\":{\"message\":\"[Automatic\nImport]
Fix unstructured syslog flow (elastic#213042)\\n\\n## Summary\\n\\nThis
PR\nfixes the Unstructured syslog flow. It picks up 5 samples
send\\nthem to\nLLM to create a pattern and tests all the samples
against the\\npattern ,\ncollects the unparsed samples [ if any ] , send
them in for\\nnext round\nof pattern check and so on.\\n\\nThis creates
a list of patterns that\nmatches all the samples and creates\\na grok
processor with those\npatterns and it breaks the syslogs down into\\na
JSON for ECS mapping ,\ncategorization and related graphs.\\n\\n###
Checklist\\n\\n- [x] [Unit
or\nfunctional\\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\\nwere\nupdated
or added to match the most common scenarios\\n- [x] The PR\ndescription
includes the appropriate Release Notes section,\\nand the\ncorrect
`release_note:*` label is applied
per\nthe\\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\",\"sha\":\"715a72fa1832242e3a96a664695f84f75346d106\"}},\"sourceBranch\":\"main\",\"suggestedTargetBranches\":[\"8.x\",\"8.18\"],\"targetPullRequestStates\":[{\"branch\":\"main\",\"label\":\"v9.1.0\",\"branchLabelMappingKey\":\"^v9.1.0$\",\"isSourceBranch\":true,\"state\":\"MERGED\",\"url\":\"https://github.com/elastic/kibana/pull/213042\",\"number\":213042,\"mergeCommit\":{\"message\":\"[Automatic\nImport]
Fix unstructured syslog flow (elastic#213042)\\n\\n## Summary\\n\\nThis
PR\nfixes the Unstructured syslog flow. It picks up 5 samples
send\\nthem to\nLLM to create a pattern and tests all the samples
against the\\npattern ,\ncollects the unparsed samples [ if any ] , send
them in for\\nnext round\nof pattern check and so on.\\n\\nThis creates
a list of patterns that\nmatches all the samples and creates\\na grok
processor with those\npatterns and it breaks the syslogs down into\\na
JSON for ECS mapping ,\ncategorization and related graphs.\\n\\n###
Checklist\\n\\n- [x] [Unit
or\nfunctional\\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\\nwere\nupdated
or added to match the most common scenarios\\n- [x] The PR\ndescription
includes the appropriate Release Notes section,\\nand the\ncorrect
`release_note:*` label is applied
per\nthe\\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\",\"sha\":\"715a72fa1832242e3a96a664695f84f75346d106\"}},{\"branch\":\"8.x\",\"label\":\"v8.19.0\",\"branchLabelMappingKey\":\"^v8.19.0$\",\"isSourceBranch\":false,\"state\":\"NOT_CREATED\"},{\"branch\":\"8.18\",\"label\":\"v8.18.1\",\"branchLabelMappingKey\":\"^v(\\\\d+).(\\\\d+).\\\\d+$\",\"isSourceBranch\":false,\"state\":\"NOT_CREATED\"}]}]\nBACKPORT-->\n\nCo-authored-by:
Bharat Pasupula
<[email protected]>"}},{"branch":"main","label":"v9.1.0","branchLabelMappingKey":"^v9.1.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/213042","number":213042,"mergeCommit":{"message":"[Automatic
Import] Fix unstructured syslog flow (elastic#213042)\n\n## Summary\n\nThis PR
fixes the Unstructured syslog flow. It picks up 5 samples send\nthem to
LLM to create a pattern and tests all the samples against the\npattern ,
collects the unparsed samples [ if any ] , send them in for\nnext round
of pattern check and so on.\n\nThis creates a list of patterns that
matches all the samples and creates\na grok processor with those
patterns and it breaks the syslogs down into\na JSON for ECS mapping ,
categorization and related graphs.\n\n### Checklist\n\n- [x] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common scenarios\n- [x] The PR
description includes the appropriate Release Notes section,\nand the
correct `release_note:*` label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"715a72fa1832242e3a96a664695f84f75346d106"}},{"branch":"8.x","label":"v8.19.0","branchLabelMappingKey":"^v8.19.0$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.18","label":"v8.18.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->
  • Loading branch information
bhapas authored Mar 5, 2025
1 parent b7908a4 commit ecca188
Show file tree
Hide file tree
Showing 15 changed files with 305 additions and 188 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,14 @@ export const unstructuredLogState = {
jsonSamples: ['{"message":"dummy data"}'],
finalized: false,
ecsVersion: 'testVersion',
errors: { test: 'testerror' },
errors: [{ test: 'testerror' }],
additionalProcessors: [],
isFirst: false,
unParsedSamples: ['dummy data'],
currentPattern: '%{GREEDYDATA:message}',
};

export const unstructuredLogResponse = {
grok_patterns: [
grok_pattern:
'####<%{MONTH} %{MONTHDAY}, %{YEAR} %{TIME} (?:AM|PM) %{WORD:timezone}> <%{WORD:log_level}> <%{WORD:component}> <%{DATA:hostname}> <%{DATA:server_name}> <%{DATA:thread_info}> <%{DATA:user}> <%{DATA:empty_field}> <%{DATA:empty_field2}> <%{NUMBER:timestamp}> <%{DATA:message_id}> <%{GREEDYDATA:message}>',
],
};
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,35 @@ export const EX_ANSWER_LOG_TYPE: SamplesFormat = {
header: false,
columns: ['ip', 'timestamp', 'request', 'status', '', 'bytes'],
};
export const LOG_FORMAT_EXAMPLE_LOGS = [
{
example:
'[18/Feb/2025:22:39:16 +0000] CONNECT conn=20597223 from=10.1.1.1:1234 to=10.2.3.4:4389 protocol=LDAP',
format: 'Structured',
},
{
example:
'2021-10-22 22:12:09,871 DEBUG [org.keycloak.events] (default task-3) operationType=CREATE, realmId=test, clientId=abcdefgh userId=sdfsf-b89c-4fca-9088-sdfsfsf, ipAddress=10.1.1.1, resourceType=USER, resourcePath=users/07972d16-b173-4c99-803d-90f211080f40',
format: 'Structured',
},
{
example:
'<166>Aug 21 22:08:13 myfirewall.my-domain.tld (squid-1)[6802]: [1598040493.253 325](tel:1598040493.253 325) 175.16.199.1 TCP_MISS/304 2912 GET https://github.com/3ilson/pfelk/file-list/master - HIER_DIRECT/81.2.69.145 -',
format: 'Unstructured',
},
{
example:
'<30>1 2021-07-03T23:01:56.547105-05:00 pfSense.example.com charon 18610 - - 08[CFG] ppk_id = (null)',
format: 'Unstructured',
},
{
example:
'2016/10/25 14:49:34 [error] 54053#0: *1 open() "/usr/local/Cellar/nginx/1.10.2_1/html/favicon.ico" failed (2: No such file or directory)',
format: 'Unstructured',
},
{
example:
'2025/02/12|14:42:42:871|FAKePolicyNumber-ws-sharedendorsement-autocore-54--fhfh-rghrg-0|INFO |http-nio-8080-exec-58 |RatingHelper.sendToPolicyPro:1521 |-call to PolicyPro for /rest/v2/actions/ISSUEEXT successful',
format: 'Unstructured',
},
];
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import { LOG_FORMAT_DETECTION_PROMPT } from './prompts';
import type { LogDetectionNodeParams } from './types';
import { SamplesFormat } from '../../../common';
import { LOG_FORMAT_DETECTION_SAMPLE_ROWS } from '../../../common/constants';
import { LOG_FORMAT_EXAMPLE_LOGS } from './constants';

export async function handleLogFormatDetection({
state,
Expand All @@ -26,6 +27,7 @@ export async function handleLogFormatDetection({
const logFormatDetectionResult = await logFormatDetectionNode.invoke({
ex_answer: state.exAnswer,
log_samples: samples.join('\n'),
example_logs: LOG_FORMAT_EXAMPLE_LOGS,
package_title: state.packageTitle,
datastream_title: state.dataStreamTitle,
});
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,18 @@ Follow these steps to do this:
* 'leef': If the log samples have Log Event Extended Format (LEEF) then classify it as "name: leef".
* 'fix': If the log samples have Financial Information eXchange (FIX) then classify it as "name: fix".
* 'unsupported': If you cannot put the format into any of the above categories then classify it with "name: unsupported".
2. Header: for structured and unstructured format:
2. You can look at the example_logs in the context to understand different log formats.
3. Header: for structured and unstructured format:
- if the samples have any or all of priority, timestamp, loglevel, hostname, ipAddress, messageId in the beginning information then set "header: true".
- if the samples have a syslog header then set "header: true"
- else set "header: false". If you are unable to determine the syslog header presence then set "header: false".
3. Note that a comma-separated list should be classified as 'csv' if its rows only contain values separated by commas. But if it looks like a list of comma separated key-values pairs like 'key1=value1, key2=value2' it should be classified as 'structured'.
4. Note that a comma-separated list should be classified as 'csv' if its rows only contain values separated by commas. But if it looks like a list of comma separated key-values pairs like 'key1=value1, key2=value2' it should be classified as 'structured'.
<example_logs>
\`\`\`json
{example_logs}
\`\`\`
</example_logs>
You ALWAYS follow these guidelines when writing your response:
<guidelines>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,10 @@

export const GROK_EXAMPLE_ANSWER = {
rfc: 'RFC2454',
regex:
'/(?:(d{4}[-]d{2}[-]d{2}[T]d{2}[:]d{2}[:]d{2}(?:.d{1,6})?(?:[+-]d{2}[:]d{2}|Z)?)|-)s(?:([w][wd.@-]*)|-)s(.*)$/',
grok_patterns: ['%{WORD:key1}:%{WORD:value1};%{WORD:key2}:%{WORD:value2}:%{GREEDYDATA:message}'],
grok_pattern: '%{WORD:key1}:%{WORD:value1};%{WORD:key2}:%{WORD:value2}:%{GREEDYDATA:message}',
};

export const GROK_ERROR_EXAMPLE_ANSWER = {
grok_patterns: [
grok_pattern:
'%{TIMESTAMP:timestamp}:%{WORD:value1};%{WORD:key2}:%{WORD:value2}:%{GREEDYDATA:message}',
],
};

This file was deleted.

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ import { StateGraph, END, START } from '@langchain/langgraph';
import type { UnstructuredLogState } from '../../types';
import { handleUnstructured } from './unstructured';
import type { UnstructuredGraphParams, UnstructuredBaseNodeParams } from './types';
import { handleUnstructuredError } from './error';
import { handleUnstructuredValidate } from './validate';

const graphState: StateGraphArgs<UnstructuredLogState>['channels'] = {
Expand All @@ -30,6 +29,10 @@ const graphState: StateGraphArgs<UnstructuredLogState>['channels'] = {
value: (x: string[], y?: string[]) => y ?? x,
default: () => [],
},
currentPattern: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
grokPatterns: {
value: (x: string[], y?: string[]) => y ?? x,
default: () => [],
Expand All @@ -42,8 +45,12 @@ const graphState: StateGraphArgs<UnstructuredLogState>['channels'] = {
value: (x: boolean, y?: boolean) => y ?? x,
default: () => false,
},
unParsedSamples: {
value: (x: string[], y?: string[]) => y ?? x,
default: () => [],
},
errors: {
value: (x: object, y?: object) => y ?? x,
value: (x: object[], y?: object[]) => y ?? x,
default: () => [],
},
additionalProcessors: {
Expand All @@ -54,11 +61,16 @@ const graphState: StateGraphArgs<UnstructuredLogState>['channels'] = {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
isFirst: {
value: (x: boolean, y?: boolean) => y ?? x,
default: () => false,
},
};

function modelInput({ state }: UnstructuredBaseNodeParams): Partial<UnstructuredLogState> {
return {
finalized: false,
isFirst: true,
lastExecutedChain: 'modelInput',
};
}
Expand All @@ -72,10 +84,10 @@ function modelOutput({ state }: UnstructuredBaseNodeParams): Partial<Unstructure
}

function validationRouter({ state }: UnstructuredBaseNodeParams): string {
if (Object.keys(state.errors).length === 0) {
if (Object.keys(state.unParsedSamples).length === 0) {
return 'modelOutput';
}
return 'handleUnstructuredError';
return 'handleUnparsed';
}

export async function getUnstructuredGraph({ model, client }: UnstructuredGraphParams) {
Expand All @@ -84,9 +96,6 @@ export async function getUnstructuredGraph({ model, client }: UnstructuredGraphP
})
.addNode('modelInput', (state: UnstructuredLogState) => modelInput({ state }))
.addNode('modelOutput', (state: UnstructuredLogState) => modelOutput({ state }))
.addNode('handleUnstructuredError', (state: UnstructuredLogState) =>
handleUnstructuredError({ state, model, client })
)
.addNode('handleUnstructured', (state: UnstructuredLogState) =>
handleUnstructured({ state, model, client })
)
Expand All @@ -100,11 +109,10 @@ export async function getUnstructuredGraph({ model, client }: UnstructuredGraphP
'handleUnstructuredValidate',
(state: UnstructuredLogState) => validationRouter({ state }),
{
handleUnstructuredError: 'handleUnstructuredError',
handleUnparsed: 'handleUnstructured',
modelOutput: 'modelOutput',
}
)
.addEdge('handleUnstructuredError', 'handleUnstructuredValidate')
.addEdge('modelOutput', END);

const compiledUnstructuredGraph = workflow.compile();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ export const GROK_MAIN_PROMPT = ChatPromptTemplate.fromMessages([
<samples>
{samples}
</samples>
<errors>
{errors}
</errors>
</context>`,
],
[
Expand All @@ -22,18 +25,19 @@ export const GROK_MAIN_PROMPT = ChatPromptTemplate.fromMessages([
Your goal is to accurately extract key components such as timestamps, hostnames, priority levels, process names, events, VLAN information, MAC addresses, IP addresses, STP roles, port statuses, messages and more.
Follow these steps to help improve the grok patterns and apply it step by step:
1. Familiarize yourself with various syslog message formats.
2. PRI (Priority Level): Encoded in angle brackets, e.g., <134>, indicating the facility and severity.
3. Timestamp: Use \`SYSLOGTIMESTAMP\` for RFC 3164 timestamps (e.g., Aug 10 16:34:02). Use \`TIMESTAMP_ISO8601\` for ISO 8601 (RFC 5424) timestamps. For epoch time, use \`NUMBER\`.
4. If the timestamp could not be categorized into a predefined format, extract the date time fields separately and combine them with the format identified in the grok pattern.
5. Make sure to identify the timezone component in the timestamp.
6. Hostname/IP Address: The system or device that generated the message, which could be an IP address or fully qualified domain name
7. Process Name and PID: Often included with brackets, such as sshd[1234].
8. VLAN information: Usually in the format of VLAN: 1234.
9. MAC Address: The network interface MAC address.
10. Port number: The port number on the device.
11. Look for status codes ,interface ,log type, source ,User action, destination, protocol, etc.
12. message: This is the free-form message text that varies widely across log entries.
1. If there are errors try to identify the root cause and provide a solution.
2. Familiarize yourself with various syslog message formats.
3. PRI (Priority Level): Encoded in angle brackets, e.g., <134>, indicating the facility and severity.
4. Timestamp: Use \`SYSLOGTIMESTAMP\` for RFC 3164 timestamps (e.g., Aug 10 16:34:02). Use \`TIMESTAMP_ISO8601\` for ISO 8601 (RFC 5424) timestamps. For epoch time, use \`NUMBER\`.
5. If the timestamp could not be categorized into a predefined format, extract the date time fields separately and combine them with the format identified in the grok pattern.
6. Make sure to identify the timezone component in the timestamp.
7. Hostname/IP Address: The system or device that generated the message, which could be an IP address or fully qualified domain name
8. Process Name and PID: Often included with brackets, such as sshd[1234].
9. VLAN information: Usually in the format of VLAN: 1234.
10. MAC Address: The network interface MAC address.
11. Port number: The port number on the device.
12. Look for status codes ,interface ,log type, source ,User action, destination, protocol, etc.
13. message: This is the free-form message text that varies widely across log entries.
You ALWAYS follow these guidelines when writing your response:
Expand All @@ -54,54 +58,3 @@ export const GROK_MAIN_PROMPT = ChatPromptTemplate.fromMessages([
],
['ai', 'Please find the JSON object below:'],
]);

export const GROK_ERROR_PROMPT = ChatPromptTemplate.fromMessages([
[
'system',
`You are an expert in Syslogs and identifying the headers and structured body in syslog messages. Here is some context for you to reference for your task, read it carefully as you will get questions about it later:
<context>
<current_pattern>
{current_pattern}
</current_pattern>
</context>`,
],
[
'human',
`Please go through each error below, carefully review the provided current grok pattern, and resolve the most likely cause to the supplied error by returning an updated version of the current_pattern.
<errors>
{errors}
</errors>
Follow these steps to help improve the grok patterns and apply it step by step:
1. Familiarize yourself with various syslog message formats.
2. PRI (Priority Level): Encoded in angle brackets, e.g., <134>, indicating the facility and severity.
3. Timestamp: Use \`SYSLOGTIMESTAMP\` for RFC 3164 timestamps (e.g., Aug 10 16:34:02). Use \`TIMESTAMP_ISO8601\` for ISO 8601 (RFC 5424) timestamps. For epoch time, use \`NUMBER\`.
4. If the timestamp could not be categorized into a predefined format, extract the date time fields separately and combine them with the format identified in the grok pattern.
5. Make sure to identify the timezone component in the timestamp.
6. Hostname/IP Address: The system or device that generated the message, which could be an IP address or fully qualified domain name
7. Process Name and PID: Often included with brackets, such as sshd[1234].
8. VLAN information: Usually in the format of VLAN: 1234.
9. MAC Address: The network interface MAC address.
10. Port number: The port number on the device.
11. Look for status codes ,interface ,log type, source ,User action, destination, protocol, etc.
12. message: This is the free-form message text that varies widely across log entries.
You ALWAYS follow these guidelines when writing your response:
<guidelines>
- Make sure to map the remaining message part to \'message\' in grok pattern.
- Make sure to add \`{packageName}.{dataStreamName}\` as a prefix to each field in the pattern. Refer to example response.
- Do not respond with anything except the processor as a JSON object enclosed with 3 backticks (\`), see example response above. Use strict JSON response format.
</guidelines>
You are required to provide the output in the following example response format:
<example_response>
A: Please find the JSON object below:
\`\`\`json
{ex_answer}
\`\`\`
</example_response>`,
],
['ai', 'Please find the JSON object below:'],
]);
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ export interface HandleUnstructuredNodeParams extends UnstructuredNodeParams {
}

export interface GrokResult {
grok_patterns: string[];
grok_pattern: string;
message: string;
}

Expand Down
Loading

0 comments on commit ecca188

Please sign in to comment.