Skip to content

Commit b3edb48

Browse files
committed
AG-38480 Add stop functionality to CSS tokenizer callback and implement hasToken function
Squashed commit of the following: commit c7c02e5 Merge: 0984c52 be39a69 Author: jellizaveta <[email protected]> Date: Mon Feb 3 17:42:56 2025 +0300 Merge branch 'master' into fix/AG-38480_2 commit 0984c52 Author: jellizaveta <[email protected]> Date: Mon Feb 3 16:25:34 2025 +0300 update changelog and readme commit 3dab213 Author: jellizaveta <[email protected]> Date: Mon Feb 3 16:23:49 2025 +0300 add test commit 9cd8541 Author: jellizaveta <[email protected]> Date: Fri Jan 31 13:05:55 2025 +0300 add jsDoc commit 5a62369 Author: jellizaveta <[email protected]> Date: Fri Jan 31 12:15:59 2025 +0300 update changelog, package.json commit 29da501 Author: Dávid Tóta <[email protected]> Date: Fri Jan 31 12:03:47 2025 +0300 Applied suggestion commit f8d2593 Author: Dávid Tóta <[email protected]> Date: Fri Jan 31 12:03:39 2025 +0300 Applied suggestion commit 5063cc0 Merge: 80e3e1e 193e35c Author: jellizaveta <[email protected]> Date: Fri Jan 31 10:22:51 2025 +0300 merge master, resolve conflicts commit 80e3e1e Author: jellizaveta <[email protected]> Date: Fri Jan 31 10:16:07 2025 +0300 fix date in changelog commit 8dd7ada Author: jellizaveta <[email protected]> Date: Fri Jan 31 10:15:22 2025 +0300 fix arrow function commit 25fc5c9 Author: jellizaveta <[email protected]> Date: Fri Jan 31 10:08:06 2025 +0300 remove indent commit add1b6b Author: jellizaveta <[email protected]> Date: Fri Jan 31 10:07:09 2025 +0300 update stop function commit 4502a69 Author: jellizaveta <[email protected]> Date: Thu Jan 30 19:11:53 2025 +0300 bind stop function to the context commit 5699351 Author: Dávid Tóta <[email protected]> Date: Thu Jan 30 18:00:21 2025 +0300 Applied suggestion commit 2ed879d Author: jellizaveta <[email protected]> Date: Thu Jan 30 15:44:16 2025 +0300 fix order with balance commit 8eff1ed Author: jellizaveta <[email protected]> Date: Thu Jan 30 14:19:04 2025 +0300 remove import commit 71fd503 Author: jellizaveta <[email protected]> Date: Thu Jan 30 14:10:16 2025 +0300 add indent commit 3c2eeba Author: jellizaveta <[email protected]> Date: Thu Jan 30 14:09:17 2025 +0300 update tests commit 7e691c6 Author: jellizaveta <[email protected]> Date: Thu Jan 30 13:48:30 2025 +0300 update changelog date commit d5937cd Author: jellizaveta <[email protected]> Date: Thu Jan 30 13:46:55 2025 +0300 update function-token.test.ts ... and 21 more commits
1 parent be39a69 commit b3edb48

31 files changed

+493
-228
lines changed

packages/agtree/CHANGELOG.md

+8
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,14 @@ The format is based on [Keep a Changelog], and this project adheres to [Semantic
88
[Keep a Changelog]: https://keepachangelog.com/en/1.0.0/
99
[Semantic Versioning]: https://semver.org/spec/v2.0.0.html
1010

11+
## [3.0.0-alpha.4] - 2024-02-03
12+
13+
### Changed
14+
15+
- Updated [@adguard/css-tokenizer] to `v1.2.0` which introduces the `hasToken` function.
16+
17+
[3.0.0-alpha.4]: https://github.com/AdguardTeam/tsurlfilter/releases/tag/agtree-3.0.0-alpha.4
18+
1119
## [3.0.0-alpha.3] - 2025-01-30
1220

1321
### Changed

packages/agtree/package.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@adguard/agtree",
3-
"version": "3.0.0-alpha.3",
3+
"version": "3.0.0-alpha.4",
44
"description": "Tool set for working with adblock filter lists",
55
"keywords": [
66
"adblock",

packages/agtree/src/parser/cosmetic/cosmetic-rule-parser.ts

+1-2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
import { sprintf } from 'sprintf-js';
2-
import { TokenType } from '@adguard/css-tokenizer';
2+
import { hasToken, TokenType } from '@adguard/css-tokenizer';
33

44
import { CosmeticRuleSeparatorUtils } from '../../utils/cosmetic-rule-separator';
55
import { AdblockSyntax } from '../../utils/adblockers';
@@ -44,7 +44,6 @@ import { UboScriptletInjectionBodyParser } from './body/ubo-scriptlet-injection-
4444
import { AdgScriptletInjectionBodyParser } from './body/adg-scriptlet-injection-body-parser';
4545
import { BaseParser } from '../base-parser';
4646
import { UboPseudoName } from '../../common/ubo-selector-common';
47-
import { hasToken } from '../css/has-token';
4847

4948
/**
5049
* Possible error messages for uBO selectors. Formatted with {@link sprintf}.

packages/agtree/src/parser/css/balancing.ts

+36-3
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,38 @@ import { sprintf } from 'sprintf-js';
1515
import { AdblockSyntaxError } from '../../errors/adblock-syntax-error';
1616
import { END_OF_INPUT, ERROR_MESSAGES } from './constants';
1717

18+
/**
19+
* Utility type to get the last element from a tuple, handling optional last elements correctly.
20+
*
21+
* @param T - The tuple to extract the last element from.
22+
* @returns The last element of the tuple if present; `L | undefined` if the last element is optional.
23+
*/
24+
// eslint-disable-next-line @typescript-eslint/no-unused-vars
25+
type Last<T extends unknown[]> = T extends [...infer _I, infer L]
26+
? L
27+
// eslint-disable-next-line @typescript-eslint/no-unused-vars
28+
: T extends [...infer _I, (infer L)?] ? L | undefined : never;
29+
30+
/**
31+
* Utility type to remove the last element from a tuple, handling optional last elements correctly.
32+
*
33+
* @param T - The tuple to remove the last element from.
34+
* @returns A tuple without the last element. If the last element is optional, it is also removed.
35+
*/
36+
// eslint-disable-next-line @typescript-eslint/no-unused-vars
37+
type OmitLast<T extends unknown[]> = T extends [...infer Rest, infer _Last]
38+
? Rest
39+
// Handles cases where the last element is optional
40+
// eslint-disable-next-line @typescript-eslint/no-unused-vars
41+
: T extends [...infer Rest, (infer _Last)?]
42+
? Rest
43+
: never;
44+
45+
/**
46+
* Extracts the parameters of `OnTokenCallback` as a tuple.
47+
*/
48+
type OnTokenCallbackParameters = Parameters<OnTokenCallback>;
49+
1850
/**
1951
* Extended version of `OnTokenCallback` which also receives a `balance` parameter.
2052
*
@@ -23,11 +55,12 @@ import { END_OF_INPUT, ERROR_MESSAGES } from './constants';
2355
* @param end End index in the source string.
2456
* @param props Additional properties of the token (if any - can be `undefined`, depending on the token type).
2557
* @param balance Calculated balance level of the token.
58+
* @param stop Function to halt tokenization.
2659
* @note This function is keeping the same signature as the original `OnTokenCallback` to avoid breaking changes,
2760
* just adding the `balance` parameter at the end.
2861
*/
2962
export type OnBalancedTokenCallback = (
30-
...args: [...Parameters<OnTokenCallback>, ...[balance: number]]
63+
...args: [...OmitLast<OnTokenCallbackParameters>, balance: number, Last<OnTokenCallbackParameters>]
3164
) => ReturnType<OnTokenCallback>;
3265

3366
/**
@@ -72,7 +105,7 @@ const tokenizeWithBalancedPairs = (
72105

73106
tokenizeExtended(
74107
raw,
75-
(type: TokenType, start, end, props) => {
108+
(type: TokenType, start, end, props, stop) => {
76109
if (tokenPairs.has(type)) {
77110
// If the token is an opening token, push its corresponding closing token to the stack.
78111
// It is safe to use non-null assertion here, because we have checked that the token exists in the map.
@@ -95,7 +128,7 @@ const tokenizeWithBalancedPairs = (
95128
}
96129
}
97130

98-
onToken(type, start, end, props, stack.length);
131+
onToken(type, start, end, props, stack.length, stop);
99132
},
100133
onError,
101134
functionHandlers,

packages/agtree/src/parser/css/has-token.ts

-38
This file was deleted.

packages/css-tokenizer/CHANGELOG.md

+8
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,14 @@ The format is based on [Keep a Changelog][keepachangelog], and this project adhe
77
[keepachangelog]: https://keepachangelog.com/en/1.0.0/
88
[semver]: https://semver.org/spec/v2.0.0.html
99

10+
## [1.2.0] - 2025-01-31
11+
12+
### Added
13+
14+
- Stop functionality to `CSS tokenizer` callback and implement `hasToken` function.
15+
16+
[1.2.0]: https://github.com/AdguardTeam/tsurlfilter/releases/tag/css-tokenizer-v1.2.0
17+
1018
## [1.1.1] - 2024-09-19
1119

1220
### Fixed

packages/css-tokenizer/README.md

+30-1
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ Table of contents:
2424
- [`tokenize`](#tokenize)
2525
- [`tokenizeExtended`](#tokenizeextended)
2626
- [Utilities](#utilities)
27+
- [`hasToken`](#hastoken)
2728
- [`TokenizerContext`](#tokenizercontext)
2829
- [`decodeIdent`](#decodeident)
2930
- [`CSS_TOKENIZER_VERSION`](#css_tokenizer_version)
@@ -176,10 +177,17 @@ where
176177
* @param start Token start offset
177178
* @param end Token end offset
178179
* @param props Other token properties (if any)
180+
* @param stop Function to halt the tokenization process
179181
* @note Hash tokens have a type flag set to either "id" or "unrestricted". The type flag defaults to "unrestricted" if
180182
* not otherwise set
181183
*/
182-
type OnTokenCallback = (type: TokenType, start: number, end: number, props?: Record<string, unknown>) => void;
184+
type OnTokenCallback = (
185+
type: TokenType,
186+
start: number,
187+
end: number,
188+
props: Record<string, unknown> | undefined,
189+
stop: () => void
190+
);
183191
```
184192

185193
```ts
@@ -232,6 +240,27 @@ function tokenizeExtended(
232240

233241
### Utilities
234242

243+
#### `hasToken`
244+
245+
```ts
246+
/**
247+
* Checks if the given raw string contains any of the specified tokens.
248+
*
249+
* @param raw - The raw string to be tokenized and checked.
250+
* @param tokens - A set of token types to check for in the raw string.
251+
* @param tokenizer - The tokenizer function to use. Defaults to `tokenizeExtended`.
252+
*
253+
* @example hasToken('div:contains("foo")', new Set([TokenType.Function]), tokenizeExtended); // true
254+
*
255+
* @returns `true` if any of the specified tokens are found in the raw string, otherwise `false`.
256+
*/
257+
function hasToken = (
258+
raw: string,
259+
tokens: Set<TokenType>,
260+
tokenizer: TokenizerFunction = tokenizeExtended,
261+
): boolean
262+
```
263+
235264
#### `TokenizerContext`
236265

237266
A class that represents the tokenizer context. It is used to manage the tokenizer state and provides access to the

0 commit comments

Comments
 (0)