Skip to content

Commit ab20070

Browse files
committed
Merge branch 'master' into syntest
2 parents 6cba069 + 8238d80 commit ab20070

33 files changed

+1928
-906
lines changed

.travis.yml

-7
Original file line numberDiff line numberDiff line change
@@ -25,13 +25,6 @@ cache:
2525

2626
script:
2727
- cargo build
28-
- |
29-
if [ "$TRAVIS_OS_NAME" == "linux" ]
30-
then
31-
p=$(cd ./target/debug/build/onig_sys-*/out/lib/ && pwd)
32-
echo "adding $p to linker path"
33-
export LD_LIBRARY_PATH="${p}:${LD_LIBRARY_PATH}"
34-
fi
3528
- cargo test
3629
- make assets
3730
- make syntest

CHANGELOG.md

+54
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Version 3.0
2+
3+
This is a major release with multiple breaking API changes, although upgrading shouldn't be too difficult. It fixes bugs and comes with some nice new features.
4+
5+
## Breaking changes and upgrading
6+
7+
- The `SyntaxSet` API has been revamped to use a builder and an arena of contexts. See [example usage](https://github.com/trishume/syntect/blob/51208d35a6d98c07468fbe044d5c6f37eb129205/examples/gendata.rs#L25-L28).
8+
- Many functions now need to be passed the `SyntaxSet` that goes with the rest of their arguments because of this new arena.
9+
- Filename added to `LoadingError::ParseSyntax`
10+
- Many functions in the `html` module now take the `newlines` version of syntaxes.
11+
- These methods have also been renamed, partially so that code that needs updating doesn't break without a compile error.
12+
- The HTML they output also treats newlines slightly differently and I think more correctly but uglier when you look at the HTML.
13+
14+
#### Breaking rename upgrade guide
15+
16+
- `SyntaxSet::add_syntax -> SyntaxSetBuilder::add`
17+
- `SyntaxSet::load_syntaxes -> SyntaxSetBuilder::add_from_folder`
18+
- `SyntaxSet::load_plain_text_syntax -> SyntaxSetBuilder::add_plain_text_syntax`
19+
- `html::highlighted_snippet_for_string -> html::highlighted_html_for_string`: also change to `newlines` `SyntaxSet`
20+
- `html::highlighted_snippet_for_file -> html::highlighted_html_for_file`: also change to `newlines` `SyntaxSet`
21+
- `html::styles_to_coloured_html -> html::styled_line_to_highlighted_html`: also change to `newlines` `SyntaxSet`
22+
- `html::start_coloured_html_snippet -> html::start_highlighted_html_snippet`: return type also changed
23+
24+
## Major changes and new features
25+
26+
- Use arena for contexts (#182 #186 #187 #190 #195): This makes the code cleaner, enables use of syntaxes from multiple threads, and prevents accidental misuse.
27+
- This involves a new `SyntaxSetBuilder` API for constructing new `SyntaxSet`s
28+
- See the revamped [parsyncat example](https://github.com/trishume/syntect/blob/51208d35a6d98c07468fbe044d5c6f37eb129205/examples/parsyncat.rs).
29+
- Encourage use of newlines (#197 #207 #196): The `nonewlines` mode is often buggy so we made it easier to use the `newlines` mode.
30+
- Added a `LinesWithEndings` utility for iterating over the lines of a string with `\n` characters.
31+
- Reengineer the `html` module to use `newlines` syntaxes.
32+
- Add helpers for modifying highlighted lines (#198): For use cases like highlighting a piece of text in a blog code snippet or debugger. This allows you to reach into the highlighted spans and add styles.
33+
- Check out `split_at` and `modify_range` in the `util` module.
34+
- New `ThemeSet::add_from_folder` function (#200): For modifying existing theme sets.
35+
36+
## Bug Fixes
37+
38+
- Improve nonewlines regex rewriting: #212 #211
39+
- Reengineer theme application to match Sublime: #209
40+
- Also mark contexts referenced by name as "no prototype" (same as ST): #180
41+
- keep with_prototype when switching contexts with `set`: #177 #166
42+
- Fix unused import warning: #174
43+
- Ignore trailing dots in selectors: #173
44+
- Fix `embed` to not include prototypes: #172 #160
45+
46+
## Upgraded dependencies
47+
48+
- plist: `0.2 -> 0.3`
49+
- regex: `0.2 -> 1.0`
50+
- onig: `3.2.1 -> 4.1`
51+
52+
# Prior versions
53+
54+
See the Github release notes: <https://github.com/trishume/syntect/releases>

Cargo.toml

+6-5
Original file line numberDiff line numberDiff line change
@@ -7,31 +7,32 @@ keywords = ["syntax", "highlighting", "highlighter", "colouring", "parsing"]
77
categories = ["parser-implementations", "parsing", "text-processing"]
88
readme = "Readme.md"
99
license = "MIT"
10-
version = "2.1.0" # remember to update html_root_url
10+
version = "3.0.0" # remember to update html_root_url
1111
authors = ["Tristan Hume <[email protected]>"]
1212
exclude = [
1313
"testdata/*",
1414
]
1515

1616
[dependencies]
1717
yaml-rust = { version = "0.4", optional = true }
18-
onig = { version = "3.2.1", optional = true }
18+
onig = { version = "4.1", optional = true }
1919
walkdir = "2.0"
20-
regex-syntax = { version = "0.4", optional = true }
20+
regex-syntax = { version = "0.6", optional = true }
2121
lazy_static = "1.0"
22+
lazycell = "1.0"
2223
bitflags = "1.0"
2324
plist = "0.3"
2425
bincode = { version = "1.0", optional = true }
2526
flate2 = { version = "1.0", optional = true, default-features = false }
2627
fnv = { version = "1.0", optional = true }
27-
serde = { version = "1.0", features = ["rc"] }
28+
serde = "1.0"
2829
serde_derive = "1.0"
2930
serde_json = "1.0"
3031

3132
[dev-dependencies]
3233
criterion = "0.2"
3334
rayon = "1.0.0"
34-
regex = "0.2"
35+
regex = "1.0"
3536
getopts = "0.2"
3637
pretty_assertions = "0.5.0"
3738

Readme.md

+46-37
Original file line numberDiff line numberDiff line change
@@ -7,27 +7,23 @@
77

88
`syntect` is a syntax highlighting library for Rust that uses [Sublime Text syntax definitions](http://www.sublimetext.com/docs/3/syntax.html#include-syntax). It aims to be a good solution for any Rust project that needs syntax highlighting, including deep integration with text editors written in Rust. It's used in production by at least two companies, and by [many open source projects](#projects-using-syntect).
99

10-
If you are writing a text editor (or something else needing highlighting) in Rust and this library doesn't fit your needs, I consider that a bug and you should file an issue or email me.
10+
If you are writing a text editor (or something else needing highlighting) in Rust and this library doesn't fit your needs, I consider that a bug and you should file an issue or email me. I consider this project mostly complete, I still maintain it and review PRs, but it's not under heavy development.
1111

12-
**Note:** I consider this project "done" in the sense that it works quite well for its intended purpose, accomplishes the major goals I had, and I'm unlikely to make any sweeping changes.
13-
I won't be committing much anymore because the marginal return on additional work isn't very high. Rest assured if you submit PRs I will review them and likely merge promptly.
14-
I'll also quite possibly still fix issues and definitely offer advice and knowledge on how the library works. Basically I'll be maintaining the library but not developing it further.
15-
I've spent months working on, tweaking, optimizing, documenting and testing this library. If you still have any reasons you don't think it fits your needs, file an issue or email me.
12+
## Important Links
1613

17-
### Rendered docs: <https://docs.rs/syntect>
14+
- API docs with examples: <https://docs.rs/syntect>
15+
- [Changelogs and upgrade notes for past releases](https://github.com/trishume/syntect/releases)
1816

1917
## Getting Started
2018

2119
`syntect` is [available on crates.io](https://crates.io/crates/syntect). You can install it by adding this line to your `Cargo.toml`:
2220

2321
```toml
24-
syntect = "2.1"
22+
syntect = "3.0"
2523
```
2624

2725
After that take a look at the [documentation](https://docs.rs/syntect) and the [examples](https://github.com/trishume/syntect/tree/master/examples).
2826

29-
**Note:** with stable Rust on Linux there is a possibility you might have to add `./target/debug/build/onig_sys-*/out/lib/` to your `LD_LIBRARY_PATH` environment variable. I dunno why or even if this happens on other places than Travis, but see `travis.yml` for what it does to make it work. Do this if you see `libonig.so: cannot open shared object file`.
30-
3127
If you've cloned this repository, be sure to run
3228

3329
```
@@ -36,17 +32,10 @@ git submodule update --init
3632

3733
to fetch all the required dependencies for running the tests.
3834

39-
### Feature Flags
40-
41-
Syntect makes heavy use of [cargo features](http://doc.crates.io/manifest.html#the-features-section), to support users who require only a subset of functionality. In particular, it is possible to use the highlighting component of syntect without the parser (for instance when hand-rolling a higher performance parser for a particular language), by adding `default-features = false` to the syntect entry in your `Cargo.toml`.
42-
43-
For more information on available features, see the features section in `Cargo.toml`.
44-
4535
## Features/Goals
4636

4737
- [x] Work with many languages (accomplished through using existing grammar formats)
48-
- [x] Highlight super quickly, faster than every editor except Sublime Text 3
49-
- [x] Load up quickly, currently in around 23ms but could potentially be even faster.
38+
- [x] Highlight super quickly, faster than nearly all text editors
5039
- [x] Include easy to use API for basic cases
5140
- [x] API allows use in fancy text editors with piece tables and incremental re-highlighting and the like.
5241
- [x] Expose internals of the parsing process so text editors can do things like cache parse states and use semantic info for code intelligence
@@ -55,6 +44,7 @@ For more information on available features, see the features section in `Cargo.t
5544
- [x] Well documented, I've tried to add a useful documentation comment to everything that isn't utterly self explanatory.
5645
- [x] Built-in output to coloured HTML `<pre>` tags or 24-bit colour ANSI terminal escape sequences.
5746
- [x] Nearly complete compatibility with Sublime Text 3, including lots of edge cases. Passes nearly all of Sublime's syntax tests, see [issue 59](https://github.com/trishume/syntect/issues/59).
47+
- [x] Load up quickly, currently in around 23ms but could potentially be even faster.
5848

5949
## Screenshots
6050

@@ -65,22 +55,29 @@ There's currently an example program called `syncat` that prints one of the sour
6555
![Solarized Light](http://i.imgur.com/l3zcO4J.png)
6656
![InspiredGithub](http://i.imgur.com/a7U1r2j.png)
6757

68-
## Roadmap
69-
70-
- [x] Sketch out representation of a Sublime Text syntax
71-
- [x] Parse `.sublime-syntax` files into the representation.
72-
- [x] Write an interpreter for the `.sublime-syntax` state machine that highlights an incoming iterator of file lines into an iterator of scope-annotated text.
73-
- [x] Parse TextMate/Sublime Text theme files
74-
- [x] Highlight a scope-annotated iterator into a colour-annotated iterator for display.
75-
- [x] Ability to dump loaded packages as binary file and load them with lazy regex compilation for fast start up times.
76-
- [x] Bundle dumped default syntaxes into the library binary so library users don't need an assets folder with Sublime Text packages.
77-
- [x] Add nice API wrappers for simple use cases. The base APIs are designed for deep high performance integration with arbitrary text editor data structures.
78-
- [x] Document the API better and make things private that don't need to be public
79-
- [x] Detect file syntax based on first line
80-
- [x] Make it really fast (mosty two hot-paths need caching, same places Textmate 2 caches)
81-
- [ ] Make syncat a better demo, and maybe more demo programs
82-
- [ ] Add sRGB colour correction (not sure if this is necessary, could be the job of the text editor)
83-
- [ ] Add C bindings so it can be used as a C library from other languages.
58+
## Example Code
59+
60+
Prints highlighted lines of a string to the terminal. See the [easy](https://docs.rs/syntect/latest/syntect/easy/index.html) and [html](https://docs.rs/syntect/latest/syntect/html/index.html) module docs for more basic use case examples.
61+
62+
```rust
63+
use syntect::easy::HighlightLines;
64+
use syntect::parsing::SyntaxSet;
65+
use syntect::highlighting::{ThemeSet, Style};
66+
use syntect::util::{as_24_bit_terminal_escaped, LinesWithEndings};
67+
68+
// Load these once at the start of your program
69+
let ps = SyntaxSet::load_defaults_newlines();
70+
let ts = ThemeSet::load_defaults();
71+
72+
let syntax = ps.find_syntax_by_extension("rs").unwrap();
73+
let mut h = HighlightLines::new(syntax, &ts.themes["base16-ocean.dark"]);
74+
let s = "pub struct Wow { hi: u64 }\nfn blah() -> u64 {}";
75+
for line in LinesWithEndings::from(s) {
76+
let ranges: Vec<(Style, &str)> = h.highlight(line, &ps);
77+
let escaped = as_24_bit_terminal_escaped(&ranges[..], true);
78+
println!("{}", escaped);
79+
}
80+
```
8481

8582
## Performance
8683

@@ -110,7 +107,13 @@ All measurements were taken on a mid 2012 15" retina Macbook Pro.
110107
- ~1.9ms to parse and highlight the 30 line 791 character `testdata/highlight_test.erb` file. This works out to around 16,000 lines/second or 422 kilobytes/second.
111108
- ~250ms end to end for `syncat` to start, load the definitions, highlight the test file and shut down. This is mostly spent loading.
112109

113-
### Caching
110+
## Feature Flags
111+
112+
Syntect makes heavy use of [cargo features](http://doc.crates.io/manifest.html#the-features-section), to support users who require only a subset of functionality. In particular, it is possible to use the highlighting component of syntect without the parser (for instance when hand-rolling a higher performance parser for a particular language), by adding `default-features = false` to the syntect entry in your `Cargo.toml`.
113+
114+
For more information on available features, see the features section in `Cargo.toml`.
115+
116+
## Caching
114117

115118
Because `syntect`'s API exposes internal cacheable data structures, there is a caching strategy that text editors can use that allows the text on screen to be re-rendered instantaneously regardless of the file size when a change is made after the initial highlight.
116119

@@ -120,11 +123,15 @@ This way from the time the edit happens to the time the new colouring gets rende
120123

121124
Any time the file is changed the latest cached state is found, the cache is cleared after that point, and a background job is started. Any already running jobs are stopped because they would be working on old state. This way you can just have one thread dedicated to highlighting that is always doing the most up-to-date work, or sleeping.
122125

123-
### Parallelizing
126+
## Parallelizing
127+
128+
Since 3.0, `syntect` can be used to do parsing/highlighting in parallel. `SyntaxSet` is both `Send` and `Sync` and so can easily be used from multiple threads. It is also `Clone`, which means you can construct a syntax set and then clone it to use for other threads if you prefer.
124129

125-
`syntect` doesn't provide any built-in facilities to enable highlighting in parallel. Some of the important data structures are not thread-safe, either, most notably `SyntaxSet`. However, if you find yourself in need of highlighting lots of files in parallel, the recommendation is to use some sort of thread pooling, along with the `thread_local!` macro from `libstd`, so that each thread that needs, say, a `SyntaxSet`, will have one, while minimizing the amount of them that need to be initialized. For adding parallelism to a previously single-threaded program, the recommended thread pooling is [`rayon`](https://github.com/nikomatsakis/rayon). However, if you're working in an already-threaded context where there might be more threads than you want (such as writing a handler for an Iron request), the recommendation is to force all highlighting to be done within a fixed-size thread pool using [`rust-scoped-pool`](https://github.com/reem/rust-scoped-pool). An example of the former is in `examples/parsyncat.rs`.
130+
Compared to older versions, there's nothing preventing the serialization of a `SyntaxSet` either. So you can directly deserialize a fully linked `SyntaxSet` and start using it for parsing/highlighting. Before, it was always necessary to do linking first.
126131

127-
See [#20](https://github.com/trishume/syntect/issues/20) and [#78](https://github.com/trishume/syntect/pull/78) for more detail and discussion about why `syntect` doesn't provide parallelism by default.
132+
It is worth mentioning that regex compilation is done lazily only when the regexes are actually needed. Once a regex has been compiled, the compiled version is used for all threads after that. Note that this is done using interior mutability, so if multiple threads happen to encounter the same uncompiled regex at the same time, compiling might happen multiple times. After that, one of the compiled regexes will be used. When a `SyntaxSet` is cloned, the regexes in the cloned set will need to be recompiled currently.
133+
134+
For adding parallelism to a previously single-threaded program, the recommended thread pooling is [`rayon`](https://github.com/nikomatsakis/rayon). However, if you're working in an already-threaded context where there might be more threads than you want (such as writing a handler for an Iron request), the recommendation is to force all highlighting to be done within a fixed-size thread pool using [`rust-scoped-pool`](https://github.com/reem/rust-scoped-pool). An example of the former is in `examples/parsyncat.rs`.
128135

129136
## Examples Available
130137

@@ -173,4 +180,6 @@ Below is a list of projects using Syntect, in approximate order by how long they
173180

174181
## License and Acknowledgements
175182

183+
Thanks to [Robin Stocker](https://github.com/robinst) and also [Keith Hall](https://github.com/keith-hall) for making awesome substantial contributions of the most important impressive improvements `syntect` has had post-`v1.0`! They deserve lots of credit for where `syntect` is today.
184+
176185
Thanks to [Textmate 2](https://github.com/textmate/textmate) and @defuz's [sublimate](https://github.com/defuz/sublimate) for the existing open source code I used as inspiration and in the case of sublimate's `tmTheme` loader, copy-pasted. All code (including defuz's sublimate code) is released under the MIT license.

assets/default_newlines.packdump

25.3 KB
Binary file not shown.

assets/default_nonewlines.packdump

25.4 KB
Binary file not shown.

benches/highlighting.rs

+23-6
Original file line numberDiff line numberDiff line change
@@ -4,18 +4,19 @@ extern crate syntect;
44

55
use criterion::{Bencher, Criterion};
66

7-
use syntect::parsing::{SyntaxSet, SyntaxDefinition, ScopeStack};
7+
use syntect::parsing::{SyntaxSet, SyntaxReference, ScopeStack};
88
use syntect::highlighting::{ThemeSet, Theme};
99
use syntect::easy::HighlightLines;
10+
use syntect::html::highlighted_html_for_string;
1011
use std::str::FromStr;
1112
use std::fs::File;
1213
use std::io::Read;
1314

14-
fn do_highlight(s: &str, syntax: &SyntaxDefinition, theme: &Theme) -> usize {
15+
fn do_highlight(s: &str, syntax_set: &SyntaxSet, syntax: &SyntaxReference, theme: &Theme) -> usize {
1516
let mut h = HighlightLines::new(syntax, theme);
1617
let mut count = 0;
1718
for line in s.lines() {
18-
let regions = h.highlight(line);
19+
let regions = h.highlight(line, syntax_set);
1920
count += regions.len();
2021
}
2122
count
@@ -33,16 +34,16 @@ fn highlight_file(b: &mut Bencher, file: &str) {
3334
};
3435

3536
// don't load from dump so we don't count lazy regex compilation time
36-
let ps = SyntaxSet::load_defaults_nonewlines();
37+
let ss = SyntaxSet::load_defaults_nonewlines();
3738
let ts = ThemeSet::load_defaults();
3839

39-
let syntax = ps.find_syntax_for_file(path).unwrap().unwrap();
40+
let syntax = ss.find_syntax_for_file(path).unwrap().unwrap();
4041
let mut f = File::open(path).unwrap();
4142
let mut s = String::new();
4243
f.read_to_string(&mut s).unwrap();
4344

4445
b.iter(|| {
45-
do_highlight(&s, syntax, &ts.themes["base16-ocean.dark"])
46+
do_highlight(&s, &ss, syntax, &ts.themes["base16-ocean.dark"])
4647
});
4748
}
4849

@@ -56,8 +57,24 @@ fn stack_matching(b: &mut Bencher) {
5657
});
5758
}
5859

60+
fn highlight_html(b: &mut Bencher) {
61+
let ss = SyntaxSet::load_defaults_newlines();
62+
let ts = ThemeSet::load_defaults();
63+
64+
let path = "testdata/parser.rs";
65+
let syntax = ss.find_syntax_for_file(path).unwrap().unwrap();
66+
let mut f = File::open(path).unwrap();
67+
let mut s = String::new();
68+
f.read_to_string(&mut s).unwrap();
69+
70+
b.iter(|| {
71+
highlighted_html_for_string(&s, &ss, syntax, &ts.themes["base16-ocean.dark"])
72+
});
73+
}
74+
5975
fn highlighting_benchmark(c: &mut Criterion) {
6076
c.bench_function("stack_matching", stack_matching);
77+
c.bench_function("highlight_html", highlight_html);
6178
c.bench_function_over_inputs(
6279
"highlight",
6380
|b, s| highlight_file(b, s),

0 commit comments

Comments
 (0)