-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: expose 'with_flags' to allow compilation of unicode patterns #51
Comments
That API looks reasonable to me! And your branch in |
I've been putting together a branch with this and ran into quite a bit of difficulty trying to expose a bitwise-or friendly flag enum. I'm sure a PyO3 expert could solve it, but I'm somewhat stumped. Even if I can figure it out, I feel like it adds disproportionately much new rust code just to expose a minor nicety. So I'm now proposing: pattern = regress.Regex(pattern_str, flags="u") Which would also accept I'll have a PR up for this when I get back to my desk, later today, and I might share some of what I tried and struggled with back here. |
resolves crate-py#51 There is no support for `Flags` structs as provided by the `regress` crate. But a string can be passed down and is converted to Flags by the `with_flags` constructor. Exposing flags as a struct/enum adds significant complexity deemd not worth the addition code at this time. (See crate-py#51 for more details.) A python test verifies that the alternate constructor pattern works and provides support for the `"u"` flag.
resolves crate-py#51 There is no support for `Flags` structs as provided by the `regress` crate. But a string can be passed down and is converted to Flags by the `with_flags` constructor. Exposing flags as a struct/enum adds significant complexity deemed not worth the additional code at this time. (See crate-py#51 for more details.) A python test verifies that the alternate constructor pattern works and provides support for the `"u"` flag.
Okay, I put up #52 with my initial solution. Here's a patch on top of it which... didn't quite work: flags-enum.diffdiff --git a/src/lib.rs b/src/lib.rs
index e2df317..ca0f1a3 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -1,5 +1,5 @@
use pyo3::{create_exception, prelude::*, types::PySlice};
-use regress::{Match, Range, Regex};
+use regress::{Flags, Match, Range, Regex};
create_exception!(regress, RegressError, pyo3::exceptions::PyException);
@@ -13,12 +13,15 @@ struct RegexPy {
impl RegexPy {
#[new]
#[pyo3(signature = (value, flags=None))]
- fn init(value: &str, flags: Option<&str>) -> PyResult<Self> {
+ fn init(value: &str, flags: Option<&FlagTypes>) -> PyResult<Self> {
match flags {
- Some(f) => match Regex::with_flags(value, f) {
- Ok(inner) => Ok(RegexPy { inner }),
- Err(e) => Err(RegressError::new_err(e.to_string())),
- },
+ Some(f) => {
+ let regress_flags: Flags = Flags::from(f);
+ match Regex::with_flags(value, regress_flags) {
+ Ok(inner) => Ok(RegexPy { inner }),
+ Err(e) => Err(RegressError::new_err(e.to_string())),
+ }
+ }
None => match Regex::new(value) {
Ok(inner) => Ok(RegexPy { inner }),
Err(e) => Err(RegressError::new_err(e.to_string())),
@@ -38,6 +41,42 @@ impl RegexPy {
}
}
+#[repr(transparent)]
+#[pyclass(name = "Flags", module = "regress", frozen)]
+#[derive(Clone)]
+struct FlagsPy {
+ inner: Flags,
+}
+
+#[pymethods]
+impl FlagsPy {
+ #[new]
+ fn init(value: &str) -> Self {
+ FlagsPy {
+ inner: Flags::from(value),
+ }
+ }
+}
+
+#[derive(FromPyObject)]
+enum FlagTypes {
+ #[pyo3(annotation = "regress.Flags")]
+ FlagsPy(FlagsPy),
+ #[pyo3(transparent, annotation = "str")]
+ String(String),
+}
+
+impl From<&FlagTypes> for Flags {
+ // convert any Flag type (FlagsPy or str) to regress Flags
+ #[inline]
+ fn from(f: &FlagTypes) -> Self {
+ match f {
+ FlagTypes::FlagsPy(flags_py) => flags_py.inner,
+ FlagTypes::String(s) => Flags::from(s.as_str()),
+ }
+ }
+}
+
#[repr(transparent)]
#[pyclass(name = "Match", module = "regress", frozen)]
struct MatchPy {
@@ -70,6 +109,7 @@ impl MatchPy {
fn regress_py(py: Python, m: &Bound<'_, PyModule>) -> PyResult<()> {
m.add_class::<MatchPy>()?;
m.add_class::<RegexPy>()?;
+ m.add_class::<FlagsPy>()?;
m.add("RegressError", py.get_type_bound::<RegressError>())?;
Ok(())
}
diff --git a/tests/test_regress.py b/tests/test_regress.py
index c8e4f1d..a146425 100644
--- a/tests/test_regress.py
+++ b/tests/test_regress.py
@@ -72,3 +72,17 @@ def test_with_flags():
flagless_match = flagless_regex.find("a 0")
assert match is not None
assert flagless_match is None
+
+
+def test_with_flags_struct():
+ # same as the above flags test, but using the struct variant
+ # "L" for letters, "Z" for spaces, "N" for numerics
+ pattern = r"^\p{L}\p{Z}\p{N}$"
+
+ regex = regress.Regex(pattern, flags=regress.Flags("u"))
+ flagless_regex = regress.Regex(pattern)
+
+ match = regex.find("a 0")
+ flagless_match = flagless_regex.find("a 0")
+ assert match is not None
+ assert flagless_match is None Basically, I tried to add a
This surprised me. I was trying to make that enum express the Python |
Context
I got to spend a few hours working on getting
regress
support forpattern
intocheck-jsonschema
, and it works great! ... On version 0.4.5! 😅I don't fully understand the issue yet (need more time to research), but I think that I need to pass the
"u"
flag explicitly to Regress.Purely if you're curious, my current work is here and is passing in CI with v0.4.5 pinned: https://github.com/python-jsonschema/check-jsonschema/compare/use-regress-for-patterns
MRE
The following python sample demonstrates a change in behavior since
0.4.5
:On 0.4.5 I get a match, and on 2024.8.1 I get
None
.To save anyone the effort of looking up those character classes, roughly:
L
is "letters",Z
is "spaces", andN
is "numbers".Change proposal
I'm not sure if the API in Python needs to reflect the
regress
crate perfectly.I was thinking that the ideal might be...
So that we nicely match the
re
module's API.I'd like to work on this when I can throw some cycles at this, if this seems like a good path.
The text was updated successfully, but these errors were encountered: