-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic xpaths not working with HTML task preannotation via flask ML backend #6929
Comments
As a further explanation: I have really complicated html including complex tables. The only way to get static xpaths that match the label studio DOM would be if label studio could provide the DOM to the ML backend. I have now modified the label studio frontend to get the DOM and I can generate static xpaths based on it and have the HTML predictions rendered. But it isn't a very elegant solution. I really appreciate the great job you are doing with label studio, it would be great if you could consider providing the DOM to the ML backend to have matching predictions even with complex HTML. |
Hello, We'll create a feature request! We greatly appreciate your feedback and the opportunity to consider your suggestion. Your request will be evaluated and ranked alongside other roadmap items. If our product team opts to proceed with your idea, we will keep you updated throughout the process. Please understand that while we take all requests seriously, we cannot promise implementation or a specific timeframe. Thank you,
|
Hello Abu, Anyway, thanks for the great work you guys are doing, it's a fantastic software! |
Hello, Let the product team know about this feature request and will follow up as soon as I've update from them Thank you,
|
Hello Abu, Thanks for keeping me updated! I appreciate the consideration and look forward to any future updates. Best, |
Describe the bug
“XPath-based predictions are added to the task data but not visually displayed in the UI when using dynamic XPaths like .//p[contains(., 'Nutzenbewertung')].”
To Reproduce
Steps to reproduce the behavior:
Use the following HTML in a Label Studio project:
html
Nutzenbewertung gemäß § 35a SGB V
Another paragraph containing Nutzenbewertung.
Add predictions using the XPath:
//*[contains(text(), 'Nutzenbewertung,')]
Observe that the prediction is added to the task data but not displayed in the UI.
Expected behavior
“The prediction should be visually displayed in the UI, as the XPath matches the correct text nodes.”
Environment (please complete the following information):
Additional context
The XPath works in a flask backend and in a browser (showing the html or the html within Label studio) but not in Label Studio.
Predictions using exact XPaths derived from annotations made in the label Studio interface like /p[34]/text()[1] work when put statically in the flask backend, but dynamic XPaths fail.
Task source with prediction and annotation (annotation done after prediction, so I am sure I could not see the prediction even before the annotation):
{
"id": 5084,
"data": {
"html": "
Nutzenbewertung gemäß § 35a SGB V
\nAnother paragraph containing Nutzenbewertung.
"},
"annotations": [
{
"id": 1306,
"result": [
{
"id": "735b0e8f-2c80-4168-b358-6662097a7739",
"type": "labels",
"value": {
"end": ".//p[contains(., 'Nutzenbewertung')]",
"text": "Nutzenbewertung gemäß § 35a SGB V",
"start": ".//p[contains(., 'Nutzenbewertung')]",
"labels": [
"Ursprungsprojekt"
],
"endOffset": 33,
"startOffset": 0
},
"origin": "prediction",
"to_name": "text",
"from_name": "Was war der ursprüngliche Auftrag?"
},
{
"id": "Wtcf3gI9g7",
"type": "labels",
"value": {
"end": "/p[1]/text()[1]",
"text": "Nutzenbewertung gemäß § 35a SGB V",
"start": "/p[1]/text()[1]",
"labels": [
"Ursprungsprojekt"
],
"endOffset": 33,
"startOffset": 0,
"globalOffsets": {
"end": 33,
"start": 0
}
},
"origin": "manual",
"to_name": "text",
"from_name": "Was war der ursprüngliche Auftrag?"
}
],
"created_username": " XXXXX, 1",
"created_ago": "0 minutes",
"completed_by": {
"id": 1,
"first_name": "",
"last_name": "",
"avatar": null,
"email": "[email protected]",
"initials": "ch"
},
"was_cancelled": false,
"ground_truth": false,
"created_at": "2025-01-17T15:32:22.888219Z",
"updated_at": "2025-01-17T15:32:22.888240Z",
"draft_created_at": "2025-01-17T15:32:09.385624Z",
"lead_time": 27.26,
"import_id": null,
"last_action": null,
"task": 5084,
"project": 1,
"updated_by": 1,
"parent_prediction": 175,
"parent_annotation": null,
"last_created_by": null
}
],
"predictions": [
{
"id": 175,
"result": [
{
"id": "735b0e8f-2c80-4168-b358-6662097a7739",
"type": "labels",
"value": {
"end": ".//p[contains(., 'Nutzenbewertung')]",
"text": "Nutzenbewertung gemäß § 35a SGB V",
"start": ".//p[contains(., 'Nutzenbewertung')]",
"labels": [
"Ursprungsprojekt"
],
"endOffset": 33,
"startOffset": 0
},
"origin": "prediction",
"to_name": "text",
"from_name": "Was war der ursprüngliche Auftrag?"
}
],
"model_version": "v1.0",
"created_ago": "10 minutes",
"score": 1,
"cluster": null,
"neighbors": null,
"mislabeling": 0,
"created_at": "2025-01-17T15:22:24.182589Z",
"updated_at": "2025-01-17T15:22:24.182602Z",
"model": null,
"model_run": null,
"task": 5084,
"project": 1
}
]
}
"Prediction" passing the xpath from label studio also via the backend (this shows up in the gui and is highlighted):
{
"id": 5083,
"data": {
"html": "
Nutzenbewertung gemäß § 35a SGB V
\nAnother paragraph containing Nutzenbewertung.
"},
"annotations": [],
"predictions": [
{
"id": 174,
"result": [
{
"id": "123KBhsJdU-5w",
"type": "labels",
"value": {
"end": "/p[1]/text()[1]",
"text": "Nutzenbewertung gemäß § 35a SGB V",
"start": "/p[1]/text()[1]",
"labels": [
"Ursprungsprojekt"
],
"endOffset": 33,
"startOffset": 0
},
"origin": "prediction",
"to_name": "text",
"from_name": "Was war der ursprüngliche Auftrag?"
}
],
"model_version": "v1.0",
"created_ago": "14 minutes",
"score": 1,
"cluster": null,
"neighbors": null,
"mislabeling": 0,
"created_at": "2025-01-17T15:21:34.936985Z",
"updated_at": "2025-01-17T15:21:34.936995Z",
"model": null,
"model_run": null,
"task": 5083,
"project": 1
}
]
}
The text was updated successfully, but these errors were encountered: