[JENKINS-69129] Support escaped emoji characters in XML files #7778
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Context
See JENKINS-69129.
Problem
A Job DSL script like
will create a
config.xml
file withunlike the GUI, which will create a
config.xml
file withWhile XStream can read in the second file just fine, it cannot cope with the first and actually reads the wrong character, as reported by the user.
Evaluation
While one workaround would be to get Job DSL to not escape the character, I could not figure out a way to do this. Job DSL is using some Groovy functionality to output the XML, and I could not find a way to configure it to not escape the output.
However, Job DSL's output is perfectly valid XML. Core is at fault here for not reading this valid XML correctly. Why is core reading the wrong character?
I looked up how we are reading XML, and I was horrified to discover we are reading XML with
net.sf.kxml:[email protected]
, a library last released 13 years ago. This was well before the emoji era, so I am not surprised it cannot read newer Unicode characters correctly.To fix this, I looked into replacing this library with the regular StAX library that is bundled with the Java Platform. XStream already supports this. Indeed, when I switched to StAX, the problem went away. So I therefore conclude there must be some bug in
kxml2
.Solution
I looked into the current state of
kxml2
. While Stefan Haustein does appear to be still maintainingkxml2
at https://github.com/kobjects/kxml2 under a new Maven artifact (com.github.kobjects:kxml2
), I see no evidence that anyone has tested this newcom.github.kobjects:kxml2
with XStream. So trying to upgrade to this newer version ofkxml2
seems risky, and it is not guaranteed to fix the bug either.In contrast, the Java Platform can be relied on to provide a decent version of StAX, and that does fix the bug. So I think it is preferable to switch to StAX for two reasons:
Implementation
XStream's
KXml2Driver
(which we were using before) usedkxml2
for reading XML andPrettyPrintWriter
for writing XML, giving us the pretty-printed XML we are all familiar with in$JENKINS_HOME
. XStream'sStandardStaxDriver
uses StAX for writing by default, which creates a number of backwards compatibility problems because it writes out XML 1.0 (not 1.1, as we currently do) without UTF-8 encoding in the prolog. Rather than create a compatibility nightmare, I am preserving the old behavior of usingPrettyPrintWriter
for writing XML. So the only change to the status quo is that we are now using StAX to read XML.I am not yet removing the
kxml2
JAR file from the Jenkins WAR in this PR because I have not done the research to determine whether doing so would affect any plugins. If, once this change has been tested in several weekly releases and proven to be stable, there is a desire to remove this no-longer-needed JAR file, the removal can be explored at that time.Bonus
As a bonus, I was able to fix an
@Ignore
d test that was being skipped due to deficiencies inkxml2
. Those deficiencies are not present in StAX, so the test can now be executed successfully (modulo changing the test to use the correct exception type).Testing done
I added two new tests:
testEmoji
(which passed before this PR) andtestEmojiEscaped
(which failed before this PR). These two tests both now pass with this PR. I also did an end-to-end test with Job DSL as shown in the problem statement and verified that the issue no longer occurred. I also saved a bunch of pages in the Jenkins UI and checked the XML that was written to make sure it looked the same as before and had no changes to the XML prolog or pretty-printing.I also ran
mvn clean verify -Psmoke-test
andmvn clean verify -Dtest=hudson.bugs.DateConversionTest,hudson.cli.UpdateViewCommandTest,hudson.model.CauseTest,hudson.model.ComputerConfigDotXmlTest,hudson.model.ParametersAction2Test,hudson.model.QueueTest,hudson.model.ViewTest,hudson.PluginManagerTest,hudson.util.RobustReflectionConverterTest,hudson.util.XStream2AnnotationTest,hudson.util.XStream2Security383Test,jenkins.install.InstallStateTest,jenkins.security.ClassFilterImplTest,jenkins.security.Security637Test,jenkins.widgets.BuildTimeTrendTest
.I also successfully tested the incremental build from this PR in
jenkinsci/bom
in jenkinsci/bom#1915.Proposed changelog entries
Support emoji in Job DSL scripts.
Proposed upgrade guidelines
N/A
Submitter checklist
@Restricted
or have@since TODO
Javadocs, as appropriate.@Deprecated(since = "TODO")
or@Deprecated(forRemoval = true, since = "TODO")
, if applicable.eval
to ease future introduction of Content Security Policy (CSP) directives (see documentation).Desired reviewers
@mention
Maintainer checklist
Before the changes are marked as
ready-for-merge
:upgrade-guide-needed
label is set and there is a Proposed upgrade guidelines section in the pull request title (see example).lts-candidate
to be considered (see query).