Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-69129] Support escaped emoji characters in XML files #7778

Merged
merged 4 commits into from
Apr 3, 2023

Conversation

basil
Copy link
Member

@basil basil commented Mar 29, 2023

Context

See JENKINS-69129.

Problem

A Job DSL script like

folder('test') {
    displayName('Fox 🦊')
}

will create a config.xml file with

<displayName>Fox &#129418;</displayName>

unlike the GUI, which will create a config.xml file with

<displayName>Fox 🦊</displayName>

While XStream can read in the second file just fine, it cannot cope with the first and actually reads the wrong character, as reported by the user.

Evaluation

While one workaround would be to get Job DSL to not escape the character, I could not figure out a way to do this. Job DSL is using some Groovy functionality to output the XML, and I could not find a way to configure it to not escape the output.

However, Job DSL's output is perfectly valid XML. Core is at fault here for not reading this valid XML correctly. Why is core reading the wrong character?

I looked up how we are reading XML, and I was horrified to discover we are reading XML with net.sf.kxml:[email protected], a library last released 13 years ago. This was well before the emoji era, so I am not surprised it cannot read newer Unicode characters correctly.

To fix this, I looked into replacing this library with the regular StAX library that is bundled with the Java Platform. XStream already supports this. Indeed, when I switched to StAX, the problem went away. So I therefore conclude there must be some bug in kxml2.

Solution

I looked into the current state of kxml2. While Stefan Haustein does appear to be still maintaining kxml2 at https://github.com/kobjects/kxml2 under a new Maven artifact (com.github.kobjects:kxml2), I see no evidence that anyone has tested this new com.github.kobjects:kxml2 with XStream. So trying to upgrade to this newer version of kxml2 seems risky, and it is not guaranteed to fix the bug either.

In contrast, the Java Platform can be relied on to provide a decent version of StAX, and that does fix the bug. So I think it is preferable to switch to StAX for two reasons:

  1. The fewer ancient third-party libraries we depend on, the easier it is to maintain Jenkins.
  2. StAX does not have the emoji bug.

Implementation

XStream's KXml2Driver (which we were using before) used kxml2 for reading XML and PrettyPrintWriter for writing XML, giving us the pretty-printed XML we are all familiar with in $JENKINS_HOME. XStream's StandardStaxDriver uses StAX for writing by default, which creates a number of backwards compatibility problems because it writes out XML 1.0 (not 1.1, as we currently do) without UTF-8 encoding in the prolog. Rather than create a compatibility nightmare, I am preserving the old behavior of using PrettyPrintWriter for writing XML. So the only change to the status quo is that we are now using StAX to read XML.

I am not yet removing the kxml2 JAR file from the Jenkins WAR in this PR because I have not done the research to determine whether doing so would affect any plugins. If, once this change has been tested in several weekly releases and proven to be stable, there is a desire to remove this no-longer-needed JAR file, the removal can be explored at that time.

Bonus

As a bonus, I was able to fix an @Ignored test that was being skipped due to deficiencies in kxml2. Those deficiencies are not present in StAX, so the test can now be executed successfully (modulo changing the test to use the correct exception type).

Testing done

I added two new tests: testEmoji (which passed before this PR) and testEmojiEscaped (which failed before this PR). These two tests both now pass with this PR. I also did an end-to-end test with Job DSL as shown in the problem statement and verified that the issue no longer occurred. I also saved a bunch of pages in the Jenkins UI and checked the XML that was written to make sure it looked the same as before and had no changes to the XML prolog or pretty-printing.

I also ran mvn clean verify -Psmoke-test and mvn clean verify -Dtest=hudson.bugs.DateConversionTest,hudson.cli.UpdateViewCommandTest,hudson.model.CauseTest,hudson.model.ComputerConfigDotXmlTest,hudson.model.ParametersAction2Test,hudson.model.QueueTest,hudson.model.ViewTest,hudson.PluginManagerTest,hudson.util.RobustReflectionConverterTest,hudson.util.XStream2AnnotationTest,hudson.util.XStream2Security383Test,jenkins.install.InstallStateTest,jenkins.security.ClassFilterImplTest,jenkins.security.Security637Test,jenkins.widgets.BuildTimeTrendTest.

I also successfully tested the incremental build from this PR in jenkinsci/bom in jenkinsci/bom#1915.

Proposed changelog entries

Support emoji in Job DSL scripts.

Proposed upgrade guidelines

N/A

Submitter checklist

  • The Jira issue, if it exists, is well-described.
  • The changelog entries and upgrade guidelines are appropriate for the audience affected by the change (users or developers, depending on the change) and are in the imperative mood (see examples).
    • Fill in the Proposed upgrade guidelines section only if there are breaking changes or changes that may require extra steps from users during upgrade.
  • There is automated testing or an explanation as to why this change has no tests.
  • New public classes, fields, and methods are annotated with @Restricted or have @since TODO Javadocs, as appropriate.
  • New deprecations are annotated with @Deprecated(since = "TODO") or @Deprecated(forRemoval = true, since = "TODO"), if applicable.
  • New or substantially changed JavaScript is not defined inline and does not call eval to ease future introduction of Content Security Policy (CSP) directives (see documentation).
  • For dependency updates, there are links to external changelogs and, if possible, full differentials.
  • For new APIs and extension points, there is a link to at least one consumer.

Desired reviewers

@mention

Maintainer checklist

Before the changes are marked as ready-for-merge:

  • There are at least two (2) approvals for the pull request and no outstanding requests for change.
  • Conversations in the pull request are over, or it is explicit that a reviewer is not blocking the change.
  • Changelog entries in the pull request title and/or Proposed changelog entries are accurate, human-readable, and in the imperative mood.
  • Proper changelog labels are set so that the changelog can be generated automatically.
  • If the change needs additional upgrade steps from users, the upgrade-guide-needed label is set and there is a Proposed upgrade guidelines section in the pull request title (see example).
  • If it would make sense to backport the change to LTS, a Jira issue must exist, be a Bug or Improvement, and be labeled as lts-candidate to be considered (see query).

@basil basil added the bug For changelog: Minor bug. Will be listed after features label Mar 29, 2023
@basil basil marked this pull request as draft March 29, 2023 05:27
@basil basil marked this pull request as ready for review March 29, 2023 06:04
basil added a commit to basil/bom that referenced this pull request Mar 29, 2023
Copy link
Member

@NotMyFault NotMyFault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work

@timja
Copy link
Member

timja commented Apr 2, 2023

/label ready-for-merge


This PR is now ready for merge, after ~24 hours, we will merge it if there's no negative feedback.

Thanks!

@comment-ops-bot comment-ops-bot bot added the ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback label Apr 2, 2023
@NotMyFault NotMyFault merged commit ce3f6a8 into jenkinsci:master Apr 3, 2023
krisstern pushed a commit to krisstern/jenkins that referenced this pull request Apr 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug For changelog: Minor bug. Will be listed after features ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants