Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nullable annotations for XmlDataDocument #41541

Merged
merged 6 commits into from
Sep 1, 2020

Conversation

terrajobst
Copy link
Contributor

No description provided.

@ghost
Copy link

ghost commented Aug 28, 2020

Tagging subscribers to this area: @roji, @ajcvickers
See info in area-owners.md if you want to be subscribed.

if (rowElem == null)
{
return null;
return null!;
Copy link

@pgovind pgovind Aug 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might cause a NRE if someone calls Row.Something. We should make the property return a DataRow? instead I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this ! needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the smart person that I am I only changed the return type of the property 😟

Fixed

@@ -21,13 +18,13 @@ namespace System.Xml
[Obsolete("XmlDataDocument class will be removed in a future release.")]
public class XmlDataDocument : XmlDocument
{
private DataSet _dataSet;
private DataSet _dataSet = null!;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use [MemberNotNull] on the Init and AttachDataSet methods instead? You can get rid of this null! if you do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same for _pointers, _columnChangeList, _foliationLock and _attrXml.

Copy link
Contributor Author

@terrajobst terrajobst Aug 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that. The problem is that there is this constructor as well:

        internal XmlDataDocument(XmlImplementation imp) : base(imp)
        {
        }

For now, I've null! initialized the values there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may not be entirely consistent, but anywhere we've used = null!; on fields we've tried to add a comment next to it indicating why.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that there is this constructor as well

What ensures _dataSet is set to non-null when that ctor is used? Or the field is never used when that ctor is used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What ensures _dataSet is set to non-null when that ctor is used? Or the field is never used when that ctor is used?

I believe the field is never used in that instance. It's instantiated by XmlDataImplementation which exposes it as XmlDocument.


private DataSetMapper _mapper;
internal Hashtable _pointers; // Hastable w/ all pointer objects used by this XmlDataDocument. Hashtable are guaranteed to work OK w/ one writer and mutiple readers, so as long as we guarantee
private DataSetMapper _mapper = null!;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here - use [MemberNotNull] on the Init method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -50,7 +47,7 @@ public override string Value
get
{
XPathNodeType xnt = _curNode.NodeType;
return xnt == XPathNodeType.Element || xnt == XPathNodeType.Root ? _curNode.InnerText : _curNode.Value;
return xnt == XPathNodeType.Element || xnt == XPathNodeType.Root ? _curNode.InnerText : _curNode.Value!;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's protecting _curNode.Value from being null? (here and below)

Copy link
Contributor Author

@terrajobst terrajobst Aug 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite frankly I don't know. But I can't make the property nullable because the base definition isn't nullable. But changing the code to return an empty string when _curNode.Value is null seems worse than potentially lying.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I can't make the property nullable because the base definition isn't nullable

Does that mean the base is defined incorrectly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking through the code it seems like other overrides return an empty string instead, so it seems the intent was non-nullable.

@@ -49,7 +46,7 @@ internal bool NextRightRowElement()
// Returns true if the current node is on a row element (head of a region)
Copy link
Member

@carlossanlop carlossanlop Aug 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 16 of this file: CurrentNode should probably be nullable. This is an internal class, and is the base class for other two classes, which are in this same PR.
At some point, the underlying private field for the currentNode is set to null. For example, line 61 of RegionIterator.cs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 91 in RegionIterator.cs is doing the same, setting the value to null.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TreeIterator.cs seems to be doing the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Since this follows the typical iterator pattern, I have annotated the NextXxx methods with [MemberNotNullIf(true, CurrentNode)]. Sadly this doesn't do much for us because the compiler still can't infer that iter.CurrentNode isn't null. for cases like this:

for (bool fMore = iter.NextRowElement(); fMore; fMore = iter.NextRowElement())
{
    rowElem = (XmlBoundElement)(iter.CurrentNode!);
    EnsureDisconnectedDataRow(rowElem);
}

@@ -67,9 +64,10 @@ private XPathNodeType DecideXPNodeTypeForTextNodes(XmlNode node)
//the function can only be called on text like nodes.
Debug.Assert(XmlDataDocument.IsTextNode(node.NodeType));
XPathNodeType xnt = XPathNodeType.Whitespace;
while (node != null)
XmlNode? n = null;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change node to n here? I think we should revert this

Copy link
Contributor Author

@terrajobst terrajobst Aug 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because node is a parameter, which is non-nullable. The code assigns to the parameter, which then can include null values as the loop goes. And the assert would null ref when one passes null to the method. So in order to have a nullable variable for the loop but keep the parameter as non-nullable I had to introduce a local 😢

@@ -339,7 +337,7 @@ internal string Value
}
else if (_column.ColumnMapping == MappingType.Attribute || _fOnValue)
{
DataRow row = Row;
DataRow row = Row!;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a possible NRE. Row currently could return a null! and the next line will fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I understand the code is that when _column is non-null, Row will have a value (see the definition of Row). Sadly I don't know how to express this invariant. Marking Row as non-null seems worse though.

@@ -376,7 +374,7 @@ internal string InnerText
}
else
{
DataRow row = Row;
DataRow row = Row!;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possible NRE from using the Row property

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (_column != null)
{
rowElem = _node as XmlBoundElement;
rowElem = (XmlBoundElement)_node;
Copy link
Member

@carlossanlop carlossanlop Aug 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, technically the change is not really wrong (aside from the fact that now it could throw if the cast fails), but it's not a change related to nullability.
Up to you if you want to revert it.

Copy link
Member

@jozkee jozkee Aug 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still represents a change of behavior from returning null to throwing InvalidCastException plus the method is already expected to return null.

If we lack code coverage, we would not catch if this is an invasive change regardless of the assertions.

Copy link
Contributor Author

@terrajobst terrajobst Aug 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous code was this:

rowElem = _node as XmlBoundElement;
Debug.Assert(rowElem != null);
  • If the cast doesn't succeed, the next line asserts
  • If the _node is null, the cast will succeed but the next line asserts
  • In release builds, a null reference would be returned

My reasoning was: a hard cast seems more desirable because at least that makes the code fail consistently, rather than asserting in debug builds an null referencing in release builds.

But if you guys feel we shouldn't do this, I can revert.

Debug.Assert(CurrentNode != null);
return true;
}
return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect a simpler change would still satisfy the compiler:

return CurrentNode is XmlBoundElement be && be.Row != null;

Does it not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does. For some reason, this doesn't work:

XmlBoundElement? be = CurrentNode as XmlBoundElement;
return ((be != null) && (be.Row != null));

Fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, this doesn't work

The compiler doesn't track the relationship between variables for nullability.

[MemberNotNull(nameof(_columnChangeList))]
[MemberNotNull(nameof(_mapper))]
[MemberNotNull(nameof(_foliationLock))]
[MemberNotNull(nameof(_attrXml))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also use the ctor that takes a params array (though you'll likely also need to suppress a CLS compliance warning).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Random side question: why do we get CLS compliance warnings on attributes of non-public members?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK CLS compliance is about consistent attribution and doesn't take visibility into consideration. If you say "the entire assembly/type is CLS compliant", then the compiler will force you to mark the ones that aren't, no matter their visibility. Presumably, that was done to make the rules simpler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also use the ctor that takes a params array (though you'll likely also need to suppress a CLS compliance warning).

I noticed the CLS compliance warning and looked around what other parts did. The other locations in S.D.C seemed use multiple applications, so that's what I did. Do you want me to change them all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK CLS compliance is about consistent attribution and doesn't take visibility into consideration.

That isn't true for method return types:

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to change them all?

You don't need to. I was just calling out the alternative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK CLS compliance is about consistent attribution and doesn't take visibility into consideration.

That isn't true for method return types:

OK, now I'm confused :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too. ;)

@terrajobst terrajobst requested a review from eerhardt August 31, 2020 17:36
@@ -2539,7 +2537,7 @@ public XmlDataDocument() : base(new XmlDataImplementation())
{
Init();
AttachDataSet(new DataSet());
_dataSet.EnforceConstraints = false;
_dataSet!.EnforceConstraints = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) if AttachDataSet was annotated with [MemberNotNull(nameof(_dataSet))], this ! wouldn't be needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! Fixed.


// This constructor is used by XmlDataImplementation.CreateDocument(), which
// exposes it as XmlDocument. The methods using these fields are never called.
_dataSet = null!;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both the = null! here and in the field initializers at the top of the class?

 private DataSet _dataSet = null!;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies, I thought I reverted them already, which I didn't. Fixed.

}

// This constructor is used by XmlDataImplementation.CreateDocument(), which
// exposes it as XmlDocument. The methods using these fields are never called.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this comment is 100% accurate. From looking at the code, it looks like XmlDataImplementation.CreateDocument() can be called publicly:

using System.Xml;

var x = new XmlDataDocument();
x.AppendChild(x.CreateElement("h"));

var y = (XmlDataDocument)x.Implementation.CreateDocument();

Console.WriteLine(y.DataSet?.ToString() ?? "<null>");

The code above prints <null>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is fair. But a document constructed from there will also null ref, for example:

y.GetRowFromElement(y.DocumentElement);

Without some surgery I don't believe an XmlDataDocument that was created via XmlDataImplementation.CreateDocument() can be used safely today 😢

@@ -67,9 +64,10 @@ private XPathNodeType DecideXPNodeTypeForTextNodes(XmlNode node)
//the function can only be called on text like nodes.
Debug.Assert(XmlDataDocument.IsTextNode(node.NodeType));
XPathNodeType xnt = XPathNodeType.Whitespace;
while (node != null)
XmlNode? n = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this need to be XmlNode? n = node;? As written, I don't think you will ever get into the loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn. Good catch! Fixed.

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good.

Copy link

@pgovind pgovind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now.

Copy link
Member

@jozkee jozkee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than this comment, LGTM.

In order to match with src
@terrajobst terrajobst merged commit 381c164 into dotnet:master Sep 1, 2020
@terrajobst terrajobst deleted the nullable-xmldatadocument branch September 1, 2020 01:09
carlossanlop pushed a commit to carlossanlop/runtime that referenced this pull request Sep 1, 2020
* Add nullable annotations for XmlDataDocument

* Address code review feedback

* More code review feedback

* Address Eric's feedback

* Annotate _dataSet as not null when AttachDataSet is called

* Fix GetElementById return type annotation

In order to match with src

Co-authored-by: Jeff Handley <[email protected]>
Co-authored-by: David Cantú <[email protected]>
@roji
Copy link
Member

roji commented Sep 4, 2020

@terrajobst sorry for not reviewing, personal issues... Not sure I'd have a lot to contribute on the XML stuff anyway :)

@ghost ghost locked as resolved and limited conversation to collaborators Dec 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants