Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip NorthwindSplitIncludeQueryMySqlTest tests for MariaDB and MySQ < 8.0 #1553

Merged

Conversation

lauxjpn
Copy link
Collaborator

@lauxjpn lauxjpn commented Nov 9, 2021

The way some of the test queries have been implemented upstream relies on a specific way of implicit ordering of the OrderDetails entities that can be non-deterministic on MariaDB and MySQL < 8.0.

This is not an issue with MySQL or MariaDB, but with the way the queries have been implemented by the EF Core team.

We therefore skip them for now in these cases (because indirectly manipulating the order in theses cases is a pain), so our CI tests won't fail (false positive) in a non-deterministic fashion.

@lauxjpn lauxjpn added this to the 6.0.0-rc.2 milestone Nov 9, 2021
@lauxjpn lauxjpn self-assigned this Nov 9, 2021
@lauxjpn lauxjpn merged commit 8faabf1 into PomeloFoundation:master Nov 9, 2021
@lauxjpn lauxjpn deleted the test/skip_non_deterministic branch November 9, 2021 15:17
@mguinness
Copy link
Collaborator

@ajcvickers Is this something that should be taken into account upstream?

@ajcvickers
Copy link

@smitpatel @roji Test ordering issue?

@roji
Copy link

roji commented Nov 10, 2021

@lauxjpn can you provide more details so we can see if this is something we need to look into?

@lauxjpn
Copy link
Collaborator Author

lauxjpn commented Nov 11, 2021

@lauxjpn can you provide more details so we can see if this is something we need to look into?

@roji The issues can be found in the NorthwindSplitIncludeQueryMySqlTest.cs file.

Its test cases will run fine on MySQL 8 without having to override anything from the base class, because MySQL 8 implicitly sorts/returns the OrderDetails entities as SQL Server (and Postgres) does.

But MariaDB does not.


Let's look at the Include_collection_SelectMany_GroupBy_Select test case for example.

LINQ query
(from o in ss.Set<Order>().Include(o => o.OrderDetails).Where(o => o.OrderID == 10248)
    from od in ss.Set<OrderDetail>()
    select o)
.GroupBy(e => e.OrderID)
.Select(e => e.OrderBy(o => o.OrderID).FirstOrDefault()
Generated SQL (SQL Server)
SELECT [t0].[OrderID], [t0].[CustomerID], [t0].[EmployeeID], [t0].[OrderDate], [t].[OrderID], [t0].[OrderID0], [t0].[ProductID]
FROM (
    SELECT [o].[OrderID]
    FROM [Orders] AS [o]
    CROSS JOIN [Order Details] AS [o0]
    WHERE [o].[OrderID] = 10248
    GROUP BY [o].[OrderID]
) AS [t]
LEFT JOIN (
    SELECT [t1].[OrderID], [t1].[CustomerID], [t1].[EmployeeID], [t1].[OrderDate], [t1].[OrderID0], [t1].[ProductID]
    FROM (
        SELECT [o1].[OrderID], [o1].[CustomerID], [o1].[EmployeeID], [o1].[OrderDate], [o2].[OrderID] AS [OrderID0], [o2].[ProductID], ROW_NUMBER() OVER(PARTITION BY [o1].[OrderID] ORDER BY [o1].[OrderID]) AS [row]
        FROM [Orders] AS [o1]
        CROSS JOIN [Order Details] AS [o2]
        WHERE [o1].[OrderID] = 10248
    ) AS [t1]
    WHERE [t1].[row] <= 1
) AS [t0] ON [t].[OrderID] = [t0].[OrderID]
ORDER BY [t].[OrderID], [t0].[OrderID], [t0].[OrderID0], [t0].[ProductID];

SELECT [o3].[OrderID], [o3].[ProductID], [o3].[Discount], [o3].[Quantity], [o3].[UnitPrice], [t].[OrderID], [t0].[OrderID], [t0].[OrderID0], [t0].[ProductID]
FROM (
    SELECT [o].[OrderID]
    FROM [Orders] AS [o]
    CROSS JOIN [Order Details] AS [o0]
    WHERE [o].[OrderID] = 10248
    GROUP BY [o].[OrderID]
) AS [t]
LEFT JOIN (
    SELECT [t1].[OrderID], [t1].[OrderID0], [t1].[ProductID]
    FROM (
        SELECT [o1].[OrderID], [o2].[OrderID] AS [OrderID0], [o2].[ProductID], ROW_NUMBER() OVER(PARTITION BY [o1].[OrderID] ORDER BY [o1].[OrderID]) AS [row]
        FROM [Orders] AS [o1]
        CROSS JOIN [Order Details] AS [o2]
        WHERE [o1].[OrderID] = 10248
    ) AS [t1]
    WHERE [t1].[row] <= 1
) AS [t0] ON [t].[OrderID] = [t0].[OrderID]
INNER JOIN [Order Details] AS [o3] ON [t0].[OrderID] = [o3].[OrderID]
ORDER BY [t].[OrderID], [t0].[OrderID], [t0].[OrderID0], [t0].[ProductID];
Generated SQL (MySQL/MariaDB)
SELECT `t0`.`OrderID`, `t0`.`CustomerID`, `t0`.`EmployeeID`, `t0`.`OrderDate`, `t`.`OrderID`, `t0`.`OrderID0`, `t0`.`ProductID`
FROM (
    SELECT `o`.`OrderID`
    FROM `Orders` AS `o`
    CROSS JOIN `Order Details` AS `o0`
    WHERE `o`.`OrderID` = 10248
    GROUP BY `o`.`OrderID`
) AS `t`
LEFT JOIN (
    SELECT `t1`.`OrderID`, `t1`.`CustomerID`, `t1`.`EmployeeID`, `t1`.`OrderDate`, `t1`.`OrderID0`, `t1`.`ProductID`
    FROM (
        SELECT `o1`.`OrderID`, `o1`.`CustomerID`, `o1`.`EmployeeID`, `o1`.`OrderDate`, `o2`.`OrderID` AS `OrderID0`, `o2`.`ProductID`, ROW_NUMBER() OVER(PARTITION BY `o1`.`OrderID` ORDER BY `o1`.`OrderID`) AS `row`
        FROM `Orders` AS `o1`
        CROSS JOIN `Order Details` AS `o2`
        WHERE `o1`.`OrderID` = 10248
    ) AS `t1`
    WHERE `t1`.`row` <= 1
) AS `t0` ON `t`.`OrderID` = `t0`.`OrderID`
ORDER BY `t`.`OrderID`, `t0`.`OrderID`, `t0`.`OrderID0`, `t0`.`ProductID`;

SELECT `o3`.`OrderID`, `o3`.`ProductID`, `o3`.`Discount`, `o3`.`Quantity`, `o3`.`UnitPrice`, `t`.`OrderID`, `t0`.`OrderID`, `t0`.`OrderID0`, `t0`.`ProductID`
FROM (
    SELECT `o`.`OrderID`
    FROM `Orders` AS `o`
    CROSS JOIN `Order Details` AS `o0`
    WHERE `o`.`OrderID` = 10248
    GROUP BY `o`.`OrderID`
) AS `t`
LEFT JOIN (
    SELECT `t1`.`OrderID`, `t1`.`OrderID0`, `t1`.`ProductID`
    FROM (
        SELECT `o1`.`OrderID`, `o2`.`OrderID` AS `OrderID0`, `o2`.`ProductID`, ROW_NUMBER() OVER(PARTITION BY `o1`.`OrderID` ORDER BY `o1`.`OrderID`) AS `row`
        FROM `Orders` AS `o1`
        CROSS JOIN `Order Details` AS `o2`
        WHERE `o1`.`OrderID` = 10248
    ) AS `t1`
    WHERE `t1`.`row` <= 1
) AS `t0` ON `t`.`OrderID` = `t0`.`OrderID`
INNER JOIN `Order Details` AS `o3` ON `t0`.`OrderID` = `o3`.`OrderID`
ORDER BY `t`.`OrderID`, `t0`.`OrderID`, `t0`.`OrderID0`, `t0`.`ProductID`;

The generated SQL queries for MSSQL and MySQL (and MariaDB) are the same.

Let's only look at the first one of the two generated SQL queries, and its most inner subquery:

SELECT `o1`.`OrderID`, `o1`.`CustomerID`, `o1`.`EmployeeID`, `o1`.`OrderDate`, `o2`.`OrderID` AS `OrderID0`, `o2`.`ProductID`, ROW_NUMBER() OVER(PARTITION BY `o1`.`OrderID` ORDER BY `o1`.`OrderID`) AS `row`
FROM `Orders` AS `o1`
CROSS JOIN `Order Details` AS `o2`
WHERE `o1`.`OrderID` = 10248

The LINQ query uses .Select(e => e.OrderBy(o => o.OrderID).FirstOrDefault()).

This is being translated into a ROW_NUMBER() OVER(PARTITION BY `o1`.`OrderID` ORDER BY `o1`.`OrderID`) AS `row` clause, so that the FirstOrDefault() part of the LINQ query can later use the row column to figure out what the first record (with the WHERE `t1`.`row` <= 1 clause).

The issue is, that ORDER BY `o1`.`OrderID` is not ordered by enough columns to be deterministic. Let's take a look at the first row returned for this inner most subquery for MySQL and MariaDB respectively:

MySQL

OrderID CustomerID EmployeeID OrderDate OrderID0 ProductID row
10248 VINET 5 1996-07-04 00:00:00 10248 11 1

MariaDB

OrderID CustomerID EmployeeID OrderDate OrderID0 ProductID row
10248 VINET 5 1996-07-04 00:00:00 10799 24 1

MariaDB is highly optimized and if you don't order something explicitly, you might not just get an unordered result, but it could even be in a non-deterministic order (over multiple runs).

To make the query deterministic, it actually needs to be ROW_NUMBER() OVER(PARTITION BY `o1`.`OrderID` ORDER BY `o1`.`OrderID`, `o2`.`OrderID`, `o2`.`ProductID`) AS `row`.

And ordering the Orders.OrderDetails collection in the LINQ query is tricky.

@roji
Copy link

roji commented Nov 11, 2021

@lauxjpn thanks for the details... FWIW from the PostgreSQL I frequently hit this exact kind of non-deterministic ordering issue - PostgreSQL also orders things differently from SQL Server and Sqlite, so explicit ordering is frequently necessary. We generally fix these test issues as they're discovered.

Can I ask you to open an issue in the EF Core repo with the above information (and also for any other similar case you come across)? We'll fix this so that in 7.0 you'll be able to unskip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants