-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Snowflake Dialect #9486
Comments
https://github.com/enso-org/enso/tree/wip/jd/snowflake-dialect Note: Nothing == Nothing in Snowflake is ... True.
|
Radosław Waśko reports a new STANDUP for yesterday (2024-06-11): Progress: Recoinessance on Snowflake Dialect. Fixed compilation errors. Moved tests to separate project. Allow secrets in config, prepare confign and readme, Debugging why alraedy fixed Aprache Arrow errors is haunting me - apparently not fixed if you run through the launcher. It should be finished by 2024-06-18. Next Day: Next day I will be working on the same task. Fix Snowflake running thru launcher. See which tests are working, see how to split into 2 PRs and what basic stuff needs to work for 1st iteration. Check if no regressions in Postgres. |
Radosław Waśko reports a new STANDUP for today (2024-06-12): Progress: Fixed running Snowflake with launcher. Fixed most critical errors. Now the test suite completes (with many failures though). Prepared a first iteration PR - we are at not worse stage than before, but split and can run the tests nicely. It should be finished by 2024-06-18. Next Day: Next day I will be working on the same task. Work on aligning the tests and fixing mismatches in implementation. Try to extract common logic between backends where applicable to avoid duplication. |
- Part of #9486 - Building on top of initial work by @jdunkerley and finishing it - Reverted the changes to the Postgres_Dialect from last Snowflake work and split the Snowflake_Dialect into a separate module. - Moved from `rounding_decimal_places_not_allowed_for_floats` to `supports_float_round_decimal_places` (as too confusing). - Added Snowflake_Dialect type. - Extracted `Snowflake_Spec` into separate `Snowflake_Tests` - It imports the common tests from `Table_Tests`. - Some initial adaptations to make the snowflake dialect not-crash. - Adding `Internals_Access` proxy to allow external implementations to access our internal data structures without directly exposing them to users. Users should not use these. - Adding profiling of SQL to check performance.
Radosław Waśko reports a new STANDUP for yesterday (2024-06-13): Progress: Investigating what we can do to make Snowflake tests faster. Increasing some timeouts in project-manager tests to make Types PR pass. Merged both pending PRs. Initial work on StaticModuleScope. It should be finished by 2024-06-18. Next Day: Next day I will be working on the same task. Continue work on Types for a bit. Then continue work on Snowflake. |
Radosław Waśko reports a new STANDUP for yesterday (2024-06-17): Progress: Speeding up tests by sharing connection. Avoid allocating tables in some tests - instead we build a table by constructing a query containing literal values - thus we avoid a CREATE query which had quite a bit of overhead. Reduced number of tests run by default (additional tests enabled by env var) from 1449 to 984. Seems there is 30-50% time reduction - the new tests take ~33 minutes to run, it used to be 45-60 mins. It should be finished by 2024-06-18. Next Day: Next day I will be working on the same task. Continue work on the tests, start to actually fix stuff as there's still a lot of failures to handle... |
Radosław Waśko reports a new STANDUP for yesterday (2024-06-18): Progress: Fixing aggregate tests, add row number and many others. Adding ordering, fixing dialect etc. Fixed literal table problem in Snowflake - yesterday's gains are not as impressive anymore as there were some failures before. It should be finished by 2024-06-18. Next Day: Next day I will be working on the same task. Continue work on the tests and fixes. |
Radosław Waśko reports a new 🔴 DELAY for today (2024-06-19): Summary: There is 10 days delay in implementation of the Snowflake Dialect (#9486) task. 7 days is my time off + at least 3 more days to finish the integration. Delay Cause: Lots of small edge cases to handle, lots of tests need updating becaue of Subtle differences in Snowflake. Moreover iteration time is very slow because Snowflake is running remote and the query latency (network + server speed) is much larger than for local backends. |
Radosław Waśko reports a new STANDUP for today (2024-06-19): Progress: Continuing work on fixing tests. Fixing Column Operations. Fixing also what I broke in other backend's tests while fixing Snowflake. Created a PR with a snapshot of my changes. Remaining changes in another PR. It should be finished by 2024-06-28. Next Day: Next day I will be working on the same task. Continue work on the tests and fixes. |
- Part of #9486 - Fixing our tests to not rely on deterministic ordering of created Tables in Database backends - Before, SQLite and Postgres used to mostly return rows in the order they were inserted in, but Snowflake does not. - Fixing various parts of Snowflake dialect.
Radosław Waśko reports a new STANDUP for the provided date (2024-06-28): Progress: Added a few more tests in the Types task. Updated Snowlake tests after order_by renamed to sort. Improving message in Test framework for easier work. Fix a bug that if no network connection Snowflake tests used to succeed with 0 tests run instead of failing. Investigating IFF operator in Snowflake, adding tests for literal table variant. It should be finished by 2024-06-28. Next Day: Next day I will be working on the same task. Finish investigating the IFF problem. See what needs fixing next. Try gathering an overview. |
Radosław Waśko reports a new 🔴 DELAY for yesterday (2024-07-02): Summary: There is 10 days delay in implementation of the Snowflake Dialect (#9486) task. Delay Cause: Iteration time on Snowflake is extremely long as it runs ~50-100x slower than other DBs for small queries because of the network latency and other kinds of overhead. Lots of subtle differences from Postgres dialect, each needing some custom handling. This is a very large task. |
Radosław Waśko reports a new STANDUP for the provided date (2024-07-01): Progress: Fixed COUNT aggregations. Sped up tests by sharing tables and connection even more. Fix shortest/longest. Investigate first/last, created some followup tickets. It should be finished by 2024-07-08. Next Day: Next day I will be working on the same task. Continue. |
Radosław Waśko reports a new STANDUP for yesterday (2024-07-02): Progress: Implemented date/time support in Snowflake. Updated type mapping to reflect true state of things. Introduced notion of implicit conversions that do not trigger Inexact_Type_Coercion warning. It should be finished by 2024-07-08. Next Day: Next day I will be working on the same task. Work on date/time operations. |
Radosław Waśko reports a new STANDUP for yesterday (2024-07-03): Progress: Implemented most date/time operation variants for Snowflake. Created ticket for inconsistencie. WIP trying a fix for the type inference problem. It should be finished by 2024-07-08. Next Day: Next day I will be working on the same task. Continue. Small cloud ticket. Types? |
Radosław Waśko reports a new STANDUP for yesterday (2024-07-04): Progress: Investigating and fixing value type inference problems. Modified some problematic operations. Trying to batch round spec as it's extremely slow. It should be finished by 2024-07-08. Next Day: Next day I will be working on the same task. Finish round batching. Small cloud ticket. Types work. |
Radosław Waśko reports a new STANDUP for the provided date (2024-07-05): Progress: Prepared a PR for the Snowflake dialect to synchronize my changes. Some small fixes needed before it's mergable. Still further work needed on the dialect (at least one more PR expected). Working on Round batching - currently it did not improve performance because we are fetching types in N queries. Working on Types: introduced the No_Such_Method static Warning. Prototype for resolving the type hierarchy (Any). Investigating why pass data are randomly missing - very weird bug - if I print the pass data in error branch then they are present and there's no errors and if I don't print they are missing... It should be finished by 2024-07-08. Next Day: Next day I will be working on the same task. Fix issues for PR (tests for downloading Integer column in Snowflake). Batched types fetch (quite big performance win expected). |
Radosław Waśko reports a new 🔴 DELAY for today (2024-07-10): Summary: There is 14 days delay in implementation of the Snowflake Dialect (#9486) task. Delay Cause: Still more time needed as this has many parts. |
Radosław Waśko reports a new STANDUP for the provided date (2024-07-08): Progress: Fixed so that we don't do additional queries when fetching a Table. This fixed round batching to be actually 2-6x faster now. Hoped for overall improvement but total performance of test suite seemed to strangely be unaffected. Added tests checking that a Next Day: Next day I will be working on the same task. Investigate and fix the Image write regression. Implement the Integer materialization in Snowflake to be able to get the PR through. |
- Related to #9486 - Fixes types in literal tables that are used throughout the tests - Tries to makes testing faster by disabling some edge cases, trying batching some queries, re-using the main connection and trying to re-use tables more - Implements date/time type mapping and operations for Snowflake - Updates type mapping to correctly reflect what Snowflake does - Disables warnings for Integer->Decimal coercion as that's too annoying and implicitly understood in Snowflake - Allows to select a Decimal column with `..By_Type ..Integer` (only in Snowflake backend) because the Decimal column there is its 'de-facto' Integer column replacement.
Radosław Waśko reports a new STANDUP for the provided date (2024-07-08): Progress: Investigating Cloud file problems together with Paweł. Got the Snowflake 3 PR merged. Prepared 4th PR - Implemented the Integer materialization. It should be finished by 2024-07-22. Next Day: Next day I will be working on the same task. Continue fixing Snowflake tests. |
…er type, other type mapping tests (#10518) - Related to #9486 - Ensures that even though an integer column in Snowflake is represented by `Decimal` type, if the values are small enough, they are materialized as `Integer`. - If the values are larger, they are still read in as `Decimal`. - Adds tests for some other `Decimal` edge cases (various precisions and scales), and for `Float`.
Radosław Waśko reports a new STANDUP for yesterday (2024-07-11): Progress: Further progress on fixing Snowflake tests. Created pt. 5 PR - fixing division / modulo, expressions, filter, missing values tests, checking joins; removed unnecessary warnings when materializing to Decimal to Integer column (reverse variant of the implicit conversion). It should be finished by 2024-07-22. Next Day: Next day I will be working on the #9812 task. Work on Types. |
- Related to #9486 - Batching of expression tests - Fixing arithmetic by simplifying `%` and `/` operations - Trying to share some more tables, sometimes improving performance sometimes not really - Adding sorting and other fixes to tests to make them pass: Missing_Values_Spec, Filter_Spec, Map_Spec - Fixing warnings related to materialization of Decimal->Integer, thus fixing Join_Spec.
Radosław Waśko reports a new STANDUP for yesterday (2024-07-15): Progress: Summarising a big picture what is left to do in Snowflake: union, replace/merge, distinct, table upload (some issues with DDL transactions need a workaround), some small nits. Fixed some easier issues with table upload. It should be finished by 2024-07-24. Next Day: Next day I will be working on the same task. Continue Snowflake work - work on some small issues in column operations, then union. |
Radosław Waśko reports a new STANDUP for yesterday (2024-07-16): Progress: Column Ops down to 0 failures. Fixing Union tests. It should be finished by 2024-07-24. Next Day: Next day I will be working on the same task. Continue |
Radosław Waśko reports a new STANDUP for yesterday (2024-07-17): Progress: Fixed Union tests. Implemented Distinct for Snowflake, Fixed Merge tests. Put up a PR. Some work on types. It should be finished by 2024-07-24. Next Day: Next day I will be working on the same task. Continue Snowflake / types stuff. |
Radosław Waśko reports a new STANDUP for yesterday (2024-07-18): Progress: Got the PR reviewed but tests were failing, mostly due to changes to Next Day: Next day I will be working on the same task. Investigate the bug. Merge the PR. Fix or pend remaining Snowflake failures in next PR. |
Radosław Waśko reports a new STANDUP for today (2024-07-19): Progress: Found and reported the bug with weird interaction of Next Day: Next day I will be working on the same task. Getting the pending PRs merged. Plan what to do next. |
) - Part of #9486 - Fixes `Table.union`, `merge` and `distinct` tests - Replaces `distinct_on` in `Context` that was actually a Postgres specific addition leaking into the base with a more abstract `Context_Extension` mechanism. - This allows us to implement the Snowflake-specific `DISTINCT` using `QUALIFY`.
Number(38,0)
as aBigInt
(it has no integer types).Long
unless it exceed in which case we should flip to BigInteger and read as that.Snowflake_Spec
tests.Snowflake_Tests
project running what we think are correct from theTable_Tests
.The text was updated successfully, but these errors were encountered: