Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A unified, robust and bug-free connection management interface for the ORM #1001

Merged
merged 31 commits into from
Mar 20, 2022

Conversation

blazing-gig
Copy link
Contributor

@blazing-gig blazing-gig commented Dec 6, 2021

Description

This PR provides for a much more robust implementation of the connection management interface of the ORM which is primarily geared towards improving performance and usability.

Motivation and Context

I was really excited that a native asyncIO based ORM was in town and implemented a lot of API constructs similar to the Django ORM since I personally like the way queries are expressed in Django. When I was tinkering around,

  • The first thing I noticed was that connections to the DB were being established on boot-up which I wasn't very comfortable with.
  • The second thing I noticed, after tinkering around with the source code, was that though the docs said that Tortoise.get_connection returns a connection for a given DB alias, there was another construct named current_transaction_map and another method named get_connection in tortoise/transactions.py that was being used internally for storing and retrieving connections.

This PR tries to address the above problems/inconsistencies as follows:

  • A unified connection management interface for accessing, modifying and deleting connections using asyncio native constructs.
  • Lazy connection creation i.e the underlying connection to the DB gets established only upon execution of a query.
  • Fixes a lot of bugs that could occur due to connections being stored in two different places ( Tortoise._connections and current_transaction_map). Also, the usage of ContextVars has been done in the most optimal way.

How Has This Been Tested?

  • All test cases present currently pass for all environments. Once the design and solution are agreed upon, I could write tests for the newly added code.
  • Any code attempting to access a specific connection using an alias has now been refactored to use the new connections interface (including all existing tests)

Once the solution has been accepted and reviewed by the core team, I can update the tests, docs and changlog accordingly.

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added the changelog accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@abondar
Copy link
Member

abondar commented Dec 6, 2021

Hi!

Thank you for your PR

Could you provide motivation for this piece?

Lazy connection creation i.e the underlying connection to the DB gets established only upon execution of a query.

When I was designing this one on the start I did that so that tortoise could fail fast, and user of orm would know that something gone wrong from the start

With lazy connection that could be delayed into future, which could bring more confusion

@blazing-gig
Copy link
Contributor Author

blazing-gig commented Dec 7, 2021

Hey @abondar

Thanks for responding quickly. While I do agree to your point (failing fast), I see the following concerns which might take higher precedence over this:

  • When used in a web framework, If multiple databases are present, lets say 5 and each had a pool size of 5, then the server would have to wait until all 25 connections are established depending on the connection pool implementation. This could become a bottleneck as it increases boot-time for services (especially when auto-scaled). The goal is to keep the boot-time to a minimum.
  • If the DB connection for some reason couldn't be established, it would only affect the routes which contain DB interactions. There could be other routes which don't interact with the DB at all. So instead of the entire service failing(DoS), only part of it would fail which can be detected and fixed quite easily. (A variant of the fail fast in some sense)
  • If most routes of a web server don't interact with the DB at all, then delaying connection creation just creates better availability of resources until the point it's created.

Also, most ORMs' including Django and SQLAlchemy have resorted to lazy connection creation for probably the same reasons.

Let me know your thoughts on the same. Thanks again 😊

@blazing-gig
Copy link
Contributor Author

Hey @abondar , is there any update on this or are we blocked somewhere ? Any concerns ?

@blazing-gig
Copy link
Contributor Author

blazing-gig commented Dec 19, 2021

Hey @abondar , any thoughts/updates on this ? I'd really like this to get merged so that I'd be able to use the ORM with so much more confidence in both my personal as well as professional python projects.

Also, I've fixed breaking tests as well. It would be great if either you or someone from the core team can take a look at this so that we can move forward.

# Conflicts:
#	tortoise/backends/base/client.py
@blazing-gig
Copy link
Contributor Author

@long2ice @abondar Is the project still being actively maintained? If yes, how do I create more traction around this PR to take things forward? Any help from the team would be greatly appreciated. Thanks.

@coveralls
Copy link

coveralls commented Dec 22, 2021

Pull Request Test Coverage Report for Build 2002631079

  • 187 of 196 (95.41%) changed or added relevant lines in 11 files are covered.
  • 4 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+0.2%) to 93.818%

Changes Missing Coverage Covered Lines Changed/Added Lines %
tortoise/contrib/test/init.py 31 32 96.88%
tortoise/models.py 4 5 80.0%
tortoise/init.py 10 12 83.33%
tortoise/connection.py 87 89 97.75%
tortoise/backends/base/client.py 29 32 90.63%
Files with Coverage Reduction New Missed Lines %
tortoise/backends/asyncpg/client.py 1 98.1%
tortoise/backends/base_postgres/client.py 1 89.11%
tortoise/init.py 2 95.45%
Totals Coverage Status
Change from base Build 2001658813: 0.2%
Covered Lines: 5228
Relevant Lines: 5507

💛 - Coveralls

Copy link
Member

@abondar abondar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi!

PR seems to be legit to me

Sorry for slow responses, even though project is being maintained, it is done on available free time basis, which could be quite limited at times

@long2ice could you please prepare release for this PR?

@long2ice
Copy link
Member

Thanks for your contribution! @abondar I will spend some time to review it. Which is a big change and break some api, I have some concerns about that and considering whether we need to accept these changes. And also please see https://github.com/tortoise/tortoise-orm/pull/1001/checks?check_run_id=4581321590 to make that pass.

@blazing-gig
Copy link
Contributor Author

Hey @abondar @long2ice , thanks for the response. I do understand that most of the work done is during free time. Its just that I found myself some time to work on this and thought that it would be great if merged quickly so that I could recommend using the ORM at my workplace for a few projects.

Personally, from the research I've done, I feel that this project has an edge over other async ORMs' out there in the market with respect to feature set, performance and ease-of-use (especially for Django buffs). Hoping to become a long term contributor in the near future.

Also, @long2ice I've made sure that all APIs' that currently exist for fetching DB connections have been made backward compatible with the new design and so this wouldn't cause breaking changes off the bat. These APIs' can be marked as deprecated and probably removed in a future release. Let me know your thoughts.

blazing-gig and others added 2 commits March 5, 2022 23:34
# Conflicts:
#	tests/schema/test_generate_schema.py
#	tortoise/backends/asyncpg/client.py
@blazing-gig
Copy link
Contributor Author

@long2ice I've fixed all breaking tests from my end. Also, I've removed the custom ConnectionPoolWrapper implementation in the psycopg client class as I felt it was unnecessary and was complicating things. Feel free to drop in ur suggestions if any.

@blazing-gig
Copy link
Contributor Author

@long2ice Also, please trigger the pipeline again

@long2ice
Copy link
Member

Looks great! Just need add the changelog

@blazing-gig
Copy link
Contributor Author

@long2ice Hey, I've updated the changelog as well. Please do have a look at it. I think it can be merged now. Let me knw if there's anything else that needs to be done. Thanks 🙂

@blazing-gig
Copy link
Contributor Author

@long2ice Hey I think test_bulk_create_mix_specified test is failing since the resulting list from the queryset is not sorted but assertListSortEqual is expecting it to be sorted... Could you please fix this from ur end ?

@blazing-gig
Copy link
Contributor Author

@long2ice Also this is similar to the bulk issue which was fixed earlier I believe

@blazing-gig
Copy link
Contributor Author

@long2ice Either you could add an order_by clause on the query or refactor the test to just check for existence.

@long2ice
Copy link
Member

What about merge develop first?

@blazing-gig
Copy link
Contributor Author

@long2ice I've already merged from develop. But the issue still remains...

@blazing-gig
Copy link
Contributor Author

@long2ice

async def test_bulk_create_mix_specified(self):
    await UniqueName.bulk_create(
        [UniqueName(id=id_) for id_ in range(10000, 11000)]
        + [UniqueName() for _ in range(1000)]
    )
    
    all_ = await UniqueName.all().values("id", "name") # --> This is gonna return results in a random order
    self.assertEqual(len(all_), 2000)
    
    self.assertListSortEqual(
        all_[:1000], [{"id": id_, "name": None} for id_ in range(10000, 11000)], sorted_key="id"
    ) # --> This expectation is gonna fail
    inc = all_[1000]["id"]
    self.assertListSortEqual(
        all_[1000:], [{"id": val + inc, "name": None} for val in range(1000)], sorted_key="id"
    )

The bottomline is that sometimes the test passes and sometimes it fails becoz of the inconsistency I've highlighted above...
Would you be able to fix this ?

@long2ice
Copy link
Member

strange, ci of develop is right

@blazing-gig
Copy link
Contributor Author

strange, ci of develop is right

@long2ice Not really sure but from what I observed the tests seem to pass sometimes and fail during others. Nevertheless could you pls check the specific concern I've highlighted above? What's your take on it?

@blazing-gig
Copy link
Contributor Author

@long2ice Also, I'd suggest triggering the pipeline again... If it succeeds then it could be a one off inconsistency which we could potentially ignore. If not, it has to be dealt with. Either which way, triggering the pipeline again could add some clarity to the matter.

@blazing-gig
Copy link
Contributor Author

@long2ice I guess my observation is true. Since the pipeline is passing, I believe we can proceed to merge it and then probably look into the inconsistency issue. If not, I believe the merge is getting unnecessarily delayed... Let me knw ur thoughts

@long2ice long2ice merged commit 9709355 into tortoise:develop Mar 20, 2022
@long2ice
Copy link
Member

OK, I merge it. Thanks!

@blazing-gig
Copy link
Contributor Author

@long2ice thanks! Also, do you plan on creating a separate issue for the inconsistency which I've mentioned above? If so, let me knw. Would be happy to contribute there as well... Also would be looking to solve other issues related to connection management and related behavior.

netbsd-srcmastr referenced this pull request in NetBSD/pkgsrc Nov 23, 2023
0.20.0
------
Added
^^^^^
- Allow ForeignKeyField(on_delete=NO_ACTION)
- Support `pydantic` 2.0.

Fixed
^^^^^
- Fix foreign key constraint not generated on MSSQL Server.
- Fix testcase error with python3.11

Breaking Changes
^^^^^^^^^^^^^^^^
- Drop support for `pydantic` 1.x.
- Drop support for `python` 3.7.
- Param `config_class` of `pydantic_model_creator` is renamed to `model_config`.
- Attr `config_class` of `PydanticMeta` is renamed to `model_config`.

0.19
====

0.19.3
------
Added
^^^^^
- Added config_class option to pydantic model genator that allows the developer to customize the generated pydantic model's `Config` class.
Fixed
^^^^^
- Fastapi example test not working.
- Fix create index sql error.
- Fix dependencies resolve error.
- Fix ignoring zero value of limit.
- Fix ForeignKeyField is none when fk is integer 0.
- Fix limit ignore zero.
- Fix min/max value validators for decimal fields.

0.19.2
------
Added
^^^^^
- Added `schema` attribute to Model's Meta to specify exact schema to use with the model.
Fixed
^^^^^
- Mixin does not work.
- `using_db` wrong position in model shortcut methods.
- Fixed connection to `Oracle` database by adding database info to DBQ in connection string.
- Fixed ORA-01435 error while using `Oracle` database
- Fixed processing of `ssl` option in MySQL connection string.
- Fixed type hinting for `QuerySetSingle`.

0.19.1
------
Added
^^^^^
- Added `Postgres`/`SQLite` partial indexes support.
- Added `Microsoft SQL Server`/`Oracle` support, powered by `asyncodbc <https://github.com/tortoise/asyncodbc>`_, note that which is **not fully tested**.
- Added `optional` parameter to `pydantic_model_creator`.
- Added `using_db` parameter to `Model` shortcut methods.
Fixed
^^^^^
- `TimeField` for `MySQL` will return `datetime.timedelta` object instead of `datetime.time` object.
- Fix on conflict do nothing.
- Fix `_custom_generated_pk` attribute not set in `Model._init_from_db` method.

0.19.0
------
Added
^^^^^
- Added psycopg backend support.
- Added a new unified and robust connection management interface to access DB connections which includes support for
  lazy connection creation and much more. For more details, check out this `PR <https://github.com/tortoise/tortoise-orm/pull/1001>`_
- Added `TimeField`.
- Added `ArrayField`.
Fixed
^^^^^
- Fix `bulk_create` doesn't work correctly with more than 1 update_fields.
- Fix `bulk_update` errors when setting null for a smallint column on postgres.
Deprecated
^^^^^^^^^^
- Existing connection management interface and related public APIs which are deprecated:
 - `Tortoise.get_connection`
 - `Tortoise.close_connections`
Changed
^^^^^^^
- Refactored `tortoise.transactions.get_connection` method to `tortoise.transactions._get_connection`.
 Note that this method has now been marked **private to this module and is not part of the public API**
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants