Skip to content

Query Performance Improvement Tag Retrieval

정회성 edited this page Oct 29, 2024 · 1 revision

Data Specification

  • Members: 10 records
  • Categories: 100 records (10 per member)
  • Tags: 2000 records (200 per member)
  • Templates: 100,000 records (10,000 per member)
  • Source Codes: 100,000 to 500,000 records (1 to 5 randomly generated per template)

Test Conditions

  • Executed 100 times with 10 threads
  • Total of 1000 requests executed

Template Generation Conditions

  • Tags used: 20 tags (all existing tags)
  • Source Codes: 2 codes

Before Optimization

Speed Measurement

  • Total request count: 1000
  • Total elapsed time: 5,268,308 ms
  • Average elapsed time: 5,268 ms

Query Analysis

Total [2 + number of tags] queries executed

1. Retrieve Member (based on MemberId)

  • Repository: TemplateJpaRepository
  • Method: findByMemberId
    SELECT
        t1_0.id,
        t1_0.category_id,
        t1_0.created_at,
        t1_0.description,
        (SELECT COUNT(*) 
         FROM likes 
         WHERE likes.template_id = t1_0.id),
        t1_0.member_id,
        t1_0.modified_at,
        t1_0.title 
    FROM
        template t1_0 
    WHERE
        t1_0.member_id = ?
  • Number of Calls: 1 time

2. Retrieve Tag List for Template (based on TemplateId)

  • Repository: TemplateTagJpaRepository
  • Method: findDistinctByTemplateIn
    SELECT
        DISTINCT tt1_0.tag_id 
    FROM
        template_tag tt1_0 
    WHERE
        tt1_0.template_id IN (?, ?, ?, ?) # as many as the number of templates
  • Number of Calls: 1 time

3. Retrieve Tag Information (based on TagId)

  • Repository: TagJpaRepository
  • Method: fetchById
    SELECT
        t1_0.id,
        t1_0.created_at,
        t1_0.modified_at,
        t1_0.name 
    FROM
        tag t1_0 
    WHERE
        t1_0.id = ?
  • Number of Calls: 200 times (as many as the number of tags)

Necessary Tasks for Improvement

Query Optimization

1st Improvement: Covering Index

Covering Index

  • An index that contains all the data required to satisfy the query.
  • If all columns used in SELECT, WHERE, ORDER BY, GROUP BY, etc., are components of the index.

There exists a logic that retrieves all templates for a given member ID.
However, in reality, only the template ID is utilized after this logic.

Thus, we will modify the logic to retrieve only the template IDs.
This change will allow us to utilize the covering index, improving query performance.

Query
@Query("""
    SELECT t.id  
    FROM Template t  
    WHERE t.member.id = :memberId  
""")  
List<Long> findAllIdsByMemberId(Long memberId);
Proof of Covering Index Usage
Before

After


2nd Improvement: Covering Index + Use Subqueries Instead of IN Clause

When dealing with vast datasets, the IN clause can lead to performance degradation. In our code, the logic for retrieving template tags contains datasets in the IN clause (currently 100,000).

We will improve this by using subqueries.
Using a subquery can enhance the performance of the IN clause. A subquery is a query that is included within the main query and is useful for dynamically retrieving data.
By dynamically filtering data with a subquery, we can efficiently query data from indexed columns.

By using a subquery, we will combine the logic for retrieving template IDs based on member ID and retrieving template tags, thereby improving the performance of the IN clause.

Reference: SQL IN Clause Tuning


3rd Improvement: Tag Information Retrieval

Previously, after retrieving tags related to a template from the template tags, we queried the tag table one by one. This caused the tag retrieval logic to execute as many times as there were tags.

To solve this problem, we will merge the logic for retrieving template tags and the logic for retrieving tags.

@Query("""
    SELECT DISTINCT t  
    FROM Tag t  
    WHERE t.id IN (  
        SELECT DISTINCT tt.id.tagId    
        FROM TemplateTag tt    
        WHERE tt.id.templateId IN        
            (SELECT te.id FROM Template te WHERE te.member.id = :memberId)
    )
""")  
List<Tag> findDistinctTagNameByMemberIdIn(Long memberId);

After Optimization

Speed Measurement

1st Improvement

  • Total request count: 1000
  • Total elapsed time: 3,632,279 ms
  • Average elapsed time: 3,632 ms

2nd Improvement

  • Total request count: 1000
  • Total elapsed time: 2,704,116 ms
  • Average elapsed time: 2,704 ms

3rd Improvement

  • Total request count: 1000
  • Total elapsed time: 92,743 ms
  • Average elapsed time: 92 ms

Query Analysis

Total 1 query executed

4. Retrieve Tag Information (based on TagId)

  • Repository: TemplateTagJpaRepository
  • Method: findDistinctTagNameByMemberIdIn
    SELECT
        DISTINCT t1_0.id,
        t1_0.created_at,
        t1_0.modified_at,
        t1_0.name 
    FROM
        tag t1_0 
    WHERE
        t1_0.id IN (SELECT
            DISTINCT tt1_0.tag_id 
        FROM
            template_tag tt1_0 
        WHERE
            tt1_0.template_id IN (SELECT
                t2_0.id 
            FROM
                template t2_0 
            WHERE
                t2_0.member_id = ?))
  • Number of Calls: 1 time

Performance Improvement Results

Before Improvement

  • Total request count: 1000
  • Total elapsed time: 5,268,308 ms
  • Average elapsed time: 5,268 ms

After Improvement

  • Total request count: 1000
  • Total elapsed time: 92,743 ms
  • Average elapsed time: 92 ms

⚡️ 코드zap

프로젝트

규칙 및 정책

공통

백엔드

프론트엔드

매뉴얼

백엔드

기술 문서

백엔드

프론트엔드


Clone this wiki locally