forked from facebook/mysql-5.6
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FB8-269: mysql_client_test MTR test stabilized #8
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Summary: Sometimes `opt_plugin_dir` contains `/` at the end, so `dlpath` ends up with `//`, and `unpack_filename()` fixes it. So resulting path length should be taken from `unpack_filename()` instead of from the earlier `strxnmov()` which could be incorrect. Once it's correct then the following code works to exclude the `dlpath` from the error message which then doesn't get truncated due to the 128 character limit in `ER_CANT_OPEN_LIBRARY`. Reviewed By: Pushapgl Differential Revision: D31223049 fbshipit-source-id: d72a6e1d8fc
Summary: --skip-flush-master-info was an optimization to skip flushing master.info (implemented at 7aff3e77ff0645f4cf2897859e1ef255f8835ee1). However, it had the following issues. - It still called sprintf to copy master.info attributes for each event, which took substantial cpu time - It skipped relay-log.info and worker-relay-log.info.N flushing too. This breaks recovery logic unless operating with slave_use_idempotent_for_recovery. To fix these issues, this diff adds a new system variable --skip-flush-relay-worker-info that controls whether to flush relay-log.info and worker-relay-log.info.N. --skip-flush-master-info no longer skips flushing these files. These parameters now skip calling sprintf as well to save cpu time. Squash with D7411993 (facebook@c17c918) Reviewed By: hermanlee Differential Revision: D31397154 fbshipit-source-id: e155b497b04
Summary: Adding support for superusers/root to truncate the table performance_schema.sql_findings Reviewed By: yizhang82 Differential Revision: D31426614 fbshipit-source-id: a6863cbe22d
Summary: rpl_raft_purged_gtids_dump_threads MTR failed due to "Cannot replicate because the master purged required binary logs" after server4 will tail server2. Try to sync server4 before switch to tail server2. Reviewed By: bhatvinay Differential Revision: D30818078 fbshipit-source-id: 754988c835e
Summary: Rows_query_log_event::has_trx_meta_data() is expensive as it does a bunch of string manipulation to figure out if it is prepended by TRX_META_DATA. In a random profile, it shows to take up 2.7% CPU cycles on secondaries. The function is called multiple times from different places during applying a trx from worker threads: 1. Rows_query_log_event::do_apply_event(): The 'apply' of this event stashes the trx_meta_data_json into 'rli' _only_ if the event contains trx_meta_data. This is the first invocation of Rows_query_log_event::has_trx_meta_data() 2. Query_log_event::do_apply_event(): It needs to cleanup the 'rows_query_ev' stashed in 'rli' _only_ if it has trx_meta_data. Hence it calls Rows_query_log_event::has_trx_metadata() again 3. Log_event::apply_event(): This extracts the timestamp (for 'milli-secs behind master' calculation) from trx_meta_data (stashed inside rli in facebook#1) and needs to check if the event contains trx_meta_data. Hence it calls Rows_query_log_event::has_trx_metadata() again It is best to cache this information in the event itself so that the cost of Rows_query_log_event::has_trx_metadata() is paid only once. This reduces the cost of this method by 3x. In a subsequent diff, I intend to cache the result of Query_log_event::extract_trx_metadata() since this method is also called multiple times and it is expensive as it does a bunch of json parsing and string construction/manipulation. Reviewed By: abhinav04sharma Differential Revision: D31373879 fbshipit-source-id: 720a648da96
Summary: Added the test case for multi-query prepared statement. Verified that no crashes occurred. Reviewed By: george-reynya Differential Revision: D31476047 fbshipit-source-id: 0cc60261b6d
Summary: Setting DSCP only makes sense for MySQL connections with an explicitly handled socket When using the service API, the protocol type is PLUGIN. Currently Mysql crashes when calling get_protocol_classic on a service api THD. For now we will simply fail the set. When we have support for response attributes, we can return true, and have the plugin be responsible for checking dscp settings and updating accordingly Reviewed By: lth Differential Revision: D31286968 fbshipit-source-id: f9305efcac4
…n_history Summary: Added 5 additional columns to ocksdb_compaction_history table. These include number of input files, total bytes in the input files, total number of output files, size of the output files and cpu consumed. Reviewed By: yizhang82 Differential Revision: D31458407 fbshipit-source-id: 30973938d1e
Summary: **Porting Notes** Problem: Running highly concurrent dump/restore tools could lead to the server terminating. Root cause was that there was a race condition when releasing MDLs which could lead invalid memory access. Specifically, MDL_context::release_lock() executed code that might lead to deletion of corresponding MDL_lock object first and then called MDL_ticket_store::remove() method which might have accessed MDL_lock::m_key key in the deleted object. The former happens really rarely, when we release the last ticket for this lock, so MDL_lock object becomes unused, and there are too many unused MDL_lock objects in the system, and the random search we use to choose objects to release picks up the specific MDL_lock object. While the latter happens any time when ticket store contains enough tickets so hash-based ticket map is employed. Solution: Move the call to MDL_ticket_store::remove() earlier in MDL_context::relase_lock() so that no access needs to be made to MDL_lock::key after it can become invalid. Only manual testing done, as it is very challenging to create a reliable test case which triggers this rare situation. Reference: mysql/mysql-server@d0a24b8 Porting notes: drop when porting to 8.0.26 Squash with null Reviewed By: hermanlee Differential Revision: D31517200 fbshipit-source-id: 4b30c1bb2cf
Summary: 1. Trimming of the binlog is left to raft plugin (based on the current leader's log). Server should skip this step as part of recovery. This essentially means setting 'valid_pos' to the last successfully parsed trx in the trx log (instead of the engine's view of the trx log) in MYSQL_BIN_LOG::recover() 2. executed_gtid_set should be initialized based on the engine's view of the trx log file cordinates. So, during startup appropriate flags need to be passed into MYSQL_BIN_LOG::init_gtid_sets(). init_gtid_sets() is already changed to handle this, but the flag was not set correctly during server startup 3. Another fix is in MYSQL_BIN_LOG::init_gtid_sets()to corretly set the position to read and calculate executed-gtid-set (based on the file name read from the engine) Reviewed By: anirbanr-fb Differential Revision: D31284902 fbshipit-source-id: fb582c36e54
Summary: This is a straight port of a 5.6 patch to 8.0 SHOW RAFT STATUS and SHOW GLOBAL STATUS go to the same functions in the plugin and access shared data structures. These functions return internal char * pointers to plugin global variables. They need to be serialized with LOCK_status otherwise it leads to race conditions. Reviewed By: li-chi Differential Revision: D31318340 fbshipit-source-id: 0627450c135
Summary: This is a straight port of similar patch on 5.6 raft reset master is inherently not raft compliant. Its get rid of all binlogs and make an instance appear like a fresh single instance database without consulting the ring. We need to block it. However under certain circumstances (in the future ) e.g. during first time replicaset build, or when we can guarantee that instance is single node raft ring, we can potentially do a reset master followed by rebootstrap of raft. This can also be achieved by disabling raft, reset master and then re-enabling raft, however to keep open the door, I have left a force option and will thread a mechanism from the plugin to call reset_master(force=true) Reviewed By: li-chi Differential Revision: D31299214 fbshipit-source-id: 2128f1845b6
Summary: Porting over just the binlog fsync portions of facebook@ac1a06f as well as group commits from facebook@4b8125171ad8 and their corresponding tests. Some key differences in this diff compared to the original: - Use `performance_schema` in tests instead of `information_schema` as per https://dev.mysql.com/doc/refman/5.7/en/performance-schema-variable-table-migration.html - Use `PSI_NOT_INSTRUMENTED` in the `my_strdup` call in `sql/mysql.cc`. - Add `SHOW_SCOPE_GLOBAL` to all `SHOW_VAR` structs. - Use `snprintf` instead of `my_snprintf`. - Remove `HAVE_REPLICATION` ifdef in `sql/sys_vars.cc`. - Use native C++ atomic adds instead of `my_atomic_add64` in `counter_histogram_increment`. Reference Patch: facebook@ac1a06f facebook@4b8125171ad8 Reviewed By: yizhang82 Differential Revision: D31223117 fbshipit-source-id: 9346181c0cb
Summary: Setting topology config in MTR so that feature that use topology config can be tested easily. Setting rpl_raft_skip_smc_updates to avoid unnecessary calls to SMC (even though we supply a dummy replicaset name). Reviewed By: li-chi Differential Revision: D31543877 fbshipit-source-id: 6ce3c43cab2
…ansition Summary: This is a port of an existing patch made to 5.6 for Raft. Raft will do a stop of the SQL thread during StopAllWrites. Then it will repoint the binlog files and during that action, the SQL threads have to remain stopped. We block it out in this diff by keeping an atomic bool which can be checked from other functions. This only applies to raft mode i.e. enable_raft_plugin = true. Reviewed By: li-chi Differential Revision: D31319499 fbshipit-source-id: f2577823f58
Summary: We have run into multiple problems when upgrading the MySQL client library in fbcode when the FB-added client error messages change their value because we have to adjust them when Oracle adds more client errors. To try to avoid this continuing to happen in the future, add a bunch of place holder client error messages to push the FB-added ones way down the list. Reviewed By: aditya-jalan, lth Differential Revision: D31160326 Squash with D6325225 fbshipit-source-id: 3c5e4d064e2
Summary: This change will return INSTANT_ADD_COLUMN from myrocks layer to sql layer. Details: 1. add rocksdb_instant_ddl variables to enable/disable instant ddl for rocksdb globally. 2. add rocksdb_support_instant() to check whether alter statement is can be instant-ddl Reviewed By: yizhang82 Differential Revision: D29544396 fbshipit-source-id: 7a78cca509f
Summary: Updating follower info on mysql side to support features like replication lag push back and support SQL like `SHOW SLAVE HOSTS`. Added a extension to `SHOW SLAVE HOSTS` to display raft followers: `SHOW SLAVE HOSTS WITH RAFT`. Reviewed By: bhatvinay, anirbanr-fb Differential Revision: D31557400 fbshipit-source-id: 94ecaf8853f
Summary: raft python fetch region from hostname. update raft_config.py to support different hostname format Reviewed By: yizhang82 Differential Revision: D31599527 fbshipit-source-id: 3914cc65e5c
Summary: There are 1.9% CPU spends on get_default_db_collation(). The caller for get_default_db_collation() is Query_log_event::do_apply_event() and most of time is spend on read data from Data Dictionary(DD). Similar to db_read_only/db_context, Add a cache(db name -> CHARSET_INFO*) into THD. Reviewed By: yizhang82 Differential Revision: D31452361 fbshipit-source-id: a848868c973
Summary: This way we'll see them in test log outputs and see if the MTR_UNIQUE_IDS_DIR are doing their thing. Reviewed By: luqun Differential Revision: D31520478 fbshipit-source-id: 7dc62a0f58f
Summary: Fixing the clock_gettime regression in slave sql thread. **Change** * Scope of thd::set_time is much wider and also used in handle_sql_slave. Moved clock_gettime used for start_cputime to separate method set_start_cputime which is called as part of dispatch command. * Renamed set_cpu_time method to get_cpu_time because we are computing the execution time using diff from currenttime with starttime. * Unified the cpu time that we set in response_attributes and perf-schema to avoid 1 additional call to clock_gettime. **Notes** * 5.6 Primary also uses clock_gettime as part of [mysql_execute_command](https://github.com/facebook/mysql-5.6/blob/fb-mysql-5.6.35/sql/sql_parse.cc#L3634), so cpu cycles spent in clock_gettime is same in 5.6 primary and 8.0 primary. squash with D19539990 (facebook@b9b78a9) and D29437104 Reviewed By: yizhang82 Differential Revision: D31564251 fbshipit-source-id: 2b8c79ae004
Summary: Over the recent SEV0, raft instances crash looping because of block skew. Even after the clock is normal, mysqld_safe failed to bring back mysqld because extended 24hours backoff. This diff reduces the max backoff time from 24h to 30mins. So it will still keep trying, but not so aggressively that filling up the trx logs. Reviewed By: bart2 Differential Revision: D31661172 fbshipit-source-id: 34543c76941
Summary: QOS level is a 6-bit DSCP value and TOS is a 8-bit value from a DSCP shifted to left by 2. 1. Have `mysql_get_option` to return ToS value 0 by default if `mysql->options.extension` is uninitialized 2. `mysql->options.extension->tos` is always initialized with 0 3. In client.cc, we accept ToS value from 0 to 255. mysql_options returns 1 to fail the call if ToS is outside of [0, 255] Reviewed By: aditya-jalan Differential Revision: D31163643 fbshipit-source-id: e5e9ff67f95
Summary: Additional histogram information for group commit latency, engine commit latency. We already have semi-sync histogram_trx_wait, so this will help in understanding the latency distribution across the group commit stages. Reference Patch: facebook@8ae4531 Reviewed By: george-reynya Differential Revision: D31499639 fbshipit-source-id: 77c0c8a5013
Summary: As part of MYSQL_DIGEST_END, we compute the hash and store it in the digest storage structure. However, we are not reusing this value but instead computing another one from the same digest storage to set the sql_id. Altering to reuse this computed value. Retaining the re-computation as a fallback. Reviewed By: yoshinorim Differential Revision: D31674192 fbshipit-source-id: e4893f4380c
…queries by admin users Summary: Porting D31669602 to 8.0 Reviewed By: mzait, yoshinorim Differential Revision: D31671612 fbshipit-source-id: 06cc2c3868a
Summary: This diff ensures that prepare_inner() function is not called when bypass sql is enabled, because it is not necessary for running bypass engine later. To do this, a new hton function, is_bypass_sql_enabled(), is defined. As a result of skipping prepare_inner(), lex->result is not populated and read_set bitmap is not set. So I made necessary change to nosql_access as well. Reviewed By: yizhang82 Differential Revision: D31458076 fbshipit-source-id: 3d0f30ae2b9
Summary: This diff adds a new query attribute 'response_attrs_contain_server_cpu'. When the value of the query attribute is set to 'ON', then the query response attribute 'server_cpu' will be populated. Reviewed By: george-reynya Differential Revision: D31626851 fbshipit-source-id: 8ae522a534c
Summary: - Disable `thread_cache` test since with 64 schedulers and trivial queries they are all executed on different schedulers by the current scheduler workers without ever involving additional workers. So the number of idle workers is always 0 during the test. Once the schedulers are made configurable, this test will be fixed to use only 1 scheduler. - Adjust `idx_threads` to remove any records about `thread/thread_pool/worker` that are picked up by the queries. The presence of workers is completely random based on what other tests have run prior to this test. Reviewed By: lth Differential Revision: D31491229 fbshipit-source-id: d5d4e65b9e8
inikep
pushed a commit
that referenced
this pull request
May 17, 2023
Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
May 18, 2023
Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
May 26, 2023
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 fbshipit-source-id: d96ebcef966 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 fbshipit-source-id: 8e7fdb8 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
Jun 1, 2023
Summary: 1. Disable more tests that were missed last time due to missing features 2. handler_basic.test : move FLUSH STATS *after* CREATE TABLE as CREATE TABLE also calls handler::write_row for innodb, not surprisingly. 3. Account for explain select format differences. 4. rocksdb.test: disable portion of test that use partition, sort the result for better stability 5. consistent_snapshot_mixed_engines: 8.0 now properly supports cross-engine snapshots (it calls the callback for each engine to create snasphot) so rebaseline for the behavioral differences - now innodb and rocksdb can share the same snapshot. 6. rpl_rocksdb_snapshot: rebaseline for binlog position change and don't use master_auto_position with master_log_pos. 7. rpl_crash_safe_wal_corrupt/rpl_gtid_crash_safe_wal_corrupt : Enable idempotent_recovery. Note they still need a missing patch to work completely. 8. rpl_gtid_rocksdb_sys_header: need to set rpl_allow_error=1 as the test is expecting binlog corruption 9. delete singledelete_idempotent_table as we don't have rbr_idempotent_table anymore. Reviewed By: lloyd Differential Revision: D17676018
inikep
pushed a commit
that referenced
this pull request
Jun 1, 2023
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 fbshipit-source-id: d96ebcef966 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 fbshipit-source-id: 8e7fdb8 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
Jun 14, 2023
Summary: 1. Disable more tests that were missed last time due to missing features 2. handler_basic.test : move FLUSH STATS *after* CREATE TABLE as CREATE TABLE also calls handler::write_row for innodb, not surprisingly. 3. Account for explain select format differences. 4. rocksdb.test: disable portion of test that use partition, sort the result for better stability 5. consistent_snapshot_mixed_engines: 8.0 now properly supports cross-engine snapshots (it calls the callback for each engine to create snasphot) so rebaseline for the behavioral differences - now innodb and rocksdb can share the same snapshot. 6. rpl_rocksdb_snapshot: rebaseline for binlog position change and don't use master_auto_position with master_log_pos. 7. rpl_crash_safe_wal_corrupt/rpl_gtid_crash_safe_wal_corrupt : Enable idempotent_recovery. Note they still need a missing patch to work completely. 8. rpl_gtid_rocksdb_sys_header: need to set rpl_allow_error=1 as the test is expecting binlog corruption 9. delete singledelete_idempotent_table as we don't have rbr_idempotent_table anymore. Reviewed By: lloyd Differential Revision: D17676018
inikep
pushed a commit
that referenced
this pull request
Jun 14, 2023
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 fbshipit-source-id: d96ebcef966 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 fbshipit-source-id: 8e7fdb8 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
Jun 19, 2023
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 fbshipit-source-id: d96ebcef966 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 fbshipit-source-id: 8e7fdb8 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
Jun 23, 2023
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 fbshipit-source-id: d96ebcef966 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 fbshipit-source-id: 8e7fdb8 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
Apr 25, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
May 7, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
May 8, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
May 9, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
May 10, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
May 13, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
May 15, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
May 16, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
May 17, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
May 17, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
May 21, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
May 21, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
May 30, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
Jun 28, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
Jul 2, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
Jul 19, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
Jul 19, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
Jul 30, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
Jul 31, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
Aug 2, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
inikep
pushed a commit
that referenced
this pull request
Aug 6, 2024
Summary: [Porting Notes] We want to dump raft logs to vanilla async replicas regardless of whether it's the relay log or binlog. Effectively after this change we'll dump relay logs on the followers and binlogs on the leader. When the raft role changes, the logs to the dumped are also changed. Dump_log class is introduced as a thin wrapper/continer around mysql_bin_log or rli->relay_log and is inited with mysql_bin_log to emulate vanilla mysql behavior. Dump threads use the global dump_log object instead of mysql_bin_log directly. We switch the log in dump log only when raft role changes (in binlog_change_to_binlog() and binlog_change_to_apply_log()). During raft role change we take all log releated locks (LOCK_log, LOCK_index, LOCK_binlog_end_pos, and dump log lock) to serialize it with other log operations like dumping logs. Related doc - https://fb.quip.com/oTVAAdgEi4zY This diff contains below 7 patches: D23013977 D24766787 D24716539 D24900223 D24955284 D25174166 D25775525 Reviewed By: luqun Differential Revision: D26141496 ------------------------------------------------------------------------------- Passing raw_log pointer to wait_with_heartbeat() and wait_without_heartbeat() Summary: When enable_raft_plugin is OFF Dump_log::lock() is a no-op. Which means that when enable_raft_plugin is OFF there can be a race between log switching and dump threads. This could lead to a scenario where the raw_log that wait_next_event() is working on might be different than what wait_with_heartbeat()/wait_without_heartbeat() is working on. This can cause deadlocks because wait_with_heartbeat()/wait_without_heartbeat()'s mysql_cond_wait would unlock and then lock a different log's LOCK_binlog_end_pos mutex which would then never be unlocked by wait_next_event(). Reviewed By: anirbanr-fb Differential Revision: D32152658 ----------------------------------------------------------------------------------------- Fix rpl_raft_dump_raft_logs Summary: This tests completes but fails because the following warning exists: ``` 2022-08-30T16:28:00.159525Z 11 [ERROR] [MY-013114] [Repl] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Slave has more GTIDs than the master has, using the master's SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, e.g., after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replicated to the slave. Suggest to replicate any transactions that master has rolled back from slave to master, and/or commit empty transactions on master to account for transactions that have been', Error_code: MY-013114 ``` Since the MTR result file is valid, we can suppress this error. Reviewed By: yichenshen Differential Revision: D39141846 ------------------------------------------------------------------------------- Fix heap overflow in group_relay_log_name handling Summary: We were accessing group_relay_log_name in Query_log_event::do_apply_event_worker() but it's assigned only after the coordinator thread encounters an end event (i.e. xid event or a query event with "COMMIT" or "ROLLBACK" query). This was causing a race between accessing group_relay_log_name in the worker thread and writing it on the coordinator thread. We don't need to set transaction position in events other than end event, so now we set transaction position in query event only if it's an end event. The race is eliminated because group_relay_log_name is set before enqueuing the event to the worker thread (in both dep repl and vanilla mts). Reviewed By: lth Differential Revision: D28767430 ------------------------------------------------------------------------------- fix memory during MYSQL_BIN_LOG::open_existing_binlog Summary: asandebug complain there are memory leaks during MYSQL_BIN_LOG open Direct leak of 50 byte(s) in 1 object(s) allocated from: #0 0x67460ef in malloc #1 0x93f0777 in my_raw_malloc(unsigned long, int) #2 0x93f064a in my_malloc(unsigned int, unsigned long, int) #3 0x93f0eb0 in my_strdup(unsigned int, char const*, int) #4 0x8af01a6 in MYSQL_BIN_LOG::open(unsigned int, char const*, char const*, unsigned int) #5 0x8af8064 in MYSQL_BIN_LOG::open_binlog(char const*, char const*, unsigned long, bool, bool, bool, Format_description_log_event*, unsigned int, RaftRotateInfo*, bool) #6 0x8b00c00 in MYSQL_BIN_LOG::new_file_impl(bool, Format_description_log_event*, RaftRotateInfo*) #7 0x8d65e47 in rotate_relay_log(Master_info*, bool, bool, bool, RaftRotateInfo*) #8 0x8d661c0 in rotate_relay_log_for_raft(RaftRotateInfo*) #9 0x8c7696a in process_raft_queue #10 0xa0fa1fd in pfs_spawn_thread(void*) #11 0x7f8c9a12b20b in start_thread release these memory before assign them Reviewed By: Pushapgl Differential Revision: D28819752
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
https://jira.percona.com/browse/FB8-269
Fixed 2 issues:
test_option_tos test adjusted to support IPv4 as well.
The test was failing when only IPv4 was available.
Client disconnection problem fixed.
When we use SSL, Vio object is being reinitialized during SSL handshake.
It's members thread_id and signal_mask were not propagated from old Vio
object which caused problems during vio_shutdown (SIGALRM signal not
sent to the thread waiting in ppoll).
No MTR test as I was able to reproduce the problem running many MTR test
instances in parallel.