Documentation #17

chandlerNick · 2025-02-07T09:11:32Z

Added docstrings to many functions (some were private and simple enough to not seem to need one).

se-jaeger

Hi, added some minor changes, which we will discuss offline.
Other than that, LGTM 🚀

se-jaeger · 2025-02-14T10:19:18Z

tab_err/error_mechanism/_ear.py

@@ -43,10 +69,10 @@ def _sample(self: EAR, data: pd.DataFrame, column: str | int, error_rate: float,
            raise ValueError(msg)

        # we offset the upper bound of the lower_error_index by a) the existing number of errors in the row, and b) the number of errors to-be generated.
-        upper_bound = len(se_data) - sum(se_mask) - n_errors
+        upper_bound = len(se_data) - sum(se_mask) - n_errors  # upper bound = length of data - current number of errors - number of errors to be generated


Suggested change

upper_bound = len(se_data) - sum(se_mask) - n_errors # upper bound = length of data - current number of errors - number of errors to be generated

upper_bound = len(se_data) - sum(se_mask) - n_errors

I think it's fine without this comment. Maybe thinking about renaming se_...

se-jaeger · 2025-02-14T10:20:22Z

tab_err/error_mechanism/_ear.py

        lower_error_index = self._random_generator.integers(0, upper_bound) if upper_bound > 0 else 0
        error_index_range = range(lower_error_index, lower_error_index + n_errors)
-        selected_rows = data_column_error_free.sort_values(by=condition_to_column).iloc[error_index_range, :]
+        selected_rows = data_column_error_free.sort_values(by=condition_to_column).iloc[error_index_range, :]  # Sort by the condition_to_column values


Suggested change

selected_rows = data_column_error_free.sort_values(by=condition_to_column).iloc[error_index_range, :] # Sort by the condition_to_column values

selected_rows = data_column_error_free.sort_values(by=condition_to_column).iloc[error_index_range, :]

I think it's fine without this comment. Maybe thinking about renaming se_...

se-jaeger · 2025-02-14T10:21:23Z

tab_err/error_mechanism/_enar.py

        error_index_range = range(lower_error_index, lower_error_index + n_errors)
-        selected_rows = se_data_error_free.sort_values().iloc[error_index_range]
+        selected_rows = se_data_error_free.sort_values().iloc[error_index_range]  # Introduce errors to locations of sorted values


Suggested change

selected_rows = se_data_error_free.sort_values().iloc[error_index_range] # Introduce errors to locations of sorted values

selected_rows = se_data_error_free.sort_values().iloc[error_index_range]

I think that's clear enough without this comment.

se-jaeger · 2025-02-14T10:24:13Z

tab_err/error_mechanism/_error_mechanism.py

+
+        Description:
+            Does error checking for the abstract method '_sample'.
+            Assigns the _random_generator attribute.
+            Calls subclass _sample method.


Suggested change

Description:

Does error checking for the abstract method '_sample'.

Assigns the _random_generator attribute.

Calls subclass _sample method.

I feel that's not necessary for users.

chandlerNick added 5 commits February 5, 2025 16:15

Documentation in progress...

939878a

Adding more documentation

d2a6a80

Adding back the result of the Error_Types.ipynb

391b267

Finished adding doc strings

41c40b0

Reformatting files

3c9eddb

chandlerNick closed this Feb 7, 2025

chandlerNick reopened this Feb 7, 2025

minor changes

27cd53b

se-jaeger approved these changes Feb 14, 2025

View reviewed changes

se-jaeger force-pushed the ToDos branch from e880e10 to 27cd53b Compare February 14, 2025 10:53

update doc dependencies

9f5cc25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation #17

Documentation #17

chandlerNick commented Feb 7, 2025

se-jaeger left a comment

se-jaeger Feb 14, 2025

se-jaeger Feb 14, 2025

se-jaeger Feb 14, 2025

se-jaeger Feb 14, 2025

se-jaeger Feb 14, 2025

se-jaeger Feb 14, 2025

	upper_bound = len(se_data) - sum(se_mask) - n_errors # upper bound = length of data - current number of errors - number of errors to be generated
	upper_bound = len(se_data) - sum(se_mask) - n_errors

	selected_rows = data_column_error_free.sort_values(by=condition_to_column).iloc[error_index_range, :] # Sort by the condition_to_column values
	selected_rows = data_column_error_free.sort_values(by=condition_to_column).iloc[error_index_range, :]

	selected_rows = se_data_error_free.sort_values().iloc[error_index_range] # Introduce errors to locations of sorted values
	selected_rows = se_data_error_free.sort_values().iloc[error_index_range]

Documentation #17

Are you sure you want to change the base?

Documentation #17

Conversation

chandlerNick commented Feb 7, 2025

se-jaeger left a comment

Choose a reason for hiding this comment

se-jaeger Feb 14, 2025

Choose a reason for hiding this comment

se-jaeger Feb 14, 2025

Choose a reason for hiding this comment

se-jaeger Feb 14, 2025

Choose a reason for hiding this comment

se-jaeger Feb 14, 2025

Choose a reason for hiding this comment

se-jaeger Feb 14, 2025

Choose a reason for hiding this comment

se-jaeger Feb 14, 2025

Choose a reason for hiding this comment