Skip to content

Commit

Permalink
Merge branch 'develop' into feature/GREAT-494/regular-expression-para…
Browse files Browse the repository at this point in the history
…meter-builder

* develop:
  Bugfix/devrel 189/datadocs filter on date is broken (#4218)
  [BUGFIX] Datadocs filter on date is broken (#4217)
  add conditional check for 'expect_column_values_to_be_in_type_list' (#4200)
  [BUGFIX] Update validate_configuration for core Expectations that don't return… (#4216)
  [MAINTENANCE] RBP testing framework changes (#4184)
  [MAINTENANCE] minor type hints clean up (#4214)
  [MAINTENANCE] Update default candidate strings SimpleDateFormatString parameter builder (#4193)
  [FEATURE] Support Multi-Dimensional Metric Computations Generically for Multi-Batch Parameter Builders (#4206)
  [MAINTENANCE]: Remove temp file accidentally committed (#4201)
  [MAINTENANCE] Refactor relative imports (#4195)
  [MAINTENANCE] Revert changes to `dependency_graph` pipeline
  fixes two references to the Getting Started tutorial (#4189)
  [MAINTENANCE] Refactor RuleBasedProfiler toolkit pattern (#4191)
  [FEATURE] Update `suite new --profile` to work with Rule Based Profiler (#4171)
  [MAINTENANCE] Make `cli_integration` stage in primary `great_expectations` pipeline run in parallel (#4187)
  [FEATURE] Introduce Top-Level Abstract "ConfigPeer" class for handling configuration output for Marshmallo Schema validated Config objects (#4183)
  • Loading branch information
Shinnnyshinshin committed Feb 16, 2022
2 parents c350925 + 9c3a72c commit a4896a2
Show file tree
Hide file tree
Showing 120 changed files with 4,417 additions and 1,324 deletions.
6 changes: 3 additions & 3 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,11 @@ Previous Design Review notes:
### Definition of Done
Please delete options that are not relevant.

- [ ] My code follows the Great Expectations [style guide](https://docs.greatexpectations.io/en/latest/contributing/style_guide.html?highlight=style%20guide)
- [ ] I have performed a [self-review](https://docs.greatexpectations.io/en/latest/contributing/contribution_checklist.html?highlight=checklist) of my own code
- [ ] My code follows the Great Expectations [style guide](https://docs.greatexpectations.io/docs/contributing/style_guides/code_style)
- [ ] I have performed a [self-review](https://docs.greatexpectations.io/docs/contributing/contributing_checklist) of my own code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] I have added [unit tests](https://docs.greatexpectations.io/en/latest/contributing/testing.html#contributing-testing-writing-unit-tests) where applicable and made sure that new and existing tests are passing.
- [ ] I have added [unit tests](https://docs.greatexpectations.io/docs/contributing/contributing_test#writing-unit-and-integration-tests) where applicable and made sure that new and existing tests are passing.
- [ ] I have run any local integration tests and made sure that nothing is broken.


Expand Down
2 changes: 1 addition & 1 deletion azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -349,7 +349,7 @@ stages:
displayName: 'pytest'
- stage: cli_integration
dependsOn: [lint, usage_stats_integration, required, db_integration]
dependsOn: [lint]
pool:
vmImage: 'ubuntu-latest'

Expand Down
2 changes: 2 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ title: Changelog
---

### Develop
* [BUGFIX] Fix datepicker filter on data docs (#4217)
* [FEATURE] Allow Data Docs to be rendered in night mode (#4130)

### 0.14.6
* [FEATURE] Create profiler from DataContext (#4070)
Expand Down
2 changes: 1 addition & 1 deletion docs/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ slug: /

Welcome to Great Expectations!

Great Expectations is the leading tool for [validating](./reference/core_concepts.md#expectations), [documenting](./reference/core_concepts.md#data-docs), and [profiling](./reference/core_concepts.md#profiling) your data to maintain quality and improve communication between teams. Head over to our [getting started](./tutorials/getting_started/intro.md) tutorial.
Great Expectations is the leading tool for [validating](./reference/core_concepts.md#expectations), [documenting](./reference/core_concepts.md#data-docs), and [profiling](./reference/core_concepts.md#profiling) your data to maintain quality and improve communication between teams. Head over to our [getting started](./tutorials/getting_started/tutorial_overview.md) tutorial.

Software developers have long known that automated testing is essential for managing complex codebases. Great Expectations brings the same discipline, confidence, and acceleration to data science and data engineering teams.

Expand Down
13 changes: 13 additions & 0 deletions docs/reference/checkpoints_and_actions.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,19 @@ redefined.
* `action_list`: actions that share the same user-defined name will be updated, otherwise a new action will be appended
* `validations`

:::caution API note

If the use case calls for instantiating the Checkpoint explicitly, then it is crucial to ensure that only serializable
values are passed as arguments to the constructor. Specifically, if `batch_request` is specified at any level of the
hierarchy of the Checkpoint configuration (at the top level and/or as part of the validators list structure), then no
runtime `batch_request` can contain `batch_data`, only a database query. This is because `batch_data` is used to specify
dataframes (Pandas, Spark), which are not serializable (while database queries are plain text, which is serializable).

The proper mechanism for specifying non-serializable parameters is to pass them dynamically to the Checkpoint `run()`
method. Hence, in a typical scenario, one would instantiate the Checkpoint class with serializable parameters only,
while specifying any non-serializable parameters, commonly dataframes, as arguments to the Checkpoint `run()` method.
:::

## SimpleCheckpoint class

For many use cases, the SimpleCheckpoint class can be used to simplify the process of specifying a Checkpoint
Expand Down
2 changes: 2 additions & 0 deletions docs_rtd/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ Changelog

develop
-----------------
* [BUGFIX] Fix datepicker filter on data docs (#4217)
* [FEATURE] Allow Data Docs to be rendered in night mode (#4130)

0.14.6
-----------------
Expand Down
2 changes: 1 addition & 1 deletion docusaurus.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ module.exports = {
items: [
{
label: 'Getting Started',
to: 'docs/'
to: 'docs/tutorials/getting_started/tutorial_overview'
}
]
},
Expand Down
73 changes: 14 additions & 59 deletions great_expectations/checkpoint/checkpoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
batch_request_contains_batch_data,
get_batch_request_as_dict,
)
from great_expectations.core.config_peer import ConfigOutputModes, ConfigPeer
from great_expectations.core.usage_statistics.usage_statistics import (
get_checkpoint_run_usage_statistics,
usage_statistics_enabled_method,
Expand All @@ -33,7 +34,6 @@
from great_expectations.data_context.types.base import CheckpointConfig
from great_expectations.data_context.types.resource_identifiers import GeCloudIdentifier
from great_expectations.data_context.util import substitute_all_config_variables
from great_expectations.util import filter_properties_dict
from great_expectations.validation_operators import ActionListValidationOperator
from great_expectations.validation_operators.types.validation_operator_result import (
ValidationOperatorResult,
Expand All @@ -43,9 +43,9 @@
logger = logging.getLogger(__name__)


class CheckpointBase:
class BaseCheckpoint(ConfigPeer):
"""
CheckpointBase class is initialized from CheckpointConfig typed object and contains all functionality
BaseCheckpoint class is initialized from CheckpointConfig typed object and contains all functionality
in the form of interface methods (which can be overwritten by subclasses) and their reference implementation.
"""

Expand Down Expand Up @@ -176,7 +176,7 @@ def run(
return CheckpointResult(
run_id=run_id,
run_results=run_results,
checkpoint_config=self.checkpoint_config,
checkpoint_config=self.config,
)

def get_substituted_config(
Expand All @@ -186,7 +186,7 @@ def get_substituted_config(
if runtime_kwargs is None:
runtime_kwargs = {}

config_kwargs: dict = self.get_config(mode="json_dict")
config_kwargs: dict = self.get_config(mode=ConfigOutputModes.JSON_DICT)

template_name: Optional[str] = runtime_kwargs.get("template_name")
if template_name:
Expand All @@ -212,7 +212,7 @@ def _get_substituted_template(
checkpoint: Checkpoint = self.data_context.get_checkpoint(
name=template_name
)
template_config: dict = checkpoint.checkpoint_config.to_json_dict()
template_config: dict = checkpoint.config.to_json_dict()

if template_config["config_version"] != source_config["config_version"]:
raise ge_exceptions.CheckpointError(
Expand Down Expand Up @@ -370,7 +370,7 @@ def _run_validation(

def self_check(self, pretty_print=True) -> dict:
# Provide visibility into parameters that Checkpoint was instantiated with.
report_object: dict = {"config": self.checkpoint_config.to_json_dict()}
report_object: dict = {"config": self.config.to_json_dict()}

if pretty_print:
print(f"\nCheckpoint class name: {self.__class__.__name__}")
Expand Down Expand Up @@ -416,87 +416,42 @@ def self_check(self, pretty_print=True) -> dict:

return report_object

# noinspection PyShadowingBuiltins
def get_config(
self,
mode: str = "typed",
clean_falsy: bool = False,
) -> Union[CheckpointConfig, dict, str]:
config: CheckpointConfig = self.checkpoint_config

if mode == "typed":
return config

if mode == "commented_map":
return config.commented_map

if mode == "dict":
config_kwargs: dict = config.to_dict()
if clean_falsy:
filter_properties_dict(
properties=config_kwargs,
clean_falsy=True,
inplace=True,
)

return config_kwargs

if mode == "json_dict":
config_kwargs: dict = config.to_json_dict()
if clean_falsy:
filter_properties_dict(
properties=config_kwargs,
clean_falsy=True,
inplace=True,
)

return config_kwargs

if mode == "yaml":
return config.to_yaml_str()

raise ValueError(f'Unknown mode {mode} in "CheckpointBase.get_config()".')

@property
def checkpoint_config(self) -> CheckpointConfig:
def config(self) -> CheckpointConfig:
return self._checkpoint_config

@checkpoint_config.setter
def checkpoint_config(self, value: CheckpointConfig):
self._checkpoint_config = value

@property
def name(self) -> Optional[str]:
try:
return self.checkpoint_config.name
return self.config.name
except AttributeError:
return None

@property
def config_version(self) -> Optional[float]:
try:
return self.checkpoint_config.config_version
return self.config.config_version
except AttributeError:
return None

@property
def action_list(self) -> List[Dict]:
try:
return self.checkpoint_config.action_list
return self.config.action_list
except AttributeError:
return []

@property
def validations(self) -> List[Dict]:
try:
return self.checkpoint_config.validations
return self.config.validations
except AttributeError:
return []

@property
def ge_cloud_id(self) -> Optional[UUID]:
try:
return self.checkpoint_config.ge_cloud_id
return self.config.ge_cloud_id
except AttributeError:
return None

Expand All @@ -508,7 +463,7 @@ def __repr__(self) -> str:
return str(self.get_config())


class Checkpoint(CheckpointBase):
class Checkpoint(BaseCheckpoint):
"""
--ge-feature-maturity-info--
Expand Down
37 changes: 30 additions & 7 deletions great_expectations/cli/suite.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,10 +79,13 @@ def suite(ctx):
@click.option(
"--profile",
"-p",
is_flag=True,
default=False,
help="""Generate a starting expectation suite automatically so you can refine it further. Assumes --interactive
flag.
"profiler_name",
is_flag=False,
flag_value="",
default=None,
help="""Generate a starting expectation suite automatically so you can refine it further.
Takes in an optional name; if provided, a profiler of that name will be retrieved from your Data Context.
Assumes --interactive flag.
""",
)
@click.option(
Expand All @@ -106,7 +109,7 @@ def suite_new(
expectation_suite: Optional[str],
interactive_flag: bool,
manual_flag: bool,
profile: bool,
profiler_name: Optional[str],
batch_request: Optional[str],
no_jupyter: bool,
) -> None:
Expand All @@ -117,6 +120,9 @@ def suite_new(
context: DataContext = ctx.obj.data_context
usage_event_end: str = ctx.obj.usage_event_end

# Only set to true if `--profile` or `--profile <PROFILER_NAME>`
profile: bool = _determine_profile(profiler_name)

interactive_mode, profile = _process_suite_new_flags_and_prompt(
context=context,
usage_event_end=usage_event_end,
Expand All @@ -131,12 +137,25 @@ def suite_new(
expectation_suite_name=expectation_suite,
interactive_mode=interactive_mode,
profile=profile,
profiler_name=profiler_name,
no_jupyter=no_jupyter,
usage_event=usage_event_end,
batch_request=batch_request,
)


def _determine_profile(profiler_name: Optional[str]) -> bool:
profile: bool = profiler_name is not None
if profile:
if profiler_name:
msg = "Since you supplied a profiler name, utilizing the RuleBasedProfiler"
else:
msg = "Since you did not supply a profiler name, defaulting to the UserConfigurableProfiler"
cli_message(string=f"<yellow>{msg}</yellow>")

return profile


def _process_suite_new_flags_and_prompt(
context: DataContext,
usage_event_end: str,
Expand All @@ -156,8 +175,7 @@ def _process_suite_new_flags_and_prompt(
batch_request: --batch-request from the `suite new` CLI command
Returns:
Dictionary with keys of processed parameters and boolean values e.g.
{"interactive": True, "profile": False}
Tuple with keys of processed parameters and boolean values
"""

interactive_mode: Optional[CLISuiteInteractiveFlagCombinations]
Expand Down Expand Up @@ -190,6 +208,7 @@ def _suite_new_workflow(
expectation_suite_name: Optional[str],
interactive_mode: CLISuiteInteractiveFlagCombinations,
profile: bool,
profiler_name: Optional[str],
no_jupyter: bool,
usage_event: str,
batch_request: Optional[
Expand Down Expand Up @@ -263,6 +282,7 @@ def _suite_new_workflow(
context=context,
expectation_suite_name=expectation_suite_name,
profile=profile,
profiler_name=profiler_name,
usage_event=usage_event,
interactive_mode=interactive_mode,
no_jupyter=no_jupyter,
Expand Down Expand Up @@ -529,6 +549,7 @@ def suite_edit(
context=context,
expectation_suite_name=expectation_suite,
profile=False,
profiler_name=None,
usage_event=usage_event_end,
interactive_mode=interactive_mode,
no_jupyter=no_jupyter,
Expand Down Expand Up @@ -668,6 +689,7 @@ def _suite_edit_workflow(
context: DataContext,
expectation_suite_name: str,
profile: bool,
profiler_name: Optional[str],
usage_event: str,
interactive_mode: CLISuiteInteractiveFlagCombinations,
no_jupyter: bool,
Expand Down Expand Up @@ -755,6 +777,7 @@ def _suite_edit_workflow(
renderer = SuiteProfileNotebookRenderer(
context=context,
expectation_suite_name=expectation_suite_name,
profiler_name=profiler_name,
batch_request=batch_request,
)
renderer.render_to_disk(notebook_file_path=notebook_path)
Expand Down
Loading

0 comments on commit a4896a2

Please sign in to comment.