attempted fix for hierarchy stream primary key #28

jlloyd-widen · 2022-03-18T18:09:54Z

fixes #18 . Specifically, it fixes the Primary Key errors mentioned in that issue. This required flattening of the json structures returned by the API. There were also several instances where query terms were missing in the google query and the primary keys were just completely incorrect.

attempted fix for hierarchy stream primary key

…ake them possible

overhauld of the primary key designations and flatten of schemas to m…

visch · 2022-03-22T19:50:37Z

Thanks for this PR @jlloyd-widen

I'm trying to fix #30 first, generally the idea here seems right! Thanks for adding the test, I'll come back after I get 30 fixed

visch · 2022-03-22T19:52:54Z

I'm a little worried about responses with GAQL's of something like https://github.com/AutoIDM/tap-googleads/blob/main/tap_googleads/streams.py#L316

Looks like you can have multiple nested objects returned by Google's api.

Maybe we just pull the primary key out of the record and put it at the root level and leave the rest? I haven't dove in enough to know.

visch · 2022-03-22T21:09:48Z

I'm a little worried about responses with GAQL's of something like https://github.com/AutoIDM/tap-googleads/blob/main/tap_googleads/streams.py#L316

Looks like you can have multiple nested objects returned by Google's api.

Maybe we just pull the primary key out of the record and put it at the root level and leave the rest? I haven't dove in enough to know.

This is pretty well addressed by your PR.

I wonder if leaving the flattening up to the USER would be appropriate here as we could hop onto the new functionality here https://gitlab.com/meltano/sdk/-/merge_requests/236/diffs#5419fd77f507b41e5e0fe17b6fbf7375fbf2c3eb .

That doesn't answer the question about how to get a primary key for each of these reports though. The work you put in to make the primary key for each stream is good.

Things I'm thinking about:

How do we handle custom reports in regards to primary keys and user flattening? Custom Reports #32 . Honestly with the auto flattening here it might be a lot easier to do custom mappings (although you'd have to have the user define which objects to denest or nest which could get confusing)
Make flattening of all objects a user decided feature. Ideally we leave the data untouched when possible?
What about a surrogate key, we'd call it primary_key for each stream, ie for campaign_performance we'd do primary_key = json_path(record, "id")+json_path(record, "date"]

I'm not in love with the surrogate key idea. We're at a trade off point of do we keep the data as close to the source data as possible by leaving the data in the same structure as before but making a surrogate key? Or do we denest the objects ourselves

Just questions not really answers to anything.

visch · 2022-03-23T15:11:24Z

Gave this a shot with some realistic data

fails on the campaign_performance_by_location stream


2022-03-23T15:06:27.287160Z [info     ] Traceback (most recent call last): cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.287291Z [info     ]   File "<string>", line 1, in <module> cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.287443Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1128, in __call__ cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.287571Z [info     ]     return self.main(*args, **kwargs) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.287689Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1053, in main cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.287811Z [info     ]     rv = self.invoke(ctx)      cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.287928Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1395, in invoke cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.288085Z [info     ]     return ctx.invoke(self.callback, **ctx.params) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.288222Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 754, in invoke cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.288342Z [info     ]     return __callback(*args, **kwargs) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.288490Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 476, in cli cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.288611Z [info     ]     tap.sync_all()             cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.288736Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 345, in sync_all cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.288875Z [info     ]     stream.sync()              cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289039Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289181Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289300Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289431Z [info     ]     self._sync_children(child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289558Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289705Z [info     ]     child_stream.sync(context=child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289861Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.289985Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.290135Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.290278Z [info     ]     self._sync_children(child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.290428Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.290629Z [info     ]     child_stream.sync(context=child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.290753Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.290873Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.290988Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 937, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.291116Z [info     ]     for record_result in self.get_records(current_context): cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.291306Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/client.py", line 124, in get_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.291427Z [info     ]     transformed_record = self.post_process(record, context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.291542Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/streams.py", line 456, in post_process cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.291660Z [info     ]     return flatten(            cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.291774Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/streams.py", line 24, in flatten cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.291895Z [info     ]     subitem = row.pop(prefix)  cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.292010Z [info     ] KeyError: 'campaignCriterion_location' cmd_type=extractor job_id=test_5 name=tap-googleads run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.333911Z [error    ] Extraction failed              code=1 job_id=test_5 message=KeyError: 'campaignCriterion_location' name=meltano run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d
2022-03-23T15:06:27.334330Z [info     ] ELT could not be completed: Extractor failed. cmd_type=elt job_id=test_5 name=meltano run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.334461Z [info     ] For more detailed log messages re-run the command using 'meltano --log-level=debug ...' CLI flag. cmd_type=elt job_id=test_5 name=meltano run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.334584Z [info     ] Note that you can also check the generated log file at '/home/visch/git/tap-googleads/.meltano/logs/elt/test_5/56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d/elt.log'. cmd_type=elt job_id=test_5 name=meltano run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr
2022-03-23T15:06:27.334719Z [info     ] For more information on debugging and logging: https://docs.meltano.com/reference/command-line-interface#debugging cmd_type=elt job_id=test_5 name=meltano run_id=56c388e8-b0f7-4b8b-a70c-8ff5019e6c5d stdio=stderr

Also fails on the conversion_by_location stream it seems. Haven't dove into why, but we need to clear this up first


2022-03-23T15:10:18.749933Z [info     ] time=2022-03-23 11:10:18 name=tap-googleads level=WARNING message=Property 'client_id' was present in the 'conversion_by_location' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.750125Z [info     ] Traceback (most recent call last): cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.750259Z [info     ]   File "<string>", line 1, in <module> cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.750466Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1128, in __call__ cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.750622Z [info     ]     return self.main(*args, **kwargs) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.750768Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1053, in main cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.750904Z [info     ]     rv = self.invoke(ctx)      cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.751046Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1395, in invoke cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.751252Z [info     ]     return ctx.invoke(self.callback, **ctx.params) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.751424Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 754, in invoke cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.751548Z [info     ]     return __callback(*args, **kwargs) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.751662Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 476, in cli cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.751776Z [info     ]     tap.sync_all()             cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.751889Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 345, in sync_all cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752004Z [info     ]     stream.sync()              cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752117Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752230Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752342Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752478Z [info     ]     self._sync_children(child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752607Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752770Z [info     ]     child_stream.sync(context=child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.752889Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753004Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753127Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753240Z [info     ]     self._sync_children(child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753352Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753476Z [info     ]     child_stream.sync(context=child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753589Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753702Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753814Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 937, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.753926Z [info     ]     for record_result in self.get_records(current_context): cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.754038Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/client.py", line 124, in get_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.754151Z [info     ]     transformed_record = self.post_process(record, context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.754263Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/streams.py", line 490, in post_process cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.754375Z [info     ]     return flatten(            cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.754505Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/streams.py", line 24, in flatten cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.754618Z [info     ]     subitem = row.pop(prefix)  cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.754731Z [info     ] KeyError: 'campaignCriterion_location' cmd_type=extractor job_id=test_5 name=tap-googleads run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.799819Z [error    ] Extraction failed              code=1 job_id=test_5 message=KeyError: 'campaignCriterion_location' name=meltano run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3
2022-03-23T15:10:18.800206Z [info     ] ELT could not be completed: Extractor failed. cmd_type=elt job_id=test_5 name=meltano run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.800331Z [info     ] For more detailed log messages re-run the command using 'meltano --log-level=debug ...' CLI flag. cmd_type=elt job_id=test_5 name=meltano run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.800512Z [info     ] Note that you can also check the generated log file at '/home/visch/git/tap-googleads/.meltano/logs/elt/test_5/e62918d4-c084-4c72-89cd-6bf19f2a13c3/elt.log'. cmd_type=elt job_id=test_5 name=meltano run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr
2022-03-23T15:10:18.800665Z [info     ] For more information on debugging and logging: https://docs.meltano.com/reference/command-line-interface#debugging cmd_type=elt job_id=test_5 name=meltano run_id=e62918d4-c084-4c72-89cd-6bf19f2a13c3 stdio=stderr

visch · 2022-03-23T15:24:02Z

Added Data for one of the failures

2022-03-23T15:21:45.786665Z [info     ] Data sent into flatten: {'campaign_resourceName': 'customers/!number!/campaigns/!number!', 'campaign_name': 'Job Title - Various Locations', 'campaign_id': '!number!', 'campaignCriterion_resourceName': 'customers/!number!/campaignCriteria/!number!~!number!', 'locationView_resourceName': 'customers/!number!/locationViews/!number!~!number!', 'conversions': 1, 'conversionActionCategory': 'SUBMIT_LEAD_FORM', 'date': '2022-03-19'} cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.786781Z [info     ] Traceback (most recent call last): cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.786898Z [info     ]   File "<string>", line 1, in <module> cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787012Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1128, in __call__ cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787208Z [info     ]     return self.main(*args, **kwargs) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787330Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1053, in main cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787468Z [info     ]     rv = self.invoke(ctx)      cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787606Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1395, in invoke cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787725Z [info     ]     return ctx.invoke(self.callback, **ctx.params) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787862Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 754, in invoke cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.787991Z [info     ]     return __callback(*args, **kwargs) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.788106Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 476, in cli cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.788221Z [info     ]     tap.sync_all()             cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.788308Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 345, in sync_all cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.788466Z [info     ]     stream.sync()              cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.788581Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.788713Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.788907Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.789067Z [info     ]     self._sync_children(child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.789218Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.789434Z [info     ]     child_stream.sync(context=child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.789559Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.789691Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.789819Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.789952Z [info     ]     self._sync_children(child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.790083Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.790212Z [info     ]     child_stream.sync(context=child_context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.790343Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.790474Z [info     ]     self._sync_records(context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.790604Z [info     ]   File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 937, in _sync_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.790731Z [info     ]     for record_result in self.get_records(current_context): cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.790872Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/client.py", line 124, in get_records cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.791006Z [info     ]     transformed_record = self.post_process(record, context) cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.791133Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/streams.py", line 492, in post_process cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.791248Z [info     ]     return flatten(            cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.791379Z [info     ]   File "/home/visch/git/tap-googleads/tap_googleads/streams.py", line 26, in flatten cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.791506Z [info     ]     subitem = row.pop(prefix)  cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.791620Z [info     ] KeyError: 'campaignCriterion_location' cmd_type=extractor job_id=test_5 name=tap-googleads run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.834672Z [error    ] Extraction failed              code=1 job_id=test_5 message=KeyError: 'campaignCriterion_location' name=meltano run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d
2022-03-23T15:21:45.835079Z [info     ] ELT could not be completed: Extractor failed. cmd_type=elt job_id=test_5 name=meltano run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.835209Z [info     ] For more detailed log messages re-run the command using 'meltano --log-level=debug ...' CLI flag. cmd_type=elt job_id=test_5 name=meltano run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.835332Z [info     ] Note that you can also check the generated log file at '/home/visch/git/tap-googleads/.meltano/logs/elt/test_5/7484dece-3eb3-4ac8-8d19-a0e942d7732d/elt.log'. cmd_type=elt job_id=test_5 name=meltano run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr
2022-03-23T15:21:45.835452Z [info     ] For more information on debugging and logging: https://docs.meltano.com/reference/command-line-interface#debugging cmd_type=elt job_id=test_5 name=meltano run_id=7484dece-3eb3-4ac8-8d19-a0e942d7732d stdio=stderr

jlloyd-widen · 2022-03-24T04:20:28Z

I wonder if leaving the flattening up to the USER would be appropriate here as we could hop onto the new functionality here https://gitlab.com/meltano/sdk/-/merge_requests/236/diffs#5419fd77f507b41e5e0fe17b6fbf7375fbf2c3eb .

That doesn't answer the question about how to get a primary key for each of these reports though. The work you put in to make the primary key for each stream is good.

Things I'm thinking about:

How do we handle custom reports in regards to primary keys and user flattening? Custom Reports #32 . Honestly with the auto flattening here it might be a lot easier to do custom mappings (although you'd have to have the user define which objects to denest or nest which could get confusing)

Make flattening of all objects a user decided feature. Ideally we leave the data untouched when possible?

@visch This got me thinking that the best way to handle this probably wouldn't be to introduce flattening like I have here, but instead to introduce the ability to copy (not cut or flatten) the primary keys from where ever the are naturally in the json structure to the root level of the record. You could use json paths to designate each primary key.

For example, if the primary key json path $.object_1.example_primary_key was provided, this:

{
    "root_data_1": "foo",
    "object_1": {
        "example_primary_key": 1,
        "other_data_1": "bar"
    }
}

would become:

{
    "root_data_1": "foo",
    "example_primary_key": 1,
    "object_1": {
        "example_primary_key": 1,
        "other_data_1": "bar"
    }
}

The downside is that this introduces duplicate data into almost all of the streams, the upsides are

the json paths to the primary keys can be provided/overwritten by the config to address the custom reports you mentioned in a different issue.
The data is technically untouched with the minor addition of the now (sometimes) duplicate primary key data.

I think I might do this in a separate PR to demonstrate and then we can decide on which approach is preferred. I'll link all these issues and PRs once I have it.

…mas to m…"

Revert "overhauld of the primary key designations and flatten of schemas to m…"

Revert "attempted fix for hierarchy stream primary key"

visch · 2022-03-24T13:38:15Z

I wonder if leaving the flattening up to the USER would be appropriate here as we could hop onto the new functionality here https://gitlab.com/meltano/sdk/-/merge_requests/236/diffs#5419fd77f507b41e5e0fe17b6fbf7375fbf2c3eb .
That doesn't answer the question about how to get a primary key for each of these reports though. The work you put in to make the primary key for each stream is good.
Things I'm thinking about:

How do we handle custom reports in regards to primary keys and user flattening? Custom Reports #32 . Honestly with the auto flattening here it might be a lot easier to do custom mappings (although you'd have to have the user define which objects to denest or nest which could get confusing)

Make flattening of all objects a user decided feature. Ideally we leave the data untouched when possible?

@visch This got me thinking that the best way to handle this probably wouldn't be to introduce flattening like I have here, but instead to introduce the ability to copy (not cut or flatten) the primary keys from where ever the are naturally in the json structure to the root level of the record. You could use json paths to designate each primary key.

For example, if the primary key json path $.object_1.example_primary_key was provided, this:
{
    "root_data_1": "foo",
    "object_1": {
        "example_primary_key": 1,
        "other_data_1": "bar"
    }
}
would become:
{
    "root_data_1": "foo",
    "example_primary_key": 1,
    "object_1": {
        "example_primary_key": 1,
        "other_data_1": "bar"
    }
}
The downside is that this introduces duplicate data into almost all of the streams, the upsides are

the json paths to the primary keys can be provided/overwritten by the config to address the custom reports you mentioned in a different issue.

The data is technically untouched with the minor addition of the now (sometimes) duplicate primary key data.

I think I might do this in a separate PR to demonstrate and then we can decide on which approach is preferred. I'll link all these issues and PRs once I have it.

I think I like that idea best! Maybe the key name could be something like _sdc_primary_key to follow singer conventions thus far?

jlloyd-widen · 2022-03-24T14:40:24Z

I think I like that idea best! Maybe the key name could be something like _sdc_primary_key to follow singer conventions thus far?

That makes sense for a single primary key. What's the naming convention for a composite primary key? _sdc_primary_key_1, etc.?

UPDATE: Actually there isn't a convention, but the closest pattern I could come up with was _sdc_primary_key_<json_path>. Any objections?

visch · 2022-03-24T15:46:57Z

I think I like that idea best! Maybe the key name could be something like _sdc_primary_key to follow singer conventions thus far?

That makes sense for a single primary key. What's the naming convention for a composite primary key? _sdc_primary_key_1, etc.?

UPDATE: Actually there isn't a convention, but the closest pattern I could come up with was _sdc_primary_key_<json_path>. Any objections?

Hmm, I was thinking we'd say SCHEMA primary_keys = ['_sdc_primary_key`]

and if _sdc_primary_key is a composite we'd do that as a part of calculating _sdc_primary_key ie if it's
["campaign_id", "adGroup_id"] then _sdc_primary_key=campaign_id+adGroup_id

We may want to add a check to be sure the generated primary key is actually unique? But that might be overkill honestly

Maybe I'm missing something? Let me know!

I guess we could always just not use _sdc prefix at all here, I was just thinking it'd probably make the most sense to folks used to Singer taps.

jlloyd-widen · 2022-03-24T15:57:08Z

Hmm, I was thinking we'd say SCHEMA primary_keys = ['_sdc_primary_key`]

and if _sdc_primary_key is a composite we'd do that as a part of calculating _sdc_primary_key ie if it's ["campaign_id", "adGroup_id"] then _sdc_primary_key=campaign_id+adGroup_id

That's a better idea. I'm in favor of using the _sdc for the reasons you mentioned. Building a composite key like that actually helps alleviate the undesirable duplicate data of my original approach. I'll get a PR for this up shortly.

Fix pk with config

jlloyd-widen · 2022-03-24T18:40:12Z

@visch updated the PR to match the discussion. Give it a run and let me know what you think. Hopefully this also resolves the errors you were getting.

visch · 2022-03-24T21:03:28Z

@visch updated the PR to match the discussion. Give it a run and let me know what you think. Hopefully this also resolves the errors you were getting.

Skimmed the PR, generally looks good I'll deep dive once we get these bugs ironed out

New data in replicate_pk_at_root: {'campaign': {'resourceName': 'customers/!numberhere!/campaigns/!numberhere!', 'name': '!WordsWereHere! - Various Locations', 'id': '!numberhere!'}, 'metrics': {'clicks': '0', 'costMicros': '0', 'ctr': 0, 'impressions': '47'}, 'campaignCriterion': {'resourceName': 'customers/!numberhere!/campaignCriteria/!numberhere!~!numberhere!'}, 'segments': {'date': '2022-03-17'}, 'locationView': {'resourceName': 'customers/!numberhere!/locationViews/'!numberhere!~'!numberhere!'}}

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 476, in cli
    tap.sync_all()
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 345, in sync_all
    stream.sync()
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync
    self._sync_records(context)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records
    self._sync_children(child_context)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children
    child_stream.sync(context=child_context)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync
    self._sync_records(context)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 953, in _sync_records
    self._sync_children(child_context)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1015, in _sync_children
    child_stream.sync(context=child_context)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 1010, in sync
    self._sync_records(context)
  File "/home/visch/git/tap-googleads/.venv/lib/python3.9/site-packages/singer_sdk/streams/core.py", line 937, in _sync_records
    for record_result in self.get_records(current_context):
  File "/home/visch/git/tap-googleads/tap_googleads/client.py", line 126, in get_records
    transformed_record = self.post_process(record, context)
  File "/home/visch/git/tap-googleads/tap_googleads/client.py", line 151, in post_process
    return replicate_pk_at_root(row, self.primary_keys_jsonpaths)
  File "/home/visch/git/tap-googleads/tap_googleads/utils.py", line 31, in replicate_pk_at_root
    val = val[level]
KeyError: 'location'

jlloyd-widen · 2022-03-24T21:26:31Z

KeyError: 'location'

@visch Your account apparently does have the field available that I was using to set a primary key so I used an alternative field in the same query. Give this new commit a shot in your account and see if it works.

visch · 2022-04-04T19:16:15Z

Ran this and it did work! Doing a code review now

visch · 2022-04-04T19:23:35Z

tap_googleads/utils.py

+        for i, level in enumerate(levels):
+            val = val[level]
+            if i == len(levels) - 1:
+                pk_str += ":" + str(val) if pk_str else str(val)


I know for mssql there's a limit to the size a PK index can be. https://docs.microsoft.com/en-us/sql/relational-databases/tables/primary-and-foreign-key-constraints?view=sql-server-ver15#:~:text=A%20table%20can%20contain%20only,key%20length%20of%20900%20bytes.

Postgres has some similar length I believe. I think we're fine for now, but we may have to evaluate this at some point and maybe just md5 the whole thing

visch · 2022-04-04T19:25:08Z

Made some changes to queries while this was being written. We'll have to merge this PR, but I think we'll be pretty close to set after that. I'll go ahead and merge them and make a PR for this branch on your github. We'll see how this goes :D

Merged with latest changes on AutoIDM main branch

jlloyd-widen · 2022-04-04T20:17:41Z

Made some changes to queries while this was being written. We'll have to merge this PR, but I think we'll be pretty close to set after that. I'll go ahead and merge them and make a PR for this branch on your github. We'll see how this goes :D

Merged your PR on my repo's branch so this should not have merge conflicts anymore and should be up to date.

…ry_key

visch · 2022-04-04T20:26:52Z

Made some changes to queries while this was being written. We'll have to merge this PR, but I think we'll be pretty close to set after that. I'll go ahead and merge them and make a PR for this branch on your github. We'll see how this goes :D

Merged your PR on my repo's branch so this should not have merge conflicts anymore and should be up to date.

Added a few more things (You were too fast for me, and I forgot a few things ha)

Could you run the latest stuff with your Google Ads data to be sure it works on your side?

After that I think we're good to go, part of me wants to run an audit on some data here to be sure we didn't miss any data on the migration. I'll think about that, if this works on your side I may just merge it!

visch · 2022-04-04T20:30:28Z

tap_googleads/tests/test_utils.py

@@ -0,0 +1,32 @@
+from tap_googleads.utils import replicate_pk_at_root


Thanks for writing tests!

Fix pk merged

jlloyd-widen · 2022-04-04T20:49:56Z

Could you run the latest stuff with your Google Ads data to be sure it works on your side?

Sorry for the happy trigger finger ;) I ran all streams locally with your latest changes. Everything ran smoothly so I merged it. Let me know if you have any other edits.

* Sync all customers for a given stream * Add logging to see when we retry requests * Update currently_syncing with customerId too. Write state as soon as we update it * Add the customerId to the bookmark keys * Add shuffle for customerId and tap_stream_id; add shuffle unit tests * Bug fix for when currently_syncing is null * Fix exception handling typeError * Fix none cases for currently_syncing * Fix currently_syncing to write a tuple we can read in later * Add get_customer_ids so we can use it in the tests * Fix manipulated_state to account for customer_ids * Update assertion for currently_syncing * Fix currently syncing assertion * Move bookmark access into Full Table assertions section Full Table doesn't need the "stream_name and customer id" key logic * Remove duplicate assertion * Revert 6db016e7ec29c2b00973b671c1efdf9451aca9c2 * Update bookmark to read stream->customer->replication_key * Update tap to write bookmarks as stream->customer->replication_key * Update manipulated state to nest stream->customer->replication_key * Run bookmark assertions for every customer * Fix dict comprehension typo * Fix conflict with main * Remove `get_state_key` again, use env var instead of hardcoded value * Add missing dependency * Move currently-syncing-null-out to the end of sync to prevent gaps * Sort selected_streams and customers to guarantee consistency across runs * Don't let the tap write (None, None) * Sort selected_streams and customers effectively * Update currently_syncing test assertions * Add sort functions for streams and customers * Update `shuffle` to handle a missing value * Update unit tests to use sort_function, add a test for shuffling streams * Add end date (AutoIDM#28) * Add optional end date, add unit tests Co-authored-by: Andy Lu <[email protected]> * Test functions can't be named run_test apparently * Rename do_thing * Extract `get_queries_from_sync` as a function * Remove unused variable * Refactor tests to be more explicit * Mock singer.utils.now to return a specific date Co-authored-by: Andy Lu <[email protected]> * add conversion_window test * fixed conversion window unittests, bug removed Co-authored-by: dylan-stitch <[email protected]> Co-authored-by: Andy Lu <[email protected]> Co-authored-by: kspeer <[email protected]>

* add conversion window test * add conversion window test * wip updated tests to worka with currently syncing dev branch [skip ci] * Revert removal of metric compatibility removal (AutoIDM#29) * Revert removal of metric compatibility removal * Whitespace cleanup * Add currently syncing (AutoIDM#24) * Sync all customers for a given stream * Add logging to see when we retry requests * Update currently_syncing with customerId too. Write state as soon as we update it * Add the customerId to the bookmark keys * Add shuffle for customerId and tap_stream_id; add shuffle unit tests * Bug fix for when currently_syncing is null * Fix exception handling typeError * Fix none cases for currently_syncing * Fix currently_syncing to write a tuple we can read in later * Add get_customer_ids so we can use it in the tests * Fix manipulated_state to account for customer_ids * Update assertion for currently_syncing * Fix currently syncing assertion * Move bookmark access into Full Table assertions section Full Table doesn't need the "stream_name and customer id" key logic * Remove duplicate assertion * Revert 6db016e7ec29c2b00973b671c1efdf9451aca9c2 * Update bookmark to read stream->customer->replication_key * Update tap to write bookmarks as stream->customer->replication_key * Update manipulated state to nest stream->customer->replication_key * Run bookmark assertions for every customer * Fix dict comprehension typo * Fix conflict with main * Remove `get_state_key` again, use env var instead of hardcoded value * Add missing dependency * Move currently-syncing-null-out to the end of sync to prevent gaps * Sort selected_streams and customers to guarantee consistency across runs * Don't let the tap write (None, None) * Sort selected_streams and customers effectively * Update currently_syncing test assertions * Add sort functions for streams and customers * Update `shuffle` to handle a missing value * Update unit tests to use sort_function, add a test for shuffling streams * Add end date (AutoIDM#28) * Add optional end date, add unit tests Co-authored-by: Andy Lu <[email protected]> * Test functions can't be named run_test apparently * Rename do_thing * Extract `get_queries_from_sync` as a function * Remove unused variable * Refactor tests to be more explicit * Mock singer.utils.now to return a specific date Co-authored-by: Andy Lu <[email protected]> * add conversion_window test * fixed conversion window unittests, bug removed Co-authored-by: dylan-stitch <[email protected]> Co-authored-by: Andy Lu <[email protected]> Co-authored-by: kspeer <[email protected]> * Bump to v0.2.0, update changelog (AutoIDM#31) * Bump to v0.2.0, update changelog * Add link for this PR, fix link syntax * Update changelog format * expanded conversion window testing for error case, BUG linked * parallelism 8 -> 12 * added unittest for start date within conversion window Co-authored-by: kspeer <[email protected]> Co-authored-by: Dylan <[email protected]> Co-authored-by: dylan-stitch <[email protected]> Co-authored-by: Andy Lu <[email protected]>

Josh Lloyd and others added 5 commits March 18, 2022 12:09

attempted fix for hierarchy stream primary key

4976e08

Merge pull request #1 from Widen/fix-pk

3941c6e

attempted fix for hierarchy stream primary key

overhauld of the primary key designations and flatten of schemas to m…

a9c54ef

…ake them possible

restored the meltano.yml file

663ec2d

Merge pull request #2 from Widen/fix-pk

02d9ffc

overhauld of the primary key designations and flatten of schemas to m…

jlloyd-widen added 5 commits March 23, 2022 22:22

Revert "overhauld of the primary key designations and flatten of sche…

fc0e627

…mas to m…"

Merge pull request #3 from Widen/revert-2-fix-pk

83dae8c

Revert "overhauld of the primary key designations and flatten of schemas to m…"

Revert "attempted fix for hierarchy stream primary key"

f0679f1

Merge pull request #4 from Widen/revert-1-fix-pk

ea6e377

Revert "attempted fix for hierarchy stream primary key"

Merge branch 'AutoIDM:main' into main

49e3187

Josh Lloyd and others added 2 commits March 24, 2022 12:26

fixed primary keys

3b8c2bd

Merge pull request #5 from Widen/fix-pk-with-config

e23ef9e

Fix pk with config

switch pk for location based streams

824bb90

removed unecessary select in meltano.yml

1680a45

visch reviewed Apr 4, 2022

View reviewed changes

visch and others added 3 commits April 4, 2022 16:09

Merged changes in

2bdc9a3

Merge pull request #7 from AutoIDM/fix-pk-merged

f559b58

Merged with latest changes on AutoIDM main branch

Campaign label primary_key added

357c0c7

visch added 2 commits April 4, 2022 16:23

Added pk for campaign labels, and explained the why behind _sdc_prima…

8c4b190

…ry_key

Ran black lint

b1c36cb

visch reviewed Apr 4, 2022

View reviewed changes

tap_googleads/tests/test_utils.py

@@ -0,0 +1,32 @@

from tap_googleads.utils import replicate_pk_at_root

Copy link

Contributor

visch Apr 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing tests!

Missed the json schema add for campaign label

a705e01

visch mentioned this pull request Apr 4, 2022

Auto add _sdc_primary_key to schema #40

Open

Merge pull request #8 from AutoIDM/fix-pk-merged

7324daf

Fix pk merged

visch merged commit 7335da4 into AutoIDM:main Apr 7, 2022

MeltyBot mentioned this pull request May 30, 2022

Nested Primary keys and Incremental keys in data is a pain meltano/sdk#361

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

attempted fix for hierarchy stream primary key #28

attempted fix for hierarchy stream primary key #28

jlloyd-widen commented Mar 18, 2022 •

edited

Loading

visch commented Mar 22, 2022

visch commented Mar 22, 2022

visch commented Mar 22, 2022

visch commented Mar 23, 2022

visch commented Mar 23, 2022 •

edited

Loading

jlloyd-widen commented Mar 24, 2022

visch commented Mar 24, 2022

jlloyd-widen commented Mar 24, 2022 •

edited

Loading

visch commented Mar 24, 2022 •

edited

Loading

jlloyd-widen commented Mar 24, 2022

jlloyd-widen commented Mar 24, 2022

visch commented Mar 24, 2022

jlloyd-widen commented Mar 24, 2022 •

edited

Loading

visch commented Apr 4, 2022

visch Apr 4, 2022

visch commented Apr 4, 2022

jlloyd-widen commented Apr 4, 2022 •

edited

Loading

visch commented Apr 4, 2022

visch Apr 4, 2022

jlloyd-widen commented Apr 4, 2022 •

edited

Loading

		@@ -0,0 +1,32 @@
		from tap_googleads.utils import replicate_pk_at_root

attempted fix for hierarchy stream primary key #28

attempted fix for hierarchy stream primary key #28

Conversation

jlloyd-widen commented Mar 18, 2022 • edited Loading

visch commented Mar 22, 2022

visch commented Mar 22, 2022

visch commented Mar 22, 2022

visch commented Mar 23, 2022

visch commented Mar 23, 2022 • edited Loading

jlloyd-widen commented Mar 24, 2022

visch commented Mar 24, 2022

jlloyd-widen commented Mar 24, 2022 • edited Loading

visch commented Mar 24, 2022 • edited Loading

jlloyd-widen commented Mar 24, 2022

jlloyd-widen commented Mar 24, 2022

visch commented Mar 24, 2022

jlloyd-widen commented Mar 24, 2022 • edited Loading

visch commented Apr 4, 2022

visch Apr 4, 2022

Choose a reason for hiding this comment

visch commented Apr 4, 2022

jlloyd-widen commented Apr 4, 2022 • edited Loading

visch commented Apr 4, 2022

visch Apr 4, 2022

Choose a reason for hiding this comment

jlloyd-widen commented Apr 4, 2022 • edited Loading

jlloyd-widen commented Mar 18, 2022 •

edited

Loading

visch commented Mar 23, 2022 •

edited

Loading

jlloyd-widen commented Mar 24, 2022 •

edited

Loading

visch commented Mar 24, 2022 •

edited

Loading

jlloyd-widen commented Mar 24, 2022 •

edited

Loading

jlloyd-widen commented Apr 4, 2022 •

edited

Loading

jlloyd-widen commented Apr 4, 2022 •

edited

Loading