-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tap refactoring using singer generator #47
base: master
Are you sure you want to change the base?
Conversation
```json | ||
{ | ||
"start_date": "2019-01-01T00:00:00Z", | ||
"user_agent": "tap-appsflyer <api_user_email@your_company.com>", | ||
"app_id": "abc1e2swewe", | ||
"api_token": "askawqewdqwer123445666" | ||
... | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
```json | |
{ | |
"start_date": "2019-01-01T00:00:00Z", | |
"user_agent": "tap-appsflyer <api_user_email@your_company.com>", | |
"app_id": "abc1e2swewe", | |
"api_token": "askawqewdqwer123445666" | |
... | |
} | |
```json | |
{ | |
"start_date": "2019-01-01T00:00:00Z", | |
"user_agent": "tap-appsflyer <api_user_email@your_company.com>", | |
"app_id": "abc1e2swewe", | |
"api_token": "askawqewdqwer123445666" | |
... | |
}``` |
See the Singer docs on discovery mode | ||
[here](https://github.com/singer-io/getting-started/blob/master/docs/DISCOVERY_MODE.md | ||
|
||
5. Run the Tap in Sync Mode (with catalog) and [write out to state file](https://github.com/singer-io/getting-started/blob/master/docs/RUNNING_AND_DEVELOPING.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the Singer docs on discovery mode | |
[here](https://github.com/singer-io/getting-started/blob/master/docs/DISCOVERY_MODE.md | |
5. Run the Tap in Sync Mode (with catalog) and [write out to state file](https://github.com/singer-io/getting-started/blob/master/docs/RUNNING_AND_DEVELOPING.md | |
See the Singer docs on discovery mode | |
[here](https://github.com/singer-io/getting-started/blob/master/docs/DISCOVERY_MODE.md) | |
5. Run the Tap in Sync Mode (with catalog) and [write out to state file](https://github.com/singer-io/getting-started/blob/master/docs/RUNNING_AND_DEVELOPING.md) |
|
||
To [check the tap](https://github.com/singer-io/singer-tools) | ||
```bash | ||
> tap-mixpanel --config tap_config.json --catalog catalog.json | singer-check-tap > state.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
> tap-mixpanel --config tap_config.json --catalog catalog.json | singer-check-tap > state.json | |
> tap-appsflyer --config tap_config.json --catalog catalog.json | singer-check-tap > state.json |
data following the [Singer spec](https://github.com/singer-io/getting-started/blob/master/SPEC.md). | ||
This is a [Singer](https://singer.io) tap that produces JSON-formatted data | ||
following the [Singer | ||
spec](https://github.com/singer-io/getting-started/blob/master/SPEC.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spec](https://github.com/singer-io/getting-started/blob/master/SPEC.md). | |
spec](https://github.com/singer-io/getting-started/blob/master/docs/SPEC.md). |
url="http://singer.io", | ||
classifiers=["Programming Language :: Python :: 3 :: Only"], | ||
py_modules=["tap_appsflyer"], | ||
install_requires= ["attrs==25.1.0", "singer-python==6.1.0", "requests==2.32.3", "backoff==2.2.1"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
install_requires= ["attrs==25.1.0", "singer-python==6.1.0", "requests==2.32.3", "backoff==2.2.1"], | |
install_requires= ["attrs==25.1.0", | |
"singer-python==6.1.0", | |
"requests==2.32.3", | |
"backoff==2.2.1"], |
sync(client=client, | ||
config=parsed_args.config, | ||
catalog=parsed_args.catalog, | ||
state=state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sync(client=client, | |
config=parsed_args.config, | |
catalog=parsed_args.catalog, | |
state=state) | |
sync(client=client, | |
config=parsed_args.config, | |
catalog=parsed_args.catalog, | |
state=state) |
if config_request_timeout and float(config_request_timeout): | ||
self.request_timeout = float(config_request_timeout) | ||
else: | ||
self.request_timeout = REQUEST_TIMEOUT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logically your code is correct, just a suggestion.
if config_request_timeout and float(config_request_timeout): | |
self.request_timeout = float(config_request_timeout) | |
else: | |
self.request_timeout = REQUEST_TIMEOUT | |
self.request_timeout = float(config_request_timeout) if config_request_timeout else REQUEST_TIMEOUT |
while True: | ||
response = self.client.get( | ||
extraction_url, self.params, self.headers | ||
) | ||
if not response: | ||
LOGGER.warning("No records found on Page %s", page_count) | ||
break | ||
|
||
with singer.metrics.http_request_timer(self.parse_source_from_url(self.client.base_url)) as timer: | ||
resp = SESSION.send(response) | ||
timer.tags[singer.metrics.Tag.http_status_code] = resp.status_code | ||
return resp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove while here.
if value.lower() == "TRUE".lower(): | ||
record[field_name] = True | ||
else: | ||
record[field_name] = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if value.lower() == "TRUE".lower(): | |
record[field_name] = True | |
else: | |
record[field_name] = False | |
record[field_name] = value.lower() == "true" |
from_datetime = bookmark_date =self.get_bookmark(state) | ||
to_datetime = self.get_stop(from_datetime, datetime.datetime.now(pytz.utc)) | ||
|
||
current_max_bookmark_date = bookmark_date_to_utc = bookmark_date |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from_datetime = bookmark_date =self.get_bookmark(state) | |
to_datetime = self.get_stop(from_datetime, datetime.datetime.now(pytz.utc)) | |
current_max_bookmark_date = bookmark_date_to_utc = bookmark_date | |
from_datetime = bookmark_date | |
bookmark_date = self.get_bookmark(state) | |
to_datetime = self.get_stop(from_datetime, datetime.datetime.now(pytz.utc)) | |
current_max_bookmark_date = bookmark_date_to_utc | |
bookmark_date_to_utc = bookmark_date |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the same wherever required?
try: | ||
first_line = next(csv_data) | ||
# Push the line back into the iterator | ||
csv_data = iter([first_line] + list(csv_data)) | ||
except StopIteration: | ||
LOGGER.warning("No data available in the CSV.") | ||
return state | ||
reader = csv.DictReader(csv_data, fieldnames) | ||
|
||
try: | ||
next(reader) # Skip the header row | ||
except StopIteration: | ||
LOGGER.warning("No data available after header row.") | ||
return state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try: | |
first_line = next(csv_data) | |
# Push the line back into the iterator | |
csv_data = iter([first_line] + list(csv_data)) | |
except StopIteration: | |
LOGGER.warning("No data available in the CSV.") | |
return state | |
reader = csv.DictReader(csv_data, fieldnames) | |
try: | |
next(reader) # Skip the header row | |
except StopIteration: | |
LOGGER.warning("No data available after header row.") | |
return state | |
try: | |
reader = csv.DictReader(csv_data, fieldnames) | |
next(reader) # Skip the header row | |
except StopIteration: | |
LOGGER.warning("No data available in the CSV.") | |
return state |
for i, row in enumerate(reader): | ||
xform_record = self.xform(row) | ||
transformed_record = transformer.transform( | ||
xform_record, schema, stream_metadata | ||
) | ||
try: | ||
record_timestamp = strptime_to_utc( | ||
transformed_record[self.replication_keys[0]] | ||
) | ||
except KeyError as _: | ||
LOGGER.error( | ||
"Unable to process Record, Exception occurred: %s for stream %s", | ||
_, | ||
self.__class__, | ||
) | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for i, row in enumerate(reader): | |
xform_record = self.xform(row) | |
transformed_record = transformer.transform( | |
xform_record, schema, stream_metadata | |
) | |
try: | |
record_timestamp = strptime_to_utc( | |
transformed_record[self.replication_keys[0]] | |
) | |
except KeyError as _: | |
LOGGER.error( | |
"Unable to process Record, Exception occurred: %s for stream %s", | |
_, | |
self.__class__, | |
) | |
continue | |
for _, row in enumerate(reader): | |
xform_record = self.xform(row) | |
transformed_record = transformer.transform( | |
xform_record, schema, stream_metadata | |
) | |
try: | |
record_timestamp = strptime_to_utc( | |
transformed_record[self.replication_keys[0]] | |
) | |
except KeyError as ex: | |
LOGGER.error( | |
"Unable to process Record, Exception occurred: %s for stream %s", | |
ex, | |
self.__class__, | |
) | |
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes inline.
Description of change
singer-python
upgrade to 6.1.0 andbackoff
to 2.2.1Manual QA steps
Risks
Rollback steps
AI generated code
https://internal.qlik.dev/general/ways-of-working/code-reviews/#guidelines-for-ai-generated-code