Skip to content
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.

Extract data reformat logic into a separate file #11

Merged
merged 1 commit into from
Jan 12, 2022

Conversation

difince
Copy link
Contributor

@difince difince commented Dec 14, 2021

Extract data reformat logic into a separate file - github_data.py

Available commands:

python ml-conversational-analytic-tool/featureVector.py  <raw_data_file> --words <words_file> --name <output_file_name>
python ml-conversational-analytic-tool/github_data.py  <raw_data_file> --name <output_file_name>
python ml-conversational-analytic-tool/commentAnalysis.py "Comment Message - Merry Christmas" --words <words_file>

--name is an optional parameter (supported by github_data.py and featureFector.py)
--words is an optional parameter (supported by featureVector.py and commentAnalysis.py)

Fixes: #4

Signed-off-by: Diana Atanasova [email protected]

Copy link
Contributor

@annajung annajung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Diana, left a few comments - PTAL

Copy link
Contributor

@ericwb ericwb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you split this PR into multiple? I see a lot of code refactoring mixed with other changes. Very large set of changes to review.

@annajung
Copy link
Contributor

@ericwb if you filter out html files, it's easier to see the changes :)
I do agree that autogenerated docs are not helpful in reviewing PRs. We should look at how we can improve this process.

@tzstoyanov
Copy link
Contributor

Any autogenerated files should be in a second commit in the PR.

@difince
Copy link
Contributor Author

difince commented Dec 15, 2021

@ericwb, @tzstoyanov @annajung I have reverted all auto-generated HTML files, so now the PR contains 5 files.
I will later create an additional PR to add all HTML files

@tzstoyanov
Copy link
Contributor

I will later create an additional PR to add all HTML files

I think it is better to have all related changes in the same PR, just add a second commit with the doc updates.

@difince
Copy link
Contributor Author

difince commented Dec 15, 2021

@tzstoyanov I agree with you that it is better to have all related changes in the same PR, having it as a separate commit is a good approach, and less error prone. I just realized that code reviews could be done for a specific commit as well. So, no need to filter out *.html files.
But IMO it would be a good practice, all additional commits created to address review comments - in the end, to be squashed to the initial commit and thus keep the log history cleaner. But if we want to keep a commit (HTML related) separate from the squash, the reviewer won't be able to use GITHub "squash and merge" action. The squash need to be done by the PR author
WDYT?

@tzstoyanov
Copy link
Contributor

But IMO it would be a good practice, all additional commits created to address review comments - in the end, to be squashed to the initial commit and thus keep the log history cleaner.

I think commits should not be squashed, as each commit is a separate logical change in the code. Keeping commits simplifies finding problems using git blame ... or git bisect ..., but may be I'm biased as this is the approach chosen in the kernel development workflow :)

Copy link
Contributor

@tzstoyanov tzstoyanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@difince
Copy link
Contributor Author

difince commented Dec 15, 2021

@tzstoyanov sometime as a follow-up, we commit just a typo fix that just makes a noise I think. @annajung suggested that we could elaborate GitHub Actions for HTML auto-generated code.

@difince difince requested review from annajung and removed request for pramodrj07 December 15, 2021 20:30
Copy link
Contributor

@annajung annajung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you! lgtm

@difince difince requested a review from ericwb January 4, 2022 15:38
Copy link
Contributor

@ericwb ericwb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like some typos on filenames

@difince difince requested a review from pramodrj07 as a code owner January 4, 2022 15:44
@difince difince requested a review from ericwb January 4, 2022 15:46
@difince difince force-pushed the words_flag branch 2 times, most recently from cb3a919 to 502b6c7 Compare January 4, 2022 18:31
@difince difince requested a review from ericwb January 5, 2022 09:57
@difince difince requested a review from ericwb January 10, 2022 19:37
Copy link
Contributor

@ericwb ericwb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ericwb ericwb enabled auto-merge (squash) January 11, 2022 16:04
@ericwb ericwb merged commit c23e3a6 into vmware-archive:main Jan 12, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove requirements for words file for annotation preparation
5 participants