-
Notifications
You must be signed in to change notification settings - Fork 10
Fix missing column and variable reference before assignment #2
Conversation
Signed-off-by: Anna Jung (VMware) <[email protected]>
Signed-off-by: Anna Jung (VMware) <[email protected]>
Signed-off-by: Anna Jung (VMware) <[email protected]>
Signed-off-by: Anna Jung (VMware) <[email protected]>
@difince @enyinna1234 @pramodrj07 @tzstoyanov PTAL thanks! |
Are the changes in README.md anyway related to the other commits? |
if model_type == 'CNN': | ||
model = BaseCNN() | ||
elif model_type == 'LSTM': | ||
if model_type == 'LSTM': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since i am still trying to understand the code base, Can i know the intention behind this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a user passes a model type that is not 'CNN' or 'LSTM', it throws variable reference before assignment error. Therefore, I made CNN the default by removing elif
to make sure the model
is always initialized. I also added input validation to make sure that users can pass in only CNN and LSTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that setting a model default value should be in the argument definition:
parser.add_argument('model', default='CNN', help='Model type to use for training')
It is more straightforward and the code is clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the confusion, should have not used the word default. I don't want to set CNN to default. The user should always explicitly pass in what algorithm they want to use with the flags that are appropriate for that training.
With the input validation, else statement will always be CNN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a, ok - didn't get that logic. So, in that case may be it is better to extend the validation - to exit with an error if the model parameter is missing ? I see two cases:
- set model default value in argument definition, or
- if the model is mandatory parameter - exit with an error if it is missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do have the validation with logic to throw an exception - see line 56
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so in that case model must be either CNN or LSTM in run(). Thanks!
@pramodrj07 no, it's a separate commit to update just the docs with correct path to run the files |
Okay!! LGTM. |
@@ -51,6 +51,10 @@ def run(annotated_filename, dataset_filename, outcome, encoding_type, model_type | |||
parser.add_argument('-pad', action='store_true', default=False, help='Pad total length of each pull') | |||
|
|||
args = parser.parse_args() | |||
|
|||
if args.model != 'CNN' and args.model != 'LSTM': | |||
raise Exception("Model must be either CNN or LSTM") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to add supported model in the argument's help string also, something like that:
parser.add_argument('model', help='Model type to use for training, supported CNN and LSTM')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! Thanks
if model_type == 'CNN': | ||
model = BaseCNN() | ||
elif model_type == 'LSTM': | ||
if model_type == 'LSTM': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that setting a model default value should be in the argument definition:
parser.add_argument('model', default='CNN', help='Model type to use for training')
It is more straightforward and the code is clean.
Signed-off-by: Anna Jung (VMware) <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking into consideration Tzvetomir's notes, it LGTM
@@ -181,9 +182,11 @@ def pullStringConversation(self, export_filename="", export=True): | |||
comment_row["Body"]) | |||
string_conversations.append(conversation.encode("ascii", "ignore").decode()) | |||
pull_urls.append(row["URL"]) | |||
pull_numbers.append(row["Number"]) | |||
|
|||
# Export converation field dataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, looks like there's a lot of the same typos in multiple places. I will leave that out of this PR and make a separate PR for clean up
When the model is trained, in order to run an inference service to serve it, the model should be exported. An optional parameter "-save=name" is added, to export the model with the given name. By default, the model is not exported. Models are exported in directory: models/<name>-<outcome>/<version>/ and are compressed in file: models/<name>-<outcome>/<name>-<outcome>-<version>.tar.gz The model's version is hardcoded to "0001", managing different model versions is TBD. The exported models are tested with kserve, the layout of directories and archive file is designed in a way kserve tensorflow predictor expects. fixes vmware-archive#2 Signed-off-by: Tzvetomir Stoyanov (VMware) <[email protected]>
When the model is trained, in order to run an inference service to serve it, the model should be exported. An optional parameter "-save=name" is added, to export the model with the given name. By default, the model is not exported. Models are exported in directory: models/<name>-<outcome>/<version>/ and are compressed in file: models/<name>-<outcome>/<name>-<outcome>-<version>.tar.gz The model's version is hardcoded to "0001", managing different model versions is TBD. The exported models are tested with kserve, the layout of directories and archive file is designed in a way kserve tensorflow predictor expects. fixes vmware-archive#2 Signed-off-by: Tzvetomir Stoyanov (VMware) <[email protected]>
When the model is trained, in order to run an inference service to serve it, the model should be exported. Two optional parameters are introduced: "-save NAME" "-save_version VERSION" By default, the model is not exported. If "-save NAME" is specified, the model is saved using given NAME. If "-save_version VERSION" is specified, together with "-save NAME", the model is saved using given NAME and VERSION. The "-save_version" is ignored, if "-save" is missing. By default, version "001" is used. Models are exported in directory: models/<NAME>-<outcome>/<VERSION>/ and are compressed in file: models/<NAME>-<outcome>/<NAME>-<outcome>-<VERSION>.tar.gz The exported models are tested with kserve, the layout of directories and archive file is designed in a way kserve tensorflow predictor expects. fixes vmware-archive#2 Signed-off-by: Tzvetomir Stoyanov (VMware) <[email protected]>
KeyError
when runningrun.py
due to missing 'Number' column in the annotated data