Update simple_seq2seq.py #90

wlhgtc · 2020-07-12T03:33:16Z

support multi-layer decoders
return all top_k_predictions (tokens) for beam_search
support to load pre-train embedding files for target embedding

1. support multi-layer decoders 2. return all top_k_predictions (tokens) for beam_search 3. support to load pre-train embedding files for target embedding

dirkgr

Can you run the auto formatter and make sure all the tests pass?

It's hard to review with random formatting changes everywhere, and I suspect that this code doesn't work yet.

dirkgr · 2020-07-13T09:32:51Z

allennlp_models/generation/models/simple_seq2seq.py

+            target_embedding_dim: int = None,
+            scheduled_sampling_ratio: float = 0.0,
+            use_bleu: bool = True,
+            bleu_ngram_weights: Iterable[float] = (0.25, 0.25, 0.25, 0.25),


Can you add the new parameters at the end, so that the code stays backwards compatible?

dirkgr · 2020-07-13T09:33:17Z

allennlp_models/generation/models/simple_seq2seq.py

    ) -> None:
        super().__init__(vocab)
+        self.source_embedding_dim = source_embedding_dim


Why no underscore before _source_embedding_dim?

Why did you add this parameter at all? Isn't it possible to get the source embedding dimension from the source embedder, without having to specify it? Also, if it's added here it needs to be added to the documentation.

This parameter is useful when you have extra-features.
Suppose you have both word embedding(600 dim) and pos tags(600 dim). Then the embedding become (batch, length, 1200). But the encoder accept tensors with 600-dim. Then you may use a linear layer or directly add(just like style in BERT). This parameter will help.
But I have not finish all style "feature_merge" code in an elegant way. So I remove it.
I will add some modules to finish it in the future.

dirkgr · 2020-07-13T09:34:54Z

allennlp_models/generation/models/simple_seq2seq.py

+            max_decoding_steps: int,
+            attention: Attention = None,
+            beam_size: int = None,
+            decoder_layers: int = 2,


Should this be called target_decoder_layers, and default to 1?

dirkgr · 2020-07-13T09:38:36Z

allennlp_models/generation/models/simple_seq2seq.py

+            # if len(indices.shape) > 1:
+            #     indices = indices[0]
+            batch_predicted_tokens = []
+            for indices in top_k_predictions:


Why is the extra loop necessary now?

The original code only return top1 result for beam_search. It's not convenient if we want eval top5 score (or we want choose result by some hand-craft algorithm). So I use the code segment in copy-net to get all results.

1. fix format questions 2. rename(and remove) some parameters

wlhgtc · 2020-07-13T23:33:15Z

@dirkgr All tests passed except a case with target_decoder_layer = 2. We have to update beam search code in #4462. Then it will pass.
But I wonder why I need CI / Pretrained Models (pull_request) test ? It has no relation with pretrained model.

dirkgr · 2020-07-15T08:38:17Z

The pretrained tests were broken in master when you were doing this. The failure has nothing to do with you. You can fix it by merging from master again. Sorry about that.

dirkgr

I have the same question here that I have for allenai/allennlp#4462: Would it not be easier to flatten/unflatten the decoder state in the model, so that from the outside it looks exactly the same, and all existing code that works with encoder/decoder models doesn't need any changes?

wlhgtc · 2020-07-16T08:00:54Z

@dirkgr After thinking carefully about your advice.
There are 2 different ways:

Change code in beam_search, just as what I do in allennlp/#4462

Adjust code in all LSMT decoder models in function _forward_beam_search :

allennlp-models/allennlp_models/generation/models/copynet_seq2seq.py

Lines 546 to 548 in 4f2e316

    
           all_top_k_predictions, log_probabilities = self._beam_search.search( 
        
               start_predictions, state, self.take_search_step 
        
           )

allennlp-models/allennlp_models/generation/models/simple_seq2seq.py

Lines 385 to 387 in 4f2e316

    
           all_top_k_predictions, log_probabilities = self._beam_search.search( 
        
               start_predictions, state, self.take_step 
        
           )

The code will looks like:

   # flatten operations
   ....

   self._beam_search.search(...)

   # unflatten operations
   ....

I prefer the first one, for the second one we need to repeat same code in many Models.

matt-gardner · 2020-07-31T16:19:36Z

@wlhgtc, sorry, @dirkgr is out for a little bit, and we aren't sure if this is waiting on us or on you. Can you give a quick update on where you think this stands?

wlhgtc · 2020-08-02T01:30:53Z

@matt-gardner Sorry, it's on my side. This PR aim to support multi-layer decoder in seq2seq model.
I have finished the changes, but the code failed cause it needs some update in beam search( #4462).
Now this PR has been merged into master.
But I don't know the test code based on the allennlp code in master or the newest released version ?
I will adjust my code and try it again in the next few days

dirkgr · 2020-08-04T15:45:09Z

The code here will run against the latest master version of allennlp, but you might have to merge/resolve conflicts here before that happens.

wlhgtc · 2020-08-07T03:44:20Z

@dirkgr I re-push my code, all test case passed.
But seem there is an error about ssh key in here, could you help me fix it ?

dirkgr · 2020-08-07T08:50:04Z

allennlp_models/generation/models/simple_seq2seq.py

        step : `int`
            The time step in beam search decoding.

+>>>>>>> 5d9098f6084a12da77b02d40e0d9392113aeb805


You checked in some unmerged files. I don't think it's serious, but we can't merge it like this.

dirkgr · 2020-08-07T08:51:08Z

tests/generation/predictors/seq2seq_test.py

+        for predicted_token in predicted_tokens:
+            assert all(isinstance(x, str) for x in predicted_token)


predicted_tokens is now a list of lists?

Yes, it contains top n sequences, could see in here

dirkgr · 2020-08-07T08:52:35Z

I don't know about the ssh failure. @epwalsh, is it possible that this test can never succeed when the PR comes from a fork?

# Conflicts: # allennlp_models/generation/models/simple_seq2seq.py

allennlp_models/generation/models/simple_seq2seq.py

epwalsh · 2020-08-11T17:10:15Z

Just fixed the SSH issue with the docs. There was another build error in that job but it was because of bad formatting in a docstring. I think my suggestion would fix that though.

fix doc format Co-authored-by: Evan Pete Walsh <[email protected]>

wlhgtc · 2020-08-12T03:57:10Z

@epwalsh Thanks for your advice. Now all tests pass, can we merge it into master ?

dirkgr · 2020-08-18T11:45:21Z

Thanks for sticking with it!

Update simple_seq2seq.py

ebbbae6

1. support multi-layer decoders 2. return all top_k_predictions (tokens) for beam_search 3. support to load pre-train embedding files for target embedding

wlhgtc mentioned this pull request Jul 12, 2020

Allow seq2seq models to use multiple decoder layers allenai/allennlp#4451

Open

dirkgr suggested changes Jul 13, 2020

View reviewed changes

wlhgtc added 12 commits July 13, 2020 18:27

Update simple_seq2seq.py

2f3da86

1. fix format questions 2. rename(and remove) some parameters

Update CHANGELOG.md

d0d86cd

Update simple_seq2seq.py

b63f595

Update simple_seq2seq.py

d6b0882

reformat code

417d4db

remove unused code

592859c

fix shape bug

ccd2942

fix lstm bug

de0a82c

fix type error

bdb089b

fix top-k bug

125bb76

adjust test code

1c9724e

add test cases

c694b48

fix group_size error

2d40bee

dirkgr reviewed Jul 15, 2020

View reviewed changes

wlhgtc mentioned this pull request Jul 18, 2020

Update beam_search.py allenai/allennlp#4462

Merged

wlhgtc added 3 commits August 6, 2020 22:09

merge master

9c2525d

Merge branch 'allenai-master'

38e63ea

adjust change log

4aec709

dirkgr suggested changes Aug 7, 2020

View reviewed changes

wlhgtc and others added 4 commits August 7, 2020 17:28

fix bug

c4330ef

Merge branch 'master' into master

ccfeb75

Merge branch 'master' into wlhgtc-master

dbea690

# Conflicts: # allennlp_models/generation/models/simple_seq2seq.py

Merge branch 'master' into master

ea037ba

epwalsh reviewed Aug 11, 2020

View reviewed changes

allennlp_models/generation/models/simple_seq2seq.py Outdated Show resolved Hide resolved

wlhgtc and others added 4 commits August 12, 2020 10:32

Update allennlp_models/generation/models/simple_seq2seq.py

c7a4de3

fix doc format Co-authored-by: Evan Pete Walsh <[email protected]>

format doc

d02a317

format doc

72959a7

reformat doc

a587a30

dirkgr added 2 commits August 17, 2020 17:38

Merge branch 'master' into master

609f7aa

Merge branch 'master' into master

dc57b16

dirkgr approved these changes Aug 18, 2020

View reviewed changes

dirkgr merged commit c211baf into allenai:master Aug 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update simple_seq2seq.py #90

Update simple_seq2seq.py #90

wlhgtc commented Jul 12, 2020

dirkgr left a comment

dirkgr Jul 13, 2020

wlhgtc Jul 13, 2020

dirkgr Jul 13, 2020

dirkgr Jul 13, 2020

wlhgtc Jul 13, 2020

dirkgr Jul 13, 2020

wlhgtc Jul 13, 2020

dirkgr Jul 13, 2020

wlhgtc Jul 13, 2020

wlhgtc commented Jul 13, 2020

dirkgr commented Jul 15, 2020

dirkgr left a comment

wlhgtc commented Jul 16, 2020

matt-gardner commented Jul 31, 2020

wlhgtc commented Aug 2, 2020

dirkgr commented Aug 4, 2020

wlhgtc commented Aug 7, 2020

dirkgr Aug 7, 2020

wlhgtc Aug 7, 2020

dirkgr Aug 7, 2020

wlhgtc Aug 7, 2020

dirkgr commented Aug 7, 2020

epwalsh commented Aug 11, 2020

wlhgtc commented Aug 12, 2020

dirkgr commented Aug 18, 2020

		for predicted_token in predicted_tokens:
		assert all(isinstance(x, str) for x in predicted_token)

Update simple_seq2seq.py #90

Update simple_seq2seq.py #90

Conversation

wlhgtc commented Jul 12, 2020

dirkgr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wlhgtc commented Jul 13, 2020

dirkgr commented Jul 15, 2020

dirkgr left a comment

Choose a reason for hiding this comment

wlhgtc commented Jul 16, 2020

matt-gardner commented Jul 31, 2020

wlhgtc commented Aug 2, 2020

dirkgr commented Aug 4, 2020

wlhgtc commented Aug 7, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dirkgr commented Aug 7, 2020

epwalsh commented Aug 11, 2020

wlhgtc commented Aug 12, 2020

dirkgr commented Aug 18, 2020