Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to TF 2.0 and new NLU components #5266

Merged
merged 829 commits into from
Feb 26, 2020
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
829 commits
Select commit Hold shift + click to select a range
9cdea3a
fix test pipelines
Ghostvv Feb 19, 2020
22775e4
black
Ghostvv Feb 19, 2020
5e94e13
Merge branch 'tf2' into tf2-val
Ghostvv Feb 19, 2020
f03c204
Merge branch 'tf2-val' into tf2-old-crf
Ghostvv Feb 19, 2020
99c8cdd
reuse existing methods
tabergma Feb 19, 2020
a175f3c
update docs
tabergma Feb 19, 2020
2623866
black formatting
tabergma Feb 19, 2020
2843007
Merge branch 'master' into tf2
tabergma Feb 19, 2020
9a34f59
fix pbar in convertfeaturizer
Ghostvv Feb 19, 2020
176927c
merge tf2
Ghostvv Feb 19, 2020
3952704
raise deprecation warning
tabergma Feb 19, 2020
0f489df
Merge branch 'tf2' into tf2-val
tabergma Feb 19, 2020
2fa3106
fix test
tabergma Feb 19, 2020
ebe604f
Merge branch 'tf2-val' into tf2-old-crf
tabergma Feb 19, 2020
83e450a
Merge branch 'tf2' into tf2-val
tabergma Feb 20, 2020
3e4b79f
Merge pull request #5259 from RasaHQ/tf2-val
tabergma Feb 20, 2020
792024d
Merge branch 'tf2' into tf2-old-crf
tabergma Feb 20, 2020
79451db
add bias feature again
tabergma Feb 20, 2020
d11a4eb
added changelog entries
tabergma Feb 20, 2020
828b795
review comments
tabergma Feb 20, 2020
b41ea6b
add missing comma
tabergma Feb 20, 2020
2dd3841
refactored model loading for convert
dakshvar22 Feb 20, 2020
b47a45d
add test for docker configs
tabergma Feb 20, 2020
1646fab
fix docs
tabergma Feb 20, 2020
8109eed
added class descriptions
dakshvar22 Feb 20, 2020
b2f70bd
Merge branch 'tf2' of github.com:RasaHQ/rasa into tf2
dakshvar22 Feb 20, 2020
2df9c9a
remove colon
Ghostvv Feb 20, 2020
b51c020
suppress logging statement for tensorflow version from transformers
dakshvar22 Feb 20, 2020
43eb701
Merge branch 'tf2' of github.com:RasaHQ/rasa into tf2
dakshvar22 Feb 20, 2020
0fb940d
use configs from files in docs
tabergma Feb 20, 2020
8199fcf
review comments
dakshvar22 Feb 20, 2020
95c92ab
Merge branch 'tf2' of github.com:RasaHQ/rasa into tf2
dakshvar22 Feb 20, 2020
f297015
add links to pipeline docs
dakshvar22 Feb 20, 2020
9822b31
added language model specific info to docs
dakshvar22 Feb 20, 2020
5fb1e09
fix typo
dakshvar22 Feb 20, 2020
0d39d24
Merge pull request #5267 from RasaHQ/tf2-old-crf
Ghostvv Feb 20, 2020
357f2c4
make sparsity configurable, Response selector is a subclass of diet s…
Ghostvv Feb 20, 2020
b6d2667
rename constant
Ghostvv Feb 20, 2020
d9f624a
Merge branch 'tf2' into tf2-params
Ghostvv Feb 20, 2020
b74c7d1
fix duplicate link
dakshvar22 Feb 20, 2020
e2caee0
use self.epochs to set current epoch
Ghostvv Feb 20, 2020
c976905
Merge pull request #5273 from RasaHQ/pipeline-docs
Ghostvv Feb 20, 2020
a75e70c
Merge branch 'tf2' into tf2-params
Ghostvv Feb 20, 2020
ccca6ac
update links in changelogs
tabergma Feb 20, 2020
ccf3789
remove diet selector
Ghostvv Feb 20, 2020
2444e01
add changelog for removed mitie docker image
tabergma Feb 20, 2020
d2426f9
add weight sparsity to the docs
Ghostvv Feb 20, 2020
c4871b4
remove doc markers
Ghostvv Feb 20, 2020
672ea5b
review comments on docs
tabergma Feb 20, 2020
f67b634
made transformers lib optional and removed a few other deps
dakshvar22 Feb 20, 2020
0223cf6
made transformers lib optional and removed a few other deps
dakshvar22 Feb 20, 2020
fff220a
merge conflict
dakshvar22 Feb 20, 2020
f1855b8
review comments on docs
tabergma Feb 20, 2020
a45468a
Update docs/nlu/components.rst
dakshvar22 Feb 20, 2020
b82ff9b
fix link in docs
Ghostvv Feb 20, 2020
2e28cc2
remove diet selector
Ghostvv Feb 20, 2020
f62c65b
remove doc markers
Ghostvv Feb 20, 2020
a8870e3
fix link in docs
Ghostvv Feb 20, 2020
e63051f
Merge branch 'tf2-selector' of https://github.com/RasaHQ/rasa into tf…
Ghostvv Feb 20, 2020
dbb6635
review comments
tabergma Feb 20, 2020
a06cff4
fix imports in tests/utitlities.py
tabergma Feb 20, 2020
1899f44
merge tf2
Ghostvv Feb 20, 2020
1a8b820
use json.dump and json.load in lexical syntactic featurizer
tabergma Feb 20, 2020
140dba9
retrieval_intent is now a constant
tabergma Feb 20, 2020
181c7a1
merge tf2
Ghostvv Feb 20, 2020
5602304
made transformers lib optional and removed a few other deps
dakshvar22 Feb 20, 2020
5621809
Update docs/nlu/components.rst
dakshvar22 Feb 20, 2020
5dc7fba
Merge branch 'transformers-pipeline' of github.com:RasaHQ/rasa into t…
dakshvar22 Feb 20, 2020
41266a0
renaming functions
tabergma Feb 20, 2020
3cfd243
droprate -> drop rate
tabergma Feb 20, 2020
6378522
bump tensorflow text to use latest versions
dakshvar22 Feb 20, 2020
c8c3c3b
fixing persisting lexical syntactic featurizer
tabergma Feb 20, 2020
ab16d5d
Merge branch 'tf2' into transformers-pipeline
dakshvar22 Feb 20, 2020
f555d09
Merge branch 'tf2' into bump_tensorflow_text
dakshvar22 Feb 20, 2020
7d61e4f
merge tf2-params
Ghostvv Feb 20, 2020
dc1b55f
Merge pull request #5275 from RasaHQ/tf2-selector
Ghostvv Feb 20, 2020
14dfdf5
improve docstrings of components
tabergma Feb 20, 2020
f1f6e77
Merge branch 'master' into tf2
tabergma Feb 20, 2020
d6b7612
merge tf2
Ghostvv Feb 20, 2020
45a11a1
update docs on model options
tabergma Feb 20, 2020
179ad4f
revert back to old requirements
dakshvar22 Feb 20, 2020
364d96a
fix merge conflict
dakshvar22 Feb 20, 2020
b6b1ff9
fix persisting and loading of ted policy
tabergma Feb 20, 2020
8ccbb39
Merge branch 'tf2' into bump_tensorflow_text
dakshvar22 Feb 20, 2020
6b4e7b6
removed unnecessary deps again
dakshvar22 Feb 20, 2020
281e14a
Merge branch 'tf2' into transformers-pipeline
dakshvar22 Feb 20, 2020
7a74f98
remove flask from test
tabergma Feb 20, 2020
f98d53e
Merge branch 'tf2' into transformers-pipeline
dakshvar22 Feb 20, 2020
ff0ae98
Merge pull request #5276 from RasaHQ/transformers-pipeline
dakshvar22 Feb 20, 2020
fadb74f
Merge branch 'tf2' into bump_tensorflow_text
dakshvar22 Feb 20, 2020
e9f5a2d
don't import raise_warning directly
tabergma Feb 20, 2020
8d57ba8
Merge branch 'master' into tf2
tabergma Feb 20, 2020
a460e5a
Merge pull request #5278 from RasaHQ/bump_tensorflow_text
dakshvar22 Feb 20, 2020
abcdc2e
Merge branch 'tf2' into tf2-sparsitz
tabergma Feb 20, 2020
7589027
review comment
tabergma Feb 20, 2020
72cc021
add missing masked_lm option to response selector
tabergma Feb 20, 2020
1c09550
Use ResponseSelector instead of DIETSelector
tabergma Feb 20, 2020
c2eef82
Merge pull request #5282 from RasaHQ/tf2-sparsitz
tabergma Feb 21, 2020
3200732
clean up NLU tests
tabergma Feb 21, 2020
8f89857
Merge branch 'tf2' into tf2-tests
tabergma Feb 21, 2020
6a205ac
update diet classifier test
tabergma Feb 21, 2020
da4b6e2
Merge branch 'master' into tf2
tabergma Feb 21, 2020
9ed77a3
Merge branch 'tf2' into tf2-tests
tabergma Feb 21, 2020
5c8ed35
clean up
tabergma Feb 21, 2020
33015fd
update example configs
tabergma Feb 21, 2020
63f5f69
reduce number of train epochs
tabergma Feb 21, 2020
eb81127
fix random seed test
tabergma Feb 21, 2020
f1cc9a7
raise exception instead of NotImplemented
tabergma Feb 21, 2020
989f5fd
added mitie docker image again
tabergma Feb 21, 2020
95f5fb5
clean up imports
tabergma Feb 21, 2020
446ff97
update config path in docker file
tabergma Feb 21, 2020
39aeed0
make comment start from capital S
Ghostvv Feb 21, 2020
dd6f1c8
refactor updating EVAL_NUM_EPOCHS
tabergma Feb 21, 2020
90c1203
fix tests
tabergma Feb 21, 2020
4d6eb7e
move pickle dump and load to io utils
tabergma Feb 21, 2020
64ff5ca
review comments
tabergma Feb 21, 2020
16466ef
review comments
tabergma Feb 21, 2020
765939f
use jsonpickle instead of pickle
tabergma Feb 21, 2020
cf84917
Merge branch 'tf2' into tf2-tests
tabergma Feb 21, 2020
8eb283f
fix types
tabergma Feb 21, 2020
cbe6f10
Merge branch 'tf2' into tf2-tests
tabergma Feb 21, 2020
d5ac30e
print warning on epochs not set.
tabergma Feb 21, 2020
b45e1f4
deprecate provides and requires in nlu
Ghostvv Feb 21, 2020
bb9bf49
Merge branch 'tf2' into tf2-required
Ghostvv Feb 21, 2020
6e76414
fix entity extractor import
Ghostvv Feb 21, 2020
9d07852
fix loading TED policy
tabergma Feb 24, 2020
3488f7f
Merge branch 'master' into tf2
tabergma Feb 24, 2020
064e946
Merge branch 'tf2' into tf2-tests
tabergma Feb 24, 2020
5bb9b7f
Merge branch 'tf2' into tf2-required
tabergma Feb 24, 2020
4f8a7ad
check if tag id dict exists.
tabergma Feb 24, 2020
9fa9bfd
Merge branch 'tf2' into tf2-tests
tabergma Feb 24, 2020
a9df694
Merge branch 'tf2' into tf2-required
tabergma Feb 24, 2020
d12ff0a
update docstrings in components.py
tabergma Feb 24, 2020
befeac5
add empty pipeline validation
Ghostvv Feb 24, 2020
0329a43
merge tf2
Ghostvv Feb 24, 2020
4dd23d0
fix refs in docstings
tabergma Feb 24, 2020
287343c
change json_pickle to pickle_dump
Ghostvv Feb 24, 2020
dc932b7
remove all traces of component.required and provides
Ghostvv Feb 24, 2020
9e3b51b
rename test
Ghostvv Feb 24, 2020
b78f5bc
merge tf2
Ghostvv Feb 24, 2020
0c281bb
force_download of HF model weights
tabergma Feb 24, 2020
fe6b90a
add docstrings to Policy
Ghostvv Feb 24, 2020
d1ae222
Merge pull request #5305 from RasaHQ/tf2-docstrings
Ghostvv Feb 24, 2020
92a90fa
fix test
Ghostvv Feb 24, 2020
2dc9215
merge tf2
Ghostvv Feb 24, 2020
6a12f02
review comments on docs
tabergma Feb 24, 2020
3c06032
fix tests
Ghostvv Feb 24, 2020
bfcde0d
Update data/configs_for_docs/default_config.yml
Ghostvv Feb 24, 2020
ac8c2ec
Update data/configs_for_docs/default_english_config.yml
Ghostvv Feb 24, 2020
8d0c3dd
Update data/configs_for_docs/default_spacy_config.yml
Ghostvv Feb 24, 2020
877e52f
Update data/configs_for_docs/pretrained_embeddings_convert_config_1.yml
Ghostvv Feb 24, 2020
4874095
Update data/configs_for_docs/pretrained_embeddings_convert_config_2.yml
Ghostvv Feb 24, 2020
0543f7a
Update data/configs_for_docs/pretrained_embeddings_spacy_config_1.yml
Ghostvv Feb 24, 2020
628b07a
Update data/configs_for_docs/pretrained_embeddings_spacy_config_2.yml
Ghostvv Feb 24, 2020
d9f06c8
Update data/configs_for_docs/supervised_embeddings_config_1.yml
Ghostvv Feb 24, 2020
5ae181d
Update data/configs_for_docs/supervised_embeddings_config_2.yml
Ghostvv Feb 24, 2020
36a5be9
Update rasa/core/policies/keras_policy.py
Ghostvv Feb 24, 2020
c4d1eef
create DOCS_URL_MIGRATION_GUIDE
Ghostvv Feb 24, 2020
1d7aead
update choosing a pipeline.
tabergma Feb 24, 2020
350bc80
undo changes
tabergma Feb 24, 2020
aa7171b
refactor config checks
Ghostvv Feb 24, 2020
ecf939b
create removal changelog
Ghostvv Feb 24, 2020
6c7f3f7
Merge pull request #5293 from RasaHQ/tf2-required
Ghostvv Feb 24, 2020
1e77c69
Update rasa/nlu/classifiers/diet_classifier.py
Ghostvv Feb 24, 2020
18f7d9a
Update rasa/nlu/classifiers/diet_classifier.py
Ghostvv Feb 24, 2020
db0d794
set num_tags to None in init
Ghostvv Feb 24, 2020
b02135c
substitute Any with Type[...]
Ghostvv Feb 24, 2020
1831f46
Update rasa/nlu/classifiers/diet_classifier.py
Ghostvv Feb 24, 2020
8728599
Update rasa/nlu/classifiers/diet_classifier.py
Ghostvv Feb 24, 2020
6bf316f
Merge branch 'tf2' into tf2-tests
tabergma Feb 24, 2020
3915a0b
add missing import
tabergma Feb 24, 2020
84bced6
fix types
tabergma Feb 24, 2020
c7e2199
Merge pull request #5290 from RasaHQ/tf2-tests
tabergma Feb 25, 2020
3f8fd72
Merge branch 'master' into tf2
tabergma Feb 25, 2020
0ae0e58
fix incorrect import
tabergma Feb 25, 2020
682455f
Merge branch 'tf2' into tf2-docs
tabergma Feb 25, 2020
2b36f1e
Merge pull request #5306 from RasaHQ/tf2-docs
tabergma Feb 25, 2020
f914fbe
documentation review comments
tabergma Feb 25, 2020
13985f5
documentation review comments
tabergma Feb 25, 2020
78b32ca
documentation review comments
tabergma Feb 25, 2020
b758670
documentation review comments
tabergma Feb 25, 2020
1cb7c53
substitute loss and sim strings with constants
Ghostvv Feb 25, 2020
12c208b
fix doc warnings
tabergma Feb 25, 2020
4504271
address rasa init problems
tabergma Feb 25, 2020
2175f27
documentation review comments
tabergma Feb 25, 2020
8e6e586
fix formatting error
akelad Feb 25, 2020
c22adff
update migration guide
tabergma Feb 25, 2020
81caf9e
modify comments
Ghostvv Feb 25, 2020
5892968
Merge branch 'tf2' of https://github.com/RasaHQ/rasa into tf2
Ghostvv Feb 25, 2020
0ecff89
update choosing a pipeline
tabergma Feb 25, 2020
9ee5cf8
add note for old terminology.
tabergma Feb 25, 2020
7a865ab
undo docker changes
tabergma Feb 25, 2020
9cc389d
refactor data helpers
Ghostvv Feb 25, 2020
82117b2
Merge branch 'tf2' of https://github.com/RasaHQ/rasa into tf2
Ghostvv Feb 25, 2020
8cfa6e5
substitute feature name strings with constants
Ghostvv Feb 25, 2020
292134a
refactor layers preparation
Ghostvv Feb 25, 2020
3057c1e
update components.rst
tabergma Feb 25, 2020
85ff063
update choosing a pipeline
tabergma Feb 25, 2020
d92f689
quick fix for docs typos/formatting
akelad Feb 25, 2020
830c66d
use migration guide constant
tabergma Feb 25, 2020
c0afb86
refactor loss and f1 helpers
Ghostvv Feb 25, 2020
9688646
Merge branch 'tf2' of https://github.com/RasaHQ/rasa into tf2
Ghostvv Feb 25, 2020
4de13a8
review comments on featurizers
tabergma Feb 25, 2020
cf96b61
fix docstrings in components
Ghostvv Feb 25, 2020
6fce774
merge tf2
Ghostvv Feb 25, 2020
935b90d
review comments on lexical_syntactic_featuirzer.
tabergma Feb 25, 2020
4f657fb
review comments on convert
tabergma Feb 25, 2020
208f5e4
review comments on hugging face components
tabergma Feb 25, 2020
b36a712
Merge branch 'master' into tf2
tabergma Feb 25, 2020
c5b337d
rename inverted tag and label dicts
Ghostvv Feb 25, 2020
cb9cd5a
Merge branch 'tf2' of https://github.com/RasaHQ/rasa into tf2
Ghostvv Feb 25, 2020
ea84dc9
remove _find_example_for_tag
Ghostvv Feb 25, 2020
dc2d5a9
remove setting numpy random seed in train
Ghostvv Feb 25, 2020
a306da7
review comments
tabergma Feb 25, 2020
be46495
create no entity tag constant
Ghostvv Feb 25, 2020
54bd698
Merge branch 'tf2' of https://github.com/RasaHQ/rasa into tf2
Ghostvv Feb 25, 2020
bb2b6cb
add type to tf_layers
Ghostvv Feb 25, 2020
d1aa219
update constants comment
Ghostvv Feb 25, 2020
12bdf87
remove magic numbers probs
Ghostvv Feb 25, 2020
4eda2e5
fix type of Data in model data
Ghostvv Feb 25, 2020
f1f6c43
add axis=
Ghostvv Feb 25, 2020
0542b28
add explanatory comments
Ghostvv Feb 25, 2020
1e8b7b9
check if responses are present.
tabergma Feb 25, 2020
937813d
review comments
tabergma Feb 25, 2020
2886ea0
add comment and type
Ghostvv Feb 25, 2020
e2e5139
Merge branch 'tf2' of https://github.com/RasaHQ/rasa into tf2
Ghostvv Feb 25, 2020
483713b
rename relative lengths
Ghostvv Feb 25, 2020
c6e6f27
remove batch_tuple_sizes
tabergma Feb 26, 2020
502ef22
review comments
tabergma Feb 26, 2020
baab754
review comments
tabergma Feb 26, 2020
1cb18d2
add docstring
tabergma Feb 26, 2020
ccf25a4
add comments to model data
Ghostvv Feb 26, 2020
d44181a
add comments to model_data
Ghostvv Feb 26, 2020
eb5cf6b
create tmp dir for convert
tabergma Feb 26, 2020
67dad7a
update type
tabergma Feb 26, 2020
e0ab5f7
add comments
Ghostvv Feb 26, 2020
d58534b
Merge branch 'tf2' of https://github.com/RasaHQ/rasa into tf2
Ghostvv Feb 26, 2020
4cf5fff
change comment
Ghostvv Feb 26, 2020
4fefe76
recalculate number of examples after balancing
Ghostvv Feb 26, 2020
8443d50
reorganize methods in model_data
Ghostvv Feb 26, 2020
1c5f9da
remove num_neg check from ted
Ghostvv Feb 26, 2020
e06caaf
update requirements
Ghostvv Feb 26, 2020
2854674
fix nlu comparison test
tabergma Feb 26, 2020
554ad4c
update requirements
Ghostvv Feb 26, 2020
c7bbae1
Merge branch 'tf2' of https://github.com/RasaHQ/rasa into tf2
Ghostvv Feb 26, 2020
727ed61
update version
Ghostvv Feb 26, 2020
c33e408
Update alt_requirements/requirements_pretrained_embeddings_convert.txt
Ghostvv Feb 26, 2020
1956976
Fixed an issue with AWS persistor
Feb 26, 2020
7229a3e
Merge branch 'master' into tf2
Ghostvv Feb 26, 2020
189355b
Merge branch 'master' into tf2
tmbo Feb 26, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Minimum Install Requirements
-r ../requirements.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dakshvar22 what was the library you said we could remove?


tensorflow_text==1.15.1
tensorflow_hub==0.6.0
tensorflow_text==2.1.0rc0
tensorflow_hub==0.7.0
4 changes: 3 additions & 1 deletion changelog/4817.improvement.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
Part of Slack sanitization:
Multiple garbled URL's in a string coming from slack will be converted into actual strings. ``Example: health check of <http://eemdb.net|eemdb.net> and <http://eemdb1.net|eemdb1.net> to health check of eemdb.net and eemdb1.net``
Multiple garbled URL's in a string coming from slack will be converted into actual strings.
``Example: health check of <http://eemdb.net|eemdb.net> and <http://eemdb1.net|eemdb1.net> to health check of
eemdb.net and eemdb1.net``
5 changes: 5 additions & 0 deletions changelog/5065.feature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Add :ref:`LexicalSyntacticFeaturizer` to sparse featurizers.

``LexicalSyntacticFeaturizer`` does the same featurization as the ``CRFEntityExtractor``. We extracted the
featurization into a separate component so that the features can be reused and featurization is independent from the
entity extraction.
7 changes: 7 additions & 0 deletions changelog/5187.feature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Integrate language models from HuggingFace's `Transformers <https://github.com/huggingface/transformers>`_ Library.

Add a new NLP component :ref:`HFTransformersNLP <HFTransformersNLP>` which
tokenizes and featurizes incoming messages using a specified pre-trained model with the Transformers library as the backend.
Add ``LanguageModelTokenizers`` and ``LanguageModelFeaturizers`` which use the information from ``HFTransformersNLP``
and sets them correctly for message object.
Language models currently supported: BERT, OpenAIGPT, GPT-2, XLNet, DistilBert, RoBERTa
15 changes: 15 additions & 0 deletions changelog/5230.feature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Refactor how GPU and CPU environments are configured for TensorFlow 2.0.

Please refer to the :ref:`documentation <tensorflow_usage>` to understand
which environment variables to set in what scenarios. A couple of examples are shown below as well:

.. code-block:: python

# This specifies to use 1024 MB of memory from GPU with logical ID 0 and 2048 MB of memory from GPU with logical ID 1
TF_GPU_MEMORY_ALLOC="0:1024, 1:2048"

# Specifies that at most 3 CPU threads can be used to parallelize multiple non-blocking operations
TF_INTER_OP_PARALLELISM_THREADS="3"

# Specifies that at most 2 CPU threads can be used to parallelize a particular operation.
TF_INTRA_OP_PARALLELISM_THREADS="2"
12 changes: 12 additions & 0 deletions changelog/5266.feature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Added a new NLU component ``DIETClassifier`` and a new policy ``TEDPolicy``.

DIET (Dual Intent and Entity Transformer) is a multi-task architecture for intent classification and entity
recognition. You can read more about this component in our :ref:`documentation <diet-classifier>`.
The new component will replace the ``EmbeddingIntentClassifier`` and the ``CRFEntityExtractor`` in the future.
Those two components are deprecated from now on.
See :ref:`migration guide <migration-to-rasa-1.8>` for details on how to
switch to the new component.

``TEDPolicy`` is the new name for ``EmbeddingPolicy``. ``EmbeddingPolicy`` is deprecated from now on.
The functionality of ``TEDPolicy`` and ``EmbeddingPolicy`` is the same. Please update your configuration file
to use the new name for the policy.
1 change: 1 addition & 0 deletions changelog/5266.improvement.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
We updated our code to TensorFlow 2.
9 changes: 9 additions & 0 deletions changelog/5266.misc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
We deprecated all existing pipeline templates, ``SklearnIntentClassifier`` and ``KerasPolicy``.

Please list the components you want to use directly in your configuration file.
Check out :ref:`Choosing a Pipeline <choosing-a-pipeline>` to decide what components to
include in your pipeline.

Use ``DIETClassifier`` instead of ``SklearnIntentClassifier``.

Use ``TEDPolicy`` instead of ``KerasPolicy``.
6 changes: 6 additions & 0 deletions changelog/663.feature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
The sentence vector of the ``SpacyFeaturizer`` and ``MitieFeaturizer`` can be calculated using max or mean pooling.

To specify the pooling operation, set the option ``pooling`` for the ``SpacyFeaturizer`` or the ``MitieFeaturizer``
in your configuration file. The default pooling operation is ``mean``. The mean pooling operation also does not take
into account words, that do not have a word vector.
See our :ref:`documentation <components>` for more details.
9 changes: 3 additions & 6 deletions changelog/699.misc.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
The `EmbeddingPolicy <https://rasa.com/docs/rasa/core/policies/#embedding-policy>`_
replaces the ``KerasPolicy`` in new Rasa projects generated with ``rasa init``.
The `EmbeddingPolicy <https://rasa.com/docs/rasa/core/policies/#embedding-policy>`_
is now the recommended machine learning policy. Please see the
`migration guide <https://rasa.com/docs/rasa/migration-guide/#rasa-1-7-to-rasa-1-8>`_
if you want to switch to this new policy in an existing project.
The :ref:`TEDPolicy <ted_policy>` replaces the ``KerasPolicy`` in new Rasa projects generated with ``rasa init``.
The :ref:`TEDPolicy <ted_policy>` is now the recommended machine learning policy. Please see the
:ref:`migration guide <migration-to-rasa-1.8>` if you want to switch to this new policy in an existing project.
14 changes: 14 additions & 0 deletions data/configs_for_docs/default_config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
language: "en"

pipeline:
- name: WhitespaceTokenizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
- name: EntitySynonymMapper
- name: DIETSelector
15 changes: 15 additions & 0 deletions data/configs_for_docs/default_english_config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
language: "en"

pipeline:
- name: ConveRTTokenizer
- name: ConveRTFeaturizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
- name: EntitySynonymMapper
- name: DIETSelector
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
language: "en"

pipeline: "pretrained_embeddings_convert"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
language: "en"

pipeline:
- name: "ConveRTTokenizer"
- name: "ConveRTFeaturizer"
- name: "EmbeddingIntentClassifier"
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
language: "en"

pipeline: "pretrained_embeddings_spacy"
10 changes: 10 additions & 0 deletions data/configs_for_docs/pretrained_embeddings_spacy_config_2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
language: "en"

pipeline:
- name: "SpacyNLP"
- name: "SpacyTokenizer"
- name: "SpacyFeaturizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "SklearnIntentClassifier"
3 changes: 3 additions & 0 deletions data/configs_for_docs/supervised_embeddings_config_1.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
language: "en"

pipeline: "supervised_embeddings"
13 changes: 13 additions & 0 deletions data/configs_for_docs/supervised_embeddings_config_2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
language: "en"

pipeline:
- name: "WhitespaceTokenizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "CountVectorsFeaturizer"
- name: "CountVectorsFeaturizer"
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: "EmbeddingIntentClassifier"
4 changes: 2 additions & 2 deletions data/test/config_embedding_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@ language: en
pipeline:
- name: "CountVectorsFeaturizer"
max_ngram: 3
- name: "EmbeddingIntentClassifier"
epochs: 10
- name: "DIETClassifier"
epochs: 10
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ pipeline:
# features for word before token
- ["low", "title", "upper", "pos", "pos2"]
# features of token itself
- ["bias", "low", "word3", "word2", "upper", "title", "digit", "pos", "pos2", "pattern"]
- ["low", "word3", "word2", "upper", "title", "digit", "pos", "pos2"]
# features for word after the token we want to tag
- ["low", "title", "upper", "pos", "pos2"]
max_iterations: 50
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ language: "en"
pipeline:
- name: "WhitespaceTokenizer"
- name: "CountVectorsFeaturizer"
- name: "EmbeddingIntentClassifier"
- name: "DIETClassifier"
epochs: 2
- name: "ResponseSelector"
epochs: 2
11 changes: 11 additions & 0 deletions data/test_config/config_pretrained_embeddings_mitie.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
language: "en"

pipeline:
- name: "MitieNLP"
model: "data/total_word_feature_extractor.dat"
- name: "MitieTokenizer"
- name: "MitieEntityExtractor"
- name: "EntitySynonymMapper"
- name: "RegexFeaturizer"
- name: "MitieFeaturizer"
- name: "SklearnIntentClassifier"
10 changes: 10 additions & 0 deletions data/test_config/config_pretrained_embeddings_mitie_2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
language: "en"

pipeline:
- name: "MitieNLP"
model: "data/total_word_feature_extractor.dat"
- name: "MitieTokenizer"
- name: "MitieEntityExtractor"
- name: "EntitySynonymMapper"
- name: "RegexFeaturizer"
- name: "MitieIntentClassifier"
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,6 @@ language: "en"
pipeline:
- name: "CountVectorsFeaturizer"
- name: "EmbeddingIntentClassifier"
epochs: 2
- name: "DucklingHTTPExtractor"
url: "http://duckling:8000"
3 changes: 2 additions & 1 deletion data/test_config/embedding_random_seed.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
policies:
- name: EmbeddingPolicy
- name: TEDPolicy
random_seed: 42
epochs: 2
2 changes: 1 addition & 1 deletion docker/Dockerfile_full
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ FROM base AS runner
WORKDIR /app

# Copy over default pipeline config
COPY sample_configs/config_pretrained_embeddings_spacy_duckling.yml config.yml
COPY docker/configs/config_supervised_embeddings_duckling.yml config.yml

# Copy over mitie model
COPY --from=builder /app/data/total_word_feature_extractor.dat data/total_word_feature_extractor.dat
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,6 @@ RUN apt-get update -qq \
# Make sure we have the latest pip version
RUN pip install -U pip

# Download mitie model
RUN wget -P /app/data/ https://s3-eu-west-1.amazonaws.com/mitie/total_word_feature_extractor.dat

# Copy only what we really need
COPY README.md .
COPY setup.py .
Expand All @@ -54,22 +51,19 @@ COPY requirements.txt .
COPY LICENSE.txt .

# Install dependencies
RUN pip install --no-cache-dir -r alt_requirements/requirements_pretrained_embeddings_mitie.txt
RUN pip install --no-cache-dir -r alt_requirements/requirements_pretrained_embeddings_convert.txt

# Install Rasa as package
COPY rasa ./rasa
RUN pip install .[sql,mitie]
RUN pip install .[sql,convert]

# Runtime stage which uses the virtualenv which we built in the previous stage
FROM base AS runner

WORKDIR /app

# Copy over default pipeline config
COPY sample_configs/config_pretrained_embeddings_mitie.yml config.yml

# Copy over mitie model
COPY --from=builder /app/data/total_word_feature_extractor.dat data/total_word_feature_extractor.dat
COPY docker/configs/config_pretrained_embeddings_convert.yml config.yml

# Copy virtualenv from previous stage
COPY --from=builder /build /build
Expand Down
2 changes: 1 addition & 1 deletion docker/Dockerfile_pretrained_embeddings_spacy_de
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ FROM base AS runner
WORKDIR /app

# Copy over default pipeline config
COPY sample_configs/config_pretrained_embeddings_spacy_de.yml config.yml
COPY docker/configs/config_pretrained_embeddings_spacy_de.yml config.yml

# Copy virtualenv from previous stage
COPY --from=builder /build /build
Expand Down
2 changes: 1 addition & 1 deletion docker/Dockerfile_pretrained_embeddings_spacy_en
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ FROM base AS runner
WORKDIR /app

# Copy over default pipeline config
COPY sample_configs/config_pretrained_embeddings_spacy.yml config.yml
COPY docker/configs/config_pretrained_embeddings_spacy_en.yml config.yml

# Copy virtualenv from previous stage
COPY --from=builder /build /build
Expand Down
15 changes: 15 additions & 0 deletions docker/configs/config_pretrained_embeddings_convert.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
language: "en"

pipeline:
- name: ConveRTTokenizer
- name: ConveRTFeaturizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
- name: EntitySynonymMapper
- name: DIETSelector
16 changes: 16 additions & 0 deletions docker/configs/config_pretrained_embeddings_spacy_de.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
language: "de"

pipeline:
- name: SpacyNLP
- name: SpacyTokenizer
- name: SpacyFeaturizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
- name: EntitySynonymMapper
- name: DIETSelector
16 changes: 16 additions & 0 deletions docker/configs/config_pretrained_embeddings_spacy_en.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
language: "en"

pipeline:
- name: SpacyNLP
- name: SpacyTokenizer
- name: SpacyFeaturizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
- name: EntitySynonymMapper
- name: DIETSelector
16 changes: 16 additions & 0 deletions docker/configs/config_supervised_embeddings_duckling.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
language: "en"

pipeline:
- name: WhitespaceTokenizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
- name: EntitySynonymMapper
- name: DIETSelector
- name: DucklingHTTPExtractor
url: "http://duckling:8000"
Loading