Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugs with Rasa X UI while uploading new training data #3580

Closed
psds01 opened this issue May 24, 2019 · 16 comments
Closed

Bugs with Rasa X UI while uploading new training data #3580

psds01 opened this issue May 24, 2019 · 16 comments
Labels
area:rasa-x/ui ✨ All issues focused on the Rasa X frontend type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR

Comments

@psds01
Copy link
Contributor

psds01 commented May 24, 2019

Rasa version: 1.0.1

Python version: 3.6.8/ Anaconda

Operating system (windows, osx, ...): Ubuntu

Issue: First of all, great work team Rasa on the project Rasa X, this is just beautiful .

Rasa X UI bugs/features: I actually have found 4 "things" in UI of Rasa X that are not working as expected.

  1. [More of a Feature Request] Under Training -> NLU Training -> Training data. If I have training data loaded there already and try to upload a new training data file, it doesn't ask me for "confirmation" like "Are you sure? You might not have downloaded this tagged data." Should have options like : "Download current data and upload" and "ignore current data and upload".

  2. If I upload a new training data file under the same UI as above, the new file content does not show up immediately. I have to refresh the page and then it shows the new training data that I just uploaded. Keeps stale data even after uploading new data.

  3. The order in which the contents of this new training data is displayed is in reverse order. What I mean is if I have content like `{ "rasa_nlu_data":

    Unknown macro: { "common_examples"}

    }` the UI will show the intents like this :

  4. It takes more than 2 minutes for my rasa X UI to load. During this time, I see a file named rasa.db_journal getting created and deleted at least a hundred times. (This does not happen with rasa init)

Content of configuration file (config.yml):

Content of domain file (domain.yml) (if used & relevant):

@akelad
Copy link
Contributor

akelad commented May 24, 2019

awesome @psds01 thanks so much for your feedback :D glad you like Rasa X!

  1. I believe this is something we've already incorporated in our designs for future updates -- @abhilasharoy can you confirm?
  2. @gausie can you look into that?
  3. Hmm, is it very important that it's in the order of the file?
  4. @ricwo can you look into that?

@akelad
Copy link
Contributor

akelad commented May 24, 2019

actually re 3: I guess in future the plan is to group by intent anyways right @abhilasharoy ?

@psds01
Copy link
Contributor Author

psds01 commented May 24, 2019

First approach, let's say I have 10 conversations b/w a bot and a user. Normally, I would combine all user inputs into a single nlu.json file, upload on rasa X, tag it, and train rasa/custom nlu. Having done that, it would mean that I have to go through these 10 conversations again to create "stories" to train rasa core. This would work if you have a small number of conversations to tag.

But, let's say I have 1000 of such conversations that I need tagged. To train both nlu and core, I can upload a single conversation, tag its intents and generate a rasa story for that conversation using some custom script. I can do this for all the 1000 files. This will give me 1000 intents files and 1000 stories files but I have to go over each file only once (unlike the first approach, where I go through intents as well as stories).

In the first approach, I tag 2000 in total (1000 files at a time, for intent tagging and 1000X1 file at a time for stories creation). In the second approch, I tag 1000X1 file at a time, and from that itself I can generate 1000X1 stories, saving me the manual trouble of making 1000 rasa stories.

So for me atleast, it makes sense that I have my intents in order and not grouped by intents.

@abhilasharoy
Copy link

@akelad

  1. This actually has not been handled yet (UX wise), It'll be on my to-do now.
  2. After thinking about this, I remember that it was a conscious decision to keep the order this way. Uploading from files may be a bit weird (as mentioned here), but this order works better when training data is added manually. Because the user is then able to see the most recently added sentence on top.

And in the future, it would not be grouped by intent by default on this screen (though we do have designs for a "intents" landing page, where this would happen). But discovery can still be super easy because users can filter by intent at will.

@psds01
Copy link
Contributor Author

psds01 commented May 24, 2019

We can avoid this, generating stories and intents differently for a conversation, if my 'nlu.json' file also had two addition fields, namely, story_id and message_id. To be more specific, instead of

{
    "rasa_nlu_data": {
        "common_examples": [
            {
                "text": "Hi",
                "intent": "",
                "entities": []
            },
            {
                "text": "Bye",
                "intent": "",
                "entities": []
            }
        ]
    }
}

if I had

{
    "rasa_nlu_data": {
        "common_examples": [
            {
                "story_id":"1",
                "message_id":"1",
                "text": "Hi",
                "intent": "",
                "entities": []
            },
            {
                "story_id":"1",
                "message_id":"2",
                "text": "Bye",
                "intent": "",
                "entities": []
            }
        ]
    }
}

Then it would save a lot of time while tagging files. I wouldn't need to worry about the order in which intent text appeared on the rasa X UI. I could always get multiple rasa stories from a single nlu.json based on story_id and message_id .

This would make it very easy to tag data : intents or stories.

@psds01
Copy link
Contributor Author

psds01 commented May 24, 2019

@akelad @ricwo @gausie
Any suggestions?

@ricwo
Copy link
Contributor

ricwo commented May 24, 2019

Regarding the startup delay - do you experience this with a fresh project (after rasa init), or after importing lots of training data? It sounds like it could be due to a very large database

@psds01
Copy link
Contributor Author

psds01 commented May 24, 2019

Thanks @ricwo for the kind comment. There is no startup delay with rasa init.
When I train on custom data, the sizes of databases are: rasa.db = 2.5 MB and tracker.db = 151.6 kB. Is 2.5 MB a large DB in this case?
Thanks,

@tmbo tmbo added type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. area:rasa-x/backend 🎩 All issues focused on the Rasa X backend area:rasa-x/ui ✨ All issues focused on the Rasa X frontend type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR and removed type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. labels May 27, 2019
@tmbo
Copy link
Member

tmbo commented May 27, 2019

It makes more sense to open a separate issue for the startup delay as that is separate from the whole training data upload topic. @psds01 do you mind doing that?

@psds01
Copy link
Contributor Author

psds01 commented May 27, 2019

Thanks for the suggestion @tmbo . Here's the issue: #3611

@rgstephens
Copy link
Contributor

rgstephens commented Jul 31, 2019

I just ran across the import issue #1 in @psds01 original post. I was surprised to find all my data removed when I imported new data.

My request is that the UI change and have options to allow the user to choose to either replace or add to the existing training data when they do the import.

@rgstephens
Copy link
Contributor

Is this getting any attention? I'm sorry to see #4067 getting resolved without addressing this related issue.

@akelad
Copy link
Contributor

akelad commented Aug 13, 2019

@rgstephens could you clarify what you mean? That PR is related to addressing the NLU training data page, not related to Rasa X

@rgstephens
Copy link
Contributor

Sounds like I mis-read that issue. Nevermind.

@stale
Copy link

stale bot commented Feb 3, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the status:stale label Feb 3, 2020
@tmbo tmbo removed status:stale area:rasa-x/backend 🎩 All issues focused on the Rasa X backend labels Feb 3, 2020
@rasabot-exalate rasabot-exalate added type:enhancement_:sparkles: type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR and removed type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR type:enhancement_:sparkles: labels Mar 21, 2022 — with Exalate Issue Sync
@alexweidauer
Copy link
Contributor

Thank you for raising this issue about Rasa X. We decided to stop supporting the Community Edition (free version) of ‘Rasa X’ (see more info here: https://rasa.com/blog/rasa-x-community-edition-changes/). That’s why we’ve closed your issue. We suggest that Rasa Enterprise customers raise the issue with our customer support team if they haven’t done so already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rasa-x/ui ✨ All issues focused on the Rasa X frontend type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR
Projects
None yet
Development

No branches or pull requests

8 participants