Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make high-level plan for 1st usable text analysis #115

Open
7 tasks
blcham opened this issue Dec 1, 2022 · 7 comments
Open
7 tasks

Make high-level plan for 1st usable text analysis #115

blcham opened this issue Dec 1, 2022 · 7 comments
Assignees

Comments

@blcham
Copy link
Contributor

blcham commented Dec 1, 2022

Text analysis info document

Tasks:

  • Discuss results of Marketa, comment/correct it so it is understandable and pick the most interesting statistics and patterns for us
  • Merge vocabularies of Marketa, Ondrej and Vojtech (remove duplicates and errors -- can be tested by script)
  • Compute the selected statistics
  • ? Implement the most interesting patterns + recompute the statistics
  • ? Include extraction of WO action to improve statistics
  • Design how to implement in maintenance planner that for every known TC with history we will have an list of probable component and failure that might happen. The list should be ordered by the probabilities.
  • Design suggestion by MK -- "determine where critical finding can occur, which means prioritize or highlight the tasks which may lead to causing the delay of maintenance visit (WP)"
@blcham blcham added this to the 1st Usable Text Analysis milestone Dec 1, 2022
@lalisand
Copy link
Collaborator

lalisand commented Dec 4, 2022

I think we need to summarize what we have and what we want to have.

We have:

  1. Two lexicons of aircraft components and failures (by students Tomáš Vojtěch and Markéta Adamcová), both imported in termit CSAT deployment
  2. Set of suggested patterns to be used by the text analysis for improving its precision/recall (by Markéta Adamcová)
  3. Annotated/analyzed datasets from students work above, with evaluation descried in their theses

We want to do (have):

  1. Decide about what will be implemented into the CSAT SW stack
  2. Implement results of 1. into the maintenance planner

@blcham you may add some links here as it is not easy for me find some (where applicable) of the listed artefacts stored on GD
@kalamartin please add any relevant information from your side, I know you do some experiments here as well

@kalamartin
Copy link
Collaborator

@lalisand @blcham Markéta Adamcová has had good point of method reducing the rows of worksteps without any information. There would be great to find a way to reduce these so we can improve precision of text analysis.

Regardless above:

  1. Otherwise, my suggestion is still the same. As a first step, to riogidly implement text analysis to maintenance planner as for every each known TC with history, add most probable component and failure suggestion.
  2. Second step could be implementation of probability of this component and failure occurence.
  3. Third step to determine these where critical finding can occur, which means prioritize or highlight the tasks which may lead to causing the delay of maintenance visit (WP)

@lalisand
Copy link
Collaborator

lalisand commented Dec 7, 2022

@kalamartin thanks for this. Did you do some experiments yourself? Did you take results of any of the analyses performed by our students and used them somewhere or not?

@kalamartin
Copy link
Collaborator

@lalisand I have done by myself experiments of determining WO´s raised from planned TC´s in WP´s which overdue scheduled TAT, so I got reference on tasks which let to extending of this TAT.
However, this was done with old data provided by Mira, but even after changes I do not expect improval in accuracy of determining component and failure in each row (based on Marketa´s results).

@lalisand
Copy link
Collaborator

lalisand commented Dec 8, 2022

@kalamartin OK, this sounds good. I understand the issue with text analysis precision. How is it with the current precision? Are the results useful (at least partly), or the analysis must be improved for CSAT to be willing to use it?

@blcham
Copy link
Contributor Author

blcham commented Dec 9, 2022

Two lexicons of aircraft components and failures (by students Tomáš Vojtěch and Markéta Adamcová), both imported in termit CSAT deployment

@lalisand:

  1. I believe that we have lexicon also from Ondrej Vitovec, so we should include it as well.
  2. I updated in the first comment of this thread list tasks that should be done (i included some of them others need to be discussed and added)

@lalisand
Copy link
Collaborator

@blcham

  1. ok, we can check this with Ondřej
  2. makes sense; shall we start with meeting with Markéta Adamcová? if so, lets suggest some time/day on Slack

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants