Categorising labour market skills into EU's taxonomy of skills

This repo contains my code for the DTU exam project in Deep Learning.
The code reproduce the main result of the paper can be found in the notebook called main.ipynb.

What to expect

In this project I propose a matching model to assign skills written in a natural language into a common EU taxonomy with 13,845 unique skill categories. Applying a pre-trained sentence BERT model shows surprisingly good results for matching skills from the EU taxonomy into a description of the skill itself. By embedding the skill and their descriptions separately, and matching them back together, the model predicts 1 in 5 skills to the correct description of itself. Moreover, 50.7% of predictions are correctly placed among the 10 most semantically similar descriptions. With fine-tuning of the model it has potential for an implementation as a labour market matching application.

Data

The data used to evaluate the model is EU's taxonomy of skills called ESCO. This dataset contains 13,485 unique skills with each a description. For a grasp of what this data looks see below, where the query column contains the skills and the document column contains the respective descriptions of skills (all in Danish).

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
data		data
obsolete		obsolete
.gitignore		.gitignore
README.md		README.md
main.ipynb		main.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Categorising labour market skills into EU's taxonomy of skills

What to expect

Data

References

About

Releases

Packages

Contributors 2

Languages

MartinBA741/SkillsColBERT

Folders and files

Latest commit

History

Repository files navigation

Categorising labour market skills into EU's taxonomy of skills

What to expect

Data

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages