From 8df381c4f78c75d42727ae1dd870dd69b21ddeea Mon Sep 17 00:00:00 2001 From: Moiz Rauf Date: Mon, 24 Aug 2020 16:03:26 +0200 Subject: [PATCH] added supplementary file information to readme --- README.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index ae3fe4d..e5d701a 100644 --- a/README.md +++ b/README.md @@ -40,6 +40,11 @@ The following are the best found configurations: |w2v-cbow | 5 | 100 | 15 | 0.1 | 3| |w2v-sg | 5 | 50 | 15 | 0.1 | 3| +## Supplementary dataset +In addition to the three benchmark datasets, we provide two supplementary files providing additional information regarding the identifiers present in the Idbench. +1. [Identifier_cross_lang_freq_stats.csv](identifier_cross_lang_freq_stats.csv) provides statistics of number of times the selected identifiers occur in JavaScript, Python and Java. +2. [Identifier_role_stats.csv](identifier_role_stats.csv) provides information on how often an identifier appears as a function name, variable name or property name. + ## Survey Conducted to Build the Dataset IdBench is build based on a survey of 500 software developers. The following gives shows the instructions shown to participants, examples of questions asked, and details about the distribution of participants. @@ -61,4 +66,4 @@ Here is an example of a question asked during the indirect survey: Finally, some statistics about the geographical distribution and previous experience of the participants: -![](https://raw.githubusercontent.com/sola-st/IdBench/master/images/participants.png) \ No newline at end of file +![](https://raw.githubusercontent.com/sola-st/IdBench/master/images/participants.png)