Skip to content

Commit

Permalink
added supplementary file information to readme
Browse files Browse the repository at this point in the history
  • Loading branch information
MoizRauf authored Aug 24, 2020
1 parent 07b2035 commit 8df381c
Showing 1 changed file with 6 additions and 1 deletion.
7 changes: 6 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,11 @@ The following are the best found configurations:
|w2v-cbow | 5 | 100 | 15 | 0.1 | 3|
|w2v-sg | 5 | 50 | 15 | 0.1 | 3|

## Supplementary dataset
In addition to the three benchmark datasets, we provide two supplementary files providing additional information regarding the identifiers present in the Idbench.
1. [Identifier_cross_lang_freq_stats.csv](identifier_cross_lang_freq_stats.csv) provides statistics of number of times the selected identifiers occur in JavaScript, Python and Java.
2. [Identifier_role_stats.csv](identifier_role_stats.csv) provides information on how often an identifier appears as a function name, variable name or property name.

## Survey Conducted to Build the Dataset
IdBench is build based on a survey of 500 software developers. The following gives shows the instructions shown to participants, examples of questions asked, and details about the distribution of participants.

Expand All @@ -61,4 +66,4 @@ Here is an example of a question asked during the indirect survey:

Finally, some statistics about the geographical distribution and previous experience of the participants:

![](https://raw.githubusercontent.com/sola-st/IdBench/master/images/participants.png)
![](https://raw.githubusercontent.com/sola-st/IdBench/master/images/participants.png)

0 comments on commit 8df381c

Please sign in to comment.