04.23.2018:We have released an updated to the dataset. V2.1 Includes the following:
1. Over 1 million queries
2. ~182k Well Formed Answers
3. Query type is now included for every query.
4. Bias in Evaluation set fixed(a small portion of answers for the V2.0 Evaluation set were able to be found in the v1.1 set and the v2.0 well formed sets, these have been removed from eval and added to train).
5. Utilities and Readme now availible.
03.01.2018:We have released an updated to the dataset. V2.0 Includes the following:
1. ~900,000 unique queries
2. ~160k Well Formed Answers
01.30.2017:We have released an update to the dataset! V1.1 contains the follwing:
1. Improvments to dataset and evaluation scripts
12.01.2016:We have released our dataset! V1.0 contains the follwing:
1. 100,000 unique query answer pairs
As we improve the quality of our data we will publish updates to the dataset. Follow us on Twitter for updates.Follow MSMarcoAI