Microsoft Machine Reading Comprehension (MS MARCO) is a new large scale dataset for reading comprehension and question answering. In MS MARCO, all questions are sampled from real anonymized user queries. The context passages, from which answers in the dataset are derived, are extracted from real web documents using the most advanced version of the Bing search engine. The answers to the queries are human generated if they could summarize the answer.

Contributors to the MS MARCO dataset

Tri Nguyen
Applied Scientist

Tong Wang
Applied Scientist

Xia Song
Principal Applied Scientist

Mir Rosenberg
Principal Lead PM