Microsoft Machine Reading Comprehension (MS MARCO) is a new large scale dataset for reading comprehension and question answering. In MS MARCO, all questions are sampled from real anonymized user queries. The context passages, from which answers in the dataset are derived, are extracted from real web documents using the most advanced version of the Bing search engine. The answers to the queries are human generated if they could summarize the answer.

Contributors to the MS MARCO dataset

Tri Nguyen
Applied Scientist

Xia Song
Senior Applied Scientist

Mir Rosenberg
Principal Lead PM

Saurabh Tiwary
Principal Applied Science Manager