References:
Abadani, Negin, Jamshid Mozafari, Afsaneh Fatemi, Mohammd Ali Nematbakhsh, and Arefeh Kazemi. 2021. “ParSQuAD: Machine Translated SQuAD Dataset for Persian Question Answering.” in 2021 7th International Conference on Web Research (ICWR). IEEE. Tehran, Iran.
Ayoubi Sajjad & Mohammad Yasin ِ Davoodeh. 2021. PersianQA: A Dataset for Persian Question Answering. GitHub Repository.
Bajaj, Payal, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, and Tri Nguyen. 2016. “Ms Marco: A Human Generated Machine Reading Comprehension Dataset.” ArXiv Preprint ArXiv: 1611.09268.
Craswell, Nick, Bhaskar Mitra, Emine Yilmaz, and Daniel Campos. 2021a. “Overview of the TREC 2020 Deep Learning Track.” ArXiv Preprint ArXiv: 2102.07662.
_____, and Jimmy Lin. 2021b. “Ms Marco: Benchmarking Ranking Models in the Large-Data Regime.” pp. 1566–76 in Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval.
Craswell, Nick, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Ellen M. Voorhees. 2020. “Overview of the TREC 2019 Deep Learning Track.” ArXiv Preprint ArXiv: 2003.07820.
Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding.” pp. 4171–86 in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics.
Hashemi, Helia, Mohammad Aliannejadi, Hamed Zamani, and W. Bruce Croft. 2020. “ANTIQUE: A Non-Factoid Question Answering Benchmark.” Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 12036 LNCS: 166–73. doi: 10.1007/978-3-030-45442-5_21.
Johnson, Jeff, Matthijs Douze, and Hervé Jégou. 2019. Billion-Scale Similarity Search with GPUs. IEEE Transactions on Big Data 7 (3): 535–547.
Karpukhin, Vladimir, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. “Dense Passage Retrieval for Open-Domain Question Answering.” pp. 6769–81 in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
Kazemi, Arefeh, Jamshid Mozafari, and Mohammad Ali Nematbakhsh. 2022. “PersianQuAD: The Native Question Answering Dataset for the Persian Language.” IEEE Access 10: 26045–57. doi: 10.1109/ACCESS.2022.3157289.
Khashabi, Daniel, Arman Cohan, Siamak Shakeri, Pedram Hosseini, Pouya Pezeshkpour, Malihe Alikhani, Moin Aminnaseri, Marzieh Bitaab, Faeze Brahman, and Sarik Ghazarian. 2021. ParsiNLU: A Suite of Language Understanding Challenges for Persian. Transactions of the Association for Computational Linguistics 9: 1163–1178.
Kwiatkowski, Tom, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, and Kenton Lee. 2019. Natural Questions: A Benchmark for Question Answering Research. Transactions of the Association for Computational Linguistics 7: 453–466.
Lin, Jimmy, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, and Rodrigo Nogueira. 2021. “Pyserini: A Python Toolkit for Reproducible Information Retrieval Research with Sparse and Dense Representations.” pp. 2356–62 in Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval.
Liu, Ye, Kazuma Hashimoto, Yingbo Zhou, Semih Yavuz, Caiming Xiong, and S. Yu Philip. 2021a. “Dense Hierarchical Retrieval for Open-Domain Question Answering.” pp. 188–200 in Findings of the Association for Computational Linguistics: EMNLP 2021.
Liu, Zhenghao, Kaitao Zhang, Chenyan Xiong, Zhiyuan Liu, and Maosong Sun. 2021b. “OpenMatch: An Open Source Library for NEU-IR Research.” pp. 2531–35 in Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval.
Mitra, Bhaskar, and Nick Craswell. 2018. An Introduction to Neural Information Retrieval. Foundations and Trends® in Information Retrieval 13 (1): 1–126.
Qu, Yingqi, Yuchen Ding, Jing Liu, Kai Liu, Ruiyang Ren, Wayne Xin Zhao, Daxiang Dong, Hua Wu, and Haifeng Wang. 2021. “RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering.” pp. 5835–47 in Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
Rajpurkar, Pranav, Robin Jia, and Percy Liang. 2018. “Know What You Don’t Know: Unanswerable Questions for SQuAD.” pp. 784–89 in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Texas, USA.
Rajpurkar, Pranav, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. “SQuAD: 100,000+ Questions for Machine Comprehension of Text.” pp. 2383–92 in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Melbourne, Australia
Robertson, Stephen, and Hugo Zaragoza. 2009. The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends® in Information Retrieval 3 (4): 333–389.
Salton, Gerard, and Christopher Buckley. 1988. Term-Weighting Approaches in Automatic Text Retrieval. Information Processing & Management 24 (5): 513–523.
Xia, Fen, Tie-Yan Liu, Jue Wang, Wensheng Zhang, and Hang Li. 2008. “Listwise Approach to Learning to Rank: Theory and Algorithm.” pp. 1192–99 in Proceedings of the 25th international conference on Machine learning. Tokyo, Japan.
Yang, Peilin, Hui Fang, and Jimmy Lin. 2017. “Anserini: Enabling the Use of Lucene for Information Retrieval Research.” pp. 1253–56 in Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval. Toyo, Japan.
Zhang, Xinyu, Andrew Yates, and Jimmy Lin. 2020. “A Little Bit is Worse than None: Ranking with Limited Training Data.” pp. 107–12 in Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing.