Mining Aligned NL-Code Pairs from Stack Overflow MSR ’18, May 28–29, 2018, Gothenburg, Sweden
REFERENCES
[1]
Miltiadis Allamanis, Earl T Barr, Christian Bird, and Charles Sutton. 2015. Sug-
gesting Accurate Method and Class Names. In Joint Meeting on Foundations of
Software Engineering (ESEC/FSE). ACM, 38–49.
[2]
Miltiadis Allamanis, Hao Peng, and Charles Sutton. 2016. A Convolutional
Attention Network for Extreme Summarization of Source Code. arXiv preprint
arXiv:1602.03001 (2016).
[3]
Miltiadis Allamanis, Daniel Tarlow, Andrew D Gordon, and Yi Wei. 2015. Bimodal
Modelling of Source Code and Natural Language. In International Conference on
Machine Learning (ICML). 2123–2132.
[4]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine
Translation by Jointly Learning to Align and Translate. In International Conference
on Learning Representations (ICLR).
[5]
Antonio Valerio Miceli Barone and Rico Sennrich. 2017. A Parallel Corpus of
Python Functions and Documentation Strings for Automated Code Documenta-
tion and Code Generation. arXiv preprint arXiv:1707.02275 (2017).
[6]
Shaunak Chatterjee, Sudeep Juvekar, and Koushik Sen. 2009. Sni: A Search En-
gine for Java Using Free-form Queries. In International Conference on Fundamental
Approaches to Software Engineering. Springer, 385–400.
[7]
Aditya Desai, Sumit Gulwani, Vineet Hingorani, Nidhi Jain, Amey Karkare,
Mark Marron, Subhajit Roy, and others. 2016. Program Synthesis using Natural
Language. In International Conference on Software Engineering (ICSE). ACM, 345–
356.
[8]
Premkumar Devanbu. 2015. New Initiative: the Naturalness of Software. In
International Conference on Software Engineering (ICSE), Vol. 2. IEEE, 543–546.
[9]
Christine Franks, Zhaopeng Tu, Premkumar Devanbu, and Vincent Hellendoorn.
2015. CACHECA: A Cache Language Model based Code Suggestion Tool. In
International Conference on Software Engineering (ICSE), Vol. 2. IEEE, 705–708.
[10]
Mark Gabel and Zhendong Su. 2010. A Study of the Uniqueness of Source Code.
In International Symposium on Foundations of Software Engineering (FSE). ACM,
147–156.
[11]
Yarin Gal and Zoubin Ghahramani. 2016. A Theoretically Grounded Applica-
tion of Dropout in Recurrent Neural Networks. In Annual Conference on Neural
Information Processing Systems (NIPS). 1019–1027.
[12]
Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O.K. Li. 2016. Incorporating
Copying Mechanism in Sequence-to-Sequence Learning. In Annual Meeting of
the Association for Computational Linguistics (ACL). ACL, 1631–1640.
[13]
Abram Hindle, Earl T Barr, Mark Gabel, Zhendong Su, and Premkumar Devanbu.
2016. On the naturalness of software. Commun. ACM 59, 5 (2016), 122–131.
[14]
Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry
Heck. 2013. Learning Deep Structured Semantic Models for Web Search using
Clickthrough Data. In International Conference on Information and Knowledge
Management (CIKM). ACM, 2333–2338.
[15]
Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016.
Summarizing Source Code using a Neural Attention Model. In Annual Meeting of
the Association for Computational Linguistics (ACL). ACL, 2073–2083.
[16]
Alan Jae, Jeremy Lacomis, Edward J. Schwartz, Claire Le Goues, and Bogdan
Vasilescu. 2018. Meaningful Variable Names for Decompiled Code: A Machine
Translation Approach. In International Conference on Program Comprehension
(ICPC). ACM.
[17]
Nal Kalchbrenner and Phil Blunsom. 2013. Recurrent Continuous Translation
Models. In Conference on Empirical Methods in Natural Language Processing
(EMNLP). ACL, 1700–1709.
[18]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Opti-
mization. CoRR abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980
[19] Philipp Koehn. 2010. Statistical Machine Translation. Cambridge Press.
[20]
Xi Victoria Lin, Chenglong Wang, Luke Zettlemoyer, and Michael D Ernst. 2018.
NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the
Linux Operating System. International Conference on Language Resources and
Evaluation (LREC) (2018).
[21]
Nicholas Locascio, Karthik Narasimhan, Eduardo De Leon, Nate Kushman, and
Regina Barzilay. 2016. Neural Generation of Regular Expressions from Natural
Language with Minimal Domain Knowledge. In Conference on Empirical Methods
in Natural Language Processing (EMNLP). ACL, 1918–1923.
[22]
Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Eective Ap-
proaches to Attention-based Neural Machine Translation. In Conference on Em-
pirical Methods in Natural Language Processing (EMNLP). ACL, 1412–1421.
[23]
Lance A Miller. 1981. Natural Language Programming: Styles, Strategies, and
Contrasts. IBM Systems Journal 20, 2 (1981), 184–215.
[24]
Dana Movshovitz-Attias and William W Cohen. 2013. Natural Language Models
for Predicting Programming Comments. In Annual Meeting of the Association for
Computational Linguistics (ACL). ACL, 35–40.
[25]
Graham Neubig. 2017. Neural Machine Translation and Sequence-to-Sequence
Models: A Tutorial. arXiv preprint arXiv:1703.01619 (2017).
[26] Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar,
Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux,
Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng
Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul
Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta,
and Pengcheng Yin. 2017. DyNet: The Dynamic Neural Network Toolkit. arXiv
preprint arXiv:1701.03980 (2017).
[27]
Anh Tuan Nguyen, Hoan Anh Nguyen, Tung Thanh Nguyen, and Tien N Nguyen.
2014. Statistical Learning Approach for Mining API Usage Mappings for Code
Migration. In International Conference on Automated Software Engineering (ASE).
ACM, 457–468.
[28]
Anh Tuan Nguyen, Tung Thanh Nguyen, and Tien N Nguyen. 2013. Lexical
Statistical Machine Translation for Language Migration. In Joint Meeting on
Foundations of Software Engineering (ESEC/FSE). ACM, 651–654.
[29]
Haoran Niu, Iman Keivanloo, and Ying Zou. 2016. Learning to Rank Code
Examples for Code Search Engines. Empirical Software Engineering (2016), 1–33.
[30]
Yusuke Oda, Hiroyuki Fudaba, Graham Neubig, Hideaki Hata, Sakriani Sakti,
Tomoki Toda, and Satoshi Nakamura. 2015. Learning to Generate Pseudo-Code
from Source Code Using Statistical Machine Translation. In International Confer-
ence on Automated Software Engineering (ASE). IEEE, 574–584.
[31]
Sebastiano Panichella, Jairo Aponte, Massimiliano Di Penta, Andrian Marcus,
and Gerardo Canfora. 2012. Mining Source Code Descriptions from Developer
Communications. In International Conference on Program Comprehension (ICPC).
IEEE, 63–72.
[32]
Chris Quirk, Raymond Mooney, and Michel Galley. 2015. Language to Code:
Learning Semantic Parsers for If-This-Then-That Recipes. In Annual Meeting of
the Association for Computational Linguistics (ACL). 878–888.
[33]
Maxim Rabinovich, Mitchell Stern, and Dan Klein. 2017. Abstract Syntax Net-
works for Code Generation and Semantic Parsing. In Proceedings of the 55th
Annual Meeting of the Association for Computational Linguistics (Volume 1: Long
Papers). Association for Computational Linguistics, Vancouver, Canada, 1139–
1149. http://aclweb.org/anthology/P17-1105
[34]
Veselin Raychev, Martin Vechev, and Andreas Krause. 2015. Predicting Program
Properties from “Big Code”. In ACM Symposium on Principles of Programming
Languages (POPL). ACM, 111–124.
[35]
Victor S Sheng, Foster Provost, and Panagiotis G Ipeirotis. 2008. Get Another
Label? Improving Data Quality and Data Mining using Multiple, Noisy Labelers.
In ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining. ACM, 614–622.
[36]
Richard Socher, Eric H Huang, Jerey Pennin, Christopher D Manning, and
Andrew Y Ng. 2011. Dynamic Pooling and Unfolding Recursive Autoencoders
for Paraphrase Detection. In Advances in Neural Information Processing Systems
(NIPS). 801–809.
[37]
Giriprasad Sridhara, Lori Pollock, and K Vijay-Shanker. 2011. Generating Pa-
rameter Comments and Integrating with Method Summaries. In International
Conference on Program Comprehension (ICPC). IEEE, 71–80.
[38]
Nitish Srivastava, Georey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan
Salakhutdinov. 2014. Dropout: a Simple Way to Prevent Neural Networks from
Overtting. Journal of Machine Learning Research 15, 1 (2014), 1929–1958.
[39]
Ilya Sutskever, Oriol Vinyals, and Quoc VV Le. 2014. Sequence to Sequence
Learning with Neural Networks. In Advances in Neural Information Processing
Systems (NIPS). 3104–3112.
[40]
Bogdan Vasilescu, Casey Casalnuovo, and Premkumar Devanbu. 2017. Recovering
Clear, Natural Identiers from Obfuscated JavaScript Names. In Joint Meeting on
the Foundations of Software Engineering (ESEC/FSE). ACM. to appear.
[41]
Yushi Wang, Jonathan Berant, and Percy Liang. 2015. Building a Semantic Parser
Overnight. In Annual Meeting of the Association for Computational Linguistics
(ACL). ACL, 1332–1342.
[42]
Yi Wei, Nirupama Chandrasekaran, Sumit Gulwani, and Youssef Hamadi. 2015.
Building Bing Developer Assistant. Technical Report. MSR-TR-2015-36, Microsoft
Research.
[43]
Edmund Wong, Taiyue Liu, and Lin Tan. 2015. CloCom: Mining Existing Source
Code for Automatic Comment Generation. In International Conference on Software
Analysis, Evolution, and Reengineering (SANER). IEEE, 380–389.
[44]
Edmund Wong, Jinqiu Yang, and Lin Tan. 2013. AutoComment: Mining question
and answer sites for automatic comment generation. In International Conference
on Automated Software Engineering (ASE). IEEE, 562–567.
[45]
Di Yang, Aftab Hussain, and Cristina Videira Lopes. 2016. From query to usable
code: an analysis of Stack Overow code snippets. In Working Conference on
Mining Software Repositories (MSR). ACM, 391–402.
[46]
Ziyu Yao, Daniel S. Weld, Wei-Peng Chen, and Huan Sun. 2018. StaQC: A
Systematically Mined Question-Code Dataset from Stack Overow. In WWW
2018: The 2018 Web Conference.
[47]
Pengcheng Yin and Graham Neubig. 2017. A Syntactic Neural Model for General-
Purpose Code Generation. In Meeting of the Association for Computational Lin-
guistics (ACL).
[48]
Alexey Zagalsky, Ohad Barzilay, and Amiram Yehudai. 2012. Example overow:
Using Social Media for Code Recommendation. In International Workshop on
Recommendation Systems for Software Engineering (RSSE). IEEE Press, 38–42.