ABSTRACT
This article explores the natural language generation capabilities of large language models with application to the production of two types of learning resources common in programming courses. Using OpenAI Codex as the large language model, we create programming exercises (including sample solutions and test cases) and code explanations, assessing these qualitatively and quantitatively. Our results suggest that the majority of the automatically generated content is both novel and sensible, and in some cases ready to use as is. When creating exercises we find that it is remarkably easy to influence both the programming concepts and the contextual themes they contain, simply by supplying keywords as input to the model. Our analysis suggests that there is significant value in massive generative machine learning models as a tool for instructors, although there remains a need for some oversight to ensure the quality of the generated content before it is delivered to students. We further discuss the implications of OpenAI Codex and similar tools for introductory programming education and highlight future research streams that have the potential to improve the quality of the educational experience for both teachers and students alike.
- Onni Aarne, Petrus Peltola, Juho Leinonen, and Arto Hellas. 2018. A study of pair programming enjoyment and attendance using study motivation and strategy metrics. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education. 759–764.Google ScholarDigital Library
- Kirsti M Ala-Mutka. 2005. A survey of automated assessment approaches for programming assignments. Computer science education 15, 2 (2005), 83–102.Google Scholar
- Ibrahim Albluwi. 2019. Plagiarism in Programming Assessments: A Systematic Review. ACM Trans. Comput. Educ. 20, 1, Article 6 (dec 2019), 28 pages. https://doi.org/10.1145/3371156Google ScholarDigital Library
- Joe Michael Allen, Frank Vahid, Kelly Downey, and Alex Daniel Edgcomb. 2018. Weekly programs in a CS1 class: Experiences with auto-graded many-small programs (MSP). In 2018 ASEE Annual Conference & Exposition.Google ScholarCross Ref
- Cory Althoff. 2022. The Self-Taught Programmer: The Definitive Guide to Programming Professionally. Hachette UK.Google Scholar
- Albert Bandura. 1977. Self-efficacy: toward a unifying theory of behavioral change.Psychological review 84, 2 (1977).Google Scholar
- Elisa Baniassad, Lucas Zamprogno, Braxton Hall, and Reid Holmes. 2021. STOP THE (AUTOGRADER) INSANITY: Regression Penalties to Deter Autograder Overreliance. In Proceedings of the 52nd ACM Technical Symposium on Computer Science Education (Virtual Event, USA) (SIGCSE ’21). Association for Computing Machinery, New York, NY, USA, 1062–1068. https://doi.org/10.1145/3408877.3432430Google ScholarDigital Library
- John B. Biggs and K. F. Collis. 1982. Evaluating the quality of learning : the SOLO taxonomy (structure of the observed learning outcome) / John B. Biggs, Kevin F. Collis. Academic Press New York. xiii, 245 p. : pages.Google Scholar
- Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in neural information processing systems. 1877–1901.Google Scholar
- Binglin Chen, Sushmita Azad, Rajarshi Haldar, Matthew West, and Craig Zilles. 2020. A Validated Scoring Rubric for Explain-in-Plain-English Questions. Association for Computing Machinery, New York, NY, USA, 563–569. https://doi.org/10.1145/3328778.3366879Google ScholarDigital Library
- Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374(2021).Google Scholar
- Matteo Ciniselli, Luca Pascarella, and Gabriele Bavota. 2022. To What Extent do Deep Learning-based Code Recommenders Generate Predictions by Cloning Code from the Training Set?arXiv preprint arXiv:2204.06894(2022).Google Scholar
- Catherine H Crouch and Eric Mazur. 2001. Peer instruction: Ten years of experience and results. American journal of physics 69, 9 (2001), 970–977.Google Scholar
- Paul Denny, Diana Cukierman, and Jonathan Bhaskar. 2015. Measuring the Effect of Inventing Practice Exercises on Learning in an Introductory Programming Course. In Proceedings of the 15th Koli Calling Conference on Computing Education Research (Koli, Finland) (Koli Calling ’15). Association for Computing Machinery, New York, NY, USA, 13–22. https://doi.org/10.1145/2828959.2828967Google ScholarDigital Library
- Paul Denny, Andrew Luxton-Reilly, Ewan Tempero, and Jacob Hendrickx. 2011. Codewrite: supporting student-driven practice of java. In Proceedings of the 42nd ACM technical symposium on Computer science education. 471–476.Google ScholarDigital Library
- Paul Denny, Ewan Tempero, Dawn Garbett, and Andrew Petersen. 2017. Examining a Student-Generated Question Activity Using Random Topic Assignment. In Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science Education (Bologna, Italy) (ITiCSE ’17). Association for Computing Machinery, New York, NY, USA, 146–151. https://doi.org/10.1145/3059009.3059033Google ScholarDigital Library
- Iddo Drori, Sarah Zhang, Reece Shuttleworth, Leonard Tang, Albert Lu, Elizabeth Ke, Kevin Liu, Linda Chen, Sunny Tran, Newman Cheng, Roman Wang, Nikhil Singh, Taylor L. Patti, Jayson Lynch, Avi Shporer, Nakul Verma, Eugene Wu, and Gilbert Strang. 2021. A Neural Network Solves, Explains, and Generates University Math Problems by Program Synthesis and Few-Shot Learning at Human Level. https://doi.org/10.48550/ARXIV.2112.15594Google Scholar
- Yuemeng Du, Andrew Luxton-Reilly, and Paul Denny. 2020. A review of research on Parsons problems. In Proceedings of the Twenty-Second Australasian Computing Education Conference. 195–202.Google ScholarDigital Library
- Angela Lee Duckworth and Lauren Eskreis-Winkler. 2013. True grit. Aps Observer 26(2013).Google Scholar
- Rodrigo Duran, Albina Zavgorodniaia, and Juha Sorva. 2021. Cognitive Load Theory in Computing Education Research: A Review. (2021). http://rodrigoduran.net/papers/CLT_in_CER.pdf Preprint.Google Scholar
- John Edwards, Joseph Ditton, Dragan Trninic, Hillary Swanson, Shelsey Sullivan, and Chad Mano. 2020. Syntax exercises in CS1. In Proceedings of the 2020 ACM Conference on International Computing Education Research. 216–226.Google ScholarDigital Library
- Stephen H. Edwards, Jürgen Börstler, Lillian N. Cassel, Mark S. Hall, and Joseph Hollingsworth. 2008. Developing a Common Format for Sharing Programming Assignments. SIGCSE Bull. 40, 4 (nov 2008), 167–182. https://doi.org/10.1145/1473195.1473240Google ScholarDigital Library
- K Anders Ericsson, Ralf T Krampe, and Clemens Tesch-Römer. 1993. The role of deliberate practice in the acquisition of expert performance.Psychological review 100, 3 (1993), 363.Google Scholar
- Andrew Ettles, Andrew Luxton-Reilly, and Paul Denny. 2018. Common Logic Errors Made by Novice Programmers. In Proceedings of the 20th Australasian Computing Education Conference (Brisbane, Queensland, Australia) (ACE ’18). Association for Computing Machinery, New York, NY, USA, 83–89. https://doi.org/10.1145/3160489.3160493Google ScholarDigital Library
- Katrina Falkner and Judy Sheard. 2019. Pedagogical approaches(1st ed.). Cambridge University Press, United Kingdom, 445–480. https://doi.org/10.1017/9781108654555.016Google Scholar
- Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, 2020. CodeBERT: A pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155(2020).Google Scholar
- James Finnie-Ansley, Paul Denny, Brett A Becker, Andrew Luxton-Reilly, and James Prather. 2022. The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming. In Australasian Computing Education Conference. 10–19.Google Scholar
- Kathi Fisler. 2014. The recurring rainfall problem. In Proceedings of the tenth annual conference on International computing education research. 35–42.Google ScholarDigital Library
- Max Fowler, Binglin Chen, Sushmita Azad, Matthew West, and Craig Zilles. 2021. Autograding ”Explain in Plain English” Questions Using NLP. In Proceedings of the 52nd ACM Technical Symposium on Computer Science Education (Virtual Event, USA) (SIGCSE ’21). Association for Computing Machinery, New York, NY, USA, 1163–1169. https://doi.org/10.1145/3408877.3432539Google ScholarDigital Library
- Brian Hanks, Sue Fitzgerald, Renée McCauley, Laurie Murphy, and Carol Zander. 2011. Pair programming in education: a literature review. Computer Science Education 21, 2 (2011), 135–173. https://doi.org/10.1080/08993408.2011.579808 arXiv:https://doi.org/10.1080/08993408.2011.579808Google ScholarCross Ref
- Mohammed Hassan and Craig Zilles. 2021. Exploring ‘Reverse-Tracing’ Questions as a Means of Assessing the Tracing Skill on Computer-Based CS 1 Exams. In Proceedings of the 17th ACM Conference on International Computing Education Research (Virtual Event, USA) (ICER 2021). Association for Computing Machinery, New York, NY, USA, 115–126. https://doi.org/10.1145/3446871.3469765Google ScholarDigital Library
- John Hattie and Helen Timperley. 2007. The power of feedback. Review of educational research 77, 1 (2007), 81–112.Google Scholar
- Arto Hellas, Juho Leinonen, and Petri Ihantola. 2017. Plagiarism in take-home exams: help-seeking, collaboration, and systematic cheating. In Proceedings of the 2017 ACM conference on innovation and technology in computer science education. 238–243.Google ScholarDigital Library
- David Hovemeyer, Matthew Hertz, Paul Denny, Jaime Spacco, Andrei Papancea, John Stamper, and Kelly Rivers. 2013. CloudCoder: Building a Community for Creating, Assigning, Evaluating and Sharing Programming Exercises (Abstract Only). In Proceeding of the 44th ACM Technical Symposium on Computer Science Education (Denver, Colorado, USA) (SIGCSE ’13). Association for Computing Machinery, New York, NY, USA, 742. https://doi.org/10.1145/2445196.2445451Google ScholarDigital Library
- Petri Ihantola, Tuukka Ahoniemi, Ville Karavirta, and Otto Seppälä. 2010. Review of recent systems for automatic assessment of programming assignments. In Proceedings of the 10th Koli calling international conference on computing education research. 86–93.Google ScholarDigital Library
- Cruz Izu and Peter Dinh. 2018. Can Novice Programmers Write C Functions?. In 2018 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE). 965–970. https://doi.org/10.1109/TALE.2018.8615375Google ScholarCross Ref
- MA Jenkins and Joseph Frederick Traub. 1967. An algorithm for an automatic general polynomial solver. Citeseer.Google Scholar
- Yue Jia and Mark Harman. 2010. An analysis and survey of the development of mutation testing. IEEE transactions on software engineering 37, 5 (2010), 649–678.Google Scholar
- Cazembe Kennedy, Aubrey Lawson, Yvon Feaster, and Eileen Kraemer. 2020. Misconception-Based Peer Feedback: A Pedagogical Technique for Reducing Misconceptions. In Proceedings of the 2020 ACM Conference on Innovation and Technology in Computer Science Education (Trondheim, Norway) (ITiCSE ’20). Association for Computing Machinery, New York, NY, USA, 166–172. https://doi.org/10.1145/3341525.3387392Google ScholarDigital Library
- Hieke Keuning, Johan Jeuring, and Bastiaan Heeren. 2018. A systematic literature review of automated feedback generation for programming exercises. ACM Transactions on Computing Education (TOCE) 19, 1 (2018), 1–43.Google ScholarDigital Library
- Juho Kim. 2015. Learnersourcing: improving learning with collective learner activity. Ph. D. Dissertation. Massachusetts Institute of Technology.Google Scholar
- Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. 2022. Large Language Models are Zero-Shot Reasoners. arXiv preprint arXiv:2205.11916(2022).Google Scholar
- Teemu Lehtinen, André L Santos, and Juha Sorva. 2021. Let’s Ask Students About Their Programs, Automatically. In 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC). IEEE, 467–475.Google Scholar
- Juho Leinonen, Paul Denny, and Jacqueline Whalley. 2021. Exploring the Effects of Contextualized Problem Descriptions on Problem Solving. In Australasian Computing Education Conference (Virtual, SA, Australia) (ACE ’21). Association for Computing Machinery, New York, NY, USA, 30–39. https://doi.org/10.1145/3441636.3442302Google ScholarDigital Library
- Juho Leinonen, Paul Denny, and Jacqueline Whalley. 2022. A Comparison of Immediate and Scheduled Feedback in Introductory Programming Projects. In Proceedings of the 53rd ACM Technical Symposium on Computer Science Education V. 1(Providence, RI, USA) (SIGCSE 2022). Association for Computing Machinery, New York, NY, USA, 885–891. https://doi.org/10.1145/3478431.3499372Google ScholarDigital Library
- Juho Leinonen, Krista Longi, Arto Klami, Alireza Ahadi, and Arto Vihavainen. 2016. Typing patterns and authentication in practical programming exams. In Proceedings of the 2016 ACM Conference on Innovation and Technology in Computer Science Education. 160–165.Google ScholarDigital Library
- Juho Leinonen, Nea Pirttinen, and Arto Hellas. 2020. Crowdsourcing Content Creation for SQL Practice. In Proceedings of the 2020 ACM Conference on Innovation and Technology in Computer Science Education. 349–355.Google ScholarDigital Library
- Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, 2022. Competition-Level Code Generation with AlphaCode. arXiv preprint arXiv:2203.07814(2022).Google Scholar
- Raymond Lister. 2020. On the Cognitive Development of the Novice Programmer: And the Development of a Computing Education Researcher. In Proceedings of the 9th Computer Science Education Research Conference (Virtual Event, Netherlands) (CSERC ’20). Association for Computing Machinery, New York, NY, USA, Article 2, 15 pages. https://doi.org/10.1145/3442481.3442498Google ScholarDigital Library
- Raymond Lister, Elizabeth S. Adams, Sue Fitzgerald, William Fone, John Hamer, Morten Lindholm, Robert McCartney, Jan Erik Moström, Kate Sanders, Otto Seppälä, Beth Simon, and Lynda Thomas. 2004. A Multi-National Study of Reading and Tracing Skills in Novice Programmers. In Working Group Reports from ITiCSE on Innovation and Technology in Computer Science Education (Leeds, United Kingdom) (ITiCSE-WGR ’04). Association for Computing Machinery, New York, NY, USA, 119–150. https://doi.org/10.1145/1044550.1041673Google ScholarDigital Library
- Raymond Lister, Colin Fidge, and Donna Teague. 2009. Further Evidence of a Relationship between Explaining, Tracing and Writing Skills in Introductory Programming. SIGCSE Bull. 41, 3 (jul 2009), 161–165. https://doi.org/10.1145/1595496.1562930Google ScholarDigital Library
- Raymond Lister, Beth Simon, Errol Thompson, Jacqueline L. Whalley, and Christine Prasad. 2006. Not Seeing the Forest for the Trees: Novice Programmers and the SOLO Taxonomy. SIGCSE Bull. 38, 3 (jun 2006), 118–122. https://doi.org/10.1145/1140123.1140157Google ScholarDigital Library
- Richard Lobb and Jenny Harlow. 2016. Coderunner: A Tool for Assessing Computer Programming Skills. ACM Inroads 7, 1 (feb 2016), 47–51. https://doi.org/10.1145/2810041Google ScholarDigital Library
- Krista Longi, Juho Leinonen, Henrik Nygren, Joni Salmi, Arto Klami, and Arto Vihavainen. 2015. Identification of programmers from typing patterns. In Proceedings of the 15th Koli Calling conference on computing education research. 60–67.Google ScholarDigital Library
- Renée McCauley, Sue Fitzgerald, Gary Lewandowski, Laurie Murphy, Beth Simon, Lynda Thomas, and Carol Zander. 2008. Debugging: a review of the literature from an educational perspective. Computer Science Education 18, 2 (2008), 67–92. https://doi.org/10.1080/08993400802114581 arXiv:https://doi.org/10.1080/08993400802114581Google ScholarCross Ref
- Laurie Murphy, Sue Fitzgerald, Raymond Lister, and Renée McCauley. 2012. Ability to ’explain in Plain English’ Linked to Proficiency in Computer-Based Programming. In Proceedings of the Ninth Annual International Conference on International Computing Education Research (Auckland, New Zealand) (ICER ’12). Association for Computing Machinery, New York, NY, USA, 111–118. https://doi.org/10.1145/2361276.2361299Google ScholarDigital Library
- Terence Nip, Elsa L. Gunter, Geoffrey L. Herman, Jason W. Morphew, and Matthew West. 2018. Using a Computer-Based Testing Facility to Improve Student Learning in a Programming Languages and Compilers Course. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education (Baltimore, Maryland, USA) (SIGCSE ’18). Association for Computing Machinery, New York, NY, USA, 568–573. https://doi.org/10.1145/3159450.3159500Google ScholarDigital Library
- Henrik Nygren, Juho Leinonen, and Arto Hellas. 2019. Non-restricted Access to Model Solutions: A Good Idea?. In Proceedings of the 2019 ACM Conference on Innovation and Technology in Computer Science Education. 44–50.Google ScholarDigital Library
- Henrik Nygren, Juho Leinonen, Nea Pirttinen, Antti Leinonen, and Arto Hellas. 2019. Experimenting with model solutions as a support mechanism. In Proceedings of the 1st UK & Ireland Computing Education Research Conference. 1–7.Google ScholarDigital Library
- José Carlos Paiva, José Paulo Leal, and Álvaro Figueira. 2022. Automated Assessment in Computer Science Education: A State-of-the-Art Review. ACM Transactions on Computing Education (TOCE) (2022).Google Scholar
- Hammond Pearce, Benjamin Tan, Baleegh Ahmad, Ramesh Karri, and Brendan Dolan-Gavitt. 2021. Can OpenAI Codex and Other Large Language Models Help Us Fix Security Bugs?arXiv preprint arXiv:2112.02125(2021).Google Scholar
- Hammond Pearce, Benjamin Tan, Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri, and Brendan Dolan-Gavitt. 2022. Pop Quiz! Can a Large Language Model Help With Reverse Engineering?arXiv preprint arXiv:2202.01142(2022).Google Scholar
- D. N. Perkins, Chris Hancock, Renee Hobbs, Fay Martin, and Rebecca Simmons. 1986. Conditions of Learning in Novice Programmers. Journal of Educational Computing Research 2, 1 (1986), 37–55. https://doi.org/10.2190/GUJT-JCBJ-Q6QU-Q9PLGoogle ScholarCross Ref
- Robert Phillips, Dan Lockton, Sharon Baurley, and Sarah Silve. 2013. Making Instructions for Others: Exploring Mental Models through a Simple Exercise. Interactions 20, 5 (sep 2013), 74–79. https://doi.org/10.1145/2505290Google ScholarDigital Library
- Nea Pirttinen, Vilma Kangas, Irene Nikkarinen, Henrik Nygren, Juho Leinonen, and Arto Hellas. 2018. Crowdsourcing programming assignments with CrowdSorcerer. In Proceedings of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education. 326–331.Google ScholarDigital Library
- Nea Pirttinen and Juho Leinonen. 2022. Can Students Review Their Peers? Comparison of Peer and Instructor Reviews. In Proceedings of the 27th ACM Conference on Innovation and Technology in Computer Science Education Vol 1.Google Scholar
- Leo Porter, Daniel Zingaro, Cynthia Lee, Cynthia Taylor, Kevin C. Webb, and Michael Clancy. 2018. Developing Course-Level Learning Goals for Basic Data Structures in CS2. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education (Baltimore, Maryland, USA) (SIGCSE ’18). Association for Computing Machinery, New York, NY, USA, 858–863. https://doi.org/10.1145/3159450.3159457Google ScholarDigital Library
- Ruixiang Qi and Davide Fossati. 2020. Unlimited Trace Tutor: Learning Code Tracing With Automatically Generated Programs. Association for Computing Machinery, New York, NY, USA, 427–433. https://doi.org/10.1145/3328778.3366939Google ScholarDigital Library
- Emily Q Rosenzweig, Allan Wigfield, and Jacquelyne S Eccles. 2019. Expectancy-value theory and its relevance for student motivation and learning.(2019).Google Scholar
- Sam Saarinen, Shriram Krishnamurthi, Kathi Fisler, and Preston Tunnell Wilson. 2019. Harnessing the Wisdom of the Classes: Classsourcing and Machine Learning for Assessment Instrument Generation. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education (Minneapolis, MN, USA) (SIGCSE ’19). Association for Computing Machinery, New York, NY, USA, 606–612. https://doi.org/10.1145/3287324.3287504Google ScholarDigital Library
- Kate Sanders, Marzieh Ahmadzadeh, Tony Clear, Stephen H Edwards, Mikey Goldweber, Chris Johnson, Raymond Lister, Robert McCartney, Elizabeth Patitsas, and Jaime Spacco. 2013. The Canterbury QuestionBank: Building a repository of multiple-choice CS1 and CS2 questions. In Proceedings of the ITiCSE working group reports conference on Innovation and technology in computer science education-working group reports. 33–52.Google ScholarDigital Library
- Kate Sanders, Jonas Boustedt, Anna Eckerdal, Robert McCartney, and Carol Zander. 2017. Folk Pedagogy: Nobody Doesn’t Like Active Learning. In Proceedings of the 2017 ACM Conference on International Computing Education Research (Tacoma, Washington, USA) (ICER ’17). Association for Computing Machinery, New York, NY, USA, 145–154. https://doi.org/10.1145/3105726.3106192Google ScholarDigital Library
- Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A Rothkopf, and Kristian Kersting. 2022. Large pre-trained language models contain human-like biases of what is right and wrong to do. Nature Machine Intelligence 4, 3 (2022), 258–268.Google ScholarCross Ref
- Otto Seppälä, Petri Ihantola, Essi Isohanni, Juha Sorva, and Arto Vihavainen. 2015. Do we know how difficult the rainfall problem is?. In Proceedings of the 15th Koli Calling Conference on Computing Education Research. 87–96.Google ScholarDigital Library
- Judy Sheard, Angela Carbone, Raymond Lister, Beth Simon, Errol Thompson, and Jacqueline L. Whalley. 2008. Going SOLO to Assess Novice Programmers. In Proceedings of the 13th Annual Conference on Innovation and Technology in Computer Science Education (Madrid, Spain) (ITiCSE ’08). Association for Computing Machinery, New York, NY, USA, 209–213. https://doi.org/10.1145/1384271.1384328Google ScholarDigital Library
- Lee S Shulman. 2005. Signature pedagogies in the professions. Daedalus 134, 3 (2005), 52–59.Google ScholarCross Ref
- Valerie J Shute. 2008. Focus on formative feedback. Review of educational research 78, 1 (2008), 153–189.Google Scholar
- Anjali Singh, Christopher Brooks, Yiwen Lin, and Warren Li. 2021. What’s In It for the Learners? Evidence from a Randomized Field Experiment on Learnersourcing Questions in a MOOC. In Proceedings of the Eighth ACM Conference on Learning@ Scale. 221–233.Google ScholarDigital Library
- E. Soloway. 1986. Learning to Program = Learning to Construct Mechanisms and Explanations. Commun. ACM 29, 9 (sep 1986), 850–858. https://doi.org/10.1145/6592.6594Google ScholarDigital Library
- Elliot Soloway and Kate Ehrlich. 1984. Empirical studies of programming knowledge. IEEE Transactions on software engineering5 (1984), 595–609.Google ScholarDigital Library
- Ben Stephenson. 2018. An Experience Using On-Computer Programming Questions During Exams. In Proceedings of the 23rd Western Canadian Conference on Computing Education (Victoria, BC, Canada) (WCCCE ’18). Association for Computing Machinery, New York, NY, USA, Article 11, 6 pages. https://doi.org/10.1145/3209635.3209639Google ScholarDigital Library
- Zahid Ullah, Adidah Lajis, Mona Jamjoom, Abdulrahman Altalhi, Abdullah Al-Ghamdi, and Farrukh Saleem. 2018. The effect of automatic assessment on novice programming: Strengths and limitations of existing systems. Computer Applications in Engineering Education 26, 6 (2018), 2328–2341. https://doi.org/10.1002/cae.21974 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/cae.21974Google ScholarCross Ref
- Anne Venables, Grace Tan, and Raymond Lister. 2009. A Closer Look at Tracing, Explaining and Code Writing Skills in the Novice Programmer. In Proceedings of the Fifth International Workshop on Computing Education Research Workshop (Berkeley, CA, USA) (ICER ’09). Association for Computing Machinery, New York, NY, USA, 117–128. https://doi.org/10.1145/1584322.1584336Google ScholarDigital Library
- Arto Vihavainen, Jonne Airaksinen, and Christopher Watson. 2014. A systematic review of approaches for teaching introductory programming and their influence on success. In Proceedings of the tenth annual conference on International computing education research. 19–26.Google ScholarDigital Library
- Arto Vihavainen, Matti Paksula, and Matti Luukkainen. 2011. Extreme apprenticeship method in teaching programming for beginners. In Proceedings of the 42nd ACM technical symposium on Computer science education. 93–98.Google ScholarDigital Library
- Regina Vollmeyer and Falko Rheinberg. 2005. A surprising effect of feedback on learning. Learning and instruction 15, 6 (2005), 589–602.Google Scholar
- Lev Semenovich Vygotsky and Michael Cole. 1978. Mind in society: Development of higher psychological processes. Harvard university press.Google Scholar
- Jacqueline L. Whalley, Raymond Lister, Errol Thompson, Tony Clear, Phil Robbins, P. K. Ajith Kumar, and Christine Prasad. 2006. An Australasian Study of Reading and Comprehension Skills in Novice Programmers, Using the Bloom and SOLO Taxonomies. In Proceedings of the 8th Australasian Conference on Computing Education - Volume 52 (Hobart, Australia) (ACE ’06). Australian Computer Society, Inc., AUS, 243–252.Google ScholarDigital Library
- Joseph Jay Williams, Juho Kim, Anna Rafferty, Samuel Maldonado, Krzysztof Z. Gajos, Walter S. Lasecki, and Neil Heffernan. 2016. AXIS: Generating Explanations at Scale with Learnersourcing and Machine Learning. In Proceedings of the Third (2016) ACM Conference on Learning @ Scale (Edinburgh, Scotland, UK) (L@S ’16). Association for Computing Machinery, New York, NY, USA, 379–388. https://doi.org/10.1145/2876034.2876042Google ScholarDigital Library
- Laurie Williams, Robert R Kessler, Ward Cunningham, and Ron Jeffries. 2000. Strengthening the case for pair programming. IEEE software 17, 4 (2000), 19–25.Google ScholarDigital Library
- John Wrenn, Shriram Krishnamurthi, and Kathi Fisler. 2018. Who Tests the Testers?. In Proceedings of the 2018 ACM Conference on International Computing Education Research (Espoo, Finland) (ICER ’18). Association for Computing Machinery, New York, NY, USA, 51–59. https://doi.org/10.1145/3230977.3230999Google ScholarDigital Library
- Benjamin Xie, Dastyni Loksa, Greg L. Nelson, Matthew J. Davidson, Dongsheng Dong, Harrison Kwik, Alex Hui Tan, Leanne Hwa, Min Li, and Amy J. Ko. 2019. A Theory of Instruction for Introductory Programming Skills. Computer Science Education 29, 2-3 (2019), 205–253. https://doi.org/10.1080/08993408.2019.1565235Google ScholarCross Ref
- Benjamin Xie, Greg L. Nelson, and Amy J. Ko. 2018. An Explicit Strategy to Scaffold Novice Program Tracing. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education (Baltimore, Maryland, USA) (SIGCSE ’18). Association for Computing Machinery, New York, NY, USA, 344–349. https://doi.org/10.1145/3159450.3159527Google ScholarDigital Library
- Lisa Yan, Annie Hu, and Chris Piech. 2019. Pensieve: Feedback on Coding Process for Novices. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education (Minneapolis, MN, USA) (SIGCSE ’19). Association for Computing Machinery, New York, NY, USA, 253–259.Google ScholarDigital Library
Recommendations
Studying the effect of AI Code Generators on Supporting Novice Learners in Introductory Programming
CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing SystemsAI code generators like OpenAI Codex have the potential to assist novice programmers by generating code from natural language descriptions, however, over-reliance might negatively impact learning and retention. To explore the implications that AI code ...
Using GitHub Copilot to Solve Simple Programming Problems
SIGCSE 2023: Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1The teaching and assessment of introductory programming involves writing code that solves a problem described by text. Previous research found that OpenAI's Codex, a natural language machine learning model trained on billions of lines of code, performs ...
Comparing Code Explanations Created by Students and Large Language Models
ITiCSE 2023: Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1Reasoning about code and explaining its purpose are fundamental skills for computer scientists. There has been extensive research in the field of computing education on the relationship between a student's ability to explain code and other skills such ...
Comments