Towards README-EVAL : Interpreting README File Instructions
James Paul White <jimwhite@uw.edu>, Department of Linguistics, University of Washington
Watch this Github repository for updates coming soon: �https://github.com/jimwhite/README-EVAL
Natural language is learned by humans in rich perceptual contexts. Sensory input is processed into a streams of utterance percepts and meaning percepts from which language is learned from their correlations. Some recent efforts in NLP research have employed grounding-inspired methods such as response-based and reinforcement learning but the domains have either not been very rich (as in games or database queries) or for which computers have poor perceptual capabilities (as in vision or robotics). To most effectively apply the concept of grounded natural language learning by machines the most appropriate domain will be that of computers. The task proposed here is learning to build software packages using instructions present in README files.
S. R. K. Branavan, Nate Kushman, Tao Lei, Regina Barzilay. (2012, July). Learning High-Level Planning from Text. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: v. 1 (pp. 126-135).
Dan Goldwasser and Dan Roth. 2013. Learning from Natural Instructions. Machine Learning, 94(2):205–232.
Related Work : Linux Plan Corpus
Lesh, Neal, Charles Rich, and Candace L. Sidner. "Using plan recognition in human-computer collaboration." Courses And Lectures-International Centre For Mechanical Sciences (1999): 23-32.
Blaylock, Nate, and James F. Allen. "Statistical Goal Parameter Recognition.” Proceedings of the Fourteenth International Conference on Automated Planning and Scheduling (ICAPS). Vol. 4. 2004.
Goal: know-filespace-usage-file(rl.exe)
(find(,rl.exe,,,**dot_0**))
(du(mail,rl.exe))
The Linux Plan Corpus consists of 457 interactive shell script sessions, with an average of 6.1 actions each, captured from human experimental subjects attempting to satisfy one of 19 different goals stated as an English sentence. Although it has been used successfully by those and other researchers, the natural variation in human behavior means that a corpus of such relatively small size appears to be very noisy. As a result they have had to rely on artificially generated data such as the Monroe Plan Corpus in order to get results that are more easily compared across system evaluations.
Grounded Language Learning
Analogy (& Metaphor) in Language
That man runs fast.
That computer runs fast.
run = repeating_process(action, actor, time = short)
<actor> does <action> in �<time>
Douglas Hofstadter. (1995). Fluid Concepts And Creative Analogies: � Computer Models Of The Fundamental Mechanisms Of Thought.
George Lakoff & Rafael Núñez. (2000). Where Mathematics Comes From: � How the Embodied Mind Brings Mathematics into Being.
Grounding machine learning of natural language in the computer domain enables learning of non-computer domains by reversing the direction of metaphorical projection.
bsf�x: Source Files�y: RPM Spec
ant�x: Source Files�y: RPM Spec
java�x: Source Files�y: RPM Spec
gcc�x: Source Files�y: RPM Spec
V: Does this still work using “gold” ?
T: Generate labels (scripts) for one (or more) of these.
Dependent �(Source)
Dependency �(Target)
From Dependencies To Validation
The package dependency DAG becomes training, test, and evaluation data by choosing dependency targets for test (i.e. the system build script outputs will be used for them in test) and dependency sources (the dependent packages) for validation (their package maintainer written build scripts are used as is to observe whether the dependencies are likely to be good).
Shared Public Evaluation Platform
Fedora Core 17 has 1,673 package nodes with a build script (avg. 6.9 lines) and some declared dependency relationship. 1,009 are leaves and the 664 internal nodes are the target of an average of 7 dependencies each.
README-EVAL Score
I would like to make the README-EVAL evaluation system available in an easy-to-use manner as a publicly available shared service.