Computer Languages and Natural Languages
CMPS 5J
October 1, 2018
Announcements
Webcasting for lectures
Username: cmps-5j-1
Password: programming
(video only shows what is on the screen or in the microphone; remind me to repeat your questions)
Canvas Discussions as class forum
Late work policy (added to syllabus today)
Reading + CodeHS:
Work submitted/completed after the original due date will be worth half credit.
Late work is accepted up until the end of the last week of lecture.
Labs:
Missed labs may be made up in TA office hours only with TA permission for half credit.
Exams:
Late/retry policy to be announced later.
Waitlist status (as of Oct 1 @ 10am)
I won’t be using permission codes to bypass the waitlist system.
In the last 48 hours...
Relevant Tweets
Sam Aaron (programmer and musician from recent reading) reflects on programming practice:
https://twitter.com/samaaron/status/1046319072403746817
Ryan Kirkbride (music psychologist from first-day demo video) is optimistic about this class:
https://twitter.com/ryankirkbride26/status/1046465365717516290
COBOL at UCSC
https://en.wikipedia.org/wiki/COBOL
#showerthoughts
Programming in science, towards public policy
Input: An estimate of the current state of the world, at some coarse resolution
Processing: An approximate simulation of the relevant physical phenomena and model of how we interact with that environment
Output: A prediction about the future state of the world, at some coarse resolution
We are using programming as a way to evaluate and argue for public policy.
Programming in history, towards politics
Authors (in 1787)�- Alexander Hamilton,�- James Madison,�- John Jay
Hamilton (2015 musical)
Computer analysis suggested authorship for FP in 1964. Can statistical authorship attribution tell us who wrote this NTY opinion?
Reflection
The people doing these projects are mostly not software engineers, and maybe they don't even consider themselves programmers. They might be data scientists. They might be climatologists, historians, or people who advise policymakers. Most likely, they are teams of people with different backgrounds. Programming is essential here, but the area of impact it something outside of a computer.
Programming can be fun, but it can also be deadly serious.
Let’s have fun?
Computer and Natural Languages
Language popularity
Computer languages of the world:
https://www.tiobe.com/tiobe-index/
(Via Silicon Valley, California’s programming culture has an outsized influence on the global programming culture)
Natural languages of the world:
https://en.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers
... of California (because we’re at a UC)
https://statisticalatlas.com/state/California/Languages
Language groupings
Natural languages
https://en.wikipedia.org/wiki/List_of_language_families
It’s easier to translate between languages within close families.
Sometimes you can understand a language you’ve never officially studied if it shares enough structure and vocabulary.
Programming languages
https://en.wikipedia.org/wiki/Programming_paradigm
This class is mostly about imperative, object-oriented languages.
My past research focused on declarative, logic programming languages.
Should the prof who studies Japanese literature teach a linguistics of English class? Sure.
One big difference
Polyglots are people who speak many languages fluently.
Natural language polyglots are rare. Most people know one or two languages.
Programming language polyglots are common. Most programmers know several languages, even spanning multiple paradigms.
Google products are written in a mixture of C++, Java, and Python. Some of their products are implemented using Google-internal languages.
Polyglot programming is not about having exceptional intelligence -- it’s about having practice, support materials, and being able to Google things when you are confused.
Computer language syntax
Computers interpret language very precisely:
#Hash Tag vs #HashTag
johnny5@example.com vs johnny-5@example.com
Interpretation is language-dependent:
johnny-5 is a variable name in Lisp (like “x”)
johnny-5 is an expression in Java (like “x - 5”)
Java case-sensitive, Lisp is mostly not (like DNS):
Example.com vs example.com
Natural language syntax
“My UCSC experience”
“MyUCSC experience”
“Hot dog meat”
“Hot dog meat”
When whitespace is optional
三个和尚没水喝
Sān gè héshàng méi shuǐ hē
Three monks have no water to drink.
(~Too many cooks spoil the broth.)
drawPoint(10,20,1.2942)
drawPoint( 10,� 20,� 1.2942)
�“Draw a point at location x=10, y=20, z=1.2942”
You don’t have to segment out the words when programming, but it helps. Sometimes it is enforced by the local culture.
Parsing a statement
Is that my glass of water?
¿Es ese mi vaso de agua?
那是我的水吗?(Nà shì wǒ de shuǐ ma?)
Bu benim suyum mu?
There is a glass here.�It contains water.�It’s mine.�Is that right?
int x;�x = 2*x+1;
There is a variable that holds integer values called “x”. Take the current value of x, multiply it by two, add one, and store that value back into x.
In some finance software, each month:
rate = 1.05;�fee = 25;�balance = balance * rate + fee;
Most programming languages are analytic (they don’t use inflection), but Perl is one exception.
Context-dependent meaning
In Java, does “log(s)” add the contents of s (some text) to a log file or does it compute the logarithm of s (some number)?
In Java, what is “x”?
In English, who is “him” or “the one the left”?
Is “chat” an English or French word?
What does “bank” mean?
… as a noun?
… as a verb?
http://wordnetweb.princeton.edu/perl/webwn?s=bank
Communicating with other programmers (including yourself)
All computers need to know is in bits or wires.�We added recognizable symbols to programming language for our own human benefit, for each other.
Non-lecture course status
R01 was due before class.
First two chunks of CodeHS due before Friday lecture.
How is lab going?