1 of 24

Computer Languages and Natural Languages

CMPS 5J

October 1, 2018

2 of 24

Announcements

3 of 24

Webcasting for lectures

https://webcast.ucsc.edu/

Username: cmps-5j-1

Password: programming

(video only shows what is on the screen or in the microphone; remind me to repeat your questions)

4 of 24

Canvas Discussions as class forum

5 of 24

Late work policy (added to syllabus today)

Reading + CodeHS:

Work submitted/completed after the original due date will be worth half credit.

Late work is accepted up until the end of the last week of lecture.

Labs:

Missed labs may be made up in TA office hours only with TA permission for half credit.

Exams:

Late/retry policy to be announced later.

6 of 24

Waitlist status (as of Oct 1 @ 10am)

I won’t be using permission codes to bypass the waitlist system.

7 of 24

In the last 48 hours...

8 of 24

Relevant Tweets

Sam Aaron (programmer and musician from recent reading) reflects on programming practice:

https://twitter.com/samaaron/status/1046319072403746817

Ryan Kirkbride (music psychologist from first-day demo video) is optimistic about this class:

https://twitter.com/ryankirkbride26/status/1046465365717516290

9 of 24

COBOL at UCSC

https://en.wikipedia.org/wiki/COBOL

10 of 24

#showerthoughts

11 of 24

Programming in science, towards public policy

Input: An estimate of the current state of the world, at some coarse resolution

Processing: An approximate simulation of the relevant physical phenomena and model of how we interact with that environment

Output: A prediction about the future state of the world, at some coarse resolution

We are using programming as a way to evaluate and argue for public policy.

12 of 24

Programming in history, towards politics

Authors (in 1787)�- Alexander Hamilton,�- James Madison,�- John Jay

Hamilton (2015 musical)

Computer analysis suggested authorship for FP in 1964. Can statistical authorship attribution tell us who wrote this NTY opinion?

13 of 24

Reflection

The people doing these projects are mostly not software engineers, and maybe they don't even consider themselves programmers. They might be data scientists. They might be climatologists, historians, or people who advise policymakers. Most likely, they are teams of people with different backgrounds. Programming is essential here, but the area of impact it something outside of a computer.

Programming can be fun, but it can also be deadly serious.

Let’s have fun?

14 of 24

Computer and Natural Languages

15 of 24

Language popularity

Computer languages of the world:

https://www.tiobe.com/tiobe-index/

(Via Silicon Valley, California’s programming culture has an outsized influence on the global programming culture)

16 of 24

Language groupings

Natural languages

https://en.wikipedia.org/wiki/List_of_language_families

It’s easier to translate between languages within close families.

Sometimes you can understand a language you’ve never officially studied if it shares enough structure and vocabulary.

Programming languages

https://en.wikipedia.org/wiki/Programming_paradigm

This class is mostly about imperative, object-oriented languages.

My past research focused on declarative, logic programming languages.

Should the prof who studies Japanese literature teach a linguistics of English class? Sure.

17 of 24

One big difference

Polyglots are people who speak many languages fluently.

Natural language polyglots are rare. Most people know one or two languages.

Programming language polyglots are common. Most programmers know several languages, even spanning multiple paradigms.

Google products are written in a mixture of C++, Java, and Python. Some of their products are implemented using Google-internal languages.

Polyglot programming is not about having exceptional intelligence -- it’s about having practice, support materials, and being able to Google things when you are confused.

18 of 24

Computer language syntax

Computers interpret language very precisely:

#Hash Tag vs #HashTag

johnny5@example.com vs johnny-5@example.com

Interpretation is language-dependent:

johnny-5 is a variable name in Lisp (like “x”)

johnny-5 is an expression in Java (like “x - 5”)

Java case-sensitive, Lisp is mostly not (like DNS):

Example.com vs example.com

19 of 24

Natural language syntax

“My UCSC experience”

“MyUCSC experience”

“Hot dog meat”

“Hot dog meat”

20 of 24

When whitespace is optional

三个和尚没水喝

Sān gè héshàng méi shuǐ hē

Three monks have no water to drink.

(~Too many cooks spoil the broth.)

drawPoint(10,20,1.2942)

drawPoint( 10,� 20,� 1.2942)

�“Draw a point at location x=10, y=20, z=1.2942”

You don’t have to segment out the words when programming, but it helps. Sometimes it is enforced by the local culture.

21 of 24

Parsing a statement

Is that my glass of water?

¿Es ese mi vaso de agua?

那是我的(Nà shì wǒ de shuǐ ma?)

Bu benim suyum mu?

There is a glass here.�It contains water.�It’s mine.�Is that right?

int x;�x = 2*x+1;

There is a variable that holds integer values called “x”. Take the current value of x, multiply it by two, add one, and store that value back into x.

In some finance software, each month:

rate = 1.05;�fee = 25;�balance = balance * rate + fee;

Most programming languages are analytic (they don’t use inflection), but Perl is one exception.

22 of 24

Context-dependent meaning

In Java, does “log(s)” add the contents of s (some text) to a log file or does it compute the logarithm of s (some number)?

In Java, what is “x”?

In English, who is “him” or “the one the left”?

Is “chat” an English or French word?

What does “bank” mean?

… as a noun?

… as a verb?

http://wordnetweb.princeton.edu/perl/webwn?s=bank

23 of 24

Communicating with other programmers (including yourself)

All computers need to know is in bits or wires.�We added recognizable symbols to programming language for our own human benefit, for each other.

24 of 24

Non-lecture course status

R01 was due before class.

First two chunks of CodeHS due before Friday lecture.

How is lab going?