1 of 18

History of AI

2 of 18

1.Inception of AI (1943 – 1955)

  • The first work that is now generally recognized as AI was done by Warren McCulloch and Walter Pitts(1943).
  • They drew on three sources:
    • Knowledge of the basic physicology and function of neurons in the brain.
    • A formal analysis of propositional logic(AND,OR,NOT,IMPLICATION).
    • Turing’s theory of computation.

They proposed a model of artificial neurons in which each neuron is characterized by a sufficient number of neighbouring neurons.

HEBBIAN LEARNING: McCulloch and Pitts also suggested that suitably defined networks could learn. Donald Hebb(1949) demonstrated a simple updating rule for modifying the connection strengths between neurons.

3 of 18

1.Inception of AI (1943 – 1955)

  • 1950: Two UG students at Harward, Marvin Minsky and Dean Edmonds, Built the first neural network called SNARC which used 3000 vaccum tubes and a surplus automatic pilot mechanism from a B-24 bomber to simulate a network of 40 neurons.

  • 1950: Alan Turing stressed on computing machine and intelligence, where he introduced Turing test, machine learning, genetic algorithms, reinforcement learning – it would be easy to create human level AI by learning algorithms.

4 of 18

2.The Birth of AI (1956)

  • McCarthy convinced Minsky, Claude Shannon, and Nathaniel Rochester to help him bring together U.S. researchers interested in automata theory, neural nets, and the study of intelligence.
  • They organized a 2-month workshop at Dartmouth in the summer of 1956. a 2 month, 10 man study of AI be carried out during the summer of 1956 at Dartmouth college in Hanover, New Hamshire.
  • An attempt was made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves.

5 of 18

2. 1956: Logic Theorist-Allen Newell and Herbert Simon

  • “We have invented a computer program capable of thinking non-numerically, and thereby solved the venerable mind–body problem.”

6 of 18

3.Early enthusiasm, great expectations(1952-1969)

Computers where designed for arithmetic operations and nothing else but AI researchers intellectual establishment, by large, preferred to believe that

“a machine can never do X”.

They focused on the tasks including games, puzzles, mathematics and IQ.

General Problem Solver, or GPS, Newell and Simon’s early success was followed up the GPS, unlike Logic Theorist, this program was designed from the start to imitate human problem-solving protocols. Within the limited class of puzzles it could handle, it turned out that the order in which the program considered subgoals and possible actions was similar to that in which humans approached the same problems.

Physical symbol system: hypothesis, which states that “a physical symbol system has the necessary and SYSTEM sufficient means for general intelligent action.” What they meant is that any system (human or machine) exhibiting intelligence must operate by manipulating data structures composed of symbols.

7 of 18

3.Early enthusiasm, great expectations(1952-1969)

At IBM, Nathaniel Rochester and his colleagues produced some of the first AI programs.

1952, Arthur Samuel wrote a series of programs for checkers(draughts) that eventually learned to play at a strong amateur level.

1958, In MIT AI lab, Mc-Carthy defined the high-level language Lisp, that has dominated the next 30 years programming language.

1958, Mc-Carthy published a paper entitled programs with common sense, in which he described the Advice Taker, a hypothetical program that can be seen as the first complete AI system.

1959: Herbert Gelernter constructed the Geometry Theorem Prover, which was able to prove theorems of mathematics that students would find quite tricky.

8 of 18

3.Early enthusiasm, great expectations(1952-1969)

1963 Microworlds.

James Slagle’s SAINT program(1963) was able to solve closed-form calculus integration problem typical of first-year college courses.

Tom Evans’s ANALOGY program(1968) solved geometric analogy problems that appear in IQ tests.

Daniel Bobrow’s STUDENT program(1967) solved algebra story problems, such as the following:

if the number of customers Tom gets it twice the square of 20% of the number of advertisements he runs, and the number of advertisements he runs 45, what is the number of customers Tom gets?

9 of 18

3.Early enthusiasm, great expectations(1952-1969)

10 of 18

3.Early enthusiasm, great expectations(1952-1969)

Hebb’s learning methods were enhanced by Bernie Widrow (widrow and Hoff, 1960; Widrow, 1962), who called his networks adalines, and by Frank Rosenblatt(1962) with his perceptions.

The perception convergence theorem(Block et al.., 1962) says that the learning algorithm can adjust the connection strengths of a perceptron to match any input data, provided such a match exists.

11 of 18

4.A dose of reality (1966-1973)

  • “visible future” Simon made more concrete predictions:that within 10 years a computer would be chess champion, and a significant mathematical theorem would be proved by machine. These predictions came true (or approximately true) within 40 years rather than 10.
  • Simon’s overconfidence was due to the promising performance of early AI systems on simple examples. In almost all cases, however, these early systems turned out to fail miserably when tried out on wider selections of problems and on more difficult problems.

Reasons for failure?

  1. AI systems are here based on informed inspection (how human think to solve the problem instead of critical analysis and algorithm)
  2. Intractability of many of the problems that AI was trying to solve.
  3. Early AI systems worked effectively for micro world programs. And could not scale well.
  4. Basic structure used to represent the intelligent nature was based on perceptron which limited in representing limited data.

12 of 18

5.Knowledge-based systems: The key to power?(1969-1979)

Early AI systems had adopted the general purpose search mechanism to solve the problem which proved to be weak as it did not scale up on large data.

The first kind of difficulty arose because most early programs knew nothing of their subject matter, they succeeded by means of simple syntactic manipulations.

The alternative to weak methods is to use more powerful, domain-specific knowledge that allows larger reasoning steps and can more easily handle typically occurring cases in narrow areas of expertise. To solve the more hard problem, you must know the result. DENDRAL program was an example of this type.

Expert systems are more Knowledge-intensive systems, Stanford began Heuristic programming project(HPP) to understand how new method of expert system can be applied to other areas.

MYCIN was system to diagnose the blood infection based on450 rules and was better than junior doctors which employed calculus of uncertainty called certainty factors to fit well how doctors diagnose on impact of evidence on the diagnosis.

13 of 18

6.AI becomes an industry(1980-present)

  • The first successful commercial expert system, R1, began operation at the Digital Equipment Corporation (McDermott, 1982).
  • The program helped configure orders for new computer systems; by 1986, it was saving the company an estimated $40 million a year.
  • By 1988, DEC’s AI group had 40 expert systems deployed, with more on the way. DuPont had 100 in use and 500 in development, saving an estimated $10 million a year. Nearly every major U.S. corporation had its own AI group and was either using or investigating expert systems.
  • In 1981, the Japanese announced the “Fifth Generation” project, a 10-year plan to build intelligent computers running Prolog.
  • Overall, the AI industry boomed from a few million dollars in 1980 to billions of dollars in 1988, including hundreds of companies building expert systems, vision systems, robots, and software and hardware specialized for these purposes.
  • Soon after that came a period called the “AIWinter,” in which many companies fell by the wayside as they failed to deliver on extravagant promises.

14 of 18

7. The return of neural networks(1986-present)

In mid 1980 four groups reinvented the back propagation learning algorithms and were applied on the many learning problems in CSE and psychology.

The widespread dissemination of these results in collection parallel distributed processing caused great excitements. They have capability to learn from the examples.

These connectionist models were competitors to the symbolic model proposed by newel and simons and logistic approach od Mc-Carthy. As human manipulate symbols.

These connectionists can compare the predicted true value to the expected output and can modify their parameters to decrease the in differences.

15 of 18

8.AI adopts the scientific method(1987-present)

Brittleness to expert system lead to incorporation of new more scientific approach.

Probability in place of Boolean logic.

Machine learning rather than hand coding.

Experimental results rather than philosophical claims.

Standards to note the progress

Soon UC Irvine was used as a standard repository for ML datasets,

The international planning competition for planning algorithms

LibreSpeech corpus for speech recognition,

The MNIST data set for handwritten digit recognition

ImageNet and COCO for image object recognition

SQUAD for NLP answering

16 of 18

Hidden Markov model dominated the area of speech recognition during 1980 which is based on the real, and large corpus of speech dataset to ensure the performance is robust.

Note that there was no scientific claim that human use HMM but HMM provided mathematical framework for understanding and solving problems.

1988 was important year for connection between the AI and other fields like statistics, OR, decision theory, and control systems.

Pearl’s probabilistic reasoning in intelligent systems lead to new acceptance of probability and decision theory in AI.

Pearl’s Baysian networks yielded the rigorous and efficient formalism for representing uncertain knowledge as well practise algorithms for probabilistic reasoning.

Rich sutans reinforcement learning was the major contribution in 1988.

17 of 18

9. The emergence of intelligent agents(1995-present)

  • Perhaps encouraged by the progress in solving the subproblems of AI, researchers have also started to look at the “whole agent” problem again.
  • The work of Allen Newell, John Laird, and Paul Rosenbloom on SOAR (Newell, 1990; Laird et al., 1987) is the best-known example of a complete agent architecture.
  • One of the most important environments for intelligent agents is the Internet. AI systems have become so common in Web-based applications that the “-bot” suffix has entered everyday language.
  • AI technologies underlie many Internet tools, such as search engines, recommender systems, and Web site aggregators.

    • Human Level AI
    • Artificial General AI
    • Friendly AI

18 of 18

10. The availability of very large data sets(2001-present)

  • Throughout the 60-year history of computer science, the emphasis has been on the algorithm as the main subject of study. But some recent work in AI suggests that for many problems, it makes more sense to worry about the data and be less picky about what algorithm to apply.
  • This is true because of the increasing availability of very large data sources: for example, trillions of words of English and billions of images from the Web (Kilgarriff and Grefenstette, 2006); or billions of base pairs of genomic sequences.
  • Yarowsky’s (1995) work on word-sense disambiguation: given the use of the word “plant” in a sentence, does that refer to flora or factory? Previous approaches to the problem had relied on human-labeled examples combined with machine learning algorithms. Yarowsky showed that the task can be done, with accuracy above 96%, with no labeled examples at all.
  • Banko and Brill (2001) show that techniques like this perform even better as the amount of available text goes from a million words to a billion and that the increase in performance from using more data exceeds any difference in algorithm choice; a mediocre algorithm with 100 million words of unlabeled training data outperforms the best known algorithm with 1 million words.
  • Hays and Efros (2007) discuss the problem of filling in holes in a photograph.
  • “knowledge bottleneck” in AI—the problem of how to express all the knowledge that a system needs—may be solved in many applications by learning methods rather than hand-coded knowledge engineering, provided the learning algorithms have enough data to go on (Halevy et al., 2009).
  • “today, many thousands of AI applications are deeply embedded in the infrastructure of every industry.”