1 of 44

Neural Machine Translation English-Bengali

Rishi Dey Chowdhury

Ayush Bilkhiwal

Sujeet Kumar

Confidential

Customized for Lorem Ipsum LLC

Version 1.0

2 of 44

Hello! āĻšā§āϝāĻžāϞ⧋!

How are you? āϤ⧁āĻŽāĻŋ āϕ⧇āĻŽāύ āφāϛ⧋?: 0.80, āϤ⧁āĻŽāĻŋ āϕ⧇āĻŽāύ āφāĻ›?: 0.70 āϤ⧁āĻŽāĻŋ āϕ⧇āĻŽāύ āφāϛ⧋?

(English Input) (Model generated scores) (Reference)

Confidential

Customized for Lorem Ipsum LLC

Version 1.0

3 of 44

Outline

Overview

Problems to solve

Strategy

Model Architecture

Hyperparameters

Model Performance

Translation Results

Conclusion

Confidential

Customized for Lorem Ipsum LLC

Version 1.0

4 of 44

Overview

  • Language being the heart of communication, with the increase in computational power and growing need for conversion of English content to Indic Languages to make it more accessible to local people, Neural Network based methods have taken over the existing statistical methods to generate better machine(automated) translations.
  • We here explore one such direction of Neural Machine Translation to convert English Content to North-Eastern Language (in our case Bengali).

5 of 44

Project objective

  • Creating Baseline Systems using different architectures for Language Pair.
  • Identification of Features which can be used in Building Feature based MT systems.

6 of 44

Literature

Survey

7 of 44

  • A detailed search for previous works related to the field of Neural Machine Translation Applications for English to Bengali and vice-versa translation, revealed very few approaches being experimented with these Language Pairs.
  • The methods adopted in all these papers mostly included Simple RNN, LSTM and GRU [1].
  • Other efforts involve using Back Translation to increase the amount of training examples to better train a model [2].
  • As per our search only one paper used BiLSTM with attention and Self-Attention based Modern Transformer Based Architecture [3].
  • This is mostly due to lack of availability of large parallel corpus.īŋŊīŋŊ

Literature Survey

8 of 44

Datasets

  • English-Bangla (en-bn) dataset from Samanantar and  WAT Indic (Workshop on Asian Translation).
  • The Samanantar dataset contains 92,51,702 parallel sentences. WAT Indic dataset contains 2390 parallel sentences.
  • Various domains such as news, sports, tech, entertainment, lifestyle, education, business, and general.
  • Training
    • 1,54,836 parallel sentences from Samanatar for running our small model.
    • 7,28,047 parallel sentences from Samanatar for running our large model.
    • Number of unique words in English and Bengali data are 1,16,349 and 1,48,459 respectively. īŋŊ

9 of 44

Datasets

  • Validation
    • 3694 sentences taken from Samanatar.
    • 1194 sentences from Indic. WAT
  • Testing
    • 1194 sentences from Indic. WAT

īŋŊ

10 of 44

Data Preprocessing

  • Byte-Pair Encoding to tokenize both the train and test data.
  • Vocabulary size of 32,000 for both English and Bengali language.
  • Normalization
    • Converting the sentences to lower case and then by using standard unicode normalization.
  • Tokenization
    • Adding the bos (beginning of sentence) token at the start and the eos (end of sentence) token and end of each encoded sentence in both the language.
    • Adding unk (unknown) for the unknown subwords encountered and pad (padding) token for padding the sentences.

11 of 44

Output Sentence Selection

Neural Network is just the way to calculate the conditional probability of the next word that comes in the output sentence given the previous word generated in the output sequence till now and the input sentence in the case of Machine Translation. Now, we can opt for several ways, like Greedy Search, Beam Search, Minimizing Baye's Risk,etc. to find the output sequence with the highest joint probability. Looking for all possible combinations is computationally very expensive. Hence, we resort to heuristic and asymptotically best methods to generate the output sequence with highest probability. We resorted to two ways which are:

  • Greedy Search: Picks the next id with the highest softmax probability. Though it works for short sentences it is not suitable for many cases.
  • Minimum Baye's Risk(MBR): Generates multiple candidate translations. Compare each one of them with all other using a similarity score(in our case ROUGE). Then, choosing the one with the highest similarity score, gives us the candidate translation that is in consensus with all the generated samples.

ROUGE SCORE =

MBR Selection Criterion:

12 of 44

MODELS

13 of 44

Architecture

1 Layer Transformer Architecture

We changed the Hyperparameters of the Transformer and trained 2 of these models to compare their performance.

14 of 44

With Attention head 0

Reference: https://jalammar.github.io/illustrated-transformer/

15 of 44

heads(0-7)

Feed Forward Neural Network

Reference: https://jalammar.github.io/illustrated-transformer/

16 of 44

With Multi headed Self Attention

Here we are using 2 heads we get the importance of tired word as well as the importance of animal word.

If we add all the attention heads to the picture, however, things can be harder to interpret

Reference: https://jalammar.github.io/illustrated-transformer/

17 of 44

Architecture

8 Heads

8 Heads Big

Number of Layers

1

1

Number of Heads

8

8

Embedding Dimension

256

256

Key Dimension

32

32

Value Dimension

32

32

Number of Parallel Sentences

1,54,836

7,28,047

Epochs

10

10

Batch Size

256

256

18 of 44

Architecture

2 Layer Transformer Architecture

We changed the Hyperparameters of the Transformer and trained 4 of these models to compare their performance.

19 of 44

Architecture

4 Heads

8 Heads

8 Heads Big

8 Heads Dim Mod

Number of Layers

2

2

2

2

Number of Heads

4

8

4

8

Embedding Dimension

256

256

256

512

Key Dimension

32

32

32

64

Value Dimension

32

32

32

64

Number of Parallel Sentences

1,54,836

1,54,836

7,28,047

1,54,836

Epochs

10

10+10

10 (pretrained weights from 8-heads)+10

10+10

Batch Size

256

256

256

256

20 of 44

Architecture

3 Layer Transformer Architecture

Hyperparameters:

Number of Layers = 2

Number of Heads = 4

Embedding Dimension = 256

Key Dimension = 32

Value Dimension = 32

Parallel Sentences = 1,54,836

Epochs = 10

Batch Size = 256

21 of 44

Architecture

4 Layer Transformer Architecture

We changed the Hyperparameters of the Transformer and trained 2 of these models to compare their performance.

22 of 44

Architecture

4 Heads

4 Heads Big

Number of Layers

4

4

Number of Heads

4

4

Embedding Dimension

256

256

Key Dimension

32

32

Value Dimension

32

32

Number of Parallel Sentences

1,54,836

7,28,047

Epochs

10

10

Batch Size

256

256

23 of 44

RESULTS

24 of 44

Results

  • For 1 Layers we look at the comparison between 8 Heads and 8 Heads Big’s accuracy

For smaller dataset

For larger dataset

25 of 44

Results

  • For 1 Layers we look at the comparison between 8 Heads and 8 Heads Big’s Loss

For smaller data

For larger data

26 of 44

Results

8 Heads

8 Heads Big

BLEU

1.51

3.03

chrF2

25.52

29.33

TER

100.29

96.24

27 of 44

Manual Evaluation of 8 Heads on Test Data

English Sentence

Bengali Translation

Adequacy

Fluency

Are we leaving for good?

āφāĻŽāϰāĻž āĻ­āĻžāϞ āϝāĻžāĻŦ?

0

0

Investigation was taken away.

āϤāĻĻāĻ¨ā§āϤ āĻ•āϰ⧇ āϖ⧁āρāϜāϤ⧇ āϤāĻĻāĻ¨ā§āϤ āĻ•āϰāĻž āĻšāϝāĻŧ⧇āϛ⧇āĨ¤

0

0

The palace is an extended part of a huge complex.

āĻĒā§āϰāĻžāϏāĻžāĻĻāϟāĻŋ āĻāĻ•āϟāĻŋ āϜāϟāĻŋāϞ āϜāϟāĻŋāϞ āĻāĻ•āϟāĻŋ āϜāϟāĻŋāϞ āĻ¸ā§āϤāϰ āϧāĻžāϰāĻŖ āĻ•āϰ⧇āĨ¤

0

0

Then he abruptly disappeared.

āϤāĻžāϰāĻĒāϰ āĻšāĻ āĻžā§Ž āĻšāĻžāϰāĻŋāϝāĻŧ⧇ āϗ⧇āϞ⧋ āϏ⧇āĨ¤

4

4

He had a rope tied around his waist.

āϤāĻžāϰ āϕ⧋āĻŽāϰ āϘāĻŋāϰ⧇ āĻĢ⧇āϞ⧇ āϤāĻŋāύāĻŋāĨ¤

1

4

There have been numerous ideas and attempts to reduce the amount of carbon emissions.

āĻ…āύ⧇āĻ•āϗ⧁āϞāĻŋ āĻ…āϪ⧁āϝāĻžāϝāĻŧā§€ āĻ…āύ⧇āĻ• āĻ•āĻŋāϛ⧁ āϏāĻĢā§āϟāĻ­ā§āϝāϤāĻžāϰ āϜāĻ¨ā§āϝ āĻĒā§āϰāϝāĻŧā§‹āϜāύ⧀āϝāĻŧ āωāĻĒāĻžāĻĻāĻžāύ āϏāϰāĻŦāϰāĻžāĻš āĻ•āϰāĻž āĻšāϝāĻŧ⧇āϛ⧇āĨ¤

0

0

In India, faith and Nature have had a deep link since ancient times.

āĻ­āĻžāϰāϤ⧇, āĻŦāĻŋāĻļā§āĻŦāĻžāϏ āĻāĻŦāĻ‚ āĻĒā§āϰāĻ•ā§ƒāϤāĻŋ āĻāĻŦāĻ‚ āĻĒā§āϰāĻ•ā§ƒāϤāĻŋ āϝ⧇ āĻĒā§āϰāĻžāĻšā§€āύ āĻ•āĻžāϞ⧇ āĻ›āĻŋāϞ āĻ­āĻžāϰāϤāĨ¤

2

1

28 of 44

Manual Evaluation of 8 Heads on Test Data

English Sentence

Bengali Translation

Adequacy

Fluency

What kind?

āϕ⧋āύ āϧāϰāύ⧇āϰ?

4

4

India has emerged as a bright spot in the global economy which is driving global growth as well.

āĻ­āĻžāϰāϤ⧇ āĻŦāĻŋāĻļā§āĻŦ āĻ…āĻ°ā§āĻĨāύ⧀āϤāĻŋ āϏāĻžāϰāĻž āĻŦāĻŋāĻļā§āĻŦ⧇ āĻāĻ•āϟāĻŋ āωāĻœā§āĻœā§āĻŦāϞ āĻļāĻŋāĻ˛ā§āĻĒ āϏāĻžā§āϚāĻžāϰ āĻ•āϰ⧇ āĻāĻŦāĻ‚ āĻŦāĻŋāĻļā§āĻŦ⧇āϰ āĻ…āĻ°ā§āĻĨāύ⧀āϤāĻŋ āĻŦāĻŋāĻļā§āĻŦāĻŦā§āϝāĻžāĻĒā§€ āĻŦāĻŋāύāĻŋāϝāĻŧā§‹āĻ— āĻ•āϰāϛ⧇āĨ¤

2

2

Do you think it is possible for mere humans to come to know our almighty Creator, as stated here in the Bible?

āφāĻĒāύāĻžāϰ āĻ•āĻŋ āĻŽāύ⧇ āĻšāϝāĻŧ, āϤ⧁āĻŽāĻŋ āĻ•āĻŋ āϜāĻžāύ⧋ āφāϰ āφāĻŽāĻžāĻĻ⧇āϰ āϏ⧃āĻˇā§āϟāĻŋāĻ•āĻ°ā§āϤāĻž āϏāĻŽā§āĻŦāĻ¨ā§āϧ⧇ āϜāĻžāύāϤ⧇ āĻĒāĻžāϰāĻŦ⧇ āύāĻž?

3

2

29 of 44

Results

Heads

English

Bengali Translation: MBR Score(10 samples)

Reference Translation

8

I love you.

āϤ⧋āĻŽāĻžāϝāĻŧ āĻ­āĻžāϞ⧋āĻŦāĻžāϏāĻŋāĨ¤?: 0.83, āφāĻŽāĻŋ āϤ⧋āĻŽāĻžāϝāĻŧ āĻ­āĻžāϞāĻŦāĻžāϏāĻŋāĨ¤: 0.83

āφāĻŽāĻŋ āϤ⧋āĻŽāĻžāϕ⧇ āĻ­āĻžāϞ⧋āĻŦāĻžāϏāĻŋāĨ¤

8

How are you.

āϤ⧁āĻŽāĻŋ āϕ⧇āĻŽāύ āφāĻ›āĨ¤: 0.93, āϤ⧁āĻŽāĻŋ āϕ⧇āĻŽāύ āφāĻ›?: 0.89

āϤ⧁āĻŽāĻŋ āϕ⧇āĻŽāύ āφāϛ⧋āĨ¤

8

I am hungry.

āφāĻŽāĻŋ āĻ•ā§āώ⧁āϧāĻžāĻ°ā§āϤ āĻŽāĻžāύ⧁āώāĨ¤: 0.80, āφāĻŽāĻŋ āĻ•ā§āώ⧁āϧāĻžāĻ°ā§āϤāĨ¤: 0.74

āφāĻŽāĻŋ āĻ•ā§āώ⧁āϧāĻžāĻ°ā§āϤāĨ¤

8

I am a boy.

āφāĻŽāĻŋ āϛ⧇āϞ⧇āĨ¤: 0.98, āφāĻŽāĻŋ āϤ⧋ āϛ⧇āϞ⧇āĨ¤: 0.85

āφāĻŽāĻŋ āϛ⧇āϞ⧇āĨ¤

8 Big

I love you.

āĻ­āĻžāϞ⧋āĻŦāĻžāϏāĻŋāĨ¤: 0.36, āφāĻŽāĻžāϰ āϏāĻ™ā§āϗ⧇ āĻ­āĻžāϞāĻŦāĻžāϏāĻžāϰ āϏāĻŽā§āĻĒāĻ°ā§āĻ•āĨ¤: 0.36

āφāĻŽāĻŋ āϤ⧋āĻŽāĻžāϕ⧇ āĻ­āĻžāϞ⧋āĻŦāĻžāϏāĻŋāĨ¤

8 Big

How are you.

āϕ⧇āĻŽāύ āφāϛ⧋ āϤ⧁āĻŽāĻŋāĨ¤: 0.99, āϤ⧁āĻŽāĻŋ āϕ⧇āĻŽāύ āφāϛ⧋āĨ¤: 0.99

āϤ⧁āĻŽāĻŋ āϕ⧇āĻŽāύ āφāϛ⧋āĨ¤

8 Big

Hyderabad is a beautiful city.

āĻšāĻžāϝāĻŧāĻĻāϰāĻžāĻŦāĻžāĻĻ⧇āϰ āϏ⧁āĻ¨ā§āĻĻāϰ āĻļāĻšāϰāĨ¤: 0.8576, āĻšāĻžāϝāĻŧāĻĻā§āϰāĻžāĻŦāĻžāĻĻ⧇āϰ āĻāĻ•āϟāĻŋ āĻļāĻšāϰāĨ¤: 0.8573

āĻšāĻžāϝāĻŧāĻĻā§āϰāĻžāĻŦāĻžāĻĻ āĻāĻ•āϟāĻŋ āϏ⧁āĻ¨ā§āĻĻāϰ āĻļāĻšāϰāĨ¤

8 Big

My name Rishi.

āφāĻŽāĻžāϰ āύāĻžāĻŽ āĻ‹āώāĻŋāĨ¤: 0.87, āφāĻŽāĻžāϰ āύāĻžāĻŽāĨ¤: 0.83

āφāĻŽāĻžāϰ āύāĻžāĻŽ āĻ‹āώāĻŋāĨ¤

30 of 44

Results

  • For 2 Layers we look at the comparison between 4 Heads and 8 Heads’ Accuracy

31 of 44

Results

  • For 2 Layers we look at the comparison between 4 Heads and 8 Heads’ Loss

32 of 44

Results

  • For 2 Layers we look at the comparison between 8 Heads and 8 Heads Big’s Accuracy

33 of 44

Results

  • For 2 Layers we look at the comparison between 8 Heads Big and 8 Heads Dim’s Accuracy

And the same trend holds for losses as well.

34 of 44

Results

4 Heads

8 Heads

8 Heads Big

8 Heads Dim Mod

BLEU

0.80

1.07

2.82

0.06

chrF2

20.25

20.48

29.59

5.04

TER

107.92

98.69

93.91

99.45

35 of 44

Manual Evaluation of 8 Heads Big on Test Data

English Sentence

Bengali Translation

Adequacy

Fluency

His demise is anguishing.

āϤāĻžāρāϰ āĻŽā§ƒāĻ¤ā§āϝ⧁ āĻŽāĻšāĻžāϏāĻŽāĻžāĻĸāĻŧāĨ¤

4

3

This is the ninth interaction in the series by the Prime Minister through video conference with the beneficiaries of various Government schemes.

āĻĒā§āϰāϧāĻžāύāĻŽāĻ¨ā§āĻ¤ā§āϰ⧀āϰ āĻŦāĻŋāĻ­āĻŋāĻ¨ā§āύ āĻĒā§āϰāĻ•āĻ˛ā§āĻĒ⧇āϰ āĻŽāĻžāĻ§ā§āϝāĻŽā§‡ āĻāχ āφāϞ⧋āϚāύāĻž āϏāĻ­āĻž āĻ›āĻžāĻĄāĻŧāĻžāĻ“ āĻĒā§āϰāϧāĻžāύāĻŽāĻ¨ā§āĻ¤ā§āϰ⧀ āĻŦāĻŋāĻ­āĻŋāĻ¨ā§āύ āϧāϰāύ⧇āϰ āϏāϚāĻŋāĻŦāĻĻ⧇āϰ āϏāĻ™ā§āϗ⧇ āφāϞāĻžāĻĒ-āφāϞ⧋āϚāύāĻž āĻ•āϰāĻŦ⧇āύāĨ¤

0

1

He said that the Union Government is working with an approach of “isolation to integration” to develop all the hitherto under-developed parts of the country.

āĻĒā§āϰāϧāĻžāύāĻŽāĻ¨ā§āĻ¤ā§āϰ⧀ āĻŦāϞ⧇āϛ⧇āύ, āĻĻ⧇āĻļ⧇āϰ āϏāĻžāĻ°ā§āĻŦāĻŋāĻ• āωāĻ¨ā§āύāϝāĻŧāύ⧇āϰ āϞāĻ•ā§āĻˇā§āϝ⧇ āϕ⧇āĻ¨ā§āĻĻā§āϰ⧀āϝāĻŧ āϏāϰāĻ•āĻžāϰ āĻāĻ•āϝ⧋āϗ⧇ āĻ•āĻžāϜ āĻ•āϰāϛ⧇āĨ¤

1

4

Imran khan taking oath

āĻļāĻĒāĻĨ āύāĻŋāϞ⧇āύ āχāĻŽāϰāĻžāύ āĻ–āĻžāύāĨ¤

4

4

Samsung has been heavily rumoured to launch two new mid-end smartphones the Galaxy J7 (2017) and Galaxy J5 (2017).

āĻ¸ā§āϝāĻžāĻŽāϏāĻžāĻ‚ āĻ—ā§āϝāĻžāϞāĻžāĻ•ā§āϏāĻŋ āĻāĻŽ ā§Ļā§§ (⧍ā§Ļā§§ā§­) āĻāĻŦāĻ‚ āĻ¸ā§āϝāĻžāĻŽāϏāĻžāĻ‚ (⧍ā§Ļā§§ā§­-ā§§ā§Žā§¯), āĻ¸ā§āϝāĻžāĻŽāϏāĻžāĻ‚ āĻāϰ āĻĻ⧁āϟāĻŋ āύāϤ⧁āύ āĻ¸ā§āĻŽāĻžāĻ°ā§āϟāĻĢā§‹āύ āĻŦāĻžāϜāĻžāϰ⧇ āĻāϏ⧇āϛ⧇āĨ¤

1

3

Hence, the people in the area are in panic.

āĻĢāϞ⧇ āφāϤāĻ™ā§āϕ⧇ āϰāϝāĻŧ⧇āϛ⧇ āĻāϞāĻžāĻ•āĻžāĻŦāĻžāϏ⧀āĨ¤

4

4

The issue has not come to my notice.

āĻŦāĻŋāώāϝāĻŧāϟāĻŋ āφāĻŽāĻžāϰ āύāϜāϰ⧇ āφāϏ⧇āύāĻŋāĨ¤

4

4

36 of 44

Manual Evaluation of 8 Heads Big on Test Data

English Sentence

Bengali Translation

Adequacy

Fluency

But there is no use.

āĻ•āĻŋāĻ¨ā§āϤ⧁ āϤāĻžāϤ⧇ āϕ⧋āύāĻ“ āϞāĻžāĻ­ āĻšāϝāĻŧāύāĻŋāĨ¤

2

4

I just hate feeling helpless.

āφāĻŽāĻŋ āĻļ⧁āϧ⧁ āĻ…āϏāĻšāĻžāϝāĻŧ āĻŦā§‹āϧ āĻ•āϰāĻŋāĨ¤

1

4

I congratulate the Finance Minister Arun Jaitley Jee for presenting an excellent Budget.

āĻ āύāĻŋāϝāĻŧ⧇ āĻŦāĻžāĻœā§‡āϟ āĻŦāĻ•ā§āϤ⧃āϤāĻžāϝāĻŧ āĻĒā§āϰāϧāĻžāύāĻŽāĻ¨ā§āĻ¤ā§āϰ⧀ āύāϰ⧇āĻ¨ā§āĻĻā§āϰ āĻŽā§‹āĻĻā§€āϰ āϏāĻ™ā§āϗ⧇ āĻŦāĻŋāĻ­āĻŋāĻ¨ā§āύ āĻŦāĻžāĻœā§‡āĻŸā§‡āϰ āϜāĻ¨ā§āϝ āĻļ⧁āϭ⧇āĻšā§āĻ›āĻž āϜāĻžāύāĻžāχāĨ¤

0

0

37 of 44

Results

Heads

English

Bengali Translation: MBR Score(10 samples)

Reference Translation

4

I love you.

āφāĻŽāĻŋ āϤ⧋āĻŽāĻžāϕ⧇ āĻ­āĻžāϞ⧋āĻŦāĻžāϏāĻŋāĨ¤: 0.98, āφāĻŽāĻŋ āϤ⧋āĻŽāĻžāϕ⧇ āĻ­āĻžāϞāĻŦāĻžāϏāĻŋāĨ¤: 0.97

āφāĻŽāĻŋ āϤ⧋āĻŽāĻžāϕ⧇ āĻ­āĻžāϞ⧋āĻŦāĻžāϏāĻŋāĨ¤

4

Thank You!

āϧāĻ¨ā§āϝāĻŦāĻžāĻĻ!: 0.99, āϧāĻ¨ā§āϝāĻŦāĻžāĻĻ āϤ⧋āĻŽāĻžāϰ!: 0.72

āϧāĻ¨ā§āϝāĻŦāĻžāĻĻ!

4

Modiji is India's Prime Minister.

āĻ•āĻŋāĻ¨ā§āϤ⧁ āύāϰ⧇āĻ¨ā§āĻĻā§āϰ āĻŽā§‹āĻĻā§€ āϏāϰāĻ•āĻžāϰāĨ¤: 0.81,āύāϰ⧇āĻ¨ā§āĻĻā§āϰ āĻŽā§‹āĻĻā§€ āϏāϰāĻ•āĻžāϰ⧇āϰ āĻ­āĻžāϰāϤāĨ¤: 0.77

āĻŽā§‹āĻĻāĻŋāϜāĻŋ āĻ­āĻžāϰāϤ⧇āϰ āĻĒā§āϰāϧāĻžāύāĻŽāĻ¨ā§āĻ¤ā§āϰ⧀

8

I am tired.

āφāĻŽāĻŋ āĻ•ā§āϞāĻžāĻ¨ā§āϤ āĻšāϝāĻŧ⧇ āϗ⧇āĻ›āĻŋāĨ¤: 0.93, āĻ•ā§āϞāĻžāĻ¨ā§āϤ āĻšāϝāĻŧ⧇ āϗ⧇āĻ›āĻŋāĨ¤: 0.86

āφāĻŽāĻŋ āĻ•ā§āϞāĻžāĻ¨ā§āϤ

8

How are you?

āϤ⧁āĻŽāĻŋ āϕ⧇āĻŽāύ āφāϛ⧋?: 0.80, āϤ⧁āĻŽāĻŋ āϕ⧇āĻŽāύ āφāĻ›?: 0.70

āϤ⧁āĻŽāĻŋ āϕ⧇āĻŽāύ āφāϛ⧋

8

Hello!

āĻšā§āϝāĻžāϞ⧋!

āĻšā§āϝāĻžāϞ⧋!

8

Let's start! āφāϏ⧁āύ āĻļ⧁āϰ⧁ āϝāĻžāĻ•!

āϚāϞ āĻļ⧁āϰ⧁ āĻ•āϰāĻŋ!

8 Big

Let's Start!

āϚāϞ⧋, āĻļ⧁āϰ⧁ āĻ•āϰāĻŋ!: 0.72, āϚāϞ⧋ āĻļ⧁āϰ⧁ āĻ•āϰāĻ›āĻŋ!: 0.72

āϚāϞ āĻļ⧁āϰ⧁ āĻ•āϰāĻŋ!

8 Big

I like Durga Puja very much.

āφāĻŽāĻŋ āĻĻ⧁āĻ°ā§āĻ—āĻž āĻĒā§‚āϜāĻž āĻ•āϰāϤ⧇ āϖ⧁āĻŦ āĻĒāĻ›āĻ¨ā§āĻĻ āĻ•āϰāĻŋāĨ¤: 0.76, āφāĻŽāĻžāϰ āĻ•āĻžāϛ⧇ āĻĻ⧁āĻ°ā§āĻ—āĻž āĻĒā§‚āϜāĻž āϖ⧁āĻŦ āĻ­āĻžāϞ āϞāĻžāϗ⧇āĨ¤: 0.76, āφāĻŽāĻŋ āĻĻ⧁āĻ°ā§āĻ—āĻž āĻĒā§‚āϜāĻž āϖ⧁āĻŦ āĻ­āĻžāϞ⧋āĻŦāĻžāϏāĻŋāĨ¤: 0.75

āφāĻŽāĻŋ āĻĻ⧁āĻ°ā§āĻ—āĻž āĻĒā§‚āϜāĻž āĻ•āϰāϤ⧇ āϖ⧁āĻŦ āĻĒāĻ›āĻ¨ā§āĻĻ āĻ•āϰāĻŋāĨ¤

8 Big

I am hungry

āφāĻŽāĻŋ āĻ•ā§āώ⧁āϧāĻžāĻ°ā§āϤāĨ¤: 0.95, āφāĻŽāĻžāϰ āĻ•ā§āώ⧁āϧāĻžāĻ°ā§āϤāĨ¤: 0.87

āφāĻŽāĻŋ āĻ•ā§āώ⧁āϧāĻžāĻ°ā§āϤāĨ¤

8 Dim

I love you.

āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ…āĻĒāϰāĻŋāĻŦāĻ°ā§āϤāĻŋāϤ āĻĨāĻžāĻ•āϤ⧇ āĻĒ⧇āϰ⧇āĻ›āĻŋāϞ⧇āύ āĻ āĻŋāĻ•āĨ¤: 0.52, ā§§ā§Š āψāĻļā§āĻŦāϰ⧇āϰ āĻŦāĻžāĻ•ā§āϝ āĻŦāĻžāχāĻŦ⧇āϞ āĻœā§‹āϰ āύ⧇āχāĨ¤: 0.52

āφāĻŽāĻŋ āϤ⧋āĻŽāĻžāϕ⧇ āĻ­āĻžāϞ⧋āĻŦāĻžāϏāĻŋāĨ¤

38 of 44

Results

  • For 3 Layers & 4 Heads

39 of 44

Results

4 Heads

BLEU

0.01

chrF2

2.85

TER

211.89

Heads

English

Bengali Translation: MBR Score(10 samples)

Reference Translation

4

I love you.

āĻ•āϞāĻ•āĻžāϤāĻž, āύ⧋āϝāĻŧāĻžāĻ–āĻžāϞ⧀ āĻŽāϤ⧋-āĻāϰ āύāĻŋāĻšā§‡ āĻĒā§‚āĻ°ā§āĻŖ āĻ•āϰ⧇ āϤāĻžāĻĻ⧇āϰ āĻ•āĻŽ āĻ•āĻžāϛ⧇ āϖ⧁āĻļāĻŋ āĻ•āϰāĻž āĻāĻŦāĻ‚ āĻ…āĻ¨ā§āϝāϟāĻŋ āϝāĻž āĻŦāĻŋāĻ­āĻŋāĻ¨ā§āύ āĻĒāϰāĻŋāĻļā§āϰāĻŽ āĻšāϝāĻŧāĨ¤: 0.37, āϤāĻžāϰāĻž āĻ—ā§‹āϕ⧀āϤ⧇ āφāĻŽāĻŋ āĻāχ āĻ­ā§‚āĻŽāĻŋāĻ•āĻžāĨ¤: 0.37

āφāĻŽāĻŋ āϤ⧋āĻŽāĻžāϕ⧇ āĻ­āĻžāϞ⧋āĻŦāĻžāϏāĻŋāĨ¤

Manual Evaluation of all examples gives 0 Adequacy and 0 Fluency

40 of 44

Results

  • For 4 Layers we look at the comparison between 4 Heads and 4 Heads Big’s Accuracy

41 of 44

Results

  • For 4 Layers we look at the comparison between 4 Heads and 4 Heads Big’s Loss

42 of 44

Results

4 Heads

4 Heads Big

BLEU

0.00

0.00

chrF2

3.29

2.18

TER

429.87

100.00

Heads

English

Bengali Translation: MBR Score(10 samples)

Reference Translation

4

I love you.

āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁ āĻ•āĻŋāĻ¨ā§āϤ⧁

āφāĻŽāĻŋ āϤ⧋āĻŽāĻžāϕ⧇ āĻ­āĻžāϞ⧋āĻŦāĻžāϏāĻŋāĨ¤

4 Big

I love you.

āĻĻ⧇āĻļ⧇āϰ āϛ⧋āϟ āϏāϰāĻ•āĻžāϰāĻŋ āĻ“āρāϕ⧇ āϏāĻžāĻŽāύ⧇āĨ¤: 0.43, āϰāĻžāĻ¸ā§āϤāĻžāϝāĻŧ āϕ⧋āύ āϤ⧇āĻŽāύāϟāĻž āϕ⧋āύ āĻĒ⧇āϝāĻŧ⧇āϛ⧇ āĻĒāĻžāϰ⧇ āĻ•āĻžāϰ⧋ āĻāĻ–āĻžāύ⧇ āĻ āĻ—ā§āϰ⧇āĻĒā§āϤāĻžāϰāĻŦ⧇āύ āĻšāϝāĻŧ āĻ•āϰ⧇ āϏāĻ™ā§āϗ⧇āĨ¤: 0.43

āφāĻŽāĻŋ āϤ⧋āĻŽāĻžāϕ⧇ āĻ­āĻžāϞ⧋āĻŦāĻžāϏāĻŋāĨ¤

Manual Evaluation of all examples gives 0 Adequacy and 0 Fluency

43 of 44

Conclusion

  • Model faces issues in fitting when it grows in complexity but has less data available to train all the parameters.
  • Out of all the models trained it is quite starking that the one with the least complexity seems to perform the best that is the transformer with 1 Layer and 8 Heads. The reason might be due to not that big training data.
  • The model exposed to larger data performs better than the one exposed to less data.
  • Model trained on larger data is able to translate named entity better, even if it hasn't seen it before in the data. (Like my Name Rishi, Places name like Hyderabad)
  • The Model learns associations between words quite well e.g. Modiji is converted to Narendra Modi in translations due to its appearance multiple times in dataset.
  • Larger Models tend to robust to punctuation marks.
  • Model faces issue in discriminating between spellings like āĻ­āĻžāϞ and āĻ­āĻžāϞ⧋.

īŋŊ

44 of 44

English Bengali Translation: MBR Score Reference Translation

Thank You! āϧāĻ¨ā§āϝāĻŦāĻžāĻĻ!: 0.99, āϧāĻ¨ā§āϝāĻŦāĻžāĻĻ āϤ⧋āĻŽāĻžāϰ!: 0.72 āϧāĻ¨ā§āϝāĻŦāĻžāĻĻ!