1 of 7

Can AI Preserve Our Science Legacy?

Creating software to solve your problems

NASA INTERNATIONAL SPACE APPS CHALLENGE

2022

2 of 7

PROBLEM STATEMENT

Search Through Documents

User may need document summaries, key words, etc.

Sementic Analysis

Access Documents

User may need to access documents from Terabytes of data.

User may need to search certain word in available documents

3 of 7

SOLUTIONS PROPOSED

Relevancy Based Search

The main functionality is to provide summary of provided documents and list of keywords

Summarizer and KeyWords Extractor

The main functionality is to search words through documents based on their relevancy amoung multiple documents

4 of 7

Relevancy Based Search

Clean text extracted from files.

Iterate over each word and build dictionary

Normalize to compensate for the effect of document length

TF-IDF: Term Frequency and Inverse Document Frequency

Calculate Similarity for input word. Sort the documents based on calculated similarity relevance

Build VSM Dictionary

Calculate tf-idf

Normalize Values

Calculate Similarity

List Relevant Documents

1

i

ii

2

3

5 of 7

Summarizer

Data Pre-processing

Select Top Sentences

Score Sentences

Extract Sentences

Generate Summary

1

2

3

4

5

6 of 7

FUTURE EXTENSIONS

Automated VSM Creation

Introduce Databases

Enable Multi-phrase Search

Document Section Based Summaries

7 of 7

Thanks

Do you have any questions?

CREDITS: This presentation template was created by Slidesgo, and includes icons by Flaticon and infographics & images by Freepik