1 of 18

Introduction to Data Science

By

S.V.V.D.Jagadeesh

Sr. Assistant Professor

Dept of Artificial Intelligence & Data Science

LAKIREDDY BALI REDDY COLLEGE OF ENGINEERING

2 of 18

  • Introduction to Neo4j
  • Basic Structures of Neo4j
  • Neo4j Representations
  • Example of Neo4j
  • Cypher a graph query language
  • Cypher query
  • Example2
  • Create Query
  • Final Graph

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Previously Discussed Topics

LBRCE

OOP

3 of 18

At the end of this session, Student will be able to:

  • Understand the basics of text mining. (Understand-L2)

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Session Outcomes

LBRCE

OOP

4 of 18

  • Text mining or text analytics is a discipline that combines language science and computer science with statistical and machine learning techniques.
  • Text mining is used for analyzing texts and turning them into a more structured form.
  • Then it takes this structured form and tries to derive insights from it.

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Introduction to Text Mining

LBRCE

OOP

5 of 18

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Introduction to Text Mining

LBRCE

OOP

6 of 18

  • Autocomplete and spelling correctors are constantly analyzing the text you type before sending an email or text message.
  • When Facebook autocompletes your status with the name of a friend, it does this with the help of a technique called named entity recognition, although this would be only one component of their repertoire.
  • The goal isn’t only to detect that you’re typing a noun, but also to guess you’re referring to a person and recognize who it might be.

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Text Mining in Real World

LBRCE

OOP

7 of 18

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Text Mining in Real World

LBRCE

OOP

8 of 18

  • Google uses many types of text mining when presenting you with the results of a query.
  • What pops up in your own mind when someone says “Chelsea”? Chelsea could be many things: a person; a soccer club; a neighborhood in Manhattan, New York or London; a food market; a flower show; and so on.
  • Google knows this and returns different answers to the question “Who is Chelsea?” versus “What is Chelsea?”

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Text Mining in Real World

LBRCE

OOP

9 of 18

  • To provide the most relevant answer, Google must do (among other things) all of the following:

■ Preprocess all the documents it collects for named entities

■ Perform language identification

■ Detect what type of entity you’re referring to

■ Match a query to a result

■ Detect the type of content to return (PDF, adult-sensitive)

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Text Mining in Real World

LBRCE

OOP

10 of 18

  • Google uses text mining for much more than answering queries.
  • Next to shielding its Gmail users from spam, it also divides the emails into different categories such as social, updates, and forums.
  • It’s possible to go much further than answering simple questions when you combine text with other logic and mathematics.

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Text Mining in Real World

LBRCE

OOP

11 of 18

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Text Mining in Real World

LBRCE

OOP

12 of 18

  • Text mining has many applications, including, but not limited to

■ Entity identification

■ Plagiarism detection

■ Topic identification

■ Text clustering

■ Translation

■ Automatic text summarization

■ Fraud detection

■ Spam filtering

■ Sentiment analysis

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Text Mining in Real World

LBRCE

OOP

13 of 18

  • Textmining is a complicated task and even many seemingly simple things can’t be done satisfactorily.
  • For instance, take the task of guessing the correct address.
  • It is difficult to return the exact result with certitude and how Google Maps prompts you for more information when looking for “Springfield.”
  • In this case a human wouldn’t have done any better without additional context, but this ambiguity is one of the many problems you face in a text mining application

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Difficulties of Text Mining

LBRCE

OOP

14 of 18

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Difficulties of Text Mining

LBRCE

OOP

15 of 18

  • Another problem is spelling mistakes and different (correct) spelling forms of a word.
  • Take the following three references to New York: “NY,” “Neww York,” and “New York.”
  • For a human, it’s easy to see they all refer to the city of New York.
  • Because of the way our brain interprets text, understanding text with spelling mistakes comes naturally to us; people may not even notice them.
  • But for a computer these are unrelated strings unless we use algorithms to tell it that they’re referring to the same entity.

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Difficulties of Text Mining

LBRCE

OOP

16 of 18

  • Related problems are synonyms and the use of pronouns.
  • Try assigning the right person to the pronoun “she” in the next sentences: “John gave flowers to Marleen’s parents when he met her parents for the first time.
  • She was so happy with this gesture.”

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Difficulties of Text Mining

LBRCE

OOP

17 of 18

  • Language algorithms are also sensitive to the context the language is used in, even if the language itself remains the same.
  • English models won’t work for Arabic and viceversa, but even if we keep to English—an algorithm trained for Twitter data isn’t likely to perform well on legal texts.

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Difficulties of Text Mining

LBRCE

OOP

18 of 18

  • Session Outcomes
  • Introduction to Text Mining
  • Text Mining in Real World
  • Difficulties of Text Mining

S.V.V.D.Jagadeesh

Tuesday, March 25, 2025

Summary

LBRCE

OOP