1 of 6

Analysis of Articles Published on Data Science

Jicksy John

Kristin Bässe

Lakshmi Prasannakumar

2 of 6

Data Collection

How we collected the data and more

[1] NEWS API, News API; 07/14/18 accessed; https://newsapi.org/

© 2017 Udacity. All rights reserved.

  • We collected data using NewsAPI[1] in Python
  • Articles are collected from 16 popular public publishing sources
  • Total Articles collected: 9961
  • Time Period: 02/20/2018 to 07/14/2018
  • Articles are fetched for Data Science, Data Analytics, Machine Learning, Business Analytics and Artificial Intelligence
  • Github Link for project code: https://github.com/jicksy/news-analysis
  • Findings: General Article Publishing Trend, Top Publishers, Gender Ratio of Authors

2

3 of 6

1. General Article Publishing Trend

© 2017 Udacity. All rights reserved.

Graph plotted using Google Spreadsheet

3

4 of 6

2. Top Publishers Identified

Using Wordcloud[2] in Python

[2] Adiljadoon-Kaggle, WordCloud with Python; 07/19/18 accessed; https://www.kaggle.com/adiljadoon/word-cloud-with-python

© 2017 Udacity. All rights reserved.

  • Size of publisher in the image represents the frequency of articles published on the website

→ e.g. Business Insider and Forbes have most articles with data science and related keywords

5 of 6

3. Gender Ratio of Authors

© 2017 Udacity. All rights reserved.

  • We analyzed the top 60 authors and found that there are twice as many male authors than female authors

Pie chart created using Google Spreadsheet

5

6 of 6

Group Photo

© 2017 Udacity. All rights reserved.

6

Kristin Bässe(@Kristin) Lakshmi P (@lakshmi) Jicksy John (@jicksy)