Additional explorations in text analysis
Kevin Lanning
SICSS – South Florida
Overview: Three projects
Using language to identify scholarly communities (brief)�Personality and ego development (not brief)�Word use in news transcripts (brief)
Papers are in Zotero and links to R code are (mostly) in the papers - or just ask me.
Language of scholarly communities
Language of scholarly communities
Personality and ego development
personality > traits
“maturity”
The construct(s) of ego development
Cognitive
Social
Moral
Autonomous/
Integrated (8
-
9)
Fulfillment
Interdependence
Complexity
Individualistic (7)
Development
Mutuality
Tolerance
Conscientious (6)
Achievement
Responsibility
Self
-
criticism
Self
-
Aware (5)
Adjustment
Helpfulness
Exceptions
Conformist (4)
Appearances
Loyalty
Obedience
Self
-
Protective (3)
Trouble
Wariness
Opportunism
Impulsive (2)
Good vs. bad
Solipsism
Urges
The measure: Wash U. Sentence Completion Test
When a child will not join in group activities…
Impulsive (2) | … they are sick |
Self-Protective (3) | … give him 2 choices, join or sit by himself |
Conformist (4) | … he might be tired |
Self-Aware (5) | … I wonder what is wrong |
Conscientious (6) | … I wonder if he doesn't feel good about himself |
Individualistic (7) | … it may be a healthy or unhealthy sign |
Autonomous/�Integrated (8-9) | … it's sometimes a reflection on the group, not the child. |
Overview / goals�
The construct(s) of ego development
“maturity”
Cognitive
Social
Moral
Autonomous/
Integrated (8
-
9)
Fulfillment
Interdependence
Complexity
Individualistic (7)
Development
Mutuality
Tolerance
Conscientious (6)
Achievement
Responsibility
Self
-
criticism
Self
-
Aware (5)
Adjustment
Helpfulness
Exceptions
Conformist (4)
Appearances
Loyalty
Obedience
Self
-
Protective (3)
Trouble
Wariness
Opportunism
Impulsive (2)
Good vs. bad
Solipsism
Urges
Sample | Impulsive (2) | Self-Protective (3) | Conformist (4) | Self-Aware (5) | Conscientious (6) | Individualistic (7) | Autonomous/ Integrated (8-9) | total |
Responses at each level | ||||||||
Univ I | 63 | 184 | 1287 | 1622 | 1006 | 178 | 24 | 4364 |
Univ II | 187 | 1045 | 6746 | 9933 | 7081 | 1407 | 159 | 26558 |
Midlife | 27 | 120 | 1131 | 1945 | 1640 | 479 | 76 | 5418 |
Exemplar | 141 | 402 | 1007 | 2167 | 2412 | 1130 | 334 | 7593 |
Total | 418 | 1751 | 10171 | 15667 | 12139 | 3194 | 593 | 43933 |
Words coded at each level | ||||||||
N words | 1456 | 7876 | 47969 | 103618 | 108911 | 39925 | 10655 | 320410 |
N distinct words | 398 | 1413 | 3273 | 5493 | 6129 | 3932 | 2032 | 10670 |
The data
Some LIWC results
Ego level as sequence
| 2 | 3 | 4 | 5 | 6 | 7 | 8-9 |
Impulsive (2) | | 0.90 | 0.85 | 0.83 | 0.80 | 0.75 | 0.70 |
Self-Protective (3) | 0.90 | | 0.90 | 0.91 | 0.88 | 0.84 | 0.80 |
Conformist (4) | 0.84 | 0.90 | | 0.95 | 0.92 | 0.88 | 0.83 |
Self-Aware (5) | 0.83 | 0.91 | 0.95 | | 0.98 | 0.96 | 0.93 |
Conscientious (6) | 0.79 | 0.87 | 0.91 | 0.98 | | 0.99 | 0.97 |
Individualistic (7) | 0.75 | 0.83 | 0.87 | 0.96 | 0.99 | | 0.98 |
Autonomous/ Integrated (8-9) | 0.70 | 0.79 | 0.83 | 0.93 | 0.97 | 0.98 | |
Correlations between word use at different levels supports a simplex model.
Correlations between word counts across all 10670 (top) or most common 1811 terms.
Expanding the model
Expanding the ego lexicon
level | nWords | Facet Weight | facet | Original dictionaries: words in SCTs |
2 | 4 | 0.9 | aggression | bothers, fight, hate, violence |
2 | 2 | 0.7 | banal- hyperbolic | amazing, awesome |
2 | 6 | 1 | banal-cool | cool, fine, liked, nice, ok, okay |
2 | 4 | 0.8 | prohibition | cant, nobody, nothing, rules |
… | … | … | … | … |
hate(.78), violence, fight, bothers, hating, hatred, dislike, fights, racism, fighting, disrespect, annoys, hates, bashing, disgusts, think, injustice, bigotry, complain, blaming, violent, hated, bullying, pisses, bullies, scares, loathe, anymore, animosity, misogyny, irks, bullshit (.50) …
Impulsive-aggression words in expanded dictionary
Ego level in multiple samples
(a weak test, passed)
(a stronger test, failed)
Step 3: Reexamining ego level in words and texts (today)
When ego level is computed as averages, estimates for long responses are too low…
…and estimates for short responses may be too high.
LIWC scales associated with prediction errors in sentence completions
Exploring a regression approach
Regressions: Sentence completions
Original dictionaries
Expanded dictionaries
Blogs
Blogs: Predicting age from sentence-completion derived models
From original dictionaries
From expanded dictionaries
Model Blogs Persons Blogs Persons
Summary
Summary of personality stuff
Fox and MSNBC
Fox > MSNBC
MSNBC > Fox