How Data Works:
The #DigitalSkeptic’s Deceptively Thin Guide to Thriving in the Post-Information Economy
By Jonathan Blum
This is a book of hints and clues. Ostensibly, it’s a satchel of implements you might want to carry as you navigate the hostile Himalayas of data. Know these 125-or-so pages, and you’ll be pleasantly surprised by the fresh leverage you’ll have over information. You’ll be less overwhelmed by the heft and bulk of the data you still must hoe. And you’ll revel, as do I, in mocking the bad-actor information that bores and misleads us all.
If I can walk the line from Theater Major to Data Nerd, so can you.
These days, a volume on the vagaries of digital information could catalogue any old thing. With so many digital descriptions jammed up on the Information Superhighway, just how “informational” can any one bit be? Look up, look down. Glance to the left or to the right: Nothing but vaguery masquerading as facts. Did roughly 40 of every 100 women in the United States actually vote against the first-ever female president? Did the United Kingdom really pass up its singular 12-century-long ambition to dominate Europe by withdrawing from the European Union? Who saw either coming? Not a soul.
The high and the mighty are equally uninformed. Lowell McAdam, CEO at telecom giant Verizon, got paid something like $18 million in 2016 to decide to fork over another $4.8 billion for failing Web-search operation Yahoo!  — only to learn that, starting around 2013, the information on something like 1.5 billion Yahoo! customers was stolen by what was then a bunch of teenagers. McAdam might seem like a data dope, save for the fact that the CEOs of Sony, Target, The National Archives, Evernote, DropBox and many, many other companies had their customers’ data stolen, too.
Hints and clues. It’s all these giants of industry had. Lack of knowledge is the ultimate leveler.
The hint- and clue-factor now seeps into the larger concept of “facts.” Who knows what’s what anymore? Wasn’t it news when humans could not agree on fundamental ideas like the age of the universe or the nature of God or man? But now, those contested ideas are just easy talking points for a light lunch. We can no longer agree on serious stuff like the relative warmth of the planet, or the effect of weather; or whether we will have jobs ... or whether we should have jobs. We can’t even count how many people showed up at an inauguration in Washington, D.C. In a world this unknowable, what chance do Adam Smith and his invisible hands of the market have? One hand can no longer count the fingers on the other.
It’s the Hints and Clues Era, a kind of Post-Information Age, where we’re all feeling our way through the digital dust kicked up from an impact crater.
I talk about the Post-Information Age as a gritty sandstorm. People think that’s a showy analogy. It isn’t. The time-to-dust ratio for any bit of information is a practical figure you need to get a feel for, pronto, in just about any informational setting. Considering how low visibility usually is when you need information, an instructional, by-step textbook to deal with all that dust is of not much use. Whatever curriculum is on those pages tends to blow away.
Only a guidebook of general principles printed into the well-worn, dog-eared manual of your mind lasts out here. Something like what high-altitude climbers Reinhold Messner and Peter Habeler put in their backpacks when they climbed Mount Everest in 1978, alone and without oxygen. These two pioneers knew then what we must learn now: Carry only the simple and useful approaches that increase the odds of making progress through a vertiginous terrain. In this case, analytics.
It is the pretense of precision -- numeric or otherwise – that will end your climb every time.
It’s been a slog to keep this book short and aired out. When you a punch a hole this big in your life to answer a question as cosmic as “How data works,” there’s a pressure to go huge. And I did make the genuine effort toward that “big book.” All the research for an authoritative text on a layman’s guide to information science is locked safely in my closet: How human intuition, not qualitative thinking and engineering, is the key to analytic success. How advanced-sounding ideas like “machine learning” and “deep learning” are basically nothing but software that writes more software, that then surreptitiously tracks us deciding which automatically written code means something. There’s the research on how information science is a perception problem larger than engineering -- kind of like the Wizard of Oz, but without all the witches and smoke. I collected practical, hip, counterintuitive examples across product design, weather prediction, U.S. Labor Department data, and even Maine lobster boats. Some of that work does peek out in the coming pages. Some will appear online, somewhere. But that big book is not this little book.
Sure, I would love to play an info-age Jean Paul Sartre or Samuel Beckett, and empty my virtual Paris apartments of “brilliant” writing and research.
But frankly, with information, short is all it deserves.
I’ve done all I can to build in breaks and breathing spots between the heavy ideas. Let’s break for a smoke now and meet Tim Munro. We first met back in 2013, when he was running an American contemporary classical music sextet.
You think the problems with information you face are tough? Try running a classical music ensemble in the 21st century. Never mind that the 400-year-old musical liturgy of Bach down through Brahms and into Bartok was a tract of pure human genius that spawned off-set printing, product design and digital compression technologies used in the military and by the deaf, by the dawn of the 21st-century, the classical music sector was “digitally enhanced” into little more than a shabby Web chat room. Every orchestra, opera and classical ensemble went into full-scale decline as free music, unfettered access to concert video and feeble sounding audio technology destroyed the palate required for top-flight classic music. Yet here was Munro, figuring out how to turn arcane American contemporary classical music into a real paying gig.
“We perform the same piece so many times that the notes are not the issue any more. The pieces become kind of like tools — ones we use to reach through the darkness, both digitally and live, to grab the audience by the ears, shake them as hard as we can and remind them what it is like to be alive.”
That gesture of connecting is the key of information analytics. Not engineering. Not software. Not neural networks. Not deep learning. That human thing is the thing.
And if you ever stray too far from that moral obligation to work at agonizing human speed to connect one soul to another, you are on the grand slide down into being just another digital dirtbag mortgaging tomorrow to make a quick and dirty buck today.
Humanity at scale. That’s the only progress.
There is no shortage of rants in this Post-Information Age of ours. The collapse of all things informational has become a nice little growth sector for storytellers: Weapons of Math Destruction, by Cathy O’Neil, Free Ride by Rob Levine and Freeloading, Chris Ruen’s heartbreaking account of the decline of the music industry, are the starting points in the skeptical liturgy. I penned many a data rant myself. Just type “#digitalskeptic” into any search engine and read for yourself about the digital collapse of the New York Stock Exchange, the slumification of digital advertising, the implosion of news, music, or higher education.
But it’s not news anymore that the Information Age is a Leonard Cohen song: Everybody knows. The dice were loaded, the war is over, the good guys lost, the fight was fixed, and the rich got rich.
There will be charts, from time to time. Not many, since graphics and maps mostly lie. But visualization of some data is indispensable. The set I use all the time is the Recording Industry Association of Americas’ sales database. It diagrams the unit sales volume and inflation-adjusted sales dollars for each 45, LP, 8-track tape, or digital download since the early 1970’s.
I could jazz up these data with a slick interactive interface that wiggles. But with this stuff, blunt is the point: You have to see exactly what the 21st-century music executive sees: the wonky, 80’s, low-tech, Lady PacMan design of how each recording technology from 45 RPM singles through cassettes and compact discs increased the number of units sold and the numbers of dollars made. That is, until the impossible happened in the early 21st-century, when unit sales of digital downloads effectively doubled while actual inflation-adjusted dollar sales effectively halved.
The figures are stunning: a 70-percent drop not in bottom-line, after-expense profits, but in top-line, before-costs sales.
That’s the new reality we need to model: More yields less.
The story in this music industry data is essential! Look carefully at how it charts the roll-out of 8-Track, vinyl-single and, finally, digital downloads and online streaming. See how each new format leads to an increase in units sold and dollars captured — until markets went digital around the year 2000. Then dollars fled the market, and in real inflation-adjusted terms, continue to collapse.
Yes, the assumption really is that sometime around 2030 the music industry, as we knew it, will cease to exist. Our future instead is to chant music replayed to us for nothing from distant servers, like some sort of digital stone age, when tunes blew in on the wind and echoed off canyon walls.
For some reason, whenever I break down the decline of digital markets, people think it’s a cheap shot. I don’t why, since the world is such a mess. But let me be clear: Music, financial services and publishing are not the only things in decline. Aggregate U.S. productivity for the first quarter of each year over the early Information Age has been in strict decline. I charted the trend below.
What other story can this information possibly be telling?
What makes the RIAA’s digital narrative on the destruction of the music industry indispensable is that it for any setting, it helps us time how fast information is turning to dust. Compare those music charts above to any other collapsing information market. How about the ongoing implosion of sovereign nation-states? You know, our governments.
It’s hard out there for a government: Economies no longer seem to recruit the time and passions of its citizens. Immigration is overwhelming many nationalities. Violence, not consensus, is the communication tool of choice. Who knows what to do? ... until you compare the timing and narrative of the problems our governments face to the timing and narrative of the problems our music business faced.
Aren’t our laws nothing more than yet another information-based system facing its own disruption at the hands of a governmental spin on Napster and iTunes? Isn’t Brexit and the U.S. election of Donald Trump just a disruptive digital idea, tuned against our rule of law?
Can’t we compare global politics now to the music industry circa 1999? Where our politics are starting in the tough slog through a 70-percent reduction in total value?
It’s a powerful metaphor you can make real choices from: We are living through the governmental equivalent of finding the new ways to make politics work. Who should be searching for new artists with the hustle and talent to innovate how we govern. Not to look for the next Friendster or MySpace that promises to go back to some past that will never return.
It’s the fresh idea we need. But There’s a long way ahead. Better be ready.
There is a kind of informational ornithology that has emerged from the hints I’ve collected. It seems smart to present these samples as a simple catalogue as sort of a list of different ways of thinking about information. The list seems to strip the chrome and fake-plastic sheen of “Big Data.”
Here then are highlights of how to think about information in the Post-Information Age, ordered as they might be in a section of this book:
Let’s get a feel for the power of a story here: Once upon a time there was a world we all lived in where all the information was worth less every second. And since information was deflating as you touched it, analytics became like bread making. Just backwards. The informational loaf of flour, water and yeast was falling so fast that simply tossing in additional ingredients only increased the model’s density and ramped up the model’s speed-to-uselessness.
All those deflating loaves of information made for a new type of analytical baker that knew enough to keep as much data out of a nascent model as possible. Her job was to mix in enough wholesome real information in her models to fire up a reasonable prediction.
For the record, in that world (which is really our world) maybe one in 10,000 models did seem to grow before everyone’s eyes. These are the informational “unicorns” that go on to become commercial apps like Google, Facebook and Uber. If you are lucky enough to blunder into one of these rare growing digital beasts, by all means enjoy it. It will be like finding oil in your backyard, where you star in your own personal reality-TV version of the Digital Beverly Hillbillies.
Laugh track and all.
Most of the time, though, your models will not bag unicorns. Instead, you will slog along on a grand quest to find the right storytelling metaphor that triages enough data to boil the nonsense you touch into a narrative informational scale you can manage. You will hunt endlessly for the right story that culls your stupid data into information you can test.
And that friends, is all about asking the right questions of the right information.
Let’s try it out by taking a big imaginary step — into a successful indie bookstore. (We go into why simple bookstore sort of magical analytical outposts later on in this book.)
What is the bookstore’s sales story I might be trying to tell? Well, what is the information about said “book?” It’s not a plot or a binder filled with pages. It’s a story described, in part, by its title, author, price and other factors. What, then, is a good question to ask of that information? That depends. Are you interested in anticipating sales or do you care about anticipating returns of unsold books? Or maybe, like Starbucks, you seek to anticipate sales in coffee, drinks and other merchandize in your bookstore. (For the record, the music in Starbucks stores is surprisingly important to drive per-customer revenue. The operation went as far as to invest in its own music label called Hear Music. That story is a bit too smoothed-out and corporate to make it into this book. But it’s not totally uninteresting. )
What, then, does a bookstore care about with its data? Maybe it’s whether lower prices increase the likelihood of a book sale. That depends on the book. Does the author matter? Yes, but in this illiterate age not as much as you think. What about topics, color or the cover? Do they matter? What you seek are the basic descriptive elements of the information you have that matters to the question at the heart of your story.
Here’s why we all have a shot at being good storytelling analysts: A good model almost never has a lot of descriptive elements: Usually no more than six. Four is just fine. All it takes is lot of trial and error — and insightful comparisons — to unlock the relationships between these basic factors to boil down your model to a few ideas: If prices matter to books sales, how much of a price cut would help? Make it specific: If I cut prices by 25 cents on every dollar, how much more do I sell of whose books? Are Maurice Sendak’s kids books better discount sellers than John Grisham novels? (The answer is no, usually.)
Remember, the past and the future are divided by an unknowable present. So you must be careful crossing the Sahara of the “then” into the watery world of tomorrow. You have to feel your way to what about past sales information gives off the right sense of future best-sellers. Is there a way you can historically look for how price cuts drove sales? How about comparing changes in prices to sales? Are you learning something in that comparison? Are you forgetting something? Are you telling a story or are you lost in your own bullshit?
Either way, you’re a storyteller. Not a scientist. Your intuition is what guides you. And that’s what the true information gods I deal with do. They exploit this uncanny, X-Men-like skill to swap contrasting narrative elements to mine out predictions relationships in otherwise useless data.
And, here comes the humbling part, these data experts do it in their heads. Just like musical giants like John Coltrane, Jimi Hendrix or Miles Davis, these analytics rock stars can keep the narrative elements straight in their mind’s eye, and quickly make connections that build great predictions. There’s nothing you and I can do but stand back and marvel at how simple these humans make it look to reduce complex information into simple, predictive elements. It’s like ground lightning, on a summer night. It just sort of … happens.
But it is surprising how far just about anybody can get with powerful analytics using simple models, the right tools, the right narrative and the right questions. All it takes is the confidence to trust your intuition.
There’ll be more exercises and tactics later. But for now, if you can just take one single step and play the analytical Dorothy from The Wizard of Data Oz and believe you can make the trip by clicking your heels, you can.
And you get the fun of pretending to wear Ruby Red Slippers!
Good stories bring confidence. And with it, the ability to list out that last of information science ornithology. Here are the last round of concepts needed to make data much more manageable.
One last hint: When I deal with the armies of the analytical undead that are chained to massive machine-learning or artificial-intelligence projects, I sense the steep price they pay for pretending something is real when it’s not. Don’t ever forget you are learning to make choices through the worst implosion of value in human history. Our decline is so steep, and this age is dark that there probably will not be a Shakespeare, Cervantes or Lu Xun to legitimize our time. Our future is probably to be The Sea Peoples that were known to invade prehistoric Egypt around 1,400 BC. But no real record exists. In these times, just staying in the game is progress. Remember that.
Because, analytics done right is a qualitative, sensual process that’s more about nuance and deft touch than it is about specific rules or engineering. Software helps. But pen and paper work just fine. Do data right and, dare I say it, it humanizes rather than destroys:
Reaching through the darkness, reminding somebody what it is like to be alive.
How Data Works won’t just be me offering clues. There will be plenty of chats with the smart people who were courageous enough to give me honest answers on the limits of information, large data and analytics.
Here’s a brief outline of the characters you will meet in interviews distributed throughout this book.
I have several more possibilities: Kathy Frankovic is a terrific pollster who can explain how Donald Trump got elected. Like for real. Edward Witten, the Charles Simonyi Professor at the School of Natural Sciences at Princeton, tells the awesome story of how far-fetched computational intelligence is.  And How Data Works would not be complete without Adam Grayson, CFO of Evil Angel, one of just a dozen surviving distributors in the imploding adult entertainment business. Here’s what he has to say about analytics:
“Even though we collect user data from all over the world in real time, we can’t listen to any of it. What happens is so random and so off-the-wall that none of it is of any use.” The reason? “You cannot regress an orgasm.” 
Considering what a contrarian dork I can be, those close to data journalism and academic data science have been remarkably gracious and interested in my work. Many instructors and data journalists have hoped that the hints and clues in this book could be aimed, at least a little bit, at student data journalism and those teaching computer analytics. I promised I would try to organize some bits they could use.
Here are some sample sections that will probably work as short chapters. 
We will wrap with me asking a tough question: Are robots racists? It’s the tale of how I worked with one of the world’s leading spatial statistical experts to analyze 6 million online loans for evidence of whether Web lenders obeyed the basics of U.S. federal fair-lending laws. Our work indicated something like 40 out of every 100 web loans have real questions to answer. Redlands, Calif.-based mapping giant Esri made that study part of its teaching curricula.
The lesson here is clear: If I can do top-level analytics, you can, too.
Finally, the final hint and clue: me. I’m not sure why, but more and more people seem interested in the road I’ve taken, and what on earth I’m doing studying information science.  I really did get a theatre degree, all be it from Columbia University. Here, then, is the story: I’m an acceptably imperfect married man, living in Harrison, N.Y. I went to Jamaica High School in Queens. For about 10 years, ending in late 2014, I made a good living as an independent wordsmith, though I never went to journalism school. And I certainly did not grow up loving words or book. (Spelling, what is that?) But somehow I got paid organizing these graphical bits of grammaric logic. In the process, I built one of the first Web-based business journalism ‘content engines’ in the early 21st century, through which I published at CNNMoney, TheStreet, Entrepreneur Magazine and many, many others. I was one of the first to use cloud-based collaboration tools to create news. I managed to channel everyone’s inner rage at the race-to-the-bottom Information Age by pioneering the short-investor brand, the #DigitalSkeptic, and, in turn, my remarkable relationship with FinePrint Literary Management.
I’m proud of my track record as an earner at operations with ABCNews, CNBC, MTV and many others. I’m self-taught as a professional video producer, forensic accountant, business analyst, carpenter, electrician and project manager. My current explorations through information science and sophisticated data marketing have led me to professional teams that develop technologies for floating homes that address sea-level rise, to the next generation of secure communications tools for journalists and to the frontiers of regenerative capitalism. I sit on the board of the Society of American Business Editors and Writers. I have all the social media trappings of a professional writing career:
● A following of 8,000 or so @digitalskeptic
● Solid email list of about 2,000
● Control of a terrific hashtag: #digitalskeptic
● Reasonable video assets: https://www.youtube.com/watch?v=2holW7WTahA
● Workshops that teach basic analytics to underperforming schools
● Cool T-Shirts. http://digitalskeptic.storenvy.com/
● A mobile app in development: The How Data Works Magic Eight Ball app. Shake your phone, get a data ‘answer.’
● Placeholder website: http://thedigitalskeptic.com/ that will roll into a silent web-auction site for limited-edition, hand-made copies of the book, featuring art by Malcolm Campbell Moran and Jonathan Marshall. 
● Relationships with art galleries for installations tied to bookstore-based events.
Let’s end with a final clue: I have learned to do what few have — row a type of Venetian gondola called a sandolo in the ancient, standing style. I have wobbled my way under the Rialto Bridge on the Venice’s Grand Canal, several times. Here is what I learned: It is much more challenging negotiating the vaporetto and the choppy waves near the Rialto Bridge than it is to learn the analytics needed to write about what lots of data does to lots of humans.
It turned out that the tacting of the hints and clues of information science are nowhere near hopeless. If we can just feel our way through the disgust of making such a hash of the early days of this digital stuff, we will be fine.
 http://www1.salary.com/Lowell-C-McAdam-Salary-Bonus-Stock-Option s-for-VERIZON-COMMUNICATIONS-INC.html