Reading Google’s Latest 10K with Many Eyes
There are a number of tools available on the web that will take your documents and analyze the contents statistically to help you identify patterns. For the investment community, these technologies are not quite ready for prime time. In other words, the potential to glean insights is hit-or-miss. But, I’ve been impressed with a tool launched a year ago by IBM (IBM) called Many Eyes which does some simple stats on document contents and provides the output in various forms. Many Eyes is a visualization tool for data focused on natural language. When the data is words, as in a lengthy 10K form, I begin to wonder if some processing of these words can tell us something we don’t already know.
I’ve taken the contents of Google’s (GOOG) 2007 10K and loaded it as a data set in the visualization tool. Do you think any meaningful insights can be derived? Here’s a summary of my results in this simple experiment.
First, I loaded the Product Section of the document, which lists and discusses about 35 products within Google’s consumer portfolio. Enterprise products (Google Apps, Google Appliance) are covered under a separate section. I excluded terms/words like “Google”, “users”, “web” and others that don’t inform the analysis.
Click here or the image below to see the visualization results.
Some themes that standout include (based on term-frequency):
=> Mobile
=> Maps
=> Gmail
=> Groups
=> News
Nothing here is a surprise, as these are all product categories where Google has a significant product in the market.
Some other terms that stood out to me are “free” and [user] “experience”, two central aspects of Google’s focus and way of doing business.
One valuable thing I learned from these results is the lack of product focus on web 2.0 trends and drivers. There’s not a major role for “social networking” themes like “syndication”, “tagging”, “sharing”, “commenting”, etc. across the product portfolio. Could this speak to weakness in product breadth? Where’s the web 2.0 lingo and product focus?
Next, I upload the Risk Factors section on the hopes it would reveal patterns related to Google’s concerns. Here, I also adjusted the contents by excluding terms that don’t inform the analysis (like “Google” or “risk”).
Click here or the image below to see the visualization results.
Results seems to clustered along the following themes, presented in order of magnitude:
=> Advertisers, advertising, clicks, search, (ad) network
=> Access, (network) providers, systems, operations
=> Intellectual, laws, protection
Again, no surprises here.
If Google’s (implicit) objective or hope with this 10K has been to convey comfort that it has its eye on the ball, I think it’s done the job with this document. That tells us something.
Bubbling-up a level, could the future of investing involve the use of more sophisticated tools and analysis that take this quick-and-dirty approach to the next level? Now that would change the nature of investor disclosure and document drafting quite a bit. When words can instantly convert to data, a lot more thought will be put into their selection.
Take a look at some of these visualizations and share any patterns you’re able to detect. Or load another company’s filings into the tool to identify patterns, dislocations to strategy and hidden themes.
Related Links
- One Month Retro on IBM's Many Eyes App
- IBM Wants Many Eyes on Visualization
- IBM's Many Eyes Project Seeks Stats Freaks
Filed under: Google, User Experiences, Web Apps, Innovation
Email It
Digg This!
Slashdot It!
Tags: visualization, 10K, financial reports, hidden meanings, documents, patterns, eyes,


