Big data hyper-hypo-hyperbole or reality?

Is “BIG DATA” hype? Have we over-sugared ourselves with too much big data candy? I’ll dodge the answer and instead present you with four interesting resources addressing this issue.

First up is a great Big Data intro video shown at the 2012 SAS Analytics conference. What I like about this video (even though it could have been cut by 30 seconds) is that it really frames the issues well.

Second, is an excellent article on big data recently published by the Harvard Business Review. This article points out that big data will make an impact, but not in the traditional sense. “Traditional” big data analytics focuses on prediction, but in the future big data will have more transformative impact on areas such as mobile-location analytics, personalized medicine, and artificial intelligence.

Third, in this blog post on big dataJason Rushin notes that

In this era of digital everything, nearly every marketer has access to more data than they can reasonably handle. A single web visit by a single customer can result in thousands of data points across items viewed, locations, durations, browser, referral, clickstream, frequency, etc.  Couple that with device, payment methods, demographic data, product attributes, not to mention data across your other channels, and any retailer is quickly drowning in data.

Rushin points that regardless of the size of your data set, your inability to act on this data set is what matters. He advises you to look for solutions that can readily supply BI value and insights.

Finally, I encourage you to spend 40 minutes and watch this video presentation by Jim Stogdill on how corporations will evolve leveraging big data (tasty tidbit: hear how a corporation is compared to a nematode).

Innovation in social analytics

Data analysis is the new plastics. Remember this scene from the movie the The Graduate?

Below is a curated list of articles from this week of innovative social analytics and business intelligence initiatives.

In this article from O’Reilly Radar, we learn that social network analysis is amalgamation of social science analysis such as sociology, political science, psychology, and anthropology combined with traditional mathematical measurements. At it’s core, social network analysis measures relationships between people and organizations. But cutting edge research is also looking at ways to leverage social network analysis as a form of early warning system for natural disasters. Much social network analysis has been regressive in nature, the future will focus more on real time analysis.

And speaking of real time analytics, the article from the Washington Post makes the argument that real time results may have a significant influence on the up-coming 2012 elections.

Perry is done,” came a Twitter posting from a viewer called (at)PatMcPsu, even while the Texas governor struggled to name the third of three federal agencies he said he would eliminate as president. Another, called (at)sfiorini, messaged, “Whoa? Seriously, Rick Perry? He can’t even name the agencies he wants to abolish. Wow. Just wow.

The key point to remember is that the “real time citizen” is no longer content to remain passive. Additionally, will the “real time citizen” quietly wait for poll stations and voting counts to close in other states before announcing the results of his/her own state? Will be interesting to watch how quiet or loud Mr. and Mrs. Real Time Citizen will react in 2012.

Finally, social app analytics start-up Kontagent snagged $12 million in a Series B round. According to an interview with Kontagent’s founder, what makes Kontagent unique is that does not perform “traditional” social analytics function (such as conversation monitoring, tabulating likes, etc) but performs deep analytics, with a focus on teasing out profitability KPIs, and has a team of data analytics and data visualization scientists working to help clients understand, interpret, and make informed business decisions based on Kontagent’s proprietary data visualization techniques.

 

Research from CISCO, innovation in business intelligence services, and predictive Web data mining

Below are three articles discussing emerging analytical theories on the nexus between Web+Social+Mobile:

Executive Primer: CISCO CIO Summit (.pdf): Excellent primer on how The Cloud, generally, is affecting enterprise IT strategic direction. Two gems: Chapter 6 “Together, the Customer Is Everywhere and Everyone” and Chapter 10 “Scenario Planning: Are You Ready?”.

Business Intelligence 2.0:  Are we there yet? (.pdf): Excellent paper focusing on innovation in business intelligence; includes and excellent benefits analysis chart.

Toward Emerging Topic Detection for Business Intelligence: Predictive Analysis of ‘Meme’ Dynamics (.pdf): This is for analytical geeks only (:-D). The paper discusses the problem of monitoring the Web to spot emerging memes. Essentially, using predictive algorithms to tease out future memes, which would be useful to brand managers in terms of seeding current campaigns with flavors of the future as dictated by the algorithm. The risk is that it can get a bit tautological.

 

Leveraging data analytics for competitive advantage

Two articles recently caught my interest. The first article from the Financial Times, Smarter leaders are betting big on data (registration required) focuses on how companies use data analytics for business intelligence purposes. The best quote from this article:

Data is the new plastics

The second article from the Los Angeles Times, He’s start-ups’ best friend, profiled angel investor Ron Conway and his theories about investing in start-ups. The most telling quote from this article:

His current focus is “real-time data” companies that help people share what they’re doing instantly – using text, photos and video. “This sector is going to be huge,” he said.

As real-time data begins to inundate firms more and more by virtue of their forays into the social web and mobile world, data analytics offers a way for firms to utilize this data in novel ways to deliver more engaging and relevant experiences to their customers. For example, a firm could use data analytics in a predictive manner to dynamically deliver more relevant web pages based on consumers’ behavior throughout a firm’s website. Similarly, firms can use a service like Flowtown in conjunction with a service like First American Core Logic’s lead qualification services to gain insight into a registrant by combining their social persona with their transactional persona and then deliver relevant data and content based on this combined persona. Firms that begin to leverage data analytics will have distinct advantages over their competition in the near and long-term future.

Using text analytics to increase customer engagement and loyalty

I love it when research/theory manifests in application/practicality. In 2007, I wrote about research being conducted on semantic analysis related to social media and blogs, and now there are companies using products stemming from this type of research.

Information Week covered text analytics, describing how JetBlue uses text analytics to understand customer sentiment from email messages, which informed the airline how to draft its customer bill of rights. And KMWorld discusses how the burgeoning field of “customer experience analysis” uses text analytics to increase customer engagement and loyalty.

Customers today aren’t just customers–they’re influencers and social networkers. Across the Web at any hour, they’re sharing observations about your company’s products and services, and those of your competitors…These new modes of customer behavior make it essential for companies to move beyond traditional ways of gathering, analyzing, and acting on customer information – Information Week

For a long time, text analytics was a technology in search of a business need. Now, thanks to social media, the need is there; the question is whether the technology can ramp up fast enough to be commercial – KMWorld

Where social media in real estate sometimes has the floor manners of a dog’s breakfast, it’ll become increasingly important for real estate firms to engage in text-sentiment analysis as part of their overall CRM and customer experience efforts. Here’s a list of companies that offer text-sentiment analysis services:

Photo credit: mnapoleon

Peering Under the Hood at Facebook

If one stops and ponders the amount of data and content users add to Facebook on a daily basis, it’s truly staggering. I’ve often wondered what the Facebook data team does with this data and content. Recently, I stumbled across two insightful articles and a video series that sheds some light on this.

The first article discusses how the Facebook data team uses statistical analysis to make informed product development decisions (the article also touches on Google’s use of data modeling and statistics).

Facebook’s Data Team used R in 2007 to answer two questions about new users: (i) which data points predict whether a user will stay? and (ii) if they stay, which data points predict how active they’ll be after three months?

For the first question, Itamar’s team used recursive partitioning (via the rpart package) to infer that just two data points are significantly predictive of whether a user remains on Facebook: (i) having more than one session as a new user, and (ii) entering basic profile information.

For the second question, they fit the data to a logistic model using a least angle regression approach (via the lars package), and found that activity at three months was predicted by variables related to three classes of behavior: (i) how often a user was reached out to by others, (ii) frequency of third party application use, and (iii) what Itamar termed “receptiveness” — related to how forthcoming a user was on the site.

The second article, posted by the Facebook data team in response to this Economist article, gives a very insightful description as to how the Facebook data team uses statistical analysis to answer an important question:

We were asked a simple question: is Facebook increasing the size of people’s personal networks? This is a particularly difficult question to answer, so as a first attempt we looked into the types of relationships people do maintain, and the relative size of these groups.

What the Facebook data team found was that a user’s passive network is 2 to 2.5 times larger than their active network (i.e., a reciprocal network where there is an active two-way communication happening), and that a passive network is just as important as a reciprocal network in building buzz.

The stark contrast between reciprocal and passive networks shows the effect of technologies such as News Feed. If these people were required to talk on the phone to each other, we might see something like the reciprocal network, where everyone is connected to a small number of individuals. Moving to an environment where everyone is passively engaged with each other, some event, such as a new baby or engagement can propagate very quickly through this highly connected network.

I’ll take a leap and say that these findings helped drive some of the reasoning behind the updated profile home page and business page “lifestreaming” functionality. Facebook’s focus on having people set up a profile–and updating this profile–and immediately engage with other people, coupled with an emphasis on increasing a user’s penetration within their passive network, is critical to Facebook’s continued growth. [Update: for an excellent three series analysis of the new Facebook pages go here, here, and here]. We can see an example of this passive network effect below where a Facebook user posted a short note that his twins are soon to be featured on CSI, the news spread quickly and opened up several channels of commentary:

passive network buzz using facebook newsfeed

Here’s an additional link to some interesting insights by Facebook’s former head of data and analytics, Jeff Hammerbacher, into Facebook’s approach to data analytics and lessons learned (these are fairly long videos, but really really fun to watch). Hammerbacher discusses how they analyze terabytes of data in near-real time to allow their various business units to make more informed decisions. My key take-away from the videos is that a graphical display of data that allows users to also “hack” the data to gain deeper insights yields great product development and customer relationship management gains.

Asserting expertise and authority with a blog

You either have high home prices or lower home prices and lower home prices are what we want, and people shouldn’t be afraid of that,” said Robert Shiller, Yale finance professor, in a Reuters interview. Most of us care about our children and grandchildren, and these people have to buy houses so why would we want high home prices. We want economic growth, we don’t want high home prices. 

So, as the slow ride down continues, what’s happening in the realm of social media that will help you when the ride hits bottom and the ascent begins anew? For starters, Business Week Online in its Feb 21, 2008 issue, is a great source for ideas.

Go ahead and bellyache about blogs. But you cannot afford to close your eyes to them, because they’re simply the most explosive outbreak in the information world since the Internet itself. And they’re going to shake up just about every business—including yours. It doesn’t matter whether you’re shipping paper clips, pork bellies, or videos of Britney in a bikini, blogs are a phenomenon that you cannot ignore, postpone, or delegate. Given the changes barreling down upon us, blogs are not a business elective. They’re a prerequisite. citation

Here’s a tip elite athletes adhere to: remember your competition is yourself and those out there who take the time to do one little extra thing, whether it’s one more hand-eye coordination exercise, or 55 more stairs to run, and it’s that one little extra thing that can separate a winner from a loser.

Ideas circulate as fast as scandal. Potential customers are out there, sniffing around for deals and partners. While you may be putting it off, you can bet that your competitors are exploring ways to harvest new ideas from blogs, sprinkle ads into them, and yes, find out what you and other competitors are up to. citation

Yes, social media will change the way real estate practices are conducted. One way–for the better–is simply to allow you to engage in a more meaningful discussions with clients and potential clients. As a real estate professional, blogs operate as your authority imprimatur. As mainstream media begins to gobble up the blog premise and “commoditize” this presence you will look out-of-date and “old school” if you similarly don’t innovate your mode(s) of communication.

Mainstream media companies will master blogs as an advertising tool and take over vast commercial stretches of the blogosphere. Over the next five years, this could well divide winners and losers in media. And in the process, mainstream media will start to look more and more like—you guessed it—blogs.” citation

Using prediction markets in real estate

Many posts have been written on this paper, Using Prediction Markets to Track Information Flows: Evidence from Google. What’s interesting is the influence of proximity on predictive markets. According to the paper, sharing an office had the highest influence (as opposed, for instance, communicating exclusively via email) and part of cultivating an innovative culture is to optimize physical locations to promote idea sharing, collaboration, etc. Microsoft also experimented with predictive markets to anticipate product deliverables. Innovative real estate firms could employ similar tactics amongst their real estate agent base to predict market changes, buyer behavior respective to such, and use these insights to better manage operations.

Real estate data integration for multi-channel marketing

The tightest definition of multichannel customer management I have yet found is:

Multichannel customer management refers to the design, deployment, coordination, and evaluation of channels through which firms and customers interact, with the goal of enhancing customer value through effective customer acquisition, retention, and development.

Neslin, et al. have authored a definitive research article that real estate firms can use to understand the challenges pertaining to “modern” real estate practices relating to client relationship, and agent relationship, issues. The research paper explores five primary challenges and analyzes the issues pertaining thereto.

Neslin begins by identifying the challenges:

[F]ive major challenges for managers: (1) data integration, (2) understanding consumer behavior, (3) channel evaluation, (4) allocation of resources across channels, and (5) coordination of channel strategies.

This post is first in a four or five part series that will explore Neslin’s position and extrapolate such to real estate marketing and client relationship best practices.

Neslin begins by identifying multitudinous ways by which consumers engage retail firms–from kiosks, call centers, catalogs, bricks-and-mortar stores, etc. Similar interaction vehicles are true for real estate firms–front-yard signs, websites, office walk-ins, etc. Next, Neslin defines “channel”

By “channel,” we mean a customer contact point, or a medium through which the firm and the customer interact.

He then sets the basis for his study: that the focuse of MCM is on the customer, as MCM is a customer-centric function. Neslin next identifies major phases of a client interaction

First, customer perceptions and preferences drive channel choices (e.g., the customer may prefer the Internet for search because it is easy to use). Second, the customer learns from and evaluates his or her experiences, which feed back into the perceptions and preferences that guide his or her next shopping task (e.g., the customer may learn that the Internet search did not answer all the important questions). Third, the customer chooses both channels and firms, so from the customer perspective, it is a two-dimensional choice.

The relevant question then is: to harness this consumer interaction data, what investments must a firm make regarding such? What Neslin argues is that firms do not necessarily have to invest in processes that involve “full data integration” in a quest to develop a “single view” of a customer. What this suggests, then, is that firms must make strategic investments in data acquisition a key points in a transaction.

Real estate firms can leverage key consumer data acquisition “channels” or points. First, any point where a consumer registers for information is a channel. This real estate site contains at least 15 registration opportunities for clients during key phases of a transaction: from beginning (click-to-chat) to contacting an agent to book a showing appointment. Of course, many firms already have this data. So what’s the next step?

Data overlays.

That is, real estate firms should consider augmenting this core consumer registration data with real time, or post-transaction data overlays, from data aggregation companies like Experian, Acxiom, Equifax, etc. These overlays take the form additional demographics, psychographics, household income levels, lifestage, etc, data elements.

Another form of consumer data can be supplied by real estate agents. Although somewhat rare, some agents actually keep client profiles (likes, desires, familial relationships). Why? Because thes agents know that understanding a client’s profile allows them to serve this client (and like clients) at a degree somewhat higher than the norm. These agents use these profiles as their competitive differentiator.

Creating client profiles (either at the per record level, or aggregate level) should be considered a first step for any real estate firm that’s serious about multi-channel management. By using such profiles firms can engage clients at a more relevant and informative level. Thus, maximizing the return on investment the customer is making by spending time on the real estate firm’s site. Similarly, a firm maximizes its own return on investment by allocating tight marketing resources in a more intelligent and cost-conscious manner.

Real estate zip code search optimization

It looks like this company is winning the Chicago real estate search engine optimization strategy and execution race. These representative results speak for themselves: 60647 homes for sale, and 60647 townhomes for sale, and 60647 condos for sale all have this website listed in Google’s top slot (at least as of the date of this post). But what really sends this site over the top in terms of customer service and Internet consumer convenience is its RSS feed.

Multichannel marketing forensics

Kevin Hillstrom, President of MineThatData has written an excellent whitepaper on conducting a multichannel forensics analysis. Why is this whitepaper an important resource to real estate firms? Because real estate firms are engaged in complex multichannel marketing endeavors. But only a handful of these firms analyze their data from a multichannel perspective.

How does a firm begin its forensics analysis? Hillstrom explains:

  1. Understand the Retention Mode your product, brand or channel resides in.
  2. Understand the Migration Mode your product, brand or channel resides in.
  3. Combine the Retention and Migration Mode, understand which of twelve retention/migration modes your business operates in. This determines the way you will grow your business, long-term.
  4. Map the Ecosystem, so that the executive can clearly understand how all products, brands and channels interact with each other.
  5. Forecast the Ecosystem. This allows the executive to understand the long-term health of the ecosystem, given various marketing initiatives.

A key point Hillstrom makes is to look at multichannel businesses as ecosystems, where each product and division is interdependent on one another (a biodiversity perspective would also apply). Unfortunately, many companies are still balkanized in this regard.

For the most part, real estate firms have at least centralized their focus around a core product and service: representing buyers and sellers of homes and other forms of real estate, combined with highly related ancillary businesses such as rentals, REO, mortgage and title services, etc. This is a real estate firm’s ecosystem.

Hillstrom, in this whitepaper, has identified several business modes and strategic considerations related thereto. With the exception of certain commercial divisions and investment services, real estate firms fall within one of the two following modes: Acquisition / Equilibrium Mode and Acquisition / Transfer Mode. Both modes imply a constant sourcing of new customers with differences in how customers adopt new products or services. In the case of the former, Hillstrom states customers occasionally migrate, whereas in the case of the latter, the assumption is that customers will migrate to another product (much like a professional baseball player over his career migrates between teams).

So how can real estate firms a) position their products and services more relevantly to new sources of customers while b) targeting the “may migrate” class to the “probably will transfer” segment? Hillstrom advocates mapping the ecosystem

A key aspect of Multichannel Forensics is the mapping of the ecosystem you work in. Each combination of products, brands and channels are mapped. Any relationships in equilibrium or transfer are mapped with arrows, arrows that indicate the direction of the relationship.

The next step is to forecast the ecosystem, which, Hillstrom argues, enables executives to engage in valuable scenario analyses.

The benefit to a real estate firm in undertaking these analytical steps is that it will have a deeper understanding as to how its agents influence (negatively or positively) the firm’s sales of its primary and ancillary products and services. What’s also beneficial about Hillstrom’s whitepaper is that he actually gives you a step-by-step process by which to perform the analysis.

Gatineau Project marketing metrics

Eric Peterson continues to provide great insight. He has an exclusive profile of the Microsoft Gatineau project. At first glance, the Gatineau project is quite impressive. What’s particularly pleasing is that it appears to have been designed for marketing personnel and business managers. The visual representation of the data clearly indicates relevant campaign success and failure metrics.

Nevertheless, there are some considerations: Will this service give an accurate, and full representation, of data across multiple universes, or is it just limited to the MSN universe? Can firms track their competitors with this program? And with respect to their demographic data, it seems to be self-reported data from MSN, rather than from a wider sample data set; thus, how representative is the demographic data in Gatineau?

False profiles and the Internet consumer

Arguably, nothing messes with a firm’s loyalty and/or CRM strategy more than a multitude of false consumer profiles polluting a CRM database. In seeking to elevate one’s marketing engagement index, it’s often helpful to understand the demographic profile of a consumer. But if such a consumer does not self-report this, or if such data is not inferred, then firms are at the mercy of the garbage.

Interestingly, a research team claims in their research paper

The profiles users may contain fake information. We believe that our proposed algorithm can be used to identify and refine the profiles which contain bogus demographic information.

Essentially, this team analyzed web log files for search patterns and used an algorithm to predict gender or age. They claim a lift in accuracy of 30.4% on gender prediction and 50.3% on age prediction over traditional methodologies.

What makes this exciting is that, assuming futher testing bears out the team’s claims, companies like HitWise or WebTrends can incorporate this algorithm into its search pattern analsysis products. Firms can then use this core demographic information to craft more relevant landing pages, calls to actions, etc, on their websites.

Profiling hedonic data in social networks

Continuing the discussion from the McKinsey interview of Cammie Dunaway, she states

[Yahoo!] is using behavioral data–really mining the wealth of transactional data we have about how people are spending their time online and trying to marry that data with attitudinal data…that’s where the most powerful insights can really come from.

Insights into what? It could be many things. Two of the most studied motivational data elements are utilitarian motivations and hedonic motivations. Utilitarian motivations center around goal-oriented behavior (e.g., I logged in to check my email, I checked my email, I logged out). Hedonic motivations are more social in nature (e.g,. I logged in to explore, to analyze, to decide, to eventually take action).

In real estate search, companies have typically focused on rewarding utilitarian behavior, often in a very reactionary manner. Consumer searches site > Consumer registers > Consumer selects home > Consumer is “passed off” to a real estate agent. Of course, the ultimate goal is to consummate a sale. And improving the “experience” of looking for a home on a real estate firm’s website could actually lead to more loyalty, referrals, and sales.

Nevertheless, overly focusing on “experience” at the expense of a goal can scuttle both consumer loyalty and ROI. Thus, balance lies in properly testing and deploying Web 2.0 assets that fulfill consumer goals while logically jibing with the product subject matter.

So how does mining attitudinal data fit this balanced approach or paradigm? Incenting consumers to add profile information that logically fits a goal is one idea. For example, if a real estate firm’s goal was to create a social network on their site targeted at tapping a suburban soccer mom demographic looking to buy a home, logical profile information may be zip code (current residence and desired residence), schools, sports, design preferences, and home type.

Zip code is important because the firm could relate this consumer to an agent who serves that zip code, where the agent serves as the social network ombudsman(woman) to answer questions and otherwise kick-start the group. Secondly, once a firm understands home type preferences and desired location, the firm can relate specific home information, community information and statistics, and other moms in the network to this person. The additional profile information constitutes community building information (e.g., relating moms who have children in similar sports). These steps help build a community and take the burden off the real estate firm to be all things to all consumers (if a mom has questions about how her child can join a traveling baseball team, she could ask the real estate agent, but more likely she’d ask the community). This way the firm’s “social asset” reinforces the firm’s local expertise, which allows for an eventual monetization of this consumer as she “graduates” through the process into ultimately looking at home types and eventually purchasing a home.

Through the tracking of profile data combined with the interaction of the consumer with the group (communications, postings, etc) combined with accessing utilities (e.g., widget downloads pertaining to design elements, video home tours, community data, statistics, etc), a firm could create an “engagement” index to validate whether their site is properly satiating consumers’ needs (Circuit City does this). The experience of this for the consumer is not so much having real estate listings and drip marketing pushed her way, but related data presented in a way that allows her to more deeply engage in the process and begin building a community before actually living in a community. Finally, in terms life-time value, this type of a social network could operate as a forum for a firm–and its real estate agents–to cultivate a valid and meaningful long-term relationship with consumers after they have actually bought a home (thus, closing the circle by adding transactional data with previously compiled attitudinal data).