Follow Us

We use cookies to provide you with a better experience. If you continue to use this site, we'll assume you're happy with this. Alternatively, click here to find out how to manage these cookies

hide cookie message

Netflix foretells 'House of Cards' success with Cassandra big data engine

Big data algorithms helped Netflix map the viewing patterns of its customers

Article comments

House of Cards, staring Kevin Spacey, is the first major TV show to completely bypass the usual television ecosystem of networks and cable operators and premier on the streaming service Netflix.

It may seem like Netflix took a big risk buying in unproven content rather than licensing content that was already successful. In reality, however, Netflix knew that the series would be a hit, based on data about the viewing habits of its 33 million users.

Using the NoSQL database Apache Cassandra, Netflix was able to gather real-time data about the programmes its customers were watching, their demographics and viewing patterns, and build up an authoritative picture of the kind of content that would be well received.

Matt Pfeil, co-founder and VP of Customer Solutions at big data software company DataStax, which worked with Netflix to implement Cassandra, explained that this is the first time that programming has been developed with the aid of big data algorithms.

“Netflix has all these data points about movies getting watched, and they can look for things like, do people like actions or drama who are our highest returning customers? Who is the lead in most of those? What type of characteristics of films provide the most engaged watching experience? And then they can use that to go figure out which series they should potentially buy,” he said.

Netflix began moving its data to Amazon Web Services (AWS) in 2010 and replaced its Oracle SQL database with Apache Cassandra the following year. According to Netflix, Oracle's SQL database inhibited the exchange of data around the world and required regular downtime for schema changes.

“From a practical computer science-level, traditional relational database technologies were not built to accommodate large volumes of data, especially in any way shape or form from a cost-effective perspective,” said Pfeil.

Netflix chose Cassandra because it offered a globally distributed data model, along with the flexibility to create and manage data clusters quickly.

By mid-2011, Netflix was using six major applications with Cassandra, including its subscriber system, AB testing, and viewing history service (including positions at which members stopped watching a streaming programme).

Each cluster has a multiple of 12 nodes. In addition to the six clusters for each application in production, Netflix has a shared Cassandra cluster with 12 nodes, used for smaller applications that don’t need their own cluster.

According to Adrian Cockcroft, cloud architect at Netflix, the regular downtime that was needed for schema changes to the Oracle SQL databased is no longer necessary, and a Cassandra cluster can be created in any region of the world in 10 minutes.

“We don’t have to plan capacity in advance, we don’t need to ask permission of other people to build things for us, and we don’t worry about running out of space or power,” he said.

Netflix is by no means the only big company using Cassandra to process big data in real time. DataStax has more than 250 customers worldwide, including 20 of the Fortune 100 companies, and the company claims that there is demand for the technology in almost every vertical.

For example, Rackspace is using Cassandra to monitor the metrics on all of its servers to determine which ones are under heavy load and might fall over.

“Everything from tech, healthcare, education, financials, retail. We're true platform players, so we're across the board,” said Pfiel.

DataStax already integrates Apache Hadoop and Apache Solr into its NoSQL big data platform, and Pfiel expects the technology to continue evolving over the next ten years.

“If you talk about this age as the data age, we're still in the teenage years, and as it matures there's going to be orders or magnitudes of different types of technologies that just encompass big data,” he said. “The more data you have and the more you can do with it, the smarter this business decision.”



Share:

More from Techworld

More relevant IT news

Comments

SoulHonky said: If data is so important shouldnt you wait for actual data about how the show fared and how it affected subscriptions before calling it a hit or a success



Send to a friend

Email this article to a friend or colleague:

PLEASE NOTE: Your name is used only to let the recipient know who sent the story, and in case of transmission error. Both your name and the recipient's name and address will not be used for any other purpose.

Techworld White Papers

Choose – and Choose Wisely – the Right MSP for Your SMB

End users need a technology partner that provides transparency, enables productivity, delivers...

Download Whitepaper

10 Effective Habits of Indispensable IT Departments

It’s no secret that responsibilities are growing while budgets continue to shrink. Download this...

Download Whitepaper

Gartner Magic Quadrant for Enterprise Information Archiving

Enterprise information archiving is contributing to organisational needs for e-discovery and...

Download Whitepaper

Advancing the state of virtualised backups

Dell Software’s vRanger is a veteran of the virtualisation specific backup market. It was the...

Download Whitepaper

Techworld UK - Technology - Business

Innovation, productivity, agility and profit

Watch this on demand webinar which explores IT innovation, managed print services and business agility.

Techworld Mobile Site

Access Techworld's content on the move

Get the latest news, product reviews and downloads on your mobile device with Techworld's mobile site.

Find out more...

From Wow to How : Making mobile and cloud work for you

On demand Biztech Briefing - Learn how to effectively deliver mobile work styles and cloud services together.

Watch now...

Site Map

* *