Before understanding the strategy of one big data solution that helped SEGA to achieve Customer Segmentation, Customer Lifetime Value, Quality of Service, and Subscriber Churn Prediction.
Let us have a brief about what SEGA is, and exactly why they needed a big data solution
There is a possibility, that you have not heard the name before but we all have played video games once in our life like Sonic- the Hedgehog, Streets of Rage, yakuza, and the list goes on and on. All these games are products of SEGA Corporation.
SEGA is a Japan-based multi-national video game and entertainment company based in Shinagawa, Tokyo, founded in 1960. The founders were Martin Bromley and Richard Stewart. It’s been 63 years that Sega has been working hard to entertain us with its video games under the leadership of Haruki Satomi (the current chairman and CEO of SEGA).
If you wish to know the whole journey of SEGA, click here
SEGA is an abbreviation of Service Games, used to manufacture coin-operated slot machines during World War II. After facing so many ups and downs, today Sega has become one of the biggest video game companies in the world with the latest revenue of $ 3.14 billion and a 30 million customer base.
The Big Data Problem that SEGA was actually facing
From iconic games like Sonic the Hedgehog to recent favorites like Warhammer, SEGA has delighted gamers worldwide for decades. From the base of 30 million customers, SEGA collects more than 25,000 valuable data events every second on player behavior and in-game interactions.
During the COVID-19 pandemic, SEGA experienced a twice spike in active players but struggled to gain any actionable insights from that huge amount of data. So what they faced was, that their team of data analysts lacked the computing capacity to handle that type of massive amount of data in an environment that would enable their data science teams to deliver actionable analytics.
SEGA has always been a powerhouse of innovations and creativity in the gaming industry. As one of the leading brands in the gaming industry, SEGA has churned out multi-million-selling game franchises like Sonic- the Hedgehog. With over 30 million customers, SEGA is still focussing on creating amazing gaming experiences but now with data as the primary driver which is collected over 25,000 events per second.
They are capable enough to harness that data to create an interactive gaming community in which the player’s experience is personalized from all touchpoints, from interactions with customer service teams to new in-game features that drive engagement but all these qualities of Sega were not producing any result as they were all busy in facing the big data problem.
Before the big data solution, SEGA had a dispersed environment of data which made it difficult for the teams to operationalize the unstructured data and the streaming of data was neither efficient nor effective for analytics and machine learning.
At the time, it was difficult for me to process our data because it was stored on different platforms and architectures. We had things stored on S3, Redshift, and others were placed on Microsoft Azure.
Mr.Stanley Wang,Data Scientist at SEGA
Related to the big data problem Mr. Stanley further added,
One of the biggest headaches was managing the ingestion of all these data sources in one place so that we could use them for ML projects.
Using Jupyter Notebook on their machines, the data science team used to spend a lot of time importing and accessing that data from various resources only. This pampered their ability to analyze and provide insights to the product team for the game innovations and studio teams to help with commercial and marketing decision-making.
Felix Baker, a Data Services Manager at SEGA“We had bottlenecks when three studios tried to make analytical queries on the same Redshift table at the same time, while a fourth had already launched a job lasting 10 hours, blocking its use.”
The Big Data Solution :
After testing many other solutions for their big data problem, Sega finally decided to select the Databricks Lakehouse Platform on AWS for their foundational requirement of data engineering and analytics.
We tried cloud-based data warehouses, but when tested, they didn’t have sufficient ingest capabilities for our streaming needs
Felix Baker, a Data Services Manager at SEGA
This is how Databricks became Sega’s big data solution for their big data problem. With the help of Databricks(the big data solution itself), they can easily handle the volume of computers for the structured and unstructured data that also includes financial data, anonymized customer information, and in-game behavior and analytics data. Along with that Databricks also helped Sega to deliver more efficient and smooth pipelined data to create BI reports and ML models completely focused on improving the gaming experiences.
The lakehouse architecture suits our needs perfectly,”
Francis Hart, Director of Online Technology at SEGA.
Since Databricks created a common platform where all the data is stored, it made the collaboration amongst the team to access the data and provide insights in real-time which improved data productivity and efficiency.
SEGA is now using Databricks SQL to track key metrics and cohort users by playstyle via BI reports and by real-time data to drive community activities, help evaluate new features, identify opportunities to better engage their community, prevent pirated usage, improve player engagement, and more.
They have also developed their own ML algorithm to tailor games and updates to players based on interactions. For example, if new players struggle to establish themselves in a game within a certain period, SEGA examines and updates the UI for increased ease of use.
If you wish to understand the Big Data problem and its solution, register yourself and join the workshop conducted by Console Flare.
What did SEGA achieve after solving this Big Data problem?
After using Databricks Lakehouse, SEGA unlocked Gaming Insights to improve the player’s experience and boost the monetization opportunities through targeted product innovations which are designed to drive engagement and revenue. Data science and studio teams started working faster and more efficiently because all the data is available in one centralized environment that enables simple model execution. Streamlined data accessibility on the Databricks Lakehouse provides Sega with more valuable data than they ever had.
With the previous architecture, we managed to collect data every half-hour, at best. With Databricks we can collect them every minute.
Felix Baker
With Databricks Lakehouse as the foundation of the Machine Learning and data analytics infrastructure, Sega is getting ready to produce more use cases including sentiment analysis on social media to gauge prerelease excitement and post-update reviews; player behavior analysis to uncover “unsuspected” styles of play; gaining real-time game statistics during streamer broadcasts to further customize communication; and analyzing distributor data for sales and financial forecasts.
Creating a loyal and prosperous community takes a lot of collective effort. With the help of data insights generated through the fingertips of their passionate customers, Sega is now more than ready to deliver quite a range of customer experiences which is designed to improve brand quality while increasing revenue today and in the future.
Having better and faster insights into our data allows us to deliver a better community gaming experience that increases customer satisfaction for not only our games but also the entire SEGA experience,
Francis Hart