Data from Adobe Campaign, Kickdynamic, Dynmark and other data sources was difficult to access.
In September 2013, Profusion’s data science team responded by designing a new analytical database (ADB) that merged data from multiple siloed data sources into one unified data warehouse.
The database empowered data scientists to query all multiple data sources using SQL. Data analysts were able to use the new infrastructure to produce detailed marketing reports in one hour instead of eight. This marked a radical shift for the business, enabling quick insights for our teams of marketing consultants and campaign managers.
We are moving towards a world where we must deal with increasingly large datasets for the work that we do. We use Hadoop, Spark, Zeppelin and Hue to provide a powerful and unified analytics environment for big data.
Hadoop is a data infrastructure designed for parallelised computation. The advantage of using a Hadoop cluster over a server with a traditional SQL based data warehouse is this: we can organically grow and scale it by adding one Hadoop node at a time.
Using Hadoop, we are empowered to use parallelised computation. However, writing parallelised algorithms is very time-consuming, complex and challenging. Using Spark, we can hide away a lot of that complexity, make it very easy for data scientists to harness machine learning libraries optimised for parallelised computation. Spark provides a unified framework where querying and analysis can happen at the same time.
Notebook environment: Zeppelin and Hue
Using Zeppelin and Hue creates a user-friendly notebook environment where Spark and Hive codes can be written and shared with colleagues. Here it is much quicker and easier to explore data and develop code than it is at the command line.
Eucalyptus semantic layer
Currently it takes three to four months to train new data scientists to use and query the analytical database. Within Adobe Campaign, the data table structure is very complex. Adobe has been designed for users to log and store data efficiently, but not to understand and query it easily.
Our data science team has designed new clean and clear data tables known as the semantic layer. This allows newly recruited data analysts and data scientists to get to grips with the data at speed and produce valuable insights soon after they join. They need relatively little training to do so.
Using the clean data tables, this data is fed into a dashboard to empower marketers, campaign managers, email developers and third party stakeholders. Anyone can explore and ask questions of our marketing data, without a data analyst or data scientist. Key stakeholders can explore and interpret the data in real time to inform commercially-led decisions around time-critical client or customer needs.
More client stories
How to execute advance campaigns
Adobe Campaign enablement
Use advanced Adobe Campaign
workflow techniques and personalise your emails
Digitise your online customer journey
Student digital transformation
Create web application forms,
email journeys and campaign reporting
Design a customer retention strategy
Churn model & win-back
Know which customers are at
risk of churn before it’s too late
Target customers at the right time
Propensity to buy
Predict when your customer
is likely to purchase next
Time and day analysis
Best time to send?
Maximise engagement –
communicate with customers
at the ‘right’ time
Multi-channel marketing strategy
Single customer view
Multi-channel marketing strategy
to achieve real-time segmentation
Commercial critical insights
Business intelligence live insights
Make smart business decisions
based on real-time data-driven insights
Understand customer needs
Understand customer evolving needs
Natural language processing
Voice of the consumer
Analyse customer feedback from
reviews, texts, survey responses,
online & social media
Fostering product discovery
Product recommendation engine
Foster product discovery
through a recommendation engine