Skip to content

The (Data) Layer Cake

We are now thinking about data and analytics capabilities in three ‘layers’.

Platform Layer

Let’s start with the data platform.  We agree with Gartner that “the market for cloud-based DBMS solutions for analytical use cases has matured and settled..”  The necessary components such as ETL, database, analytics and reporting are now increasingly being packaged in one service offering to present a consistent ‘experience’ to all users.  Our primary platform of interest is Microsoft Fabric but other vendors are available such as Google BigQuery, Databricks and Snowflake to name but a few.

We use a combination of two Gartner Critical Capabilities reports: Cloud Database Management Systems for Analytical Use Cases and  Analytics and Business Intelligence Platforms to provide a framework to compare these platforms and explore the valuable use cases. (Apologies, the reports are behind the Gartner paywall but you can search for vendor-provided copies)

Though we think the overall platform concept will be stable for a little while to come, the pace of change across these platforms is still breathtaking.  We keep tabs by plugging the various blogs such as Google Cloud Blog and Microsoft Fabric Blog into an RSS feed such as Feedly or Inoreader and from there parse the content with AI to highlight key developments.  You will see just by looking at Microsoft Fabric June Updates how quickly things are moving.

ML Pipeline Layer

The second layer we are focusing on is the Machine Learning pipeline and in particular AutoML.  Again, though we highlight AutoML in Fabric, all of the other platforms offer a similar capability.  Though the challenge of understanding the business context and identifying valuable hypotheses is still an essential human capability, much of the workload from that point forward is increasingly commoditised.  “You implement each step from data ingestion, cleansing, and preparation, to training machine learning models and generating insights, and then consume those insights using visualization tools like Power BI.”

This has given us the platform to explore some very sophisticated predictive analytics scenarios around the SaaS Metrics model.  In particular, we are exploring the PyMC Labs blog scenarios which we think offer profoundly valuable insights into customer data.

AI Integration Layer

The third layer we are calling ‘AI integration’ layer.  It is impossible to summarise the Cambrian explosion of capabilities over the last 18 months focusing of course on OpenAI but now all vendors are bolting on AI capabilities like some steampunk fantasy, whether it works or not and whether it is appropriate or not. In the realm of ‘AI Integration’ we are looking at companies such as LangChain who are building out the ‘plumbing’ between the AI models which allows things like Retrieval Augmented Generation (RAG) and Cohere who are focusing on building valuable business applications.  We are particularly excited about Generative UI both generally and particularly in the data analytics space.

We are using AI at every stage in our business analysis and development process.  We think even unstructured interaction with ChatGP offers a step change in productivity across almost every activity.  Please feel free to follow along with our ongoing experiment on GitHub and YouTube.

Leave a Reply

Discover more from Standswell

Subscribe now to keep reading and get access to the full archive.

Continue reading