Community Comment: Part 24 - Self-service data product use cases exist, with caveats
- Self-service data product use cases exist, with caveats
- Users need to know what they're doing
- Stakeholders need to be aware of benefits & drawbacks
- Some commercial products (e.g. Power BI) introduce unique risks
The comments I provided in reaction to a community discussion thread.
Owner & Data Architect at Data Analytics Firm:
I love power bi, BUT
I want to break it up.
I believe getting value from data involves 2 separate parts:
– getting the data and building the data model
– analysing the data
I believe the first part (data engineering part) should be done at the central level by data experts that ensure data is consistent, well documented, and overall of high quality.
The second part (data visualization and analysis) should be done by business users .
Why?
-data viz tools are getting easier by the day
-business people understand their data better
-business people should not have to wait for someone to build their reports
Power BI seamlessly integrates the 2 parts and that is the problem.
When business people build a Power BI report:
They build data pipelines by mix-and-matching different Excel, not ensuring data quality (they don’t know how), rebuilding something that already exists, etc
Overall building low-quality foundations
When technical people build the Power BI reports:
Technical people do not know what indicators are important for the business. So they pick a good looking chart for each column in the table, and call it a day (I might exaggerate but it’s often like this)
Let’s stop asking business people to build data pipelines, and stop asking technical people to deliver insights.
Owner & Data Architect at Multiple Firms:
I tend to agree with you [Owner at Data Analytics Firm] when we are talking about creating 'SYSTEMATIC' Data / Information products (=high quality, repeatable, needs to run stable over time, agreed upon etc).
AFAIK this CAN be technically done by using SSAS as a semantic layer, right? A business-focused data engineer, let's call them 'Analytics / Semantics Engineer' would the build that conformed 'semantic layer' instead of your 'business person' persona.
On the other hand, I think it should not be 'forbidden' to create OPPORTUNISTIC data products too (one-off / ad hoc insight, no requirement to run in a stable fashion over time). If we would do so we would be back in the Business Objects / Cognos era….
Working opportunistically in PowerBI works very well!
The issue is that, as a consumer, you are often not aware whether the data products is build in a SYSTEMATIC or OPPORTUNISTIC way.
Making this distinction absolutely clear could help!
Owner & Data Architect at Multiple Firms:
Interview with [Chief Data Officer at Ministry of Justice & Security (Netherlands)] on his DQM:
https://prudenza.typepad.com/files/english—the-data-quadrant-model-interview-ronald-damhof.pdf
Owner & Data Architect at Data Analytics Firm:
Fully agree on your distinction.
Business should not have to ask data engineering help when running an ad-hoc analysis on 3 Excels
So yes, my point is for systematic analysis.
The problem is when companies build an opportunistic data product , and they think that the product is ready to use for sistematic analysis
Then they get upset when it doesn’t work, data is bad quality, etc
It’s like building a POC and then deciding to use the POC as if it was the full product
Yes it’s already possible to build a semantic layer in Power BI (using a Power BI dataset based on SSAS technology)
Because Power BI can do everything, companies (for what I have seen) get confused on what should IT do , what should citizen developers do, etc
Owner & Data Architect at Data Analytics Firm:
[Owner & Data Architect at Multiple Firms]
Yes I have never been a fan of the ‘citizen’ version of jobs
The misbelief is that code is what separates ‘experts’ from other people
So by creating a Low-code / no-code platform, then regular people can a do lot of what experts did in the past
But coding is a small part of what separates experts. The important part is how they think: creating automatic tests, refactoring code to make it readable, using different environments and not pushing directly to production, versioning their code, and very importantly the experience
So getting rid of code gives the false belief that people can build solid applications. It’s a false belief
Gfesser:
Thank you for drawing a distinction between the systematic and opportunistic. The opportunistic route can definitely be a challenging one to take, and I've seen issues surface soon after data is made available this way. Are there use cases to provide self service? Of course there are, but users need to know what they're doing, and stakeholders need to be made aware of benefits and drawbacks with this approach. And with Power BI specifically, there are benefits and drawbacks that need to be kept in mind unless it is absolutely certain that reporting is limited to ad hoc or PoC purposes. The compound challenge of course being that it's often hard to prevent this type of reporting from being repurposed as production ready. I explained in an AWS data platform use case, for example, that while it was convenient to offer users access to Power BI in Azure to hit the data in the solution we built via Power BI Gateway, this has risks including lack of visibility for a broader audience, inability to perform ad hoc querying of models from outside Power BI, and inconsistency across locally implemented data models for the same tables.
Building a Data Platform on AWS from Scratch: Part 2
https://www.linkedin.com/pulse/building-data-platform-aws-from-scratch-part-2-erik-gfesser
Owner & Data Architect at Data Analytics Firm:
Erik Gfesser
I believe you identified the real issue that is "The compound challenge of course being that it's often hard to prevent this type of reporting from being repurposed as production ready.".
Building 'opportunistic' report is not a problem. It's one of the main innovations brought by modern BI tools.
Instead of waiting for IT to build views or tables on SQL, now anyone can analyze their data mostly using drag-and-drop UI interface.
The problem is when opportunistic reports are moved to production.
Then 'exceptions' that were not considered happen in production (a join creates duplicated rows, data is missing, some data is not refreshed, etc). That is when management starts no longer trusting the reports
New post in reaction to comments of above post:
https://www.linkedin.com/posts/lucazanna_linkedin-ideas-activity-6981350110693572608-covS
Owner & Data Architect at Data Analytics Firm:
Best part of LinkedIn :
Sharing an idea and having people complete it / improve it / clarify it based on their experience
Thank you for the discussion [Owner & Data Architect at Multiple Firms], Erik Gfesser, [Cloud Solution Architect Data & AI at Microsoft].