Roles and Responsibilities

Once you've defined your cloud-scale analytics strategy, you need to organize teams to successfully deliver on it. This article describes some of the roles and responsibilities you should consider for cloud-scale analytics. You can map these roles and responsilbites to the various teams we've discussed in previous articles.

Important

This article highlights potential roles and responsibilities, but it isn't an complete list. Consider this article's guidance and then alter it for what works within your organization. If you're a small organization, you might not resource these roles, but that shouldn't prevent you from deploying a cloud-scale analytics platform. If you're a large organization, you might decide to streamline and consolidate roles.

Roles

A cloud-scale analytics deployment involves multiple roles. The following table describes each role, job title and responsibilities.

Role Other Job Title Responsibilities Skills Applies to:
Solution Architect Platform Architect, Solution Architect Design and oversight of cloud technologies to meet business/customer needs Cloud technology subject matter experts, Architecture pattern development Data application teams, platform group
Platform Ops Cloud Engineer, Infrastructure Engineer, Systems Engineer Assisting with cloud technology design, implementing and enabling cloud services and capabilities, and managing cloud resources Cloud technology subject matter experts, programming, DevOps Data landing zone ops, platform group
Security Architect Security standards and policy design, security standards and policy oversight, security tool decisions, security assessments and audits, and security patterns and processes Security technology subject matter experts, regulation, compliance, and legal controls, decision making, threat protection, and vulnerability management Platform group
Security Engineer Security Ops Security policy, standards, and tooling implementation, and assist with security assessments and/or audits Security technology subject matter expertise, regulation, compliance, and legal controls, threat detection, vulnerability management, and SOC processes Platform group
DataOps Engineer Data Platform Ops Orchestrating the analytic pipeline, promoting features to production, and automatting quality Agile Development, DevOps, Statistical Process Control Data application teams
Data Modeling Architect Data Modeler Data modeling, data mapping, and data patterns Industry data standards, data tooling capabilities, lake database templates, and data governance Data application teams
Data Solution Architect Data Solution Architect Data platform best practices application, data publishers/data product owner guidance, and data patterns Industry data standards, data tooling capabilities, lake database templates, and data governance Data application teams, platform group
Data Engineer Data QA Engineer, ETL Engineer Lake database templates/data lakes/warehouses/marts implementation, data movement, and data transformation Databases, programming/scripting, simple storage, and data APIs Data application teams
Data Owner Source Data Owner, Technical Publisher Access approvals, data quality, business term definition, usage rule definition, and specification of data in control file for onboarding data source Domain subject matter expert and business relationships Data application teams
Data Product Owner Data Scientist Manager, Data Analyst Manager Vision of data product, data product usage, and metric definitions Business subject matter expert, business relationships, analytics concepts, user experience design, and product management Data application teams
Data Analyst Data Visualization, Designer, Business Data Analyst, BI Developer, BI Engineer, Reporting Analyst Visualization, charts, graphs, dashboards, tables, reports, and Exploratory Data Analysis Programming/scripting/SQL, statistics, data cleaning, and data visualization Consumers, data application teams
Data Scientist Machine Learning Researcher, Machine Learning Engineer, Quantitative Analyst, AI Programmer Algorithms, models, data product curation, Exploratory Data Analysis, measuring and improvement of results, and communicating findings Business subject matter expert, advanced mathematics, machine learning, data mining tools, programming/scripting/SQL, statistics, and data visualization Data application teams
Data Governance Manager Data Governance Lead, Data Governance Sponsor Data governance program oversight, data governance standards, policies and rules, and approval of tools to support governance capabilities Governance regulations and control frameworks, business relationships, and business strategy alignment to objectives Platform group, governance
Data Steward Data Trustee Data meaning, data quality, data compliance, fitness of data assets, knowledge of data products and their use, data team outreach, conceptual subject area ownership, subject area leadership with technical data owner, and data subject area stewardship Domain subject matter expertise, data quality, governance regulations, and governance control frameworks Data application teams, governance

Responsibilities

A cloud-scale analytics deployment involves multiple areas of responsibility. The following tables provide overviews for each of these areas.

Compute

Areas of Responsibility Data Owner Data Product Owner Data Analyst Data Scientist Data Governance Manager Data Steward Solution Architect Platform Ops Security Architect Security Engineer DataOps Engineer Data Solution Architect Data Modeling Architect Data Engineer
Manage the UI for requesting compute A R
Specify what tools are used to bring compute to the data platform infrastructure as code (IaC) template A R
Configure and monitor compute that accesses the data platform A R R R R R
Provide Producer support for compute that accesses the data platform A R R
Understand, monitor, and execute business rules & cleansing for data curation A R R R R

Data lifecycle

Areas of Responsibility Data Owner Data Product Owner Data Analyst Data Scientist Data Governance Manager Data Steward Solution Architect Platform Ops Security Architect Security Engineer DataOps Engineer Data Solution Architect Data Modeling Architect Data Engineer
Responsible for data model architecture for zone(s) in the data platform R R A R R
Drive architectural approval for overall continuity of the data platform R A R
Own source data loaded to data platform A R
Manage source data loaded to data platform A R R
Manage the ingestion service in the data platform A
Manage the handshake service in the data platform A
Manage archive data in raw zone (retention and archiving policies) R A R
Prepare data for ingestion to drop or landing to other zones A
Change management - changes to existing data sources A R
Perform end to end testing and report results R R A R R R
Performs post testing bug fixes R R A R R R

Data products and operations

Areas of Responsibility Data Owner Data Product Owner Data Analyst Data Scientist Data Governance Manager Data Steward Solution Architect Platform Ops Security Architect Security Engineer DataOps Engineer Data Solution Architect Data Modeling Architect Data Engineer
Manage backup and recovery A R
Manage database performance and availability A R R R
Manage ITSM processes for the data platform (Incident, Problem, Change, Request etc.) R R R R R R R R R R R R
Oversee service level agreements for data platform services A R
Support and management Azure subscriptions A R
Understand relationships between organizations, subscriptions, licenses, user accounts, and tenants, to set a subscription model A R
Review monthly Azure bill and understand usage and charges R R A
Manage costs and create showback and chargeback reports R A
Cost management - make costing model able to define reporting and charges. R A
Cost management - Use costs reports for trending and usage growth R A
Perform code review to ensure robust integrity R R A R
Build & deliver approved data solutions architecture R R R R

Drop Zone

Areas of Responsibility Data Owner Data Product Owner Data Analyst Data Scientist Data Governance Manager Data Steward Solution Architect Platform Ops Security Architect Security Engineer DataOps Engineer Data Solution Architect Data Modeling Architect Data Engineer
Manage UI for Drop Zone provisioning A R
Request a drop zone A R R
Approve drop zone provisioning A R
Manage drop zone provisioning execution A R

General

Areas of Responsibility Data Owner Data Product Owner Data Analyst Data Scientist Data Governance Manager Data Steward Solution Architect Platform Ops Security Architect Security Engineer DataOps Engineer Data Solution Architect Data Modeling Architect Data Engineer
Manage stakeholder comms once architecture is approved A
Gauge reusability for future projects R R R R R A R R
Business team training
Infrastructure team support R R R R R R A R R R
Incident management - Data platform related issues R A
Define, communicate, and drive execution of data strategy and data governance strategy A

Governance

Areas of Responsibility Data Owner Data Product Owner Data Analyst Data Scientist Data Governance Manager Data Steward Solution Architect Platform Ops Security Architect Security Engineer DataOps Engineer Data Solution Architect Data Modeling Architect Data Engineer
Owns partnership between data product owner/data owner to ensure best practices R R A R
Main liaison with EA, cloud, and security on documented standards related to the data platform R R A R
Partner with data product owner/data owner for data integrity efforts R R A R
Care for all data in the data platform R R A R
Define metadata management standards related to the data platform (policies and rules about metadata creations and maintenance) R R A R R
Ensure data quality and data management standards in the data platform R R R R A R R R
Ensure data security and compliance in the data platform R R A R R R R
Monitor and review data quality in the data platform R R R R A R R R
Measure and report data quality in the data platform to management R R A R

Lake database templates

Areas of Responsibility Data Owner Data Product Owner Data Analyst Data Scientist Data Governance Manager Data Steward Solution Architect Platform Ops Security Architect Security Engineer DataOps Engineer Data Solution Architect Data Modeling Architect Data Engineer
Manage lake database templates mapping for ingesting data into the data platform R A R R
Own lake database templates in the data platform R R A R R

Onboarding

Areas of Responsibility Data Owner Data Product Owner Data Analyst Data Scientist Data Governance Manager Data Steward Solution Architect Platform Ops Security Architect Security Engineer DataOps Engineer Data Solution Architect Data Modeling Architect Data Engineer
Own new data use case process requests R R A
Identifying the data owners for data use cases A R R
Collect data use case requirements from data owners and data product owners R R A
Approve new data use cases R R A
Determine data use case classification and data domain R R A
Define other data use case metadata for Data Catalog (asset description, contacts, folder structure, etc.) R R A R R R R
Identify data access requirements for data use cases R R A R R R R
Facilitate funding approval for data use cases R R A R R
Facilitate privacy assessment for sensitive data onboarding R R A R R R
Obtain architecture pattern approval for data use cases R R A R R
Collaborate with Data Engineers for data use cases R R A R
Ensure existing data model standards are met and aligned A
Ensure that governance and risk guidelines are followed A
Create "Source to Target" Mapping Doc and own updates R R A R
Approve "Source to Target" Mapping Document R R A R

Purview

Areas of Responsibility Data Owner Data Product Owner Data Analyst Data Scientist Data Governance Manager Data Steward Solution Architect Platform Ops Security Architect Security Engineer DataOps Engineer Data Solution Architect Data Modeling Architect Data Engineer
Read access to the Data Glossary R R A
Manage Data Glossary Create, Update, Delete R R A R
Approve Data Glossary changes R R A
Read access to the Data Catalog R R A
Manage Data Catalog Create, Update, Delete R R A R
Approve Data Catalog changes R R A
Register a data source in Purview R A R R R R
Manage Purview ADLS Scanning A R R
Manage Purview permissions A R R
Manage Purview policies (dataset access) A R
Approve purview policies (dataset access) R R A
Monitor Purview Insight Reports R R A R
Manage Purview Collections A R R

Security

Areas of Responsibility Data Owner Data Product Owner Data Analyst Data Scientist Data Governance Manager Data Steward Solution Architect Platform Ops Security Architect Security Engineer DataOps Engineer Data Solution Architect Data Modeling Architect Data Engineer
Manage encryption & encryption services for the data platform R A R R R
Approve data access requests for the Structured zone R R R R
Approve data access requests for the Gold Zone R R A
Approve data access requests for the Raw Zone R R R R
Monitor and audit data platform access R R A R

Next step

Learn about group alignment within data management landing zones and data landing zones: