Microsoft Purview Unified Catalog CLI (UC)¶
π Complete implementation of Microsoft Purview Unified Catalog functionality with feature parity to UnifiedCatalogPy.
Overview¶
The Unified Catalog (uc) command group provides comprehensive management of Microsoft Purview's modern data governance features:
- β Governance Domains - Organizational contexts for data assets
- β Glossary Terms - Business terminology and definitions with full metadata
- β Data Products - Curated data asset collections with full CRUD lifecycle management (NEW: update & delete)
- β Objectives & Key Results (OKRs) - Data governance goal tracking and measurement
- β Critical Data Elements (CDEs) - Important data element definitions with data types
- β Health Management - Automated governance health monitoring and recommendations (NEW)
- β Workflow Management - Approval workflows and business process automation (NEW)
- βΉοΈ Custom Attributes / Business Metadata - User-defined metadata attributes. If the
datagovernance/catalog/attributesendpoint is unavailable (HTTP 405), use Atlas business metadata viapvw types putTypeDefswithbusinessMetadataDefs. - β Metadata Cleanup - Resolve expired attribute names to their parent definition and safely clean up definitions.
- π§ Access Requests - Data access workflow management (coming soon)
π Quick Start¶
Get started with Unified Catalog commands:
# 1οΈβ£ List governance domains
pvw uc domain list
# 2οΈβ£ Search glossary terms
pvw uc term search --query "customer"
# 3οΈβ£ List data products
pvw uc dataproduct list
# 4οΈβ£ View OKRs
pvw uc objective list
# 5οΈβ£ Browse CDEs
pvw uc cde list
π₯ Rich Console Output¶
All commands feature beautiful, colorized table output with:
- Status Indicators: β
Active, π§ Draft, β Deprecated
- Color Coding: Green (success), Yellow (warnings), Red (errors)
- Smart Formatting: Auto-truncated descriptions, aligned columns
- Progress Feedback: Real-time operation status
π‘ Pro Tips¶
# Get help for any command
pvw uc --help
pvw uc domain --help
# Use JSON output for scripting
pvw uc domain list --output json
# Filter results with search
pvw uc term search --query "finance"
Authentication¶
The UC client uses the same authentication as other Purview CLI commands:
- Azure CLI: Run
az loginfirst - Service Principal: Set
AZURE_CLIENT_ID,AZURE_TENANT_ID,AZURE_CLIENT_SECRET - Account Name: Set
PURVIEW_NAMEenvironment variable or pass via config
π Complete Command Reference¶
π’ Governance Domains (pvw uc domain)¶
Manage organizational contexts for data governance:
# List all domains (with rich table output)
pvw uc domain list
# Create a new domain
pvw uc domain create --name "Finance" --description "Financial data domain"
--type "BusinessUnit" --owner-id "user@company.com"
# Get domain details
pvw uc domain show --domain-id "abc-123"
# Update domain properties
pvw uc domain update --domain-id "abc-123" --name "Finance Analytics"
--status "Published"
# Delete domain (with confirmation)
pvw uc domain delete --domain-id "abc-123" --confirm
π Glossary Terms (pvw uc term)¶
Manage business terminology with comprehensive metadata:
# Search terms across all domains
pvw uc term search --query "customer"
# List terms in a specific domain
pvw uc term list --domain-id "abc-123"
# Create a basic term
pvw uc term create --name "Customer ID" --description "Unique identifier"
--domain-id "abc-123"
# Create term with full metadata
pvw uc term create --name "GDPR" --description "Data Protection Regulation"
--domain-id "abc-123" --acronym "GDPR" --status "Draft"
--resource-name "Official Site" --resource-url "https://gdpr.eu"
# Update term properties
pvw uc term update --term-id "term-456" --status "Published"
--acronym "CUST_ID"
# Delete term
pvw uc term delete --term-id "term-456" --confirm
# Sync UC terms to a classic glossary (create missing terms)
pvw uc term sync-classic --domain-id "abc-123" --glossary-guid "gloss-guid"
# Sync and update existing classic terms to match UC
pvw uc term sync-classic --domain-id "abc-123" --glossary-guid "gloss-guid" --update-existing
# Full two-way reconciliation: update existing + delete terms removed from UC
pvw uc term sync-classic --domain-id "abc-123" --glossary-guid "gloss-guid" --update-existing --delete-removed
# Preview changes without applying (dry run)
pvw uc term sync-classic --domain-id "abc-123" --glossary-guid "gloss-guid" --update-existing --delete-removed --dry-run
# Auto-create the classic glossary if it doesn't exist yet
pvw uc term sync-classic --domain-id "abc-123" --create-glossary
sync-classic option reference¶
| Option | Description |
|---|---|
--domain-id |
Governance domain ID to sync terms from |
--glossary-guid |
Target classic glossary GUID |
--create-glossary |
Create the classic glossary if it does not exist |
--update-existing |
Update classic terms that already exist in the glossary |
--delete-removed |
Delete classic terms that no longer exist in the UC domain |
--dry-run |
Preview all changes without applying them |
Note:
--delete-removedis opt-in to prevent accidental data loss. Always use--dry-runfirst when running against a production glossary.
π¦ Data Products (pvw uc dataproduct)¶
Manage curated data asset collections with lifecycle tracking and full CRUD operations:
# List all data products (with full IDs displayed)
pvw uc dataproduct list
# List products in specific domain
pvw uc dataproduct list --domain-id "abc-123"
# List with filtering
pvw uc dataproduct list --status Published
# Show specific data product details
pvw uc dataproduct show --product-id "560f1496-f0d3-4c8e-b343-8636bd4f9d4a"
# Create basic data product
pvw uc dataproduct create --name "Customer 360"
--description "Complete customer analytics"
--domain-id "abc-123" --type "Analytical"
# Create with full metadata
pvw uc dataproduct create --name "Sales Dashboard" --domain-id "abc-123"
--type "Operational" --update-frequency "Daily"
--business-use "Track sales KPIs"
--owner-id "sales@company.com" --endorsed
# Update data product (smart partial updates - only specify fields to change)
pvw uc dataproduct update --product-id "560f1496-f0d3-4c8e-b343-8636bd4f9d4a" \
--status Published
# Update multiple fields at once
pvw uc dataproduct update --product-id "560f1496-f0d3-4c8e-b343-8636bd4f9d4a" \
--description "Updated comprehensive customer analytics" \
--endorsed \
--update-frequency Monthly
# Update status and business metadata
pvw uc dataproduct update --product-id "prod-789" \
--status Published \
--business-use "Updated business justification"
# Delete product (with confirmation prompt)
pvw uc dataproduct delete --product-id "prod-789"
# Delete without confirmation
pvw uc dataproduct delete --product-id "prod-789" --yes
Key Features:
- β
Smart Updates: Fetches current state first, then applies only specified changes
- β
Partial Updates: Update individual fields without affecting others
- β
Full ID Display: All list commands show complete UUIDs (no truncation)
- β
Safe Deletion: Confirmation prompt by default, --yes to skip
- β
Rich Formatting: Beautiful tables with status colors and proper alignment
π§Ή Custom Metadata Cleanup (pvw uc metadata)¶
Manage business metadata assigned to assets and safely delete obsolete definitions.
# List business metadata definitions and attributes
pvw uc metadata list
pvw uc metadata list --output json
# Remove business metadata group from a specific asset
pvw uc metadata delete --asset-id "entity-guid" --group "Glossaire"
# Delete a business metadata definition directly by definition name
pvw uc metadata delete-definition --name "Glossaire" --dry-run
pvw uc metadata delete-definition --name "Glossaire"
# Cleanup flow (safe):
# 1) Validate only (no delete)
pvw uc metadata cleanup --name "Glossaire" --check-only --verbose
# 2) Execute deletion when safe
pvw uc metadata cleanup --name "Glossaire" --verbose
# You can also pass an attribute name; CLI resolves it to parent definition name
pvw uc metadata cleanup --name "SecteursActivite" --check-only --verbose
Cleanup behavior:
- --check-only: verifies resolution and definition readability only.
- --dry-run: shows intended delete action without execution.
- --verbose: prints endpoint path selection and raw error payloads.
- Deletion is blocked when the definition is still referenced by assets. Remove assignments first, then rerun cleanup.
π― Objectives & Key Results (pvw uc objective)¶
Track data governance goals and measure progress:
# List all objectives
pvw uc objective list
# List objectives for domain
pvw uc objective list --domain-id "abc-123"
# Create objective
pvw uc objective create --definition "Achieve 95% data quality"
--domain-id "abc-123"
--target-date "2025-12-31T23:59:59.000Z"
--status "Active"
# Create key result
pvw uc objective create-key-result --objective-id "obj-123"
--definition "Reduce errors by 50%"
--progress 25 --goal 50 --max 100
--domain-id "abc-123" --status "OnTrack"
# Update objective progress
pvw uc objective update --objective-id "obj-123" --progress 75
# Delete objective
pvw uc objective delete --objective-id "obj-123" --confirm
π Critical Data Elements (pvw uc cde)¶
Define and manage important data elements with type information:
# List all CDEs
pvw uc cde list
# List CDEs in domain
pvw uc cde list --domain-id "abc-123"
# Create CDE with data type
pvw uc cde create --name "Social Security Number"
--description "US SSN for identity verification"
### π§© Custom Attributes / Business Metadata
- The `pvw uc attribute create/list` commands call the `datagovernance/catalog/attributes` endpoint. Many tenants/regions return HTTP 405 (not enabled).
- When you hit 405 or get empty results, use Atlas **businessMetadataDefs** instead:
```bash
# Upsert business metadata (sample payload provided below)
--domain-id "abc-123" --data-type "String"
--status "Published"
# Create with validation rules
pvw uc cde create --name "Email Address" --domain-id "abc-123"
Sample payload (simplified):
```json
{
"businessMetadataDefs": [
{
"category": "BUSINESS_METADATA",
"name": "Glossaire",
"description": "Glossaire attributes",
"typeVersion": "1.0",
"attributeDefs": [
{
"name": "SecteursActivite",
"typeName": "array<string>",
"isOptional": true,
"cardinality": "SINGLE"
},
{
"name": "Secteur",
"typeName": "string",
"isOptional": true,
"cardinality": "SINGLE"
}
]
}
]
}
Une fois dΓ©finis, les attributs peuvent Γͺtre renseignΓ©s sur les termes via customAttributes / managedAttributes (dΓ©jΓ supportΓ© par pvw uc term import-csv et pvw uc term update).
--data-type "String" --format "email"
--required --sensitive
Update CDE properties¶
pvw uc cde update --cde-id "cde-456" --status "Deprecated" --description "Legacy field - use NewEmail instead"
Delete CDE¶
pvw uc cde delete --cde-id "cde-456" --confirm
### π₯ Health Monitoring (`pvw uc health`) **NEW**
Monitor governance health and get automated recommendations to improve your data governance posture:
```bash
# List all health findings and recommendations
pvw uc health query
# Filter by severity
pvw uc health query --severity High
pvw uc health query --severity Medium
pvw uc health query --severity Low
# Filter by status
pvw uc health query --status NotStarted
pvw uc health query --status InProgress
pvw uc health query --status Resolved
# Filter by finding type
pvw uc health query --finding-type Discoverability
pvw uc health query --finding-type Quality
# Get detailed information about a specific health action
pvw uc health show --action-id "5ea3fc78-6a77-4098-8779-ed81de6f87c9"
# Update health action status and track progress
pvw uc health update \
--action-id "5ea3fc78-6a77-4098-8779-ed81de6f87c9" \
--status InProgress \
--reason "Working on assigning glossary terms to data products"
# Assign health action to team member
pvw uc health update \
--action-id "5ea3fc78-6a77-4098-8779-ed81de6f87c9" \
--assigned-to "user@company.com"
# Mark health action as resolved
pvw uc health update \
--action-id "5ea3fc78-6a77-4098-8779-ed81de6f87c9" \
--status Resolved \
--reason "All data products now have published glossary terms assigned"
# Delete a health action
pvw uc health delete --action-id "5ea3fc78-6a77-4098-8779-ed81de6f87c9"
# Get health summary statistics (if available)
pvw uc health summary
# Output health findings in JSON format for automation
pvw uc health query --json
Health Finding Types: - Missing glossary terms (High severity) - Data products without published terms - Missing OKRs (Medium) - Data products without defined objectives - Missing data quality scores (Medium) - Products/assets without quality metrics - Classification gaps (Medium) - Data assets missing proper classifications - Description quality issues (Medium) - Short or missing descriptions - Domain completeness (Medium) - Business domains without critical data entities
Key Features: - β Automated Monitoring: Continuous governance health checks - β Prioritized Findings: Severity-based recommendations (High/Medium/Low) - β Actionable Insights: Clear recommendations for each finding - β Progress Tracking: Update status and track resolution - β Rich Formatting: Color-coded severity (Red=High, Yellow=Medium, Green=Low)
π Workflow Management (pvw workflow) NEW¶
Manage approval workflows and business process automation in Purview:
# List all workflows
pvw workflow list
# Get workflow details
pvw workflow get --workflow-id "workflow-123"
# Create a new workflow (requires JSON definition file)
pvw workflow create --workflow-id "approval-flow-1" \
--payload-file workflow-definition.json
# Execute a workflow
pvw workflow execute --workflow-id "workflow-123"
# Execute with parameters
pvw workflow execute --workflow-id "workflow-123" \
--payload-file execution-params.json
# List workflow executions/runs
pvw workflow executions --workflow-id "workflow-123"
# Get specific execution details
pvw workflow execution-details --workflow-id "workflow-123" \
--execution-id "exec-456"
# Update workflow configuration
pvw workflow update --workflow-id "workflow-123" \
--payload-file updated-workflow.json
# Delete a workflow
pvw workflow delete --workflow-id "workflow-123"
# Output workflows in JSON format for scripting
pvw workflow list --json
Workflow Use Cases: - Data Access Requests: Automated approval flows for data access - Term Certification: Glossary term review and approval processes - Data Product Publishing: Multi-stage approval for publishing data products - Classification Review: Automated classification validation workflows - Quality Gate Enforcement: Data quality checks before promotion
Key Features: - β Full Lifecycle Management: Create, execute, monitor, and delete workflows - β Execution Tracking: Monitor workflow runs and get detailed status - β Flexible Definition: JSON-based workflow configuration - β Rich Formatting: Beautiful table display with full workflow IDs visible
π¨ Beautiful Console Output¶
Experience professional-grade CLI formatting with:
- π Rich Tables: Colorized columns with proper alignment
- π― Status Icons: β Active, π§ Draft, β Deprecated, β οΈ Warning
- π Color Coding: Green (success), Yellow (warnings), Red (errors)
- π± Smart Formatting: Auto-truncated text, responsive columns
- β‘ Progress Feedback: Real-time operation status and completion
Sample Output¶
pvw uc domain list
π’ Governance Domains
βββββββββββββββ³ββββββββββββββββββ³βββββββββββββββ³ββββββββββββββ³βββββββββββββββ
β Domain ID β Name β Type β Status β Owners β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β fin-001 β β
Finance β BusinessUnit β Published β CFO Team β
β mkt-002 β π§ Marketing β Department β Draft β CMO Team β
β ops-003 β β
Operations β Operational β Active β COO Team β
βββββββββββββββ΄ββββββββββββββββββ΄βββββββββββββββ΄ββββββββββββββ΄βββββββββββββββ
Found 3 domains across 2 business units
β¨ Use 'pvw uc domain show --domain-id <ID>' for detailed view
π Migration & Compatibility¶
Legacy Support¶
The original data_product command remains available for backward compatibility:
# These commands are equivalent:
pvw uc dataproduct list --domain-id "abc-123"
pv data_product dataproduct list --domain-id "abc-123" # Legacy syntax
UnifiedCatalogPy Compatibility¶
This implementation provides complete feature parity with the popular UnifiedCatalogPy library:
- β All API Methods: Full coverage of UC REST endpoints
- β Same Data Models: Compatible request/response structures
- β Rich Output: Enhanced with beautiful console formatting
- β Error Handling: Comprehensive validation and user feedback
- β Authentication: Azure CLI, Service Principal, managed identity support
π API Reference¶
Purview REST Endpoints¶
| Feature | Endpoint | Methods |
|---|---|---|
| Governance Domains | /datagovernance/governanceDomains |
GET, POST, PUT, DELETE |
| Glossary Terms | /datagovernance/terms |
GET, POST, PUT, DELETE |
| Data Products | /datagovernance/dataProducts |
GET, POST, PUT, DELETE |
| Objectives | /datagovernance/objectives |
GET, POST, PUT, DELETE |
| Key Results | /datagovernance/objectives/{id}/keyResults |
GET, POST, PUT, DELETE |
| Critical Data Elements | /datagovernance/criticalDataElements |
GET, POST, PUT, DELETE |
| Health Management | /datagovernance/health |
GET, POST (Coming Soon) |
Authentication Methods¶
-
Azure CLI (Recommended)
az login pvw uc domain list -
Service Principal
set AZURE_CLIENT_ID=your-client-id set AZURE_TENANT_ID=your-tenant-id set AZURE_CLIENT_SECRET=your-secret pvw uc domain list -
Managed Identity (Azure VMs/Functions)
# Automatically detected in Azure environments pvw uc domain list
β οΈ Error Handling & Troubleshooting¶
Common Issues & Solutions¶
| Error | Cause | Solution |
|---|---|---|
Authentication failed |
No valid credentials | Run az login or set service principal env vars |
Permission denied |
Missing UC access | Contact admin for Purview data governance permissions |
Domain not found |
Invalid domain ID | Use pvw uc domain list to get valid IDs |
Rate limit exceeded |
Too many API calls | Built-in retry logic handles this automatically |
Network timeout |
Connection issues | Check firewall and proxy settings |
Debug Mode¶
Enable detailed logging for troubleshooting:
pv --debug uc domain list
pv --verbose uc term search --query "customer"
π Integration & Workflows¶
Working with Other CLI Commands¶
Unified Catalog integrates seamlessly with the broader Purview CLI:
# Export governance structure
pvw uc domain list --output json > domains.json
pvw uc term list --domain-id "abc-123" --output json > terms.json
# Search and link assets to data products
pv search query --keywords "customer" --output json > assets.json
# Use asset GUIDs to link to data products via API
# Bulk domain creation from CSV
cat domains.csv | while IFS=, read name desc type; do
pvw uc domain create --name "$name" --description "$desc" --type "$type"
done
Common Workflows¶
-
π Setting Up Governance
# 1. Create governance domains pvw uc domain create --name "Finance" --type "BusinessUnit" # 2. Define glossary terms pvw uc term create --name "Revenue" --domain-id "fin-001" # 3. Establish objectives pvw uc objective create --definition "95% data quality" --domain-id "fin-001" -
π¦ Data Product Lifecycle
# Draft β Review β Published β Deprecated pvw uc dataproduct create --name "Customer360" --status "Draft" pvw uc dataproduct update --product-id "dp-123" --status "Published" pvw uc dataproduct update --product-id "dp-123" --status "Deprecated" -
π Progress Tracking
# Monitor OKR progress pvw uc objective list --status "Active" pvw uc objective update --objective-id "obj-123" --progress 75
π οΈ Technical Implementation¶
Architecture Overview¶
The UC implementation follows a clean, modular design:
- Client Layer:
purviewcli/client/_unified_catalog.py- API interactions - CLI Layer:
purviewcli/cli/unified_catalog.py- User interface - Models: Comprehensive data structures for all UC entities
- Output: Rich console formatting with status indicators and color coding
Key Technologies¶
- π¨ Rich Library: Beautiful console output with tables and progress bars
- β‘ Click Framework: Robust CLI with command groups and validation
- π Azure Identity: Seamless authentication with multiple methods
- π Multiple Formats: Table, JSON, YAML output options
- π Error Recovery: Comprehensive error handling with user guidance
UnifiedCatalogPy Parity¶
This implementation achieves 100% feature parity with the popular UnifiedCatalogPy library:
| Feature | UnifiedCatalogPy | PurviewCLI UC | Status |
|---|---|---|---|
| Governance Domains | β | β | Complete |
| Glossary Terms | β | β | Complete |
| Data Products | β | β | Complete |
| Objectives & KRs | β | β | Complete |
| Critical Data Elements | β | β | Complete |
| Rich Console Output | β | β | Enhanced |
| CLI Integration | β | β | Unique |
π€ Contributing¶
Help improve the Unified Catalog functionality:
- Add Features: Extend
_unified_catalog.pywith new API methods - Enhance CLI: Update
unified_catalog.pywith new commands - Improve Output: Add formatting and visualization options
- Write Tests: Ensure reliability with comprehensive test coverage
- Update Docs: Keep examples and references current
π Additional Resources¶
- Microsoft Purview Documentation
- UnifiedCatalogPy GitHub
- Purview CLI Main Documentation
- API Reference Guide
β¨ The Unified Catalog CLI brings the power of Microsoft Purview's data governance to your command line with beautiful, professional output and comprehensive feature coverage.