Jira Analyzer

[Type]: AI Pipeline

[Language]: Python

[License]: BSL 1.1

Overview

An AI-powered analysis pipeline that transforms raw Jira feature requests into prioritized, revenue-enriched intelligence. The system extracts issues from Jira, enriches them with Salesforce revenue data, uses local LLMs to rate vertical-specific relevance, ranks by revenue-weighted priority, and publishes ranked reports to Confluence.

Key Innovation: By running all AI inference through local Ollama models, the pipeline keeps sensitive customer data and feature request details entirely on-premises—no data leaves the network. The repository also includes a standalone deduplication utility that combines TF-IDF similarity, sentence-transformer embeddings, and LLM-based semantic comparison to identify duplicate feature requests independently.

The pipeline produces priority-ranked feature requests weighted by customer revenue impact, enabling product teams to make data-driven roadmap decisions across multiple industry verticals.

Analysis Pipeline

Key Features

Jira Integration

Automated extraction of feature requests via JQL (Jira Query Language) with configurable filters and pagination

Revenue Enrichment

Salesforce account matching to attach revenue data for revenue-weighted prioritization

Vertical Rating

Local LLM-based scoring of feature relevance across industry verticals using Ollama

Duplicate Detection

Standalone companion utility: TF-IDF similarity, sentence-transformer embeddings, and LLM verification for independent dedup analysis

Priority Ranking

Revenue-weighted scoring with cross-vertical aggregation for data-driven roadmap decisions

Confluence Publishing

Automated report generation with ranked tables published directly to Confluence Cloud

Technical Implementation

The pipeline is architected as a series of composable stages, each responsible for a distinct transformation of the feature request data. This design enables independent testing, selective re-runs, and easy extension with new data sources or analysis stages.

Local LLM Inference: All AI processing runs through Ollama (qwen3), keeping sensitive feature request and customer data entirely on-premises with zero external API calls for inference
3-Stage Duplicate Detection: TF-IDF cosine similarity for fast candidate screening, sentence-transformer embeddings for semantic matching, and LLM-based verification to eliminate false positives
Revenue Pipeline: Salesforce API integration matches Jira reporter organizations to SFDC accounts, enriching each request with revenue data for business-impact weighting
Vertical Scoring: LLM evaluates each feature request against configurable industry vertical definitions, producing relevance scores that enable per-vertical priority views
Confluence Integration: Generates formatted HTML tables with priority rankings and publishes directly to Confluence Cloud pages via REST API
Excel Export: Full analysis results exported to structured Excel workbooks via openpyxl for offline review and stakeholder distribution

The system is designed for enterprise-scale analysis, processing hundreds of feature requests through the full pipeline while maintaining data privacy through exclusive use of local models.

AI Pipeline Engineering

This project demonstrates production-grade AI pipeline architecture—combining local LLM inference for data privacy, multi-stage NLP for duplicate detection, and enterprise API integration across Jira, Salesforce, and Confluence. By running all inference through Ollama, sensitive customer data and feature request details never leave the network, making it suitable for enterprise environments with strict data governance requirements.

Technology Stack

Python Ollama (qwen3) sentence-transformers rapidfuzz Jira REST API Salesforce API Confluence Cloud API pandas openpyxl

Enterprise Value

Transforms feature request management from manual triage to automated, data-driven prioritization:

Revenue-Driven Prioritization: Feature requests ranked by actual customer revenue impact, replacing subjective prioritization with data-driven decisions
Cross-Vertical Intelligence: Identifies features with broad appeal across industry verticals, maximizing development ROI
Operational Efficiency: Automates hours of manual Jira triage, Salesforce lookups, and duplicate identification into a single pipeline run
Data Privacy: All AI inference runs locally via Ollama—no customer data or feature details are sent to external LLM providers
Stakeholder Reporting: Auto-published Confluence reports and Excel exports keep product teams aligned without manual report creation

View on GitHub Back to Projects