# Music Analyser - Project Context & Documentation This document serves as a comprehensive guide to the **Music Analyser** project. It outlines the vision, technical decisions, current architecture, and future roadmap. **Use this document to provide context to future AI agents or developers.** ## 1. Project Vision The goal of this project is to build a personal analytics dashboard that: 1. **Regularly queries** the Spotify API (24/7) to collect a complete history of listening habits. 2. Stores this data locally (or in a private database) to ensure ownership and completeness. 3. Provides **rich analysis** and visualizations (similar to "Spotify Wrapped" but on-demand and more detailed). 4. Integrates **AI (Google Gemini)** to provide qualitative insights, summaries, and trend analysis (e.g., "You started the week with high-energy pop but shifted to lo-fi study beats by Friday"). ## 2. Roadmap & Phases ### Phase 1: Foundation & Data Collection (Current Status: ✅ COMPLETED) - **Goal:** reliable data ingestion and storage. - **Deliverables:** - FastAPI Backend. - SQLite Database (with SQLAlchemy). - Spotify OAuth logic (Refresh Token flow). - Background Worker for 24/7 polling. - Docker containerization + GitHub Actions (Multi-arch build). ### Phase 2: Visualization (Next Step) - **Goal:** View the raw data. - **Deliverables:** - Frontend (React + Vite). - Basic Data Table / List View of listening history. - Basic filtering (by date, artist). ### Phase 3: Analysis & AI - **Goal:** Deep insights. - **Deliverables:** - Advanced charts/graphs. - AI Integration (Gemini 2.5/3 Flash) to generate text summaries of listening trends. - Email reports (optional). ## 3. Technical Architecture ### Backend - **Language:** Python 3.11+ - **Framework:** FastAPI (High performance, easy to use). - **Dependencies:** `httpx` (Async HTTP), `sqlalchemy` (ORM), `pydantic` (Validation). ### Database - **Current:** SQLite (`music.db`). - *Decision:* Chosen for simplicity in Phase 1. - **Future path:** The code uses SQLAlchemy, so migrating to **PostgreSQL** (e.g., Supabase) only requires changing the connection string in `database.py`. ### Database Schema 1. **`Track` Table:** - Stores unique tracks. - Columns: `id` (Spotify ID), `name`, `artist`, `album`, `duration_ms`, `metadata_json` (Stores the *entire* raw Spotify JSON response for future-proofing). 2. **`PlayHistory` Table:** - Stores the instances of listening. - Columns: `id`, `track_id` (FK), `played_at` (Timestamp), `context_uri`. ### Authentication Strategy - **Challenge:** The background worker runs headless (no user to click "Login"). - **Solution:** We use the **Authorization Code Flow with Refresh Tokens**. 1. User runs the local helper script (`backend/scripts/get_refresh_token.py`) once. 2. This generates a long-lived `SPOTIFY_REFRESH_TOKEN`. 3. The backend uses this token to automatically request new short-lived Access Tokens whenever needed. ### Background Worker Logic - **File:** `backend/run_worker.py` -> `backend/app/ingest.py` - **Process:** 1. Worker wakes up every 60 seconds. 2. Calls Spotify `recently-played` endpoint (limit 50). 3. Iterates through tracks. 4. **Deduplication:** Checks `(track_id, played_at)` against the DB. If it exists, skip. If not, insert. 5. **Metadata:** If the track is new to the system, it saves the full metadata immediately. ### AI Integration - **Model:** Google Gemini (Target: 2.5 Flash or 3 Flash). - **Status:** Service class exists (`AIService`) but is not yet fully wired into the daily workflow. ### Deployment - **Docker:** Multi-stage build (python-slim). - **CI/CD:** GitHub Actions workflow (`docker-publish.yml`). - Builds for `linux/amd64` and `linux/arm64`. - Pushes to GitHub Container Registry (ghcr.io). ## 4. How to Run ### Prerequisites - Spotify Client ID & Secret. - Google Gemini API Key. - Docker (optional). ### Local Development 1. **Setup Env:** ```bash cp backend/.env.example backend/.env # Fill in details ``` 2. **Install:** ```bash cd backend pip install -r requirements.txt ``` 3. **Run API:** ```bash uvicorn app.main:app --reload ``` 4. **Run Worker:** ```bash python run_worker.py ``` ### Docker ```bash docker build -t music-analyser-backend ./backend docker run --env-file backend/.env music-analyser-backend ```