mirror of
https://github.com/bnair123/MusicAnalyser.git
synced 2026-02-25 11:46:07 +00:00
Add skip tracking, compressed heatmap, listening log, docs, tests, and OpenAI support
Major changes: - Add skip tracking: poll currently-playing every 15s, detect skips (<30s listened) - Add listening-log and sessions API endpoints - Fix ReccoBeats client to extract spotify_id from href response - Compress heatmap from 24 hours to 6 x 4-hour blocks - Add OpenAI support in narrative service (use max_completion_tokens for new models) - Add ListeningLog component with timeline and list views - Update all frontend components to use real data (album art, play counts) - Add docker-compose external network (dockernet) support - Add comprehensive documentation (API, DATA_MODEL, ARCHITECTURE, FRONTEND) - Add unit tests for ingest and API endpoints
This commit is contained in:
125
docs/API.md
Normal file
125
docs/API.md
Normal file
@@ -0,0 +1,125 @@
|
||||
# API Documentation
|
||||
|
||||
The MusicAnalyser Backend is built with FastAPI. It provides endpoints for data ingestion, listening history retrieval, and AI-powered analysis.
|
||||
|
||||
## Base URL
|
||||
Default local development: `http://localhost:8000`
|
||||
Docker environment: Proxied via Nginx at `http://localhost:8991/api`
|
||||
|
||||
---
|
||||
|
||||
## Endpoints
|
||||
|
||||
### 1. Root / Health Check
|
||||
- **URL**: `/`
|
||||
- **Method**: `GET`
|
||||
- **Response**:
|
||||
```json
|
||||
{
|
||||
"status": "ok",
|
||||
"message": "Music Analyser API is running"
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Get Recent History
|
||||
Returns a flat list of recently played tracks.
|
||||
- **URL**: `/history`
|
||||
- **Method**: `GET`
|
||||
- **Query Parameters**:
|
||||
- `limit` (int, default=50): Number of items to return.
|
||||
- **Response**: List of PlayHistory objects with nested Track data.
|
||||
|
||||
### 3. Get Tracks
|
||||
Returns a list of unique tracks in the database.
|
||||
- **URL**: `/tracks`
|
||||
- **Method**: `GET`
|
||||
- **Query Parameters**:
|
||||
- `limit` (int, default=50): Number of tracks to return.
|
||||
|
||||
### 4. Trigger Spotify Ingestion
|
||||
Manually triggers a background task to poll Spotify for recently played tracks.
|
||||
- **URL**: `/trigger-ingest`
|
||||
- **Method**: `POST`
|
||||
- **Response**:
|
||||
```json
|
||||
{
|
||||
"status": "Ingestion started in background"
|
||||
}
|
||||
```
|
||||
|
||||
### 5. Trigger Analysis Pipeline
|
||||
Runs the full stats calculation and AI narrative generation for a specific timeframe.
|
||||
- **URL**: `/trigger-analysis`
|
||||
- **Method**: `POST`
|
||||
- **Query Parameters**:
|
||||
- `days` (int, default=30): Number of past days to analyze.
|
||||
- `model_name` (str): LLM model to use.
|
||||
- **Response**:
|
||||
```json
|
||||
{
|
||||
"status": "success",
|
||||
"snapshot_id": 1,
|
||||
"period": { "start": "...", "end": "..." },
|
||||
"metrics": { ... },
|
||||
"narrative": { ... }
|
||||
}
|
||||
```
|
||||
|
||||
### 6. Get Analysis Snapshots
|
||||
Retrieves previously saved analysis reports.
|
||||
- **URL**: `/snapshots`
|
||||
- **Method**: `GET`
|
||||
- **Query Parameters**:
|
||||
- `limit` (int, default=10): Number of snapshots to return.
|
||||
|
||||
### 7. Detailed Listening Log
|
||||
Returns a refined listening log with skip detection and listening duration calculations.
|
||||
- **URL**: `/listening-log`
|
||||
- **Method**: `GET`
|
||||
- **Query Parameters**:
|
||||
- `days` (int, 1-365, default=7): Timeframe.
|
||||
- `limit` (int, 1-1000, default=200): Max plays to return.
|
||||
- **Response**:
|
||||
```json
|
||||
{
|
||||
"plays": [
|
||||
{
|
||||
"id": 123,
|
||||
"track_name": "Song Name",
|
||||
"artist": "Artist Name",
|
||||
"played_at": "ISO-TIMESTAMP",
|
||||
"listened_ms": 180000,
|
||||
"skipped": false,
|
||||
"image": "..."
|
||||
}
|
||||
],
|
||||
"period": { "start": "...", "end": "..." }
|
||||
}
|
||||
```
|
||||
|
||||
### 8. Session Statistics
|
||||
Groups plays into listening sessions (Marathon, Standard, Micro).
|
||||
- **URL**: `/sessions`
|
||||
- **Method**: `GET`
|
||||
- **Query Parameters**:
|
||||
- `days` (int, 1-365, default=7): Timeframe.
|
||||
- **Response**:
|
||||
```json
|
||||
{
|
||||
"sessions": [
|
||||
{
|
||||
"start_time": "...",
|
||||
"end_time": "...",
|
||||
"duration_minutes": 45,
|
||||
"track_count": 12,
|
||||
"type": "Standard"
|
||||
}
|
||||
],
|
||||
"summary": {
|
||||
"count": 10,
|
||||
"avg_minutes": 35,
|
||||
"micro_rate": 0.1,
|
||||
"marathon_rate": 0.05
|
||||
}
|
||||
}
|
||||
```
|
||||
43
docs/ARCHITECTURE.md
Normal file
43
docs/ARCHITECTURE.md
Normal file
@@ -0,0 +1,43 @@
|
||||
# Architecture Overview
|
||||
|
||||
MusicAnalyser is a full-stack personal analytics platform designed to collect, store, and analyze music listening habits using the Spotify API and Google Gemini AI.
|
||||
|
||||
## System Components
|
||||
|
||||
### 1. Backend (FastAPI)
|
||||
- **API Layer**: Handles requests from the frontend, manages the database, and triggers analysis.
|
||||
- **Database**: SQLite used for local storage of listening history, track metadata, and AI snapshots.
|
||||
- **ORM**: SQLAlchemy manages the data models and relationships.
|
||||
- **Services**:
|
||||
- `SpotifyClient`: Handles OAuth2 flow and API requests.
|
||||
- `StatsService`: Computes complex metrics (heatmaps, sessions, top tracks, hipster scores).
|
||||
- `NarrativeService`: Interfaces with Google Gemini to generate text-based insights.
|
||||
- `IngestService`: Manages the logic of fetching and deduplicating Spotify "recently played" data.
|
||||
|
||||
### 2. Background Worker
|
||||
- A standalone Python script (`run_worker.py`) that polls the Spotify API every 60 seconds.
|
||||
- Ensures a continuous record of listening history even when the dashboard is not open.
|
||||
|
||||
### 3. Frontend (React)
|
||||
- **Framework**: Vite + React.
|
||||
- **Styling**: Tailwind CSS for a modern, dark-themed dashboard.
|
||||
- **Visualizations**: Recharts for radar and heatmaps; Framer Motion for animations.
|
||||
- **State**: Managed via standard React hooks (`useState`, `useEffect`) and local storage for caching.
|
||||
|
||||
### 4. External Integrations
|
||||
- **Spotify API**: Primary data source for tracks, artists, and listening history.
|
||||
- **ReccoBeats API**: Used for fetching audio features (BPM, Energy, Mood) for tracks.
|
||||
- **Genius API**: Used for fetching song lyrics to provide deep content analysis.
|
||||
- **Google Gemini**: Large Language Model used to "roast" the user's taste and generate personas.
|
||||
|
||||
## Data Flow
|
||||
|
||||
1. **Ingestion**: `Background Worker` → `Spotify API` → `Database (PlayHistory)`.
|
||||
2. **Enrichment**: `Ingest Logic` → `ReccoBeats/Genius/Spotify` → `Database (Track/Artist)`.
|
||||
3. **Analysis**: `Frontend` → `Backend API` → `StatsService` → `NarrativeService (Gemini)` → `Database (Snapshot)`.
|
||||
4. **Visualization**: `Frontend` ← `Backend API` ← `Database (Snapshot/Log)`.
|
||||
|
||||
## Deployment
|
||||
- **Containerization**: Both Backend and Frontend are containerized using Docker.
|
||||
- **Docker Compose**: Orchestrates the backend (including worker) and frontend (Nginx proxy) services.
|
||||
- **CI/CD**: GitHub Actions builds multi-arch images (amd64/arm64) and pushes to GHCR.
|
||||
89
docs/DATA_MODEL.md
Normal file
89
docs/DATA_MODEL.md
Normal file
@@ -0,0 +1,89 @@
|
||||
# Data Model Documentation
|
||||
|
||||
This document describes the database schema for the MusicAnalyser project. The project uses SQLite with SQLAlchemy as the ORM.
|
||||
|
||||
## Entity Relationship Diagram Overview
|
||||
|
||||
- **Artist** (Many-to-Many) **Track**
|
||||
- **Track** (One-to-Many) **PlayHistory**
|
||||
- **AnalysisSnapshot** (Independent)
|
||||
|
||||
---
|
||||
|
||||
## Tables
|
||||
|
||||
### `artists`
|
||||
Stores unique artists retrieved from Spotify.
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `id` | String | Spotify ID (Primary Key) |
|
||||
| `name` | String | Artist name |
|
||||
| `genres` | JSON | List of genre strings |
|
||||
| `image_url` | String | URL to artist profile image |
|
||||
|
||||
### `tracks`
|
||||
Stores unique tracks retrieved from Spotify, enriched with audio features and lyrics.
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `id` | String | Spotify ID (Primary Key) |
|
||||
| `name` | String | Track name |
|
||||
| `artist` | String | Display string for artists (e.g., "Artist A, Artist B") |
|
||||
| `album` | String | Album name |
|
||||
| `image_url` | String | URL to album art |
|
||||
| `duration_ms` | Integer | Track duration in milliseconds |
|
||||
| `popularity` | Integer | Spotify popularity score (0-100) |
|
||||
| `raw_data` | JSON | Full raw response from Spotify API for future-proofing |
|
||||
| `danceability` | Float | Audio feature: Danceability (0.0 to 1.0) |
|
||||
| `energy` | Float | Audio feature: Energy (0.0 to 1.0) |
|
||||
| `key` | Integer | Audio feature: Key |
|
||||
| `loudness` | Float | Audio feature: Loudness in dB |
|
||||
| `mode` | Integer | Audio feature: Mode (0 for Minor, 1 for Major) |
|
||||
| `speechiness` | Float | Audio feature: Speechiness (0.0 to 1.0) |
|
||||
| `acousticness` | Float | Audio feature: Acousticness (0.0 to 1.0) |
|
||||
| `instrumentalness` | Float | Audio feature: Instrumentalness (0.0 to 1.0) |
|
||||
| `liveness` | Float | Audio feature: Liveness (0.0 to 1.0) |
|
||||
| `valence` | Float | Audio feature: Valence (0.0 to 1.0) |
|
||||
| `tempo` | Float | Audio feature: Tempo in BPM |
|
||||
| `time_signature` | Integer | Audio feature: Time signature |
|
||||
| `lyrics` | Text | Full lyrics retrieved from Genius |
|
||||
| `lyrics_summary` | String | AI-generated summary of lyrics |
|
||||
| `genre_tags` | String | Combined genre tags for the track |
|
||||
| `created_at` | DateTime | Timestamp of record creation |
|
||||
| `updated_at` | DateTime | Timestamp of last update |
|
||||
|
||||
### `play_history`
|
||||
Stores individual listening instances.
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `id` | Integer | Primary Key (Auto-increment) |
|
||||
| `track_id` | String | Foreign Key to `tracks.id` |
|
||||
| `played_at` | DateTime | Timestamp when the track was played |
|
||||
| `context_uri` | String | Spotify context URI (e.g., playlist or album URI) |
|
||||
| `listened_ms` | Integer | Computed duration the track was actually heard |
|
||||
| `skipped` | Boolean | Whether the track was likely skipped |
|
||||
| `source` | String | Ingestion source (e.g., "spotify_recently_played") |
|
||||
|
||||
### `analysis_snapshots`
|
||||
Stores periodic analysis results generated by the AI service.
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `id` | Integer | Primary Key |
|
||||
| `date` | DateTime | When the analysis was performed |
|
||||
| `period_start` | DateTime | Start of the analyzed period |
|
||||
| `period_end` | DateTime | End of the analyzed period |
|
||||
| `period_label` | String | Label for the period (e.g., "last_30_days") |
|
||||
| `metrics_payload` | JSON | Computed statistics used as input for the AI |
|
||||
| `narrative_report` | JSON | AI-generated narrative and persona |
|
||||
| `model_used` | String | LLM model identifier (e.g., "gemini-1.5-flash") |
|
||||
|
||||
### `track_artists` (Association Table)
|
||||
Facilitates the many-to-many relationship between tracks and artists.
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `track_id` | String | Foreign Key to `tracks.id` |
|
||||
| `artist_id` | String | Foreign Key to `artists.id` |
|
||||
61
docs/FRONTEND.md
Normal file
61
docs/FRONTEND.md
Normal file
@@ -0,0 +1,61 @@
|
||||
# Frontend Documentation
|
||||
|
||||
The frontend is a React application built with Vite and Tailwind CSS. It uses Ant Design for some UI components and Recharts for data visualization.
|
||||
|
||||
## Main Components
|
||||
|
||||
### `Dashboard.jsx`
|
||||
The primary layout component that manages data fetching and state.
|
||||
- **Features**:
|
||||
- Handles API calls to `/snapshots` and `/trigger-analysis`.
|
||||
- Implements local storage caching to reduce API load.
|
||||
- Displays a global loading state during analysis.
|
||||
- Contains the main header with a refresh trigger.
|
||||
|
||||
### `NarrativeSection.jsx`
|
||||
Displays the AI-generated qualitative analysis.
|
||||
- **Props**:
|
||||
- `narrative`: Object containing `persona`, `vibe_check_short`, and `roast`.
|
||||
- `vibe`: Object containing audio features used to generate dynamic tags.
|
||||
- **Purpose**: Gives the user a "identity" based on their music taste (e.g., "THE MELANCHOLIC ARCHITECT").
|
||||
|
||||
### `StatsGrid.jsx`
|
||||
A grid of high-level metric cards.
|
||||
- **Props**:
|
||||
- `metrics`: The `metrics_payload` from a snapshot.
|
||||
- **Displays**:
|
||||
- **Minutes Listened**: Total listening time converted to days.
|
||||
- **Obsession**: The #1 most played track with album art background.
|
||||
- **Unique Artists**: Count of different artists encountered.
|
||||
- **Hipster Score**: A percentage indicating how obscure the user's taste is.
|
||||
|
||||
### `VibeRadar.jsx`
|
||||
Visualizes the "Sonic DNA" of the user.
|
||||
- **Props**:
|
||||
- `vibe`: Audio feature averages (acousticness, danceability, energy, etc.).
|
||||
- **Visuals**:
|
||||
- **Radar Chart**: Shows the balance of audio features.
|
||||
- **Mood Clusters**: Floating bubbles representing "Party", "Focus", and "Chill" percentages.
|
||||
- **Whiplash Meter**: Shows volatility in tempo, energy, and valence between consecutive tracks.
|
||||
|
||||
### `TopRotation.jsx`
|
||||
A horizontal scrolling list of the most played tracks.
|
||||
- **Props**:
|
||||
- `volume`: Object containing `top_tracks` array.
|
||||
- **Purpose**: Quick view of recent favorites.
|
||||
|
||||
### `HeatMap.jsx`
|
||||
Visualizes when the user listens to music.
|
||||
- **Props**:
|
||||
- `timeHabits`: Compressed heatmap data (7x6 grid for days/time blocks).
|
||||
- `sessions`: List of recent listening sessions.
|
||||
- **Visuals**:
|
||||
- **Grid**: Days of the week vs. Time blocks (12am, 4am, etc.).
|
||||
- **Session Timeline**: Vertical list of recent listening bouts with session type (Marathon vs. Micro).
|
||||
|
||||
### `ListeningLog.jsx`
|
||||
A detailed view of individual plays.
|
||||
- **Features**:
|
||||
- **Timeline View**: Visualizes listening sessions across the day for the last 7 days.
|
||||
- **List View**: A table of individual plays with skip status detection.
|
||||
- **Timeframe Filter**: Toggle between 24h, 7d, 14d, and 30d views.
|
||||
Reference in New Issue
Block a user