# Data Model Documentation This document describes the database schema for the MusicAnalyser project. The project uses SQLite with SQLAlchemy as the ORM. ## Entity Relationship Diagram Overview - **Artist** (Many-to-Many) **Track** - **Track** (One-to-Many) **PlayHistory** - **AnalysisSnapshot** (Independent) --- ## Tables ### `artists` Stores unique artists retrieved from Spotify. | Field | Type | Description | |-------|------|-------------| | `id` | String | Spotify ID (Primary Key) | | `name` | String | Artist name | | `genres` | JSON | List of genre strings | | `image_url` | String | URL to artist profile image | ### `tracks` Stores unique tracks retrieved from Spotify, enriched with audio features and lyrics. | Field | Type | Description | |-------|------|-------------| | `id` | String | Spotify ID (Primary Key) | | `name` | String | Track name | | `artist` | String | Display string for artists (e.g., "Artist A, Artist B") | | `album` | String | Album name | | `image_url` | String | URL to album art | | `duration_ms` | Integer | Track duration in milliseconds | | `popularity` | Integer | Spotify popularity score (0-100) | | `raw_data` | JSON | Full raw response from Spotify API for future-proofing | | `danceability` | Float | Audio feature: Danceability (0.0 to 1.0) | | `energy` | Float | Audio feature: Energy (0.0 to 1.0) | | `key` | Integer | Audio feature: Key | | `loudness` | Float | Audio feature: Loudness in dB | | `mode` | Integer | Audio feature: Mode (0 for Minor, 1 for Major) | | `speechiness` | Float | Audio feature: Speechiness (0.0 to 1.0) | | `acousticness` | Float | Audio feature: Acousticness (0.0 to 1.0) | | `instrumentalness` | Float | Audio feature: Instrumentalness (0.0 to 1.0) | | `liveness` | Float | Audio feature: Liveness (0.0 to 1.0) | | `valence` | Float | Audio feature: Valence (0.0 to 1.0) | | `tempo` | Float | Audio feature: Tempo in BPM | | `time_signature` | Integer | Audio feature: Time signature | | `lyrics` | Text | Full lyrics retrieved from Genius | | `lyrics_summary` | String | AI-generated summary of lyrics | | `genre_tags` | String | Combined genre tags for the track | | `created_at` | DateTime | Timestamp of record creation | | `updated_at` | DateTime | Timestamp of last update | ### `play_history` Stores individual listening instances. | Field | Type | Description | |-------|------|-------------| | `id` | Integer | Primary Key (Auto-increment) | | `track_id` | String | Foreign Key to `tracks.id` | | `played_at` | DateTime | Timestamp when the track was played | | `context_uri` | String | Spotify context URI (e.g., playlist or album URI) | | `listened_ms` | Integer | Computed duration the track was actually heard | | `skipped` | Boolean | Whether the track was likely skipped | | `source` | String | Ingestion source (e.g., "spotify_recently_played") | ### `analysis_snapshots` Stores periodic analysis results generated by the AI service. | Field | Type | Description | |-------|------|-------------| | `id` | Integer | Primary Key | | `date` | DateTime | When the analysis was performed | | `period_start` | DateTime | Start of the analyzed period | | `period_end` | DateTime | End of the analyzed period | | `period_label` | String | Label for the period (e.g., "last_30_days") | | `metrics_payload` | JSON | Computed statistics used as input for the AI | | `narrative_report` | JSON | AI-generated narrative and persona | | `model_used` | String | LLM model identifier (e.g., "gemini-1.5-flash") | ### `track_artists` (Association Table) Facilitates the many-to-many relationship between tracks and artists. | Field | Type | Description | |-------|------|-------------| | `track_id` | String | Foreign Key to `tracks.id` | | `artist_id` | String | Foreign Key to `artists.id` |