feat: migrate to PostgreSQL and enhance playlist curation

- Migrate database from SQLite to PostgreSQL (100.91.248.114:5433) - Fix playlist curation to use actual top tracks instead of AI name matching - Add /playlists/history endpoint for historical playlist viewing - Add Playlist Archives section to frontend with expandable history - Add playlist-modify-* scopes to Spotify OAuth for playlist creation - Rewrite Genius client to use official API (fixes 403 scraping blocks) - Ensure playlists are created on Spotify before curation attempts - Add DATABASE.md documentation for PostgreSQL schema - Add migrations for PlaylistConfig and composition storage
2026-06-25 09:08:29 +02:00 · 2025-12-30 22:24:56 +04:00
parent 26b4895695
commit 272148c5bf
19 changed files with 1130 additions and 145 deletions
@@ -0,0 +1,271 @@
+# Database Documentation
+
+## PostgreSQL Connection Details
+
+| Property | Value |
+|----------|-------|
+| Host | `100.91.248.114` |
+| Port | `5433` |
+| User | `bnair` |
+| Password | `Bharath2002` |
+| Database | `music_db` |
+| Data Location (on server) | `/opt/DB/MusicDB/pgdata` |
+
+### Connection String
+```
+postgresql://bnair:Bharath2002@100.91.248.114:5433/music_db
+```
+
+## Schema Overview
+
+```
+┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
+│     artists     │     │   track_artists  │     │     tracks      │
+├─────────────────┤     ├──────────────────┤     ├─────────────────┤
+│ id (PK)         │◄────┤ artist_id (FK)   │     │ id (PK)         │
+│ name            │     │ track_id (FK)    │────►│ reccobeats_id   │
+│ genres (JSON)   │     └──────────────────┘     │ name            │
+│ image_url       │                              │ artist          │
+└─────────────────┘                              │ album           │
+                                                 │ image_url       │
+                                                 │ duration_ms     │
+                                                 │ popularity      │
+                                                 │ raw_data (JSON) │
+                                                 │ danceability    │
+                                                 │ energy          │
+                                                 │ key             │
+                                                 │ ... (audio)     │
+                                                 │ genres (JSON)   │
+                                                 │ lyrics          │
+                                                 │ created_at      │
+                                                 │ updated_at      │
+                                                 └─────────────────┘
+                                                         │
+                                                         │
+                                                         ▼
+┌─────────────────────┐                          ┌─────────────────┐
+│ analysis_snapshots  │                          │  play_history   │
+├─────────────────────┤                          ├─────────────────┤
+│ id (PK)             │                          │ id (PK)         │
+│ date                │                          │ track_id (FK)   │
+│ period_start        │                          │ played_at       │
+│ period_end          │                          │ context_uri     │
+│ period_label        │                          │ listened_ms     │
+│ metrics_payload     │                          │ skipped         │
+│ narrative_report    │                          │ source          │
+│ model_used          │                          └─────────────────┘
+│ playlist_theme      │
+│ ... (playlist)      │
+│ playlist_composition│
+└─────────────────────┘
+
+┌─────────────────────┐
+│   playlist_config   │
+├─────────────────────┤
+│ key (PK)            │
+│ spotify_id          │
+│ last_updated        │
+│ current_theme       │
+│ description         │
+│ composition (JSON)  │
+└─────────────────────┘
+```
+
+## Tables
+
+### `tracks`
+Central entity storing Spotify track metadata and enriched audio features.
+
+| Column | Type | Description |
+|--------|------|-------------|
+| `id` | VARCHAR | Spotify track ID (primary key) |
+| `reccobeats_id` | VARCHAR | ReccoBeats UUID for audio features |
+| `name` | VARCHAR | Track title |
+| `artist` | VARCHAR | Display artist string (e.g., "Drake, Future") |
+| `album` | VARCHAR | Album name |
+| `image_url` | VARCHAR | Album art URL |
+| `duration_ms` | INTEGER | Track duration in milliseconds |
+| `popularity` | INTEGER | Spotify popularity score (0-100) |
+| `raw_data` | JSON | Full Spotify API response |
+| `danceability` | FLOAT | Audio feature (0.0-1.0) |
+| `energy` | FLOAT | Audio feature (0.0-1.0) |
+| `key` | INTEGER | Musical key (0-11) |
+| `loudness` | FLOAT | Audio feature (dB) |
+| `mode` | INTEGER | Major (1) or minor (0) |
+| `speechiness` | FLOAT | Audio feature (0.0-1.0) |
+| `acousticness` | FLOAT | Audio feature (0.0-1.0) |
+| `instrumentalness` | FLOAT | Audio feature (0.0-1.0) |
+| `liveness` | FLOAT | Audio feature (0.0-1.0) |
+| `valence` | FLOAT | Audio feature (0.0-1.0) |
+| `tempo` | FLOAT | BPM |
+| `time_signature` | INTEGER | Beats per bar |
+| `genres` | JSON | Genre tags (deprecated, use Artist.genres) |
+| `lyrics` | TEXT | Full lyrics from Genius |
+| `lyrics_summary` | VARCHAR | AI-generated summary |
+| `genre_tags` | VARCHAR | AI-generated tags |
+| `created_at` | TIMESTAMP | Record creation time |
+| `updated_at` | TIMESTAMP | Last update time |
+
+### `artists`
+Artist entities with genre information.
+
+| Column | Type | Description |
+|--------|------|-------------|
+| `id` | VARCHAR | Spotify artist ID (primary key) |
+| `name` | VARCHAR | Artist name |
+| `genres` | JSON | List of genre strings |
+| `image_url` | VARCHAR | Artist profile image URL |
+
+### `track_artists`
+Many-to-many relationship between tracks and artists.
+
+| Column | Type | Description |
+|--------|------|-------------|
+| `track_id` | VARCHAR | Foreign key to tracks.id |
+| `artist_id` | VARCHAR | Foreign key to artists.id |
+
+### `play_history`
+Immutable log of listening events.
+
+| Column | Type | Description |
+|--------|------|-------------|
+| `id` | INTEGER | Auto-increment primary key |
+| `track_id` | VARCHAR | Foreign key to tracks.id |
+| `played_at` | TIMESTAMP | When the track was played |
+| `context_uri` | VARCHAR | Spotify context (playlist, album, etc.) |
+| `listened_ms` | INTEGER | Duration actually listened |
+| `skipped` | BOOLEAN | Whether track was skipped |
+| `source` | VARCHAR | Source of the play event |
+
+### `analysis_snapshots`
+Stores computed statistics and AI-generated narratives.
+
+| Column | Type | Description |
+|--------|------|-------------|
+| `id` | INTEGER | Auto-increment primary key |
+| `date` | TIMESTAMP | When analysis was run |
+| `period_start` | TIMESTAMP | Analysis period start |
+| `period_end` | TIMESTAMP | Analysis period end |
+| `period_label` | VARCHAR | Label (e.g., "last_30_days") |
+| `metrics_payload` | JSON | StatsService output |
+| `narrative_report` | JSON | NarrativeService output |
+| `model_used` | VARCHAR | LLM model name |
+| `playlist_theme` | VARCHAR | AI-generated theme name |
+| `playlist_theme_reasoning` | TEXT | AI explanation for theme |
+| `six_hour_playlist_id` | VARCHAR | Spotify playlist ID |
+| `daily_playlist_id` | VARCHAR | Spotify playlist ID |
+| `playlist_composition` | JSON | Track list at snapshot time |
+
+### `playlist_config`
+Configuration for managed Spotify playlists.
+
+| Column | Type | Description |
+|--------|------|-------------|
+| `key` | VARCHAR | Config key (primary key, e.g., "six_hour") |
+| `spotify_id` | VARCHAR | Spotify playlist ID |
+| `last_updated` | TIMESTAMP | Last update time |
+| `current_theme` | VARCHAR | Current playlist theme |
+| `description` | VARCHAR | Playlist description |
+| `composition` | JSON | Current track list |
+
+## Schema Modifications (Alembic)
+
+All schema changes MUST go through Alembic migrations.
+
+### Creating a New Migration
+
+```bash
+cd backend
+source venv/bin/activate
+
+# Auto-generate migration from model changes
+alembic revision --autogenerate -m "description_of_change"
+
+# Or create empty migration for manual SQL
+alembic revision -m "description_of_change"
+```
+
+### Applying Migrations
+
+```bash
+# Apply all pending migrations
+alembic upgrade head
+
+# Apply specific migration
+alembic upgrade <revision_id>
+
+# Rollback one migration
+alembic downgrade -1
+
+# Rollback to specific revision
+alembic downgrade <revision_id>
+```
+
+### Migration Best Practices
+
+1. **Test locally first** - Always test migrations on a dev database
+2. **Backup before migrating** - `pg_dump -h 100.91.248.114 -p 5433 -U bnair music_db > backup.sql`
+3. **One change per migration** - Keep migrations atomic
+4. **Include rollback logic** - Implement `downgrade()` function
+5. **Review autogenerated migrations** - They may miss nuances
+
+### Example Migration
+
+```python
+# alembic/versions/xxxx_add_new_column.py
+from alembic import op
+import sqlalchemy as sa
+
+revision = 'xxxx'
+down_revision = 'yyyy'
+
+def upgrade():
+    op.add_column('tracks', sa.Column('new_column', sa.String(), nullable=True))
+
+def downgrade():
+    op.drop_column('tracks', 'new_column')
+```
+
+## Direct Database Access
+
+### Using psql
+```bash
+psql -h 100.91.248.114 -p 5433 -U bnair -d music_db
+```
+
+### Using Python
+```python
+import psycopg2
+conn = psycopg2.connect(
+    host='100.91.248.114',
+    port=5433,
+    user='bnair',
+    password='Bharath2002',
+    dbname='music_db'
+)
+```
+
+### Common Queries
+
+```sql
+-- Recent plays
+SELECT t.name, t.artist, ph.played_at 
+FROM play_history ph 
+JOIN tracks t ON ph.track_id = t.id 
+ORDER BY ph.played_at DESC 
+LIMIT 10;
+
+-- Top tracks by play count
+SELECT t.name, t.artist, COUNT(*) as plays 
+FROM play_history ph 
+JOIN tracks t ON ph.track_id = t.id 
+GROUP BY t.id, t.name, t.artist 
+ORDER BY plays DESC 
+LIMIT 10;
+
+-- Genre distribution
+SELECT genre, COUNT(*) 
+FROM artists, jsonb_array_elements_text(genres::jsonb) AS genre 
+GROUP BY genre 
+ORDER BY count DESC;
+```