Files
MusicAnalyser/docs/DATABASE.md
bnair123 272148c5bf feat: migrate to PostgreSQL and enhance playlist curation
- Migrate database from SQLite to PostgreSQL (100.91.248.114:5433)
- Fix playlist curation to use actual top tracks instead of AI name matching
- Add /playlists/history endpoint for historical playlist viewing
- Add Playlist Archives section to frontend with expandable history
- Add playlist-modify-* scopes to Spotify OAuth for playlist creation
- Rewrite Genius client to use official API (fixes 403 scraping blocks)
- Ensure playlists are created on Spotify before curation attempts
- Add DATABASE.md documentation for PostgreSQL schema
- Add migrations for PlaylistConfig and composition storage
2025-12-30 22:24:56 +04:00

9.9 KiB

Database Documentation

PostgreSQL Connection Details

Property Value
Host 100.91.248.114
Port 5433
User bnair
Password Bharath2002
Database music_db
Data Location (on server) /opt/DB/MusicDB/pgdata

Connection String

postgresql://bnair:Bharath2002@100.91.248.114:5433/music_db

Schema Overview

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│     artists     │     │   track_artists  │     │     tracks      │
├─────────────────┤     ├──────────────────┤     ├─────────────────┤
│ id (PK)         │◄────┤ artist_id (FK)   │     │ id (PK)         │
│ name            │     │ track_id (FK)    │────►│ reccobeats_id   │
│ genres (JSON)   │     └──────────────────┘     │ name            │
│ image_url       │                              │ artist          │
└─────────────────┘                              │ album           │
                                                 │ image_url       │
                                                 │ duration_ms     │
                                                 │ popularity      │
                                                 │ raw_data (JSON) │
                                                 │ danceability    │
                                                 │ energy          │
                                                 │ key             │
                                                 │ ... (audio)     │
                                                 │ genres (JSON)   │
                                                 │ lyrics          │
                                                 │ created_at      │
                                                 │ updated_at      │
                                                 └─────────────────┘
                                                         │
                                                         │
                                                         ▼
┌─────────────────────┐                          ┌─────────────────┐
│ analysis_snapshots  │                          │  play_history   │
├─────────────────────┤                          ├─────────────────┤
│ id (PK)             │                          │ id (PK)         │
│ date                │                          │ track_id (FK)   │
│ period_start        │                          │ played_at       │
│ period_end          │                          │ context_uri     │
│ period_label        │                          │ listened_ms     │
│ metrics_payload     │                          │ skipped         │
│ narrative_report    │                          │ source          │
│ model_used          │                          └─────────────────┘
│ playlist_theme      │
│ ... (playlist)      │
│ playlist_composition│
└─────────────────────┘

┌─────────────────────┐
│   playlist_config   │
├─────────────────────┤
│ key (PK)            │
│ spotify_id          │
│ last_updated        │
│ current_theme       │
│ description         │
│ composition (JSON)  │
└─────────────────────┘

Tables

tracks

Central entity storing Spotify track metadata and enriched audio features.

Column Type Description
id VARCHAR Spotify track ID (primary key)
reccobeats_id VARCHAR ReccoBeats UUID for audio features
name VARCHAR Track title
artist VARCHAR Display artist string (e.g., "Drake, Future")
album VARCHAR Album name
image_url VARCHAR Album art URL
duration_ms INTEGER Track duration in milliseconds
popularity INTEGER Spotify popularity score (0-100)
raw_data JSON Full Spotify API response
danceability FLOAT Audio feature (0.0-1.0)
energy FLOAT Audio feature (0.0-1.0)
key INTEGER Musical key (0-11)
loudness FLOAT Audio feature (dB)
mode INTEGER Major (1) or minor (0)
speechiness FLOAT Audio feature (0.0-1.0)
acousticness FLOAT Audio feature (0.0-1.0)
instrumentalness FLOAT Audio feature (0.0-1.0)
liveness FLOAT Audio feature (0.0-1.0)
valence FLOAT Audio feature (0.0-1.0)
tempo FLOAT BPM
time_signature INTEGER Beats per bar
genres JSON Genre tags (deprecated, use Artist.genres)
lyrics TEXT Full lyrics from Genius
lyrics_summary VARCHAR AI-generated summary
genre_tags VARCHAR AI-generated tags
created_at TIMESTAMP Record creation time
updated_at TIMESTAMP Last update time

artists

Artist entities with genre information.

Column Type Description
id VARCHAR Spotify artist ID (primary key)
name VARCHAR Artist name
genres JSON List of genre strings
image_url VARCHAR Artist profile image URL

track_artists

Many-to-many relationship between tracks and artists.

Column Type Description
track_id VARCHAR Foreign key to tracks.id
artist_id VARCHAR Foreign key to artists.id

play_history

Immutable log of listening events.

Column Type Description
id INTEGER Auto-increment primary key
track_id VARCHAR Foreign key to tracks.id
played_at TIMESTAMP When the track was played
context_uri VARCHAR Spotify context (playlist, album, etc.)
listened_ms INTEGER Duration actually listened
skipped BOOLEAN Whether track was skipped
source VARCHAR Source of the play event

analysis_snapshots

Stores computed statistics and AI-generated narratives.

Column Type Description
id INTEGER Auto-increment primary key
date TIMESTAMP When analysis was run
period_start TIMESTAMP Analysis period start
period_end TIMESTAMP Analysis period end
period_label VARCHAR Label (e.g., "last_30_days")
metrics_payload JSON StatsService output
narrative_report JSON NarrativeService output
model_used VARCHAR LLM model name
playlist_theme VARCHAR AI-generated theme name
playlist_theme_reasoning TEXT AI explanation for theme
six_hour_playlist_id VARCHAR Spotify playlist ID
daily_playlist_id VARCHAR Spotify playlist ID
playlist_composition JSON Track list at snapshot time

playlist_config

Configuration for managed Spotify playlists.

Column Type Description
key VARCHAR Config key (primary key, e.g., "six_hour")
spotify_id VARCHAR Spotify playlist ID
last_updated TIMESTAMP Last update time
current_theme VARCHAR Current playlist theme
description VARCHAR Playlist description
composition JSON Current track list

Schema Modifications (Alembic)

All schema changes MUST go through Alembic migrations.

Creating a New Migration

cd backend
source venv/bin/activate

# Auto-generate migration from model changes
alembic revision --autogenerate -m "description_of_change"

# Or create empty migration for manual SQL
alembic revision -m "description_of_change"

Applying Migrations

# Apply all pending migrations
alembic upgrade head

# Apply specific migration
alembic upgrade <revision_id>

# Rollback one migration
alembic downgrade -1

# Rollback to specific revision
alembic downgrade <revision_id>

Migration Best Practices

  1. Test locally first - Always test migrations on a dev database
  2. Backup before migrating - pg_dump -h 100.91.248.114 -p 5433 -U bnair music_db > backup.sql
  3. One change per migration - Keep migrations atomic
  4. Include rollback logic - Implement downgrade() function
  5. Review autogenerated migrations - They may miss nuances

Example Migration

# alembic/versions/xxxx_add_new_column.py
from alembic import op
import sqlalchemy as sa

revision = 'xxxx'
down_revision = 'yyyy'

def upgrade():
    op.add_column('tracks', sa.Column('new_column', sa.String(), nullable=True))

def downgrade():
    op.drop_column('tracks', 'new_column')

Direct Database Access

Using psql

psql -h 100.91.248.114 -p 5433 -U bnair -d music_db

Using Python

import psycopg2
conn = psycopg2.connect(
    host='100.91.248.114',
    port=5433,
    user='bnair',
    password='Bharath2002',
    dbname='music_db'
)

Common Queries

-- Recent plays
SELECT t.name, t.artist, ph.played_at 
FROM play_history ph 
JOIN tracks t ON ph.track_id = t.id 
ORDER BY ph.played_at DESC 
LIMIT 10;

-- Top tracks by play count
SELECT t.name, t.artist, COUNT(*) as plays 
FROM play_history ph 
JOIN tracks t ON ph.track_id = t.id 
GROUP BY t.id, t.name, t.artist 
ORDER BY plays DESC 
LIMIT 10;

-- Genre distribution
SELECT genre, COUNT(*) 
FROM artists, jsonb_array_elements_text(genres::jsonb) AS genre 
GROUP BY genre 
ORDER BY count DESC;