# Database Documentation

## PostgreSQL Connection Details

| Property | Value |
|----------|-------|
| Host | `100.91.248.114` |
| Port | `5433` |
| User | `bnair` |
| Password | `Bharath2002` |
| Database | `music_db` |
| Data Location (on server) | `/opt/DB/MusicDB/pgdata` |

### Connection String
```
postgresql://bnair:Bharath2002@100.91.248.114:5433/music_db
```

## Schema Overview

```
┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│     artists     │     │   track_artists  │     │     tracks      │
├─────────────────┤     ├──────────────────┤     ├─────────────────┤
│ id (PK)         │◄────┤ artist_id (FK)   │     │ id (PK)         │
│ name            │     │ track_id (FK)    │────►│ reccobeats_id   │
│ genres (JSON)   │     └──────────────────┘     │ name            │
│ image_url       │                              │ artist          │
└─────────────────┘                              │ album           │
                                                 │ image_url       │
                                                 │ duration_ms     │
                                                 │ popularity      │
                                                 │ raw_data (JSON) │
                                                 │ danceability    │
                                                 │ energy          │
                                                 │ key             │
                                                 │ ... (audio)     │
                                                 │ genres (JSON)   │
                                                 │ lyrics          │
                                                 │ created_at      │
                                                 │ updated_at      │
                                                 └─────────────────┘
                                                         │
                                                         │
                                                         ▼
┌─────────────────────┐                          ┌─────────────────┐
│ analysis_snapshots  │                          │  play_history   │
├─────────────────────┤                          ├─────────────────┤
│ id (PK)             │                          │ id (PK)         │
│ date                │                          │ track_id (FK)   │
│ period_start        │                          │ played_at       │
│ period_end          │                          │ context_uri     │
│ period_label        │                          │ listened_ms     │
│ metrics_payload     │                          │ skipped         │
│ narrative_report    │                          │ source          │
│ model_used          │                          └─────────────────┘
│ playlist_theme      │
│ ... (playlist)      │
│ playlist_composition│
└─────────────────────┘

┌─────────────────────┐
│   playlist_config   │
├─────────────────────┤
│ key (PK)            │
│ spotify_id          │
│ last_updated        │
│ current_theme       │
│ description         │
│ composition (JSON)  │
└─────────────────────┘
```

## Tables

### `tracks`
Central entity storing Spotify track metadata and enriched audio features.

| Column | Type | Description |
|--------|------|-------------|
| `id` | VARCHAR | Spotify track ID (primary key) |
| `reccobeats_id` | VARCHAR | ReccoBeats UUID for audio features |
| `name` | VARCHAR | Track title |
| `artist` | VARCHAR | Display artist string (e.g., "Drake, Future") |
| `album` | VARCHAR | Album name |
| `image_url` | VARCHAR | Album art URL |
| `duration_ms` | INTEGER | Track duration in milliseconds |
| `popularity` | INTEGER | Spotify popularity score (0-100) |
| `raw_data` | JSON | Full Spotify API response |
| `danceability` | FLOAT | Audio feature (0.0-1.0) |
| `energy` | FLOAT | Audio feature (0.0-1.0) |
| `key` | INTEGER | Musical key (0-11) |
| `loudness` | FLOAT | Audio feature (dB) |
| `mode` | INTEGER | Major (1) or minor (0) |
| `speechiness` | FLOAT | Audio feature (0.0-1.0) |
| `acousticness` | FLOAT | Audio feature (0.0-1.0) |
| `instrumentalness` | FLOAT | Audio feature (0.0-1.0) |
| `liveness` | FLOAT | Audio feature (0.0-1.0) |
| `valence` | FLOAT | Audio feature (0.0-1.0) |
| `tempo` | FLOAT | BPM |
| `time_signature` | INTEGER | Beats per bar |
| `genres` | JSON | Genre tags (deprecated, use Artist.genres) |
| `lyrics` | TEXT | Full lyrics from Genius |
| `lyrics_summary` | VARCHAR | AI-generated summary |
| `genre_tags` | VARCHAR | AI-generated tags |
| `created_at` | TIMESTAMP | Record creation time |
| `updated_at` | TIMESTAMP | Last update time |

### `artists`
Artist entities with genre information.

| Column | Type | Description |
|--------|------|-------------|
| `id` | VARCHAR | Spotify artist ID (primary key) |
| `name` | VARCHAR | Artist name |
| `genres` | JSON | List of genre strings |
| `image_url` | VARCHAR | Artist profile image URL |

### `track_artists`
Many-to-many relationship between tracks and artists.

| Column | Type | Description |
|--------|------|-------------|
| `track_id` | VARCHAR | Foreign key to tracks.id |
| `artist_id` | VARCHAR | Foreign key to artists.id |

### `play_history`
Immutable log of listening events.

| Column | Type | Description |
|--------|------|-------------|
| `id` | INTEGER | Auto-increment primary key |
| `track_id` | VARCHAR | Foreign key to tracks.id |
| `played_at` | TIMESTAMP | When the track was played |
| `context_uri` | VARCHAR | Spotify context (playlist, album, etc.) |
| `listened_ms` | INTEGER | Duration actually listened |
| `skipped` | BOOLEAN | Whether track was skipped |
| `source` | VARCHAR | Source of the play event |

### `analysis_snapshots`
Stores computed statistics and AI-generated narratives.

| Column | Type | Description |
|--------|------|-------------|
| `id` | INTEGER | Auto-increment primary key |
| `date` | TIMESTAMP | When analysis was run |
| `period_start` | TIMESTAMP | Analysis period start |
| `period_end` | TIMESTAMP | Analysis period end |
| `period_label` | VARCHAR | Label (e.g., "last_30_days") |
| `metrics_payload` | JSON | StatsService output |
| `narrative_report` | JSON | NarrativeService output |
| `model_used` | VARCHAR | LLM model name |
| `playlist_theme` | VARCHAR | AI-generated theme name |
| `playlist_theme_reasoning` | TEXT | AI explanation for theme |
| `six_hour_playlist_id` | VARCHAR | Spotify playlist ID |
| `daily_playlist_id` | VARCHAR | Spotify playlist ID |
| `playlist_composition` | JSON | Track list at snapshot time |

### `playlist_config`
Configuration for managed Spotify playlists.

| Column | Type | Description |
|--------|------|-------------|
| `key` | VARCHAR | Config key (primary key, e.g., "six_hour") |
| `spotify_id` | VARCHAR | Spotify playlist ID |
| `last_updated` | TIMESTAMP | Last update time |
| `current_theme` | VARCHAR | Current playlist theme |
| `description` | VARCHAR | Playlist description |
| `composition` | JSON | Current track list |

## Schema Modifications (Alembic)

All schema changes MUST go through Alembic migrations.

### Creating a New Migration

```bash
cd backend
source venv/bin/activate

# Auto-generate migration from model changes
alembic revision --autogenerate -m "description_of_change"

# Or create empty migration for manual SQL
alembic revision -m "description_of_change"
```

### Applying Migrations

```bash
# Apply all pending migrations
alembic upgrade head

# Apply specific migration
alembic upgrade <revision_id>

# Rollback one migration
alembic downgrade -1

# Rollback to specific revision
alembic downgrade <revision_id>
```

### Migration Best Practices

1. **Test locally first** - Always test migrations on a dev database
2. **Backup before migrating** - `pg_dump -h 100.91.248.114 -p 5433 -U bnair music_db > backup.sql`
3. **One change per migration** - Keep migrations atomic
4. **Include rollback logic** - Implement `downgrade()` function
5. **Review autogenerated migrations** - They may miss nuances

### Example Migration

```python
# alembic/versions/xxxx_add_new_column.py
from alembic import op
import sqlalchemy as sa

revision = 'xxxx'
down_revision = 'yyyy'

def upgrade():
    op.add_column('tracks', sa.Column('new_column', sa.String(), nullable=True))

def downgrade():
    op.drop_column('tracks', 'new_column')
```

## Direct Database Access

### Using psql
```bash
psql -h 100.91.248.114 -p 5433 -U bnair -d music_db
```

### Using Python
```python
import psycopg2
conn = psycopg2.connect(
    host='100.91.248.114',
    port=5433,
    user='bnair',
    password='Bharath2002',
    dbname='music_db'
)
```

### Common Queries

```sql
-- Recent plays
SELECT t.name, t.artist, ph.played_at 
FROM play_history ph 
JOIN tracks t ON ph.track_id = t.id 
ORDER BY ph.played_at DESC 
LIMIT 10;

-- Top tracks by play count
SELECT t.name, t.artist, COUNT(*) as plays 
FROM play_history ph 
JOIN tracks t ON ph.track_id = t.id 
GROUP BY t.id, t.name, t.artist 
ORDER BY plays DESC 
LIMIT 10;

-- Genre distribution
SELECT genre, COUNT(*) 
FROM artists, jsonb_array_elements_text(genres::jsonb) AS genre 
GROUP BY genre 
ORDER BY count DESC;
```