mirror of
https://github.com/bnair123/MusicAnalyser.git
synced 2026-02-25 11:46:07 +00:00
- Migrate database from SQLite to PostgreSQL (100.91.248.114:5433) - Fix playlist curation to use actual top tracks instead of AI name matching - Add /playlists/history endpoint for historical playlist viewing - Add Playlist Archives section to frontend with expandable history - Add playlist-modify-* scopes to Spotify OAuth for playlist creation - Rewrite Genius client to use official API (fixes 403 scraping blocks) - Ensure playlists are created on Spotify before curation attempts - Add DATABASE.md documentation for PostgreSQL schema - Add migrations for PlaylistConfig and composition storage
272 lines
9.9 KiB
Markdown
272 lines
9.9 KiB
Markdown
# Database Documentation
|
|
|
|
## PostgreSQL Connection Details
|
|
|
|
| Property | Value |
|
|
|----------|-------|
|
|
| Host | `100.91.248.114` |
|
|
| Port | `5433` |
|
|
| User | `bnair` |
|
|
| Password | `Bharath2002` |
|
|
| Database | `music_db` |
|
|
| Data Location (on server) | `/opt/DB/MusicDB/pgdata` |
|
|
|
|
### Connection String
|
|
```
|
|
postgresql://bnair:Bharath2002@100.91.248.114:5433/music_db
|
|
```
|
|
|
|
## Schema Overview
|
|
|
|
```
|
|
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
|
|
│ artists │ │ track_artists │ │ tracks │
|
|
├─────────────────┤ ├──────────────────┤ ├─────────────────┤
|
|
│ id (PK) │◄────┤ artist_id (FK) │ │ id (PK) │
|
|
│ name │ │ track_id (FK) │────►│ reccobeats_id │
|
|
│ genres (JSON) │ └──────────────────┘ │ name │
|
|
│ image_url │ │ artist │
|
|
└─────────────────┘ │ album │
|
|
│ image_url │
|
|
│ duration_ms │
|
|
│ popularity │
|
|
│ raw_data (JSON) │
|
|
│ danceability │
|
|
│ energy │
|
|
│ key │
|
|
│ ... (audio) │
|
|
│ genres (JSON) │
|
|
│ lyrics │
|
|
│ created_at │
|
|
│ updated_at │
|
|
└─────────────────┘
|
|
│
|
|
│
|
|
▼
|
|
┌─────────────────────┐ ┌─────────────────┐
|
|
│ analysis_snapshots │ │ play_history │
|
|
├─────────────────────┤ ├─────────────────┤
|
|
│ id (PK) │ │ id (PK) │
|
|
│ date │ │ track_id (FK) │
|
|
│ period_start │ │ played_at │
|
|
│ period_end │ │ context_uri │
|
|
│ period_label │ │ listened_ms │
|
|
│ metrics_payload │ │ skipped │
|
|
│ narrative_report │ │ source │
|
|
│ model_used │ └─────────────────┘
|
|
│ playlist_theme │
|
|
│ ... (playlist) │
|
|
│ playlist_composition│
|
|
└─────────────────────┘
|
|
|
|
┌─────────────────────┐
|
|
│ playlist_config │
|
|
├─────────────────────┤
|
|
│ key (PK) │
|
|
│ spotify_id │
|
|
│ last_updated │
|
|
│ current_theme │
|
|
│ description │
|
|
│ composition (JSON) │
|
|
└─────────────────────┘
|
|
```
|
|
|
|
## Tables
|
|
|
|
### `tracks`
|
|
Central entity storing Spotify track metadata and enriched audio features.
|
|
|
|
| Column | Type | Description |
|
|
|--------|------|-------------|
|
|
| `id` | VARCHAR | Spotify track ID (primary key) |
|
|
| `reccobeats_id` | VARCHAR | ReccoBeats UUID for audio features |
|
|
| `name` | VARCHAR | Track title |
|
|
| `artist` | VARCHAR | Display artist string (e.g., "Drake, Future") |
|
|
| `album` | VARCHAR | Album name |
|
|
| `image_url` | VARCHAR | Album art URL |
|
|
| `duration_ms` | INTEGER | Track duration in milliseconds |
|
|
| `popularity` | INTEGER | Spotify popularity score (0-100) |
|
|
| `raw_data` | JSON | Full Spotify API response |
|
|
| `danceability` | FLOAT | Audio feature (0.0-1.0) |
|
|
| `energy` | FLOAT | Audio feature (0.0-1.0) |
|
|
| `key` | INTEGER | Musical key (0-11) |
|
|
| `loudness` | FLOAT | Audio feature (dB) |
|
|
| `mode` | INTEGER | Major (1) or minor (0) |
|
|
| `speechiness` | FLOAT | Audio feature (0.0-1.0) |
|
|
| `acousticness` | FLOAT | Audio feature (0.0-1.0) |
|
|
| `instrumentalness` | FLOAT | Audio feature (0.0-1.0) |
|
|
| `liveness` | FLOAT | Audio feature (0.0-1.0) |
|
|
| `valence` | FLOAT | Audio feature (0.0-1.0) |
|
|
| `tempo` | FLOAT | BPM |
|
|
| `time_signature` | INTEGER | Beats per bar |
|
|
| `genres` | JSON | Genre tags (deprecated, use Artist.genres) |
|
|
| `lyrics` | TEXT | Full lyrics from Genius |
|
|
| `lyrics_summary` | VARCHAR | AI-generated summary |
|
|
| `genre_tags` | VARCHAR | AI-generated tags |
|
|
| `created_at` | TIMESTAMP | Record creation time |
|
|
| `updated_at` | TIMESTAMP | Last update time |
|
|
|
|
### `artists`
|
|
Artist entities with genre information.
|
|
|
|
| Column | Type | Description |
|
|
|--------|------|-------------|
|
|
| `id` | VARCHAR | Spotify artist ID (primary key) |
|
|
| `name` | VARCHAR | Artist name |
|
|
| `genres` | JSON | List of genre strings |
|
|
| `image_url` | VARCHAR | Artist profile image URL |
|
|
|
|
### `track_artists`
|
|
Many-to-many relationship between tracks and artists.
|
|
|
|
| Column | Type | Description |
|
|
|--------|------|-------------|
|
|
| `track_id` | VARCHAR | Foreign key to tracks.id |
|
|
| `artist_id` | VARCHAR | Foreign key to artists.id |
|
|
|
|
### `play_history`
|
|
Immutable log of listening events.
|
|
|
|
| Column | Type | Description |
|
|
|--------|------|-------------|
|
|
| `id` | INTEGER | Auto-increment primary key |
|
|
| `track_id` | VARCHAR | Foreign key to tracks.id |
|
|
| `played_at` | TIMESTAMP | When the track was played |
|
|
| `context_uri` | VARCHAR | Spotify context (playlist, album, etc.) |
|
|
| `listened_ms` | INTEGER | Duration actually listened |
|
|
| `skipped` | BOOLEAN | Whether track was skipped |
|
|
| `source` | VARCHAR | Source of the play event |
|
|
|
|
### `analysis_snapshots`
|
|
Stores computed statistics and AI-generated narratives.
|
|
|
|
| Column | Type | Description |
|
|
|--------|------|-------------|
|
|
| `id` | INTEGER | Auto-increment primary key |
|
|
| `date` | TIMESTAMP | When analysis was run |
|
|
| `period_start` | TIMESTAMP | Analysis period start |
|
|
| `period_end` | TIMESTAMP | Analysis period end |
|
|
| `period_label` | VARCHAR | Label (e.g., "last_30_days") |
|
|
| `metrics_payload` | JSON | StatsService output |
|
|
| `narrative_report` | JSON | NarrativeService output |
|
|
| `model_used` | VARCHAR | LLM model name |
|
|
| `playlist_theme` | VARCHAR | AI-generated theme name |
|
|
| `playlist_theme_reasoning` | TEXT | AI explanation for theme |
|
|
| `six_hour_playlist_id` | VARCHAR | Spotify playlist ID |
|
|
| `daily_playlist_id` | VARCHAR | Spotify playlist ID |
|
|
| `playlist_composition` | JSON | Track list at snapshot time |
|
|
|
|
### `playlist_config`
|
|
Configuration for managed Spotify playlists.
|
|
|
|
| Column | Type | Description |
|
|
|--------|------|-------------|
|
|
| `key` | VARCHAR | Config key (primary key, e.g., "six_hour") |
|
|
| `spotify_id` | VARCHAR | Spotify playlist ID |
|
|
| `last_updated` | TIMESTAMP | Last update time |
|
|
| `current_theme` | VARCHAR | Current playlist theme |
|
|
| `description` | VARCHAR | Playlist description |
|
|
| `composition` | JSON | Current track list |
|
|
|
|
## Schema Modifications (Alembic)
|
|
|
|
All schema changes MUST go through Alembic migrations.
|
|
|
|
### Creating a New Migration
|
|
|
|
```bash
|
|
cd backend
|
|
source venv/bin/activate
|
|
|
|
# Auto-generate migration from model changes
|
|
alembic revision --autogenerate -m "description_of_change"
|
|
|
|
# Or create empty migration for manual SQL
|
|
alembic revision -m "description_of_change"
|
|
```
|
|
|
|
### Applying Migrations
|
|
|
|
```bash
|
|
# Apply all pending migrations
|
|
alembic upgrade head
|
|
|
|
# Apply specific migration
|
|
alembic upgrade <revision_id>
|
|
|
|
# Rollback one migration
|
|
alembic downgrade -1
|
|
|
|
# Rollback to specific revision
|
|
alembic downgrade <revision_id>
|
|
```
|
|
|
|
### Migration Best Practices
|
|
|
|
1. **Test locally first** - Always test migrations on a dev database
|
|
2. **Backup before migrating** - `pg_dump -h 100.91.248.114 -p 5433 -U bnair music_db > backup.sql`
|
|
3. **One change per migration** - Keep migrations atomic
|
|
4. **Include rollback logic** - Implement `downgrade()` function
|
|
5. **Review autogenerated migrations** - They may miss nuances
|
|
|
|
### Example Migration
|
|
|
|
```python
|
|
# alembic/versions/xxxx_add_new_column.py
|
|
from alembic import op
|
|
import sqlalchemy as sa
|
|
|
|
revision = 'xxxx'
|
|
down_revision = 'yyyy'
|
|
|
|
def upgrade():
|
|
op.add_column('tracks', sa.Column('new_column', sa.String(), nullable=True))
|
|
|
|
def downgrade():
|
|
op.drop_column('tracks', 'new_column')
|
|
```
|
|
|
|
## Direct Database Access
|
|
|
|
### Using psql
|
|
```bash
|
|
psql -h 100.91.248.114 -p 5433 -U bnair -d music_db
|
|
```
|
|
|
|
### Using Python
|
|
```python
|
|
import psycopg2
|
|
conn = psycopg2.connect(
|
|
host='100.91.248.114',
|
|
port=5433,
|
|
user='bnair',
|
|
password='Bharath2002',
|
|
dbname='music_db'
|
|
)
|
|
```
|
|
|
|
### Common Queries
|
|
|
|
```sql
|
|
-- Recent plays
|
|
SELECT t.name, t.artist, ph.played_at
|
|
FROM play_history ph
|
|
JOIN tracks t ON ph.track_id = t.id
|
|
ORDER BY ph.played_at DESC
|
|
LIMIT 10;
|
|
|
|
-- Top tracks by play count
|
|
SELECT t.name, t.artist, COUNT(*) as plays
|
|
FROM play_history ph
|
|
JOIN tracks t ON ph.track_id = t.id
|
|
GROUP BY t.id, t.name, t.artist
|
|
ORDER BY plays DESC
|
|
LIMIT 10;
|
|
|
|
-- Genre distribution
|
|
SELECT genre, COUNT(*)
|
|
FROM artists, jsonb_array_elements_text(genres::jsonb) AS genre
|
|
GROUP BY genre
|
|
ORDER BY count DESC;
|
|
```
|