mirror of
https://github.com/bnair123/MusicAnalyser.git
synced 2026-02-25 11:46:07 +00:00
feat: migrate to PostgreSQL and enhance playlist curation
- Migrate database from SQLite to PostgreSQL (100.91.248.114:5433) - Fix playlist curation to use actual top tracks instead of AI name matching - Add /playlists/history endpoint for historical playlist viewing - Add Playlist Archives section to frontend with expandable history - Add playlist-modify-* scopes to Spotify OAuth for playlist creation - Rewrite Genius client to use official API (fixes 403 scraping blocks) - Ensure playlists are created on Spotify before curation attempts - Add DATABASE.md documentation for PostgreSQL schema - Add migrations for PlaylistConfig and composition storage
This commit is contained in:
271
docs/DATABASE.md
Normal file
271
docs/DATABASE.md
Normal file
@@ -0,0 +1,271 @@
|
||||
# Database Documentation
|
||||
|
||||
## PostgreSQL Connection Details
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| Host | `100.91.248.114` |
|
||||
| Port | `5433` |
|
||||
| User | `bnair` |
|
||||
| Password | `Bharath2002` |
|
||||
| Database | `music_db` |
|
||||
| Data Location (on server) | `/opt/DB/MusicDB/pgdata` |
|
||||
|
||||
### Connection String
|
||||
```
|
||||
postgresql://bnair:Bharath2002@100.91.248.114:5433/music_db
|
||||
```
|
||||
|
||||
## Schema Overview
|
||||
|
||||
```
|
||||
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
|
||||
│ artists │ │ track_artists │ │ tracks │
|
||||
├─────────────────┤ ├──────────────────┤ ├─────────────────┤
|
||||
│ id (PK) │◄────┤ artist_id (FK) │ │ id (PK) │
|
||||
│ name │ │ track_id (FK) │────►│ reccobeats_id │
|
||||
│ genres (JSON) │ └──────────────────┘ │ name │
|
||||
│ image_url │ │ artist │
|
||||
└─────────────────┘ │ album │
|
||||
│ image_url │
|
||||
│ duration_ms │
|
||||
│ popularity │
|
||||
│ raw_data (JSON) │
|
||||
│ danceability │
|
||||
│ energy │
|
||||
│ key │
|
||||
│ ... (audio) │
|
||||
│ genres (JSON) │
|
||||
│ lyrics │
|
||||
│ created_at │
|
||||
│ updated_at │
|
||||
└─────────────────┘
|
||||
│
|
||||
│
|
||||
▼
|
||||
┌─────────────────────┐ ┌─────────────────┐
|
||||
│ analysis_snapshots │ │ play_history │
|
||||
├─────────────────────┤ ├─────────────────┤
|
||||
│ id (PK) │ │ id (PK) │
|
||||
│ date │ │ track_id (FK) │
|
||||
│ period_start │ │ played_at │
|
||||
│ period_end │ │ context_uri │
|
||||
│ period_label │ │ listened_ms │
|
||||
│ metrics_payload │ │ skipped │
|
||||
│ narrative_report │ │ source │
|
||||
│ model_used │ └─────────────────┘
|
||||
│ playlist_theme │
|
||||
│ ... (playlist) │
|
||||
│ playlist_composition│
|
||||
└─────────────────────┘
|
||||
|
||||
┌─────────────────────┐
|
||||
│ playlist_config │
|
||||
├─────────────────────┤
|
||||
│ key (PK) │
|
||||
│ spotify_id │
|
||||
│ last_updated │
|
||||
│ current_theme │
|
||||
│ description │
|
||||
│ composition (JSON) │
|
||||
└─────────────────────┘
|
||||
```
|
||||
|
||||
## Tables
|
||||
|
||||
### `tracks`
|
||||
Central entity storing Spotify track metadata and enriched audio features.
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `id` | VARCHAR | Spotify track ID (primary key) |
|
||||
| `reccobeats_id` | VARCHAR | ReccoBeats UUID for audio features |
|
||||
| `name` | VARCHAR | Track title |
|
||||
| `artist` | VARCHAR | Display artist string (e.g., "Drake, Future") |
|
||||
| `album` | VARCHAR | Album name |
|
||||
| `image_url` | VARCHAR | Album art URL |
|
||||
| `duration_ms` | INTEGER | Track duration in milliseconds |
|
||||
| `popularity` | INTEGER | Spotify popularity score (0-100) |
|
||||
| `raw_data` | JSON | Full Spotify API response |
|
||||
| `danceability` | FLOAT | Audio feature (0.0-1.0) |
|
||||
| `energy` | FLOAT | Audio feature (0.0-1.0) |
|
||||
| `key` | INTEGER | Musical key (0-11) |
|
||||
| `loudness` | FLOAT | Audio feature (dB) |
|
||||
| `mode` | INTEGER | Major (1) or minor (0) |
|
||||
| `speechiness` | FLOAT | Audio feature (0.0-1.0) |
|
||||
| `acousticness` | FLOAT | Audio feature (0.0-1.0) |
|
||||
| `instrumentalness` | FLOAT | Audio feature (0.0-1.0) |
|
||||
| `liveness` | FLOAT | Audio feature (0.0-1.0) |
|
||||
| `valence` | FLOAT | Audio feature (0.0-1.0) |
|
||||
| `tempo` | FLOAT | BPM |
|
||||
| `time_signature` | INTEGER | Beats per bar |
|
||||
| `genres` | JSON | Genre tags (deprecated, use Artist.genres) |
|
||||
| `lyrics` | TEXT | Full lyrics from Genius |
|
||||
| `lyrics_summary` | VARCHAR | AI-generated summary |
|
||||
| `genre_tags` | VARCHAR | AI-generated tags |
|
||||
| `created_at` | TIMESTAMP | Record creation time |
|
||||
| `updated_at` | TIMESTAMP | Last update time |
|
||||
|
||||
### `artists`
|
||||
Artist entities with genre information.
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `id` | VARCHAR | Spotify artist ID (primary key) |
|
||||
| `name` | VARCHAR | Artist name |
|
||||
| `genres` | JSON | List of genre strings |
|
||||
| `image_url` | VARCHAR | Artist profile image URL |
|
||||
|
||||
### `track_artists`
|
||||
Many-to-many relationship between tracks and artists.
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `track_id` | VARCHAR | Foreign key to tracks.id |
|
||||
| `artist_id` | VARCHAR | Foreign key to artists.id |
|
||||
|
||||
### `play_history`
|
||||
Immutable log of listening events.
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `id` | INTEGER | Auto-increment primary key |
|
||||
| `track_id` | VARCHAR | Foreign key to tracks.id |
|
||||
| `played_at` | TIMESTAMP | When the track was played |
|
||||
| `context_uri` | VARCHAR | Spotify context (playlist, album, etc.) |
|
||||
| `listened_ms` | INTEGER | Duration actually listened |
|
||||
| `skipped` | BOOLEAN | Whether track was skipped |
|
||||
| `source` | VARCHAR | Source of the play event |
|
||||
|
||||
### `analysis_snapshots`
|
||||
Stores computed statistics and AI-generated narratives.
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `id` | INTEGER | Auto-increment primary key |
|
||||
| `date` | TIMESTAMP | When analysis was run |
|
||||
| `period_start` | TIMESTAMP | Analysis period start |
|
||||
| `period_end` | TIMESTAMP | Analysis period end |
|
||||
| `period_label` | VARCHAR | Label (e.g., "last_30_days") |
|
||||
| `metrics_payload` | JSON | StatsService output |
|
||||
| `narrative_report` | JSON | NarrativeService output |
|
||||
| `model_used` | VARCHAR | LLM model name |
|
||||
| `playlist_theme` | VARCHAR | AI-generated theme name |
|
||||
| `playlist_theme_reasoning` | TEXT | AI explanation for theme |
|
||||
| `six_hour_playlist_id` | VARCHAR | Spotify playlist ID |
|
||||
| `daily_playlist_id` | VARCHAR | Spotify playlist ID |
|
||||
| `playlist_composition` | JSON | Track list at snapshot time |
|
||||
|
||||
### `playlist_config`
|
||||
Configuration for managed Spotify playlists.
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `key` | VARCHAR | Config key (primary key, e.g., "six_hour") |
|
||||
| `spotify_id` | VARCHAR | Spotify playlist ID |
|
||||
| `last_updated` | TIMESTAMP | Last update time |
|
||||
| `current_theme` | VARCHAR | Current playlist theme |
|
||||
| `description` | VARCHAR | Playlist description |
|
||||
| `composition` | JSON | Current track list |
|
||||
|
||||
## Schema Modifications (Alembic)
|
||||
|
||||
All schema changes MUST go through Alembic migrations.
|
||||
|
||||
### Creating a New Migration
|
||||
|
||||
```bash
|
||||
cd backend
|
||||
source venv/bin/activate
|
||||
|
||||
# Auto-generate migration from model changes
|
||||
alembic revision --autogenerate -m "description_of_change"
|
||||
|
||||
# Or create empty migration for manual SQL
|
||||
alembic revision -m "description_of_change"
|
||||
```
|
||||
|
||||
### Applying Migrations
|
||||
|
||||
```bash
|
||||
# Apply all pending migrations
|
||||
alembic upgrade head
|
||||
|
||||
# Apply specific migration
|
||||
alembic upgrade <revision_id>
|
||||
|
||||
# Rollback one migration
|
||||
alembic downgrade -1
|
||||
|
||||
# Rollback to specific revision
|
||||
alembic downgrade <revision_id>
|
||||
```
|
||||
|
||||
### Migration Best Practices
|
||||
|
||||
1. **Test locally first** - Always test migrations on a dev database
|
||||
2. **Backup before migrating** - `pg_dump -h 100.91.248.114 -p 5433 -U bnair music_db > backup.sql`
|
||||
3. **One change per migration** - Keep migrations atomic
|
||||
4. **Include rollback logic** - Implement `downgrade()` function
|
||||
5. **Review autogenerated migrations** - They may miss nuances
|
||||
|
||||
### Example Migration
|
||||
|
||||
```python
|
||||
# alembic/versions/xxxx_add_new_column.py
|
||||
from alembic import op
|
||||
import sqlalchemy as sa
|
||||
|
||||
revision = 'xxxx'
|
||||
down_revision = 'yyyy'
|
||||
|
||||
def upgrade():
|
||||
op.add_column('tracks', sa.Column('new_column', sa.String(), nullable=True))
|
||||
|
||||
def downgrade():
|
||||
op.drop_column('tracks', 'new_column')
|
||||
```
|
||||
|
||||
## Direct Database Access
|
||||
|
||||
### Using psql
|
||||
```bash
|
||||
psql -h 100.91.248.114 -p 5433 -U bnair -d music_db
|
||||
```
|
||||
|
||||
### Using Python
|
||||
```python
|
||||
import psycopg2
|
||||
conn = psycopg2.connect(
|
||||
host='100.91.248.114',
|
||||
port=5433,
|
||||
user='bnair',
|
||||
password='Bharath2002',
|
||||
dbname='music_db'
|
||||
)
|
||||
```
|
||||
|
||||
### Common Queries
|
||||
|
||||
```sql
|
||||
-- Recent plays
|
||||
SELECT t.name, t.artist, ph.played_at
|
||||
FROM play_history ph
|
||||
JOIN tracks t ON ph.track_id = t.id
|
||||
ORDER BY ph.played_at DESC
|
||||
LIMIT 10;
|
||||
|
||||
-- Top tracks by play count
|
||||
SELECT t.name, t.artist, COUNT(*) as plays
|
||||
FROM play_history ph
|
||||
JOIN tracks t ON ph.track_id = t.id
|
||||
GROUP BY t.id, t.name, t.artist
|
||||
ORDER BY plays DESC
|
||||
LIMIT 10;
|
||||
|
||||
-- Genre distribution
|
||||
SELECT genre, COUNT(*)
|
||||
FROM artists, jsonb_array_elements_text(genres::jsonb) AS genre
|
||||
GROUP BY genre
|
||||
ORDER BY count DESC;
|
||||
```
|
||||
Reference in New Issue
Block a user