feat: migrate to PostgreSQL and enhance playlist curation

- Migrate database from SQLite to PostgreSQL (100.91.248.114:5433)
- Fix playlist curation to use actual top tracks instead of AI name matching
- Add /playlists/history endpoint for historical playlist viewing
- Add Playlist Archives section to frontend with expandable history
- Add playlist-modify-* scopes to Spotify OAuth for playlist creation
- Rewrite Genius client to use official API (fixes 403 scraping blocks)
- Ensure playlists are created on Spotify before curation attempts
- Add DATABASE.md documentation for PostgreSQL schema
- Add migrations for PlaylistConfig and composition storage
This commit is contained in:
bnair123
2025-12-30 22:24:56 +04:00
parent 26b4895695
commit 272148c5bf
19 changed files with 1130 additions and 145 deletions

View File

@@ -5,7 +5,7 @@
## OVERVIEW ## OVERVIEW
Personal music analytics dashboard polling Spotify 24/7. Core stack: Python (FastAPI, SQLAlchemy, SQLite) + React (Vite, Tailwind, AntD). Integrates AI (Gemini) for listening narratives. Personal music analytics dashboard polling Spotify 24/7. Core stack: Python (FastAPI, SQLAlchemy, PostgreSQL) + React (Vite, Tailwind, AntD). Integrates AI (Gemini) for listening narratives.
## STRUCTURE ## STRUCTURE
@@ -54,7 +54,7 @@ Personal music analytics dashboard polling Spotify 24/7. Core stack: Python (Fas
## CONVENTIONS ## CONVENTIONS
- **Single Container Multi-Process**: `backend/entrypoint.sh` starts worker + API (Docker anti-pattern, project-specific). - **Single Container Multi-Process**: `backend/entrypoint.sh` starts worker + API (Docker anti-pattern, project-specific).
- **SQLite Persistence**: Production uses SQLite (`music.db`) via Docker volumes. - **PostgreSQL Persistence**: Production uses PostgreSQL on internal server (100.91.248.114:5433, database: music_db).
- **Deduplication**: Ingestion checks `(track_id, played_at)` unique constraint before insert. - **Deduplication**: Ingestion checks `(track_id, played_at)` unique constraint before insert.
- **Frontend State**: Minimal global state; primarily local component state and API fetching. - **Frontend State**: Minimal global state; primarily local component state and API fetching.

View File

@@ -79,25 +79,26 @@ Open your browser to: **http://localhost:8991**
┌────────┴────────┐ ┌────────┴────────┐
▼ ▼ ▼ ▼
┌──────────┐ ┌──────────────┐ ┌──────────┐ ┌──────────────┐
SQLite │ │ Spotify API │ PostgreSQL│ │ Spotify API │
│ music.db │ │ Gemini AI │ │ music_db │ │ Gemini AI │
└──────────┘ └──────────────┘ └──────────┘ └──────────────┘
``` ```
- **Backend Container**: Runs both the FastAPI server AND the background Spotify polling worker - **Backend Container**: Runs both the FastAPI server AND the background Spotify polling worker
- **Frontend Container**: Nginx serving the React build, proxies `/api/` to backend - **Frontend Container**: Nginx serving the React build, proxies `/api/` to backend
- **Database**: SQLite stored in a Docker named volume (`music_data`) for persistence - **Database**: PostgreSQL hosted on internal server (100.91.248.114:5433)
## Data Persistence ## Data Persistence
Your listening history is stored in a Docker named volume: Your listening history is stored in PostgreSQL:
- Volume name: `music_data` - Host: `100.91.248.114:5433`
- Database file: `/app/music.db` - Database: `music_db`
- Migrations run automatically on container startup - Data Location (on server): `/opt/DB/MusicDB/pgdata`
- Migrations run automatically on container startup via Alembic
To backup your data: To backup your data:
```bash ```bash
docker cp $(docker-compose ps -q backend):/app/music.db ./backup.db pg_dump -h 100.91.248.114 -p 5433 -U bnair music_db > backup.sql
``` ```
## Local Development ## Local Development
@@ -136,4 +137,4 @@ Access at http://localhost:5173 (Vite proxies `/api` to backend automatically)
| `SPOTIFY_REFRESH_TOKEN` | Yes | Long-lived refresh token from OAuth | | `SPOTIFY_REFRESH_TOKEN` | Yes | Long-lived refresh token from OAuth |
| `GEMINI_API_KEY` | Yes | Google Gemini API key | | `GEMINI_API_KEY` | Yes | Google Gemini API key |
| `GENIUS_ACCESS_TOKEN` | No | Genius API token for lyrics | | `GENIUS_ACCESS_TOKEN` | No | Genius API token for lyrics |
| `DATABASE_URL` | No | SQLite path (default: `sqlite:///./music.db`) | | `DATABASE_URL` | No | PostgreSQL URL (default: `postgresql://bnair:Bharath2002@100.91.248.114:5433/music_db`) |

View File

@@ -3,9 +3,10 @@ FROM python:3.11-slim
WORKDIR /app WORKDIR /app
# Install system dependencies # Install system dependencies (including PostgreSQL client libs for psycopg2)
RUN apt-get update && apt-get install -y --no-install-recommends \ RUN apt-get update && apt-get install -y --no-install-recommends \
curl \ curl \
libpq-dev \
&& rm -rf /var/lib/apt/lists/* && rm -rf /var/lib/apt/lists/*
COPY requirements.txt . COPY requirements.txt .

View File

@@ -0,0 +1,32 @@
"""add_composition_to_snapshot
Revision ID: 24fafb6f6e98
Revises: 86ea83950f3d
Create Date: 2025-12-30 10:43:05.933962
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = '24fafb6f6e98'
down_revision: Union[str, Sequence[str], None] = '86ea83950f3d'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
"""Upgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.add_column('analysis_snapshots', sa.Column('playlist_composition', sa.JSON(), nullable=True))
# ### end Alembic commands ###
def downgrade() -> None:
"""Downgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.drop_column('analysis_snapshots', 'playlist_composition')
# ### end Alembic commands ###

View File

@@ -0,0 +1,41 @@
"""add_playlist_config_table
Revision ID: 7e28cc511ef8
Revises: 5ed73db9bab9
Create Date: 2025-12-30 10:30:36.775553
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = '7e28cc511ef8'
down_revision: Union[str, Sequence[str], None] = '5ed73db9bab9'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
"""Upgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.create_table('playlist_config',
sa.Column('key', sa.String(), nullable=False),
sa.Column('spotify_id', sa.String(), nullable=False),
sa.Column('last_updated', sa.DateTime(), nullable=True),
sa.Column('current_theme', sa.String(), nullable=True),
sa.Column('description', sa.String(), nullable=True),
sa.PrimaryKeyConstraint('key')
)
op.create_index(op.f('ix_playlist_config_key'), 'playlist_config', ['key'], unique=False)
# ### end Alembic commands ###
def downgrade() -> None:
"""Downgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.drop_index(op.f('ix_playlist_config_key'), table_name='playlist_config')
op.drop_table('playlist_config')
# ### end Alembic commands ###

View File

@@ -0,0 +1,32 @@
"""add_composition_to_playlist_config
Revision ID: 86ea83950f3d
Revises: 7e28cc511ef8
Create Date: 2025-12-30 10:39:27.121477
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = '86ea83950f3d'
down_revision: Union[str, Sequence[str], None] = '7e28cc511ef8'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
"""Upgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.add_column('playlist_config', sa.Column('composition', sa.JSON(), nullable=True))
# ### end Alembic commands ###
def downgrade() -> None:
"""Downgrade schema."""
# ### commands auto generated by Alembic - please adjust! ###
op.drop_column('playlist_config', 'composition')
# ### end Alembic commands ###

View File

@@ -2,17 +2,33 @@ import os
from sqlalchemy import create_engine from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, declarative_base from sqlalchemy.orm import sessionmaker, declarative_base
SQLALCHEMY_DATABASE_URL = os.getenv("DATABASE_URL", "sqlite:///./music.db") # PostgreSQL connection configuration
# Uses docker hostname 'music_db' when running in container, falls back to external IP for local dev
POSTGRES_HOST = os.getenv("POSTGRES_HOST", "music_db")
POSTGRES_PORT = os.getenv("POSTGRES_PORT", "5432")
POSTGRES_USER = os.getenv("POSTGRES_USER", "bnair")
POSTGRES_PASSWORD = os.getenv("POSTGRES_PASSWORD", "Bharath2002")
POSTGRES_DB = os.getenv("POSTGRES_DB", "music_db")
connect_args = {} # Build the PostgreSQL URL
if SQLALCHEMY_DATABASE_URL.startswith("sqlite"): # Format: postgresql://user:password@host:port/database
connect_args["check_same_thread"] = False SQLALCHEMY_DATABASE_URL = os.getenv(
"DATABASE_URL",
f"postgresql://{POSTGRES_USER}:{POSTGRES_PASSWORD}@{POSTGRES_HOST}:{POSTGRES_PORT}/{POSTGRES_DB}",
)
engine = create_engine(SQLALCHEMY_DATABASE_URL, connect_args=connect_args) # PostgreSQL connection pool settings for production
engine = create_engine(
SQLALCHEMY_DATABASE_URL,
pool_size=5, # Maintain 5 connections in the pool
max_overflow=10, # Allow up to 10 additional connections
pool_pre_ping=True, # Verify connection health before using
)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine) SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
Base = declarative_base() Base = declarative_base()
def get_db(): def get_db():
db = SessionLocal() db = SessionLocal()
try: try:

View File

@@ -10,6 +10,7 @@ from .models import (
PlayHistory as PlayHistoryModel, PlayHistory as PlayHistoryModel,
Track as TrackModel, Track as TrackModel,
AnalysisSnapshot, AnalysisSnapshot,
PlaylistConfig,
) )
from . import schemas from . import schemas
from .ingest import ( from .ingest import (
@@ -220,13 +221,18 @@ async def refresh_six_hour_playlist(db: Session = Depends(get_db)):
end_date = datetime.utcnow() end_date = datetime.utcnow()
start_date = end_date - timedelta(hours=6) start_date = end_date - timedelta(hours=6)
spotify_client = get_spotify_client()
playlist_service = PlaylistService( playlist_service = PlaylistService(
db=db, db=db,
spotify_client=get_spotify_client(), spotify_client=spotify_client,
recco_client=get_reccobeats_client(), recco_client=get_reccobeats_client(),
narrative_service=NarrativeService(), narrative_service=NarrativeService(),
) )
# Ensure playlists exist (creates on Spotify if needed)
user_id = await spotify_client.get_current_user_id()
await playlist_service.ensure_playlists_exist(user_id)
result = await playlist_service.curate_six_hour_playlist(start_date, end_date) result = await playlist_service.curate_six_hour_playlist(start_date, end_date)
snapshot = AnalysisSnapshot( snapshot = AnalysisSnapshot(
@@ -239,6 +245,7 @@ async def refresh_six_hour_playlist(db: Session = Depends(get_db)):
playlist_theme=result.get("theme_name"), playlist_theme=result.get("theme_name"),
playlist_theme_reasoning=result.get("description"), playlist_theme_reasoning=result.get("description"),
six_hour_playlist_id=result.get("playlist_id"), six_hour_playlist_id=result.get("playlist_id"),
playlist_composition=result.get("composition"),
) )
db.add(snapshot) db.add(snapshot)
db.commit() db.commit()
@@ -256,13 +263,18 @@ async def refresh_daily_playlist(db: Session = Depends(get_db)):
end_date = datetime.utcnow() end_date = datetime.utcnow()
start_date = end_date - timedelta(days=1) start_date = end_date - timedelta(days=1)
spotify_client = get_spotify_client()
playlist_service = PlaylistService( playlist_service = PlaylistService(
db=db, db=db,
spotify_client=get_spotify_client(), spotify_client=spotify_client,
recco_client=get_reccobeats_client(), recco_client=get_reccobeats_client(),
narrative_service=NarrativeService(), narrative_service=NarrativeService(),
) )
# Ensure playlists exist (creates on Spotify if needed)
user_id = await spotify_client.get_current_user_id()
await playlist_service.ensure_playlists_exist(user_id)
result = await playlist_service.curate_daily_playlist(start_date, end_date) result = await playlist_service.curate_daily_playlist(start_date, end_date)
snapshot = AnalysisSnapshot( snapshot = AnalysisSnapshot(
@@ -273,6 +285,7 @@ async def refresh_daily_playlist(db: Session = Depends(get_db)):
metrics_payload={}, metrics_payload={},
narrative_report={}, narrative_report={},
daily_playlist_id=result.get("playlist_id"), daily_playlist_id=result.get("playlist_id"),
playlist_composition=result.get("composition"),
) )
db.add(snapshot) db.add(snapshot)
db.commit() db.commit()
@@ -286,32 +299,71 @@ async def refresh_daily_playlist(db: Session = Depends(get_db)):
@app.get("/playlists") @app.get("/playlists")
async def get_playlists_metadata(db: Session = Depends(get_db)): async def get_playlists_metadata(db: Session = Depends(get_db)):
"""Returns metadata for the managed playlists.""" """Returns metadata for the managed playlists."""
latest_snapshot = (
db.query(AnalysisSnapshot) six_hour_config = (
.filter(AnalysisSnapshot.six_hour_playlist_id != None) db.query(PlaylistConfig).filter(PlaylistConfig.key == "six_hour").first()
.order_by(AnalysisSnapshot.date.desc()) )
.first() daily_config = (
db.query(PlaylistConfig).filter(PlaylistConfig.key == "daily").first()
) )
return { return {
"six_hour": { "six_hour": {
"id": latest_snapshot.six_hour_playlist_id "id": six_hour_config.spotify_id
if latest_snapshot if six_hour_config
else os.getenv("SIX_HOUR_PLAYLIST_ID"), else os.getenv("SIX_HOUR_PLAYLIST_ID"),
"theme": latest_snapshot.playlist_theme if latest_snapshot else "N/A", "theme": six_hour_config.current_theme if six_hour_config else "N/A",
"reasoning": latest_snapshot.playlist_theme_reasoning "reasoning": six_hour_config.description if six_hour_config else "N/A",
if latest_snapshot "last_refresh": six_hour_config.last_updated.isoformat()
else "N/A", if six_hour_config
"last_refresh": latest_snapshot.date.isoformat()
if latest_snapshot
else None, else None,
"composition": six_hour_config.composition if six_hour_config else [],
}, },
"daily": { "daily": {
"id": latest_snapshot.daily_playlist_id "id": daily_config.spotify_id
if latest_snapshot if daily_config
else os.getenv("DAILY_PLAYLIST_ID"), else os.getenv("DAILY_PLAYLIST_ID"),
"last_refresh": latest_snapshot.date.isoformat() "theme": daily_config.current_theme if daily_config else "N/A",
if latest_snapshot "reasoning": daily_config.description if daily_config else "N/A",
"last_refresh": daily_config.last_updated.isoformat()
if daily_config
else None, else None,
"composition": daily_config.composition if daily_config else [],
}, },
} }
@app.get("/playlists/history")
def get_playlist_history(
limit: int = Query(default=20, ge=1, le=100),
db: Session = Depends(get_db),
):
"""Returns historical playlist snapshots."""
snapshots = (
db.query(AnalysisSnapshot)
.filter(
(AnalysisSnapshot.playlist_theme.isnot(None))
| (AnalysisSnapshot.six_hour_playlist_id.isnot(None))
| (AnalysisSnapshot.daily_playlist_id.isnot(None))
)
.order_by(AnalysisSnapshot.date.desc())
.limit(limit)
.all()
)
result = []
for snap in snapshots:
result.append(
{
"id": snap.id,
"date": snap.date.isoformat() if snap.date else None,
"period_label": snap.period_label,
"theme": snap.playlist_theme,
"reasoning": snap.playlist_theme_reasoning,
"six_hour_id": snap.six_hour_playlist_id,
"daily_id": snap.daily_playlist_id,
"composition": snap.playlist_composition or [],
}
)
return {"history": result}

View File

@@ -130,3 +130,18 @@ class AnalysisSnapshot(Base):
daily_playlist_id = Column( daily_playlist_id = Column(
String, nullable=True String, nullable=True
) # Spotify playlist ID for 24-hour playlist ) # Spotify playlist ID for 24-hour playlist
playlist_composition = Column(JSON, nullable=True)
playlist_composition = Column(
JSON, nullable=True
) # Store the track list at this snapshot
class PlaylistConfig(Base):
__tablename__ = "playlist_config"
key = Column(String, primary_key=True, index=True) # e.g., "six_hour", "daily"
spotify_id = Column(String, nullable=False)
last_updated = Column(DateTime, default=datetime.utcnow)
current_theme = Column(String, nullable=True)
description = Column(String, nullable=True)
composition = Column(JSON, nullable=True)

View File

@@ -1,35 +1,103 @@
import os import os
import lyricsgenius import requests
from typing import Optional, Dict, Any from typing import Optional, Dict, Any
import re
class GeniusClient: class GeniusClient:
def __init__(self): def __init__(self):
self.access_token = os.getenv("GENIUS_ACCESS_TOKEN") self.access_token = os.getenv("GENIUS_ACCESS_TOKEN")
if self.access_token: self.base_url = "https://api.genius.com"
self.genius = lyricsgenius.Genius(self.access_token, verbose=False, remove_section_headers=True) self.headers = (
else: {"Authorization": f"Bearer {self.access_token}"}
print("WARNING: GENIUS_ACCESS_TOKEN not found. Lyrics enrichment will be skipped.") if self.access_token
else {}
)
if not self.access_token:
print(
"WARNING: GENIUS_ACCESS_TOKEN not found. Lyrics enrichment will be skipped."
)
self.genius = None self.genius = None
else:
self.genius = True
def search_song(self, title: str, artist: str) -> Optional[Dict[str, Any]]: def search_song(self, title: str, artist: str) -> Optional[Dict[str, Any]]:
"""
Searches for a song on Genius and returns metadata + lyrics.
"""
if not self.genius: if not self.genius:
return None return None
try: try:
# Clean up title (remove "Feat.", "Remastered", etc for better search match)
clean_title = title.split(" - ")[0].split("(")[0].strip() clean_title = title.split(" - ")[0].split("(")[0].strip()
song = self.genius.search_song(clean_title, artist) query = f"{clean_title} {artist}"
if song: response = requests.get(
return { f"{self.base_url}/search",
"lyrics": song.lyrics, headers=self.headers,
"image_url": song.song_art_image_url, params={"q": query},
"artist_image_url": song.primary_artist.image_url timeout=10,
} )
if response.status_code != 200:
print(f"Genius API Error: {response.status_code}")
return None
data = response.json()
hits = data.get("response", {}).get("hits", [])
if not hits:
return None
song = hits[0]["result"]
lyrics = self._scrape_lyrics(song.get("url")) if song.get("url") else None
return {
"lyrics": lyrics,
"image_url": song.get("song_art_image_url")
or song.get("header_image_url"),
"artist_image_url": song.get("primary_artist", {}).get("image_url"),
}
except Exception as e: except Exception as e:
print(f"Genius Search Error for {title} by {artist}: {e}") print(f"Genius Search Error for {title} by {artist}: {e}")
return None return None
def _scrape_lyrics(self, url: str) -> Optional[str]:
try:
headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
"Accept-Language": "en-US,en;q=0.5",
}
response = requests.get(url, headers=headers, timeout=10)
if response.status_code != 200:
return None
html = response.text
lyrics_divs = re.findall(
r'<div[^>]*data-lyrics-container="true"[^>]*>(.*?)</div>',
html,
re.DOTALL,
)
if not lyrics_divs:
return None
lyrics = ""
for div in lyrics_divs:
text = re.sub(r"<br\s*/?>", "\n", div)
text = re.sub(r"<[^>]+>", "", text)
text = (
text.replace("&amp;", "&")
.replace("&quot;", '"')
.replace("&#x27;", "'")
)
lyrics += text + "\n"
return lyrics.strip() if lyrics.strip() else None
except Exception as e:
print(f"Lyrics scrape error: {e}")
return None

View File

@@ -119,6 +119,7 @@ class NarrativeService:
2. Provide a "description" (2-3 sentences explaining why). 2. Provide a "description" (2-3 sentences explaining why).
3. Identify 10-15 "curated_tracks" (song names only) that fit this vibe and the artists listed. 3. Identify 10-15 "curated_tracks" (song names only) that fit this vibe and the artists listed.
4. Return ONLY valid JSON. 4. Return ONLY valid JSON.
5. Do NOT output internal variable names (e.g. 'part_of_day', 'avg_valence') in the description. Translate them to natural language (e.g. 'morning listens', 'happy vibe').
**REQUIRED JSON:** **REQUIRED JSON:**
{{ {{
@@ -192,6 +193,7 @@ class NarrativeService:
2. Be specific - reference actual metrics from the data. 2. Be specific - reference actual metrics from the data.
3. Be playful but not cruel. 3. Be playful but not cruel.
4. Return ONLY valid JSON. 4. Return ONLY valid JSON.
5. Translate all technical metrics (e.g. 'discovery_rate', 'valence', 'hhi') into natural language descriptions. Do NOT use the variable names themselves.
**LISTENING HIGHLIGHTS:** **LISTENING HIGHLIGHTS:**
- Peak listening: {peak_listening} - Peak listening: {peak_listening}

View File

@@ -24,89 +24,239 @@ class PlaylistService:
async def ensure_playlists_exist(self, user_id: str) -> Dict[str, str]: async def ensure_playlists_exist(self, user_id: str) -> Dict[str, str]:
"""Check/create playlists. Returns {six_hour_id, daily_id}.""" """Check/create playlists. Returns {six_hour_id, daily_id}."""
six_hour_env = os.getenv("SIX_HOUR_PLAYLIST_ID") from app.models import PlaylistConfig
daily_env = os.getenv("DAILY_PLAYLIST_ID")
if not six_hour_env: six_hour_config = (
six_hour_data = await self.spotify.create_playlist( self.db.query(PlaylistConfig)
user_id=user_id, .filter(PlaylistConfig.key == "six_hour")
name="Short and Sweet", .first()
description="AI-curated 6-hour playlists based on your listening habits", )
) daily_config = (
six_hour_env = str(six_hour_data["id"]) self.db.query(PlaylistConfig).filter(PlaylistConfig.key == "daily").first()
)
if not daily_env: six_hour_id = six_hour_config.spotify_id if six_hour_config else None
daily_data = await self.spotify.create_playlist( daily_id = daily_config.spotify_id if daily_config else None
user_id=user_id,
name="Proof of Commitment",
description="Your daily 24-hour mix showing your music journey",
)
daily_env = str(daily_data["id"])
return {"six_hour_id": str(six_hour_env), "daily_id": str(daily_env)} if not six_hour_id:
six_hour_id = os.getenv("SIX_HOUR_PLAYLIST_ID")
if not six_hour_id:
six_hour_data = await self.spotify.create_playlist(
user_id=user_id,
name="Short and Sweet",
description="AI-curated 6-hour playlists based on your listening habits",
)
six_hour_id = str(six_hour_data["id"])
self._save_playlist_config("six_hour", six_hour_id, "Short and Sweet")
if not daily_id:
daily_id = os.getenv("DAILY_PLAYLIST_ID")
if not daily_id:
daily_data = await self.spotify.create_playlist(
user_id=user_id,
name="Proof of Commitment",
description="Your daily 24-hour mix showing your music journey",
)
daily_id = str(daily_data["id"])
self._save_playlist_config("daily", daily_id, "Proof of Commitment")
return {"six_hour_id": six_hour_id, "daily_id": daily_id}
def _save_playlist_config(
self,
key: str,
spotify_id: str,
description: str = None,
theme: str = None,
composition: List[Dict[str, Any]] = None,
):
from app.models import PlaylistConfig
config = self.db.query(PlaylistConfig).filter(PlaylistConfig.key == key).first()
if not config:
config = PlaylistConfig(key=key, spotify_id=spotify_id)
self.db.add(config)
else:
config.spotify_id = spotify_id
if description:
config.description = description
if theme:
config.current_theme = theme
if composition:
config.composition = composition
config.last_updated = datetime.utcnow()
self.db.commit()
async def _hydrate_tracks(
self, track_ids: List[str], sources: Dict[str, str]
) -> List[Dict[str, Any]]:
"""Fetch full track details for a list of IDs."""
from app.models import Track
db_tracks = self.db.query(Track).filter(Track.id.in_(track_ids)).all()
track_map = {t.id: t for t in db_tracks}
missing_ids = [tid for tid in track_ids if tid not in track_map]
if missing_ids:
spotify_tracks = await self.spotify.get_tracks(missing_ids)
for st in spotify_tracks:
if not st:
continue
track_map[st["id"]] = {
"id": st["id"],
"name": st["name"],
"artist": ", ".join([a["name"] for a in st["artists"]]),
"image": st["album"]["images"][0]["url"]
if st["album"]["images"]
else None,
"uri": st["uri"],
}
result = []
for tid in track_ids:
track = track_map.get(tid)
if not track:
continue
if hasattr(track, "name") and not isinstance(track, dict):
track_data = {
"id": track.id,
"name": track.name,
"artist": track.artist,
"image": track.image_url,
"uri": f"spotify:track:{track.id}",
}
else:
track_data = track
track_data["source"] = sources.get(tid, "unknown")
result.append(track_data)
return result
async def curate_six_hour_playlist( async def curate_six_hour_playlist(
self, period_start: datetime, period_end: datetime self, period_start: datetime, period_end: datetime
) -> Dict[str, Any]: ) -> Dict[str, Any]:
"""Generate 6-hour playlist (15 curated + 15 recommendations).""" """Generate 6-hour playlist (15 curated + 15 recommendations)."""
from app.models import Track from app.models import Track, PlayHistory
from app.services.stats_service import StatsService from app.services.stats_service import StatsService
from sqlalchemy import func
stats = StatsService(self.db) stats = StatsService(self.db)
data = stats.generate_full_report(period_start, period_end) data = stats.generate_full_report(period_start, period_end)
top_tracks_period = [t["id"] for t in data["volume"].get("top_tracks", [])][:15]
if len(top_tracks_period) < 5:
fallback_tracks = (
self.db.query(Track.id, func.count(PlayHistory.id).label("cnt"))
.join(PlayHistory, Track.id == PlayHistory.track_id)
.group_by(Track.id)
.order_by(func.count(PlayHistory.id).desc())
.limit(15)
.all()
)
top_tracks_period = [tid for tid, _ in fallback_tracks]
listening_data = { listening_data = {
"peak_hour": data["time_habits"]["peak_hour"], "peak_hour": data["time_habits"].get("peak_hour", 12),
"avg_energy": data["vibe"]["avg_energy"], "avg_energy": data["vibe"].get("avg_energy", 0.5),
"avg_valence": data["vibe"]["avg_valence"], "avg_valence": data["vibe"].get("avg_valence", 0.5),
"total_plays": data["volume"]["total_plays"], "total_plays": data["volume"].get("total_plays", 0),
"top_artists": data["volume"]["top_artists"][:10], "top_artists": data["volume"].get("top_artists", [])[:10],
} }
theme_result = self.narrative.generate_playlist_theme(listening_data) theme_result = self.narrative.generate_playlist_theme(listening_data)
curated_track_names = theme_result.get("curated_tracks", []) curated_details = []
curated_tracks: List[str] = [] for tid in top_tracks_period:
for name in curated_track_names: track_obj = self.db.query(Track).filter(Track.id == tid).first()
track = self.db.query(Track).filter(Track.name.ilike(f"%{name}%")).first() if track_obj:
if track: curated_details.append(
curated_tracks.append(str(track.id)) {
"id": str(track_obj.id),
"energy": track_obj.energy,
"source": "history",
}
)
recommendations: List[str] = [] rec_details = []
if curated_tracks: seed_ids = top_tracks_period[:5] if top_tracks_period else []
recs = await self.recco.get_recommendations( if seed_ids:
seed_ids=curated_tracks[:5], raw_recs = await self.recco.get_recommendations(
seed_ids=seed_ids,
size=15, size=15,
) )
recommendations = [ for r in raw_recs:
str(r.get("spotify_id") or r.get("id")) rec_id = str(r.get("spotify_id") or r.get("id"))
for r in recs if rec_id:
if r.get("spotify_id") or r.get("id") rec_details.append(
] {
"id": rec_id,
"energy": r.get("energy"),
"source": "recommendation",
}
)
final_tracks = curated_tracks[:15] + recommendations[:15] all_candidates = curated_details[:15] + rec_details[:15]
optimized_tracks = self._optimize_playlist_flow(all_candidates)
final_track_ids = [t["id"] for t in optimized_tracks]
sources = {t["id"]: t["source"] for t in optimized_tracks}
# Hydrate for persistence/display
full_tracks = await self._hydrate_tracks(final_track_ids, sources)
playlist_id = None
from app.models import PlaylistConfig
config = (
self.db.query(PlaylistConfig)
.filter(PlaylistConfig.key == "six_hour")
.first()
)
if config:
playlist_id = config.spotify_id
if not playlist_id:
playlist_id = os.getenv("SIX_HOUR_PLAYLIST_ID")
playlist_id = os.getenv("SIX_HOUR_PLAYLIST_ID")
if playlist_id: if playlist_id:
theme_name = f"Short and Sweet - {theme_result['theme_name']}"
desc = f"{theme_result['description']}\n\nCurated: {len(curated_details)} tracks + {len(rec_details)} recommendations"
await self.spotify.update_playlist_details( await self.spotify.update_playlist_details(
playlist_id=playlist_id, playlist_id=playlist_id,
name=f"Short and Sweet - {theme_result['theme_name']}", name=theme_name,
description=( description=desc,
f"{theme_result['description']}\n\nCurated: {len(curated_tracks)} tracks + {len(recommendations)} recommendations"
),
) )
await self.spotify.replace_playlist_tracks( await self.spotify.replace_playlist_tracks(
playlist_id=playlist_id, playlist_id=playlist_id,
track_uris=[f"spotify:track:{tid}" for tid in final_tracks], track_uris=[f"spotify:track:{tid}" for tid in final_track_ids],
)
self._save_playlist_config(
"six_hour",
playlist_id,
description=desc,
theme=theme_result["theme_name"],
composition=full_tracks,
) )
return { return {
"playlist_id": playlist_id, "playlist_id": playlist_id,
"theme_name": theme_result["theme_name"], "theme_name": theme_result["theme_name"],
"description": theme_result["description"], "description": theme_result["description"],
"track_count": len(final_tracks), "track_count": len(final_track_ids),
"curated_count": len(curated_tracks), "sources": sources,
"rec_count": len(recommendations), "composition": full_tracks,
"curated_count": len(curated_details),
"rec_count": len(rec_details),
"refreshed_at": datetime.utcnow().isoformat(), "refreshed_at": datetime.utcnow().isoformat(),
} }
@@ -120,33 +270,86 @@ class PlaylistService:
stats = StatsService(self.db) stats = StatsService(self.db)
data = stats.generate_full_report(period_start, period_end) data = stats.generate_full_report(period_start, period_end)
top_all_time = self._get_top_all_time_tracks(limit=30) top_all_time_ids = self._get_top_all_time_tracks(limit=30)
recent_tracks = [track["id"] for track in data["volume"]["top_tracks"][:20]] recent_tracks_ids = [track["id"] for track in data["volume"]["top_tracks"][:20]]
final_tracks = (top_all_time + recent_tracks)[:50] favorites_details = []
for tid in top_all_time_ids:
track_obj = self.db.query(Track).filter(Track.id == tid).first()
if track_obj:
favorites_details.append(
{
"id": str(track_obj.id),
"energy": track_obj.energy,
"source": "favorite_all_time",
}
)
discovery_details = []
for tid in recent_tracks_ids:
track_obj = self.db.query(Track).filter(Track.id == tid).first()
if track_obj:
discovery_details.append(
{
"id": str(track_obj.id),
"energy": track_obj.energy,
"source": "recent_discovery",
}
)
all_candidates = favorites_details + discovery_details
optimized_tracks = self._optimize_playlist_flow(all_candidates)
final_track_ids = [t["id"] for t in optimized_tracks]
sources = {t["id"]: t["source"] for t in optimized_tracks}
# Hydrate for persistence/display
full_tracks = await self._hydrate_tracks(final_track_ids, sources)
playlist_id = None
from app.models import PlaylistConfig
config = (
self.db.query(PlaylistConfig).filter(PlaylistConfig.key == "daily").first()
)
if config:
playlist_id = config.spotify_id
if not playlist_id:
playlist_id = os.getenv("DAILY_PLAYLIST_ID")
playlist_id = os.getenv("DAILY_PLAYLIST_ID")
theme_name = f"Proof of Commitment - {datetime.utcnow().date().isoformat()}" theme_name = f"Proof of Commitment - {datetime.utcnow().date().isoformat()}"
if playlist_id: if playlist_id:
desc = (
f"{theme_name} reflects the past 24 hours plus your all-time devotion."
)
await self.spotify.update_playlist_details( await self.spotify.update_playlist_details(
playlist_id=playlist_id, playlist_id=playlist_id,
name=theme_name, name=theme_name,
description=( description=desc,
f"{theme_name} reflects the past 24 hours plus your all-time devotion."
),
) )
await self.spotify.replace_playlist_tracks( await self.spotify.replace_playlist_tracks(
playlist_id=playlist_id, playlist_id=playlist_id,
track_uris=[f"spotify:track:{tid}" for tid in final_tracks], track_uris=[f"spotify:track:{tid}" for tid in final_track_ids],
)
self._save_playlist_config(
"daily",
playlist_id,
description=desc,
theme=theme_name,
composition=full_tracks,
) )
return { return {
"playlist_id": playlist_id, "playlist_id": playlist_id,
"theme_name": theme_name, "theme_name": theme_name,
"description": "Daily mix refreshed with your favorites and discoveries.", "description": "Daily mix refreshed with your favorites and discoveries.",
"track_count": len(final_tracks), "track_count": len(final_track_ids),
"favorites_count": len(top_all_time), "sources": sources,
"recent_discoveries_count": len(recent_tracks), "composition": full_tracks,
"favorites_count": len(favorites_details),
"recent_discoveries_count": len(discovery_details),
"refreshed_at": datetime.utcnow().isoformat(), "refreshed_at": datetime.utcnow().isoformat(),
} }
@@ -165,3 +368,29 @@ class PlaylistService:
) )
return [track_id for track_id, _ in result] return [track_id for track_id, _ in result]
def _optimize_playlist_flow(
self, tracks: List[Dict[str, Any]]
) -> List[Dict[str, Any]]:
"""
Sort tracks to create a smooth flow (Energy Ramp).
Strategy: Sort by energy (Low -> High -> Medium).
"""
if not any("energy" in t for t in tracks):
return tracks
for t in tracks:
if "energy" not in t or t["energy"] is None:
t["energy"] = 0.5
sorted_tracks = sorted(tracks, key=lambda x: x["energy"])
n = len(sorted_tracks)
low_end = int(n * 0.3)
high_start = int(n * 0.7)
low_energy = sorted_tracks[:low_end]
medium_energy = sorted_tracks[low_end:high_start]
high_energy = sorted_tracks[high_start:]
return low_energy + high_energy + medium_energy

View File

@@ -71,6 +71,26 @@ class SpotifyClient:
return None return None
return response.json() return response.json()
async def get_tracks(self, track_ids: List[str]) -> List[Dict[str, Any]]:
"""Fetch multiple tracks by ID."""
if not track_ids:
return []
token = await self.get_access_token()
ids_param = ",".join(track_ids[:50])
async with httpx.AsyncClient() as client:
response = await client.get(
f"{SPOTIFY_API_BASE}/tracks",
params={"ids": ids_param},
headers={"Authorization": f"Bearer {token}"},
)
if response.status_code != 200:
print(f"Error fetching tracks: {response.text}")
return []
return response.json().get("tracks", [])
async def get_artists(self, artist_ids: List[str]) -> List[Dict[str, Any]]: async def get_artists(self, artist_ids: List[str]) -> List[Dict[str, Any]]:
""" """
Fetches artist details (including genres) for a list of artist IDs. Fetches artist details (including genres) for a list of artist IDs.

View File

@@ -10,7 +10,7 @@ tenacity==8.2.3
python-dateutil==2.9.0.post0 python-dateutil==2.9.0.post0
requests==2.31.0 requests==2.31.0
alembic==1.13.1 alembic==1.13.1
psycopg2-binary==2.9.9
scikit-learn==1.4.0 scikit-learn==1.4.0
lyricsgenius==3.0.1
google-genai==1.56.0 google-genai==1.56.0
openai>=1.0.0 openai>=1.0.0

View File

@@ -16,8 +16,9 @@ from http.server import HTTPServer, BaseHTTPRequestHandler
# CONFIGURATION - You can hardcode these or input them when prompted # CONFIGURATION - You can hardcode these or input them when prompted
SPOTIFY_CLIENT_ID = input("Enter your Spotify Client ID: ").strip() SPOTIFY_CLIENT_ID = input("Enter your Spotify Client ID: ").strip()
SPOTIFY_CLIENT_SECRET = input("Enter your Spotify Client Secret: ").strip() SPOTIFY_CLIENT_SECRET = input("Enter your Spotify Client Secret: ").strip()
REDIRECT_URI = "http://localhost:8888/callback" REDIRECT_URI = "http://127.0.0.1:8888/callback"
SCOPE = "user-read-recently-played user-read-playback-state" SCOPE = "user-read-recently-played user-read-playback-state playlist-modify-public playlist-modify-private"
class RequestHandler(BaseHTTPRequestHandler): class RequestHandler(BaseHTTPRequestHandler):
def do_GET(self): def do_GET(self):
@@ -36,6 +37,7 @@ class RequestHandler(BaseHTTPRequestHandler):
# Shut down server # Shut down server
raise KeyboardInterrupt raise KeyboardInterrupt
def get_token(code): def get_token(code):
url = "https://accounts.spotify.com/api/token" url = "https://accounts.spotify.com/api/token"
payload = { payload = {
@@ -49,24 +51,27 @@ def get_token(code):
response = requests.post(url, data=payload) response = requests.post(url, data=payload)
if response.status_code == 200: if response.status_code == 200:
data = response.json() data = response.json()
print("\n" + "="*50) print("\n" + "=" * 50)
print("SUCCESS! HERE ARE YOUR CREDENTIALS") print("SUCCESS! HERE ARE YOUR CREDENTIALS")
print("="*50) print("=" * 50)
print(f"\nSPOTIFY_REFRESH_TOKEN={data['refresh_token']}") print(f"\nSPOTIFY_REFRESH_TOKEN={data['refresh_token']}")
print(f"SPOTIFY_CLIENT_ID={SPOTIFY_CLIENT_ID}") print(f"SPOTIFY_CLIENT_ID={SPOTIFY_CLIENT_ID}")
print(f"SPOTIFY_CLIENT_SECRET={SPOTIFY_CLIENT_SECRET}") print(f"SPOTIFY_CLIENT_SECRET={SPOTIFY_CLIENT_SECRET}")
print("\nSave these in your .env file or share them with the agent.") print("\nSave these in your .env file or share them with the agent.")
print("="*50 + "\n") print("=" * 50 + "\n")
else: else:
print("Error getting token:", response.text) print("Error getting token:", response.text)
def start_auth(): def start_auth():
auth_url = "https://accounts.spotify.com/authorize?" + urllib.parse.urlencode({ auth_url = "https://accounts.spotify.com/authorize?" + urllib.parse.urlencode(
"response_type": "code", {
"client_id": SPOTIFY_CLIENT_ID, "response_type": "code",
"scope": SCOPE, "client_id": SPOTIFY_CLIENT_ID,
"redirect_uri": REDIRECT_URI, "scope": SCOPE,
}) "redirect_uri": REDIRECT_URI,
}
)
print(f"Opening browser to: {auth_url}") print(f"Opening browser to: {auth_url}")
try: try:
@@ -74,7 +79,7 @@ def start_auth():
except: except:
print(f"Could not open browser. Please manually visit: {auth_url}") print(f"Could not open browser. Please manually visit: {auth_url}")
server_address = ('', 8888) server_address = ("", 8888)
httpd = HTTPServer(server_address, RequestHandler) httpd = HTTPServer(server_address, RequestHandler)
print("Listening on port 8888...") print("Listening on port 8888...")
try: try:
@@ -83,5 +88,6 @@ def start_auth():
pass pass
httpd.server_close() httpd.server_close()
if __name__ == "__main__": if __name__ == "__main__":
start_auth() start_auth()

View File

@@ -0,0 +1,126 @@
import pytest
from unittest.mock import Mock, AsyncMock, MagicMock
from datetime import datetime
from app.services.playlist_service import PlaylistService
from app.models import PlaylistConfig, Track
@pytest.fixture
def mock_db():
session = MagicMock()
# Mock query return values
session.query.return_value.filter.return_value.first.return_value = None
return session
@pytest.fixture
def mock_spotify():
client = AsyncMock()
client.create_playlist.return_value = {"id": "new_playlist_id"}
client.get_tracks.return_value = []
return client
@pytest.fixture
def mock_recco():
client = AsyncMock()
return client
@pytest.fixture
def mock_narrative():
service = Mock()
service.generate_playlist_theme.return_value = {
"theme_name": "Test Theme",
"description": "Test Description",
"curated_tracks": [],
}
return service
@pytest.fixture
def playlist_service(mock_db, mock_spotify, mock_recco, mock_narrative):
return PlaylistService(mock_db, mock_spotify, mock_recco, mock_narrative)
@pytest.mark.asyncio
async def test_ensure_playlists_exist_creates_new(
playlist_service, mock_db, mock_spotify
):
# Setup: DB empty, Env vars assumed empty (or mocked)
mock_db.query.return_value.filter.return_value.first.return_value = None
result = await playlist_service.ensure_playlists_exist("user123")
assert result["six_hour_id"] == "new_playlist_id"
assert result["daily_id"] == "new_playlist_id"
assert mock_spotify.create_playlist.call_count == 2
# Verify persistence call
assert mock_db.add.call_count == 2 # Once for each
assert mock_db.commit.call_count == 2
@pytest.mark.asyncio
async def test_ensure_playlists_exist_loads_from_db(
playlist_service, mock_db, mock_spotify
):
# Setup: DB has configs
mock_six = PlaylistConfig(key="six_hour", spotify_id="db_six_id")
mock_daily = PlaylistConfig(key="daily", spotify_id="db_daily_id")
# Mock return values for separate queries
# This is tricky with MagicMock chains.
# Simpler approach: Assuming the service calls query(PlaylistConfig).filter(...)
# We can just check the result logic without complex DB mocking if we abstract the DB access.
# But let's try to mock the specific return values based on call order if possible.
mock_query = mock_db.query.return_value
mock_filter = mock_query.filter
# Configure filter().first() to return mock_six then mock_daily
# But ensure_playlists_exist calls filter twice.
# mock_filter.return_value is the same object.
# mock_filter.return_value.first.side_effect = [mock_six, mock_daily]
# This assumes sequential execution order which is fragile but works for unit test.
# IMPORTANT: Ensure filter side_effect is cleared if set previously
mock_filter.side_effect = None
mock_filter.return_value.first.side_effect = [mock_six, mock_daily]
result = await playlist_service.ensure_playlists_exist("user123")
assert result["six_hour_id"] == "db_six_id"
assert result["daily_id"] == "db_daily_id"
mock_spotify.create_playlist.assert_not_called()
def test_optimize_playlist_flow(playlist_service):
tracks = [
{"id": "1", "energy": 0.8}, # High
{"id": "2", "energy": 0.2}, # Low
{"id": "3", "energy": 0.5}, # Medium
{"id": "4", "energy": 0.9}, # High
{"id": "5", "energy": 0.3}, # Low
]
# Expected sort: Low, Low, Medium, High, High
# Then split:
# Sorted: 2(0.2), 5(0.3), 3(0.5), 1(0.8), 4(0.9)
# Len 5.
# Low end: 5 * 0.3 = 1.5 -> 1. (Index 1) -> [2]
# High start: 5 * 0.7 = 3.5 -> 3. (Index 3) -> [1, 4]
# Medium: [5, 3]
# Result: Low + High + Medium = [2] + [1, 4] + [5, 3]
# Order: 2, 1, 4, 5, 3
# Energies: 0.2, 0.8, 0.9, 0.3, 0.5
optimized = playlist_service._optimize_playlist_flow(tracks)
ids = [t["id"] for t in optimized]
# Check if High energy tracks are in the middle/early part (Ramp Up)
# The current logic is Low -> High -> Medium.
# So we expect High energy block (1, 4) to be in the middle?
# Wait, code was: low_energy + high_energy + medium_energy
assert ids == ["2", "1", "4", "5", "3"]
assert optimized[0]["energy"] == 0.2
assert optimized[1]["energy"] == 0.8

View File

@@ -7,21 +7,12 @@ services:
image: ghcr.io/bnair123/musicanalyser:latest image: ghcr.io/bnair123/musicanalyser:latest
container_name: music-analyser-backend container_name: music-analyser-backend
restart: unless-stopped restart: unless-stopped
volumes: env_file:
- music_data:/app/data - .env
environment: environment:
- DATABASE_URL=sqlite:////app/data/music.db - DATABASE_URL=postgresql://bnair:Bharath2002@music_db:5432/music_db
- SPOTIFY_CLIENT_ID=${SPOTIFY_CLIENT_ID}
- SPOTIFY_CLIENT_SECRET=${SPOTIFY_CLIENT_SECRET}
- SPOTIFY_REFRESH_TOKEN=${SPOTIFY_REFRESH_TOKEN}
- GEMINI_API_KEY=${GEMINI_API_KEY}
- GENIUS_ACCESS_TOKEN=${GENIUS_ACCESS_TOKEN}
- OPENAI_API_KEY=${OPENAI_API_KEY}
- OPENAI_APIKEY=${OPENAI_APIKEY}
- SIX_HOUR_PLAYLIST_ID=${SIX_HOUR_PLAYLIST_ID}
- DAILY_PLAYLIST_ID=${DAILY_PLAYLIST_ID}
ports: ports:
- '8000:8000' - '8088:8000'
networks: networks:
- dockernet - dockernet
healthcheck: healthcheck:
@@ -45,10 +36,6 @@ services:
backend: backend:
condition: service_healthy condition: service_healthy
volumes:
music_data:
driver: local
networks: networks:
dockernet: dockernet:
external: true external: true

271
docs/DATABASE.md Normal file
View File

@@ -0,0 +1,271 @@
# Database Documentation
## PostgreSQL Connection Details
| Property | Value |
|----------|-------|
| Host | `100.91.248.114` |
| Port | `5433` |
| User | `bnair` |
| Password | `Bharath2002` |
| Database | `music_db` |
| Data Location (on server) | `/opt/DB/MusicDB/pgdata` |
### Connection String
```
postgresql://bnair:Bharath2002@100.91.248.114:5433/music_db
```
## Schema Overview
```
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ artists │ │ track_artists │ │ tracks │
├─────────────────┤ ├──────────────────┤ ├─────────────────┤
│ id (PK) │◄────┤ artist_id (FK) │ │ id (PK) │
│ name │ │ track_id (FK) │────►│ reccobeats_id │
│ genres (JSON) │ └──────────────────┘ │ name │
│ image_url │ │ artist │
└─────────────────┘ │ album │
│ image_url │
│ duration_ms │
│ popularity │
│ raw_data (JSON) │
│ danceability │
│ energy │
│ key │
│ ... (audio) │
│ genres (JSON) │
│ lyrics │
│ created_at │
│ updated_at │
└─────────────────┘
┌─────────────────────┐ ┌─────────────────┐
│ analysis_snapshots │ │ play_history │
├─────────────────────┤ ├─────────────────┤
│ id (PK) │ │ id (PK) │
│ date │ │ track_id (FK) │
│ period_start │ │ played_at │
│ period_end │ │ context_uri │
│ period_label │ │ listened_ms │
│ metrics_payload │ │ skipped │
│ narrative_report │ │ source │
│ model_used │ └─────────────────┘
│ playlist_theme │
│ ... (playlist) │
│ playlist_composition│
└─────────────────────┘
┌─────────────────────┐
│ playlist_config │
├─────────────────────┤
│ key (PK) │
│ spotify_id │
│ last_updated │
│ current_theme │
│ description │
│ composition (JSON) │
└─────────────────────┘
```
## Tables
### `tracks`
Central entity storing Spotify track metadata and enriched audio features.
| Column | Type | Description |
|--------|------|-------------|
| `id` | VARCHAR | Spotify track ID (primary key) |
| `reccobeats_id` | VARCHAR | ReccoBeats UUID for audio features |
| `name` | VARCHAR | Track title |
| `artist` | VARCHAR | Display artist string (e.g., "Drake, Future") |
| `album` | VARCHAR | Album name |
| `image_url` | VARCHAR | Album art URL |
| `duration_ms` | INTEGER | Track duration in milliseconds |
| `popularity` | INTEGER | Spotify popularity score (0-100) |
| `raw_data` | JSON | Full Spotify API response |
| `danceability` | FLOAT | Audio feature (0.0-1.0) |
| `energy` | FLOAT | Audio feature (0.0-1.0) |
| `key` | INTEGER | Musical key (0-11) |
| `loudness` | FLOAT | Audio feature (dB) |
| `mode` | INTEGER | Major (1) or minor (0) |
| `speechiness` | FLOAT | Audio feature (0.0-1.0) |
| `acousticness` | FLOAT | Audio feature (0.0-1.0) |
| `instrumentalness` | FLOAT | Audio feature (0.0-1.0) |
| `liveness` | FLOAT | Audio feature (0.0-1.0) |
| `valence` | FLOAT | Audio feature (0.0-1.0) |
| `tempo` | FLOAT | BPM |
| `time_signature` | INTEGER | Beats per bar |
| `genres` | JSON | Genre tags (deprecated, use Artist.genres) |
| `lyrics` | TEXT | Full lyrics from Genius |
| `lyrics_summary` | VARCHAR | AI-generated summary |
| `genre_tags` | VARCHAR | AI-generated tags |
| `created_at` | TIMESTAMP | Record creation time |
| `updated_at` | TIMESTAMP | Last update time |
### `artists`
Artist entities with genre information.
| Column | Type | Description |
|--------|------|-------------|
| `id` | VARCHAR | Spotify artist ID (primary key) |
| `name` | VARCHAR | Artist name |
| `genres` | JSON | List of genre strings |
| `image_url` | VARCHAR | Artist profile image URL |
### `track_artists`
Many-to-many relationship between tracks and artists.
| Column | Type | Description |
|--------|------|-------------|
| `track_id` | VARCHAR | Foreign key to tracks.id |
| `artist_id` | VARCHAR | Foreign key to artists.id |
### `play_history`
Immutable log of listening events.
| Column | Type | Description |
|--------|------|-------------|
| `id` | INTEGER | Auto-increment primary key |
| `track_id` | VARCHAR | Foreign key to tracks.id |
| `played_at` | TIMESTAMP | When the track was played |
| `context_uri` | VARCHAR | Spotify context (playlist, album, etc.) |
| `listened_ms` | INTEGER | Duration actually listened |
| `skipped` | BOOLEAN | Whether track was skipped |
| `source` | VARCHAR | Source of the play event |
### `analysis_snapshots`
Stores computed statistics and AI-generated narratives.
| Column | Type | Description |
|--------|------|-------------|
| `id` | INTEGER | Auto-increment primary key |
| `date` | TIMESTAMP | When analysis was run |
| `period_start` | TIMESTAMP | Analysis period start |
| `period_end` | TIMESTAMP | Analysis period end |
| `period_label` | VARCHAR | Label (e.g., "last_30_days") |
| `metrics_payload` | JSON | StatsService output |
| `narrative_report` | JSON | NarrativeService output |
| `model_used` | VARCHAR | LLM model name |
| `playlist_theme` | VARCHAR | AI-generated theme name |
| `playlist_theme_reasoning` | TEXT | AI explanation for theme |
| `six_hour_playlist_id` | VARCHAR | Spotify playlist ID |
| `daily_playlist_id` | VARCHAR | Spotify playlist ID |
| `playlist_composition` | JSON | Track list at snapshot time |
### `playlist_config`
Configuration for managed Spotify playlists.
| Column | Type | Description |
|--------|------|-------------|
| `key` | VARCHAR | Config key (primary key, e.g., "six_hour") |
| `spotify_id` | VARCHAR | Spotify playlist ID |
| `last_updated` | TIMESTAMP | Last update time |
| `current_theme` | VARCHAR | Current playlist theme |
| `description` | VARCHAR | Playlist description |
| `composition` | JSON | Current track list |
## Schema Modifications (Alembic)
All schema changes MUST go through Alembic migrations.
### Creating a New Migration
```bash
cd backend
source venv/bin/activate
# Auto-generate migration from model changes
alembic revision --autogenerate -m "description_of_change"
# Or create empty migration for manual SQL
alembic revision -m "description_of_change"
```
### Applying Migrations
```bash
# Apply all pending migrations
alembic upgrade head
# Apply specific migration
alembic upgrade <revision_id>
# Rollback one migration
alembic downgrade -1
# Rollback to specific revision
alembic downgrade <revision_id>
```
### Migration Best Practices
1. **Test locally first** - Always test migrations on a dev database
2. **Backup before migrating** - `pg_dump -h 100.91.248.114 -p 5433 -U bnair music_db > backup.sql`
3. **One change per migration** - Keep migrations atomic
4. **Include rollback logic** - Implement `downgrade()` function
5. **Review autogenerated migrations** - They may miss nuances
### Example Migration
```python
# alembic/versions/xxxx_add_new_column.py
from alembic import op
import sqlalchemy as sa
revision = 'xxxx'
down_revision = 'yyyy'
def upgrade():
op.add_column('tracks', sa.Column('new_column', sa.String(), nullable=True))
def downgrade():
op.drop_column('tracks', 'new_column')
```
## Direct Database Access
### Using psql
```bash
psql -h 100.91.248.114 -p 5433 -U bnair -d music_db
```
### Using Python
```python
import psycopg2
conn = psycopg2.connect(
host='100.91.248.114',
port=5433,
user='bnair',
password='Bharath2002',
dbname='music_db'
)
```
### Common Queries
```sql
-- Recent plays
SELECT t.name, t.artist, ph.played_at
FROM play_history ph
JOIN tracks t ON ph.track_id = t.id
ORDER BY ph.played_at DESC
LIMIT 10;
-- Top tracks by play count
SELECT t.name, t.artist, COUNT(*) as plays
FROM play_history ph
JOIN tracks t ON ph.track_id = t.id
GROUP BY t.id, t.name, t.artist
ORDER BY plays DESC
LIMIT 10;
-- Genre distribution
SELECT genre, COUNT(*)
FROM artists, jsonb_array_elements_text(genres::jsonb) AS genre
GROUP BY genre
ORDER BY count DESC;
```

View File

@@ -1,12 +1,14 @@
import React, { useState, useEffect } from 'react'; import React, { useState, useEffect } from 'react';
import axios from 'axios'; import axios from 'axios';
import { Card, Button, Typography, Space, Spin, message, Tooltip as AntTooltip } from 'antd'; import { Card, Button, Typography, Space, Spin, message, Tooltip as AntTooltip, Collapse, Empty } from 'antd';
import { import {
PlayCircleOutlined, PlayCircleOutlined,
ReloadOutlined, ReloadOutlined,
HistoryOutlined, HistoryOutlined,
InfoCircleOutlined, InfoCircleOutlined,
CustomerServiceOutlined CustomerServiceOutlined,
CalendarOutlined,
DownOutlined
} from '@ant-design/icons'; } from '@ant-design/icons';
import Tooltip from './Tooltip'; import Tooltip from './Tooltip';
import TrackList from './TrackList'; import TrackList from './TrackList';
@@ -17,6 +19,10 @@ const PlaylistsSection = () => {
const [loading, setLoading] = useState(true); const [loading, setLoading] = useState(true);
const [refreshing, setRefreshing] = useState({ sixHour: false, daily: false }); const [refreshing, setRefreshing] = useState({ sixHour: false, daily: false });
const [playlists, setPlaylists] = useState(null); const [playlists, setPlaylists] = useState(null);
const [history, setHistory] = useState([]);
const [loadingHistory, setLoadingHistory] = useState(false);
const [showHistory, setShowHistory] = useState(false);
const fetchPlaylists = async () => { const fetchPlaylists = async () => {
try { try {
@@ -30,10 +36,30 @@ const PlaylistsSection = () => {
} }
}; };
const fetchHistory = async () => {
if (loadingHistory) return;
setLoadingHistory(true);
try {
const response = await axios.get('/api/playlists/history');
setHistory(response.data.history || []);
} catch (error) {
console.error('Failed to fetch playlist history:', error);
message.error('Failed to load playlist history');
} finally {
setLoadingHistory(false);
}
};
useEffect(() => { useEffect(() => {
fetchPlaylists(); fetchPlaylists();
}, []); }, []);
useEffect(() => {
if (showHistory && history.length === 0) {
fetchHistory();
}
}, [showHistory]);
const handleRefresh = async (type) => { const handleRefresh = async (type) => {
const isSixHour = type === 'six-hour'; const isSixHour = type === 'six-hour';
setRefreshing(prev => ({ ...prev, [isSixHour ? 'sixHour' : 'daily']: true })); setRefreshing(prev => ({ ...prev, [isSixHour ? 'sixHour' : 'daily']: true }));
@@ -43,6 +69,7 @@ const PlaylistsSection = () => {
await axios.post(endpoint); await axios.post(endpoint);
message.success(`${isSixHour ? '6-Hour' : 'Daily'} playlist refreshed!`); message.success(`${isSixHour ? '6-Hour' : 'Daily'} playlist refreshed!`);
await fetchPlaylists(); await fetchPlaylists();
if (showHistory) fetchHistory();
} catch (error) { } catch (error) {
console.error(`Refresh failed for ${type}:`, error); console.error(`Refresh failed for ${type}:`, error);
message.error(`Failed to refresh ${type} playlist`); message.error(`Failed to refresh ${type} playlist`);
@@ -53,6 +80,32 @@ const PlaylistsSection = () => {
if (loading) return <div className="flex justify-center p-8"><Spin size="large" /></div>; if (loading) return <div className="flex justify-center p-8"><Spin size="large" /></div>;
const historyItems = history.map((item) => ({
key: item.id,
label: (
<div className="flex justify-between items-center w-full">
<div className="flex items-center space-x-3">
<Text className="text-gray-400 font-mono text-xs">
{new Date(item.date).toLocaleDateString(undefined, { month: 'short', day: 'numeric' })}
</Text>
<Text className="text-white font-medium">{item.theme}</Text>
<span className="px-2 py-0.5 rounded text-[10px] bg-slate-700 text-blue-300 border border-slate-600">
{item.period_label || '6h'}
</span>
</div>
<Text className="text-gray-500 text-xs">{item.composition?.length || 0} tracks</Text>
</div>
),
children: (
<div className="pl-2">
<Paragraph className="text-gray-300 text-sm italic mb-2 border-l-2 border-blue-500 pl-3 py-1">
"{item.reasoning}"
</Paragraph>
<TrackList tracks={item.composition} maxHeight="max-h-96" />
</div>
),
}));
return ( return (
<div className="mt-8 space-y-6"> <div className="mt-8 space-y-6">
<div className="flex items-center space-x-2"> <div className="flex items-center space-x-2">
@@ -162,6 +215,39 @@ const PlaylistsSection = () => {
</div> </div>
</Card> </Card>
</div> </div>
<div className="mt-8 border-t border-slate-700 pt-6">
<Button
type="text"
onClick={() => setShowHistory(!showHistory)}
className="flex items-center text-gray-400 hover:text-white p-0 text-lg font-medium mb-4 transition-colors"
>
<CalendarOutlined className="mr-2" />
Playlist Archives
<DownOutlined className={`ml-2 text-xs transition-transform duration-300 ${showHistory ? 'rotate-180' : ''}`} />
</Button>
{showHistory && (
<div className="animate-fade-in">
{loadingHistory ? (
<div className="flex justify-center p-8"><Spin /></div>
) : history.length > 0 ? (
<Collapse
items={historyItems}
bordered={false}
className="bg-transparent"
expandIconPosition="end"
ghost
theme="dark"
itemLayout="horizontal"
style={{ background: 'transparent' }}
/>
) : (
<Empty description={<span className="text-gray-500">No playlist history available yet</span>} />
)}
</div>
)}
</div>
</div> </div>
); );
}; };