Skip to content

Movie Recommendation Bot

I built an AI-powered Movie Recommendation Bot that uses GPT-2 and MongoDB vector search to answer movie queries in natural language.

The bot uses:

  • GPT-2 for generating conversational responses.
  • Atlas MongoDB on Azure Cloud to store and manage text embeddings.
  • PyMongo to connect to MongoDB.
  • Sentence Transformers for utilizing pre-trained embedding models.
  • Hugging Face Transformers for various NLP tasks.
  • Gradio to create an interactive web interface.
  • Hugging Face Space for deploying the bot.

Try the bot here. Dig into the code here.

How It Works

graph TD
    A[User Input] -->|Query| B[Gradio Interface]
    B --> C[Query Processing]
    C --> D[Generate Query Embedding]
    D --> E[Vector Search in MongoDB]
    E --> F[Retrieve Similar Movies]
    F --> G[Format Search Results]
    G --> H[Combine Query and Results]
    H --> I[Tokenize for GPT-2]
    I --> J[Generate Response with GPT-2]
    J --> K[Decode GPT-2 Output]
    K --> L[Format Final Response]
    L --> M[Display in Gradio Interface]
    M --> N[User Views Response]

    subgraph "Data Preparation"
        O[Load Movie Dataset] --> P[Generate Movie Embeddings]
        P --> Q[Store in MongoDB]
    end

    subgraph "External Services"
        R[Hugging Face Models]
        S[MongoDB Atlas]
    end

    R -.-> D
    R -.-> I
    R -.-> J
    S -.-> E
    S -.-> Q

Design Decisions and Trade-offs

The bot is designed to operate within the limitations of free-tier Hugging Face resources. Responses are sometimes truncated (partial sentences) — this is not a bug but a constraint of the 150-token limit on the free tier.

I deliberately kept the bot within free resources to balance functionality with available computational capacity.

Potential improvements with more resources:

  • More Advanced Language Models: Models like OpenAI GPT or Google Gemma would significantly improve response quality, but would substantially increase operational costs.
  • Increase Token Limits: Expanding from 150 to at least 1000 tokens would allow longer, more comprehensive responses with more movie details and recommendations.
  • Multi-turn Conversations: Retaining context across follow-up questions would enhance user experience, but requires additional computational resources.
  • Hybrid Search: Combining vector and keyword search could provide more accurate recommendations, at the cost of increased index size and higher computational requirements.

These improvements are beyond the current scope due to resource constraints.