Movie Recommendation Bot

Excited to share my new project - AI-Powered Movie Recommendation Bot with GPT-2 and MongoDB.

This bot allows users to ask for movie recommendations and receive detailed responses, using natural language processing and vector search techniques.

This bot uses:

GPT-2 for generating conversational responses.
Atlas MongoDB on Azure Cloud to store and manage text embeddings.
PyMongo to connect to MongoDB.
Sentence Transformers for utilizing pre-trained embedding models.
Hugging Face Transformers for various NLP tasks.
Gradio to create an interactive web interface.
Hugging Face Space for deploying the bot.

Click here to access the bot.
And click here to dig into the code.

I drew this diagram using Mermaid to demonstrate what the bot does.

graph TD
    A[User Input] -->|Query| B[Gradio Interface]
    B --> C[Query Processing]
    C --> D[Generate Query Embedding]
    D --> E[Vector Search in MongoDB]
    E --> F[Retrieve Similar Movies]
    F --> G[Format Search Results]
    G --> H[Combine Query and Results]
    H --> I[Tokenize for GPT-2]
    I --> J[Generate Response with GPT-2]
    J --> K[Decode GPT-2 Output]
    K --> L[Format Final Response]
    L --> M[Display in Gradio Interface]
    M --> N[User Views Response]

    subgraph "Data Preparation"
        O[Load Movie Dataset] --> P[Generate Movie Embeddings]
        P --> Q[Store in MongoDB]
    end

    subgraph "External Services"
        R[Hugging Face Models]
        S[MongoDB Atlas]
    end

    R -.-> D
    R -.-> I
    R -.-> J
    S -.-> E
    S -.-> Q

My Movie Recommendation Bot is designed to provide quick responses based on user queries. You might notice that the bot's responses are sometimes truncated, often containing only part of a sentence. This is not a coding issue but a result of operating within the token limitations of the free tier of Hugging Face resources.

There are several ways to enhance the bot's response quality by using better models and more resources.

I want the bot to operate within the limitations of free resources. This balances functionality with available computational capacity.

These are some potential ways to improve responses

Use More Advanced Language Models: Implementing models like OpenAI GPT or Google Gemma could significantly improve response quality. However, this would substantially increase operational costs.
Increase Token Limits: Expanding from the current 150 tokens to at least 1000 tokens for response generation would allow for longer, more comprehensive responses and include more movie details and recommendations. Again, this would increase costs significantly.
Implement Multi-turn Conversations: This feature would allow users to ask follow-up questions and retain context throughout the conversation, enhancing user experience. However, it would require additional computational resources.
Implement Hybrid Search: Combining vector and keyword search could provide more accurate movie recommendations. This improvement would lead to increased index size and higher computational requirements for search operations.

While these improvements could enhance the bot's functionality, they are currently beyond my scope due to resource constraints.