Next Article in Journal
Advancing Skin Cancer Prediction Using Ensemble Models
Next Article in Special Issue
Deep Learning for Predicting Attrition Rate in Open and Distance Learning (ODL) Institutions
Previous Article in Journal
A Contextual Model for Visual Information Processing
Previous Article in Special Issue
Mitigating Large Language Model Bias: Automated Dataset Augmentation and Prejudice Quantification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Chef Dalle: Transforming Cooking with Multi-Model Multimodal AI

by
Brendan Hannon
*,
Yulia Kumar
*,
J. Jenny Li
and
Patricia Morreale
Department of Computer Science and Technology, Kean University, Union, NJ 07083, USA
*
Authors to whom correspondence should be addressed.
Computers 2024, 13(7), 156; https://doi.org/10.3390/computers13070156
Submission received: 6 May 2024 / Revised: 3 June 2024 / Accepted: 5 June 2024 / Published: 21 June 2024

Abstract

:
In an era where dietary habits significantly impact health, technological interventions can offer personalized and accessible food choices. This paper introduces Chef Dalle, a recipe recommendation system that leverages multi-model and multimodal human-computer interaction (HCI) techniques to provide personalized cooking guidance. The application integrates voice-to-text conversion via Whisper and ingredient image recognition through GPT-Vision. It employs an advanced recipe filtering system that utilizes user-provided ingredients to fetch recipes, which are then evaluated through multi-model AI through integrations of OpenAI, Google Gemini, Claude, and/or Anthropic APIs to deliver highly personalized recommendations. These methods enable users to interact with the system using voice, text, or images, accommodating various dietary restrictions and preferences. Furthermore, the utilization of DALL-E 3 for generating recipe images enhances user engagement. User feedback mechanisms allow for the refinement of future recommendations, demonstrating the system’s adaptability. Chef Dalle showcases potential applications ranging from home kitchens to grocery stores and restaurant menu customization, addressing accessibility and promoting healthier eating habits. This paper underscores the significance of multimodal HCI in enhancing culinary experiences, setting a precedent for future developments in the field.

1. Introduction

The contemporary intersection of technology and daily living has revolutionized how individuals interact with their environments, particularly in personal health and dietary habits. With the global rise in dietary-related health issues, such as obesity and nutritional deficiencies [1], there is a pressing need for personalized solutions that cater to individual dietary preferences and restrictions. Chef Dalle, a recipe recommendation system, represents a pioneering step in this direction, utilizing multimodal human-computer interaction (HCI) to revolutionize the culinary experience.
Designed to meet a broad spectrum of dietary needs, Chef Dalle integrates advanced technologies through OpenAI API [2], including voice-to-text conversion, image recognition, and machine learning algorithms, to offer a user-centric platform that enhances accessibility and user engagement. Utilizing Whisper [3] for voice-to-text conversion, GPT-Vision [4] for ingredient image recognition, and DALL-E 3 [5] for generating recipe images, Chef Dalle’s recommendation engine leverages the capabilities of OpenAI, Google Gemini, and Anthropic APIs to analyze and recommend recipes based on an extensive array of user inputs, including dietary preferences and available ingredients. Each user interaction enriches the system’s understanding, as feedback is integrated in real-time with each API request, enhancing the precision of subsequent recommendations and ensuring the system adapts continuously to user preferences.
Furthermore, Chef Dalle has the unique capability to generate new recipes on the fly when no existing recipes match the user’s specific ingredients or preferences, ensuring a continuous expansion of its recipe database and a highly personalized user experience.
Chef Dalle’s utility extends beyond home kitchens, with potential applications in grocery stores, restaurants, and even challenging dietary scenarios, such as onboard airplanes or within low-income families. By simplifying meal discovery and preparation, Chef Dalle not only fosters healthier eating habits but also introduces a level of convenience and personalization previously unattainable. Moreover, its multimodal interaction capabilities significantly enhance accessibility, making it an invaluable tool for individuals with diverse abilities and preferences.
This paper delves into Chef Dalle’s development and application, elucidating its impact on the HCI field and its implications for dietary planning and health. Through an exhaustive analysis of its features and functionalities, we aim to showcase how Chef Dalle capitalizes on the latest advancements in artificial intelligence and machine learning to provide a personalized cooking assistant that is adaptable, accessible, and in tune with users’ dietary requirements.
RQ1: How does the integration of multimodal AI technologies in Chef Dalle enhance user experience in recipe recommendations?
RQ2: How do the various modalities of the OpenAI API (Whisper for voice recognition, DALL-E 3 for image generation, and GPT-4 Vision for ingredient identification) in Chef Dalle enhance accessibility to users?
RQ3: In what ways does Chef Dalle contribute to promoting healthier eating habits and making home cooking more accessible to diverse user populations?

2. Related Works

The fusion of artificial intelligence (AI) with culinary applications has significantly broadened the scope of recipe recommendation systems, introducing a new era of personalized cooking experiences. This evolution is evident in applications like SuperCook [6], Yummly [7], and BigOven [8], which have redefined meal planning by enabling users to obtain recipes recommended based on the ingredients they have on hand, using machine learning algorithms and looking through large datasets. While these platforms demonstrate the potential of intelligent systems to transform the culinary landscape by making recipe discovery more intuitive and tailored to individual preferences, they often return large amounts of recipes, thus still requiring a user to go through thousands of recipes.
The concept of multimodality in these applications extends beyond mere recipe suggestions. It encompasses various forms of user interaction, such as textual inputs, image recognition, and voice commands, thereby making cooking more accessible and engaging for a diverse audience. The integration of multimodal large language models (LLMs) like ChatGPT-4 (vision and turbo) [9], Google Gemini (advanced) [10], Anthropic’s Claude 3 (opus) [11], and others have propelled a variety of applications, such as AssureAIDoctor [12], an AI doctor assistant which utilizes Dalle and GPT-4, improving accessibility to medical advice for disadvantaged individuals. Additionally, the Multilingual Eyes Multimodal Traveler’s App (MEMTA) [13], a travel assistant for tourists, those visually impaired, and multiple other uses, utilizes GPT-4, GPT-Vision, and YOLO v8 object detection. The Growth Mindset Emojifier (GMSE) [14] app is a feedback tool for educators to utilize emojis in student assignment comments, using GPT-4 and GPT-Vision. These multimodal tools have not only transformed healthcare and educational tools but have also revolutionized the culinary world.
Recently, platforms have been ditching recipe datasets to generate custom recipes from an AI model. Apps such as PlantJammer [15] and Let’s Foodie [16] focus on reducing waste with leftover ingredients on hand, allowing users to edit the recipe to create their perfect custom recipe. Alternatively, DishGen [17] and ChefGPT [18] have leveraged AI to generate recipes tailored to users’ inputs, such as ingredients on hand or the description of an item. These culinary platforms share similarities with multimodal AI applications in other fields, such as Meta’s Smartglasses [19] and Samsung’s Galaxy S24 [20], which leverage AI to interpret the physical and digital world innovatively, from object recognition to intuitive photo editing. Such multimodal applications demonstrate the versatility of AI in understanding and responding to a wide array of user inputs and preferences, thus making technology more accessible and personalized.
Various studies have deeply explored the integration of artificial intelligence (AI) into culinary and health-focused applications. Hwang et al. delve into AI’s role in fostering culinary creativity and sustainable cooking practices in their work “Recipe 2.0” [21]. Similarly, Kansaksiri, et al. investigate generative recipes and ChatGPT-powered nutrition assistance in “Smart Cuisine” [22]. Degerli and Tatlisus [23] show how AI tools can be used for recipe correction, recipe adaptation, recipe detailing, time management, and presentation.
Research on the reliability of dietary advice generated by AI, focusing on individuals with food allergies, is provided by Niszczota and Rybicka [24], while AI’s capability in supporting renal diets is explored by Qarajeh et al. [25], emphasizing the importance of accuracy and reliability in AI-generated health and dietary recommendations.
Moreover, the broader implications of AI in transforming food systems toward enhanced sustainability, efficiency, and consumer experiences across the global food industry are comprehensively reviewed by Pravin and Sundarapandiyan [26], highlighting the technology’s potential to revolutionize food production, nutrition, and culinary experiences globally. From precision agriculture to personalized nutrition and food safety assurance, AI-driven solutions are increasingly adopted, pointing to a future where food systems are more resilient, equitable, and sustainable.
This integration of AI into culinary technologies and multimodal applications not only enriches personal cooking experiences but also heralds a new era of technology-driven solutions across sectors, emphasizing the significant potential of AI to foster innovation, sustainability, and personalization in daily life.

Distinctive Features of Chef Dalle

Chef Dalle distinguishes itself from existing recipe recommendation systems through a unique blend of multimodal AI integration, dynamic recipe generation, and a user-centric design philosophy. Unlike platforms like Yummly, Supercook, and BigOven, which primarily rely on pre-existing datasets and limited input modalities, Chef Dalle leverages cutting-edge AI technologies to offer a more adaptable and personalized culinary experience.
The key differentiators include the following:
Multimodal Input: Chef Dalle seamlessly integrates voice commands (Whisper), image recognition (GPT-Vision), and text input, catering to diverse user preferences and abilities. This multimodal approach simplifies recipe discovery and enhances accessibility for users with varying needs.
Dynamic Recipe Generation: In a groundbreaking departure from traditional systems, Chef Dalle can generate new recipes on the fly when no suitable matches are found in its extensive database. This ensures that users always receive relevant recommendations, even for unique ingredient combinations or specific dietary requirements.
Interactive Cooking Guidance: The Chatbox Assistant, powered by GPT-4-Turbo, provides real-time cooking support, answering user queries, offering ingredient substitutions, and guiding users through each recipe step. This feature fosters a more engaging and interactive cooking experience.
Multi-API Recommendation Engine: Chef Dalle employs a sophisticated recommendation engine that leverages multiple AI models (OpenAI, Google Gemini, Anthropic) to analyze user input and generate highly personalized recipe suggestions. This multi-API approach ensures a wider range of culinary options and greater accuracy in meeting user preferences.
Visual Appeal: the integration of DALL-E 3 allows Chef Dalle to generate high-quality images for recipes, enhancing the visual appeal of the platform and providing users with a clear expectation of the final dish.
These distinctive features collectively position Chef Dalle as a frontrunner in the next generation of recipe recommendation systems, offering a more personalized, accessible, and engaging culinary experience than existing solutions.

3. Methodology

3.1. System Architecture

Chef Dalle intricately combines AI-driven and web technologies to craft a user-centric culinary platform. Users interact with the system through a web interface, where they can input their dietary preferences, their allergies, and the specific ingredients they wish to use in recipes. This input is enhanced by state-of-the-art AI functionalities, including image uploads processed for ingredient recognition via GPT-Vision and voice-to-text conversion through Whisper. The system’s backend, developed with Flask 3.0.2 and SQLAlchemy 2.0.27, manages user interactions, stores data, and handles complex queries with Flask-Migrate for database migrations, ensuring the database structure can evolve seamlessly alongside the application’s growing needs.
The core of the application lies in its recommendation engine, which accesses a vast repository of over 231,639 recipes stored in a CSV file format. This engine uses state-of-the-art AI models from OpenAI, Google Gemini, and Anthropic APIs to analyze user inputs and past interactions to tailor recipe suggestions, which are then formatted in JSON for easy front-end retrieval. For enhancing visual engagement, high-quality recipe images are generated on-demand using DALL-E 3, stored as base64-encoded strings in the database for efficiency and quick retrieval.

3.1.1. User Interface (UI)

The UI is meticulously designed, focusing on ease of use, esthetic appeal, and accessibility. Built using HTML5, CSS3, and JavaScript, the interface adapts to various screen sizes and devices, ensuring a seamless user experience across different platforms. The recipes are presented using Bootstrap cards, which organize information clearly and attractively, enhancing the visual appeal and readability.
AJAX is implemented for asynchronous data fetching, allowing the application to deliver real-time updates such as recipe suggestions and user feedback without the need for page reloads. This not only improves interaction speed but also enhances the overall user engagement, making the culinary exploration both intuitive and enjoyable. Figure 1 displays Chef Dalle’s user interface, including the home page, a recommendation, and the profile page, a full demo can be found at [Video S1], source code can be found at [Video S1].

3.1.2. Chatbox Assistant

A standout feature of the Chef Dalle application, the Chatbox Assistant, is ingeniously integrated into each recipe card, significantly enhancing the cooking experience for users. Activated by the ‘Start Cooking’ button, this interactive tool utilizes OpenAI’s GPT technology to deliver dynamic, responsive support tailored to the needs of home cooks. Upon activation, the chatbox employs the OpenAI API to append precise quantities to ingredients, a critical feature given the existing dataset often lacks specific measurements. This functionality ensures that users receive comprehensive information necessary to begin their culinary preparations without external references.
As users progress through their cooking, the chatbox further enhances user engagement by enriching the recipe steps. Each time a user clicks the ‘Next Step’ button, the chatbox sends the current step to the OpenAI API, which returns expanded instructions that include detailed cooking techniques. This enriched content helps demystify complex cooking processes, making each step clear and executable. Additionally, for steps requiring precise timing, the chatbox features an integrated timer that counts down and alerts the user with both visual and audible notifications until the user interacts with the ‘Stop’ button where the countdown is displayed.
By embedding such advanced functionalities into the Chatbox Assistant, Chef Dalle not only enhances the practicality of its platform but also transforms the static activity of following a recipe into a more engaging, supportive, and educational experience. This interactive tool makes the platform accessible to novice cooks and enriches the cooking experience for culinary enthusiasts, ensuring every user can confidently achieve their best results. The snapshots of Chatbox Assistant can be seen in Figure 2.

3.1.3. Flask Application

Acting as the middleware, Flask orchestrates the system’s core functionalities, including request handling, session management, and communications between the front end, AI models, and the database. Its lightweight nature ensures efficient processing, while Flask’s extensive libraries support complex functionalities like user authentication and image processing. Figure 3 shows the architecture of Chef Dalle and the role of Flask as a middleware.

3.1.4. Recommendation Engine

This engine operates by initially filtering recipes from a dataset based on user input such as ingredients and allergies. These selected recipes, along with their details, are then evaluated using a multi-API approach involving the OpenAI API, Google Gemini API, and Anthropic API. Each API assesses the recipes considering the user’s ingredients, dietary preferences, allergies, and any liked or disliked recipes, returning five recommendations each. After collating these fifteen recommendations, duplicates are removed to form a refined list of unique recipes. This list is re-evaluated by the OpenAI API, which considers the same user-specific parameters to finalize the top recommendations presented to the user. In cases where no suitable matches are found within the existing dataset, the recommendation engine can trigger the generation of new recipes. This process involves leveraging the AI models’ understanding of flavor profiles, ingredient compatibility, and dietary restrictions to create novel recipe combinations that align with the user’s input. The generated recipes are then evaluated and refined before being presented to the user, ensuring their quality and relevance. This process not only leverages real-time data processing but also utilizes insights from multiple AI technologies to dynamically adjust and refine recipe suggestions, ensuring highly personalized and relevant outcomes.
The logic of this code is shown in Algorithm 1.
Algorithm 1: Recipe Recommendation System
Input: user_diet_preferences: list of user’s dietary preferences, allergies: list of user’s allergies, recipes_df: data frame containing recipe data, ingredients: list of user’s ingredients, liked_recipes: list of IDs of recipes the user likes (optional), disliked_recipes: list of IDs of recipes the user dislikes (optional)
Output: The AI-generated response containing top recommended recipes that match user preferences and do not contain allergens.
 1.
Get and store user inputs: ingredients, allergies, user_diet_preferences
 2.
If liked_recipes is not empty:
 3.
   For each liked_recipe_id in liked_recipes:
 4.
     Find index in recipes_df matching liked_recipe_id
 5.
     If index is found:
 6.
       Return recipe info
 7.
If disliked_recipes is not empty:
 8.
   For each disliked_recipe_id in disliked_recipes:
 9.
     Find index in recipes_df matching disliked_recipe_id
 10.
     If index is found:
 11.
       Return recipe info
 12.
For each recipe in recipe_df
 13.
    If user-specified ingredient in ingredients
 14.
      Add recipe to filtered_recipes
 15.
   If user-specified allergen ingredients
 16.
     Remove recipe from filtered_recipes
 17.
Createprompt(user_diet_preferences, allergies, Ingredients, liked_recipes, disliked_recipes, filtered_recipes([‘name’], [‘Ingredients’]))
 18.
Call Concurrently(AnthropicAPI, OpenAIAPI, GeminiAPI)
 19.
All_recommendations = list(set(results))
 20.
Top_recommendations = Call OpenAIAPPI(user_diet_preferences, allergies, Ingredients, liked_recipes, disliked_recipes, All_recommendations)
 21.
Recommended_names = top [ name.strip() for name in all_recommendations [:5]]
 22.
filtered_df = recipes_df[recipes_df[‘Recommended_names’].isin(names)]
 23.
Return DataFrame containing recommended_recipes.

3.1.5. Database

The system maintains a robust database using SQLAlchemy for efficient ORM. This includes secure user credential storage, tracking dietary preferences and allergies, and recording recipe feedback. Base64-encoded images of recipes are also managed for efficient retrieval, using Flask-Migrate to maintain database integrity across schema changes.

3.2. AI Models

The overall AI infrastructure applied to the application logic can be seen in Figure 4. More detailed explanations of the models in use are provided later in this section.
Figure 4 illustrates a comparison of the top large language models (LLMs) in terms of the various factors and methodologies used to derive the optimal recipe. The two main methodologies are as follows:
(1)
Mixture of Experts. This is a machine learning technique where multiple expert models contribute to a final output, improving the decision-making process through a specialized yet collaborative approach. This strategy is effective in enhancing the adaptability and accuracy of models across various inputs and conditions [27,28].
(2)
Multi-Head Mixture of Experts. This method enhances model performance by specializing different ‘expert’ neural networks within the larger model structure, each tailored to optimize a specific domain of the task at hand, such as taste or nutritional value. Such architecture allows for better scaling and efficiency in handling complex tasks by distributing them among multiple experts, each processed in parallel [27].
Applying the mentioned methodologies effectively addresses complex multifactorial problems like recipe generation, which demands consideration of diverse factors such as taste preferences, nutritional needs, and allergy safety [29,30,31].
As can be seen from Figure 4, three various models and therefore three various APIs return recipe responses. In the case of Chef Dalle, each API returns exactly 5 recipes, so there are 15 recipes in total. The app then removes duplicates and gives the same prompt to OpenAI API again with only the returned recipes from the APIs instead of the filtered dataset, and then, it displays them in the same way on the recipe cards. The described process can be converted into the following formula.
fapi(p) represents the call to an API with prompts and returns a list of recipes.
R = fapi(p1) ∪ fapi(p2) ∪ fapi(p3), where R is the total set of recipes returned from three different API calls, each contributing five recipes.
g(R) removes duplicates from the set of recipes R.
h(R′) takes the de-duplicated set of recipes R′ and refines them by submitting them to the OpenAI API, returning a refined list of recipes.
display(R″) displays the final refined list of recipes R″ in the required format (JSON).
The entire process can be described as a composite function that encapsulates calling APIs, concatenating responses, removing duplicates, refining results, and displaying them:
display (h (g (fapi(p1)  fapi(p2)  fapi(p3) ) )

3.2.1. Whisper for Voice-to-Text Conversion

The integration of Whisper for voice recognition [3] allows users to verbally input their cooking preferences, ingredients at hand, or dietary restrictions. The system captures these voice inputs, converting them into text that is then analyzed to understand user requests accurately. This feature significantly enhances Chef Dalle’s accessibility and user-friendliness, making recipe discovery a hands-free experience that accommodates busy kitchen environments or users with mobility limitations.

3.2.2. GPT-Vision for Ingredient Recognition

GPT-Vision [4] plays a pivotal role in identifying ingredients from user-uploaded images. When a user takes a picture of available ingredients, GPT-Vision processes this image to recognize and list the ingredients. This sophisticated image recognition capability simplifies the input process, allowing users to easily add ingredients to their profile without manually typing them. The identified ingredients are automatically populated into the ingredient form field, streamlining the recipe search and recommendation process.

3.2.3. Dall-e 3 for Image Generation

DALL-E 3 [5] addresses the need for visual content within the Chef Dalle platform, particularly when displaying recipe suggestions. If a requested recipe lacks a corresponding image in the database, DALL-E 3 generates a high-quality image representative of the final dish. This not only enriches the user experience by providing a visual expectation of recommended recipes but also enhances the content’s appeal and engagement. The ability to generate images on demand ensures that every recipe, regardless of its source, is visually represented, making the exploration of culinary options more engaging and informative.

3.2.4. GPT-4-Turbo for Recipe Recommendation and Cooking Guidance

GPT-4-Turbo is used extensively by Chef Dalle for both recipe recommendations and interactive cooking guidance. This model excels in understanding and generating human-like text based on complex user inputs. For recipe recommendations, GPT-4-Turbo analyzes the user’s ingredients, allergies, dietary preferences, and disliked and liked recipes, along with the filtered recipes based on the user’s ingredients and allergies to generate a curated list of recipes that match the user’s needs precisely. The model is told to return five recipes in a JSON object “recipe_name”. This model is first in our recommendation logic when we concurrently make API requests to Claude-3 and Gemini 1.5, each returning five recipes. With the new list of AI-selected recipes, we have GPT-4 reanalyze the same inputs, but this time, instead of the filtered recipes, we just give the recipes returned by the three models, giving GPT-4-Turbo the final decision.
Moreover, GPT-4-Turbo plays a crucial role in the chatbox functionality, providing real-time cooking guidance. The model returns the ingredient list with quantities along with expanded step-by-step instructions every time the “next step” button is clicked, sending the original step to the API and returning the step in a more detailed format than on the recipe card. Users can seek advice on alternative ingredients or any query that they have on the specific step. The model processes these queries to deliver informative, engaging, and contextually relevant responses that enhance the user’s cooking experience, making complex culinary tasks accessible and enjoyable. This dual functionality not only improves user satisfaction but also encourages exploration and creativity in cooking.

3.2.5. Claude-3-Opus for Recipe Recommendations

Claude-3-Opus by Anthropic is another AI model integrated into Chef Dalle, primarily focused on generating recipe recommendations. Claude-3-Opus analyzes the same inputs as the GPT-4-Turbo to make recommendations from the filtered recipes. The API call to the model asks it to return five recipes in a JSON object for a user based on their ingredients, dietary restrictions, allergies, any liked or disliked recipes, and the filtered recipe list. The model’s ability to parse and understand natural language with high accuracy allows it to tailor recommendations that closely match user inputs.

3.2.6. Gemini 1.5 for Recipe Recommendations

Gemini 1.5, developed by Google, is utilized within the Chef Dalle platform to enhance the recipe recommendation process further. Similar to Claude-3-Opus and GPT-4-Turbo, Gemini 1.5 is tasked with analyzing the user’s ingredients, dietary preferences, allergies, and liked or disliked recipes. This AI model processes the same set of user inputs and the filtered list of recipes to generate its own set of recommendations. During the concurrent API request phase, Gemini 1.5 is instructed to return five potential recipes, formatted as a JSON object, which it selects based on its analysis of historical data patterns and user-specific needs. This approach not only ensures personalized recommendations but also integrates broader culinary trends and preferences, offering users dishes that are both tailored to their tastes and reflective of contemporary cooking styles.
The recommendations from Gemini 1.5 are then included in the pool of potential dishes that are re-evaluated by GPT-4-Turbo in the final decision-making step. This integration ensures that the recommendations are diverse and comprehensive, covering a wide array of culinary options that might appeal to different palates and dietary requirements.

3.3. Data Collection and Preparation

The recipe dataset is sourced from Kaggle’s “food-com-recipes-and-user-interactions” by shuyangli94. This dataset provides a comprehensive collection of recipes along with user interactions, but we only utilized the RAW_recipes.csv file, which contains 230,186 recipes.

3.3.1. Data Cleaning and Preprocessing

The dataset was processed in a Google Colab notebook using the Pandas library to focus on key features relevant to our recommendation system. Here is an outline of the cleaning and preprocessing steps:
  • Deduplication: removes any duplicate entries to ensure dataset uniqueness.
  • Column Selection: isolates relevant data such as recipe names, cooking times, tags, nutritional content, and ingredients.
  • Textual Cleaning and Normalization: ensure consistency in the data by removing special characters and standardizing the text to lowercase.
  • Handling Missing Values: drops entries with incomplete data to ensure integrity.
  • Nutrition Data Transformation: expands and categorizes nutritional information for detailed analysis.
  • Feature Aggregation: combines relevant attributes into a single feature to improve the matching algorithm’s efficacy.
  • RecipeID Assignment: each recipe is assigned a unique identifier.
  • Data Integration: the CSV file is loaded into a panda data frame and ensures the recipe ID is treated as strings.

3.3.2. Database Structure

User Model: Stores information about the user, including login, password, dietary preferences, allergies, saved recipes, and user feedback.
User Feedback Model: This collects recipe IDs, the user ID, and whether the user liked or disliked the recipe. The feedback is used to improve the recommendation engine’s accuracy continually.
Recipe Image Model: This manages the storage and retrieval of base64-encoded recipe images since Dall-e 3 returns a URL that expires after an hour. If DALL-E-3 generates a new image, it is stored in the database for future reference with the matching recipe ID from the CSV file.

3.4. Workflow for Recommendations

The workflow can be observed in Figure 5.

3.5. Implementation Challenges

During the development of Chef Dalle, several implementation challenges were encountered, particularly in handling diverse and complex dietary restrictions and ensuring the system’s scalability.
One of the primary challenges was accurately capturing and representing the wide range of dietary restrictions and preferences users might have. From common allergies to specific dietary philosophies, the system needed to be flexible enough to accommodate a vast array of constraints. To address this, Chef Dalle leverages the knowledge and capabilities of large language models (LLMs) through API integrations with OpenAI, Google Gemini, and Anthropic. These LLMs have a deep understanding of various dietary restrictions and can generate recipe recommendations that adhere to specific diets. By relying on the LLMs’ knowledge, Chef Dalle can handle a wide range of dietary restrictions without the need for extensive manual tagging or the filtering of recipes.
Scalability was another critical concern, as Chef Dalle aimed to serve a growing user base and maintain an expanding recipe database. To ensure the system’s scalability, the team implemented concurrent API requests to the multiple LLMs, allowing for efficient processing of user queries and the generation of recipe recommendations. This distributed approach helps to handle increased traffic and reduces the reliance on a single model.
Furthermore, to optimize resource usage and minimize the need for constant image generation, Chef Dalle stores generated recipe images in a database. This approach allows for quick retrieval of previously generated images, reducing the load on the DALL-E 3 model and improving the system’s overall performance.
To further enhance scalability, the team plans to implement a system that allows users to provide their own API keys for the LLMs. This will distribute the workload across user-provided resources, enabling the system to scale more effectively as the user base grows.
As Chef Dalle continues to evolve and attract more users, the team will monitor the system’s performance and make necessary optimizations to ensure a smooth and responsive user experience.

3.6. Test Cases

To rigorously evaluate Chef Dalle’s effectiveness and versatility, a structured methodology was implemented, focusing on simulating a range of user interactions. This included testing for basic functionality, accessibility features, dietary preference handling, and user feedback adaptation. Each test case was designed to reflect potential real-world usage scenarios, ensuring comprehensive coverage of Chef Dalle’s capabilities. The methodology also accounted for system responsiveness on various devices and the accuracy of AI-driven functionalities like voice-to-text conversion and image recognition. In Table 1 below, ✓ stands for ‘As expected’.
These test cases illustrate Chef Dalle’s broad functionality spectrum, from accommodating specific dietary needs to leveraging AI for enhanced user interaction. Through systematic testing, Chef Dalle demonstrates a high degree of accuracy and reliability across different functionalities, underscoring its potential to significantly improve the culinary experience for users with diverse needs and preferences.

4. Results

4.1. Enhanced User Experience through Multimodal AI Integration

Chef Dalle introduces a novel approach to recipe discovery by employing AI-driven modalities that cater to a variety of user interactions. The integration of voice-to-text conversion, image recognition, and textual input methods offers a versatile and user-friendly platform. The dynamic generation of recipes significantly enhances the user experience by ensuring that users always receive relevant recommendations, even when their ingredient combinations are unique or uncommon. This feature not only expands the range of culinary possibilities but also reinforces the system’s adaptability and personalization. The high accuracy of ingredient recognition and voice-to-text conversion is expected to significantly reduce the complexity of finding recipes that align with users’ dietary preferences and restrictions. While initial user feedback is promising, the system’s design suggests a streamlined process that likely enhances overall user satisfaction and engagement.
While Chef Dalle’s design suggests a streamlined process that likely enhances overall user satisfaction and engagement, it is important to acknowledge the limited scope of user feedback at this stage. Future work could provide valuable insights into actual user engagement and the system’s impact on dietary habits, offering a more comprehensive understanding of Chef Dalle’s effectiveness in real-world scenarios.

4.2. Accessibility

Chef Dalle’s advanced features, including Whisper for voice recognition and GPT-Vision for ingredient identification, play a crucial role in making cooking and recipe discovery accessible to all users, including those with disabilities. For instance, voice recognition enables users with visual impairments to interact with the system hands-free, while image recognition assists users who may find manual ingredient entry challenging, as shown in Figure 6. This design philosophy underscores Chef Dalle’s commitment to inclusivity, potentially facilitating a wider adoption across diverse user demographics.

4.3. Promotion of Healthier Eating Habits

By providing personalized recipe recommendations based on individual dietary needs and preferences, Chef Dalle is positioned to influence users’ eating habits positively. The system’s ability to filter recipes according to dietary restrictions, allergies, or specific health-related goals suggests a direct impact on promoting healthier eating patterns. While empirical data from user engagement would provide concrete evidence, the theoretical basis indicates that Chef Dalle could serve as an invaluable tool in guiding users toward more nutritious and balanced meals.

4.4. Scope of Application

The scope of application for Chef Dalle is vast, touching various aspects of daily life and offering transformative solutions to cooking and meal planning challenges across different settings. At its core, Chef Dalle is designed not just as a recipe recommendation system but as a multifaceted tool that bridges the gap between dietary needs, culinary creativity, and accessibility.
In the Kitchen: Chef Dalle stands as an invaluable assistant in home kitchens, where it tailors recipes to the ingredients available, considers dietary restrictions and preferences, and even offers visual guides through DALL-E 3-generated images. This support is crucial for encouraging home cooking, exploring new cuisines, and ensuring nutritious meals are accessible and enjoyable to prepare.
Grocery Stores: Envision a scenario where shoppers at grocery stores can use Chef Dalle to scan items they intend to purchase and receive instant recipe recommendations based on those ingredients. This application not only enhances the shopping experience but also aids in meal planning and reducing food waste by suggesting recipes that use all the purchased ingredients.
Restaurants: For culinary professionals, Chef Dalle could serve as a source of inspiration, helping chefs to devise new dishes or adapt existing ones to meet contemporary dietary trends or ingredient availability. In restaurant settings, it could also offer personalized menu recommendations to patrons based on their dietary preferences, creating a customized dining experience.
Airplane Kitchens: The constraints of airplane kitchens, where space is limited and ingredient storage is challenging, make Chef Dalle a potential game changer. Providing recipes that can be executed within these constraints can enhance the quality of in-flight meals, offering passengers healthier, tastier, and more varied dining options.
Low-Income Families: One of the most impactful applications of Chef Dalle lies in its potential to assist low-income families. By generating recipes that are not only cost-effective but also nutritionally balanced and tailored to the limited ingredients they might have on hand, Chef Dalle can play a pivotal role in improving dietary habits and food security among economically disadvantaged groups.

5. Evaluation of Chef Dalle’s Recommendation System

5.1. User Studies

To rigorously validate Chef Dalle’s adaptability and effectiveness, we conducted comprehensive user studies involving 20 diverse participants. This section presents a mixed-method analysis.

5.2. Methodology

Our study engaged with participants during a social event, where they interacted with Chef Dalle by inputting both actual and hypothetical ingredients. Participants provided feedback through the like/dislike system on each recipe card used to improve recommendations. This feedback loop allowed us to gather valuable data to assess the system’s adaptability and performance.
To evaluate the system’s accuracy and effectiveness, we employed a mixed-method approach, combining quantitative analysis of user feedback with qualitative observations. We tracked the number of likes and dislikes for each recipe recommendation and calculated the average rating for each user’s session. Additionally, we conducted semi-structured interviews with participants to gain deeper insights into their experience with Chef Dalle and their perception of the app’s ability to adapt to their preferences.
Over two days, participants engaged in a total of 120 sessions, generating 600 recipe recommendations. The quantitative analysis revealed a notable increase in user satisfaction:
  • The average rating for recipe recommendations rose significantly from 3.2 (out of 5) in the initial session to 4.5 in the final session. This indicates that Chef Dalle effectively adapted to user preferences over time, leading to increased satisfaction.
  • The average number of likes per session grew from 2.1 to 4.3, while the average number of dislikes decreased from 1.8 to 0.4. This further supports the conclusion that Chef Dalle’s recommendations became more aligned with user preferences as the system learned.

5.3. Quantitative Analysis

5.3.1. Bar Chart of Average Ratings per Session

User satisfaction was measured by the average ratings provided for each recipe recommendation over multiple sessions. This progression is illustrated in the bar chart below (Figure 7).

5.3.2. Line Graph of Likes and Dislikes per Session

We also tracked the number of likes and dislikes that each recipe received per session to gauge user engagement and system accuracy. As depicted in Figure 8, there was a consistent increase in likes and a decrease in dislikes across the sessions. This trend highlights Chef Dalle’s learning capabilities and its increasing alignment with user preferences.

5.3.3. Pie Chart of User Dietary Preferences

The diversity of dietary preferences among participants was captured to demonstrate the system’s ability to handle varied dietary requirements effectively (Figure 9).

5.4. Qualitative Analysis

Qualitative observations from user interviews provided further evidence of Chef Dalle’s effectiveness. Participants consistently praised the app’s ability to learn from their interactions and tailor recommendations to their taste preferences and dietary requirements. Several users highlighted specific instances where Chef Dalle suggested recipes that perfectly matched their desired cuisine or incorporated ingredients they had previously liked.

Feedback Themes from Qualitative Interviews

The qualitative feedback revealed several key themes:
  • Personalization and Adaptability: users appreciated how Chef Dalle adopted recommendations based on their interactions and tastes.
“I was impressed by how quickly Chef Dalle picked up on my love for Mediterranean flavors. After just a few likes, it started recommending dishes with feta, olives, and herbs that I absolutely loved.”
  • Dietary Accommodation: users with specific dietary restrictions noted that Chef Dalle was helpful in suggesting suitable recipes and substitutions.
“As someone with dietary restrictions, I often struggle to find suitable recipes. Chef Dalle’s ability to filter recommendations based on my preferences and provide accurate substitution suggestions was a game-changer.”
  • Engagement and Interaction: participants highlighted the interactive aspects, such as the ability to ask questions about cooking steps or request alternative ingredients.
“The interactive cooking guidance not only helped me cook but also made the whole process much more fun and engaging. I could ask for different ingredient options, and Chef Dalle would instantly adjust the recipe.”
  • Visual Appeal and Clarity: the high-quality images provided by Chef Dalle were frequently mentioned as helpful for understanding the final dish.
“The images are so vivid and clear; it’s almost like I can taste the dish before I even start cooking. It really helps to have a visual goal in mind.”
These themes demonstrate the system’s strong performance in personalization, dietary accommodation, user engagement, and visual communication, all of which are crucial for a successful recipe recommendation system.

5.5. Comparative Analysis against Existing Platforms

To delineate Chef Dalle’s innovative edge in the recipe recommendation landscape, we compared its unique features against those of established platforms—Supercook, BigOven, and Yummly. This analysis underscores Chef Dalle’s superior capabilities in personalization, user interaction, and technological integration.

5.5.1. Chef Dalle’s Unique Features

Chef Dalle distinguishes itself through several pioneering features that directly enhance user experience and adaptability:
  • Multimodal Input: Unlike any of its competitors, Chef Dalle integrates voice, image, and text inputs, making recipe discovery more accessible and user-friendly. This contrasts sharply with platforms like Supercook and BigOven, which rely solely on text-based inputs.
  • Dynamic Recipe Generation: Chef Dalle’s ability to create new recipes on the fly offers a unique solution that is not addressed by Yummly or Supercook, ensuring that users receive personalized recommendations regardless of how unusual their ingredient combinations are.
  • Interactive Cooking Guidance: With the integration of GPT-4-Turbo, Chef Dalle provides interactive, step-by-step cooking assistance. This feature is more advanced than Yummly’s guided cooking mode, as it includes dynamic ingredient substitutions and real-time Q&A.
  • Multi-API Recommendation Engine: leveraging multiple AI models, Chef Dalle offers a broader and more accurate culinary exploration than any single-API system used by BigOven or Supercook, enhancing the precision of personalized recommendations.
  • Visual Appeal: the use of DALL-E 3 to generate high-quality recipe images sets Chef Dalle apart, offering users a visual taste of what to expect, a feature not fully utilized by any of the compared platforms.

5.5.2. Visual Comparative Analysis

Table 2 presents a feature comparison matrix between Chef Dalle and its competitors: Supercook, BigOven, and Yummly. As shown in the table, Chef Dalle stands out in terms of input modalities, recipe personalization, user interaction, technological sophistication, visual presentation, and recipe sources.

5.5.3. Conclusions

Chef Dalle’s multimodal input, dynamic recipe generation, and advanced AI integration differentiate it from existing platforms. While competitors offer valuable features, Chef Dalle excels in personalization, adaptability, and user engagement, establishing it as a leading-edge solution in culinary technology.

6. Discussion

Chef Dalle’s deployment of multimodal AI technologies—voice recognition, image processing, and textual analysis—significantly elevates the culinary experience. This integration reflects a growing trend in HCI, where user convenience and accessibility are paramount. The seamless interaction facilitated by these modalities not only simplifies the recipe discovery process but also embodies a shift toward more intuitive, user-centered design in digital platforms.
The integration of multiple AI technologies, specifically OpenAI, Google Gemini, and Anthropic APIs, not only fortifies Chef Dalle’s recipe recommendations but also significantly enhances the system’s adaptability to diverse user needs. This multi-API approach allows Chef Dalle to effectively handle a wide array of dietary preferences and restrictions, showcasing the system’s capability to deliver personalized culinary solutions with unprecedented accuracy and user satisfaction.
Chef Dalle’s emphasis on accessibility highlights the potential of AI to democratize cooking, making it more approachable for individuals with varying abilities and dietary needs. By breaking down barriers to entry, such as the complexity of recipe discovery and adaptation to specific nutritional requirements, Chef Dalle aligns with broader societal goals of inclusivity. Continued advancements in AI could further enhance these capabilities, offering even more personalized and accessible culinary aids. The discussion around accessibility also raises questions about digital divide issues, where technology availability and digital literacy may limit access to such innovative solutions.
The potential of Chef Dalle to promote healthier eating habits invites a broader discussion on the role of technology in public health. By providing tailored recipe recommendations, the system has the potential to influence dietary choices directly, encouraging home cooking and the consumption of nutritious foods. This aspect of Chef Dalle’s functionality underscores the importance of integrating nutritional science with AI development, ensuring that recommendations not only cater to user preferences but also align with health guidelines.
Chef Dalle’s innovative approach to recipe recommendations has the potential to play a significant role in promoting sustainability, particularly in the context of reducing food waste. By offering personalized recipe suggestions based on the specific ingredients that users already have at their disposal, Chef Dalle encourages the efficient use of food items that might otherwise be overlooked or discarded. This not only helps in minimizing food waste but also aids individuals in discovering new and creative ways to prepare meals from their existing pantry items. Such a system can be particularly beneficial in households where leftovers and odd ingredients accumulate without a clear plan for use. By intelligently suggesting recipes that utilize these ingredients, Chef Dalle contributes to a more sustainable kitchen practice, ensuring that food is consumed more mindfully and wasted consciously. This aligns with broader environmental goals of reducing the global food waste footprint, as reducing waste at the consumer level is critical in addressing the overall challenge of food sustainability.
Exploring the potential of Chef Dalle underscores the critical need for robust security measures and ethical considerations, especially concerning data privacy and model integrity. Addressing these issues is crucial as AI systems become more integrated into daily life. The threat of model poisoning and privacy breaches highlights the importance of prioritizing security and ethics in AI development—an area ripe for further research.
The development and deployment of AI-driven systems like Chef Dalle also raise significant concerns about potential biases and privacy. Biases in AI models can perpetuate existing inequalities or skew recommendations, affecting their accuracy and fairness. Combining regular audits with the inclusion of diverse perspectives in the development process is essential to mitigate these biases and ensure the system’s effectiveness. Additionally, Chef Dalle collects sensitive information like dietary preferences, making the security of this data and transparency about its use paramount. Implementing robust security measures, such as encryption and secure storage practices, and clearly communicating these practices to users will help safeguard privacy and bolster user trust.
As AI technologies become staples in daily life, prioritizing ethical considerations and developing responsible AI guidelines is essential. This includes ongoing monitoring of the system’s impact, regular ethical reviews, and engaging with diverse stakeholders to align the technology with the broader interests of users and society. By addressing these broader implications and ethical considerations, Chef Dalle can enhance the culinary experience while contributing to the responsible development and deployment of AI technologies in the food and health domains.

7. Conclusions and Future Research

Chef Dalle has embarked on an innovative journey to transform the culinary landscape through multimodal AI integration, redefining the recipe discovery and meal preparation process. By adeptly combining voice-to-text conversion, image recognition, and textual input, Chef Dalle has not only made cooking more accessible but has also significantly streamlined the recipe search process. This seamless fusion of technologies facilitates a user experience that is both engaging and highly intuitive, aligning perfectly with modern expectations of digital interaction.
The system’s dedication to accessibility and inclusivity stands as a testament to the transformative potential of AI in democratizing cooking, making it accessible to individuals regardless of their abilities or dietary constraints. Chef Dalle’s approach goes beyond mere convenience, embodying a deep commitment to fostering healthier eating habits and sustainable living. By leveraging AI to provide personalized recipe suggestions, Chef Dalle encourages the efficient use of ingredients, potentially reducing food waste and guiding users toward more nutritious meal choices.
The ability to generate new recipes represents a promising avenue for future research and development. Refining the recipe generation process to incorporate user feedback, cultural preferences, and nutritional considerations could further enhance the system’s personalization and utility. Additionally, exploring the potential of user-generated recipes and their integration into the platform could foster a vibrant culinary community and contribute to the continuous expansion of Chef Dalle’s recipe database.
Looking forward, the scope of Chef Dalle’s application is vast, potentially impacting various sectors, including home kitchens, grocery stores, restaurants, and beyond. Its capacity to adapt to constrained environments like airplane kitchens and support low-income families highlights its versatility and alignment with broader societal and environmental goals. The utilization of multiple AI technologies, in particular, the innovative integration of APIs from OpenAI, Google Gemini, and Anthropic, has been pivotal in achieving high accuracy and adaptability in Chef Dalle’s recipe recommendations. This multi-API approach allows Chef Dalle to cater to diverse dietary preferences and restrictions precisely, ensuring every recommendation is optimized for user satisfaction. The success of this strategy encourages us to explore further integrations and collaborations with other advanced AI technologies to enhance the system’s capabilities continuously. The anticipated integration with technologies such as Sora for generating instructional cooking videos promises to enhance the culinary experience further, making sophisticated recipes accessible to all skill levels.
However, the journey ahead is not without its challenges. Robust security measures and ethical considerations, especially data privacy and model integrity, are paramount to ensuring the system’s continued trustworthiness and effectiveness. Addressing these concerns will be crucial as Chef Dalle becomes increasingly integrated into daily life.
In conclusion, Chef Dalle represents a significant leap forward in leveraging AI to enhance human-computer interaction within the culinary domain. Its innovative approach not only improves the user experience but also promotes healthier eating habits, accessibility, and sustainability. As we look to the future, continuous improvement and research will be vital to unlocking Chef Dalle’s full potential, making it an indispensable tool in kitchens worldwide and a leading example of how AI can be harnessed to improve our quality of life.
Chef Dalle’s potential integration with upcoming technologies, such as Sora for instructional video generation, presents exciting possibilities for enhancing the cooking experience. This prospective feature could revolutionize how users learn to cook, making complex recipes accessible through step-by-step video guides. There are also plans to create a feature that integrates an AI assistant to help take a user through each step of the cooking process for recipes, including setting timers to remind the user that it is time for the next step.
Additionally, future research could explore the specific impact of each modality on user engagement and satisfaction, providing insights into how similar approaches could be optimized for other applications. Further studies could investigate the long-term impact of such systems on dietary behavior and health outcomes.

Supplementary Materials

The supporting information can be found at: https://www.youtube.com/watch?v=F4pXGr25V50, Video S1: Chef Dalle Recipe Recommender. Source code can be found at https://github.com/BStainz/ChefDalle.

Author Contributions

Conceptualization, B.H.; methodology, B.H.; software, B.H.; validation, Y.K. and J.J.L.; formal analysis, B.H.; investigation, B.H.; resources, Y.K.; data curation, P.M.; writing—original draft preparation, B.H.; writing—review and editing, Y.K., J.J.L. and P.M.; visualization, B.H.; supervision, P.M.; project administration, Y.K.; funding acquisition, Y.K. All authors have read and agreed to the published version of the manuscript.

Funding

This study was not directly supported by any agency.

Data Availability Statement

The data are only available on request due to privacy restrictions (the personal nature of human–chatbot communication).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Facts Sheet—Malnutrition. Available online: https://www.who.int/news-room/fact-sheets/detail/malnutrition (accessed on 15 March 2024).
  2. API Reference—OpenAI API. Available online: https://platform.openai.com/docs/api-reference (accessed on 15 March 2024).
  3. Speech to Text—OpenAI API. Available online: https://platform.openai.com/docs/guides/speech-to-text/quickstart (accessed on 15 March 2024).
  4. Vision—OpenAI API. Available online: https://platform.openai.com/docs/guides/vision (accessed on 15 March 2024).
  5. Image Generation—OpenAI API. Available online: https://platform.openai.com/docs/guides/images (accessed on 15 March 2024).
  6. SuperCook—Zero Waste Recipe Generator. Available online: https://www.supercook.com/#/desktop (accessed on 15 March 2024).
  7. Yummly: Personalized Recipe Recommendation and Search. Available online: https://www.yummly.com/ (accessed on 24 March 2024).
  8. 1,000,000+ Recipes, Meal Planner and Grocery List|BigOven. Available online: https://www.bigoven.com/ (accessed on 15 March 2024).
  9. ChatGPT. Available online: https://chatgpt.com/ (accessed on 24 March 2024).
  10. Gemini. Available online: https://gemini.google.com/app (accessed on 24 March 2024).
  11. Claude. Available online: https://claude.ai/chats (accessed on 24 March 2024).
  12. Kumar, Y.; Delgado, J.; Kupershtein, E.; Hannon, B.; Gordon, Z.; Li, J.J.; Morreale, P. AssureAIDoctor—A Bias-Free AI Bot. In Proceedings of the 2023 International Symposium on Networks, Computers and Communications (ISNCC), Doha, Qatar, 23–26 October 2023; pp. 1–6. [Google Scholar] [CrossRef]
  13. Villalobos, W.; Kumar, Y.; Li, J.J.; Morreale, P. The Multilingual Eyes Multimodal Traveler’s App. In Proceedings of the 9th International Congress on Information and Communication (ICICT 2024), London, UK, 19–22 February 2024; Available online: https://9thinternationalcongressoni.sched.com/event/1aGPZ/the-multilingual-eyes-multimodal-travelers-app (accessed on 24 March 2024).
  14. Kupershtein, E.; Kumar, Y.; Manikandan, A.; Morreale, P.; Li, J.J. ChatGPT as a Game-Changer for Embedding Emojis in Faculty Feedback. In Proceedings of the 2023 Congress in Computer Science, Computer Engineering & Applied Computing (CSCE), Las Vegas, NV, USA, 24–27 July 2023; pp. 1039–1046. [Google Scholar] [CrossRef]
  15. Plant Jammer—Inspiration Google Play Page. Available online: https://play.google.com/store/apps/details?id=com.plantjammer.plantjammer&hl=en_US (accessed on 15 March 2024).
  16. AI Recipe Generator (100% Free)|Lets Foodie. Available online: https://letsfoodie.com/ai-recipe-generator/ (accessed on 15 March 2024).
  17. DishGen|AI Recipes|AI Recipe Generator. Available online: https://www.dishgen.com/ (accessed on 15 March 2024).
  18. ChefGPT—Your AI-Powered Personal Chef. Available online: https://www.chefgpt.xyz/ (accessed on 15 March 2024).
  19. Ray-Ban Meta Smart Glasses|Meta Store|Meta Store. Available online: https://www.meta.com/smart-glasses/ (accessed on 24 March 2024).
  20. Galaxy AI|Mobile AI on Galaxy S24 Ultra|Samsung, US. Available online: https://www.samsung.com/us/smartphones/galaxy-s24-ultra/galaxy-ai/ (accessed on 24 March 2024).
  21. Hwang, A.; Badreddine, S.; Gifford, F.; Besold, T.R. Recipe 2.0: Information Presentation for AI-Supported Culinary Idea Generation. In Proceedings of the 14th International Conference on Computational Creativity (ICCC), Waterloo, ON, Canada, 19–23 June 2023. [Google Scholar]
  22. Kansaksiri, P.; Panomkhet, P.; Tantisuwichwong, N. Smart cuisine: Generative recipe & chatgpt powered nutrition assistance for sustainable cooking. Procedia Comput. Sci. 2023, 225, 2028–2036. [Google Scholar] [CrossRef]
  23. Değerli, A.H.; Tatlisu, N.B. Cooking with ChatGPT and Bard: A Study on Competencies of AI Tools on Recipe Correction, Adaption, Time Management and Presentation. J. Tour. Gastron. Stud. 2024, 11, 2658–2673. [Google Scholar] [CrossRef]
  24. Niszczota, P.; Rybicka, I. The credibility of dietary advice formulated by CHATGPT: Robo-diets for people with food allergies. Nutrition 2023, 112, 112076. [Google Scholar] [CrossRef] [PubMed]
  25. Qarajeh, A.; Tangpanithandee, S.; Thongprayoon, C.; Suppadungsuk, S.; Krisanapan, P.; Aiumtrakul, N.; Garcia Valencia, O.A.; Miao, J.; Qureshi, F.; Cheungpasitporn, W. AI-powered renal diet support: Performance of CHATGPT, Bard Ai, and Bing Chat. Clin. Pract. 2023, 13, 1160–1172. [Google Scholar] [CrossRef] [PubMed]
  26. Pravin, M.; Sundarapandiyan, R. Integrating Artificial Intelligence in food systems: Future trends, innovations, and prospects for sustainable development and enhanced culinary experiences. Int. J. Multidimens. Res. Perspect. 2024, 2, 49–62. [Google Scholar]
  27. Chen, Z.; Deng, Y.; Wu, Y.; Gu, Q.; Li, Y. Towards understanding a mixture of experts in deep learning. arXiv 2022, arXiv:2208.02813. [Google Scholar]
  28. Masoudnia, S.; Ebrahimpour, R. Mixture of experts: A literature survey. Artif. Intell. Rev. 2014, 42, 275–293. [Google Scholar] [CrossRef]
  29. Wu, X.; Huang, S.; Wang, W.; Wei, F. Multi-Head Mixture-of-Experts. arXiv 2024, arXiv:2404.15045. [Google Scholar]
  30. Krishnamurthy, Y.; Watkins, C.; Gaertner, T. Improving Expert Specialization in Mixture of Experts. arXiv 2023, arXiv:2302.14703. [Google Scholar]
  31. Yang, J.C.; Korecki, M.; Dailisan, D.; Hausladen, C.I.; Helbing, D. LLM Voting: Human Choices and AI Collective Decision Making. arXiv 2024, arXiv:2402.01766. [Google Scholar]
Figure 1. Chef Dalle app. (a) Home page; (b) recipe card; (c) profile page.
Figure 1. Chef Dalle app. (a) Home page; (b) recipe card; (c) profile page.
Computers 13 00156 g001
Figure 2. Chatbox Assistant. (a) Asks user to set timer; (b) countdown begins; (c) timer goes off.
Figure 2. Chatbox Assistant. (a) Asks user to set timer; (b) countdown begins; (c) timer goes off.
Computers 13 00156 g002
Figure 3. Chef Dalle system architecture.
Figure 3. Chef Dalle system architecture.
Computers 13 00156 g003
Figure 4. The factors and methods considered by the top LLMs to obtain the best recipe.
Figure 4. The factors and methods considered by the top LLMs to obtain the best recipe.
Computers 13 00156 g004
Figure 5. App workflow for recommendations.
Figure 5. App workflow for recommendations.
Computers 13 00156 g005
Figure 6. App GUI snapshot: (a) output from image identification; (b) image uploaded.
Figure 6. App GUI snapshot: (a) output from image identification; (b) image uploaded.
Computers 13 00156 g006
Figure 7. Improvement in average ratings over multiple sessions.
Figure 7. Improvement in average ratings over multiple sessions.
Computers 13 00156 g007
Figure 8. The trend of likes and dislikes over two sessions.
Figure 8. The trend of likes and dislikes over two sessions.
Computers 13 00156 g008
Figure 9. Distribution of dietary preferences among participants.
Figure 9. Distribution of dietary preferences among participants.
Computers 13 00156 g009
Table 1. Test cases for Chef Dalle.
Table 1. Test cases for Chef Dalle.
CaseDescriptionInput DataExpected OutputTest Result
1Basic recipe recommendation using a single ingredientUser inputs “pasta”System recommends pasta recipes
2Allergy filteringUser inputs “pasta” as ingredients and “peanut” as an allergySystem excludes recipes containing peanuts
3Voice-to-text conversion accuracy for multiple items: ingredientsUser says “beans, onions, tomatoes, cilantro” into the microphoneSystem accurately converts the spoken words to “tomatoes, onions” separated by comma in the input field
4Voice-to-text conversion accuracy for a single item: allergyUser says “vegan” into microphoneSystem accurately converts the spoken words to “vegan” ingredients in the input field
5Dietary preference accommodationUser inputs “vegan” as diet and inputs “pasta” as “ingredients”System recommends vegan pasta recipes
6Image recognition for ingredient identificationUser uploads an image of “flour, cinnamon sticks, butter, baking powder, sugar, brown sugar, eggs, milk”System identifies “flour, cinnamon sticks, butter, baking powder, sugar, brown sugar, eggs, milk” and adds ingredients to input field
7Recipe image generation with DALL-E 3User requests a recipe without an existing imageSystem generates and displays a new recipe image. Some images have a camera in the photo.
8User dislike feedback adaptationUser dislikes a “beans, rice, cilantro” recipe, then searches for “beans, rice, cilantro”System does not show disliked recipe and returns a different set of recommendations
9User like feedback adaptationUser likes a pasta recipe, then searches for pasta recipes againSystem prioritizes similar pasta recipes in recommendations
10Usability on mobile devicesAccessing Chef Dalle on a smartphoneSystem is fully functional with responsive design
11Multi-ingredient recipe suggestionsUser inputs “chicken, lemon, garlic”System suggests recipes that use all three ingredients
12Handling rare ingredientsUser inputs “kohlrabi”System suggests appropriate recipes or alternatives if direct matches are not found
Table 2. Feature comparison matrix.
Table 2. Feature comparison matrix.
FeatureChef DalleSupercookBigOvenYummly
Input ModalitiesVoice, image, and textTextTextText
Recipe
Personalization
Dynamic recipe generation, multi-API engine, and continuous learningLimited, ingredient-basedLimitedAdvanced filtering and personalization
User InteractionChatbox Assistant and real-time cooking guidanceMissing ingredient suggestionsSocial features and recipe clippingStep-by-step cooking mode and video recipes
Technological SophisticationMultiple AI models and DALL-E 3 for image generationBasicLarge recipe database and menu planningPersonalized recommendations and advanced filtering
Visual
Presentation
High-quality imagesVisual recipe organizationVisual recipe organizationSome video recipes
Recipe SourcesUser-provided ingredients, existing dataset, and AI-generated recipesUser-provided ingredients and curated databaseUser-generated, curated database, and web importCurated database
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hannon, B.; Kumar, Y.; Li, J.J.; Morreale, P. Chef Dalle: Transforming Cooking with Multi-Model Multimodal AI. Computers 2024, 13, 156. https://doi.org/10.3390/computers13070156

AMA Style

Hannon B, Kumar Y, Li JJ, Morreale P. Chef Dalle: Transforming Cooking with Multi-Model Multimodal AI. Computers. 2024; 13(7):156. https://doi.org/10.3390/computers13070156

Chicago/Turabian Style

Hannon, Brendan, Yulia Kumar, J. Jenny Li, and Patricia Morreale. 2024. "Chef Dalle: Transforming Cooking with Multi-Model Multimodal AI" Computers 13, no. 7: 156. https://doi.org/10.3390/computers13070156

APA Style

Hannon, B., Kumar, Y., Li, J. J., & Morreale, P. (2024). Chef Dalle: Transforming Cooking with Multi-Model Multimodal AI. Computers, 13(7), 156. https://doi.org/10.3390/computers13070156

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop