ChatGPT Translation of Program Code for Image Sketch Abstraction
Abstract
:1. Introduction
2. Related Work
3. Problem Setting and the Skeleton App
4. Methodology of the M-to-PY Conversion
4.1. The M-to-PY Methodology
- ▪
- Select LLM(s) to work with;
- ▪
- Understand the original MATLAB code with the help of LLM(s);
- ▪
- Identify conversion challenges;
- ▪
- Develop the M-to-PY converter prototype with the help of LLM(s);
- ▪
- Develop an associated App (optional);
- ▪
- Test and refine the converter using test cases generated by LLM(s);
- ▪
- Evaluate and learn from the result;
- ▪
- Compare the results with existing tools.
- (1)
- Complex Data Structure—Linked List;
- (2)
- Complex Algorithm—Quick Sort;
- (3)
- Complex Library—NumPy;
- (4)
- Error-Prone Code—Recursion;
- (5)
- Multithreading;
- (6)
- Modular code.
- ▪
- Choosing programming language and database;
- ▪
- Designing the Skeleton App GUI;
- ▪
- Developing the Skeleton App prototype;
- ▪
- Testing the Skeleton App;
- ▪
- Refining the Skeleton App;
- ▪
- Comparing with existing tools;
- ▪
- Creating test cases;
- ▪
- Evaluating the results.
4.2. Subsection Experiments with LLMs
4.3. Pair Programming with LLMs
4.4. LLMs Test Generation
4.5. M-to-PY Conversion with LLMs
5. Motion Detection with ChatGPT-4
Algorithm 1: High-view pseudocode of the case study |
Input: A GIF file containing multiple frames Output: GIF files with outlined objects, skeletons of objects, prominent objects circled, and motion detection 1. Function process_gif(input_gif): 2. frames = load_gif(input_gif) 3. max_width, max_height = get_max_dimensions(frames) 4. outlined_frames = [] 5. skeletons_frames = [] 6. prominent_objects_frames = [] 7. motion_detection_frames = [] 8. previous_skeleton = None 9. for frame in frames: 10. resized_frame = resize_frame(frame, max_width, max_height) 11. outlined_frame = create_outline(resized_frame)//Outline Creation 12. outlined_frames.append(outlined_frame) 13. skeleton = create_skeleton(outlined_frame)//Skeleton Creation 14. skeletons_frames.append(skeleton) 15. prominent_object_frame = highlight_prominent_object(outlined_frame)//Prominent 16. prominent_objects_frames.append(prominent_object_frame)//Object Highlighting 17. if previous_skeleton is not None://Motion Detection 18. motion_frame = detect_motion(previous_skeleton, skeleton) 19. motion_detection_frames.append(motion_frame) 20. previous_skeleton = skeleton 21. outlined_objects_gif = save_gif(outlined_frames) 22. skeletons_gif = save_gif(skeletons_frames) 23. prominent_objects_gif = save_gif(prominent_objects_frames) 24. motion_detection_gif = save_gif(motion_detection_frames) 25. return outlined_objects_gif, skeletons_gif, prominent_objects_gif, motion_detection_gif |
Algorithm 2: Object Outlining in GIF Frames |
Input: GIF file Output: GIF with outlined objects frames ← [frame for frame in imageio.get_reader(GIF path)]//Read GIF Frames max_width ← max([frame.width for frame in frames])//Determine Frame Dimensions max_height ← max([frame.height for frame in frames])//to standardize the frame size outlines ← []//Initialize Output: Create a list to store the outlined frames. for frame in frames do//Process Each Frame frame ← resize_with_padding(frame, max_width, max_height)//Resize Frame gray ← cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY)//Convert to Grayscale blurred ← cv2.GaussianBlur(gray, (5, 5), 0)//Apply Gaussian Blur edges ← cv2.Canny(blurred, 50, 150)//the Canny edge detector. kernel ← np.ones((5, 5), np.uint8)//Morphological Operations dilated ← cv2.dilate(edges, kernel, iterations = 2)//Apply dilation eroded ← cv2.erode(dilated, kernel, iterations = 2)//Apply erosion to close gaps //Find Contours: Extract contours from the processed image. contours, _← cv2.findContours(eroded, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) outline ← cv2.drawContours(frame.copy(), contours, −1, (0, 255, 0), 2)//Draw the contours outlines.append(outline)//Add the outlined frame to the list end for Save as GIF: Save the outlined frames as a new GIF file. |
- ▪
- Motion Analysis and Tracking: the skeleton output can be used to analyze or track the movement of the prominent object in each frame. This analysis is essential in biomechanics or animation fields, where understanding movement dynamics is crucial [30].
- ▪
- Machine Learning and Feature Extraction: the obtained skeletons can serve as a simplified yet effective feature set for machine learning algorithms, particularly in image recognition or classification tasks [31].
- ▪
- Object Recognition and Categorization: the structural information provided by the skeleton can be instrumental in recognizing or categorizing different objects within various frames [32].
- ▪
- 3D Modeling and Visualization: skeleton data can inform 3D modeling processes, aiding in the construction of detailed visualizations or reconstructions, especially in fields requiring a deep understanding of an object’s structure [33].
- ▪
- Development of Control Systems in Robotics: skeletal data can be pivotal in developing algorithms for robotic movements or gesture-based control systems, enhancing the interaction between robots and their environment [34].
- ▪
- Augmented Reality (AR) Applications: leveraging the skeleton data can provide a unique interactive experience, overlaying digital information onto real-world objects [35].
Algorithm 3: Cheetah in the circle |
Input: The list of frames Output: A new GIF with circles drawn around the prominent object in each frame. For each frame, process to highlight the prominent object: Resize the frame with padding to match the maximum dimensions. Convert the frame to grayscale. Apply Gaussian Blur to reduce noise. Apply thresholding to create a binary image. Find contours in the binary image. Filter out small contours based on a size threshold to remove noise. Identify the largest contour, assuming it is the prominent object. Draw a circle around the largest contour. Save the frame. Save the processed frames with circles as a new GIF. |
6. Discussion
7. Conclusions
- ▪
- LLMs as AI Pair Programmers (RQ1): This study confirms that Large Language Models, such as ChatGPT-4, Bard, and Bing, can be effectively used as AI pair programmers in MATLAB-to-Python conversion for complex projects. These models excel in understanding the logic and structure of the code, translating MATLAB code into Python with high accuracy, and providing valuable insights into code functionality.
- ▪
- Comparison with Existing Tool (RQ2): The research shows that LLMs outperform existing language-to-language converter tools regarding accuracy and efficiency. In handling complex MATLAB code, LLMs produced functional, accurate, and efficient Python code more effectively and swiftly than current tools.
- ▪
- Automation Using OpenAI API (RQ3): The study explored the automation of the MATLAB-to-Python conversion process using the OpenAI API. It identified challenges such as token limitations for large code segments and occasional inaccuracies in complex translations. The research suggests segmenting code for translation and combining the API’s output with manual review as effective solutions.
- ▪
- Feasibility of Skeleton App Development (RQ4): The feasibility of creating a functional Skeleton App with the help of ChatGPT post-M-to-PY translation is confirmed. The Skeleton App, available online at no cost, is user-friendly and adept at generating image skeletons, showcasing the potential of LLMs in code translation and application development.
- ▪
- Commercial Considerations: While using ChatGPT-4 and similar models incur costs, this was not a scalability concern for this project. However, it may become relevant for larger applications.
- ▪
- Shift Towards Testing Efforts: As the reliability of code generation by LLMs increases, this research highlights a shift towards enhancing testing efforts, where LLMs can provide significant assistance.
- ▪
- Overall Impact and Future Applications: This research validates the latest LLMs in-app generation advanced capabilities for image sketch abstractions and 3D motion detections. The developed M-to-PY converter bridges the gap between MATLAB scientific code and AI code in Python, facilitating the scientific use of AI technologies.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- OpenAI Quickstart. 2023. Available online: https://platform.openai.com/docs/quickstart (accessed on 1 September 2023).
- Leonard, K.; Morin, G.; Hahmann, S.; Carlier, A. A 2D shape structure for decomposition and part similarity. In Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 4–8 December 2016; pp. 3216–3221. [Google Scholar] [CrossRef]
- Javaid, M.A. Understanding Dijkstra’s Algorithm. SSRN Electron. J. 2013, 14–26. [Google Scholar] [CrossRef]
- Dennis Layton. Open AI Code Interpreter. 2023. Available online: https://medium.com/@dlaytonj2/open-ai-code-interpreter-what-you-need-to-know-29c57085835e (accessed on 1 September 2023).
- OMPC. 2008. Available online: http://ompc.juricap.com/ (accessed on 1 September 2023).
- Meta, A.I. Deep Learning to Translate between Programming Languages. 2020. Available online: https://ai.facebook.com/blog/deep-learning-to-translate-between-programming-languages/ (accessed on 1 September 2023).
- Koziolek, H.; Gruener, S.; Ashiwal, V. ChatGPT for PLC/DCS Control Logic Generation. arXiv 2023, arXiv:2305.15809. [Google Scholar]
- Li, R.; Pu, C.; Fan, F.; Tao, J.; Xiang, Y. Leveraging ChatGPT for Power System Programming Tasks. arXiv 2023, arXiv:2305.11202. [Google Scholar]
- Tsai, M.L.; Ong, C.W.; Chen, C.L. Exploring the use of large language models (LLMs) in chemical engineering education: Building core course problem models with Chat-GPT. Educ. Chem. Eng. 2023, 44, 71–95. [Google Scholar] [CrossRef]
- Merow, C.; Serra-Diaz, J.M.; Enquist, B.J.; Wilson, A.M. AI chatbots can boost scientific coding. Nat. Ecol. Evol. 2023, 7, 960–962. [Google Scholar] [CrossRef] [PubMed]
- Kumar, Y.; Morreale, P.; Sorial, P.; Delgado, J.; Li, J.J.; Martins, P. A Testing Framework for AI Linguistic Systems (testFAILS). Electronics 2023, 12, 3095. [Google Scholar] [CrossRef]
- Maurer, B.; Geo, P. Tech, AI Chatbot Geotechnical Engineer: How AI Language Models Like “ChatGPT” Could Change the Profession. Engrxiv 2023. [Google Scholar] [CrossRef]
- Crokidakis, N.; de Menezes, M.A.; Cajueiro, D.O. Questions of science: Chatting with ChatGPT about complex systems. arXiv 2023, arXiv:2303.16870. [Google Scholar]
- Nikolic, S.; Daniel, S.; Haque, R.; Belkina, M.; Hassan, G.M.; Grundy, S.; Sandison, C. ChatGPT versus engineering education assessment: A multidisciplinary and multi-institutional benchmarking and analysis of this generative artificial intelligence tool to investigate assessment integrity. Eur. J. Eng. Educ. 2023, 48, 559–614. [Google Scholar] [CrossRef]
- Vercel. AI Code Translator. 2023. Available online: https://vercel.com/templates/next.js/ai-code-translator (accessed on 1 September 2023).
- CodeConvert. 2023. Available online: https://www.codeconvert.ai/ (accessed on 1 September 2023).
- Ma, J.; Ren, X.; Li, H.; Li, W.; Tsviatkou, V.Y.; Boriskevich, A.A. Noise-against skeleton extraction framework and application on hand gesture recognition. IEEE Access 2023, 11, 9547–9559. [Google Scholar] [CrossRef]
- Wang, T.; Yamakawa, Y. Edge-Supervised Linear Object Skeletonization for High-Speed Camera. Sensors 2023, 23, 5721. [Google Scholar] [CrossRef]
- Lovanshi, M.; Tiwari, V. Human skeleton pose and spatio-temporal feature-based activity recognition using ST-GCN. Multimed. Tools Appl. 2023. [Google Scholar] [CrossRef]
- Usmani, A.; Siddiqui, N.; Islam, S. Skeleton joint trajectories based human activity recognition using deep RNN. Multimed. Tools Appl. 2023. [Google Scholar] [CrossRef]
- Skeleton App. 2023. Available online: https://skeleton-app-65563db1db74.herokuapp.com/ (accessed on 1 September 2023).
- Skeleton App Repo. 2023. Available online: https://github.com/ykumar2020/SkeletonApp (accessed on 1 November 2023).
- Discussion with ChatGPT-4 on Motion Detection. Available online: https://chat.openai.com/share/9f06d172-d095-4199-95d3-40406ea00ed7 (accessed on 1 September 2023).
- Python Code and Its Results. 2023. Available online: https://colab.research.google.com/drive/1yy3h4_FGTsDOf5wOTFs34p5GXMn_QU1X?usp=sharing (accessed on 1 September 2023).
- Bergstrom, A.C.; Conran, D.; Messinger, D.W. Gaussian blur and relative edge response. arXiv 2023, arXiv:2301.00856. [Google Scholar]
- Canny, J. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986, 6, 679–698. [Google Scholar] [CrossRef]
- Soille, P. Morphological Image Analysis: Principles and Applications; Springer: Berlin, Germany, 1999; Volume 2, pp. 170–171. [Google Scholar]
- Chakraborty, D. OpenCV Contour Approximation (cv2.approxPolyDP). PyImageSearch. 2021. Available online: https://pyimagesearch.com/2021/10/06/opencv-contour-approximation/ (accessed on 15 January 2024).
- GIF Image of a Running Cheetah. Available online: https://i.pinimg.com/originals/61/5d/b5/615db596fe3db4b7e04b54af1aea5826.gif (accessed on 15 January 2024).
- Zarka, N.; Alhalah, Z.; Deeb, R. Real-Time Human Motion Detection and Tracking. In Proceedings of the 2008 3rd International Conference on Information and Communication Technologies: From Theory to Applications, Damascus, Syria, 7–11 April 2008; pp. 1–6. [Google Scholar] [CrossRef]
- Ali, M.; Kumar, D. A Combination between Deep learning for feature extraction and Machine Learning for Recognition. In Proceedings of the 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kharagpur, India, 6–8 July 2021; pp. 01–05. [Google Scholar] [CrossRef]
- Carvalho, L.E.; von Wangenheim, A. 3D object recognition and classification: A systematic literature review. Pattern Anal. Appl. 2019, 22, 1243–1292. [Google Scholar] [CrossRef]
- Cárdenas, J.L.; Ogayar, C.J.; Feito, F.R.; Jurado, J.M. Modeling of the 3D Tree Skeleton Using Real-World Data: A Survey. IEEE Trans. Vis. Comput. Graph. 2023, 29, 4920–4935. [Google Scholar] [CrossRef] [PubMed]
- Yang, N.; Duan, F.; Wei, Y.; Liu, C.; Tan, J.T.C.; Xu, B.; Zhang, J. A study of the human-robot synchronous control system based on skeletal tracking technology. In Proceedings of the 2013 IEEE International Conference on Robotics and Biomimetics (ROBIO), Shenzhen, China, 12–14 December 2013; pp. 2191–2196. [Google Scholar] [CrossRef]
- Singh, J.; Urvashi; Singh, G.; Maheshwari, S. Augmented Reality Technology: Current Applications, Challenges and its Future. In Proceedings of the 2022 4th International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 21–23 September 2022; pp. 1722–1726. [Google Scholar] [CrossRef]
- Yang, X.; Wang, X.; Liu, Z.; Shu, F. M2Coder: A Fully Automated Translator from Matlab M-functions to C/C++ Codes for ACS Motion Controllers. In Proceedings of the International Conference on Guidance, Navigation and Control, Tianjin, China, 5–7 August 2022; Springer Nature: Singapore, 2022; pp. 3130–3139. [Google Scholar]
- Benefits of AI Tools Section at Stack Overflow Developer Survey. 2023. Available online: https://survey.stackoverflow.co/2023/#benefits-of-ai-tools (accessed on 15 January 2024).
Converter Tool | Test Case Number | |||||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
M-to-PY | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Vercel | ✓ | ✓ | ✓ | ✓ | ✓ | ✕ |
CodeConvert AI | ✓ | ✓ | ✓ | ✓ | ✕ | ✕ |
OMPC | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ |
# | Testing Parmeter | Outcome | Testing Notes |
---|---|---|---|
1 | Feeding whole text at once | Failure | ChatGPT did not allow more than 1000 tokens |
2 | Feeding MATLAB code snippets in parts | Failure | Many errors as chat generate not only different variable names but assume and produce different types and shapes of data structures |
3 | Running whole code | Failure | Did not work hard to conclude as to why the source code consists of several classes, modules, and many functions. Only one starting main script runs calling other modules, creating objects, etc. |
4 | Possible workarounds | Failure | Not many available tools, converters did not work, and manually debugging required time and expertise with MATLAB |
Criteria | Object Outlines | Skeletons |
---|---|---|
Representation | The complete boundary of the object | Simplified, centerline representation of the object |
Pros | A clear understanding of shape and orientation Easy calculation of features such as area and perimeter | Simplified analysis of complex shapes |
Less sensitive to boundary noise | ||
Useful for analyzing the movement of object parts | ||
Cons | Sensitive to noise and small boundary variations | Loses information about object’s width and shape |
Application to Motion Detection | Better for understanding overall object movement and shape changes | Better for analyzing the movement of specific object parts |
Criteria | Semi-Automated Approach | Fully Automated Approach |
---|---|---|
Efficiency | Time-intensive but allows for greater control during the conversion process. | Faster, as it requires less human intervention. |
Accuracy | High, due to human oversight correcting potential errors. | May vary, depending on the sophistication of the automation algorithms. |
Ease of Implementation | More complex, requiring a balance between automated tools and human input. | Simpler, as it relies heavily on pre-defined algorithms. |
Flexibility | Highly adaptable to specific requirements of different projects. | Less adaptable, often designed for general cases. |
User Involvement | Requires significant user involvement for decision-making and error correction. | Minimal user input is needed, as the process is largely self-sufficient. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kumar, Y.; Gordon, Z.; Alabi, O.; Li, J.; Leonard, K.; Ness, L.; Morreale, P. ChatGPT Translation of Program Code for Image Sketch Abstraction. Appl. Sci. 2024, 14, 992. https://doi.org/10.3390/app14030992
Kumar Y, Gordon Z, Alabi O, Li J, Leonard K, Ness L, Morreale P. ChatGPT Translation of Program Code for Image Sketch Abstraction. Applied Sciences. 2024; 14(3):992. https://doi.org/10.3390/app14030992
Chicago/Turabian StyleKumar, Yulia, Zachary Gordon, Oluwatunmise Alabi, Jenny Li, Kathryn Leonard, Linda Ness, and Patricia Morreale. 2024. "ChatGPT Translation of Program Code for Image Sketch Abstraction" Applied Sciences 14, no. 3: 992. https://doi.org/10.3390/app14030992
APA StyleKumar, Y., Gordon, Z., Alabi, O., Li, J., Leonard, K., Ness, L., & Morreale, P. (2024). ChatGPT Translation of Program Code for Image Sketch Abstraction. Applied Sciences, 14(3), 992. https://doi.org/10.3390/app14030992