Galaxy Morphology Classification using OpenAI Visual Language Models
Lanz Anthonee Lagman1*, Prospero C. Naval, Jr.2, Reinabelle C. Reyes3
1Computational Science Research Center, University of the Philippines - Diliman, Quezon City, Philippines
2Department of Computer Science, University of the Philippines - Diliman, Quezon City, Philippines
3National Institute of Physics, University of the Philippines - Diliman, Quezon City, Philippines
* Presenter:Lanz Anthonee Lagman, email:lalagman1@up.edu.ph
Visual Language models (VLMs), which integrate a vision encoder with large (LLMs) and
small language models (SLMs), offer a new approach to machine learning (ML). Unlike traditional
models confined to either tabular or image data and requiring extensive preprocessing, VLMs can
combine text and image inputs for immediate deployment without prior training. We evaluate
OpenAI’s MMs, GPT-4o and GPT-4o-mini for zero-shot, prompt-based classification of images from
the Galaxy10 DECals dataset. Two samples were created: a 10-class and a reclassified 4-class set.
GPT-4o and GPT-4o-mini achieved 41.20% and 38.50% accuracy, respectively, on the 10-class set,
but performed better on the 4-class set with 79.00% and 73.00%. These results suggest the models
struggle with classifying galaxies when subtle visual differences exist between classes but perform
better with more visually distinct class groupings, indicating the need for image preprocessing.
Fine-tuning OpenAI's models with the inclusion of images, once available, could further improve
classification performance.


Keywords: visual language models, galactic morphology, image classification