Machine Learning for Astronomical Source Classification


This is a demonstrative example of Machine Learning for the purpose of classifying different types of Astronomical sources in a dataset.

This ML code utilizes the Scitkit-Learn ML library for Python.

The ML code you are about to run, takes two data files (.csv) as inputs which contain some typical features (brightness, redshift and mass) for a few types of Astronomical sources (AGN, GRB, Supernova, Quasar, Stars).

One of the input files is used to train the ML model and the other one is the unseen test data file.

Sample training and actual data files (synthetically generated) are available for download below.

Download Sample Files

Download Training Data File
Download Actual Data File

Try out the ML code with your own data...

This code is flexible enough to work with data files (.csv only) other than the sample files provided above.

As long as the training and actual/test data files are structured in a tabulur format similar to the sample files, the ML code should work.

Take a look at the sample files provided above and pre-process your data files such that they have the same structure as the sample files.

The sample files have 3 features, but your data files can have any number of features of any kind.

Important note: Make sure that the "last" column in both your training and actual data files is the "target variable/label" so that the accuracy score, classification report and the confusion matrix could be written.

Fire up the ML code for Astronomical source classification!

When you upload the data files and hit "Run ML Code" button, the ML model is trained on the training data and tested using unseen data with similar structure.

The ML accuracy score, confusion matrix and the classification report are presented on the next page as the output.

Upload training data file:


Upload actual data file:




Home