The categories must be nominal (no inherent order). If the categories have a natural ranking (like "Low, Medium, High"), you should use Ordinal Logistic Regression instead.
It outputs a vector of probabilities for all classes that sum up to 1.0. The class with the highest probability is the predicted outcome. Key Differences at a Glance Multinomial Outcome Classes Function Example Fraud vs. Not Fraud Red vs. Blue vs. Green Complexity Simple; one set of weights Higher; weights for each class When to Use Which? Logistic Regression: Binary and Multinomial
It uses the Sigmoid function to map any real-valued number into a value between 0 and 1. The Math: It models the "log-odds" of the probability The categories must be nominal (no inherent order)