This seems to be a very confusing subject for most, and I've had difficulty while learning how to setup Keras NN models as the addition/subtraction of layers and neurons creates vastly different outcomes in model. Normally I wouldn't just link out to others, but there is a very well written synopsis found on StackExchange below that lays it out in a very simple fashion. Very brief summary:
Input (first) layer: Neurons = Number of features in the datasetHidden layer(s): Neurons = Somewhere between 1 and the amount in the input later (take the mean); Number of hidden layers: 1 works for *most* applications, maybe none.Output (last) layer: exactly 1 unless it's a classification problem and you utilize the softmax activation, in which case the number equals the number of classes you are predicting
https://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw
Meaning in the case of a dataset with 20 features:
#Example Keras Binary Classification model
model = Sequential()
model.add(Dense(20, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',optimizer='adam')
#Example Keras Multi-Class model
model = Sequential()
model.add(Dense(20, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(3, activation='softmax')) #If I...