This seems to be a very confusing subject for most, and I’ve had difficulty while learning how to setup Keras NN models as the addition/subtraction of layers and neurons creates vastly different outcomes in model. Normally I wouldn’t just link out to others, but there is a very well written synopsis found on StackExchange below that lays it out in a very simple fashion. Very brief summary:

  1. Input (first) layer: Neurons = Number of features in the dataset
  2. Hidden layer(s): Neurons = Somewhere between 1 and the amount in the input later (take the mean); Number of hidden layers: 1 works for *most* applications, maybe none.
  3. Output (last) layer: exactly 1

Meaning in the case of a dataset with 20 features:

#Example Keras Classification model
model = Sequential()

model.add(Dense(20, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(1, activation='sigmoid'))