There are a few rules of thumb when scaling resources for Graylog2:
graylog2-servernodes should have a focus on CPU power
graylog2-web-interfacenodes are mostly waiting for HTTP answers of the rest of the system and can also be rather small
graylog2-radionodes act as workers. They don't know each other and you can shut them down at any point in time without changing the cluster state at all.
Also keep in mind that messages are only stored in Elasticsearch. If you have data loss on Elasticsearch, the messages are gone. (except if you have backups of course)
MongoDB is only storing meta information and will be abstracted with a general database layer in future versions. This will allow you to use other databases like MySQL instead.
This is a minimum Graylog2 setup that can be used for smaller, non-critical or test setups. Nothing is redundant but it is easy and quick to setup.
This is a setup for bigger production environments. It has several
graylog2-server nodes behind a load balancer that share the load. The load balancer can ping the
graylog2-server nodes via REST/HTTP to check if they are alive and take dead nodes out of the cluster.
This is a big setup that allows to shut down or lose big parts of the system without losing messages. The messages are written to
graylog2-radio nodes behind a load balancer. Radio nodes are configured from the web interface and write the received messages to a Kafka cluster (AMQP is supported, too).
graylog2-server nodes read messages from the Kafka cluster and distribute the load automatically and very even. Messages just queue up on the Kafka broker disks until they are read if no
graylog2-server node is running or message processing is stopped on all of them. This way you can even shut down the whole Elasticsearch cluster if you want and never lose any messages.