Elasticsearch and trace logging for everything

I needed to enable logging on everything for Elasticsearch. By everything I mean literally everything I mean literally everything, if there was a log line, I wanted to see it.

I recently started moving my ELK stack to Docker, backed by an NFS share. I started to immediately receive an error saying it couldn't obtain a lock, without any other real information about why and wanted to enable trace logging to figure it out.

The first thing I wanted to do to figure out this problem, enable trace logging on everything from Elasticsearch. By everything I mean literally everything, if there was a log entry, I wanted to see it. It was kind of amazing how many answers to that question was along the lines of What do you mean everything? I understand what everything means, so here's the answer, you need to create (or modify) the log4j2.properties file and set rootLogger.level to trace. Unfortunately, there isn't a way to do it by setting an environment variable, or Java option.

I am using Elasticsearch 7.2.0 and used the default log4j2.properties file as a starting point. The only thing I changed was commenting out the rootLogger.level = info and added my own, rootLogger.level = trace. After doing this, there was a ton of log entries that get spewed out, but that's what I was expecting and wanting to see.

Here is my new log4j2.properties, if you're not using Docker, you will not want to copy this whole file, just change the necessary lines:

status = error

# log action execution errors for easier debugging
logger.action.name = org.elasticsearch.action
logger.action.level = debug

appender.rolling.type = Console
appender.rolling.name = rolling
appender.rolling.layout.type = ESJsonLayout
appender.rolling.layout.type_name = server

# switch these 2 lines to toggle trace/info
rootLogger.level = trace
#rootLogger.level = info

rootLogger.appenderRef.rolling.ref = rolling

appender.deprecation_rolling.type = Console
appender.deprecation_rolling.name = deprecation_rolling
appender.deprecation_rolling.layout.type = ESJsonLayout
appender.deprecation_rolling.layout.type_name = deprecation

logger.deprecation.name = org.elasticsearch.deprecation
logger.deprecation.level = warn
logger.deprecation.appenderRef.deprecation_rolling.ref = deprecation_rolling
logger.deprecation.additivity = false

appender.index_search_slowlog_rolling.type = Console
appender.index_search_slowlog_rolling.name = index_search_slowlog_rolling
appender.index_search_slowlog_rolling.layout.type = ESJsonLayout
appender.index_search_slowlog_rolling.layout.type_name = index_search_slowlog

logger.index_search_slowlog_rolling.name = index.search.slowlog
logger.index_search_slowlog_rolling.level = trace
logger.index_search_slowlog_rolling.appenderRef.index_search_slowlog_rolling.ref = index_search_slowlog_rolling
logger.index_search_slowlog_rolling.additivity = false

appender.index_indexing_slowlog_rolling.type = Console
appender.index_indexing_slowlog_rolling.name = index_indexing_slowlog_rolling
appender.index_indexing_slowlog_rolling.layout.type = ESJsonLayout
appender.index_indexing_slowlog_rolling.layout.type_name = index_indexing_slowlog

logger.index_indexing_slowlog.name = index.indexing.slowlog.index
logger.index_indexing_slowlog.level = trace
logger.index_indexing_slowlog.appenderRef.index_indexing_slowlog_rolling.ref = index_indexing_slowlog_rolling
logger.index_indexing_slowlog.additivity = false

Since I'm running this in Docker Swarm and I didn't want to modify the base image, I needed to map that file. I couldn't figure out how to get NFS to mount just a single file so I opted to just do the whole directory. To be honest, I didn't try very hard, I was about to murder this thing.

The directory you need to mount is /usr/share/elasticsearch/config. There are several files in there, since I didn't modify anything in the original image, here they are:

elasticsearch.yml:

cluster.name: "docker-cluster"
network.host: 0.0.0.0

jvm.options:

## JVM configuration

################################################################
## IMPORTANT: JVM heap size
################################################################
##
## You should always set the min and max JVM heap
## size to the same value. For example, to set
## the heap to 4 GB, set:
##
## -Xms4g
## -Xmx4g
##
## See https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html
## for more information
##
################################################################

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space

-Xms1g
-Xmx1g

################################################################
## Expert settings
################################################################
##
## All settings below this section are considered
## expert settings. Don't tamper with them unless
## you understand what you are doing
##
################################################################

## GC configuration
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly

## G1GC Configuration
# NOTE: G1GC is only supported on JDK version 10 or later.
# To use G1GC uncomment the lines below.
# 10-:-XX:-UseConcMarkSweepGC
# 10-:-XX:-UseCMSInitiatingOccupancyOnly
# 10-:-XX:+UseG1GC
# 10-:-XX:InitiatingHeapOccupancyPercent=75

## DNS cache policy
# cache ttl in seconds for positive DNS lookups noting that this overrides the
# JDK security property networkaddress.cache.ttl; set to -1 to cache forever
-Des.networkaddress.cache.ttl=60
# cache ttl in seconds for negative DNS lookups noting that this overrides the
# JDK security property networkaddress.cache.negative ttl; set to -1 to cache
# forever
-Des.networkaddress.cache.negative.ttl=10

## optimizations

# pre-touch memory pages used by the JVM during initialization
-XX:+AlwaysPreTouch

## basic

# explicitly set the stack size
-Xss1m

# set to headless, just in case
-Djava.awt.headless=true

# ensure UTF-8 encoding by default (e.g. filenames)
-Dfile.encoding=UTF-8

# use our provided JNA always versus the system one
-Djna.nosys=true

# turn off a JDK optimization that throws away stack traces for common
# exceptions because stack traces are important for debugging
-XX:-OmitStackTraceInFastThrow

# flags to configure Netty
-Dio.netty.noUnsafe=true
-Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0

# log4j 2
-Dlog4j.shutdownHookEnabled=false
-Dlog4j2.disable.jmx=true

-Djava.io.tmpdir=${ES_TMPDIR}

## heap dumps

# generate a heap dump when an allocation from the Java heap fails
# heap dumps are created in the working directory of the JVM
-XX:+HeapDumpOnOutOfMemoryError

# specify an alternative path for heap dumps; ensure the directory exists and
# has sufficient space
-XX:HeapDumpPath=data

# specify an alternative path for JVM fatal error logs
-XX:ErrorFile=logs/hs_err_pid%p.log

## JDK 8 GC logging

8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:logs/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m

# JDK 9+ GC logging
9-:-Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m
# due to internationalization enhancements in JDK 9 Elasticsearch need to set the provider to COMPAT otherwise
# time/date parsing will break in an incompatible way for some date patterns and locals
9-:-Djava.locale.providers=COMPAT

log4j2.properties (modified):

status = error

# log action execution errors for easier debugging
logger.action.name = org.elasticsearch.action
logger.action.level = debug

appender.rolling.type = Console
appender.rolling.name = rolling
appender.rolling.layout.type = ESJsonLayout
appender.rolling.layout.type_name = server

# switch these 2 lines to toggle trace/info
rootLogger.level = trace
#rootLogger.level = info

rootLogger.appenderRef.rolling.ref = rolling

appender.deprecation_rolling.type = Console
appender.deprecation_rolling.name = deprecation_rolling
appender.deprecation_rolling.layout.type = ESJsonLayout
appender.deprecation_rolling.layout.type_name = deprecation

logger.deprecation.name = org.elasticsearch.deprecation
logger.deprecation.level = warn
logger.deprecation.appenderRef.deprecation_rolling.ref = deprecation_rolling
logger.deprecation.additivity = false

appender.index_search_slowlog_rolling.type = Console
appender.index_search_slowlog_rolling.name = index_search_slowlog_rolling
appender.index_search_slowlog_rolling.layout.type = ESJsonLayout
appender.index_search_slowlog_rolling.layout.type_name = index_search_slowlog

logger.index_search_slowlog_rolling.name = index.search.slowlog
logger.index_search_slowlog_rolling.level = trace
logger.index_search_slowlog_rolling.appenderRef.index_search_slowlog_rolling.ref = index_search_slowlog_rolling
logger.index_search_slowlog_rolling.additivity = false

appender.index_indexing_slowlog_rolling.type = Console
appender.index_indexing_slowlog_rolling.name = index_indexing_slowlog_rolling
appender.index_indexing_slowlog_rolling.layout.type = ESJsonLayout
appender.index_indexing_slowlog_rolling.layout.type_name = index_indexing_slowlog

logger.index_indexing_slowlog.name = index.indexing.slowlog.index
logger.index_indexing_slowlog.level = trace
logger.index_indexing_slowlog.appenderRef.index_indexing_slowlog_rolling.ref = index_indexing_slowlog_rolling
logger.index_indexing_slowlog.additivity = false

role_mapping.yml:

# Role mapping configuration file which has elasticsearch roles as keys
# that map to one or more user or group distinguished names

#roleA:   this is an elasticsearch role
#  - groupA-DN  this is a group distinguished name
#  - groupB-DN
#  - user1-DN   this is the full user distinguished name

#power_user:
#  - "cn=admins,dc=example,dc=com"
#user:
#  - "cn=users,dc=example,dc=com"
#  - "cn=admins,dc=example,dc=com"
#  - "cn=John Doe,cn=other users,dc=example,dc=com"

users:
Empty, 0 byte file.

users_roles:
Empty, 0 byte file.

To set this up in my docker-compose.yml file I added 2 new volumes, 1 for the logs  (because why not) and 1 for the config directory. I'm using NFS for the volumes, your volume config may be different.

docker-compose.yml:

version: '3.7'
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.2.0
    environment: 
      - "ES_JAVA_OPTS=-Xms7g -Xmx7g"
      - discovery.type=single-node
    volumes:
      - elasticsearchdata:/usr/share/elasticsearch/data
      - elasticsearchconfig:/usr/share/elasticsearch/config
      - elasticsearchlogs:/usr/share/elasticsearch/logs
    ports:
      - 9200:9200
    networks:
      - ext
volumes:
  elasticsearchdata:
    driver: local
    driver_opts:
      type: nfs
      device: ":/volumes/elasticsearch/data"
      o: "addr=${NFSSERVER},vers=4,rw"
  elasticsearchconfig:
    driver: local
    driver_opts:
      type: nfs
      device: ":/volumes/elasticsearch/config"
      o: "addr=${NFSSERVER},vers=4,rw"
  elasticsearchlogs:
    driver: local
    driver_opts:
      type: nfs
      device: ":/volumes/elasticsearch/logs"
      o: "addr=${NFSSERVER},vers=4,rw"
networks:
  ext:

That's it for enabling trace logging for everything in Elasticsearch running in Docker.