Custom log4j appender in Hadoop 2

How to specify custom log4j appender in Hadoop 2 (amazon emr)?

Hadoop 2 ignores my file that contains custom appender, overriding it with internal file. There is a flag -Dhadoop.root.logger that specifies logging threshold, but it does not help for custom appender.

Answers order to change at the name node, u can change /home/hadoop/ order to change for the container logs, u need to change it at the yarn containers jar, since they hard-coded loading the file directly from project resources.

2.1 ssh to the slave (on EMR u can also simply add this as bootstrap action, so u dont need to ssh to each of the nodes). ssh to hadoop slave

2.2 override the at the jar resources:

jar uf /home/hadoop/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.2.0.jar

I know this question has been answered already, but there is a better way of doing this, and this information isn't easily available anywhere. There are actually at least two that get used in Hadoop (at least for YARN). I'm using Cloudera, but it will be similar for other distributions.

Local properties file

Location: /etc/hadoop/conf/ (on the client machines)

There is the that gets used by the normal java process. It affects the logging of all the stuff that happens in the java process but not inside of YARN/Map Reduce. So all your driver code, anything that plugs map reduce jobs together, (e.g., cascading initialization messages) will log according to the rules you specify here. This is almost never the logging properties file you care about.

As you'd expect, this file is parsed after invoking the hadoop command, so you don't need to restart any services when you update your configuration.

If this file exists, it will take priority over the one sitting in your jar (because it's usually earlier in the classpath). If this file doesn't exist the one in your jar will be used.

Container properties file

Location: etc/hadoop/conf/ (on the data node machines)

This file decides the properties of the output from all the map and reduce tasks, and is nearly always what you want to change when you're talking about hadoop logging.

In newer versions of Hadoop/YARN someone caught a dangerously virulent strain of logging fever and now the default logging configuration ensures that single jobs can generate several hundred of megs of unreadable junk making your logs quite hard to read. I'd suggest putting something like this at the bottom of the file to get rid of most of the extremely helpful messages about how many bytes have been processed:

By default this file usually doesn't exist, in which case the copy of this file found in hadoop-yar-server-nodemanager-stuff.jar (as mentioned by uriah kremer) will be used. However, like with the other log4j-properties file, if you do create /etc/hadoop/conf/ it will be used on all your YARN stuff. Which is good!

Note: No matter what you do, a copy of container-log4j-properties in your jar will not be used for these properties, because the YARN nodemanager jars are higher in the classpath. Similarly, despite what the internet tells you -Dlog4j.configuration=PATH_TO_FILE will not alter your container logging properties because the option doesn't get passed on to yarn when the container is initialized.

Look for in the deployment. That is the script being sourced before executing the hadoop command. I see the following code in, see if modifying that helps.

HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_ROOT_LOGGER:-INFO,console}"

Need Your Help

How to disable Yii Resque log

php resque yii-components php-resque

I'm working with Yii-resque extension.

.Net Framework 1.1 on IIS 7.5 (Windows 7-64 bit)

.net windows-7 iis-7.5 .net-1.1 worker-process

I'm in a situation where i will need to get my .Net 1.1 codebase setup in Windows 7, 64 bit machine that has IIS 7.5.