Member-only story
Step by Step guide to expose spark jmx metrics and funnel them to datadog.
Please read my previous article article to get context around the need and use of exposing spark jmx metrics .
Below are steps we need follow to expose spark jmx metrics and export them to datadog.
Step one: Configure spark job to expose jmx metrics.
This can be done by adding below args to spark.driver.extraJavaOptions
for a given spark job or add them in spark-defaults.conf (which is not recommended as it applies to each and every job on cluster)
'spark.driver.extraJavaOptions': '
- -Djava.rmi.server.hostname=
Step two: In spark-defaults.conf add “spark.metrics.namespace ${}”.
Why → When jmx metrics emitted they will contain application id like application_xxx as metrics prefix. So if we need to track metrics irrespective of application restarts , it needs to have application name instead of id.