While the New Relic Kubernetes OpenTelemetry Collector is designed to be robust and reliable, issues can still arise. This troubleshooting document provides troubleshooting steps for common problems you might encounter.
General Collector Pod Issues
Check out the logs of the Collector pod that's experiencing issues. Run this command:
$kubectl logs <otel-pod-name> -n newrelicTo enable detailed DEBUG level logging for troubleshooting, set the verboseLog parameter to true in the nr-k8s-otel-collector Helm chart.
Metric collection failures
Problem: Metrics are not being collected or sent to New Relic.
Troubleshooting:
Verify scrape configurations: Ensure your
prometheusreceiver configurations within the collector's configuration (extraConfigordefault) are correct.bash$kubectl describe configmap prometheus-config -n monitoringCheck pod annotations: If you're using Prometheus service discovery, confirm that your application pods have the correct
prometheus.io/scrape=trueannotations.bash$kubectl get pods --namespace=[your-namespace] --show-labels | grep 'prometheus.io/scrape=true'Test network connectivity: Ensure the collector pod can reach the metric endpoints.
bash$kubectl exec [prometheus-pod-name] -- curl <http://service:port>
Configuration overrides not taking effect
Problem: Custom configurations are not properly applied.
troubleshooting:
Review your
values.yaml: Double-check yourvalues.yamlfile for typos or incorrect indentation in theextraConfigsection.bash$cat helm-charts/charts/nr-k8s-otel-collector/values.yaml | grep extraConfigValidate applied
ConfigMaps: The Helm chart generatesConfigMapsfrom yourvalues.yaml. Inspect the resultingConfigMapto ensure your custom settings are present.bash$kubectl describe configmap [collector-configmap-name] -n monitoring
Collector failing to start
Problem: The OpenTelemetry collector pod fails to initialize or crashes repeatedly.
Troubleshooting:
Inspect pod logs: The most common first step. Look for specific error messages that indicate misconfigurations or missing dependencies.
bash$kubectl logs [collector-pod-name] --namespace=monitoringVerify environment variables: Ensure required environment variables are correctly injected.
bash$kubectl exec [collector-pod-name] -- env | grep -i [variable-name]
Network failures
Problem: The collector cannot communicate or send data to New Relic.
Troubleshooting:
Check DNS resolution: Ensure the collector pod can resolve service names or New Relic endpoints.
bash$kubectl exec [collector-pod-name] -- nslookup service-nameRun connectivity tests: Test connectivity to internal services or external New Relic endpoints.
bash$kubectl exec [collector-pod-name] -- curl -I <http://service-name:port>Review network policies: If you have strict network policies in your cluster, ensure they allow traffic for the OpenTelemetry Collector pods to internal services and external New Relic endpoints.
bash$kubectl describe networkpolicy -n [namespace]
Support
If you have issues with the OpenTelemetry observability for Kubernetes, refer to:
- Issues section on GitHub for any similar problems or consider opening a new issue.