Job Description
- Configure, implement, and manage Datadog monitoring and observability platform - Set up dashboards, alerts, and metrics for infrastructure and application monitoring - Ensure system reliability and performance through proactive monitoring strategies - Work on observability practices including logs, metrics, and traces - Develop and maintain scripts using Python, Bash, or PowerShell for automation - Collaborate with stakeholders for requirements gathering and monitoring solutions - Integrate Datadog with cloud and on-premise infrastructure environments - Support incident detection, troubleshooting, and resolution processes - Implement best practices for monitoring, alerting, and performance optimization - Work with infrastructure-as-code tools like Terraform and Ansible (if applicable)