Lead Infrastructure Engineer- Infrastructure Monitoring
JPMorgan Chase & Co.
Benefits
- Health care coverage
- Retirement savings plan
- Backup childcare
- Tuition reimbursement
- Mental health support
- Financial coaching
Skills
About the Role
We have an exciting opportunity for you to collaborate with passionate professionals, solve complex problems, and grow your career in a supportive, innovative environment.
As a Lead Infrastructure Engineer at JPMorgan Chase within Corporate Technology's Enterprise Observability Platforms, you will help build and operate a strategic, market-leading Infrastructure Monitoring platform that strengthens critical service resilience and delivers trusted operational insights. You will be a hands-on technical contributor on an high-performing agile team, building secure, stable, and scalable observability solutions—turning telemetry into actionable insights, modernizing event-to-incident workflows, enabling automation and AIOps-driven reliability improvements aligned to the firm’s business objectives. Job responsibilities
- Engineer, operate, and continuously improve the firm’s Infrastructure Monitoring platforms, ensuring availability, performance, scalability, and security.
- Build and run enterprise-grade Infrastructure Monitoring capabilities across Linux, Windows, and complex Network estates, including platform-level onboarding and lifecycle management.
- Design and implement platform services, integrations, and telemetry collection across metrics, logs, events, including OpenTelemetry collection patterns where applicable.
- Develop and maintain standardized onboarding patterns (agents/collectors, configurations, dashboards, alert policies) to accelerate safe adoption at scale.
- Improve monitoring signal quality and usability through baselining, threshold strategy, noise reduction, enrichment, and topology/context alignment.
- Develop secure, high-quality automation and production code; review, debug, and improve code/configuration written by others.
- Automate platform operations and reduce toil through scripting and CI/CD-driven configuration management; implement infrastructure-as-code deployment patterns
- Manage & maintain production health for the monitoring platform: lead triage, perform RCA, and deliver preventative engineering and resilience improvements.
- Partner with infrastructure, application, and SRE teams to align platform capabilities to SLIs/SLOs, operational readiness, and continuous improvement goals.
Contribute to a culture of diversity, opportunity, inclusion, and respect.
Required qualifications, capabilities, and skills
- Formal training or certification on infrastructure engineering concepts and 5+ years applied experience
- Proficiency with enterprise operating systems (Linux and/or Windows), including administration, troubleshooting, performance analysis, and operational best practices within regulated production environments.
- Proven hands-on experience delivering and operating enterprise-scale Infrastructure Monitoring solutions across Linux, Windows, and/or Network estates
- Solid understanding and hands-on implementation of observability and telemetry concepts, including metrics, logs, and events, with experience using OpenTelemetry collection patterns and integrating telemetry into Downstream components
- Proficiency in automation and engineering practices, including scripting and development with Python, Ansible, PowerShell / Bash, and applying CI/CD-driven workflows for controlled, secure, and repeatable change management.
- Well-rounded experience in infrastructure across hardware platforms, operating systems, networking, storage, and databases (MS SQL Server, Oracle, Cassandra), including common deployment patterns, integration architectures, scaling and resiliency considerations, and performance assessment.
- Experience implementing Infrastructure-as-Code (IaC) and configuration management practices using tools such as Terraform, enabling standardized provisioning and scalable, repeatable deployments.
- Hands-on experience operating in hybrid infrastructure environments, including enterprise on-prem platforms and public/private cloud, with familiarity supporting and migrating monitoring capabilities across cloud boundaries.
- Demonstrated ability to improve monitoring signal quality through baselining, threshold strategy, noise reduction, enrichment, and topology/context alignment, supporting reliable event-to-incident workflows and operational insights.
Experience developing, reviewing, debugging, and maintaining secure, high-quality production code and platform configurations, including automation supporting monitoring platforms and platform operations.
Preferred qualifications, capabilities, and skills
- Hands on experience operating one or more enterprise monitoring platforms such as SCOM, Tivoli, SMARTS, IBM Instana, DX NetOps, ITNM ,Netcool Suite
- Experience with modern observability ecosystems such as Splunk, Dynatrace, Grafana, Prometheus and interoperability patterns for telemetry integration, routing and visualization.
- Experience with Kubernetes (e.g., EKS) for container orchestration and operations.
- Experience with topology-driven monitoring and correlation approaches for large-scale infrastructure environments.
- Knowledge of Event Management & AIOps workflows (noise reduction, anomaly detection, probable cause analysis, guided remediation) with appropriate controls.
We offer a competitive total rewards package including base salary determined based on the role, experience, skill set and location. Those in eligible roles may receive commission-based pay and/or discretionary incentive compensation, paid in the form of cash and/or forfeitable equity, awarded in recognition of individual achievements and contributions. We also offer a range of benefits and programs to meet employee needs, based on eligibility. These benefits include comprehensive health care coverage, on-site health and wellness centers, a retirement savings plan, backup childcare, tuition reimbursement, mental health support, financial coaching and more. Additional details about total compensation and benefits will be provided during the hiring process.
We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants’ and employees’ religious practices and beliefs, as well as mental health or physical disability needs. Visit our FAQs for more information about requesting an accommodation.
JPMorgan Chase & Co. is an Equal Opportunity Employer, including Disability/Veterans
Our professionals in our Corporate Functions cover a diverse range of areas from finance and risk to human resources and marketing. Our corporate teams are an essential part of our company, ensuring that we’re setting our businesses, clients, customers and employees up for success.