Basic Information
Ref Number
Last day to apply
Primary Location
Country
Job Type
Work Style
Description and Requirements
Must have 8+ years of experience.
Troubleshooting experience supporting applications hosted in AWS on EKS and EC2.
Advanced Java application troubleshooting including thread and heap analysis.
Understanding complex multi-application environments, which interact in a firewalled environment.
Splunk log analysis for application troubleshooting.
Must be able to develop adhoc queries on the fly to pinpoint application issues.
Needs to be able to articulate search parameters and commands.
Usage of prewritten queries and dashboards is not sufficient.
This is not Splunk administration like rolling out agents or administering Splunk.
Datadog, AppDynamics, or other APM tool – Usage of Datadog or AppDynamics (or equivalent application performance monitoring (APM) tool, such as DynaTrace or New Relic) to troubleshoot and monitor application health and performance. This would include investigation of business transactions, information points, and health rules to pinpoint application problems.
Java Application / Database interaction and troubleshooting – Comprehensive understanding of Application to Database interaction and troubleshooting.
This would include Oracle and Mongo databases.
Areas of understanding like JDBC connection utilization and troubleshooting, connection pool/query/cache optimization, ability to analyze DB reports like AWR and make recommendations.
Automation – Experience creating and maintaining automated pipelines with tools like Gitlab and Jenkins.
Linux knowledge – Supporting applications running in a Linux environment.
Shell scripting for automation of administration tasks.
Understanding of OS setting for app performance optimization, administration of Microservices, troubleshooting logging/forwarding issues.
Additional Job Description
Excellent communication skills.
Able to convey information to leadership and technical audiences.
Providing technical mentoring and day-to-day direction of technical staff responding to alarms, triaging issues, and escalating appropriately.
Constantly striving to improve team response.
EEO Statement