NTT DATA Services strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inclusive, adaptable, and forward-thinking organization, apply now.
We are currently seeking a Production Management Engineer to join our team in Halifax, Nova Scotia (CA-NS), Canada (CA).
Job Responsibilities Include:
- Monitor and resolve system errors/disruptions. Document resolution. Manage incident per ITIL lifecycle. Liaise with upstream data owners to provide resolution. Respond to and solve inquiries and operations requested by users. Document/Review handling steps for support scenarios.
- Prepare and present stability reports and presentations. Analyze alert and stability trends and make recommendations. Investigate root cause of issues and inform/educate developers about the cause so the developers can mitigate the root cause.
- Automate (1) resolution of common problems (2) routine investigations (3) routine user requests using scripts or available programming platform. Lead reliability or business-driven projects. Provide reliability engineering.
- You will work closely with engineering/development teams to design, build and maintain systems and help them decide on products to use, schema design and query tuning
- You will troubleshoot issues across the entire stack: hardware, software, application and network
- You will mentor other SREs on standard methodology from monitoring and troubleshooting complex code and database issues
- Represent the SRE organization in design reviews and operational readiness exercises for new and existing services
- Participate in on-call rotation and conference calls with other specialists over different time zones
- Hands-on Unix experience
- Hands-on experience with SQL-based DB
- Three Tier support with DBs such as IMB, DB2, Sybase, Mongo, Green Plum, KDB
- Excellent analytical and communication skills
- Ability to prioritize and willingness to take ownership
- Problem-solving mindset and solution enabler
- Great problem-solving and debugging ability
- Familiar with financial products like Equity and Fixed Income, Securities and different types of risks in an investment bank, Trade flow
- Contribute to system design and architecture through strong database knowledge
- Knowledge of automation-related activities using scripting languages like Python, Perl, Ruby and Bash
- Hands-on experience with enterprise tools like AppDynamic, Grafana, Splunk, Dynatrace
- Awareness of and ability to reason about modern software and system architectures, including load balancing, queuing, caching, distributed systems failure modes, microservices, Cloud, etc.
- Deep understanding of operating system concepts such as process, memory allocation and the network stack; an understanding of how applications are affected by the above and the ability to debug.
- Practical experience running large scale online systems is always an advantage
About NTT Data Canada
NTT Data Canada drives outcomes that keep their clients a step ahead in the digitally dynamic world. Their team of more than 50,000 professionals worldwide works with clients to address the challenges of today and tomorrow – whether it’s helping jump-start a cloud migration, reinvent the customer experience, streamline business processes or upgrade ageing infrastructure. As a division of NTT DATA Corporation, a top 10 global IT services provider with 120,000+ employees in more than 50 countries, they excel in blending IT and business expertise with decades of industry know-how. NTT Data Canada offers one of the industry’s most comprehensive services portfolios, designed to modernize business and technology to deliver the outcomes that matter most to their clients.