Jobe Details

Logging and Monitoring Operations Engineer Posted on: 27/09/2018

Infinity Consulting Solutions, Inc
  • 4 to 6 Year(s)
  • -
  • CHICAGO-60602, IL, US

Powershell, interpersonal skills

  • Job Description

    Logging and Monitoring Operations Engineer

    ICS is partnered with a Fortune 500 financial services organization in Chicago seeking a Logging and Monitoring Operations Engineer.

    This Engineer will be responsible for leading an offshore team regarding enterprise logging and monitoring efforts.

    They will improve operational stability, reduce the risk of experimentation, and increase overall function of technology assets by providing robust, comprehensive logging, monitoring, and notification solutions.

    Oversee the technical implementation of the Enterprise Logging and Monitoring effort outlined below

    Translate strategic monitoring goals into tactical plans, then execute them within a team
    Build, maintain, and operate tools comprising the Enterprise Monitoring and Logging system to provide visibility into the operation of technology assets

    Meet the monitoring, logging, and outage notification needs of system, application, and business owners

    Work with teams to onboard specific applications with useful metrics and measures

    Integrate logically-connected metrics to provide higher-level awareness of overall health

    Assist in discovering root level problems not clearly visible from their initial impacts

    Mature overall organizational awareness and increase incident response capability

    Enable more efficient usage of development and operations staff time

    Emphasize collaboration and automation

    Become a force-multiplier within the organization by unlocking and sharing new capabilities

    Relentless pursuit of process improvement

    Primary Accountabilities/Responsibilities:

    Work with leadership and colleagues to define and modify Logging & Monitoring strategy

    Translate strategy into an actionable, tactical plan to accomplish high-level goals

    Mentor and guide team members and colleagues in Logging & Monitoring tactics and operations

    Make high-level decisions and perform low-level technical configurations to build and maintain a global monitoring system

    Operate, maintain, and expand monitoring tools

    Follow and execute change management procedures

    Develop and leverage new technologies to improve IT situational awareness

    Work with development teams, system owners, application owners, and business stakeholders to identify and monitor important infrastructure and business systems

    Create methods to detect errors and outages

    Create notifications to appropriate groups when issues occur

    Provide expertise, tools, and assistance to operations, development, and support teams for monitoring IT systems, infrastructure, applications, tools, processes and tasks

    Collaborate with support teams and business partners to ensure our business is operating and detect as quickly as possible if anything goes wrong

    Collaborate with DevOps to automate monitoring capability as part of building any new project or system

    Collaborate with IT Infrastructure to develop a comprehensive window into the global operation

    Actively seek out improvements and solutions in IT operational awareness

    Future expansion of this role may include opportunities for team leadership and management

    Job Requirements:

    Bachelor's degree in Computer Science or related field experience preferred

    Must be authorized to work in the US for any employer

    4-6 years of experience in Enterprise IT efforts to build, maintain, and deploy large-scale infrastructure or development projects, including:

    2-4 years of direct experience working with monitoring, logging, or telemetry software such as: Splunk, Zabbix, Nagios, Solarwinds, SCOM, Pingdom, Graylog, LogEntries, metricBeats, Elastic/ELK, Grafana, NXLog, EventTracker, Prometheus, DataDog, PagerDuty, AlertOps, OpsGenie, or others

    Experience working both independently and in a team-oriented, collaborative environment is essential

    Demonstrated ability to conform to shifting priorities, demands and timelines through analytical and problem-solving capabilities

    Ability to remain flexible during times of change and react to project adjustments and alterations promptly, efficiently and positively

    Strong written and oral communication skills

    Strong interpersonal skills

    Must be able to learn, understand and apply new technologies

    Strong customer orientation • Excellent analytical and problem-solving capability

    Ability to effectively prioritize and execute tasks in a high-pressure environment is crucial

    Ability to influence colleagues and communicate effectively across all levels of the organization

    Ability to manage multiple projects and work effectively under time constraints as necessary

    Excellent verbal, written and relationship skills used to interact with a global group of
    technical and non-technical people

    Attention to detail is a must

    Ideal candidate will have the following additional experience:

    Championing and driving an organization's logging and monitoring strategy

    Implementing large-scale monitoring projects

    Utilizing configuration as code or other strategies to bake monitoring into infrastructure at the earliest stages of implementation

    Automating management, configuration, or other tasks for consistency and reliability

    Scripting using Powershell or Bash

    Software development, particularly in .NET or .NET Core

    Using git for version control of software or scripts

    Experience on or with a NoC-style 24-hour monitoring and response team

    Creation of runbooks for controlled responses to incidents, errors, or problems
    Disaster Recovery planning

    Ability to speak to the applicability and potential value of the following concepts:
    DevOps, Continuous Integration, Continuous Delivery, Configuration as Code, Cattle not Pets, Customer Value Stream, Iterative Improvement

  • Operations Manager
    Information Technology
  • No Preference
    FullTime Job
  • Candidate Requirements
  • Walkin Information
Recruiter Details
Doug Klares
1350 Broadway, Suite 2205, NEW YORK-10018, NY, US
Advertise with Us,