Job Details

Logging and Monitoring Operations Engineer

CHICAGO-60602, IL, US
09/27/2018

-


Required Skills

    Powershell, interpersonal skills
Company

Infinity Consulting Solutions, Inc

Experience

4 to 6 Year(s)

Job Description

Logging and Monitoring Operations Engineer

ICS is partnered with a Fortune 500 financial services organization in Chicago seeking a Logging and Monitoring Operations Engineer.

This Engineer will be responsible for leading an offshore team regarding enterprise logging and monitoring efforts.

They will improve operational stability, reduce the risk of experimentation, and increase overall function of technology assets by providing robust, comprehensive logging, monitoring, and notification solutions.

Oversee the technical implementation of the Enterprise Logging and Monitoring effort outlined below

Translate strategic monitoring goals into tactical plans, then execute them within a team
Build, maintain, and operate tools comprising the Enterprise Monitoring and Logging system to provide visibility into the operation of technology assets

Meet the monitoring, logging, and outage notification needs of system, application, and business owners

Work with teams to onboard specific applications with useful metrics and measures

Integrate logically-connected metrics to provide higher-level awareness of overall health

Assist in discovering root level problems not clearly visible from their initial impacts

Mature overall organizational awareness and increase incident response capability

Enable more efficient usage of development and operations staff time

Emphasize collaboration and automation

Become a force-multiplier within the organization by unlocking and sharing new capabilities

Relentless pursuit of process improvement

Primary Accountabilities/Responsibilities:

Work with leadership and colleagues to define and modify Logging & Monitoring strategy

Translate strategy into an actionable, tactical plan to accomplish high-level goals

Mentor and guide team members and colleagues in Logging & Monitoring tactics and operations

Make high-level decisions and perform low-level technical configurations to build and maintain a global monitoring system

Operate, maintain, and expand monitoring tools

Follow and execute change management procedures

Develop and leverage new technologies to improve IT situational awareness

Work with development teams, system owners, application owners, and business stakeholders to identify and monitor important infrastructure and business systems

Create methods to detect errors and outages

Create notifications to appropriate groups when issues occur

Provide expertise, tools, and assistance to operations, development, and support teams for monitoring IT systems, infrastructure, applications, tools, processes and tasks

Collaborate with support teams and business partners to ensure our business is operating and detect as quickly as possible if anything goes wrong

Collaborate with DevOps to automate monitoring capability as part of building any new project or system

Collaborate with IT Infrastructure to develop a comprehensive window into the global operation

Actively seek out improvements and solutions in IT operational awareness

Future expansion of this role may include opportunities for team leadership and management

Job Requirements:

Bachelor's degree in Computer Science or related field experience preferred

Must be authorized to work in the US for any employer

4-6 years of experience in Enterprise IT efforts to build, maintain, and deploy large-scale infrastructure or development projects, including:

2-4 years of direct experience working with monitoring, logging, or telemetry software such as: Splunk, Zabbix, Nagios, Solarwinds, SCOM, Pingdom, Graylog, LogEntries, metricBeats, Elastic/ELK, Grafana, NXLog, EventTracker, Prometheus, DataDog, PagerDuty, AlertOps, OpsGenie, or others

Experience working both independently and in a team-oriented, collaborative environment is essential

Demonstrated ability to conform to shifting priorities, demands and timelines through analytical and problem-solving capabilities

Ability to remain flexible during times of change and react to project adjustments and alterations promptly, efficiently and positively

Strong written and oral communication skills

Strong interpersonal skills

Must be able to learn, understand and apply new technologies

Strong customer orientation • Excellent analytical and problem-solving capability

Ability to effectively prioritize and execute tasks in a high-pressure environment is crucial

Ability to influence colleagues and communicate effectively across all levels of the organization

Ability to manage multiple projects and work effectively under time constraints as necessary

Excellent verbal, written and relationship skills used to interact with a global group of
technical and non-technical people

Attention to detail is a must

Ideal candidate will have the following additional experience:

Championing and driving an organization's logging and monitoring strategy

Implementing large-scale monitoring projects

Utilizing configuration as code or other strategies to bake monitoring into infrastructure at the earliest stages of implementation

Automating management, configuration, or other tasks for consistency and reliability

Scripting using Powershell or Bash

Software development, particularly in .NET or .NET Core

Using git for version control of software or scripts

Experience on or with a NoC-style 24-hour monitoring and response team

Creation of runbooks for controlled responses to incidents, errors, or problems
Disaster Recovery planning

Ability to speak to the applicability and potential value of the following concepts:
DevOps, Continuous Integration, Continuous Delivery, Configuration as Code, Cattle not Pets, Customer Value Stream, Iterative Improvement



Operations Manager
Information Technology

No Preference
FullTime Job
Other
1

Candidate Requirements
-
Bachelors

Walkin Information
-
-
-

Recruiter Details
Doug Klares
1350 Broadway, Suite 2205, NEW YORK-10018, NY, US
-