Job Details

Site Reliability Engineer

SILVER SPRING-20904, MD, US
08/11/2019

-


Required Skills

    javascript
Company

Infinity Consulting Solutions, Inc

Experience

3 to 5 Year(s)

Job Description

Responsibilities

Serve as a primary point responsible for the overall health, performance, and capacity of one or more of our Internet-facing services

Gain deep knowledge of our complex applications

Assist in the roll-out and deployment of new product features and installations to facilitate our rapid iteration and constant growth.

Drive efficiencies in systems and processes: capacity planning, configuration management, performance tuning, monitoring and root cause analysis

Lead an objective no-blame post-incident analysis and review process

On behalf of operations be on point for capacity planning and to help the team anticipate and prepare for growth

Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications

Support the creation of end-to-end availability and performance of mission critical services Build automation to prevent problem recurrence. Partner with specialists to build automated responses for non-exceptional service conditions.

Develop reliability tools and frameworks for use by all operations teams

Ensure all key services are measured, monitored and raising alerts when needed

Partner with specialists on automating the deployment and configuration processes

Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications in a large-scale UNIX and Windows environment.

Function well in a fast-paced, rapidly-changing environment.

Be on-call when required to support our operations centers

Requirements

Bachelor's degree in Computer Science, Information Technology, Mathematics Software or Broadcast
Engineering, or other technical discipline, or related practical experience.

3+ years' experience with troubleshooting in Unix/Linux

Good programming skills in one or more of C/C++, Java, Javascript, Python, Perl, and an ability to pick up new ones.

Experience in the Linux environment and a good understanding of its fundamentals and internals: filesystems and modern memory management, threads and processes, the user/kernel-space divide, etc.
Background in Configuration and management of large-scale platforms. (Virtualization, Cloud, Unix, Linux, Java, SQL, Oracle)

A good understanding of large-scale distributed systems in practice, including multi-tier architectures, application security, monitoring and storage systems.

Working knowledge of the TCP/IP stack, internet routing and load balancing

Working exposure to linear and digital broadcasting and platforms preferred

Knowledge of most of these: data structures, relational and non-relational databases, networking, Linux internals, filesystems, web architecture, and related topics

Previous experience working with geographically-distributed coworkers.

Strong verbal, written, interpersonal communication and customer service skills and ability to work well in a global diverse, team-focused environment

Good organizational and conceptual skills combined with proven critical thinking, analytic, problem solving, and decision-making abilities

Ability to multi task within related functions


Others
Information Technology

No Preference
FullTime Job
Other
1

Candidate Requirements
-
Bachelors

Walkin Information
-
8/2/2019
-

Recruiter Details
Doug Klares
1350 Broadway, Suite 2205, NEW YORK-10018, NY
-