Software Engineer - SRE
Bethesda Softworks @ Rockville, MD, US
The Bethesda.net team is seeking a Site Reliability Engineer (SRE) to help solve our toughest problems. Working on the platform’s technical foundations, you will help improve performance and stability. Working with our product teams, you will mentor engineers and guide future development to bring sustainable growth to our services.
Given the choice between fast and perfect, you seek the proper balance. Your experience brings an understanding of how past and present choices affect the future potential of systems and teams. When you see repetitive work or manual processes, you actively seek to reduce this wasted effort and increased risk.
- 2 years of experience as a software engineer
- You should possess a good grasp of software engineering principles, and strong problem solving, design, and testing skills
- Experience developing and designing software solutions for a live platform
- Experience operating and deploying systems in a cloud environment
- Experience with engineering automated build/deploy systems which include continuous integration as well as infrastructure as code
- Familiarity with Docker container based systems
- Able to troubleshoot complex systems in a live environment quickly and effectively
- Familiarity with Linux system administration
- Review new and existing services for performance, reliability, and sustainable coding practices
- Understand and define infrastructure as code to support systems developed
- Write clean, maintainable code, that is suitable for continuous integration and deployment (CI/CD), following best practices and software guidelines
- Maintain and contribute to common code libraries that can be used by engineers to leverage the platform in a consistent manner
- Learn and utilize diverse languages and technologies as needed - Python, Go, Nginx, Redis, MySQL, AWS technologies, etc.
- Develop solutions in support of platform reliability under leadership of a technical lead and senior engineers
- Support systems in a 24x7 environment including troubleshooting, hot fixing, and root cause analysis
- Implement automation for repeated and time-consuming tasks
- Participate in on-call rotation with the rest of the engineering team to provide escalated support for Tier 1 & 2
- Other duties as assigned