
About the Role
Our work environment includes:
Growth opportunities
Casual work attire
Regular social events
As an SRE you will:
Be on call during business hours to respond to Alpha Nodus’s infrastructure availability incidents
Be the first line of defense to restore the systems, and then help debug and fix the issues in a way so that they don’t arise again.
Lead the effort to debug, resolve and restore the availability of services with the help of product development, customer success, and the operations team across services and levels of the stack. and ensure that similar incidents don't arise again.
Improve the infrastructure to prevent incidents from ever happening in the first place
Create monitoring and alerts on symptoms and not on outages
Design, build and maintain core infrastructure to allow scaling of Alpha Nodus Infrastructure to 100’s of enterprise customers
Improve the deployment, provisioning and rollout process
Document every action so findings turn into repeatable actions and then turn into automated recovery
Be the champion for all aspects of security and access control for the production environment
Requirements
You may be a fit if you have some of these qualifications:
Bachelor’s degree in computer science or other technical disciplines
1-3 years of professional experience as an SRE
Strong programming skills: Python, Node.js, javascript
Experienced with Nginx, Docker, Kubernetes, chef, terraform, or similar technologies
Have the urge to collaborate and communicate asynchronously with different departments
Have the urge to document all the things so you don't need to learn the same thing twice
Have the urge to deliver quickly and iterate fast

EMAIL your Resume and Cover Letter to jobs@alphanodus.com