Platform Systems Engineer
Your Responsibilities:
- Pinpoint, address, solve, and escalate incidents swiftly and efficiently.
- Oversee the real-time health and functionality of the platform.
- Prioritize and address issues reported by users.
- Use application logs and stack traces for issue diagnosis.
- Collaborate with fellow engineers to uphold and enhance the application's health and functionality.
- Create operational tools, advancements, and automation solutions.
- Engage in incident management and participate in post-incident evaluations.
- Compose clear and user-friendly documentation for the environment and operational processes.
- Join a team working in a 24x7 rotational shift pattern.
Your Credentials:
- Demonstrated commitment to ownership, customer support, and ethical integrity coupled with excellent communication skills.
- Proven capability to navigate and resolve intricate engineering challenges through debugging and analytical skills.
- Proficiency in running and rectifying Linux/Unix systems in a live setting.
- Familiarity with shell scripting and other scripting languages.
- Knowledge of container solutions (Docker) and orchestration tools (Kubernetes).
- Experience in implementing logging, telemetry, and monitoring tools like Splunk, Prometheus, and Grafana.
- Proficiency with version control systems, especially Git.
- Robust skills in debugging and problem-solving across applications, systems, and networks (TCP/IP).
- Experience working within a 24x7 shift rotation.
- Possession of a bachelor's degree or a more advanced degree.
Added Advantages:
- Proficiency in working with APIs and familiarity with serialized formats such as JSON and YAML.
- Knowledge of interactive web-based computing and notebook platforms, including but not limited to Jupyter notebook, DeepNote, Apache Zeppelin, JetBrains Datalore, NextJournal, and Google Colab.
- Capability to understand and troubleshoot Python scripts.
- Operations background in sectors like e-commerce, retail banking, or payment systems.
- Experience operating AWS in large-scale production settings.