Apple

Site Reliability Engineer


PayCompetitivo
LocationAustin/Texas
Employment typeOther

This job is now closed

  • Job Description

      Req#: 200545012

      Summary

      Our team is collaborative; we work closely with partner teams to deliver the best results for Apple. We strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard and results are rewarded.

      Key Qualifications

      • At least 3 years in a Site Reliability Engineering, DevOps, or Infrastructure focused roles.
      • Basic Linux expertise.
      • Experience supporting internet-facing production services and distributed systems.
      • Ability to implement and coordinate telemetry using monitoring and observability tools such as Splunk, Grafana and Prometheus or similar.
      • Experience in troubleshooting and resolving issues in Kubernetes from both an OS and Application perspective.
      • Hands on experience with scripting languages such as Bash, Python (required).
      • Experience building and operating container orchestrating systems like Kubernetes or EKS.
      • Experience designing, building and maintaining infrastructure with a cloud provider such as AWS
      • Automation advocate - you truly believe in removing operational load via software.
      • A strong sense of ownership. At the same time you’re a great teammate who communicates clearly and transparently.
      • Self motivated, inquisitive and always looking to learn more.
      • Nice to have:
      • Good understanding of networking , TCP/IP network fundamentals and basic troubleshooting.
      • Experience with disaster recovery and capacity planning.
      • Experience in deployment automation based on Terraform or CloudFormation.
      • Working experience of systems built with open source storage and search technologies including Cassandra, Kafka, Solr, Postgres and Redis.

      Description

      As an SRE at Apple, you'll: Operate, monitor, and triage all aspects of our production and non-production environments. Pioneer and implement the next generation telemetry system for News, Stocks, Weather and Books. Prepare alert handling procedures, runbooks, and collaborate with off-shore SRE team. Automate deployment and orchestration of services into the cloud environment as well as other routine processes. Actively participate in capacity planning and disaster recovery exercises. Interact with and support partner teams including engineering, SRE, QA, and project management. Create self-service solutions for them. Cultivate and maintain relationships with internal and external third party vendors.

      Education & Experience

      Bachelor of Science in Computer Science or equivalent experience is required.

      Additional Requirements

  • About the company

      Work at Apple! Join a team and inspire the work. Discover how you can make an impact: See our areas of work, worldwide locations, and opportunities for students.