Senior System Administrator, Technical Operations (Redwood City, CA)


Job Description:

We are looking for a Sr. System Administrator to join our fast growing SaaS business. This role would be part of the Technical Operations (TechOps) team which supports and scales the Zuora service and underlying infrastructures. TechOps provides 24x7x365 service support and is responsible for availability, security, and performance of the Zuora service, which is business critical for every single Zuora customer. The Sr. System Administrator will need to provide stellar system administration skills in Linux and open source (LAMP) environments with significant proficiency in how to design, monitor, manage, and scale globally disperse infrastructures for online service businesses.

The Sr. System Administrator will engage with other TechOps personnel to collaborate on infrastructure design and be responsible for fault tolerance, high availability, and helping to maintain a high quality of service for Zuora operations. This role will need to partner with Development, QA, and Release teams to ensure production infrastructures are ready for on-time deployment of product releases as well as ensuring operational requirements are met so each product release can be adequately supported by the TechOps organization.

The Sr. System Administrator will engage with Development teams at various phases of the development lifecycle to collaborate on high level application design and be responsible for capacity, scalability, and a high quality of service for Zuora operations.

The Sr. System Administrator will leverage industry best practices in design and systems architecture to help manage our growing network of servers and optimize performance of the service, while continuously contributing to monitoring and management.

Job Responsibilities:

  1. Own the design and build out of server infrastructures both virtual and physical, along with fluency in load balancer configurations, network administration, and MySQL databases as related to ensuring service availability, capacity management, performance, and security
  2. Be fluent in automated system provisioning, data centers, and writing automation scripts/tools to manage a large infrastructure with a small, effective staff of talented system engineers
  3. Cultivate partnerships with Development, infusing and helping to prioritize operational requirements, driving teams to surface key performance indicators (KPIs), craft the tools to interrogate, monitor, and alarm on those KPIs, write standard operating procedures to respond to the alarms, and ensure dependencies are well managed and fault tolerance is shipped in-tact
  4. Participate in on-call rotations for service support with a sense of urgency in maintaining service operations for high volume, business critical online web properties
  5. Clearly communicate and produce documentation for various technology architectures
  6. Use critical thinking, keen judgment, and leverage best practices wherever possible
  7. Be a champion of change management preferably with an ITIL/CMDB approach to asset and configuration management

Requirements:

  1. 5+ years of experience supporting high growth Linux server environments (200+ nodes), preferably most of which has been for online service businesses
  2. Fluency in monitoring and management approaches using toolsets such as Nagios, syslog-ng, Splunk, puppet, chef, and rrd-based tools such as MRTG, cacti, munin, ganglia, etc.
  3. Strong knowledge of building and optimizing LAMP and open source application stacks (Apache, tomcat, MySQL, Python, Ruby, chef, memcached, ActiveMQ, etc.), with significant expertise in Linux distros of RHEL derivatives such as CentOS
  4. Experience with both physical and virtual automated system provisioning (PXE boot, DHCP)
  5. Expertise in DNS, LDAP, various storage strategies, and network or distributed file systems
  6. Knowledge of networking in both physical data center and virtual hosting (cloud computing) environments, including VLANs and how/why used, difference between L2 and L3 protocols, fault tolerance in Internet connectivity, and management of various load balancing solutions (F5, Citrix, Cisco, Foundry)
  7. Knowledge of MySQL DB administration including backups and replication topologies is a plus
  8. Expertise with open source asset management systems such as OCS Inventory and GLPI is a plus
  9. Affinity for and ability to succeed in fast paced, high growth startup-like environments
  10. Strong communication skills with proficiency in producing design diagrams and operational procedures
  11. Experience engaging in development lifecycles to help shape custom application output
  12. Possess a positive, customer service attitude, proactive in engaging with Development teams
  13. Excellent multi-tasking and prioritization skills
  14. A degree from a 4-year university is desired (ideally in a technical field)

Local applicants preferred.


To apply, email your resume to careers@zuora.com and include the job title in the subject field.

Zuora is an Equal Opportunity Employer.

No third party applications accepted.