Zuora provides the leading cloud-based subscription management platform that functions as a system of record for subscription businesses across all industries. Powering the Subscription Economy®, the Zuora platform was architected specifically for dynamic, recurring subscription business models and acts as an intelligent subscription management hub that automates and orchestrates the entire subscription order-to-cash process, including billing and revenue recognition.
At Zuora, every employee is the CEO of their career and leading our mission are over 1,200 passionate and innovative ZEOs who value freedom, responsibility and accountability in equal measure because they have the capacity to make shift happen. Our culture isn’t an empty branding effort – our ZEOs love working here and it shows in our 4.5+ rating on Glassdoor. We take it very seriously. We encourage our employees to be curious, creative, and stay focused on our shared mission of enabling our customers to be successful.
Zuora serves more than 1,000 companies around the world, including Box, Komatsu, Rogers, Schneider Electric, Xplornet and Zendesk. Headquartered in Silicon Valley, Zuora also operates offices in Atlanta, Boston, Frisco, Denver, San Francisco, London, Paris, Beijing, Sydney, Chennai and Tokyo.
We are Looking for Site Reliability Engineer for our Chennai Office.
- Part of a Global SRE team, based in Chennai, India & San Jose, US.
- Improve and build upon our automation tools for systems provisioning, monitoring, trending, and management.
- Communicate effectively with fellow SREs and other engineering teams, and describe problems succinctly with sufficient detail that you can hand-off an ongoing problem to another team or a peer for completion.
- During a crisis, lead the effort to triage and mitigate
- Manage real-time communications during outages with both technical and non- technical audiences
- Perform periodic on-call duty as part of a global team maintaining the availability and performance of RevPro SaaS.
- Strategize with fellow SREs and other engineering teams on complex problems, and make decisions and recommendations about systems improvements after analyzing possible courses of conduct.
- Perform performance analysis, proactive troubleshooting, continual improvement and capacity planning for production, virtualized environment
- Administrating Web Servers, Application Servers and Databases running applications.
- Develop policies and procedures that improve overall platform stability.
- Participate in reviews of outages in order to improve overall product stability.
- Build relationships with development teams and technology leaders across the company
- Over 3-5 years of experience operating and scaling services in a distributed, internet-scale environment
- Strong experience with Oracle database and hand-on experience with Postgres, MySQL is a plus.
- Strong knowledge of Linux operating systems and environment.
- Experience with monitoring, trending, and logging tools such as Logstash/ElasticSearch/Kibana, Cloudwatch. Splunk.
- Experience with Virtualization/Amazon AWS.
- Solid scripting skills; Experience with Shell, Python, Pl/SQL etc.
- Experience with setup, configuration like Ansible, Terraform, Postfix, Central Logging (syslog-Ng), SNMP and Monitoring systems (e.g. Nagios, Ganglia, Cacti) and other reporting tools.
- Experience in handling production outages and root cause analysis
- Strong crisis management leadership ability; Experience with Incident management.
- Hands on operational experience in a high-volume or critical production service environment
- Effective communication skills, whether talking to individual contributors or to executive management
- Ultimate self-starter
- Strong troubleshooting and problem resolution skills
- Experience creating tools for infrastructure (IaaS and PaaS) management and automation a plus
- Experience with complex SaaS or Production, revenue critical web services environments is a strong plus.
- Experience with Unix/Linux system administration especially in RedHat Linux (CentOS), Ubuntu environment
- Experience with environment configurations at network, OS and application levels
- Experience with environment monitoring in a 24/7 web application and ecommerce environments
- Ability to use scripting languages to automate tasks and gather data
- Demonstrate ability to use problem solving techniques such as root cause analysis to resolve issues
- Demonstrate ability to write and present effective materials, including presentations, status reporting, technical diagrams and flowcharts
- Ability to follow and adhere to policies, procedures and standards relating to Systems management. May recommend process improvements.
- /B.S. degree (required); M.S. degree or equivalent technical training
- Ability to handle periodic on-call duty
At Zuora, different perspectives, experiences and contributions matter. Everyone counts. Zuora is proud to be an equal opportunity employer committed to creating an inclusive environment for all.