OpenPaaS story: Road to Production
- Tháng Năm 11, 2020
- Posted by: Linh Thao Nguyen
- Category: Services
In the life cycle of an application, deploying to production is one of the most important phases. At Linagora, OpenPaaS is not an exception and this article will tell you our encountered problems and how OpenPaaS team deals with them in the first phase to go to production.
PROVE
There are a ton of companies outside with millions of applications that have similar functionalities. How do we make customers believe us and choose our solutions? In Linagora, we converge many convincing criteria, and one of them is we are doing Open Source Software, we respect customers’ data privacy, transparent everything (source code, documentation, roadmap, etc).
The most things that customers care about are: security and performance. It is not like when you develop in your local with few users, you just focus on features. When your application goes live and provides services to thousands or even million users, performance and security are the real problem and we need to prove to customers on this point. OpenPaaS team has developed a tool to able to challenge our server by simulating production environment with a thousands of user requests during a certain amount of time. For security, we have taken our best efforts to upgrade vulnerable dependencies as well as refactor code base to avoid sensitive information leaking mistakes.
PREPARE
1. Distribute system
To go live, the system needs to ensure:
- Availability
- Fault tolerance
- Speed
- Scalability
OpenPaaS team has distributed services (Databases, James, Nodejs Server, etc) by using Kubernetes and Helm package management.
2. Sandbox environment
Of course, during the R&D process, we already have testing environments but to be sure everything is good before going live, we need an environment that is closed to the production one. Our DevOps team has done a good work to have it. Now, with some easy clicks, a new releasing version is available to test.
3. Data store
Docker came with awesome features that support developers with a quick and easy way to start services such as database, elasticsearch, rabbitmq, etc. However, production is an absolutely different story, data is the most important thing that requires a stable data store, backup and restore when bad things happen. And we don’t trust Docker at this point (at least at this time). So, the OpenPaaS team decided to store data outside of k8s and in the real machine.
We also have many workshops and hard work on data backup and restore. With many technologies and in a distributed environment, we need to have a good and safe way to backup & restore data.
4. Releasing workflow
Go live requires the development team to have a smooth workflow to quickly and easily deploy a new version to Sandbox & Production environment. In the OpenPaaS team, we have defined clear steps for this, for example, once the development team has fixed bugs, we release a release candidate version, then deploy to a Sandbox environment, and ask QA to verify the fix and possible impact on features if there are. Then, if everything is OK, we release an official version, deploy it again to Sandbox to be sure the basic features work before deploy the official one to production.
Our QA team mixes manual and automation tests on the Sandbox environment. This helps manual test to focus on the bug fixes and automation tests to cover almost basic functionalities.
DEPLOY
1. Migrating
Cool! Now, we go to an interesting part, deploying our production application to customers. Because each customer has a particular old platform and data, we need to have a specific migration process. This requires us to work with customers to deal with specific data structures and businesses. For example, our mail server James needs to reindex all of the emails in the old platform to Elasticsearch, and with the millions of emails in the old platform, one process with sequence reindexing requires a lot of time to get his part done. So James team comes with a parallel solution to have multiple processes to reindex emails.
2. Monitoring
Once we go live, we need to have a friendly way to monitor the system. This requires each service in the system to provide a way or API to monitor them. And those APIs will be consumed by monitoring tools.
3. Trail feedback
Customers will have a trial period to experiment on the platform before it officially goes live. And in this period, many feedbacks from customers come:
- Bug fixes: Obviously, bugs are always there 😉
- Improvement & customization: That’s it! For example the application theme, language.
- Feature requests: Yes, customers need specific features to handle their specific businesses
- Performance: Performance depends on the deployed system and specific server capabilities (memory, network, etc).
4. Hot fixes
Once deployed, if there are bugs that crash the system or make stupid things, we need to have quick fixes for customers to ensure the experiment of users and have real fixes later that require more time to design, refactor, etc.
Conclusion
All the results above require rhythmic coordination among components: Developers, DevOps, QA, BA, PO, Designer. OpenPaaS team worked hard to pass the first steps to make our production go live.
Of course, we just have some small successes in the first phase, there are many challenges waiting for us ahead. We need to improve and prepare our skills, knowledge, and resources to adapt them and we really expect them with the highest motivations.
If you want to be a part of us to make awesome things and solve together the interesting challenges as above, join us here. For Vietnamese people, visit us here.
Enjoy and Thank you!
Tuan Le Cong – Javascript Team Leader