Rootconf 2017 // Ruchi Singh
Migration of 300 microservices from AWS Cloud to Snapdeal Cloud
@rchsngh569
The next 15 mins...
Largest e-commerce marketplace in India
Buyer Seller
Overview Of Snapdeal Cloud:Cirrus
And the story begins…..Our plan for migration
Checklist before starting migration
Why did we build a private cloud?
Cost - A major driving factor
Public clouds are great till the growth is unpredictable
At an inflection point, public clouds don’t remain cost effective, we needed an alternate.
And how we made it cost effective -
SD cloud is built using 100% open-source components.
Some analysis on some enterprise technologies and calculation of operational cost gave us clearness about our idea.
We have a team to build, automate and manage the DC and Cloud Platform.
Other factors...
Performance - One machine one service; much higher performance, optimized it for self use
Security - Advance Enterprise firewall, Intrusion Detection, DDOS Prevention
Data sovereignty - keeping our critical data within boundaries
To Summarize...
Gaining knowledge about infrastructure...
After planning steps
Gotchas during migration and our fix to them
-- Security Groups to redirect traffic between services running in old cloud and new cloud
-- Data not in sync issue so took a delta and dump that data to the new machines
-- launched machines were not able to handle the load so extend our infra at run time.
-- Strong monitoring needed for new system and applications was required and we were missing some parts in monitoring. We use EFK, Icinga for monitoring
Our plan for failure...Rollback strategies
If a service migration fails for any reason, some rollback strategies we have -
Execution part : technical tools
-- Our yaml files for each individual service (our infrastructure as code source)
-- Dendrite for service discovery (nerve and synapse)
-- Saltstack, chef (orchestration tools)
-- Git, Jenkins (CI/CD pipeline)
-- Automation scripts
Key Learnings
-- Plan, Plan and Plan for your cloud migration! This is where your project fails.
-- Understand your services, architecture and their dependencies
-- We created a live service dependency graph to facilitate migration
-- Don’t migrate as-it-is, fix the problems that you never got time to fix
-- Strict naming conventions and make sure all launched services are registered with all orchestration tools you are using
-- Automate and monitor everything!
But our cloud is hybrid...
But we didn’t stop there, still using public cloud for these purposes ---
-- Disaster Recovery : for data backup and recovery
-- New service/company acquisition: new company acquires which used to run in public cloud and takes sometime to migrate it
-- On-demand : at critical times like diwali. Reach maximum capacity and still need to grow
And it’s party time!
After 1.5 years, ups and downs, few downtimes, challenges and Snapdeal.com is running on it’s own cloud.
Thank you!