10.5 C
Canberra
Wednesday, October 29, 2025

Fitch Group achieves multi-Area resiliency for mission-critical Kafka infrastructure with Amazon MSK Replicator


Actual-time knowledge streaming and occasion processing are important parts of recent distributed programs architectures. Apache Kafka has emerged as a number one platform for constructing real-time knowledge pipelines and enabling asynchronous communication between microservices and purposes. Nonetheless, operating and managing Kafka clusters at scale might be difficult, requiring specialised experience and vital operational overhead.

Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a completely managed service that permits you to construct and run manufacturing Kafka purposes. With Amazon MSK, you may depend on AWS to deal with the heavy lifting of provisioning and managing Kafka clusters, when you deal with constructing revolutionary purposes and real-time knowledge processing pipelines.

On this submit, we discover how Fitch Group, one of many prime credit standing corporations, used Amazon MSK and Amazon MSK Replicator to realize multi-Area resiliency for his or her mission-critical Kafka infrastructure.

About Fitch Group and their want for multi-region resiliency

As a number one international monetary info companies supplier, Fitch Group delivers very important credit score and threat insights, sturdy knowledge, and dynamic instruments to champion extra environment friendly, clear monetary markets. With workers in over 30 international locations, Fitch Group’s tradition of credibility, independence, and transparency is embedded all through its construction, which incorporates Fitch Rankings, one of many world’s prime three credit score scores companies, and Fitch Options, a number one supplier of insights, knowledge, and analytics.

To remain aggressive and environment friendly within the fast-paced monetary business, Fitch Group strategically adopted an event-driven microservices structure. On the coronary heart of this ecosystem lies Kafka, particularly Amazon MSK, which serves because the spine for his or her knowledge integration programs.

Fitch Group makes use of Kafka to allow purposes to ship ratings-related enterprise occasions, facilitating automation inside their scores workflow programs and offering real-time or close to real-time processing. This architectural alternative has considerably decreased the time to marketplace for end-user-facing programs like Fitch Rankings Professional and Fitch Group Rankings web sites. Furthermore, Kafka’s sturdy capabilities enable for seamless aggregation and distribution of information from many disparate programs via their knowledge platform, enhancing knowledge consistency, reliability, and accessibility throughout the group.

Given the important function that Kafka performs in Fitch Group structure, offering sturdy catastrophe restoration (DR) mechanisms turned paramount. Any disruption to their Kafka infrastructure may have vital repercussions on their scores workflow automation, real-time processing, and end-user-facing programs, doubtlessly exposing Fitch Group to regulatory, monetary, and reputational dangers.

To realize the specified ranges of resiliency, Fitch Group had the next key necessities:

  • Multi-Area deployment – Deploy MSK clusters throughout a number of AWS Areas to supply enterprise continuity and preserve service availability throughout Regional or service occasions
  • Automated replication – Replicate Kafka knowledge throughout Areas in close to actual time with minimal latency and knowledge loss
  • Constant subject namespaces – Preserve the identical Kafka subject names and constructions throughout supply and vacation spot clusters to reduce software modifications
  • Speedy restoration – Within the occasion of a failover, allow purposes to seamlessly begin consuming from the replicated cluster with minimal Restoration Time Goal (RTO) and Restoration Level Goal (RPO)

Resolution overview

Fitch Group selected to implement their multi-Area Kafka deployment utilizing Amazon MSK and MSK Replicator. MSK Replicator is a completely managed replication service that permits steady, automated knowledge replication between MSK clusters throughout the identical Area or throughout completely different Areas. It helps replicating knowledge between clusters with completely different configurations, together with various dealer counts, storage volumes, and Kafka variations. Right here’s how Fitch Group used MSK Replicator to realize their multi-Area resiliency targets:

  • Deployed MSK clusters in two separate Areas, with the first cluster in the primary Area and the secondary cluster in a distinct Area for catastrophe restoration
  • Configured MSK Replicator to repeatedly replicate knowledge from the first cluster to the secondary cluster, sustaining the identical subject names and constructions throughout each clusters
  • Carried out software failover logic to mechanically change to consuming from the secondary cluster in case of a main cluster unavailability, with minimal restoration time and knowledge loss

The next diagram illustrates this structure

Advantages achieved

By implementing Amazon MSK and MSK Replicator, Fitch Group realized a number of key advantages:

  • Enhanced catastrophe restoration – The multi-Area deployment offers enterprise continuity even within the face of Regional or service occasions.
  • Simplified operations – The managed functionality of MSK Replicator offloads the operational complexity of self-managing customized replication options, decreasing the burden on Fitch Group’s IT group
  • Scalability – The answer can scale to deal with various knowledge masses, ensuring that DR capabilities develop alongside enterprise wants
  • Minimal software modifications – MSK Replicator helps replicating matters with the identical identify, which eliminates the necessity for shopper software modifications, decreasing growth effort and potential errors
  • Seamless failover and failback – Bidirectional replication capabilities allow fast switching of operations to the standby Area with minimal disruption, and easy reversion after the first Area is restored
  • Improved testing capabilities – The setup facilitates common DR workout routines with out impacting manufacturing programs, permitting Fitch Group to validate their DR plans persistently

Conclusion

By utilizing Amazon MSK and MSK Replicator, Fitch Group has efficiently applied a extremely resilient and scalable Kafka infrastructure that meets their stringent enterprise continuity and catastrophe restoration necessities. This multi-Area deployment permits them to course of mission-critical monetary knowledge at scale whereas offering minimal downtime and knowledge loss within the occasion of service occasions or disasters. As Fitch Group continues to innovate and develop, their sturdy Kafka infrastructure offers a stable basis for future enlargement and the event of recent data-driven companies, in the end enhancing their potential to ship well timed and correct monetary insights to their shoppers.


In regards to the authors

Kalyan Janaki is Senior Massive Knowledge & Analytics Specialist with Amazon Net Providers. He helps prospects architect and construct extremely scalable, performant, and safe cloud-based options on AWS.

Venu Nemallikanti is the Enterprise Architect and Lead for Occasion Streaming at Fitch Group, a globally acknowledged monetary info companies supplier working in over 30 international locations. His main obligations embrace overseeing the structure and implementation of occasion streaming options, guaranteeing the seamless integration and efficiency of programs that ship credit score scores, analysis, knowledge, and analytics to a worldwide clientele.

Chaitanya Shah is a Principal Technical Account Supervisor with AWS, primarily based out of New York. He likes to code and actively contributes to the AWS options labs to assist prospects remedy advanced issues. He offers steering to AWS prospects on finest practices for his or her Cloud migrations. He’s additionally specialised in AWS knowledge switch and the information and analytics area.

Oleg Chugaev is a Principal Options Architect and Serverless evangelist with 20+ years in IT, holding a number of AWS certifications. At AWS, he drives prospects via their cloud transformation journeys by changing advanced challenges into actionable roadmaps for each technical and enterprise audiences.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

[td_block_social_counter facebook="tagdiv" twitter="tagdivofficial" youtube="tagdiv" style="style8 td-social-boxed td-social-font-icons" tdc_css="eyJhbGwiOnsibWFyZ2luLWJvdHRvbSI6IjM4IiwiZGlzcGxheSI6IiJ9LCJwb3J0cmFpdCI6eyJtYXJnaW4tYm90dG9tIjoiMzAiLCJkaXNwbGF5IjoiIn0sInBvcnRyYWl0X21heF93aWR0aCI6MTAxOCwicG9ydHJhaXRfbWluX3dpZHRoIjo3Njh9" custom_title="Stay Connected" block_template_id="td_block_template_8" f_header_font_family="712" f_header_font_transform="uppercase" f_header_font_weight="500" f_header_font_size="17" border_color="#dd3333"]
- Advertisement -spot_img

Latest Articles